fix(analytics): handle new Langfuse trace input format after PR#33 by Sergey-Zeltyn · Pull Request #40 · cuga-project/cuga-eval

Sergey-Zeltyn · 2026-06-04T12:18:58Z

Adapt trace comparison analytics to new Langfuse format

Related Issue

Fixes #33 (downstream breakage)
Closes #43

Description

The trace comparison analytics pipeline crashed with AttributeError: 'list' object has no attribute 'get' after PR#33 was merged.

Type of Changes

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix that would cause existing functionality to not work as expected)

Root Cause

LangfuseAdapter.load_trace() assumed the top-level input field of a Langfuse trace was always a dict {"intent": "...", "task_name": "...", ...} — set explicitly by the old start_as_current_observation(input={...}) eval wrapper span.

PR#33 removed that wrapper span and made the LangGraph CallbackHandler the trace root. LangGraph records the raw message list passed to agent.invoke as the trace input, so data["input"] is now a list of {"role": ..., "content": ...} dicts. Calling .get("intent") on a list raises AttributeError.

Solution

Handle both formats in langfuse_adapter.py:

Old format (dict): extract input["intent"] directly
New format (list): extract content from the first user-role message, which is the task intent string passed to HumanMessage(content=intent)

Testing

I have tested this fix locally (./scripts/analyze.sh --analytics trace_compare --config all_trajectories.conf no longer crashes)
I have added tests that prove my fix works
All new and existing tests passed
I have verified the bug no longer occurs

Checklist

My code follows the code style of this project
I have performed a self-review of my own code
I have made corresponding changes to the documentation if needed

Summary by CodeRabbit

Bug Fixes
- Enhanced trace input data extraction to properly support multiple input format variations, ensuring consistent and reliable handling across different data structures.

PR#33 removed the eval wrapper span (start_as_current_observation), making the LangGraph callback the trace root. The top-level input is now a list of LangChain message dicts instead of {"intent": ...}, causing AttributeError in the trace comparison pipeline. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-06-04T12:19:10Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bd39670e-26a5-43d2-be63-b4676c688e29

📥 Commits

Reviewing files that changed from the base of the PR and between 28cd6b8 and 6e0ae6e.

📒 Files selected for processing (1)

analytics/trace_comparison_rules/src/langfuse_adapter.py

🚧 Files skipped from review as they are similar to previous changes (1)

analytics/trace_comparison_rules/src/langfuse_adapter.py

📝 Walkthrough

Walkthrough

Updated Langfuse adapter's load_trace to extract task formulation from data["input"] supporting dict input (use intent) or list-of-messages input (use first message with role == "user"'s content), assigning only when the result is a string.

Changes

Langfuse Input Shape Handling

Layer / File(s)	Summary
Task formulation extraction from dual input shapes `analytics/trace_comparison_rules/src/langfuse_adapter.py`	`load_trace` now handles `data["input"]` as either a dict with `intent` or a list of message dicts, extracting `task_formulation` from the first user-role message's `content` for the list case and only assigning when the value is a string.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: fixing the handler to support the new Langfuse trace input format introduced in PR#33.
Linked Issues check	✅ Passed	The PR directly addresses the crash in issue `#43` by handling both dict and list input formats, fulfilling the requirement from `#33` that changed the input structure.
Out of Scope Changes check	✅ Passed	The changes are narrowly focused on updating langfuse_adapter.py to handle the new input format from PR#33, with no unrelated modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/trace-comparison-new-langfuse-format

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@analytics/trace_comparison_rules/src/langfuse_adapter.py`:
- Around line 73-85: The extracted task formulation from raw_input may be
non-string (m.get("content")), so ensure trace.task_formulation is either a
string or None: after obtaining the candidate (from raw_input dict or the
next(...) for list items), check isinstance(candidate, str) and assign it to
trace.task_formulation only if true; otherwise, either set
trace.task_formulation = None or convert safely to a string representation
(e.g., json.dumps(candidate)) before assigning. Update the logic around
raw_input, the dict branch (raw_input.get("intent")), and the list branch (the
next(...) generator that reads m.get("content")) to include this
validation/conversion so downstream code that calls .strip() on
trace.task_formulation is safe.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 33464d9b-e979-4bbb-a3cb-384206320906

📥 Commits

Reviewing files that changed from the base of the PR and between 167291d and 28cd6b8.

📒 Files selected for processing (1)

analytics/trace_comparison_rules/src/langfuse_adapter.py

haroldship

Looks good to me, CodeRabbit comment is minor. Up to you if you want to include it.

Address PR review: LangChain message `content` may be a list (multi-modal blocks), so extracting it directly could break downstream code that calls .strip() and string concatenation on task_formulation. Only assign when the candidate is a string; otherwise leave as None (downstream already handles None with an "Unknown Task" fallback). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Sergey-Zeltyn requested a review from haroldship June 4, 2026 12:20

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread analytics/trace_comparison_rules/src/langfuse_adapter.py

haroldship approved these changes Jun 4, 2026

View reviewed changes

Sergey-Zeltyn merged commit 1f57f77 into main Jun 4, 2026
3 of 4 checks passed

Sergey-Zeltyn deleted the fix/trace-comparison-new-langfuse-format branch June 4, 2026 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(analytics): handle new Langfuse trace input format after PR#33#40

fix(analytics): handle new Langfuse trace input format after PR#33#40
Sergey-Zeltyn merged 2 commits into
mainfrom
fix/trace-comparison-new-langfuse-format

Sergey-Zeltyn commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

haroldship left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Sergey-Zeltyn commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Adapt trace comparison analytics to new Langfuse format

Related Issue

Description

Type of Changes

Root Cause

Solution

Testing

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

haroldship left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sergey-Zeltyn commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading