Skip to content

[Chore]: Document cuga-agent SDK trajectory requirement (supersedes eval workaround PR #5) #42

@haroldship

Description

@haroldship

Summary

Document the cuga-agent requirement for full SDK trajectory capture (steps[].prompts, cuga-viz) so contributors do not re-add an eval-side TokenUsageTracker workaround.

Background

  • cuga-agent#71closed; fixed in cuga-agent#236 (CugaAgent wires TokenUsageTracker in _build_callbacks()).
  • cuga-eval PR #5 — eval-repo duplicate callback; closed without merge (would double-register tracking with current cuga-agent).

There is no matching open issue in cuga-eval for PR #5 (PR only referenced upstream #71).

Tasks

  • In CONTRIBUTING.md (cuga-agent path dependency section): note that SDK benchmarks need a ../cuga-agent checkout including #71 / #236 for rich trajectories; do not add a harness-level TokenUsageTracker copy.
  • (Optional) One-line log/warning in setup_agent_with_tools if an old cuga-agent is detected (only if a cheap version/import check exists).
  • (Optional) Add direct langchain-core / python-multipart floor pins in pyproject.toml for CVE policy — uv.lock already resolves newer transitive versions; evaluate whether explicit pins add value.

Acceptance criteria

  • New contributors running SDK/AppWorld/M3 evals understand trajectory richness comes from cuga-agent, not cuga-eval callbacks.
  • No duplicate TokenUsageTracker / SDKTokenUsageTrackerCallback in this repo.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions