Skip to content

fix(teammate-progress): keep cumulative token+tool counts across prompts (#475)#1402

Open
0xghost42 wants to merge 1 commit into
Gitlawb:mainfrom
0xghost42:fix/475-teammate-cumulative-token-tracker
Open

fix(teammate-progress): keep cumulative token+tool counts across prompts (#475)#1402
0xghost42 wants to merge 1 commit into
Gitlawb:mainfrom
0xghost42:fix/475-teammate-cumulative-token-tracker

Conversation

@0xghost42
Copy link
Copy Markdown
Contributor

Closes #475.

What broke

TeammateSpinnerLine, InProcessTeammateDetailDialog, and the leader's spinner aggregate (Spinner.tsx:191-200) all read teammate.progress?.tokenCount and teammate.progress?.toolUseCount straight from AppState. Those fields are populated by getProgressUpdate(tracker) from inside the in-process teammate runner's per-message loop. The runner used to do this:

while (!abortController.signal.aborted && !shouldExit) {
  // ...
  // Create fresh progress tracker for this prompt
  const tracker = createProgressTracker()
  // ...
}

A fresh tracker per prompt iteration meant that whenever the leader sent a second prompt to the same teammate (i.e. any multi-turn agent-team flow), the previous prompts' cumulativeOutputTokens and toolUseCount were dropped on the floor and the pill jumped back to (or near) zero.

The Claude API returns input_tokens as cumulative-per-request — forkContextMessages already re-sends the full history each turn — so latestInputTokens was always large and partially masked the bug. The visible symptom matched the report on #475: output tokens and tool-use counts repeatedly looked low / stale to leaders, even after long sessions.

Fix

Hoist createProgressTracker() (and the resolver) out of the while loop so a single tracker spans the whole teammate lifetime. cumulativeOutputTokens and toolUseCount now keep running totals across every prompt; latestInputTokens continues to track the latest request's cumulative-per-request input (which is what we want for "running context cost"). recentActivities stays capped at MAX_RECENT_ACTIVITIES = 5 so the preview doesn't bloat over long teammate sessions.

Behavior on a single-prompt teammate is unchanged.

Tests

New src/tasks/LocalAgentTask/progressTracker.test.ts pins the contract the runner relies on. Six cases:

  1. Output tokens accumulate across multiple assistant messages.
  2. Cumulative semantic survives a simulated multi-prompt teammate session (3 turns prompt-1 + 2 turns prompt-2 → monotonically-increasing tokenCount).
  3. Fresh-tracker-per-prompt regression repro — confirms that under the old behavior prompt-1's output tokens and tool uses are lost when a new tracker is created for prompt-2.
  4. Tool use count accumulates across messages.
  5. cache_creation_input_tokens + cache_read_input_tokens fold into latestInputTokens.
  6. recentActivities stays capped at 5 while toolUseCount keeps climbing.

bun test (full): 2998/2998 pass. bun run build clean (no React/Ink leakage in SDK bundle; external lists valid).

Repro for reviewer

  1. Start two teammates (e.g. /team with two agents).
  2. Send a first prompt to one — note token + tool-use count growing on its pill.
  3. Send a second prompt to the same teammate — pre-fix the count snaps back to the second prompt's per-call value (often visibly lower); post-fix it keeps climbing.

…pts (Gitlawb#475)

The in-process teammate runner re-created the progress tracker on every
prompt iteration, so task.progress.tokenCount and toolUseCount were reset
between leader prompts to the same teammate. TeammateSpinnerLine,
InProcessTeammateDetailDialog and the Spinner aggregate all read these
counters directly, which is why agent-team pills appeared to lose tokens
and tool uses partway through a session.

The Claude API returns input_tokens as cumulative-per-request (each turn
re-sends forkContextMessages history), so latestInputTokens already
captures the running context cost. The fix moves createProgressTracker
out of the while-loop so cumulativeOutputTokens and toolUseCount also
keep their running totals across multiple prompts.

Adds src/tasks/LocalAgentTask/progressTracker.test.ts pinning:
- output tokens accumulate across multiple assistant messages
- cumulative semantic survives a simulated multi-prompt teammate session
- fresh-tracker-per-prompt regression repro (prior outputs + tool uses lost)
- tool use count accumulates
- cache_creation/read input tokens fold into latestInputTokens
- recentActivities stays capped while toolUseCount keeps climbing

bun test (full): 2998/2998 pass. bun run build clean.
Copy link
Copy Markdown
Collaborator

@jatmn jatmn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I do not see any actionable issues from my review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tokens and tool uses not showing in agent teams

3 participants