feat: add token debug view for rollout rows in ep logs#430
feat: add token debug view for rollout rows in ep logs#430
Conversation
|
Closing in favor of #431, which is based on current main and contains the same intended UI changes without the outdated branch history. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 086a9776ff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| key={i} | ||
| token={tok} | ||
| tokenId={allIds[i]} | ||
| turnIdx={isPrompt ? 0 : trace.step_index} |
There was a problem hiding this comment.
Offset turn indices before rendering completion tokens
TurnSection uses trace.step_index directly as turnIdx for completion tokens, but this breaks when traces are zero-based (a valid RL convention where the first turn is 0): EpisodeToken then treats first-turn completions as prompt tokens because turnIdx > 0 is false, so those tokens render as masked and lose logprob coloring/tooltip semantics. Normalize the displayed turn index to a positive completion index before passing it to token rendering.
Useful? React with 👍 / 👎.
| const isTool = message.role === "tool"; | ||
| const hasToolCalls = message.tool_calls && message.tool_calls.length > 0; | ||
| const hasFunctionCall = message.function_call; | ||
| const hideMessageContent = message.role === "assistant" && hasToolCalls; |
There was a problem hiding this comment.
Preserve assistant text when tool calls carry real content
The new hideMessageContent condition suppresses all assistant message text whenever tool_calls is present, which drops legitimate assistant narration in responses that include both natural-language content and tool calls. In those cases the transcript becomes incomplete (only tool-call cards remain), so this should be gated on payload-like/empty content rather than every assistant message with tool_calls.
Useful? React with 👍 / 👎.
Summary
ep logsrows forfull_episodeandtoken_turn_tracestool_callsare presentValidation
npm run buildScreenshots
Kimi K2.5 VL
Qwen3 VL
Note
Low Risk
UI-only changes to log rendering and debug visualization with no authentication, persistence, or backend behavior changes; main risk is regressions in message formatting or expanded-row layout.
Overview
Adds a new
TokenDebugViewpanel to expanded evaluation/rollout rows (whenextra.full_episodeorextra.token_turn_tracesare present) to visualize token IDs, masking, and logprobs across turns.Updates the expanded row chat transcript to be more prompt-faithful by injecting tool declarations into the first system message, and tweaks
MessageBubblerendering to hide raw assistant content whentool_callsexist while still showing/copying tool-call arguments.Also removes the committed built CSS asset (
vite-app/dist/assets/index-*.css), indicating build artifacts are no longer tracked/updated in this PR.Written by Cursor Bugbot for commit 086a977. This will update automatically on new commits. Configure here.