Summary
When a session is resumed (or trimmed mid-session) after a long autonomous run with no operator input in the recent tail, getResumeMessages can return a slice whose first message is a tool (tool_result). Anthropic rejects every subsequent submit on that session with:
messages.0.content.0: unexpected `tool_use_id` found in `tool_result` blocks: <id>.
Each `tool_result` block must have a corresponding `tool_use` block in the previous message.
The error is sticky: each new submit fails identically, and the rollback path re-persists the broken slice to messages.json, so the session is effectively bricked until the file is manually repaired.
Reproduction (deterministic)
- Start a session in operator/auto mode and let the agent run autonomously past ~200 model messages without typing any operator input.
- Close and reopen the session (or trigger any code path that re-runs
getResumeMessages on the persisted history — e.g. the post-step sync at src/tui/components/operator-dashboard/index.tsx:1314).
- Type any prompt.
- The request to the model fails with the orphaned
tool_result error above. Every retry fails identically with the same tool_use_id.
Root cause
src/core/session/index.ts — getResumeMessages:
if (messages.length <= limit) return messages;
let cutIndex = messages.length - limit;
while (cutIndex < messages.length) {
if (messages[cutIndex].role === "user") break;
cutIndex++;
}
if (cutIndex >= messages.length) {
cutIndex = messages.length - limit; // raw fallback — can land on a `tool` message
}
return messages.slice(cutIndex);
- The walk only searches forward for a
user boundary.
- When the recent
limit messages contain no user role (common in long autonomous runs), the fallback does a raw cut at messages.length - limit.
- That index can land on a
tool message, putting an orphaned tool-result at result[0]. The matching assistant tool-call has been trimmed off the front.
- The AI SDK converts a leading
tool role to an Anthropic user message containing a tool_result block with no preceding tool_use — exactly the error condition.
normalizeMessages does not repair this; it only merges consecutive user messages and upgrades raw-string output fields to { type: "text", value: ... }. It does not enforce the tool_use/tool_result pairing invariant.
Why the broken state is sticky
Two paths re-persist the broken slice on disk:
- User submit —
src/tui/components/operator-dashboard/index.tsx:817-826 writes [...conversationRef.current, { role: "user", content: prompt }] to messages.json. If conversationRef.current[0] is a tool, the orphan is at the head of the persisted file.
- Error rollback —
src/tui/components/operator-dashboard/index.tsx:1342-1354. When the API rejects the request, the catch block rolls conversationRef.current back to prevMessages and writes that to disk — i.e. the orphan-headed state without the new user message. Subsequent submits append a user at the end, but the orphan at the head persists.
Existing test gap
src/core/session/persistence.test.ts ("handles conversations with no user messages after cut point") only asserts result.length === 5. It never asserts that result[0] is a safe role, so the regression slipped past existing coverage.
Suggested fix shape
In getResumeMessages, after picking cutIndex, advance past any leading tool messages and any leading assistant message that begins with a tool-call whose paired tool-result was trimmed. The chosen slice must start with either a user message or an assistant whose content has no orphan tool-call parts.
Hardening (defense in depth): have normalizeMessages strip leading orphaned tool messages and leading tool-call-only assistant messages, so any caller that constructs a conversation prefix (not just resume) gets the same invariant for free.
Also worth adding: an explicit test that asserts the slice is API-valid (no orphaned tool_use/tool_result at head) for the all-tool-and-assistant tail case.
Workaround for affected sessions
The session is recoverable: edit ~/.pensar/sessions/<sessionId>/messages.json so the array starts at the first assistant message with no tool-call content parts (or the first user/safe-assistant further in). Deleting only messages[0] is not enough — the next message is typically also an orphan.
Impact
- Affects any long-running session resumed in auto/autopilot mode without recent operator input.
- Once triggered, the session cannot be used until
messages.json is repaired by hand.
- Silent for the operator: the failure mode looks like an unrelated 400 from the model provider.
Summary
When a session is resumed (or trimmed mid-session) after a long autonomous run with no operator input in the recent tail,
getResumeMessagescan return a slice whose first message is atool(tool_result). Anthropic rejects every subsequent submit on that session with:The error is sticky: each new submit fails identically, and the rollback path re-persists the broken slice to
messages.json, so the session is effectively bricked until the file is manually repaired.Reproduction (deterministic)
getResumeMessageson the persisted history — e.g. the post-step sync atsrc/tui/components/operator-dashboard/index.tsx:1314).tool_resulterror above. Every retry fails identically with the sametool_use_id.Root cause
src/core/session/index.ts—getResumeMessages:userboundary.limitmessages contain nouserrole (common in long autonomous runs), the fallback does a raw cut atmessages.length - limit.toolmessage, putting an orphanedtool-resultatresult[0]. The matching assistanttool-callhas been trimmed off the front.toolrole to an Anthropicusermessage containing atool_resultblock with no precedingtool_use— exactly the error condition.normalizeMessagesdoes not repair this; it only merges consecutive user messages and upgrades raw-stringoutputfields to{ type: "text", value: ... }. It does not enforce the tool_use/tool_result pairing invariant.Why the broken state is sticky
Two paths re-persist the broken slice on disk:
src/tui/components/operator-dashboard/index.tsx:817-826writes[...conversationRef.current, { role: "user", content: prompt }]tomessages.json. IfconversationRef.current[0]is atool, the orphan is at the head of the persisted file.src/tui/components/operator-dashboard/index.tsx:1342-1354. When the API rejects the request, the catch block rollsconversationRef.currentback toprevMessagesand writes that to disk — i.e. the orphan-headed state without the new user message. Subsequent submits append a user at the end, but the orphan at the head persists.Existing test gap
src/core/session/persistence.test.ts("handles conversations with no user messages after cut point") only assertsresult.length === 5. It never asserts thatresult[0]is a safe role, so the regression slipped past existing coverage.Suggested fix shape
In
getResumeMessages, after pickingcutIndex, advance past any leadingtoolmessages and any leadingassistantmessage that begins with atool-callwhose pairedtool-resultwas trimmed. The chosen slice must start with either ausermessage or anassistantwhose content has no orphantool-callparts.Hardening (defense in depth): have
normalizeMessagesstrip leading orphanedtoolmessages and leadingtool-call-only assistant messages, so any caller that constructs a conversation prefix (not just resume) gets the same invariant for free.Also worth adding: an explicit test that asserts the slice is API-valid (no orphaned tool_use/tool_result at head) for the all-tool-and-assistant tail case.
Workaround for affected sessions
The session is recoverable: edit
~/.pensar/sessions/<sessionId>/messages.jsonso the array starts at the firstassistantmessage with notool-callcontent parts (or the firstuser/safe-assistantfurther in). Deleting onlymessages[0]is not enough — the next message is typically also an orphan.Impact
messages.jsonis repaired by hand.