Skip to content

Message history state machine: invariant by construction#170

Merged
seamus-brady merged 1 commit intomainfrom
feature/message-history-state-machine
Apr 26, 2026
Merged

Message history state machine: invariant by construction#170
seamus-brady merged 1 commit intomainfrom
feature/message-history-state-machine

Conversation

@seamus-brady
Copy link
Copy Markdown
Owner

Summary

The cog has been dying mid-cycle with API 400s of the form messages.40.content.0: unexpected tool_use_id — orphan tool_result blocks whose matching tool_use was lost. Each historical fix patched one new shape; the next code path introduced another. This PR ends the family at the root.

Root cause: state.messages was a public List(Message). ~30 handlers across the cog list.append-ed directly. Provider-API invariants (alternation, leading-user, tool_usetool_result pairing) lived only in a reactive sweep at the LLM boundary that covered one direction of one rule. New mutations kept introducing new violations.

Structural fix: opaque MessageHistory with one chokepoint (add) that maintains every invariant by construction. Direct list.append on state.messages is now impossible — the type prevents it.

What's enforced where

Invariant Enforced by
Leading message must be User add silently drops a leading Assistant; from_list strips one at ingest
User/assistant alternation add coalesces consecutive same-role messages
No orphan tool_result add strips any tool_result whose tool_use_id has no matching tool_use in the prior assistant message; if that empties the message, the message is dropped
No orphan tool_use from_list injects synthetic stub tool_results at ingest (the opposite direction; useful when loading persisted history with a half-completed cycle)

for_send exports a wire-ready List(Message). The reactive boundary sweep (repair_orphans_and_warn + llm/message_repair.gleam) is deleted — there's nothing left for it to repair.

What changed

  • New: src/llm/message_history.gleam (442 lines) — opaque type, add chokepoint, from_list ingest sanitisation, to_list / for_send exports
  • New: test/llm/message_history_test.gleam (260 lines) — 16 tests, one per invariant, each constructing the exact malformation that caused a production bug
  • Modified: src/agent/cognitive_state.gleamstate.messages: MessageHistory (was List(Message))
  • Modified: src/agent/cognitive.gleam, src/agent/cognitive/agents.gleam, src/agent/cognitive/llm.gleam, src/agent/cognitive/safety.gleam — every direct list.append(state.messages, ...) rewritten through message_history.add / add_user / add_assistant / add_user_text
  • Deleted: src/llm/message_repair.gleam, test/llm/message_repair_test.gleam — module redundant; its pipeline is intrinsic to MessageHistory.from_list

Test plan

  • gleam build clean
  • gleam format clean
  • gleam test2155 passing (gained 16 from new tests, lost the deleted message_repair_test cases since the module is gone)
  • add_user_with_orphan_tool_result_strips_it_test reproduces the operator-reported cog-killer
  • Run gleam run against the live agent; confirm no API 400s after a turn that previously dropped the cog

🤖 Generated with Claude Code

The cog has been dying mid-cycle with API 400s like
"messages.40.content.0: unexpected `tool_use_id`" — orphan tool_result
blocks whose matching tool_use was lost. Each fix patches one new
shape; the next code path introduces another.

Root cause: state.messages was a public List(Message) anyone could
list.append to. Provider-API invariants (alternation, leading-user,
tool_use ↔ tool_result pairing) lived only in a reactive sweep at
the LLM boundary that covered one direction of one rule. New
mutations kept introducing new violations.

Structural fix: opaque MessageHistory with one chokepoint (`add`)
that maintains every invariant by construction.

* `add` enforces:
  - leading assistant → silently dropped
  - consecutive same-role → coalesced (alternation invariant)
  - user message containing tool_result → orphans dropped; message
    dropped if it empties
* `from_list` (used at ingest from disk / tests) runs the full
  sanitisation pipeline including the opposite direction
  (synthesise stub tool_results for orphan tool_uses)
* `for_send` returns wire-ready List(Message) — already valid by
  construction, no boundary repair needed
* All ~30 mutation sites across cognitive.gleam, cognitive/agents.gleam,
  cognitive/safety.gleam, cognitive/llm.gleam now go through the typed
  API. Direct list.append on state.messages is impossible.

Removed:
- llm/message_repair.gleam — its repair pipeline is intrinsic to
  MessageHistory.from_list. The reactive sweep at the LLM boundary
  (repair_orphans_and_warn) is gone.

Tests:
- 16 new tests in test/llm/message_history_test.gleam covering each
  invariant with the exact malformations that caused production bugs.
- 2155 passing (gained 16, lost the message_repair_test cases).

Build clean, format clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@seamus-brady seamus-brady merged commit abf26ec into main Apr 26, 2026
1 check passed
@seamus-brady seamus-brady deleted the feature/message-history-state-machine branch April 26, 2026 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant