Skip to content

Add referenced_artifacts and checkpoint tool#167

Merged
seamus-brady merged 1 commit intomainfrom
feat/agent-context-bundle-and-checkpoint
Apr 26, 2026
Merged

Add referenced_artifacts and checkpoint tool#167
seamus-brady merged 1 commit intomainfrom
feat/agent-context-bundle-and-checkpoint

Conversation

@seamus-brady
Copy link
Copy Markdown
Owner

Summary

PR 2 of the sub-agent-resilience plan. Implements Fixes 3 + 4: parent → child structural-context handoff via referenced_artifacts, and a lighter checkpoint tool plus skill discipline updates so agents save in chunks during synthesis instead of trying to fit it all in one final response.

Together these address the redundant-bootstrap and lost-work-on-cap patterns seen in the 2026-04-26 Nemo session, where 13 researcher delegations each re-discovered the same 309-section book and one writer delegation produced zero words after blowing its token cap.

What changed

Fix 3 — referenced_artifacts on agent_* tool calls

  • New parameter on every agent_* tool: referenced_artifacts (comma-separated artifact IDs).
  • Framework intercepts in dispatch_single_agent (src/agent/cognitive/agents.gleam): retrieves each artifact's content via the librarian, renders as <reference_artifact id="...">CONTENT</reference_artifact> blocks, prepends to the agent's first message.
  • Resolution failures render status="not_found" markers; over-50KB bundle renders status="elided" reason="bundle_size" markers — agent always sees what was attempted, never silently dropped.
  • New helpers parse_referenced_artifacts_csv and render_referenced_artifacts_bundle (both pub for testability).

Fix 4 — checkpoint tool + skill discipline

  • New checkpoint(label, content) tool in src/tools/artifacts.gleam. Lighter than store_result — auto-fills tool="checkpoint", uses label as the summary, otherwise behaves identically.
  • Wired into writer + researcher routing predicates and executors. Both agents can now call it.
  • Updated .springdrift_example/skills/document-library/SKILL.md with explicit guidance on the checkpointing pattern (save sections as you produce them, don't assemble the whole final output in one response) and the reconnaissance-then-followups pattern using referenced_artifacts.

Test plan

  • gleam build clean, no warnings
  • gleam test — 2074 passed (15 new), no failures
  • Pure tests on parse_referenced_artifacts_csv (3) — empty / present / invalid-json fallbacks
  • Bundle rendering tests (5) — empty CSV, no librarian, valid artifact resolves with content, missing artifact gets not_found marker, multiple artifacts render in order, whitespace in CSV is trimmed
  • Checkpoint dispatch tests (4) — happy path returns artifact_id, label becomes summary, missing label rejected, missing content rejected
  • Routing tests (2) — writer.routes_tool("checkpoint") and researcher.routes_tool("checkpoint") both True (regression guard for the dispatch-routing bug class fixed in PR Fix three agent-tool dispatch bugs of the same family #163)
  • Operator: dispatch a synthesis task with reconnaissance pattern (agent_researcher returns an artifact_id in its summary, then dispatch parallel followups with referenced_artifacts: <that_id>) and confirm the followups receive the content as a <reference_artifact> block in their first message

Notes for review

  • referenced_artifacts is a string parameter (comma-separated), not a JSON array. Gleam's tool builder doesn't have an array param helper yet, and string-with-CSV parsing is robust enough for this use. The description on the tool param explicitly tells the LLM to comma-separate.
  • The 50KB bundle cap is a constant in agents.gleam (referenced_artifacts_bundle_cap_chars). If operators routinely need bigger bundles, consider lifting to config; for now it matches the existing artifact-record-truncation cap of the same size.
  • The existing artifact_id (singular) param remains unchanged — embeds just the ID as a hint that the agent must retrieve_result itself. referenced_artifacts is the new auto-prepend-content path. Both can coexist.
  • checkpoint tool's tool field on the ArtifactRecord is the discriminator — downstream filtering (tool == "checkpoint") can find checkpoints specifically. No new schema field needed.

What's next

PR 3 — Fix 5: codify Nemo's emergent strategies as orchestration skills (orchestration-large-inputs/SKILL.md + when-to-use-writer/SKILL.md) and add Strategy Registry seeding so fresh instances boot with the floor strategies populated.

🤖 Generated with Claude Code

…ience)

Implements Fixes 3 and 4 of the sub-agent-resilience plan
(docs/roadmap/planned/sub-agent-resilience.md).

Fix 3 — referenced_artifacts on agent_* tool calls
- New `referenced_artifacts` parameter on every agent_* tool. Takes a
  comma-separated list of artifact IDs.
- Framework intercepts on dispatch_single_agent: each ID is resolved
  via the librarian, content prepended to the agent's first message
  as `<reference_artifact id="...">CONTENT</reference_artifact>`
  blocks wrapped in `<reference_artifacts>...</reference_artifacts>`.
- Eliminates the redundant-bootstrap pattern observed in 2026-04-26
  Nemo session: 13 researchers each independently re-discovered the
  same 309-section book. The new flow is one reconnaissance
  delegation → checkpoint the structural outline → dispatch
  downstream agents with that artifact_id via referenced_artifacts.
  Each child sees the structure immediately without calling
  retrieve_result.
- Differs from existing `artifact_id` param: that param embeds just
  the ID as a hint; the agent must call retrieve_result itself.
  referenced_artifacts auto-renders the CONTENT.
- Resolution failures render `<reference_artifact id="X"
  status="not_found"/>` markers rather than silently dropping —
  agent sees what was attempted.
- Bundle size capped at 50,000 chars; over-cap artifacts render
  `status="elided" reason="bundle_size"` markers.
- Whitespace in the CSV is trimmed so LLMs passing `"a, b, c"` are
  tolerated.

Fix 4 — checkpoint tool + skill discipline
- New `checkpoint(label, content)` tool — lighter sibling of
  store_result. Auto-fills tool="checkpoint", uses label as the
  summary, otherwise behaves identically. Lower friction means
  agents are more likely to actually use it during synthesis.
- Wired into writer + researcher routing predicates
  (routes_tool / is_<x>_tool) and into their executors.
- Updated .springdrift_example/skills/document-library/SKILL.md
  with explicit checkpointing pattern + referenced_artifacts
  reconnaissance pattern. Replaces the implicit "agents will figure
  out store_result discipline through CBR over many cycles" with
  explicit guidance.

Tests:
- 9 tests in test/agent/referenced_artifacts_test.gleam covering
  parse extraction (3), bundle rendering (5), order preservation,
  missing artifact markers, whitespace tolerance.
- 6 tests in test/tools/checkpoint_tool_test.gleam covering happy
  path, label-becomes-summary, missing-label rejection,
  missing-content rejection, and routing through writer + researcher.
- 2074 tests passing (15 new on top of 2059 from PR #166).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@seamus-brady seamus-brady merged commit 032415a into main Apr 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant