Add referenced_artifacts and checkpoint tool by seamus-brady · Pull Request #167 · seamus-brady/springdrift

seamus-brady · 2026-04-26T17:09:37Z

Summary

PR 2 of the sub-agent-resilience plan. Implements Fixes 3 + 4: parent → child structural-context handoff via referenced_artifacts, and a lighter checkpoint tool plus skill discipline updates so agents save in chunks during synthesis instead of trying to fit it all in one final response.

Together these address the redundant-bootstrap and lost-work-on-cap patterns seen in the 2026-04-26 Nemo session, where 13 researcher delegations each re-discovered the same 309-section book and one writer delegation produced zero words after blowing its token cap.

What changed

Fix 3 — `referenced_artifacts` on agent_* tool calls

New parameter on every agent_* tool: referenced_artifacts (comma-separated artifact IDs).
Framework intercepts in dispatch_single_agent (src/agent/cognitive/agents.gleam): retrieves each artifact's content via the librarian, renders as <reference_artifact id="...">CONTENT</reference_artifact> blocks, prepends to the agent's first message.
Resolution failures render status="not_found" markers; over-50KB bundle renders status="elided" reason="bundle_size" markers — agent always sees what was attempted, never silently dropped.
New helpers parse_referenced_artifacts_csv and render_referenced_artifacts_bundle (both pub for testability).

Fix 4 — `checkpoint` tool + skill discipline

New checkpoint(label, content) tool in src/tools/artifacts.gleam. Lighter than store_result — auto-fills tool="checkpoint", uses label as the summary, otherwise behaves identically.
Wired into writer + researcher routing predicates and executors. Both agents can now call it.
Updated .springdrift_example/skills/document-library/SKILL.md with explicit guidance on the checkpointing pattern (save sections as you produce them, don't assemble the whole final output in one response) and the reconnaissance-then-followups pattern using referenced_artifacts.

Test plan

gleam build clean, no warnings
gleam test — 2074 passed (15 new), no failures
Pure tests on parse_referenced_artifacts_csv (3) — empty / present / invalid-json fallbacks
Bundle rendering tests (5) — empty CSV, no librarian, valid artifact resolves with content, missing artifact gets not_found marker, multiple artifacts render in order, whitespace in CSV is trimmed
Checkpoint dispatch tests (4) — happy path returns artifact_id, label becomes summary, missing label rejected, missing content rejected
Routing tests (2) — writer.routes_tool("checkpoint") and researcher.routes_tool("checkpoint") both True (regression guard for the dispatch-routing bug class fixed in PR Fix three agent-tool dispatch bugs of the same family #163)
Operator: dispatch a synthesis task with reconnaissance pattern (agent_researcher returns an artifact_id in its summary, then dispatch parallel followups with referenced_artifacts: <that_id>) and confirm the followups receive the content as a <reference_artifact> block in their first message

Notes for review

referenced_artifacts is a string parameter (comma-separated), not a JSON array. Gleam's tool builder doesn't have an array param helper yet, and string-with-CSV parsing is robust enough for this use. The description on the tool param explicitly tells the LLM to comma-separate.
The 50KB bundle cap is a constant in agents.gleam (referenced_artifacts_bundle_cap_chars). If operators routinely need bigger bundles, consider lifting to config; for now it matches the existing artifact-record-truncation cap of the same size.
The existing artifact_id (singular) param remains unchanged — embeds just the ID as a hint that the agent must retrieve_result itself. referenced_artifacts is the new auto-prepend-content path. Both can coexist.
checkpoint tool's tool field on the ArtifactRecord is the discriminator — downstream filtering (tool == "checkpoint") can find checkpoints specifically. No new schema field needed.

What's next

PR 3 — Fix 5: codify Nemo's emergent strategies as orchestration skills (orchestration-large-inputs/SKILL.md + when-to-use-writer/SKILL.md) and add Strategy Registry seeding so fresh instances boot with the floor strategies populated.

🤖 Generated with Claude Code

…ience) Implements Fixes 3 and 4 of the sub-agent-resilience plan (docs/roadmap/planned/sub-agent-resilience.md). Fix 3 — referenced_artifacts on agent_* tool calls - New `referenced_artifacts` parameter on every agent_* tool. Takes a comma-separated list of artifact IDs. - Framework intercepts on dispatch_single_agent: each ID is resolved via the librarian, content prepended to the agent's first message as `<reference_artifact id="...">CONTENT</reference_artifact>` blocks wrapped in `<reference_artifacts>...</reference_artifacts>`. - Eliminates the redundant-bootstrap pattern observed in 2026-04-26 Nemo session: 13 researchers each independently re-discovered the same 309-section book. The new flow is one reconnaissance delegation → checkpoint the structural outline → dispatch downstream agents with that artifact_id via referenced_artifacts. Each child sees the structure immediately without calling retrieve_result. - Differs from existing `artifact_id` param: that param embeds just the ID as a hint; the agent must call retrieve_result itself. referenced_artifacts auto-renders the CONTENT. - Resolution failures render `<reference_artifact id="X" status="not_found"/>` markers rather than silently dropping — agent sees what was attempted. - Bundle size capped at 50,000 chars; over-cap artifacts render `status="elided" reason="bundle_size"` markers. - Whitespace in the CSV is trimmed so LLMs passing `"a, b, c"` are tolerated. Fix 4 — checkpoint tool + skill discipline - New `checkpoint(label, content)` tool — lighter sibling of store_result. Auto-fills tool="checkpoint", uses label as the summary, otherwise behaves identically. Lower friction means agents are more likely to actually use it during synthesis. - Wired into writer + researcher routing predicates (routes_tool / is_<x>_tool) and into their executors. - Updated .springdrift_example/skills/document-library/SKILL.md with explicit checkpointing pattern + referenced_artifacts reconnaissance pattern. Replaces the implicit "agents will figure out store_result discipline through CBR over many cycles" with explicit guidance. Tests: - 9 tests in test/agent/referenced_artifacts_test.gleam covering parse extraction (3), bundle rendering (5), order preservation, missing artifact markers, whitespace tolerance. - 6 tests in test/tools/checkpoint_tool_test.gleam covering happy path, label-becomes-summary, missing-label rejection, missing-content rejection, and routing through writer + researcher. - 2074 tests passing (15 new on top of 2059 from PR #166). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

seamus-brady merged commit 032415a into main Apr 26, 2026
1 check passed

seamus-brady mentioned this pull request Apr 26, 2026

Codify Nemo's strategies as skills with Strategy Registry seeding #169

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add referenced_artifacts and checkpoint tool#167

Add referenced_artifacts and checkpoint tool#167
seamus-brady merged 1 commit intomainfrom
feat/agent-context-bundle-and-checkpoint

seamus-brady commented Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

seamus-brady commented Apr 26, 2026

Summary

What changed

Fix 3 — referenced_artifacts on agent_* tool calls

Fix 4 — checkpoint tool + skill discipline

Test plan

Notes for review

What's next

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix 3 — `referenced_artifacts` on agent_* tool calls

Fix 4 — `checkpoint` tool + skill discipline