Add referenced_artifacts and checkpoint tool#167
Merged
seamus-brady merged 1 commit intomainfrom Apr 26, 2026
Merged
Conversation
…ience) Implements Fixes 3 and 4 of the sub-agent-resilience plan (docs/roadmap/planned/sub-agent-resilience.md). Fix 3 — referenced_artifacts on agent_* tool calls - New `referenced_artifacts` parameter on every agent_* tool. Takes a comma-separated list of artifact IDs. - Framework intercepts on dispatch_single_agent: each ID is resolved via the librarian, content prepended to the agent's first message as `<reference_artifact id="...">CONTENT</reference_artifact>` blocks wrapped in `<reference_artifacts>...</reference_artifacts>`. - Eliminates the redundant-bootstrap pattern observed in 2026-04-26 Nemo session: 13 researchers each independently re-discovered the same 309-section book. The new flow is one reconnaissance delegation → checkpoint the structural outline → dispatch downstream agents with that artifact_id via referenced_artifacts. Each child sees the structure immediately without calling retrieve_result. - Differs from existing `artifact_id` param: that param embeds just the ID as a hint; the agent must call retrieve_result itself. referenced_artifacts auto-renders the CONTENT. - Resolution failures render `<reference_artifact id="X" status="not_found"/>` markers rather than silently dropping — agent sees what was attempted. - Bundle size capped at 50,000 chars; over-cap artifacts render `status="elided" reason="bundle_size"` markers. - Whitespace in the CSV is trimmed so LLMs passing `"a, b, c"` are tolerated. Fix 4 — checkpoint tool + skill discipline - New `checkpoint(label, content)` tool — lighter sibling of store_result. Auto-fills tool="checkpoint", uses label as the summary, otherwise behaves identically. Lower friction means agents are more likely to actually use it during synthesis. - Wired into writer + researcher routing predicates (routes_tool / is_<x>_tool) and into their executors. - Updated .springdrift_example/skills/document-library/SKILL.md with explicit checkpointing pattern + referenced_artifacts reconnaissance pattern. Replaces the implicit "agents will figure out store_result discipline through CBR over many cycles" with explicit guidance. Tests: - 9 tests in test/agent/referenced_artifacts_test.gleam covering parse extraction (3), bundle rendering (5), order preservation, missing artifact markers, whitespace tolerance. - 6 tests in test/tools/checkpoint_tool_test.gleam covering happy path, label-becomes-summary, missing-label rejection, missing-content rejection, and routing through writer + researcher. - 2074 tests passing (15 new on top of 2059 from PR #166). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR 2 of the sub-agent-resilience plan. Implements Fixes 3 + 4: parent → child structural-context handoff via
referenced_artifacts, and a lightercheckpointtool plus skill discipline updates so agents save in chunks during synthesis instead of trying to fit it all in one final response.Together these address the redundant-bootstrap and lost-work-on-cap patterns seen in the 2026-04-26 Nemo session, where 13 researcher delegations each re-discovered the same 309-section book and one writer delegation produced zero words after blowing its token cap.
What changed
Fix 3 —
referenced_artifactson agent_* tool callsagent_*tool:referenced_artifacts(comma-separated artifact IDs).dispatch_single_agent(src/agent/cognitive/agents.gleam): retrieves each artifact's content via the librarian, renders as<reference_artifact id="...">CONTENT</reference_artifact>blocks, prepends to the agent's first message.status="not_found"markers; over-50KB bundle rendersstatus="elided" reason="bundle_size"markers — agent always sees what was attempted, never silently dropped.parse_referenced_artifacts_csvandrender_referenced_artifacts_bundle(both pub for testability).Fix 4 —
checkpointtool + skill disciplinecheckpoint(label, content)tool insrc/tools/artifacts.gleam. Lighter thanstore_result— auto-fillstool="checkpoint", useslabelas the summary, otherwise behaves identically..springdrift_example/skills/document-library/SKILL.mdwith explicit guidance on the checkpointing pattern (save sections as you produce them, don't assemble the whole final output in one response) and the reconnaissance-then-followups pattern usingreferenced_artifacts.Test plan
gleam buildclean, no warningsgleam test— 2074 passed (15 new), no failuresparse_referenced_artifacts_csv(3) — empty / present / invalid-json fallbackswriter.routes_tool("checkpoint")andresearcher.routes_tool("checkpoint")both True (regression guard for the dispatch-routing bug class fixed in PR Fix three agent-tool dispatch bugs of the same family #163)agent_researcherreturns an artifact_id in its summary, then dispatch parallel followups withreferenced_artifacts: <that_id>) and confirm the followups receive the content as a<reference_artifact>block in their first messageNotes for review
referenced_artifactsis a string parameter (comma-separated), not a JSON array. Gleam's tool builder doesn't have an array param helper yet, and string-with-CSV parsing is robust enough for this use. The description on the tool param explicitly tells the LLM to comma-separate.agents.gleam(referenced_artifacts_bundle_cap_chars). If operators routinely need bigger bundles, consider lifting to config; for now it matches the existing artifact-record-truncation cap of the same size.artifact_id(singular) param remains unchanged — embeds just the ID as a hint that the agent mustretrieve_resultitself.referenced_artifactsis the new auto-prepend-content path. Both can coexist.checkpointtool'stoolfield on the ArtifactRecord is the discriminator — downstream filtering (tool == "checkpoint") can find checkpoints specifically. No new schema field needed.What's next
PR 3 — Fix 5: codify Nemo's emergent strategies as orchestration skills (
orchestration-large-inputs/SKILL.md+when-to-use-writer/SKILL.md) and add Strategy Registry seeding so fresh instances boot with the floor strategies populated.🤖 Generated with Claude Code