Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions .springdrift_example/skills/document-library/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,48 @@ Reports in progress. Revise across multiple sessions, promote when ready.
Search results include formatted citations:
`[Document Title, §Section Name, lines 145-178]`
Include these in reports so claims are traceable to sources.

## Checkpointing — don't blow your token cap on a synthesis

When producing structured output (multi-section drafts, multi-topic
comparisons, anything over ~500 words), **call `checkpoint(label,
content)` after each major section**. Don't try to assemble the
whole final output in one response — that's how you blow your
token cap and lose all the work.

`checkpoint` is the lightweight sibling of `store_result`: just a
label and content. It writes to the artifact store with sensible
defaults and returns a compact `artifact_id` you can reference
later or hand back to your orchestrator.

**Pattern**:
1. Write section 1 → `checkpoint("draft-section-1-memory", content)` → got `art-abc123`
2. Write section 2 → `checkpoint("draft-section-2-affect", content)` → got `art-def456`
3. Write section 3 → `checkpoint("draft-section-3-safety", content)` → got `art-789xyz`
4. Final response: brief summary + the three artifact IDs. Orchestrator can retrieve any of them with `retrieve_result`.

**Anti-pattern**: try to write all three sections inline in one
final response, hit `max_tokens` mid-section-2, ship a half-finished
output (or worse, get caught by the truncation guard and ship a
[truncation_guard:writer] admission with no artifacts to recover
from). The truncation guard is a safety net, not a substitute for
discipline.

This is especially important for the writer agent (synthesis is its
job) and for the researcher when extracting from large documents.

## Sharing context between agents — referenced_artifacts

When you delegate to a sub-agent that needs the same structural
context another agent already produced (e.g. a section outline,
prior findings, a checkpoint), pass the artifact_ids on the
delegation tool call via `referenced_artifacts: "art-abc123,art-def456"`.
The framework auto-prepends the artifact CONTENT as
`<reference_artifact>` blocks to the child's first message — the
child sees it immediately without calling `retrieve_result`.

**Pattern for large document tasks**: one reconnaissance delegation
to map the structure → checkpoint the outline → dispatch N parallel
followups, each carrying the outline ID via `referenced_artifacts`.
Eliminates the redundant-bootstrap cost of every child re-discovering
the same structure.
122 changes: 122 additions & 0 deletions src/agent/cognitive/agents.gleam
Original file line number Diff line number Diff line change
Expand Up @@ -641,6 +641,21 @@ fn dispatch_single_agent(
Some(task_subject) -> {
let agent_task_id = cycle_log.generate_uuid()
let #(raw_instruction, ctx) = parse_agent_params(call.input_json)
// Auto-prepend referenced_artifacts content as <reference_artifact>
// blocks. Eliminates the redundant-bootstrapping pattern observed
// in 2026-04-26 Nemo session: orchestrator does ONE reconnaissance
// delegation, stores the structural outline, then passes the
// artifact_id via referenced_artifacts to N parallel followups —
// each downstream agent sees the structure immediately without
// calling retrieve_result. Empty when no IDs supplied or no
// librarian available.
let refs_csv = parse_referenced_artifacts_csv(call.input_json)
let refs_bundle =
render_referenced_artifacts_bundle(refs_csv, state.memory.librarian)
let raw_instruction = case refs_bundle {
"" -> raw_instruction
bundle -> bundle <> raw_instruction
}
// L2 failure-context injection: when an agent is re-dispatched
// this cycle after a prior failure (same agent_id), prepend a
// short block describing what failed last time. Blind retries
Expand Down Expand Up @@ -1883,6 +1898,113 @@ pub fn parse_agent_params(input_json: String) -> #(String, String) {
}
}

/// Hard cap on the total size of the auto-prepended artifact bundle
/// produced by `referenced_artifacts`. Protects a child agent's
/// context window from being swamped by a parent that supplies many
/// or huge artifact references. When the cap is exceeded, the
/// remaining artifacts are still listed by id with a "[...elided
/// for size...]" marker so the agent sees what was attempted.
const referenced_artifacts_bundle_cap_chars: Int = 50_000

/// Resolve a comma-separated `referenced_artifacts` string into a
/// list of `<reference_artifact>` XML blocks containing the actual
/// artifact content, ready to prepend to the agent's instruction.
///
/// Resolution failures (artifact not found, content not on disk) are
/// rendered as `<reference_artifact id="X" status="not_found"/>`
/// blocks rather than silently dropped — the agent sees what the
/// orchestrator tried to pass and can react. Cap-busts are rendered
/// as `<reference_artifact id="X" status="elided" reason="bundle_size"/>`
/// markers so the agent knows the content existed but was withheld.
///
/// Returns "" when the input is empty or there's no librarian — the
/// caller can treat the empty string as "no bundle, dispatch normally."
pub fn render_referenced_artifacts_bundle(
csv: String,
lib: Option(Subject(librarian.LibrarianMessage)),
) -> String {
case csv, lib {
"", _ -> ""
_, None -> ""
csv, Some(librarian_subj) -> {
let ids =
string.split(csv, ",")
|> list.map(string.trim)
|> list.filter(fn(s) { s != "" })
let #(blocks_rev, _final_size) =
list.fold(ids, #([], 0), fn(acc, id) {
let #(blocks, size_so_far) = acc
case size_so_far >= referenced_artifacts_bundle_cap_chars {
True -> {
let elided =
"<reference_artifact id=\""
<> id
<> "\" status=\"elided\" reason=\"bundle_size\"/>"
#([elided, ..blocks], size_so_far)
}
False ->
case librarian.lookup_artifact(librarian_subj, id) {
Error(Nil) -> {
let missing =
"<reference_artifact id=\""
<> id
<> "\" status=\"not_found\"/>"
#([missing, ..blocks], size_so_far)
}
Ok(meta) ->
case
librarian.retrieve_artifact_content(
librarian_subj,
id,
meta.stored_at,
)
{
Error(Nil) -> {
let missing =
"<reference_artifact id=\""
<> id
<> "\" status=\"content_missing\"/>"
#([missing, ..blocks], size_so_far)
}
Ok(content) -> {
let block =
"<reference_artifact id=\""
<> id
<> "\">\n"
<> content
<> "\n</reference_artifact>"
#([block, ..blocks], size_so_far + string.length(content))
}
}
}
}
})
let blocks = list.reverse(blocks_rev)
case blocks {
[] -> ""
_ ->
"<reference_artifacts>\n"
<> string.join(blocks, "\n")
<> "\n</reference_artifacts>\n\n"
}
}
}
}

/// Extract the comma-separated `referenced_artifacts` value from the
/// agent_* tool call's input JSON. Returns "" if the param isn't
/// present.
pub fn parse_referenced_artifacts_csv(input_json: String) -> String {
let decoder = {
use refs <- decode.optional_field("referenced_artifacts", "", decode.string)
decode.success(refs)
}
case json.parse(input_json, decoder) {
Ok(refs) -> refs
Error(_) -> ""
}
}

fn parse_refs_prefix(input_json: String) -> String {
let decoder = {
use artifact_id <- decode.optional_field("artifact_id", "", decode.string)
Expand Down
5 changes: 5 additions & 0 deletions src/agent/types.gleam
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,11 @@ pub fn agent_to_tool(spec: AgentSpec) -> Tool {
"If the instruction continues or refers to prior work stored as an artifact (researcher's large fetched content), pass the artifact ID here so the specialist can read it.",
False,
)
|> tool.add_string_param(
"referenced_artifacts",
"Comma-separated artifact IDs whose CONTENT should be auto-prepended to the agent's first message — the agent sees the material immediately without calling retrieve_result. Use this for reconnaissance-then-followups patterns: parent does one delegation to map a large input, stores the outline as an artifact, then dispatches downstream agents with that ID here so they don't re-bootstrap. Differs from artifact_id (which only embeds the ID as a hint that the agent must retrieve itself). Total content capped at ~50KB to protect the child's context.",
False,
)
|> tool.add_string_param(
"task_id",
"If the instruction targets an existing planner task, pass its ID here.",
Expand Down
3 changes: 2 additions & 1 deletion src/agents/researcher.gleam
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ pub fn routes_tool(name: String) -> Bool {
|| name == "jina_reader"
|| name == "store_result"
|| name == "retrieve_result"
|| name == "checkpoint"
|| name == "calculator"
|| name == "get_current_datetime"
|| name == "read_skill"
Expand Down Expand Up @@ -262,7 +263,7 @@ fn researcher_executor(
brave_cache_ttl_ms,
)
"jina_reader" -> jina.execute(call)
"store_result" | "retrieve_result" ->
"store_result" | "retrieve_result" | "checkpoint" ->
artifacts.execute(
call,
artifacts_dir,
Expand Down
8 changes: 6 additions & 2 deletions src/agents/writer.gleam
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ pub fn routes_tool(name: String) -> Bool {
knowledge_tools.is_knowledge_tool(name)
|| name == "store_result"
|| name == "retrieve_result"
|| name == "checkpoint"
|| name == "calculator"
|| name == "get_current_datetime"
|| name == "read_skill"
Expand Down Expand Up @@ -155,15 +156,18 @@ fn writer_executor(
)
False ->
case call.name, lib {
"store_result", Some(l) | "retrieve_result", Some(l) ->
"store_result", Some(l)
| "retrieve_result", Some(l)
| "checkpoint", Some(l)
->
artifacts.execute(
call,
artifacts_dir,
"writer",
l,
max_artifact_chars,
)
"store_result", None | "retrieve_result", None ->
"store_result", None | "retrieve_result", None | "checkpoint", None ->
llm_types.ToolFailure(
tool_use_id: call.id,
error: "Artifact tools unavailable (no librarian)",
Expand Down
104 changes: 103 additions & 1 deletion src/tools/artifacts.gleam
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ fn iso_now() -> String
// ---------------------------------------------------------------------------

pub fn all() -> List(llm_types.Tool) {
[store_result_tool(), retrieve_result_tool()]
[store_result_tool(), retrieve_result_tool(), checkpoint_tool()]
}

fn store_result_tool() -> llm_types.Tool {
Expand All @@ -59,6 +59,35 @@ fn store_result_tool() -> llm_types.Tool {
|> tool.build()
}

fn checkpoint_tool() -> llm_types.Tool {
tool.new("checkpoint")
|> tool.with_description(
"Save in-progress work as an artifact mid-task. Lighter than"
<> " store_result — auto-fills tool=\"checkpoint\" and uses your"
<> " label as the summary. Use this every major section when"
<> " producing structured output (multi-section drafts,"
<> " comparisons, anything over ~500 words). DO NOT try to assemble"
<> " the whole final output in one response — that's how you blow"
<> " your token cap and lose all the work. Save in chunks, then"
<> " reference them by artifact_id when you respond to your"
<> " orchestrator. Returns a compact artifact_id.",
)
|> tool.add_string_param(
"label",
"Short label for this checkpoint (e.g. \"draft-section-1-memory\")."
<> " Used as the artifact's summary so you can find it later.",
True,
)
|> tool.add_string_param(
"content",
"The work to save. No upper limit on the agent side — content is"
<> " written to disk and a compact ID returned. Truncation only"
<> " kicks in at the storage layer's hard cap.",
True,
)
|> tool.build()
}

fn retrieve_result_tool() -> llm_types.Tool {
tool.new("retrieve_result")
|> tool.with_description(
Expand Down Expand Up @@ -88,6 +117,8 @@ pub fn execute(
"store_result" ->
run_store_result(call, artifacts_dir, cycle_id, lib, max_artifact_chars)
"retrieve_result" -> run_retrieve_result(call, lib)
"checkpoint" ->
run_checkpoint(call, artifacts_dir, cycle_id, lib, max_artifact_chars)
_ ->
llm_types.ToolFailure(
tool_use_id: call.id,
Expand Down Expand Up @@ -165,6 +196,77 @@ fn run_store_result(
}
}

/// Lighter sibling of `run_store_result`: takes just `label` and
/// `content`, fills the rest from sensible defaults so the agent
/// doesn't pay token cost on metadata. Same on-disk shape as
/// store_result; the discriminator is `tool: "checkpoint"` in the
/// ArtifactRecord, which downstream lookup tools can filter on.
fn run_checkpoint(
call: llm_types.ToolCall,
artifacts_dir: String,
cycle_id: String,
lib: Subject(LibrarianMessage),
max_artifact_chars: Int,
) -> llm_types.ToolResult {
let decoder = {
use label <- decode.field("label", decode.string)
use content <- decode.field("content", decode.string)
decode.success(#(label, content))
}
case json.parse(call.input_json, decoder) {
Error(_) ->
llm_types.ToolFailure(
tool_use_id: call.id,
error: "Invalid checkpoint input — expected `label` and `content`.",
)
Ok(#(label, content)) -> {
let artifact_id = "art-" <> uuid_v4()
let now = iso_now()
let summary = "checkpoint: " <> label
let record =
ArtifactRecord(
schema_version: 1,
artifact_id:,
cycle_id:,
stored_at: now,
tool: "checkpoint",
url: "",
summary:,
char_count: string.length(content),
truncated: False,
)
artifacts_log.append(artifacts_dir, record, content, max_artifact_chars)
let meta =
ArtifactMeta(
artifact_id:,
cycle_id:,
stored_at: now,
tool: "checkpoint",
url: "",
summary:,
char_count: string.length(content),
truncated: False,
)
librarian.index_artifact(lib, meta)
slog.debug(
"tools/artifacts",
"checkpoint",
"Stored checkpoint " <> artifact_id <> " (" <> label <> ")",
Some(cycle_id),
)
llm_types.ToolSuccess(
tool_use_id: call.id,
content: "Checkpointed as artifact_id=\""
<> artifact_id
<> "\" ("
<> string.inspect(string.length(content))
<> " chars). Reference this ID to retrieve later or pass it"
<> " back to your orchestrator via referenced_artifacts.",
)
}
}
}

fn run_retrieve_result(
call: llm_types.ToolCall,
lib: Subject(LibrarianMessage),
Expand Down
Loading
Loading