Skip to content

Add tool result budgeting to persist large outputs to disk#705

Open
iamnbutler wants to merge 1 commit intomainfrom
tasks/gh-iamnbutler-tasks-issue-667--a8454b7d
Open

Add tool result budgeting to persist large outputs to disk#705
iamnbutler wants to merge 1 commit intomainfrom
tasks/gh-iamnbutler-tasks-issue-667--a8454b7d

Conversation

@iamnbutler
Copy link
Copy Markdown
Owner

Summary

  • Adds max_result_size field to Tool struct (default 100KB, None to disable budgeting)
  • New tool_result_budget module: when a tool result exceeds its limit, the full output is written to disk and replaced with a byte-count, file path, and 2KB preview
  • Integrated into Session::apply_tool_results() — when output_dir is set, results are automatically budgeted before entering conversation history
  • Tools like Read can opt out via Tool::no_result_budget() to avoid read→file→read loops

Test plan

  • Unit tests for small results (unchanged), large results (persisted + preview), None max (skipped), unknown tools (default budget), multi-byte UTF-8 boundary handling, mixed tool results
  • All 64 existing agent crate tests pass
  • Full workspace builds cleanly (no new warnings)

Closes #667

🤖 Generated with Claude Code

When a tool result exceeds its configured max_result_size (default 100KB),
the full output is written to disk and replaced with a byte-count summary,
file path, and 2KB preview. This prevents context bloat from large outputs
while keeping full content available for debugging.

- Add max_result_size field to Tool (default Some(100_000), None to disable)
- Add tool_result_budget module with budget_tool_results() function
- Integrate budgeting into Session::apply_tool_results() via output_dir
- Add SessionBuilder::output_dir() for configuration
- Add Tool::no_result_budget() and Tool::with_max_result_size() helpers

Closes #667

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR adds tool result budgeting cleanly and completely. All 64 tests pass. The pre-existing clippy error in chain.rs:58 is unrelated to this change.

The implementation is correct: pending_tool_calls is read before being cleared in apply_tool_results, so the call-ID → tool-name mapping is always available when budgeting runs. The UTF-8 char boundary handling is right. The opt-in design (output_dir: None by default) means no behavioral change for existing callers.

Three inline suggestions, all minor:

  • create_dir_all is called per-file instead of once before the loop
  • No cleanup of persisted files when a session ends (may be intentional, but worth documenting)
  • No test covering the is_error = true path

None are blockers. Good to merge once the suggestions are considered.


References:

Reviewed by PR / Review

// Persist full output to disk
let filename = format!("tool_output_{}.txt", uuid::Uuid::new_v4());
let path = output_dir.join(&filename);
if let Err(e) = std::fs::create_dir_all(output_dir) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION]

Priority: Code Quality

create_dir_all is called once per result rather than once before the loop. It's idempotent so no correctness issue, but for sessions with many large results it's an unnecessary syscall per file.

Consider calling it once before the loop (with early return on failure) and skipping the per-result call:

if let Err(e) = std::fs::create_dir_all(output_dir) {
    tracing::warn!("failed to create tool output dir {}: {}", output_dir.display(), e);
    return;
}

for result in results.iter_mut() {
    // ...no create_dir_all here
}

}

// Persist full output to disk
let filename = format!("tool_output_{}.txt", uuid::Uuid::new_v4());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION]

Priority: Code Quality

Output files accumulate in output_dir indefinitely — there's no cleanup when a session ends or a container is reclaimed. For long-running sessions generating many large tool outputs this could fill the directory.

Worth noting whether cleanup is intentional (files useful for post-session debugging) or a TODO. If it's a known gap, a // TODO: clean up on session drop comment here would signal intent.


#[test]
fn test_none_max_size_skips_budgeting() {
let dir = tempfile::tempdir().unwrap();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION]

Priority: Code Quality

There's no test covering the is_error = true path. While the logic treats error results identically to success results (which is probably correct — a 100KB stack trace is still too big for context), a small test would make that intention explicit and guard against future changes that might want to special-case errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

agent: Tool result budgeting - persist large outputs to disk

1 participant