Add tool result budgeting to persist large outputs to disk#705
Add tool result budgeting to persist large outputs to disk#705iamnbutler wants to merge 1 commit intomainfrom
Conversation
When a tool result exceeds its configured max_result_size (default 100KB), the full output is written to disk and replaced with a byte-count summary, file path, and 2KB preview. This prevents context bloat from large outputs while keeping full content available for debugging. - Add max_result_size field to Tool (default Some(100_000), None to disable) - Add tool_result_budget module with budget_tool_results() function - Integrate budgeting into Session::apply_tool_results() via output_dir - Add SessionBuilder::output_dir() for configuration - Add Tool::no_result_budget() and Tool::with_max_result_size() helpers Closes #667 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
PR adds tool result budgeting cleanly and completely. All 64 tests pass. The pre-existing clippy error in chain.rs:58 is unrelated to this change.
The implementation is correct: pending_tool_calls is read before being cleared in apply_tool_results, so the call-ID → tool-name mapping is always available when budgeting runs. The UTF-8 char boundary handling is right. The opt-in design (output_dir: None by default) means no behavioral change for existing callers.
Three inline suggestions, all minor:
create_dir_allis called per-file instead of once before the loop- No cleanup of persisted files when a session ends (may be intentional, but worth documenting)
- No test covering the
is_error = truepath
None are blockers. Good to merge once the suggestions are considered.
References:
Reviewed by PR / Review
| // Persist full output to disk | ||
| let filename = format!("tool_output_{}.txt", uuid::Uuid::new_v4()); | ||
| let path = output_dir.join(&filename); | ||
| if let Err(e) = std::fs::create_dir_all(output_dir) { |
There was a problem hiding this comment.
[SUGGESTION]
Priority: Code Quality
create_dir_all is called once per result rather than once before the loop. It's idempotent so no correctness issue, but for sessions with many large results it's an unnecessary syscall per file.
Consider calling it once before the loop (with early return on failure) and skipping the per-result call:
if let Err(e) = std::fs::create_dir_all(output_dir) {
tracing::warn!("failed to create tool output dir {}: {}", output_dir.display(), e);
return;
}
for result in results.iter_mut() {
// ...no create_dir_all here
}| } | ||
|
|
||
| // Persist full output to disk | ||
| let filename = format!("tool_output_{}.txt", uuid::Uuid::new_v4()); |
There was a problem hiding this comment.
[SUGGESTION]
Priority: Code Quality
Output files accumulate in output_dir indefinitely — there's no cleanup when a session ends or a container is reclaimed. For long-running sessions generating many large tool outputs this could fill the directory.
Worth noting whether cleanup is intentional (files useful for post-session debugging) or a TODO. If it's a known gap, a // TODO: clean up on session drop comment here would signal intent.
|
|
||
| #[test] | ||
| fn test_none_max_size_skips_budgeting() { | ||
| let dir = tempfile::tempdir().unwrap(); |
There was a problem hiding this comment.
[SUGGESTION]
Priority: Code Quality
There's no test covering the is_error = true path. While the logic treats error results identically to success results (which is probably correct — a 100KB stack trace is still too big for context), a small test would make that intention explicit and guard against future changes that might want to special-case errors.
Summary
max_result_sizefield toToolstruct (default 100KB,Noneto disable budgeting)tool_result_budgetmodule: when a tool result exceeds its limit, the full output is written to disk and replaced with a byte-count, file path, and 2KB previewSession::apply_tool_results()— whenoutput_diris set, results are automatically budgeted before entering conversation historyReadcan opt out viaTool::no_result_budget()to avoid read→file→read loopsTest plan
Nonemax (skipped), unknown tools (default budget), multi-byte UTF-8 boundary handling, mixed tool resultsCloses #667
🤖 Generated with Claude Code