Add tool result budgeting to persist large outputs to disk by iamnbutler · Pull Request #705 · iamnbutler/tasks

iamnbutler · 2026-03-31T16:59:37Z

Summary

Adds max_result_size field to Tool struct (default 100KB, None to disable budgeting)
New tool_result_budget module: when a tool result exceeds its limit, the full output is written to disk and replaced with a byte-count, file path, and 2KB preview
Integrated into Session::apply_tool_results() — when output_dir is set, results are automatically budgeted before entering conversation history
Tools like Read can opt out via Tool::no_result_budget() to avoid read→file→read loops

Test plan

Unit tests for small results (unchanged), large results (persisted + preview), None max (skipped), unknown tools (default budget), multi-byte UTF-8 boundary handling, mixed tool results
All 64 existing agent crate tests pass
Full workspace builds cleanly (no new warnings)

Closes #667

🤖 Generated with Claude Code

When a tool result exceeds its configured max_result_size (default 100KB), the full output is written to disk and replaced with a byte-count summary, file path, and 2KB preview. This prevents context bloat from large outputs while keeping full content available for debugging. - Add max_result_size field to Tool (default Some(100_000), None to disable) - Add tool_result_budget module with budget_tool_results() function - Integrate budgeting into Session::apply_tool_results() via output_dir - Add SessionBuilder::output_dir() for configuration - Add Tool::no_result_budget() and Tool::with_max_result_size() helpers Closes #667 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions

PR adds tool result budgeting cleanly and completely. All 64 tests pass. The pre-existing clippy error in chain.rs:58 is unrelated to this change.

The implementation is correct: pending_tool_calls is read before being cleared in apply_tool_results, so the call-ID → tool-name mapping is always available when budgeting runs. The UTF-8 char boundary handling is right. The opt-in design (output_dir: None by default) means no behavioral change for existing callers.

Three inline suggestions, all minor:

create_dir_all is called per-file instead of once before the loop
No cleanup of persisted files when a session ends (may be intentional, but worth documenting)
No test covering the is_error = true path

None are blockers. Good to merge once the suggestions are considered.

References:

§23809523685

Reviewed by PR / Review

github-actions · 2026-03-31T17:05:33Z

+        // Persist full output to disk
+        let filename = format!("tool_output_{}.txt", uuid::Uuid::new_v4());
+        let path = output_dir.join(&filename);
+        if let Err(e) = std::fs::create_dir_all(output_dir) {


[SUGGESTION]

Priority: Code Quality

create_dir_all is called once per result rather than once before the loop. It's idempotent so no correctness issue, but for sessions with many large results it's an unnecessary syscall per file.

Consider calling it once before the loop (with early return on failure) and skipping the per-result call:

if let Err(e) = std::fs::create_dir_all(output_dir) { tracing::warn!("failed to create tool output dir {}: {}", output_dir.display(), e); return; } for result in results.iter_mut() { // ...no create_dir_all here }

github-actions · 2026-03-31T17:05:33Z

+        }
+
+        // Persist full output to disk
+        let filename = format!("tool_output_{}.txt", uuid::Uuid::new_v4());


[SUGGESTION]

Priority: Code Quality

Output files accumulate in output_dir indefinitely — there's no cleanup when a session ends or a container is reclaimed. For long-running sessions generating many large tool outputs this could fill the directory.

Worth noting whether cleanup is intentional (files useful for post-session debugging) or a TODO. If it's a known gap, a // TODO: clean up on session drop comment here would signal intent.

github-actions · 2026-03-31T17:05:33Z

+
+    #[test]
+    fn test_none_max_size_skips_budgeting() {
+        let dir = tempfile::tempdir().unwrap();


[SUGGESTION]

Priority: Code Quality

There's no test covering the is_error = true path. While the logic treats error results identically to success results (which is probably correct — a 100KB stack trace is still too big for context), a small test would make that intention explicit and guard against future changes that might want to special-case errors.

github-actions bot reviewed Mar 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tool result budgeting to persist large outputs to disk#705

Add tool result budgeting to persist large outputs to disk#705
iamnbutler wants to merge 1 commit intomainfrom
tasks/gh-iamnbutler-tasks-issue-667--a8454b7d

iamnbutler commented Mar 31, 2026

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot Mar 31, 2026

Uh oh!

github-actions bot Mar 31, 2026

Uh oh!

github-actions bot Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

iamnbutler commented Mar 31, 2026

Summary

Test plan

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant