fix(ai): preserve Anthropic server tool replay by code-yeongyu · Pull Request #48 · code-yeongyu/senpi

code-yeongyu · 2026-06-17T10:01:37Z

Summary

Fix Anthropic same-model replay for assistant messages that contain provider-native server tool blocks and signed thinking.

The failing session 019ed4f0-2914-701d-9614-dd0f58c0b103 ended with Anthropic rejecting the next request because thinking blocks had been modified. The local replay serializer preserved signed thinking text, but it dropped same-model Anthropic providerNative blocks such as server_tool_use and web_search_tool_result. That changed the protected assistant content sequence before the follow-up tool-result request.

This patch keeps replayable Anthropic provider-native raw blocks only for the same provider/api/model, while continuing to drop non-portable cross-provider native blocks.

QA Evidence

Evidence directory: local-ignore/qa-evidence/20260617-thinking-session-fix/

Failing-first: red-anthropic-provider-native.txt showed same-model provider-native server blocks were dropped from the Anthropic replay payload.
Regression: green-anthropic-provider-native-replay-after-review.txt passed test/anthropic-provider-native-replay.test.ts with the real failure shape: server native blocks + signed thinking + text + tool call + tool result.
Focused hermetic suite: green-focused-ai-tests-hermetic.txt passed 13 tests with 1 live Anthropic E2E skipped.
Full check: npm-run-check-after-install.txt passed npm run check after npm install --ignore-scripts hydrated already-declared local @types packages.
Senpi real CLI QA: senpi-qa-mock-loop-anthropic-self.txt and senpi-qa-mock-loop-anthropic-tool.txt passed the Anthropic mock loop and multi-turn tool loop with zero real provider calls and real auth unchanged.
Review: ultrawork reviewer approval recorded after the regression test was staged.

Summary by cubic

Fixes Anthropic same-model replay to keep server-native tool blocks around signed thinking so follow-up tool-result requests are accepted. Restores stable multi-turn tool flows and scopes PR530 benchmark suites by package for faster CI.

Bug Fixes
- Preserve a whitelist of Anthropic provider-native blocks when replaying the same provider/api/model (e.g., server_tool_use, web_search_tool_result); continue dropping cross-provider native blocks.
- Added a regression test to verify signed thinking + native server blocks + tool use replay; removed the obsolete test that expected provider-native blocks to be dropped.

^{Written for commit 856288b. Summary will update on new commits.}

code-yeongyu added 2 commits June 17, 2026 19:00

fix(ai): preserve Anthropic server tool replay

a7fabcc

ci: scope PR530 benchmark suites by package

856288b

code-yeongyu merged commit 5f1c892 into main Jun 18, 2026
4 checks passed

code-yeongyu deleted the codex/fix-anthropic-thinking-replay branch June 18, 2026 01:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ai): preserve Anthropic server tool replay#48

fix(ai): preserve Anthropic server tool replay#48
code-yeongyu merged 2 commits into
mainfrom
codex/fix-anthropic-thinking-replay

code-yeongyu commented Jun 17, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

code-yeongyu commented Jun 17, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

QA Evidence

Summary by cubic

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

code-yeongyu commented Jun 17, 2026 •

edited by cubic-dev-ai Bot

Loading