refactor(proxy): audit thinking-mode protocol and refactor test suite#33
Merged
Conversation
Allowlists `user`, `seed`, `n`, and `logit_bias` so the proxy stops log-spamming when Cursor (and any OpenAI-SDK client) sends them on every request. DeepSeek either honors or safely ignores these; the warning was telling users we were silently dropping a stable end-user identifier they probably want forwarded.
yxlao
added a commit
that referenced
this pull request
May 1, 2026
PR #33's test refactor trimmed test_transform.py from 1489 to 321 lines and accidentally dropped five regression tests that locked in PR #28's cross-mode/model context-preservation mechanisms (Pro/Flash family normalization, portable turn-scoped keys, recovery-boundary continuation). The production code is intact, but coverage was gone. Restore the originals verbatim from commit 5f14da3 as a new CrossModeAndModelTests class: - test_deepseek_pro_and_flash_share_reasoning_namespace - test_strict_hit_backfills_portable_cache_for_mode_switch - test_portable_turn_cache_restores_final_assistant_after_tool_result - test_portable_turn_cache_isolated_for_reused_tool_call_id - test_recovered_response_is_recorded_under_pre_recovery_scope Adjusted only the imports and the namespace/scope helpers to fit the post-PR-#33 layout. All 87 tests pass.
maThiaslI152
added a commit
to maThiaslI152/deepseek-cursor-proxy
that referenced
this pull request
May 6, 2026
Merges 6 upstream commits from yxlao/deepseek-cursor-proxy: - feat(streaming): add collapsible reasoning display (yxlao#32) - refactor(proxy): audit thinking-mode protocol and refactor test suite (yxlao#33) - fix(server): honor missing-reasoning reject mode (yxlao#34) - fix: prevent recovery cascade and improve Stop-scenario reasoning lookup (yxlao#25) - feat(config): default reasoning effort to max (yxlao#36) - refactor(logging): simplified prints and add spinner (yxlao#37) Combines both branches' changes: - Concurrent request deduplication (ours) + dual-scope recording (upstream) - Broad namespace cache keys (ours) + portable tool_name keys (upstream) - Spinner, build_arg_parser, main() function (upstream) - Metrics tracking, schema migration, graceful shutdown (ours) Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
This PR audits the proxy against DeepSeek's official thinking-mode tool-call protocol and confirms that the multi-round
reasoning_contentchain is reconstructed and forwarded correctly across tool-call turns. Along the way it tightens a few protocol edges, fixes minor bugs, and consolidates the test suite.Summary
Tightens the thinking-mode protocol behavior based on a code audit, and refactors the test suite around a single end-to-end protocol harness so each scenario lives in exactly one place.
Changes Made
Protocol fixes
src/deepseek_cursor_proxy/transform.py:strip_recovery_notice_for_upstream()so the proxy-generated recovery notice is no longer echoed back to DeepSeek. The notice still serves as a boundary marker in the with-prefix history (preserved for cache-scope alignment), but is stripped before the request leaves the proxy.LEGACY_RECOVERY_NOTICE_TEXTconstant and itshas_recovery_notice()branch — only the current recovery notice is recognized now.WARNINGlogs for unsupported request fields (parallel_tool_calls,service_tier, etc.) that the proxy silently drops, and for non-DeepSeek model names that get rewritten to the configured upstream model.rewrite_response_body(): prefix → record reasoning → fold reasoning, so cache scope signatures still match what Cursor will echo back.src/deepseek_cursor_proxy/streaming.py: Addedfold_reasoning_into_content()for non-streaming responses, mirroringreasoning_contentinto a<details><summary>Thinking</summary>...</details>block (or<think>ifcollapsible_reasoning=False) — the streaming adapter already did this; non-streaming responses now match.src/deepseek_cursor_proxy/config.py+server.py: Removed thepass-throughthinking mode. The proxy was the only thing that knew how to rebuild missingreasoning_content; pass-through bypassed it and produced 400s upstream.Test suite refactor
tests/test_protocol.py, which uses aStrictFakeDeepSeekHTTP handler that 400s on missingreasoning_content(the same behavior the real DeepSeek API has). Test classes cover: canonical loop, strict-reject mode, thinking-disabled, recovery, streaming-then-non-streaming, streaming/non-streaming display, concurrent threads, and streaming cache timing.tests/test_proxy_end_to_end.py(1414 lines) — its scenarios now live intest_protocol.py.tests/test_transform.pyfrom 1489 → 321 lines: kept only pure-helper unit tests (content extraction, request prep, recovery-notice stripping, response rewrite). Behavior that needs an upstream now lives intest_protocol.py.tests/test_server.pywith HTTP-boundary tests: bearer token forwarding, oversized body rejection, healthz, streaming close-after-[DONE], normal vs verbose logging.tests/test_trace.pywith three integration tests that exercise the trace writer through a running proxy (non-streaming replay, streaming chunks, recovery diagnostics).tests/test_streaming.py::FoldReasoningTestsfor the new non-streaming fold helper.Final layout: 80 tests across 9 files (~3500 lines), down from 98 tests across 10 files (~5000 lines).
Repo cleanup
docs/audit_report.mdanddocs/thinking-tools.md— internal audit notes, not user-facing docs.scripts/audit_deepseek_protocol.py— the mock DeepSeek server is now part oftests/test_protocol.py..claude/settings.local.jsonand added.claude/to.gitignore.scripts/audit_deepseek_protocol.pyruff per-file ignore frompyproject.toml.Breaking Changes
--thinking pass-throughis no longer supported. Configs that specifiedpass-throughwill fall back to the default (enabled). The mode bypassed the proxy's reasoning-cache patching, which is the proxy's reason for existing."Note: recovered this DeepSeek chat...") is no longer recognized as a boundary marker. Live conversations that still carry the old prefix in their history will have it treated as ordinary assistant content. New recoveries always emit the current[deepseek-cursor-proxy] Refreshed reasoning_content history.prefix.Test plan
uv run python -m unittest discover -s tests— 80 tests pass (1 skipped: live DeepSeek API test, requires real API key)