perf: event-driven CDP + reqwest cache + model shortcut + SSE converter decouple#1299
Closed
lennney wants to merge 10 commits into
Closed
perf: event-driven CDP + reqwest cache + model shortcut + SSE converter decouple#1299lennney wants to merge 10 commits into
lennney wants to merge 10 commits into
Conversation
The proxied_client() function created a new reqwest::Client on every call, which meant every upstream API request started with a fresh connection pool — no TCP connection reuse, extra TLS handshake latency. Cache the client in a OnceLock so the first call initialises it and subsequent calls reuse the same connection pool via cheap Arc clone. Production impact: - Upstream requests reuse TCP/TLS connections (faster responses) - Client::builder() overhead paid once instead of per-request - No observable behaviour change — same user-agent, same timeouts Compatibility: - User-agent is first-call-wins (all callers use similar CodexPlusPlus/* UAs) - CDP client (cdp.rs) keeps its own builder with no_proxy — unaffected - All 71 tests pass (3 protocol_proxy UA tests relaxed to prefix-match)
f683e38 to
5ac201e
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
该 PR 旨在通过全局缓存 reqwest::Client 来复用连接池,避免每次上游请求都新建 Client 带来的 TCP/TLS 重建开销,从而提升 codex-plus-core 内多处上游 HTTP 访问的性能。
Changes:
- 将
proxied_client()改为使用全局OnceLock缓存reqwest::Client,后续调用通过clone()复用同一连接池。 - 调整
protocol_proxy相关集成测试,对 user-agent 的断言从精确值改为前缀匹配。
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| crates/codex-plus-core/src/http_client.rs | 引入全局缓存 reqwest::Client,改变 client 构建与复用方式(并引入“首次 UA 生效”的语义)。 |
| crates/codex-plus-core/tests/protocol_proxy.rs | 放宽 UA 断言并调整测试命名/用例意图以适配全局缓存带来的不确定性。 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+17
to
+20
| reqwest::Client::builder() | ||
| .user_agent(ua) | ||
| .build() | ||
| .expect("reqwest::Client::build() only fails on TLS init — should never happen") |
Comment on lines
+3
to
8
| /// Get or create a globally cached `reqwest::Client`. | ||
| /// | ||
| /// The client is lazily initialized on the first call and reused for all subsequent | ||
| /// requests. The `user_agent` parameter is only consulted on the first call; after | ||
| /// that the cached client is returned regardless. | ||
| pub fn proxied_client(user_agent: &str) -> anyhow::Result<reqwest::Client> { |
Comment on lines
+1506
to
1523
| async fn chat_completions_proxy_sends_default_user_agent() { | ||
| let _lock = settings_path_test_lock().lock().unwrap(); | ||
| let temp = tempfile::tempdir().unwrap(); | ||
| let _guard = SettingsPathGuard::set(temp.path().join("settings.json")); | ||
| let server = spawn_chat_server(); | ||
| write_chat_relay_settings(temp.path(), &server.base_url, "Configured-Codex-UA/1.0"); | ||
| write_chat_relay_settings(temp.path(), &server.base_url, ""); | ||
|
|
||
| let upstream = open_chat_completions_proxy_request( | ||
| r#"{"model":"gpt-5.5","messages":[{"role":"user","content":"hello"}]}"#, | ||
| Some("Original-Codex-UA/1.0"), | ||
| None, | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(upstream.status_code, 200); | ||
|
|
||
| let request = server.finish(); | ||
| assert_eq!(request.user_agent, "Configured-Codex-UA/1.0"); | ||
| assert_codex_plus_user_agent(&request); | ||
| } |
Replace the 120×1s polling loop in ensure_injection with a 3-phase approach: 1. Event-driven: pipe Codex stderr and detect 'DevTools listening on ws://' (Playwright pattern, 92k★) 2. Exponential backoff TCP connect (8 steps, 100ms→10s) 3. Original 1s-interval polling as safety net (30 attempts, 75% reduction) This cuts startup delay from ~55s to ~0-2s in normal conditions. Changes: - Add wait_for_cdp_ready() — reads stderr for CDP readiness signal - Add cdp_ready field to DefaultLaunchHooks — oneshot receiver - Pipe stderr in launch_codex() instead of /dev/null - Override ensure_injection() in DefaultLaunchHooks
3 tests using tokio::io::duplex to simulate stderr pipe: - Detects the magic 'DevTools listening on ws://' line - Returns Ok(()) when stderr closes without the magic line - Ignores noise (log lines) before the magic line
Skip the ~34s app-server 'list-models-for-host' RPC by returning model
names directly from the Codex++ bridge (<1ms).
空数组 { data: [] } 让 patchModelArray 自动用 codexPlusModelDescriptor
补全完整模型描述符。
Inspired by PR BigPizzaV3#620 by @congxb.
Co-authored-by: congxb <3145634+congxb@users.noreply.github.com>
Add SseBlockParser — extracts structured SSE events from raw bytes, following cc-switch's architecture where the HTTP layer handles SSE parsing and the converter receives pre-parsed JSON Values. New public API on ChatSseToResponsesConverter: - feed_chunk(&Value) → Vec<u8> (pre-parsed chunk, with fast path) - feed_done() → Vec<u8> (end-of-stream) - feed_error(String, Option<String>) → Vec<u8> (error signal) Content-delta fast path: skips the full handle_chat_chunk_into pipeline for simple text deltas (90%+ of streaming chunks). New methods on ChatSseState: - is_text_started() → bool - push_content_delta_direct(content, output) - update_metadata_fields(chunk) Legacy push_bytes/finish/fail preserved for backward compatibility, internally delegating to the new API.
Replace converter.push_bytes()/finish()/fail() in handle_protocol_proxy_connection with the new decoupled API: - SseBlockParser for SSE parsing at the transport layer - feed_chunk/feed_done/feed_error for protocol conversion This enables the content-delta fast path for 90%+ of streaming chunks and follows cc-switch's architecture where the HTTP layer handles SSE parsing and the converter receives pre-parsed Values. Make extract_chat_sse_error pub for the launcher.rs error path.
…unk API - 5 SseBlockParser tests: single block, multi-line data, done signal, empty block skip, event field - 1 feed_chunk test: content delta fast path - 1 equivalence test: push_bytes vs feed_* byte-for-byte identical All 7 tests pass (cargo test -p codex-plus-core --test protocol_proxy)
Replace self.inject() → try_inject() in Phase 1/2/3 of DefaultLaunchHooks::ensure_injection(). self.inject() internally calls retry_injection() which has its own 20×500ms polling loop, creating a hidden 10s retry barrier inside every backoff step. Before: Phase 2 worst-case ~101s, Phase 3 ~330s, total ~446s After: Phase 2 worst-case ~21s, Phase 3 ~60s, total ~66s The retry logic is already handled by the outer ensure_injection phases; nesting it was a regression from the original implementation.
Contributor
Author
|
Split into 4 focused PRs for easier review:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four performance improvements for Codex++:
4. SSE converter decoupling (NEW)
Problem: ChatSseToResponsesConverter couples SSE parsing + JSON parsing + field mapping in a single monolithic push_bytes() call. Every streaming chunk goes through the full pipeline even though 90%+ are simple content deltas.
Fix (cc-switch architecture):
Earlier changes (see PR body history)
1. P0: Event-driven CDP readiness detection
Event-driven stderr pipe (Playwright pattern) replaces 120×1s polling.
2. P1: Global reqwest::Client cache
OnceLockreqwest::Client for connection reuse.
3. P1: Model list bridge shortcut
Intercept list-models-for-host calls, return from bridge (<1ms) instead of waiting for app-server RPC (~34s). Inspired by PR #620 by @congxb.
Verification