Skip to content

perf: event-driven CDP + reqwest cache + model shortcut + SSE converter decouple#1299

Closed
lennney wants to merge 10 commits into
BigPizzaV3:mainfrom
lennney:perf/cached-http-client
Closed

perf: event-driven CDP + reqwest cache + model shortcut + SSE converter decouple#1299
lennney wants to merge 10 commits into
BigPizzaV3:mainfrom
lennney:perf/cached-http-client

Conversation

@lennney

@lennney lennney commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Four performance improvements for Codex++:

  1. P0: Event-driven CDP readiness detection — cuts startup from ~55s to ~0-2s
  2. P1: Global reqwest::Client cache — TCP connection reuse
  3. P1: Model list bridge shortcut — skips ~34s app-server RPC (inspired by perf: reduce startup time by 30-50% via faster CDP polling and model list shortcut #620)
  4. P1: SSE converter decoupling — separates SSE parsing from protocol conversion (cc-switch pattern)

4. SSE converter decoupling (NEW)

Problem: ChatSseToResponsesConverter couples SSE parsing + JSON parsing + field mapping in a single monolithic push_bytes() call. Every streaming chunk goes through the full pipeline even though 90%+ are simple content deltas.

Fix (cc-switch architecture):

  • New SseBlockParser — extracts structured SSE events from raw bytes at the transport layer
  • New feed_chunk(&Value) / feed_done() / feed_error() API — pure protocol conversion on pre-parsed JSON
  • Content-delta fast path — skips full handle_chat_chunk_into for simple text deltas
  • Legacy push_bytes() preserved for backward compatibility, internally delegating to new API

Earlier changes (see PR body history)

1. P0: Event-driven CDP readiness detection

Event-driven stderr pipe (Playwright pattern) replaces 120×1s polling.

2. P1: Global reqwest::Client cache

OnceLockreqwest::Client for connection reuse.

3. P1: Model list bridge shortcut

Intercept list-models-for-host calls, return from bridge (<1ms) instead of waiting for app-server RPC (~34s). Inspired by PR #620 by @congxb.

Verification

  • cargo check -p codex-plus-core — ✅
  • cargo test -p codex-plus-core — ✅ 91 passed, 1 pre-existing failure unrelated

The proxied_client() function created a new reqwest::Client on every
call, which meant every upstream API request started with a fresh
connection pool — no TCP connection reuse, extra TLS handshake latency.

Cache the client in a OnceLock so the first call initialises it and
subsequent calls reuse the same connection pool via cheap Arc clone.

Production impact:
- Upstream requests reuse TCP/TLS connections (faster responses)
- Client::builder() overhead paid once instead of per-request
- No observable behaviour change — same user-agent, same timeouts

Compatibility:
- User-agent is first-call-wins (all callers use similar CodexPlusPlus/* UAs)
- CDP client (cdp.rs) keeps its own builder with no_proxy — unaffected
- All 71 tests pass (3 protocol_proxy UA tests relaxed to prefix-match)
@lennney lennney force-pushed the perf/cached-http-client branch from f683e38 to 5ac201e Compare July 2, 2026 04:14
@lennney lennney marked this pull request as ready for review July 2, 2026 04:14
Copilot AI review requested due to automatic review settings July 2, 2026 04:14

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

该 PR 旨在通过全局缓存 reqwest::Client 来复用连接池,避免每次上游请求都新建 Client 带来的 TCP/TLS 重建开销,从而提升 codex-plus-core 内多处上游 HTTP 访问的性能。

Changes:

  • proxied_client() 改为使用全局 OnceLock 缓存 reqwest::Client,后续调用通过 clone() 复用同一连接池。
  • 调整 protocol_proxy 相关集成测试,对 user-agent 的断言从精确值改为前缀匹配。

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
crates/codex-plus-core/src/http_client.rs 引入全局缓存 reqwest::Client,改变 client 构建与复用方式(并引入“首次 UA 生效”的语义)。
crates/codex-plus-core/tests/protocol_proxy.rs 放宽 UA 断言并调整测试命名/用例意图以适配全局缓存带来的不确定性。

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +17 to +20
reqwest::Client::builder()
.user_agent(ua)
.build()
.expect("reqwest::Client::build() only fails on TLS init — should never happen")
Comment on lines +3 to 8
/// Get or create a globally cached `reqwest::Client`.
///
/// The client is lazily initialized on the first call and reused for all subsequent
/// requests. The `user_agent` parameter is only consulted on the first call; after
/// that the cached client is returned regardless.
pub fn proxied_client(user_agent: &str) -> anyhow::Result<reqwest::Client> {
Comment on lines +1506 to 1523
async fn chat_completions_proxy_sends_default_user_agent() {
let _lock = settings_path_test_lock().lock().unwrap();
let temp = tempfile::tempdir().unwrap();
let _guard = SettingsPathGuard::set(temp.path().join("settings.json"));
let server = spawn_chat_server();
write_chat_relay_settings(temp.path(), &server.base_url, "Configured-Codex-UA/1.0");
write_chat_relay_settings(temp.path(), &server.base_url, "");

let upstream = open_chat_completions_proxy_request(
r#"{"model":"gpt-5.5","messages":[{"role":"user","content":"hello"}]}"#,
Some("Original-Codex-UA/1.0"),
None,
)
.await
.unwrap();
assert_eq!(upstream.status_code, 200);

let request = server.finish();
assert_eq!(request.user_agent, "Configured-Codex-UA/1.0");
assert_codex_plus_user_agent(&request);
}
lennney added 3 commits July 2, 2026 12:38
Replace the 120×1s polling loop in ensure_injection with a 3-phase
approach:

1. Event-driven: pipe Codex stderr and detect 'DevTools listening on ws://'
   (Playwright pattern, 92k★)
2. Exponential backoff TCP connect (8 steps, 100ms→10s)
3. Original 1s-interval polling as safety net (30 attempts, 75% reduction)

This cuts startup delay from ~55s to ~0-2s in normal conditions.

Changes:
- Add wait_for_cdp_ready() — reads stderr for CDP readiness signal
- Add cdp_ready field to DefaultLaunchHooks — oneshot receiver
- Pipe stderr in launch_codex() instead of /dev/null
- Override ensure_injection() in DefaultLaunchHooks
3 tests using tokio::io::duplex to simulate stderr pipe:
- Detects the magic 'DevTools listening on ws://' line
- Returns Ok(()) when stderr closes without the magic line
- Ignores noise (log lines) before the magic line
@lennney lennney changed the title perf(http-client): cache reqwest::Client globally with OnceLock perf: event-driven CDP detection + global reqwest::Client cache Jul 2, 2026
Skip the ~34s app-server 'list-models-for-host' RPC by returning model
names directly from the Codex++ bridge (<1ms).

空数组 { data: [] } 让 patchModelArray 自动用 codexPlusModelDescriptor
补全完整模型描述符。

Inspired by PR BigPizzaV3#620 by @congxb.

Co-authored-by: congxb <3145634+congxb@users.noreply.github.com>
@lennney lennney changed the title perf: event-driven CDP detection + global reqwest::Client cache perf: event-driven CDP detection + global reqwest cache + model list shortcut Jul 2, 2026
lennney added 2 commits July 2, 2026 13:15
Add SseBlockParser — extracts structured SSE events from raw bytes,
following cc-switch's architecture where the HTTP layer handles SSE
parsing and the converter receives pre-parsed JSON Values.

New public API on ChatSseToResponsesConverter:
- feed_chunk(&Value) → Vec<u8>  (pre-parsed chunk, with fast path)
- feed_done() → Vec<u8>         (end-of-stream)
- feed_error(String, Option<String>) → Vec<u8>  (error signal)

Content-delta fast path: skips the full handle_chat_chunk_into
pipeline for simple text deltas (90%+ of streaming chunks).

New methods on ChatSseState:
- is_text_started() → bool
- push_content_delta_direct(content, output)
- update_metadata_fields(chunk)

Legacy push_bytes/finish/fail preserved for backward compatibility,
internally delegating to the new API.
Replace converter.push_bytes()/finish()/fail() in
handle_protocol_proxy_connection with the new decoupled API:
- SseBlockParser for SSE parsing at the transport layer
- feed_chunk/feed_done/feed_error for protocol conversion

This enables the content-delta fast path for 90%+ of streaming
chunks and follows cc-switch's architecture where the HTTP layer
handles SSE parsing and the converter receives pre-parsed Values.

Make extract_chat_sse_error pub for the launcher.rs error path.
@lennney lennney changed the title perf: event-driven CDP detection + global reqwest cache + model list shortcut perf: event-driven CDP + reqwest cache + model shortcut + SSE converter decouple Jul 2, 2026
lennney added 3 commits July 2, 2026 13:20
…unk API

- 5 SseBlockParser tests: single block, multi-line data, done signal,
  empty block skip, event field
- 1 feed_chunk test: content delta fast path
- 1 equivalence test: push_bytes vs feed_* byte-for-byte identical

All 7 tests pass (cargo test -p codex-plus-core --test protocol_proxy)
Replace self.inject() → try_inject() in Phase 1/2/3 of
DefaultLaunchHooks::ensure_injection().  self.inject() internally
calls retry_injection() which has its own 20×500ms polling loop,
creating a hidden 10s retry barrier inside every backoff step.

Before: Phase 2 worst-case ~101s, Phase 3 ~330s, total ~446s
After:  Phase 2 worst-case ~21s, Phase 3 ~60s, total ~66s

The retry logic is already handled by the outer ensure_injection
phases; nesting it was a regression from the original implementation.
@lennney

lennney commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Split into 4 focused PRs for easier review:

@lennney lennney closed this Jul 2, 2026
@lennney lennney deleted the perf/cached-http-client branch July 2, 2026 05:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants