Skip to content

Fix CLI timeout enforcement, web-fetch cooldown persistence, vertex test assertions#16

Merged
quangdang46 merged 2 commits into
mainfrom
devin/1778594042-fix-bugs
May 12, 2026
Merged

Fix CLI timeout enforcement, web-fetch cooldown persistence, vertex test assertions#16
quangdang46 merged 2 commits into
mainfrom
devin/1778594042-fix-bugs

Conversation

@quangdang46
Copy link
Copy Markdown
Owner

@quangdang46 quangdang46 commented May 12, 2026

Summary

Three latent bugs were sitting in the Rust BE behind unused variable: ... compile warnings. None of them caused a compile/test failure, so CI was green, but each silently broke a real feature. Fixed all three and added regression tests.

1. src/server/api/cli_tools.rstimeout_secs never enforced

POST /api/cli-tools/execute and POST /api/cli-tools/run/:tool both accept a timeout_secs field, capped to 120s. The handlers computed timeout_secs but then called tokio::process::Command::output().await directly, with no timeout wrapper. A slow or hung child process would block the request forever and return timed_out: false.

Pulled child execution into a new run_command_with_timeout helper that:

  • spawns with kill_on_drop(true) so the child is reaped when the future is dropped,
  • wraps wait_with_output() in tokio::time::timeout(Duration::from_secs(timeout_secs), …),
  • on expiry returns success: false, exit_code: None, stderr: "Command timed out after Ns", timed_out: true.

Added 4 tokio tests: happy path, stdout capture, hard timeout (verifies the kill happens within seconds for a sleep 30 child), and the missing-binary error path.

2. src/server/api/web_fetch.rsrate_limited_until never persisted

mark_connection_unavailable is the function that gets called when a /v1/web/fetch upstream returns a fallback-worthy error (429 / 5xx). It computed an until: DateTime<Utc> from the cooldown duration but never wrote it to c.rate_limited_until. The fallback selector elsewhere (select_connection, filter_available_accounts, is_account_unavailable — see core/account_fallback/) keys off exactly that field to decide whether to skip a cooling-down connection.

Net effect: the per-connection cooldown for web-fetch fallback was a no-op across requests. The excluded HashSet inside do_fetch_with_fallback covered the current request only; the next request would happily re-pick the same dead connection until it died again. Compare to apply_error_state in core/account_fallback/mod.rs:614 which sets rate_limited_until correctly for the chat path — same intent, just missed in this code path.

Fix: persist until.to_rfc3339() into c.rate_limited_until in mark_connection_unavailable, and clear it back to None in clear_connection_error on success.

3. src/core/executor/vertex.rs — missing test assertions

test_parse_vertex_model_partner destructured (location, project_id, actual_model, is_partner) but only asserted on location and is_partner. test_parse_vertex_model_standard ignored project_id similarly. Added the missing assert_eq!s so the partner branch's actual_model == "glm-5-maas" (no models/ prefix) and the always-empty project_id contract are locked in.

Review & Testing Checklist for Human

  • cargo test --lib --tests — 54 test binaries, 0 failures locally; CI should be the same.
  • Exercise /api/cli-tools/execute from the dashboard with a long-running command (e.g. sleep 60, timeout_secs: 2) and confirm the response returns within ~2s with timed_out: true and that no sleep process is left behind on the host (pgrep sleep).
  • Exercise /v1/web/fetch against a provider that 429s, then immediately again and confirm the second request skips that connection (look for rate_limited_until in db.json's provider_connections row, and the dashboard's provider card showing 'cooling down').
  • Sanity check that the Astro dashboard still builds (cd web && npm run build).

Notes

The compile warnings around unused variable: timeout_secs, unused variable: until, unused variable: project_id, and unused variable: actual_model are now gone, which is the simplest sniff-test that the bugs are gone too. A few unrelated unused-variable warnings remain (p, tool_name_map, model_str, plan, provider, headers, req for StartMitmRequest) — I reviewed each and they're stale parameters / leftover refactors, not behavior-affecting. Happy to clean those up in a follow-up if you want.

The CI workflow at .github/workflows/rust-ci.yml runs cargo fmt --all --check and cargo test --lib --tests. Both pass locally on Rust 1.95.0.

…est assertions

Three latent bugs surfaced by 'unused variable' warnings:

1. server/api/cli_tools.rs: execute_command and run_tool computed
   timeout_secs but never enforced it, so /api/cli-tools/execute and
   /api/cli-tools/run/:name could hang the request indefinitely on a
   slow child process. Pulled command execution into
   run_command_with_timeout that spawns with kill_on_drop, uses
   tokio::time::timeout, and reports timed_out=true on expiry.

2. server/api/web_fetch.rs: mark_connection_unavailable computed an
   'until' timestamp from the cooldown but never wrote it back to
   the provider connection. As a result the per-connection cooldown
   was effectively a no-op across requests: select_connection /
   filter_available_accounts use rate_limited_until to skip
   cooling-down accounts, but it was always None. Now persist
   rate_limited_until on failure and clear it in
   clear_connection_error on success.

3. core/executor/vertex.rs: test_parse_vertex_model_standard and
   test_parse_vertex_model_partner destructured project_id and (for
   partner) actual_model but never asserted on them. Added the
   missing assertions so the parser contract is locked in.

Plus four tokio tests for run_command_with_timeout covering happy
path, stdout capture, hard timeout (kill within seconds), and the
missing-binary error path.
@quangdang46 quangdang46 merged commit dda3d50 into main May 12, 2026
3 checks passed
@quangdang46 quangdang46 deleted the devin/1778594042-fix-bugs branch May 12, 2026 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant