fix: bound install-scan timeout + sweep/release correlation slots#16
Merged
Conversation
Self-review hardening of the governance hot path before 0.1.0. #1 install-scan timeout: /install-scan is a gating call like /evaluate but was unbounded, so a hung proxy would stall the install hook indefinitely (fail-closed but never prompt). Factor a `bounded()` helper that runs a gating call under an AbortController + evaluateTimeoutMs, and apply it to both evaluate and installScan. #2 correlation slot leak / false ambiguity-block: - Add a periodic sweep (at most once per TTL) on reserve(): lazy per-key eviction never reclaims a per-call-unique toolCallId key whose after_tool_call never arrives, so such slots leaked. Sweeping on the growth path bounds the map without a background timer. Expose a `size` getter (telemetry + leak guard). - Release the reservation when an approval resolves to deny/timeout/cancelled: the tool won't run, so no after_tool_call claims the slot — freeing it prevents the leak and stops a stale slot from wrongly ambiguity-blocking the next call on a runId/no-ID key.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two hardening fixes from the pre-
0.1.0self-review, both on the governance hot path. TDD throughout.#1 —
install-scanhad no timeout/install-scanis a gating call like/evaluate, but it was unbounded whileevaluatehad anAbortController+evaluateTimeoutMs. A hung proxy would stall the install hook indefinitely — fail-closed (never proceeds) but not prompt, contradicting the spirit of the bounded-/evaluatenon-negotiable.bounded()helper that runs a gating call under anAbortController+evaluateTimeoutMs(covers fetch and body read), and applied it to bothevaluateandinstallScan.#2 — correlation slot leak + false ambiguity-block
toolCallIdkey whoseafter_tool_callnever arrives was never reclaimed → slow unbounded memory growth. Added a sweep (at most once per TTL) onreserve()— the only growth path — so memory stays bounded without a background timer. Exposed asizegetter (telemetry + leak guard).deny/timeout/cancelledthe tool doesn't run, so noafter_tool_callclaims the slot. Beyond the leak, on arunId/no-ID key this stale slot would wrongly ambiguity-block the next legitimate call until TTL. Now released inonResolutionfor those decisions;allow-once/allow-alwayskeep the slot (the tool runs andafter_tool_callclaims it) — both directions covered by tests.Verification
pnpm verifygreen: 196 tests (was 185; +11), 0 type errors, build ✓. New tests: install-scan timeout fail-closed; registry sweep evicts never-claimed unique-key slots; approval deny/timeout/cancelled release the slot while allow-once/allow-always keep it.Self-review also raised minor nits (AdapterConfig readonly, baseUrl scheme, mapSession fallback docs) — deferred; not in this PR.