Skip to content

fix(daemon): keep tier-2 inference off the fast-path SDK round-trip#14

Merged
venkateshamatam merged 1 commit into
mainfrom
fix/tier-2-fast-path-budget
May 3, 2026
Merged

fix(daemon): keep tier-2 inference off the fast-path SDK round-trip#14
venkateshamatam merged 1 commit into
mainfrom
fix/tier-2-fast-path-budget

Conversation

@venkateshamatam
Copy link
Copy Markdown
Owner

fix(daemon): keep tier-2 inference off the fast-path SDK round-trip

  • /v1/check no longer awaits the classifier on actions the rules
    already allowed or rejected. The verdict returns in microseconds
    and the score lands in the action_ledger via a fire-and-forget
    backfill once inference completes.
  • The detached classify() call is deferred with setTimeout(fn, 0)
    so any synchronous work the classifier does on its first tick
    (tokenizer warm-up, IPC encode) runs after the response is sent,
    not on the SDK round-trip.
  • Concurrency cap on detached classifier work prevents a runaway
    sidecar from queueing thousands of in-flight inferences. New
    scoring is dropped when the cap is hit; the verdict still lands.
  • backfillTier2 logs a warning if the row id no longer exists.
    finalize already throws on missing rows; the asymmetry is
    intentional because backfill is best-effort and detached.
  • A rules-allowed action that the classifier scores as malicious
    now logs the disagreement so it becomes review and training
    signal. The classifier remains advisory; no verdict change.
  • Paused-path behavior is unchanged: the modal still gets the
    score in real time, persisted to the pending row before the
    WebSocket fan-out.
  • New backfillTier2(rowId, score) ledger helper and a tier2Cols
    helper that collapses three open-coded copies of the
    nullable-tier2 spread.
  • appendResolved now returns the row id so the fast-path callsite
    can chain backfillTier2 without a second query.
  • New integration tests at src/daemon/check.test.ts drive the Hono
    app via app.fetch and assert: the response latency is independent
    of classifier delay, the backfill lands after the response, the
    paused path still surfaces the score on the modal and the SDK
    response, the throwing-after-return classifier doesn't break the
    request, and a flood of 80 concurrent fast-path requests survives.

- /v1/check no longer awaits the classifier on actions the rules
  already allowed or rejected. The verdict returns in microseconds
  and the score lands in the action_ledger via a fire-and-forget
  backfill once inference completes.
- The detached classify() call is deferred with setTimeout(fn, 0)
  so any synchronous work the classifier does on its first tick
  (tokenizer warm-up, IPC encode) runs after the response is sent,
  not on the SDK round-trip.
- Concurrency cap on detached classifier work prevents a runaway
  sidecar from queueing thousands of in-flight inferences. New
  scoring is dropped when the cap is hit; the verdict still lands.
- backfillTier2 logs a warning if the row id no longer exists.
  finalize already throws on missing rows; the asymmetry is
  intentional because backfill is best-effort and detached.
- A rules-allowed action that the classifier scores as malicious
  now logs the disagreement so it becomes review and training
  signal. The classifier remains advisory; no verdict change.
- Paused-path behavior is unchanged: the modal still gets the
  score in real time, persisted to the pending row before the
  WebSocket fan-out.
- New backfillTier2(rowId, score) ledger helper and a tier2Cols
  helper that collapses three open-coded copies of the
  nullable-tier2 spread.
- appendResolved now returns the row id so the fast-path callsite
  can chain backfillTier2 without a second query.
- New integration tests at src/daemon/check.test.ts drive the Hono
  app via app.fetch and assert: the response latency is independent
  of classifier delay, the backfill lands after the response, the
  paused path still surfaces the score on the modal and the SDK
  response, the throwing-after-return classifier doesn't break the
  request, and a flood of 80 concurrent fast-path requests survives.
@venkateshamatam venkateshamatam merged commit 67cf781 into main May 3, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant