Skip to content

fix(limits): pre-flight fail-fast — abort BEFORE the tool call, not after (paper §4.2) #153

@manzil-infinity180

Description

@manzil-infinity180

Problem

Paper §4.2 (Resource Limits) defines two enforcement modes:

  • fail-fast: Abort immediately when breached (for cost, security)
  • post-hoc: Verify at completion (for quality, turn count)

The phrase "abort immediately when breached" implies pre-flight abort — refuse to run the tool call that would put the session over a fail-fast limit.

Code reality: fail-fast limits run at PostToolUse, i.e. after the tool already executed.

  • internal/hooks/handler.go PostToolUse path: evaluator.CheckLimits(metricsForLimits, "fail-fast") (hooks-mode).
  • internal/policy/evaluator.go:CheckLimits is the same call shape on the MCP side.

Effect: a single Bash call costing $100 against maxSpendUSD: {"value": 10, "enforcement": "fail-fast"} runs to completion and only then gets flagged. The spend already happened; the limit is closer to "post-action" than "abort immediately".

Modes affected

  • MCPaflock_check_limits is read-only; the gate that could pre-flight-block lives in handleBash etc. via EvaluatePreToolUse, but cost/token limits aren't consulted there today.
  • Hooks — PostToolUse is by definition after the call. PreToolUse currently checks tool/file allowlists, not cost/token limits.

Both paths need the change to honor the paper wording.

Paper reference

paper/aflock.tex §4.2 "Resource Limits" — the "fail-fast: Abort immediately when breached" line.

Proposed fix

Add a PreToolUse-time check (or PreCall in MCP) that estimates the projected cumulative metrics if the about-to-run tool succeeds, and aborts if it would push a fail-fast limit over.

Practical approximation: at PreToolUse, compare current metrics.CostUSD / metrics.TokensIn against the policy limit minus a small safety margin (e.g. 5%). If we're already within margin, decline the tool call with a clear message. This is conservative but matches the paper's "abort immediately" intent better than today's after-the-fact gate.

A stricter variant requires a per-tool cost/token estimate, which is harder; the margin-based approach is good enough for v1 and can be tightened later.

Acceptance for closing this issue

  • PreToolUse / pre-MCP-call returns deny when adding any plausible delta to CostUSD / TokensIn would exceed a fail-fast limit.
  • Test that asserts a fail-fast limit blocks the next tool call after the limit is breached, not just the call that breached it.
  • Docs or paper note clarifies the safety-margin approximation if not exact.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpaper-90Close set (assumes paper-85 done) → reaches ~90%paper-gapGap between paper claims and implementation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions