Summary
On MCP-only aflock deployments, the policy limits maxTurns and
maxSpendUSD are silently never enforced — counters stay at 0, so
neither fail-fast nor post-hoc enforcement ever fires. Found live
during PR #88 + SPIRE end-to-end testing.
Repro
With the standard MCP-UDS sandbox setup and a policy carrying real
limits:
limits:
maxSpendUSD: { value: 5, enforcement: fail-fast }
maxTurns: { value: 50, enforcement: post-hoc }
After a Claude Code session running 4 tool calls, the resulting
~/.aflock/sessions/<id>/state.json shows:
"metrics": {
"tokensIn": 0,
"tokensOut": 0,
"costUSD": 0,
"turns": 0,
"toolCalls": 4, // ← only this counter increments
"tools": { "Bash": 2, "Read": 2 }
}
Why it happens
RecordAction (called from internal/mcp/server.go:1379) increments
ToolCalls and Tools map ✅
IncrementTurns / UpdateMetrics(tokensIn, tokensOut, costUSD) are
ONLY called from internal/hooks/handler.go:668 ❌
- The MCP path has no equivalent observable for either:
- Tokens / cost: MCP protocol doesn't carry Claude's model usage
telemetry. aflock cannot see how many tokens Claude used to process
a request — that information only reaches us via the
claude-code PostToolUse hook.
- Turns: MCP has no "turn boundary" signal. One user message can
trigger N tool/call requests; aflock sees the individual requests
but not the boundary.
Why this is security-relevant
Verify-time and fail-fast enforcement of maxTurns and maxSpendUSD
are both bypassed silently. Policy authors writing these limits have
no way to know their constraints are unenforced on the deployment
shape they're using.
Fix options
Option A — Refuse-or-warn at startup (smallest)
At MCP server start, inspect the policy. If either maxTurns or
maxSpendUSD is set, and we are running MCP-only (no hooks active in
the same process), either:
- Refuse to start (
fail-fast enforcement is impossible to honor)
- Log a loud warning (
post-hoc could still be honored later)
Pros: small (~50 lines + tests). Closes the silent-bypass UX trap.
Cons: doesn't actually make the limits work.
Option B — Heuristic counters
- Increment
turns per tools/call request on MCP path. Overcounts
vs hooks path's "turn = user message" definition, but better than 0.
- Estimate
costUSD from response payload sizes. Very approximate.
Pros: limits become somewhat enforceable.
Cons: numbers are misleading; "turn" no longer means the same thing
across deployment shapes; encourages false sense of safety.
Option C — Hybrid deployment
Recommend running both MCP + hooks in the same session. Hooks path
provides accurate metrics; MCP path provides identity attestation.
Document the requirement and provide a config flag.
Pros: zero approximation.
Cons: complexity for users; adds a setup step.
Option D — Upstream MCP extension
Propose to Anthropic that MCP carry usage metadata in tool/call
results (similar to how the Anthropic API returns usage with each
response). Long-term proper fix.
Suggested next step
Land Option A as a near-term mitigation — it closes the silent
bypass, even if it doesn't make the limits work. Track Options B/C/D
as separate issues once Option A ships.
Files of interest
internal/mcp/server.go — server start, policy load
internal/state/session.go:237 (UpdateMetrics),
:247 (IncrementTurns) — only called from hooks
internal/hooks/handler.go:668 — the only call site
Acceptance for Option A
- Starting
aflock serve --unix with a policy containing maxSpendUSD
with enforcement: fail-fast refuses to start with a clear error
pointing to this issue
enforcement: post-hoc logs a warning at startup but allows start
- Same behavior for
maxTurns
- Regression test asserts both behaviors
Summary
On MCP-only aflock deployments, the policy limits
maxTurnsandmaxSpendUSDare silently never enforced — counters stay at 0, soneither
fail-fastnorpost-hocenforcement ever fires. Found liveduring PR #88 + SPIRE end-to-end testing.
Repro
With the standard MCP-UDS sandbox setup and a policy carrying real
limits:
After a Claude Code session running 4 tool calls, the resulting
~/.aflock/sessions/<id>/state.jsonshows:Why it happens
RecordAction(called frominternal/mcp/server.go:1379) incrementsToolCallsandToolsmap ✅IncrementTurns/UpdateMetrics(tokensIn, tokensOut, costUSD)areONLY called from
internal/hooks/handler.go:668❌telemetry. aflock cannot see how many tokens Claude used to process
a request — that information only reaches us via the
claude-code PostToolUse hook.
trigger N tool/call requests; aflock sees the individual requests
but not the boundary.
Why this is security-relevant
Verify-time and fail-fast enforcement of
maxTurnsandmaxSpendUSDare both bypassed silently. Policy authors writing these limits have
no way to know their constraints are unenforced on the deployment
shape they're using.
Fix options
Option A — Refuse-or-warn at startup (smallest)
At MCP server start, inspect the policy. If either
maxTurnsormaxSpendUSDis set, and we are running MCP-only (no hooks active inthe same process), either:
fail-fastenforcement is impossible to honor)post-hoccould still be honored later)Pros: small (~50 lines + tests). Closes the silent-bypass UX trap.
Cons: doesn't actually make the limits work.
Option B — Heuristic counters
turnspertools/callrequest on MCP path. Overcountsvs hooks path's "turn = user message" definition, but better than 0.
costUSDfrom response payload sizes. Very approximate.Pros: limits become somewhat enforceable.
Cons: numbers are misleading; "turn" no longer means the same thing
across deployment shapes; encourages false sense of safety.
Option C — Hybrid deployment
Recommend running both MCP + hooks in the same session. Hooks path
provides accurate metrics; MCP path provides identity attestation.
Document the requirement and provide a config flag.
Pros: zero approximation.
Cons: complexity for users; adds a setup step.
Option D — Upstream MCP extension
Propose to Anthropic that MCP carry
usagemetadata in tool/callresults (similar to how the Anthropic API returns usage with each
response). Long-term proper fix.
Suggested next step
Land Option A as a near-term mitigation — it closes the silent
bypass, even if it doesn't make the limits work. Track Options B/C/D
as separate issues once Option A ships.
Files of interest
internal/mcp/server.go— server start, policy loadinternal/state/session.go:237(UpdateMetrics),:247(IncrementTurns) — only called from hooksinternal/hooks/handler.go:668— the only call siteAcceptance for Option A
aflock serve --unixwith a policy containingmaxSpendUSDwith
enforcement: fail-fastrefuses to start with a clear errorpointing to this issue
enforcement: post-hoclogs a warning at startup but allows startmaxTurns