Skip to content

fix: MCP-mode metrics never populated — fail-fast limits (maxSpendUSD, maxTokensIn) cannot fire #72

@manzil-infinity180

Description

@manzil-infinity180

Summary

In MCP mode, state.Metrics.{TokensIn, TokensOut, CostUSD, Turns} are never incremented from real agent activity. Only ToolCalls and FilesRead/Written update. Consequence: any limits entry with enforcement: "fail-fast" (paper §4) that depends on token or spend counters is effectively a no-op.

Evidence

Live test session (aflock-mcp-test, 2 tool calls executed):

// aflock get_session output:
{
  "costUSD": 0,
  "filesRead": 1,
  "filesWritten": 0,
  "tokensIn": 0,
  "tokensOut": 0,
  "toolCalls": 2,
  "turns": 0
}

After ~5 real Claude round-trips consuming ~10k tokens total, metrics still read zero.

Root cause

Only two code paths populate those fields:

  • internal/state/session.go:239 UpdateMetrics() — takes explicit tokensIn, tokensOut, costUSD
  • internal/state/session.go:249 IncrementTurns()

In hooks mode, internal/hooks/handler.go reads usage/cost fields from the PostToolUse hook input JSON and calls those functions. In MCP mode, there is no equivalent — aflock is not on the Anthropic API path, so it has no visibility into tokens/cost. No MCP handler ever calls UpdateMetrics or IncrementTurns.

Why this matters

Paper §4 (Resource Limits):

fail-fast: Abort immediately when breached (for cost, security)

With zero metrics, fail-fast limits never trip. An adversarial or runaway agent can exceed maxSpendUSD: 1.00 arbitrarily without aflock noticing — until aflock verify --session is run post-hoc, by which point the spend has already happened.

The limits:post-hoc check in verify also reports metrics as zero because the state file itself is zero, so even post-hoc enforcement doesn't work in MCP mode.

Options

  1. Hybrid mode: keep hooks installed purely for metrics, even when the authorization path is MCP. Requires PostToolUse hook to write into the same session state MCP uses. Works today but defeats the "MCP-only" deployment story.

  2. Client-supplied usage via _usage tool arg: extend each MCP tool handler (handleBash, etc.) to accept _tokens_in, _tokens_out, _cost_usd and require the client to supply them. Claude Code does not currently populate these — would need upstream work.

  3. Anthropic SDK proxy: the aflock MCP server could proxy Anthropic API calls (or wrap the SDK) so it sees usage responses. Large scope change; probably overkill.

  4. Documented gap: update paper and docs to say fail-fast limits only work in hooks mode; MCP supports post-hoc only.

Recommendation: (1) as a short-term workaround + (4) in docs, with (2) as a proper fix tracked for later.

Repro

# Policy with an aggressive limit
echo '{"version":"1.0","name":"metrics-test","limits":{"maxTokensIn":{"value":1,"enforcement":"fail-fast"}},"tools":{"allow":["Bash"]}}' > .aflock
# Start Claude with aflock MCP
# Make 10 bash calls via aflock bash
# Observe: all 10 succeed, session.Metrics.TokensIn stays 0

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpaper-85Close set → reaches ~85% paper compliancepaper-gapGap between paper claims and implementation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions