Skip to content

feat: implement token/cost/turn tracking via Claude Code session JSONL #96

@manzil-infinity180

Description

@manzil-infinity180

Summary

Implement token, cost, and turn tracking on the MCP path by parsing the Claude Code session JSONL. This makes maxSpendUSD, maxTokensIn, maxTokensOut, and maxTurns enforceable on MCP-only deployments — closing the silent-bypass gap raised in #94.

Background

Claude Code writes a session log at ~/.claude/projects/<encoded-cwd>/<session-uuid>.jsonl, append-only, near-real-time after each assistant turn. Every assistant line carries:

"message": {
  "usage": {
    "input_tokens": ...,
    "output_tokens": ...,
    "cache_read_input_tokens": ...,
    "cache_creation": {
      "ephemeral_5m_input_tokens": ...,
      "ephemeral_1h_input_tokens": ...
    }
  }
}

This is the same usage block from the Anthropic API, persisted unchanged. aflock already locates this file at startup — see [aflock] Most recent session file: ... log line. We just don't parse the usage field today.

The same data path is what ccusage, claude-usage, and other community usage trackers use. Works identically on Claude Max subscription and API-key auth (the usage field is populated regardless).

Proposed integration

Hooks deployment (PreToolUse) — primary path:

  1. Read stdin JSON, extract session_id and transcript_path.
  2. Memoize per-session JSONL byte offset.
  3. On each invocation, read new lines from offset to EOF.
  4. Sum message.usage.{input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens} across assistant lines.
  5. Apply cache-tier weights for shadow-USD: input ~1×, cache_write ~1.25×, cache_read ~0.1× (using current Anthropic public pricing for the model in use).
  6. Compare cumulative values against policy limits.maxTokensIn / maxTokensOut / maxSpendUSD / maxTurns.
  7. Exit non-zero with denial message to block the next tool call when exceeded.

MCP server arm — fallback path:

  • aflock's MCP server already discovers the same JSONL at startup. Reuse that discovery, tail the file via the same parser as a goroutine.
  • Slight write-flush race vs the hook path (PreToolUse fires before the next tool, hook ordering is deterministic; MCP-server tailing is async). Acceptable for post-hoc enforcement; not safe for fail-fast until we add a synchronous "check before dispatch" path.

Acceptance

  • After implementation, a session that consumes >maxTokensIn is blocked at the next PreToolUse with a clear denial.
  • A session that completes under-limit shows non-zero tokensIn, tokensOut, costUSD, turns in state.json and in the aflock verify --session output.
  • Works identically on Claude Max subscription and API-key Claude Code (validated by reading the same JSONL fields).
  • Cache-tier weights configurable via policy or env (so users can override pricing if model rates change).

Files

  • internal/state/session.go — already has UpdateMetrics / IncrementTurns, just needs new callers.
  • internal/hooks/handler.go:668 — already calls IncrementTurns; PreToolUse path needs to also feed UpdateMetrics from JSONL.
  • internal/mcp/server.go — JSONL discovery already happens at startup. Add tail-reader.
  • New: internal/usage/jsonl.go — JSONL parser, offset memoization, cache-tier cost math (parallel to ccusage's logic).

Why now

References

Out of scope

  • Plan-utilization % on Max subscription (% of monthly cap consumed) — that comes from server-side rate-limit headers, not the JSONL. Separate concern, not addressable from this fix.

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions