Skip to content

feat(usage): optional wire-interception proxy for complete, real-time, subscription-accurate accounting + enforcement #157

@manzil-infinity180

Description

@manzil-infinity180

Goal

Match Claude Code's own /usage numbers exactly (tokens + cost) in any auth mode, and gain real-time, pre-request spend/usage enforcement — by optionally reading model traffic on the wire instead of (only) the transcript. This is the architectural fix for the accuracy ceiling in #156, and complements (does not replace) the hooks/transcript path.

Reference implementation

Datadog lapdog (open source: DataDog/dd-apm-test-agent, dir lapdog/ + ddapm_test_agent/claude_proxy.py, claude_cost_tracker.py; local clone at ~/work-dir/dd-apm-test-agent) does exactly this. It runs a localhost proxy, reads the usage receipt off each Anthropic response (incl. background calls the transcript omits), and prices it from the same table Claude Code uses — its $0.29 matched /usage $0.2908 to the cent. Note: lapdog also pulls the tool/agent span tree from Claude Code hooks, not the transcript (it never reads the Claude JSONL).

How it works

  • Set ANTHROPIC_BASE_URL to a local aflock listener (documented Claude Code feature; same mechanism LiteLLM / Bedrock gateways use). Chain to any existing value; forward upstream unchanged.
  • Stream pass-through: proxy the SSE bytes live, tee the message_start / message_delta usage events to update the running budget (no buffering → preserves streaming UX).
  • Pre-request gate: before forwarding the next /v1/messages, check budget; if over and enforcing, return a clean 429-style error so the agent stops gracefully (never kill mid-stream).
  • No CA cert / no TLS MITM — localhost hop is plaintext to the proxy; the proxy does its own HTTPS upstream.

Why it's worth it

Risks / caveats (why it's opt-in, not default)

  • In-band SPOF: aflock now sits in the critical path. fail-open = bypassable / no enforcement on failure; fail-closed = an aflock bug can stall the agent (a prod outage). Default fail-open (safe vs the agent, our real adversary); fail-closed opt-in.
  • Credential + full prompt/response transit aflock → large trust-boundary expansion. Localhost-only by default; never log auth headers; never persist bodies (extract only usage+model); no default forwarding. PII/compliance implications in prod.
  • Off-localhost needs TLS; transparent (no-env-var) interception would require a CA cert = MITM = avoid.
  • Coupling/fragility to Anthropic API shape (SSE events, /v1/messages, usage) and, for the fetch-patch variant, to Claude Code's Bun runtime.
  • Backend conflicts: must chain to existing ANTHROPIC_BASE_URL; Bedrock/Vertex use a different protocol/auth (SigV4) → Anthropic usage parsing won't apply, so detect & disable there.
  • $ enforcement stays notional under flat-rate subscription — the proxy fixes accuracy, not the meaning of a dollar cap; token/turn caps remain the honest control there.

Phased plan

  • Phase 0 — fix Opus 4.5+ pricing (fix(usage): Opus 4.5+ priced at pre-4.5 rates → ~3x cost overcount #155). Independent, ships now.
  • Phase 1 — observe-only proxy (off by default): prove aflock-on-wire reproduces /usage to the cent incl. background calls. No enforcement.
  • Phase 2 — pre-request enforcement, fail-open: enforce maxSpendUSD / token caps from wire data in subscription mode.
  • Phase 3 — prod hardening (if demand): fail-closed mode, cross-host TLS, gateway chaining, multi-tenant accounting, compliance story.

Threat-model note

Enforces against the agent (won't bypass a localhost proxy it doesn't know about), fail-open by default. Not a control against a malicious human operator (who can kill the proxy / unset the env var) — don't market it as tamper-proof.

Related: #111, #96, #153, #100, #87. Supersedes the accuracy ceiling in #156; Phase 0 is #155.

Metadata

Metadata

Assignees

No one assigned

    Labels

    architectureArchitectural change neededdeferredNot needed for current milestone; revisit laterenhancementNew feature or requestepicEpic: multiple sub-issues

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions