feat(usage): optional wire-interception proxy for complete, real-time, subscription-accurate accounting + enforcement

## Goal
Match Claude Code's own `/usage` numbers exactly (tokens + cost) in **any** auth mode, and gain **real-time, pre-request** spend/usage enforcement — by optionally reading model traffic on the wire instead of (only) the transcript. This is the architectural fix for the accuracy ceiling in #156, and complements (does not replace) the hooks/transcript path.

## Reference implementation
Datadog **lapdog** (open source: `DataDog/dd-apm-test-agent`, dir `lapdog/` + `ddapm_test_agent/claude_proxy.py`, `claude_cost_tracker.py`; local clone at `~/work-dir/dd-apm-test-agent`) does exactly this. It runs a localhost proxy, reads the `usage` receipt off each Anthropic response (incl. background calls the transcript omits), and prices it from the same table Claude Code uses — its `$0.29` matched `/usage` `$0.2908` to the cent. Note: lapdog also pulls the tool/agent span tree from Claude Code **hooks**, not the transcript (it never reads the Claude JSONL).

## How it works
- Set `ANTHROPIC_BASE_URL` to a local aflock listener (documented Claude Code feature; same mechanism LiteLLM / Bedrock gateways use). **Chain** to any existing value; forward upstream unchanged.
- Stream pass-through: proxy the SSE bytes live, tee the `message_start` / `message_delta` `usage` events to update the running budget (no buffering → preserves streaming UX).
- Pre-request gate: before forwarding the next `/v1/messages`, check budget; if over and enforcing, return a clean 429-style error so the agent stops gracefully (never kill mid-stream).
- No CA cert / no TLS MITM — localhost hop is plaintext to the proxy; the proxy does its own HTTPS upstream.

## Why it's worth it
- **Completeness** — captures every call incl. haiku title-gen / compaction → exact, auth-mode-independent (fixes #156, removes the subscription advisory downgrade in #111).
- **Real-time pre-request enforcement** — stops spend at the source, earlier than even the pre-tool fail-fast in #153, and catches model calls from native subagents that bypass hooks (relevant to #100).
- **Works for any Anthropic-SDK agent**, not just Claude Code (hooks/transcript are CC-only) → supports the broader-agents goal in #87; the natural enforcement point for autonomous agents in prod.

## Risks / caveats (why it's opt-in, not default)
- **In-band SPOF**: aflock now sits in the critical path. fail-open = bypassable / no enforcement on failure; fail-closed = an aflock bug can stall the agent (a prod outage). Default **fail-open** (safe vs the *agent*, our real adversary); fail-closed opt-in.
- **Credential + full prompt/response transit aflock** → large trust-boundary expansion. Localhost-only by default; **never log auth headers; never persist bodies (extract only `usage`+`model`); no default forwarding.** PII/compliance implications in prod.
- **Off-localhost needs TLS**; transparent (no-env-var) interception would require a CA cert = MITM = avoid.
- **Coupling/fragility** to Anthropic API shape (SSE events, `/v1/messages`, `usage`) and, for the fetch-patch variant, to Claude Code's Bun runtime.
- **Backend conflicts**: must chain to existing `ANTHROPIC_BASE_URL`; **Bedrock/Vertex** use a different protocol/auth (SigV4) → Anthropic `usage` parsing won't apply, so detect & disable there.
- **$ enforcement stays notional under flat-rate** subscription — the proxy fixes accuracy, not the meaning of a dollar cap; token/turn caps remain the honest control there.

## Phased plan
- [ ] **Phase 0** — fix Opus 4.5+ pricing (#155). Independent, ships now.
- [ ] **Phase 1** — observe-only proxy (off by default): prove aflock-on-wire reproduces `/usage` to the cent incl. background calls. No enforcement.
- [ ] **Phase 2** — pre-request enforcement, fail-open: enforce `maxSpendUSD` / token caps from wire data in subscription mode.
- [ ] **Phase 3** — prod hardening (if demand): fail-closed mode, cross-host TLS, gateway chaining, multi-tenant accounting, compliance story.

## Threat-model note
Enforces against the **agent** (won't bypass a localhost proxy it doesn't know about), fail-open by default. **Not** a control against a malicious human operator (who can kill the proxy / unset the env var) — don't market it as tamper-proof.

Related: #111, #96, #153, #100, #87. Supersedes the accuracy ceiling in #156; Phase 0 is #155.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(usage): optional wire-interception proxy for complete, real-time, subscription-accurate accounting + enforcement #157

Goal

Reference implementation

How it works

Why it's worth it

Risks / caveats (why it's opt-in, not default)

Phased plan

Threat-model note

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(usage): optional wire-interception proxy for complete, real-time, subscription-accurate accounting + enforcement #157

Description

Goal

Reference implementation

How it works

Why it's worth it

Risks / caveats (why it's opt-in, not default)

Phased plan

Threat-model note

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions