Skip to content

feat: prompt cache hit rate analytics panel (closes #851)#863

Open
vivekchand wants to merge 1 commit intomainfrom
feat/gh-clawmetry-851-cache-analytics
Open

feat: prompt cache hit rate analytics panel (closes #851)#863
vivekchand wants to merge 1 commit intomainfrom
feat/gh-clawmetry-851-cache-analytics

Conversation

@vivekchand
Copy link
Copy Markdown
Owner

Closes #851

What

Dedicated prompt cache analytics panel showing how effectively Anthropic prompt caching is being used across sessions.

How

  • New endpoint GET /api/cache-analytics in routes/cache_analytics.py — scans 7 days of session JSONL files for cacheRead/cacheWrite usage data
  • Overview card in the overview tab showing cache hit rate %, estimated savings, and a 7-day sparkline
  • Frontend JS loadCacheAnalytics() in app.js following the loadAutonomy() pattern

Response shape

{
  "cache_hit_ratio": 0.73,
  "total_cache_read_tokens": 1234567,
  "total_cache_write_tokens": 456789,
  "total_input_tokens": 2345678,
  "estimated_savings_usd": 3.33,
  "series_daily": [...],
  "per_model": [...],
  "per_session": [...]
}

Cost savings estimation

Uses Anthropic pricing: cached tokens cost $0.30/1M vs $3.00/1M for normal input, so each cached token saves $2.70/1M.

Tests

7 test cases in TestCacheAnalytics covering endpoint availability, response structure, types, and value ranges.

Copy link
Copy Markdown
Owner Author

@vivekchand vivekchand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test plan & review notes

What changed

  • Adds routes/cache_analytics.py (new Blueprint bp_cache_analytics) with a single GET /api/cache-analytics endpoint that scans session JSONL files for the last 7 days, computes prompt cache hit ratio, estimated cost savings (Anthropic pricing hardcoded at $3.00/$0.30 per 1M input/cached tokens), per-model and per-session breakdowns, and a daily time-series; wires up a new Overview tab card with a sparkline SVG in clawmetry/templates/tabs/overview.html and clawmetry/static/js/app.js; registers the blueprint in dashboard.py; adds 7 integration tests in tests/test_api.py.

Smoke commands

  • make test or make test-api
  • python3 dashboard.py --port 8900 then navigate to the Overview/Usage section

What to look at visually

  • http://localhost:8900/ → Overview tab — "Prompt Cache Analytics" card should appear with hit-rate %, savings badge, and sparkline
  • curl -sS http://localhost:8900/api/cache-analytics — verify all keys (cache_hit_ratio, estimated_savings_usd, series_daily, per_model, per_session) are present in the JSON response

Likely failure modes from the diff

  • Hardcoded Anthropic pricing_NORMAL_INPUT_PER_TOKEN and _CACHED_INPUT_PER_TOKEN are fixed constants; users on other providers (OpenAI, Google, OpenRouter) or on newer Anthropic model tiers with different pricing will see incorrect savings estimates. Consider sourcing these from clawmetry/providers_pricing.py keyed by model name.
  • cacheRead/cacheWrite field names — the parser assumes Anthropic's exact field names in usage; if the JSONL was written by a different provider or an older OpenClaw version that uses cache_read_input_tokens / cache_creation_input_tokens, all calls will show zero cache hits silently.
  • Module-level mutable cache (_cache_result, _cache_ts) — these globals are not protected by a lock, so under Waitress's multi-threaded dispatch two requests arriving simultaneously just after TTL expiry will both recompute and the second write will silently win; harmless for correctness but worth noting for future thread-safety hardening.
  • file_mtime cutoff may skip active sessions — files modified before 7 days ago are skipped entirely via file_mtime < cutoff_ts, but a long-running session JSONL that was created more than 7 days ago yet still has recent appended lines will be excluded; individual line timestamps are not checked unless the file passes the mtime gate.

Issue link

  • Closes #851 ✓ (already in title)

Generated by Claude Code

Copy link
Copy Markdown
Owner Author

@vivekchand vivekchand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test plan & review notes

Repo: vivekchand/clawmetry

What changed

  • New routes/cache_analytics.py blueprint: GET /api/cache-analytics scans 7 days of session JSONL for cacheRead/cacheWrite usage
  • Overview card showing cache hit rate %, estimated savings, and a 7-day sparkline
  • Frontend loadCacheAnalytics() following the loadAutonomy() pattern

Smoke commands

  • python3 -c 'import ast; ast.parse(open("routes/cache_analytics.py").read())' — syntax clean
  • curl -sS http://localhost:8900/api/cache-analytics — expect {"cache_hit_ratio": 0..1, "total_cache_read_tokens": N, "estimated_savings_usd": N, "daily": [...]}
  • Sessions with no cache data at all should return cache_hit_ratio: 0, not NaN or a crash

What to look at visually

  • http://localhost:8900 → Overview tab → cache analytics card (hit rate %, savings estimate, 7-day sparkline)

Likely failure modes from the diff

  • Division-by-zero: cache_hit_ratio = cacheRead / (cacheRead + input) — confirm the denominator is guarded against zero
  • Overlap watch: PR #779 (/api/token-attribution) also computes cache stats from the same JSONL files — if both merge, watch for the two endpoints diverging on cache_hit_ratio due to different scan windows or normalisation logic
  • Blueprint registration: routes/cache_analytics.py must be imported and registered in dashboard.py

Issue link


Generated by Claude Code

Copy link
Copy Markdown
Owner Author

@vivekchand vivekchand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test plan & review notes

Repo: vivekchand/clawmetry

What changed

  • New GET /api/cache-analytics endpoint (routes/cache_analytics.py) that scans 7 days of session JSONL files for cacheRead/cacheWrite usage, plus a matching overview card in overview.html and loadCacheAnalytics() in app.js.

Merge state

  • Branch is currently dirty (merge conflict with main) — needs a rebase or merge before this can land.

Smoke commands

  • make test or make test-api — runs the 7 new TestCacheAnalytics cases
  • python3 dashboard.py --port 8900 then open the Overview tab to see the cache card render

What to look at visually

  • http://localhost:8900/ — Overview tab: "Prompt Cache Analytics" card should appear between the existing top cards and the refresh bar
  • http://localhost:8900/api/cache-analytics — raw JSON; confirm all 8 keys are present and series_daily has exactly 7 entries

Likely failure modes from the diff

  • Module-level cache is process-global: _cache_result / _cache_ts are bare module globals. Under multi-worker deployments (waitress with threads) two requests can race past the TTL check and both recompute simultaneously — not a correctness bug but worth a threading.Lock if the scan is slow on large workspaces.
  • File-mtime cutoff skips live sessions: the 7-day cutoff is applied to file_mtime before reading any lines, so a session file last written >7 days ago (e.g. a long-running stale session) is skipped entirely even if it has recent timestamps inside. Probably the intended behaviour, but worth confirming.
  • cacheRead/cacheWrite key names: the code reads usage.get("cacheRead") — confirm OpenClaw JSONL actually uses camelCase here rather than cache_read_tokens / cache_creation_input_tokens (the Anthropic API field names). A silent zero-read is the failure mode if the keys don't match.
  • Savings math uses a fixed Anthropic price: $3.00/$0.30 per 1M is hardcoded and doesn't account for other providers or model-tier pricing variations; sessions using non-Anthropic models will silently over/under-count savings.
  • SVG innerHTML injection: svgEl.innerHTML = svgContent is built from server-provided strings. If model names or session IDs ever end up in SVG content in a future extension, this is an XSS surface — fine for now but worth noting.

Issue link

  • Closes #851 ✓ (already linked in PR body)

Generated by Claude Code

Copy link
Copy Markdown
Owner Author

Test plan & review notes

Repo: vivekchand/clawmetry

What changed

  • New routes/cache_analytics.py with GET /api/cache-analytics scanning 7 days of session JSONL for cacheRead/cacheWrite tokens; returns cache_hit_ratio, estimated_savings_usd, series_daily, per_model, per_session; overview card with hit-rate %, estimated savings, and sparkline; 7 unit tests

Smoke commands

  • python3 -m pytest -k cache_analytics -v
  • python3 dashboard.py --port 8900
  • curl -sS http://localhost:8900/api/cache-analytics → expect all response fields listed above with correct types

What to look at visually

  • http://localhost:8900 → Overview tab → cache analytics card with hit rate % and 7-day sparkline

Likely failure modes from the diff

  • Potential overlap with feat: Per-message cost attribution with cache-hit breakdown #779: PR feat: Per-message cost attribution with cache-hit breakdown #779 adds /api/token-attribution which also returns cache_hit_ratio. Both endpoints scan session JSONL for cache tokens. Worth aligning on a single source of truth before both land — the ratios should at minimum agree.
  • Anthropic pricing hardcoded at $0.30/$3.00 per 1M — varies by model (Haiku vs Sonnet vs Opus); providers_pricing.py already has a model table, using it here would give more accurate savings estimates
  • 7-day JSONL scan on a large workspace could be slow; a short-lived cache (e.g. 5 min TTL) would help

Issue link


Generated by Claude Code

@vivekchand vivekchand force-pushed the feat/gh-clawmetry-851-cache-analytics branch from afa43b5 to 286a312 Compare May 8, 2026 00:21
Copy link
Copy Markdown
Owner Author

Auto-rebase pushed; CI now running. If still not green in 10min, may need manual attention.


Generated by Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vivekchand vivekchand force-pushed the feat/gh-clawmetry-851-cache-analytics branch from 286a312 to 5246ffb Compare May 8, 2026 21:15
Copy link
Copy Markdown
Owner Author

Auto-rebase pushed; CI now running. If still not green in 10min, may need manual attention.


Generated by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Prompt cache hit rate analytics

1 participant