feat: prompt cache hit rate analytics panel (closes #851)#863
feat: prompt cache hit rate analytics panel (closes #851)#863vivekchand wants to merge 1 commit intomainfrom
Conversation
vivekchand
left a comment
There was a problem hiding this comment.
Test plan & review notes
What changed
- Adds
routes/cache_analytics.py(new Blueprintbp_cache_analytics) with a singleGET /api/cache-analyticsendpoint that scans session JSONL files for the last 7 days, computes prompt cache hit ratio, estimated cost savings (Anthropic pricing hardcoded at $3.00/$0.30 per 1M input/cached tokens), per-model and per-session breakdowns, and a daily time-series; wires up a new Overview tab card with a sparkline SVG inclawmetry/templates/tabs/overview.htmlandclawmetry/static/js/app.js; registers the blueprint indashboard.py; adds 7 integration tests intests/test_api.py.
Smoke commands
make testormake test-apipython3 dashboard.py --port 8900then navigate to the Overview/Usage section
What to look at visually
http://localhost:8900/→ Overview tab — "Prompt Cache Analytics" card should appear with hit-rate %, savings badge, and sparklinecurl -sS http://localhost:8900/api/cache-analytics— verify all keys (cache_hit_ratio,estimated_savings_usd,series_daily,per_model,per_session) are present in the JSON response
Likely failure modes from the diff
- Hardcoded Anthropic pricing —
_NORMAL_INPUT_PER_TOKENand_CACHED_INPUT_PER_TOKENare fixed constants; users on other providers (OpenAI, Google, OpenRouter) or on newer Anthropic model tiers with different pricing will see incorrect savings estimates. Consider sourcing these fromclawmetry/providers_pricing.pykeyed by model name. cacheRead/cacheWritefield names — the parser assumes Anthropic's exact field names inusage; if the JSONL was written by a different provider or an older OpenClaw version that usescache_read_input_tokens/cache_creation_input_tokens, all calls will show zero cache hits silently.- Module-level mutable cache (
_cache_result,_cache_ts) — these globals are not protected by a lock, so under Waitress's multi-threaded dispatch two requests arriving simultaneously just after TTL expiry will both recompute and the second write will silently win; harmless for correctness but worth noting for future thread-safety hardening. file_mtimecutoff may skip active sessions — files modified before 7 days ago are skipped entirely viafile_mtime < cutoff_ts, but a long-running session JSONL that was created more than 7 days ago yet still has recent appended lines will be excluded; individual line timestamps are not checked unless the file passes the mtime gate.
Issue link
- Closes #851 ✓ (already in title)
Generated by Claude Code
vivekchand
left a comment
There was a problem hiding this comment.
Test plan & review notes
Repo: vivekchand/clawmetry
What changed
- New
routes/cache_analytics.pyblueprint:GET /api/cache-analyticsscans 7 days of session JSONL forcacheRead/cacheWriteusage - Overview card showing cache hit rate %, estimated savings, and a 7-day sparkline
- Frontend
loadCacheAnalytics()following theloadAutonomy()pattern
Smoke commands
python3 -c 'import ast; ast.parse(open("routes/cache_analytics.py").read())'— syntax cleancurl -sS http://localhost:8900/api/cache-analytics— expect{"cache_hit_ratio": 0..1, "total_cache_read_tokens": N, "estimated_savings_usd": N, "daily": [...]}- Sessions with no cache data at all should return
cache_hit_ratio: 0, notNaNor a crash
What to look at visually
http://localhost:8900→ Overview tab → cache analytics card (hit rate %, savings estimate, 7-day sparkline)
Likely failure modes from the diff
- Division-by-zero:
cache_hit_ratio = cacheRead / (cacheRead + input)— confirm the denominator is guarded against zero - Overlap watch: PR #779 (
/api/token-attribution) also computes cache stats from the same JSONL files — if both merge, watch for the two endpoints diverging oncache_hit_ratiodue to different scan windows or normalisation logic - Blueprint registration:
routes/cache_analytics.pymust be imported and registered indashboard.py
Issue link
- Closes #851
Generated by Claude Code
vivekchand
left a comment
There was a problem hiding this comment.
Test plan & review notes
Repo: vivekchand/clawmetry
What changed
- New
GET /api/cache-analyticsendpoint (routes/cache_analytics.py) that scans 7 days of session JSONL files forcacheRead/cacheWriteusage, plus a matching overview card inoverview.htmlandloadCacheAnalytics()inapp.js.
Merge state
- Branch is currently dirty (merge conflict with
main) — needs a rebase or merge before this can land.
Smoke commands
make testormake test-api— runs the 7 newTestCacheAnalyticscasespython3 dashboard.py --port 8900then open the Overview tab to see the cache card render
What to look at visually
http://localhost:8900/— Overview tab: "Prompt Cache Analytics" card should appear between the existing top cards and the refresh barhttp://localhost:8900/api/cache-analytics— raw JSON; confirm all 8 keys are present andseries_dailyhas exactly 7 entries
Likely failure modes from the diff
- Module-level cache is process-global:
_cache_result/_cache_tsare bare module globals. Under multi-worker deployments (waitress with threads) two requests can race past the TTL check and both recompute simultaneously — not a correctness bug but worth athreading.Lockif the scan is slow on large workspaces. - File-mtime cutoff skips live sessions: the 7-day cutoff is applied to
file_mtimebefore reading any lines, so a session file last written >7 days ago (e.g. a long-running stale session) is skipped entirely even if it has recent timestamps inside. Probably the intended behaviour, but worth confirming. cacheRead/cacheWritekey names: the code readsusage.get("cacheRead")— confirm OpenClaw JSONL actually uses camelCase here rather thancache_read_tokens/cache_creation_input_tokens(the Anthropic API field names). A silent zero-read is the failure mode if the keys don't match.- Savings math uses a fixed Anthropic price:
$3.00/$0.30per 1M is hardcoded and doesn't account for other providers or model-tier pricing variations; sessions using non-Anthropic models will silently over/under-count savings. - SVG innerHTML injection:
svgEl.innerHTML = svgContentis built from server-provided strings. Ifmodelnames or session IDs ever end up in SVG content in a future extension, this is an XSS surface — fine for now but worth noting.
Issue link
- Closes #851 ✓ (already linked in PR body)
Generated by Claude Code
Test plan & review notesRepo: vivekchand/clawmetry What changed
Smoke commands
What to look at visually
Likely failure modes from the diff
Issue link Generated by Claude Code |
afa43b5 to
286a312
Compare
|
Auto-rebase pushed; CI now running. If still not green in 10min, may need manual attention. Generated by Claude Code |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
286a312 to
5246ffb
Compare
|
Auto-rebase pushed; CI now running. If still not green in 10min, may need manual attention. Generated by Claude Code |
Closes #851
What
Dedicated prompt cache analytics panel showing how effectively Anthropic prompt caching is being used across sessions.
How
GET /api/cache-analyticsinroutes/cache_analytics.py— scans 7 days of session JSONL files forcacheRead/cacheWriteusage dataloadCacheAnalytics()inapp.jsfollowing theloadAutonomy()patternResponse shape
{ "cache_hit_ratio": 0.73, "total_cache_read_tokens": 1234567, "total_cache_write_tokens": 456789, "total_input_tokens": 2345678, "estimated_savings_usd": 3.33, "series_daily": [...], "per_model": [...], "per_session": [...] }Cost savings estimation
Uses Anthropic pricing: cached tokens cost $0.30/1M vs $3.00/1M for normal input, so each cached token saves $2.70/1M.
Tests
7 test cases in
TestCacheAnalyticscovering endpoint availability, response structure, types, and value ranges.