From f7fb4679e378c7ee0927cec43d61fabf3e24efee Mon Sep 17 00:00:00 2001 From: john988 Date: Wed, 17 Jun 2026 11:28:02 +0800 Subject: [PATCH 1/2] =?UTF-8?q?docs:=20add=20AUDIT.md=20=E2=80=94=20point-?= =?UTF-8?q?in-time=20security/quality=20audit=20with=20current=20status?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Multi-dimension audit of the pre-v1.4.0 local copy (73 confirmed findings). Most are already resolved by v1.4.0 and PR #140; the doc marks the three still genuinely open (scanner shrink-skip, today/week timezone, dashboard query cache) and records the refuted/safe items. --- AUDIT.md | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 AUDIT.md diff --git a/AUDIT.md b/AUDIT.md new file mode 100644 index 0000000..225efdb --- /dev/null +++ b/AUDIT.md @@ -0,0 +1,74 @@ +# Security & Quality Audit — claude-usage + +**Audited:** the pre-sync local working copy (`scanner.py` / `cli.py` / `dashboard.py` +at commit `d3b7985` plus uncommitted changes), 2026-06-17. +**Method:** 8-dimension review (scanner correctness, cost accuracy, data integrity, +security, cross-platform, performance, maintainability, docs) with adversarial +per-finding verification — 80 raised → **73 confirmed / 7 refuted**. + +> ⚠️ **Read this first.** The audited tree was ~2 months behind upstream. Syncing +> to **v1.4.0** resolved most findings, and the subagent + ccusage work in **PR +> #140** resolves several more. The status column reflects the **current** code, +> not the audited snapshot. Only the rows marked **⚠️ OPEN** still need action. + +## Severity summary (confirmed, as of the audited snapshot) + +`0 critical · 2 high · 9 medium · 42 low · 20 info` + +## Key findings & current status + +| Finding | Severity | Status | +|---|---|---| +| `claude-opus-4-8` mispriced ~3× via greedy `claude-opus-4` prefix match | High | ✅ Fixed in v1.4.0 (explicit 4.8 entry) | +| Cross-file token inflation (session totals accumulate duplicates) | Med | ✅ Fixed in v1.4.0 (`message_id` unique index + recompute-from-`turns`) | +| CLI had no billable gate → unknown/local models charged Sonnet rates | Med | ✅ Fixed in v1.4.0 (`get_pricing` → `None` → $0) | +| `claude-haiku-4-6` (and future minors) mispriced by CLI | Med | ✅ Fixed in v1.4.0 | +| README pricing table omitted models / `opus-4-8` | Med | ✅ Fixed in v1.4.0 | +| Pricing duplicated in `cli.py` (Python) vs `dashboard.py` (JS) → drift | Med | ✅ Fixed in **PR #140** (`pricing.py` single source; JS reads `/api/data`) | +| DOM XSS: model/project/agent names into `innerHTML` unescaped | Low | ✅ Fixed in v1.4.0 (`esc()`) | +| Single-threaded server: a slow `/api/data` blocks other requests | Low | ✅ Fixed in v1.4.0 (`ThreadingHTTPServer`) | +| Incremental scan read each updated file multiple times | Low | ✅ Fixed in v1.4.0 | +| No automated tests / CI | Med | ✅ Fixed in v1.4.0 (`tests/` + GitHub Actions) | +| `launch.json` ran `python dashboard.py` (no scan; `python` on macOS) | Med | ✅ Fixed in v1.4.0 (orphan `launch.json` removed) | +| Usage-limit events not tracked | Low | ✅ Added in **PR #140** (`limit_events`, gated on `isApiErrorMessage`) | +| "Transcript ≠ Anthropic billing" not disclosed | Info | ✅ Added in **PR #140** (footer caveat) | +| **Scanner: a shrunk/compacted JSONL is skipped forever** | Med | ⚠️ **OPEN** | +| **`today` / `week` compare local date vs UTC timestamps** | Low | ⚠️ **OPEN** | +| **Dashboard recomputes every query per request + ships full history; no cache** | Med (perf) | ⚠️ **OPEN** | + +## Still open — actionable on the current codebase + +1. **Scanner shrink/compaction permanent-skip** (`scanner.py`, the + `if line_count <= old_lines:` branch). It updates `processed_files.mtime` but + not `lines`; on the next scan the mtime matches and the file is skipped, so a + compacted (rewritten-smaller) transcript is never re-ingested and stale turns + linger. **Fix:** also `SET lines = ?` in that branch (and, for full + correctness, treat `current_lines < old_lines` as a rewrite — delete the + file's turns and re-parse). Low real-world frequency (transcripts are usually + append-only), hence Medium. + +2. **Timezone "today"/"week"** (`cli.py`): `date.today()` (local) is compared to + `substr(timestamp,1,10)` (UTC). Near midnight, users far from UTC see the + wrong day. The dashboard's `getRangeBounds` has the same local-vs-UTC seam. + **Fix:** compare in a single, explicit timezone. + +3. **Dashboard query cost** (`dashboard.py:get_dashboard_data`): every + `/api/data` hit runs all GROUP BY/JOIN queries from scratch and the client + re-fetches the full history each 30s. Fine today; it won't scale to very large + DBs. **Fix:** an mtime-keyed result cache + server-side date filtering. + +## Verified safe (refuted findings) + +- **SQL injection** via the `PRAGMA table_info(...)` and `AGENT_TYPE_EXPR` + f-strings — not exploitable; every interpolated value is an internal constant. +- **Cross-platform paths** — `Path.home()` + `pathlib` + checking both + `\subagents\` and `/subagents/` is correct on Windows and POSIX. +- **CLI vs dashboard cache pricing** — numerically identical (the CLI's derived + `input×0.10/×1.25` equals the dashboard's explicit per-model cache rates). + +## Full data + +The complete per-finding output (all 73, each with its adversarial-verification +rationale) was produced by the audit workflow and is large; it lives outside the +repo at the run's task-output file (`tasks/w2o3go5v6.output`) and was not +committed. From 50db034c43d148ea6e9f36fdb9fdf1e017698cae Mon Sep 17 00:00:00 2001 From: john988 Date: Wed, 17 Jun 2026 11:40:27 +0800 Subject: [PATCH 2/2] docs(audit): mark the 3 remaining findings fixed in PR #140 Scanner shrink-skip, today/week UTC dates, and dashboard /api/data caching are resolved; note the residual stale-turn limitation on compaction. --- AUDIT.md | 45 +++++++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 20 deletions(-) diff --git a/AUDIT.md b/AUDIT.md index 225efdb..7fd1e95 100644 --- a/AUDIT.md +++ b/AUDIT.md @@ -8,8 +8,10 @@ per-finding verification — 80 raised → **73 confirmed / 7 refuted**. > ⚠️ **Read this first.** The audited tree was ~2 months behind upstream. Syncing > to **v1.4.0** resolved most findings, and the subagent + ccusage work in **PR -> #140** resolves several more. The status column reflects the **current** code, -> not the audited snapshot. Only the rows marked **⚠️ OPEN** still need action. +> #140** resolves the rest. The status column reflects the **current** code, not +> the audited snapshot. As of PR #140 every confirmed finding is addressed; the +> section below records the three that were still open after the v1.4.0 sync and +> how PR #140 fixed them. One residual limitation is noted inline. ## Severity summary (confirmed, as of the audited snapshot) @@ -32,30 +34,33 @@ per-finding verification — 80 raised → **73 confirmed / 7 refuted**. | `launch.json` ran `python dashboard.py` (no scan; `python` on macOS) | Med | ✅ Fixed in v1.4.0 (orphan `launch.json` removed) | | Usage-limit events not tracked | Low | ✅ Added in **PR #140** (`limit_events`, gated on `isApiErrorMessage`) | | "Transcript ≠ Anthropic billing" not disclosed | Info | ✅ Added in **PR #140** (footer caveat) | -| **Scanner: a shrunk/compacted JSONL is skipped forever** | Med | ⚠️ **OPEN** | -| **`today` / `week` compare local date vs UTC timestamps** | Low | ⚠️ **OPEN** | -| **Dashboard recomputes every query per request + ships full history; no cache** | Med (perf) | ⚠️ **OPEN** | +| Scanner: a shrunk/compacted JSONL was skipped forever | Med | ✅ Fixed in **PR #140** (shrink path syncs `lines`, not just mtime) | +| `today` / `week` compared local date vs UTC timestamps | Low | ✅ Fixed in **PR #140** (UTC date in CLI; `getRangeBounds` UTC) | +| Dashboard recomputed every query per request | Med (perf) | ✅ Fixed in **PR #140** (mtime-keyed `/api/data` cache) | -## Still open — actionable on the current codebase +## Open after the v1.4.0 sync — now fixed in PR #140 1. **Scanner shrink/compaction permanent-skip** (`scanner.py`, the - `if line_count <= old_lines:` branch). It updates `processed_files.mtime` but - not `lines`; on the next scan the mtime matches and the file is skipped, so a - compacted (rewritten-smaller) transcript is never re-ingested and stale turns - linger. **Fix:** also `SET lines = ?` in that branch (and, for full - correctness, treat `current_lines < old_lines` as a rewrite — delete the - file's turns and re-parse). Low real-world frequency (transcripts are usually - append-only), hence Medium. + `if line_count <= old_lines:` branch). It updated `processed_files.mtime` but + not `lines`; on the next scan the mtime matched and the file was skipped, so a + compacted (rewritten-smaller) transcript was never re-ingested. **Fixed:** the + branch now also `SET lines = ?`, so later appends are detected. + *Residual limitation:* stale turns from the pre-compaction content aren't + purged (turns aren't linked to a source file); a full re-ingest-on-shrink + would need a `source_file` column on `turns`. Low real-world frequency + (transcripts are usually append-only). -2. **Timezone "today"/"week"** (`cli.py`): `date.today()` (local) is compared to - `substr(timestamp,1,10)` (UTC). Near midnight, users far from UTC see the - wrong day. The dashboard's `getRangeBounds` has the same local-vs-UTC seam. - **Fix:** compare in a single, explicit timezone. +2. **Timezone "today"/"week"** (`cli.py`): `date.today()` (local) was compared to + `substr(timestamp,1,10)` (UTC), so near midnight users far from UTC saw the + wrong day; the dashboard's `getRangeBounds` had the same seam. **Fixed:** the + CLI uses a UTC date (`_utc_today()`) and `getRangeBounds` uses UTC date math. 3. **Dashboard query cost** (`dashboard.py:get_dashboard_data`): every - `/api/data` hit runs all GROUP BY/JOIN queries from scratch and the client - re-fetches the full history each 30s. Fine today; it won't scale to very large - DBs. **Fix:** an mtime-keyed result cache + server-side date filtering. + `/api/data` hit re-ran all GROUP BY/JOIN queries. **Fixed:** the payload is now + cached keyed on the DB's path + mtime, so the 30s poll reuses it until a + scan/ingest changes the DB. *Note:* the client still receives full history and + filters client-side; server-side date filtering remains a larger future change + (acceptable at current scale). ## Verified safe (refuted findings)