Skip to content

docs(readme): add per-paper SVG bar charts + Mermaid pipeline diagram#6

Merged
Declade merged 3 commits into
mainfrom
feat/readme-figures-and-mermaid
May 24, 2026
Merged

docs(readme): add per-paper SVG bar charts + Mermaid pipeline diagram#6
Declade merged 3 commits into
mainfrom
feat/readme-figures-and-mermaid

Conversation

@Declade
Copy link
Copy Markdown
Owner

@Declade Declade commented May 24, 2026

Summary

Adds 4 generated SVG figures + a native Mermaid block to the README to break up walls of text and visualise the per-category numbers behind Paper 1 (HIPAA Safe Harbor / MTSamples) and Paper 2 (GLBA NPI / CFPB).

  • scripts/build-figures.ts — deterministic SVG generator, reads Paper 1 raw NDJSON + Paper 2 SUMMARY-tuned.json, emits 4 SVGs.
  • docs/figures/paper1-recall-per-category.svg (6.6 KB) — 18 HIPAA Safe Harbor categories × recall, green ≥ 90 % / orange < 90 %.
  • docs/figures/paper2-recall-per-category.svg (6.2 KB) — 17 GLBA NPI categories × recall, same visual treatment.
  • docs/figures/paper1-vs-paper2-precision-recall.svg (4.8 KB) — shared-category comparison.
  • docs/figures/methodology-pipeline.svg (3.6 KB) — static SVG twin of the inline Mermaid L1→L4 diagram.
  • README: figures inlined under each paper, new "How it works" section with the Mermaid diagram, "Regenerate the README figures" subsection, structure-tree updates.

What stayed unchanged

The existing description, regulatory framing (Codex r1→r2 approved Annex III phrasing), citations, reproduce commands, methodology summary — none of it was touched. Figures are added on top.

Aggregated numbers (cross-check)

  • Paper 1: 79.0 % overall recall (8 916 TP / 11 285 ground-truth). Strongest weak category: DATE (53.9 %), accounting for 77 % of all FN.
  • Paper 2 (tuned): 72.2 % overall recall (8 118 TP / 11 243 ground-truth). Weak categories cluster around digit-shape ambiguity (CARD_EXPIRATION, ACCOUNT_BALANCE, BANK_ACCOUNT_NUMBER, CARD_CVV).

Constraint compliance

  • Zero new npm dependencies (SVGs built from typed template literals).
  • Deterministic: identical inputs → byte-identical outputs (verified by running twice + shasum).
  • All 4 SVGs ≤ 7 KB (well under the 50 KB constraint).
  • Dark-mode-friendly via embedded prefers-color-scheme media queries.
  • pnpm typecheck && pnpm typecheck:test && pnpm test → 46/46 PASS.

How to regenerate

pnpm run build-figures

To add a new figure: write a render…() function in scripts/build-figures.ts, push it into the figures array in main(). No new dependencies needed.

Test plan

  • Re-run pnpm run build-figures and confirm byte-identical SVG output (shasum diff clean).
  • pnpm typecheck && pnpm typecheck:test && pnpm test all green.
  • GitHub renders Mermaid block + SVGs inline in the README preview.
  • All 4 SVGs render visibly in light + dark mode on GitHub.
  • Reviewer chain: bug-hunter-reviewer + claim-enforcement-guard + personal-info-leak-detector + regulator-validator.
  • Codex round 1 substantive-PASS.

Declade and others added 3 commits May 24, 2026 21:44
Adds 4 generated SVG figures + a native Mermaid block to break up the
README walls of text and visualise the per-category numbers behind both
papers.

What's new

- `scripts/build-figures.ts` — deterministic SVG generator. Reads Paper 1
  from `paper1-AFTER-500row-20260522T080037Z.ndjson` (aggregating
  gateway-attested matches/missed by HIPAA Safe Harbor category) and
  Paper 2 from `SUMMARY-tuned.json` (already aggregated). Re-runnable
  via `pnpm run build-figures`; identical inputs → byte-identical output.

- `docs/figures/paper1-recall-per-category.svg` (6.6 KB) — 18 HIPAA
  Safe Harbor categories × recall, color-coded ≥/<90 %.

- `docs/figures/paper2-recall-per-category.svg` (6.2 KB) — 17 GLBA NPI
  categories × recall, same visual treatment.

- `docs/figures/paper1-vs-paper2-precision-recall.svg` (4.8 KB) —
  shared-category comparison (NAME, EMAIL, PHONE, SSN, DATE/DOB,
  ADDRESS).

- `docs/figures/methodology-pipeline.svg` (3.6 KB) — static SVG twin of
  the inline Mermaid L1→L4 pipeline diagram (for non-GitHub renderers
  that don't render Mermaid natively).

- README updates: SVG figure inlines in the Published papers section,
  a new "How it works" section with the Mermaid diagram + L1-L4
  layer description, a "Regenerate the README figures" subsection in
  Reproduce, and updates to the Repository structure tree.

What stayed unchanged

The existing description, regulatory framing, citations, reproduce
commands, and methodology summary were not touched — figures are added
on top of the previously shipped content. No paper data, no datasets,
no harness code touched.

Verification

- `pnpm typecheck && pnpm typecheck:test && pnpm test` → 46/46 PASS.
- Re-running `pnpm run build-figures` twice produces byte-identical
  SVGs (shasum diff clean).
- All 4 SVGs ≤7 KB (constraint: ≤50 KB each).
- Aggregated overall numbers cross-check: Paper 1 79.0 % recall
  (8 916 TP / 11 285), Paper 2 72.2 % recall (8 118 TP / 11 243).

Constraint compliance

- Zero new npm deps (SVGs built from typed template literals).
- Determinism: no timestamps, no PRNG, no locale-dependent number formats.
- Dark-mode-friendly via embedded `prefers-color-scheme` media queries.
…L2 labels + aria descriptions + dark-mode + source labels

- HIGH (reproducibility): Decouple build-figures.ts from the gitignored Paper 1 NDJSON. New script aggregate-paper1-summary.ts produces papers/paper-1-healthcare/SUMMARY-tuned.json (2.6 KB, checked in) from the raw harness output (~5 MB, gitignored per the "only summaries are checked in" repo convention, mirroring Paper 2). build-figures.ts now reads only committed inputs, so a fresh git clone can regenerate all four SVGs.
- HIGH (L1/L2 labels): Correct the Mermaid diagram in README and the static methodology-pipeline.svg. L1 = "Known-entity match (exact / phonetic / fuzzy)", L2 = "Presidio NER + custom recognizers", L3 = PII Shield (Qwen 2.5 7B), L4 = reid-guard (Llama-3.1-8B). Deny-list / safelist is documented explicitly as a post-detection FP filter applied across L1+L2, NOT a layer. Aligned with services/sanitizer/known_entity.py:155, services/sanitizer/presidio_scan.py:147, and the L4 reid-guard PRD.
- MED (malformed NDJSON): aggregate-paper1-summary.ts throws a descriptive Error citing file:line on any malformed JSON line and exits non-zero without writing output. build-figures.ts wraps both summary readers in try/catch with descriptive errors. Published benchmark figures can no longer be generated from a silently-truncated input.
- LOW (SVG accessibility): All four SVGs now emit a <desc> element with a 1-2 sentence figure description, and the root <svg> declares aria-describedby in addition to the existing aria-labelledby + <title>. Element ids are namespaced by figure dimensions to avoid collisions when multiple SVGs render on one page.
- LOW (dark-mode): The pipeline diagram's grouping-bracket rectangles now use class="groupBracket" with a prefers-color-scheme dark override (#30363d) instead of the hardcoded #d0d7de stroke that ignored the dark-mode block.
- NIT (source label): Strip the "Source: " prefix from SOURCE_LABEL_P1 / SOURCE_LABEL_P2 constants and let the renderer prepend it once. Eliminates the "Source: Source: ... + Source: ..." double-prefix in the comparison chart footnote.

Determinism preserved: ran pnpm build-figures twice; all four SVG SHA1s identical. All four SVGs still well under 50 KB. pnpm typecheck + typecheck:test + test all PASS (46 tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1. SUMMARY-tuned.json input_path leak (HIGH per personal-info-leak-detector)

   aggregate-paper1-summary.ts wrote the absolute resolved input path into
   the summary's `input_path` field, so the committed SUMMARY-tuned.json
   contained `/Users/marcschuelke/lucairn-research/...`. Split the function
   signature into (inputAbs, inputLabel): inputAbs is used for fs reads + error
   messages, inputLabel is the repo-relative (or user-supplied) path embedded
   into the summary. Matches Paper 2's SUMMARY-tuned.json convention.

   Regenerated papers/paper-1-healthcare/SUMMARY-tuned.json with the relative
   path. Determinism preserved: rerun produces byte-identical output.

2. SVG right-edge clipping in all four figures

   - Bar charts (Paper 1 / Paper 2 / comparison): WIDTH 760 -> 880 with
     RIGHT_PAD bumped to 130 (per-paper) / 120 (comparison). At 100% recall
     with 4-digit n, the "<pct>% (n=NNNN)" / "Paper N: <pct>%" right-side
     labels overran the 760px viewBox by 25-40px and were clipped on the
     right edge.
   - Pipeline diagram: WIDTH 760 -> 800 + HEIGHT 320 -> 340. The rightmost
     box ("Signed claim", x=640 w=130) ended at x=770, 10px past the old
     viewBox. The "out: signed claim -> customer" footer (text-anchor=end at
     x=770) was clipped the same way. The 270-char footer note was also
     split onto two stacked lines so it stops overrunning the right edge.

Verification:
  - `rsvg-convert -f png` round-trip on all four SVGs: zero clipping at the
    right edge in any figure; "Signed claim" box and footer text fully
    visible in pipeline diagram.
  - `pnpm typecheck && pnpm typecheck:test && pnpm test` all PASS (46 tests).
  - Determinism check: `pnpm aggregate:paper1 && pnpm build-figures` rerun
    produces byte-identical sha256 hashes for the summary + all four SVGs.
  - File sizes: all four SVGs <7KB (well under the 50KB ceiling).
  - `grep -r "/Users/" papers/paper-1-healthcare/` returns nothing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Declade Declade merged commit 53bf600 into main May 24, 2026
0 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant