Advanced Forensic Validation Sprint complete

## Advanced Forensic Validation Sprint — Complete

Commit: `1c18b98`
Tests: **561 passing** (+49 new)

### Delivered

**Replay subsystem** (`forensics/replay.py`)
- `execute_replay(case_dir, source_run_id, engine_version)` re-parses evidence, re-runs plugins, diffs findings/hypotheses, persists `replay_{id}.json` to `exports/`, writes audit entry.
- States: `EXACT_MATCH` / `EXPECTED_DRIFT` / `UNEXPECTED_DRIFT` / `INCOMPATIBLE`. No synthetic/fake replays — missing evidence returns `INCOMPATIBLE` with reason.

**Run model + diff engine** (`forensics/models.py`, `forensics/diff.py`)
- `AnalysisRun` extended with parser/plugin/tuning provenance (backward-compat defaults, `from_dict` drops unknown keys).
- `compare_runs(case_dir, run_a, run_b)` returns `RunComparison` covering findings, plugin executions, hypotheses, and tuning-profile changes.

**Tuning profile system** (`forensics/tuning.py`)
- `TuningProfile.default()` with per-plugin `AnalyzerConfigProfile` + `ThresholdSet` for all 11 plugins; threshold values mirror analyzer source constants.

**Validation corpus + harness** (`validation/`, `tests/corpus/`)
- Seeded 3 cases: `CORPUS-normal-flight`, `CORPUS-crash`, `CORPUS-vibration-crash`.
- `run_validation()` parses evidence, runs all registered plugins via trust policy, compares to expected/should-not-find lists.
- `compute_quality_report()` produces per-analyzer TP/FP/FN + precision/recall — numbers derived from real runs only.
- `scripts/validate_corpus.py` — standalone CLI for CI.

**API routes**
- `POST /api/cases/{id}/runs/{run_id}/replay`
- `GET  /api/cases/{id}/runs/{run_id}/replay-verification`
- `POST /api/cases/{id}/compare-runs`
- `GET  /api/cases/{id}/tuning-profile`
- `POST /api/validation/run` + `GET /api/validation/results`

**GUI updates** (`web/static/index.html`, vanilla JS, >=12px)
- Run selector shows crit/warn counts, plugin count, tuning profile, REPLAY badge.
- Replay button + color-coded replay result panel.
- Compare Runs panel inside Exports.
- Validation tab with per-case PASS/FAIL and per-analyzer quality table.
- Tuning Profile badge on Plugins tab.

### Tests added (49)
- `tests/test_validation/test_replay.py`
- `tests/test_validation/test_diff_engine.py`
- `tests/test_validation/test_tuning.py`
- `tests/test_validation/test_corpus.py`
- `tests/test_web/test_replay_routes.py`

### Deferred intentionally
- Real hypothesis count wiring in harness (stub set to 0).
- Full threshold override plumbing from `tuning_profile.json` into plugin execution (profile is persisted per run but plugins still read source constants).
- Corpus growth beyond 3 seed cases — hook-up point is ready, dataset is not.

### Recommended next sprint
1. Wire `TuningProfile` overrides into plugin runtime (replace hard-coded constants with profile lookups).
2. Grow corpus to 20+ cases spanning all analyzer categories; enable regression gating in CI.
3. Implement true hypothesis replay by persisting hypotheses alongside findings per run.
4. Add `/api/validation/history` so the Validation tab can chart precision/recall over time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advanced Forensic Validation Sprint complete #31

Advanced Forensic Validation Sprint — Complete

Delivered

Tests added (49)

Deferred intentionally

Recommended next sprint

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Advanced Forensic Validation Sprint complete #31

Description

Advanced Forensic Validation Sprint — Complete

Delivered

Tests added (49)

Deferred intentionally

Recommended next sprint

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions