Eval tracking companion skill — log structured audit output

## Idea

A lightweight developer-only skill that reads harden's structured convergence summary output and logs it: date, deliverable type, passes to convergence, severity counts, hits/misses. Enables tracking harden's performance over time and benchmarking whether new features improve or degrade audit quality.

## Scope

- **Dev tool only** — not part of harden's user-facing package
- Users install harden and use it. Developers use this to measure quality.
- Explicitly scoped as separate from harden (prevents scope creep, platform incompatibility)
- Original idea notes: "only worth building after 10+ manual eval logs prove the data is useful"

## Dependencies

- **Blocked by:** harden#6 (structured convergence summary) — needs parseable output format to exist first

## Key Design Questions

- Storage format for eval logs (JSON lines? SQLite? markdown table?)
- What metrics to track beyond basic counts
- How to correlate eval results with harden SKILL.md versions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval tracking companion skill — log structured audit output #7

Idea

Scope

Dependencies

Key Design Questions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Eval tracking companion skill — log structured audit output #7

Description

Idea

Scope

Dependencies

Key Design Questions

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions