Migrated from spboyer/waza#389
Summary
Design how users construct eval.yaml from a catalog of shared, versioned grader definitions.
Concept
Users should be able to:
- Browse available graders (
waza registry search 'factual')
- Add a grader to their eval (
waza registry add github.com/waza-evals/fact@v1.0)
- Reference it in eval.yaml with local config overrides
- Run evals that mix local and remote graders seamlessly
Design Questions
- What does
waza registry search look like? Metadata format?
- How do remote graders interact with local graders in the same eval?
- Can you override config on a remote grader (e.g., change threshold)?
- How does
waza init scaffold with registry graders?
Parent: #385
Summary
Design how users construct eval.yaml from a catalog of shared, versioned grader definitions.
Concept
Users should be able to:
waza registry search 'factual')waza registry add github.com/waza-evals/fact@v1.0)Design Questions
waza registry searchlook like? Metadata format?waza initscaffold with registry graders?Parent: #385