Skip to content

feat: Composable eval construction from registry graders #17

@spboyer

Description

@spboyer

Migrated from spboyer/waza#389

Summary

Design how users construct eval.yaml from a catalog of shared, versioned grader definitions.

Concept

Users should be able to:

  1. Browse available graders (waza registry search 'factual')
  2. Add a grader to their eval (waza registry add github.com/waza-evals/fact@v1.0)
  3. Reference it in eval.yaml with local config overrides
  4. Run evals that mix local and remote graders seamlessly

Design Questions

  • What does waza registry search look like? Metadata format?
  • How do remote graders interact with local graders in the same eval?
  • Can you override config on a remote grader (e.g., change threshold)?
  • How does waza init scaffold with registry graders?

Parent: #385

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions