To run a sample evaluation:
- Go to the parent directory
- Run
python -m simple-evals.simple_evals --eval=healthbench --model=qwen3:4b --grader-model=qwen3:4b --examples=1
| Name | Name | Last commit date | ||
|---|---|---|---|---|
To run a sample evaluation:
python -m simple-evals.simple_evals --eval=healthbench --model=qwen3:4b --grader-model=qwen3:4b --examples=1