v0.19.2: Fix TrainedVerifier input format by doramirdor · Pull Request #63 · NadirRouter/NadirClaw

doramirdor · 2026-05-29T13:09:06Z

Summary

NadirClaw's TrainedVerifier.score() was tokenizing with the bare cheap answer as text_pair. The released cross-encoder (nadirclaw/cascade-verifier-v1) was trained on a structured format:

text_pair = f"CHEAP:\n{cheap_answer}\n\nEXPENSIVE:\n{reference_answer or ''}"

This matches the Pro production backend at getnadir.dev/backend/app/services/verifier_model.py:195 and the HF model card.

Without the wrapper, the verifier's scores drift against the calibrated tau=0.80 acceptance threshold, which produced the RouterArena PR #112 numbers.

Changes

nadirclaw/trained_verifier.py — wrap tokenizer input in CHEAP:/EXPENSIVE: format; fold reference_answer into the EXPENSIVE: block (empty when None); update docstring (no longer "ignored").
tests/test_trained_verifier.py — add test_trained_verifier_wraps_input_in_production_format covering reference provided, None, and whitespace-only cases via a mock tokenizer.
nadirclaw/__init__.py — bump __version__ to 0.19.2.

Calibration impact

Mode	Behavior
Before v0.19.2	`text_pair = cheap_answer` (bare) — drifted scores vs tau=0.80
After v0.19.2	`text_pair = f"CHEAP:\n{cheap}\n\nEXPENSIVE:\n{ref or ''}"` — matches production

References

HF model card: https://huggingface.co/nadirclaw/cascade-verifier-v1
Production reference: getnadir.dev/backend/app/services/verifier_model.py:195
RouterArena submission: PR #112

Test plan

pytest tests/test_trained_verifier.py -v (9 passed, 1 slow-gated skip)
Full suite: pytest tests/ -v (773 passed, 1 skipped)
Mock-tokenizer test asserts exact text_pair wrapping for three cases (with ref, None, whitespace)

NadirClaw's TrainedVerifier was passing the cheap answer as the bare text_pair to the tokenizer. The model was trained on a structured format with CHEAP:/EXPENSIVE: markers, matching what the Pro production backend uses. Without that wrapper, scores are miscalibrated against the production tau=0.80 threshold. This patch wraps the input in the production format: text_pair = f"CHEAP:\n{cheap}\n\nEXPENSIVE:\n{reference or ''}" reference_answer is now used when provided (was previously documented as ignored). Behavior with reference_answer=None matches production: empty string substitution. Aligns NadirClaw with: - https://huggingface.co/nadirclaw/cascade-verifier-v1 (model card) - getnadir.dev/backend/app/services/verifier_model.py (production) Repo: https://github.com/NadirRouter/NadirClaw Service: https://getnadir.com

doramirdor force-pushed the fix/verifier-input-format-v0.19.2 branch from 216225c to bc46c72 Compare May 29, 2026 13:09

doramirdor merged commit 3be0f72 into main May 29, 2026
3 checks passed

doramirdor mentioned this pull request May 29, 2026

Add Nadir router (verifier-gated cascade + cost-min baseline) RouteWorks/RouterArena#112

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.19.2: Fix TrainedVerifier input format#63

v0.19.2: Fix TrainedVerifier input format#63
doramirdor merged 1 commit into
mainfrom
fix/verifier-input-format-v0.19.2

doramirdor commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

doramirdor commented May 29, 2026

Summary

Changes

Calibration impact

References

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant