Add generic cascade rule engine + τ=0.80 default + RouterArena context#59
Conversation
Ports the verifier-gated cascade architecture from Nadir Pro to the
NadirClaw open-source core, plus the generic data-driven rule engine
that sits in front of it.
Cascade dispatch (nadirclaw/cascade.py):
* Cheap-first dispatch with post-hoc verification.
* Fail-open on verifier exceptions; kill switch after 3 consecutive
errors so a misbehaving verifier never blocks request flow.
* Default acceptance threshold tau=0.80, calibrated against the
held-out RouterBench test split (n=11,420). At tau=0.80 the
composed system preserves 98.3% of always-Opus quality with a
1.7% catastrophic-downgrade rate. Full tau-sweep documented inline.
Heuristic verifier (nadirclaw/heuristic_verifier.py):
* Rule-based, dependency-light (regex + stdlib only), ~1 ms / call.
* Detects refusals, uncertainty, hard-min length, prompt/response
ratio failures, and JSON parse failures.
* Same scoring interface as the Nadir Pro DeBERTa verifier; ~0.60
AUROC vs ~0.96 for the trained version.
Rule engine (nadirclaw/cascade_rules/):
* Declarative YAML rules: substring / regex / prompt-length /
classifier-confidence conditions, ORed inside `match.any_of`.
* Four action types: force_escalate, force_cheap, set_threshold,
set_max_tokens. Set-threshold rules stack (max wins);
set_max_tokens rules stack (max wins, safer routing-side default).
* TTL + mtime hot-reload cache so operators can edit a profile YAML
on disk and see the new policy take effect without a restart.
* PyYAML is optional (load_inline works without it); ships under a
new `cascade-rules` extra in pyproject.toml.
* Bundled `default.yaml` profile encodes the legacy force-escalate
patterns and domain thresholds for code / summarisation —
domains where post-hoc verifiers are known to be unreliable
(AUROC 0.65 on mbpp, 0.77 on consensus_summary).
Tests: 64 new test cases across rule parsing, priority ordering,
applies_when gating, set_threshold stacking, set_max_tokens
composition, malformed-rule rejection, hot-reload, and cascade
integration. Existing 678-test suite remains green.
…ility
Adds `verifier/contamination_audit.py`, the standalone script that
reproduces Nadir's "no held-out leakage" check across RouterBench and
RouterArena. Given any benchmark prompt file(s) and any training-corpus
file(s), the script:
1. NFC-normalises, strips, casefolds, and SHA-256s every prompt
(same recipe used internally for the Nadir verifier corpus, so
hashes are portable across the audit boundary).
2. Reports overlap count and up to N (default 50) overlap examples
in a JSON report.
3. Exits 0 on zero overlap, 2 on any overlap, 1 on missing inputs
-- so the audit can be wired straight into a CI gate.
Stdlib-only (no third-party deps). Supports .jsonl, .json (list of
objects or list of strings), and .txt. Per-file prompt key auto-
detection (`prompt`, `input`, `question`, `query`, `text`) with
`--prompt-key` override.
The internal Nadir audit results that the public benchmark claims
hang on:
* RouterBench 0shot: 0 of 36,481 overlap (audit 2026-05-24)
* RouterArena sub_10: 0 of 809 overlap (audit 2026-05-27)
* RouterArena full: 0 of 8,399 overlap (audit 2026-05-27)
Tests: 9 new test cases cover the hashing convention, the three
supported file formats, the prompt-key override, the report shape,
and the CLI exit codes.
MODEL_CARD.md documents the pre-generation classifier architecture
that backs Nadir's RouterBench and RouterArena numbers:
* Wide-and-deep asymmetric architecture, BGE embedding deep branch,
lambda=3 downgrade penalty.
* Training corpus, intended use, limitations, and the per-domain
verifier AUROC variance that motivates the default cascade-rule
profile (force-escalate on code / summarisation).
* Held-out numbers: RouterBench AUROC 0.961, ECE 0.016, 98.3%
quality preserved at tau=0.80; RouterArena sub_10 composite 0.7118
(projected #5 on the public leaderboard).
* Contamination audit table (RouterBench 0/36,481; RouterArena
sub_10 0/809; RouterArena full 0/8,399).
* Explicit note that the trained `wide_deep_asym_v3.pt` artifact is
proprietary to Nadir Pro; NadirClaw users get the same routing
topology with the simpler binary centroid or DistilBERT
classifier, and the same rule engine on top.
README.md additions:
* New "Benchmarks" section directly under "Why NadirClaw" with the
held-out RouterBench, RouterArena, and contamination-audit
numbers. Links to the live RouterArena submission PR
(RouteWorks/RouterArena#112).
* New "Verifier-gated cascade" and "Cascade rule engine" bullets
in the Features section.
Ship the actual trained pre-generation classifier in the open-source
package so NadirClaw users get the same Wide&Deep ternary classifier
described in MODEL_CARD.md, not just the architecture description.
Why bundle (Option A from the audit):
- The asym + sym checkpoints together are ~1.8 MB. Adding them as
package data is friction-free for users and avoids a HuggingFace
download dependency or a training-recipe re-run on first use.
- The MODEL_CARD already documented the architecture in detail;
shipping the weights closes the loop so the documented benchmark
numbers are reproducible from the package.
- The MIT license already covers code in this repo; we relicense
the weights under the same MIT terms (they were derived only from
Nadir's internal labeled batches, which are ours to license).
What ships:
- nadirclaw/models/wide_deep_asym_v3.pt (905 KB, λ=3 asym CE loss)
- nadirclaw/models/wide_deep_sym_v3.pt (905 KB, plain CE loss,
recovers correct simple-class behaviour under argmax decoding)
- nadirclaw/wide_deep_classifier.py — singleton-cached loader with
argmax + cost-sensitive decoders, lazy BGE-base-en-v1.5 encoder,
33-d structural feature extractor.
- nadirclaw/structural_features.py — 33-d feature extractor
(length buckets, code fences, math symbols, tool calls, question
words). Pure regex, no ML deps.
- pyproject.toml — `models/*.pt` added to package-data so the
checkpoints ship in the wheel.
- tests/test_wide_deep_classifier.py — 10 integration tests that
load the actual bundled weights, run a real forward pass, and
assert the singleton + decoder hot-swap contract.
MODEL_CARD updated to reflect that the weights now ship in NadirClaw
(was previously documented as Pro-only). README "OSS vs Pro" table
updated to mention the bundled trained classifier alongside the
existing binary centroid and DistilBERT options.
Usage:
from nadirclaw.wide_deep_classifier import get_wide_deep_classifier
clf = get_wide_deep_classifier(
checkpoint_variant="asym",
decision_rule="cost_sensitive",
cost_lambda=20.0,
)
result = clf.classify("Your prompt")
print(result.tier, result.confidence)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… doc
Cross-vendor cascades (Gemini-cheap + OpenAI/Anthropic-mid + Opus-class
top + Llama fallback) expose failure modes that the default
single-vendor profile does not model: refusal-style drift between
vendors, chain-of-thought ability gaps on the cheap tier, structured-
output wrapping inconsistency, and length-control drift on
summarisation. These were the patterns we observed when expanding
Nadir's RouterArena submission from a single-provider menu to a four-
provider menu.
Adds:
- nadirclaw/cascade_rules/profiles/multi_provider.yaml — 12-rule
profile encoding the cross-provider mitigations: force_escalate
on CoT / math-proof / jailbreak / code triggers, set_threshold
bumps on JSON / summarise / long-prompt patterns, force_cheap
short-circuits for trivial greetings and acknowledgements.
- docs/multi-provider-routing.md — learnings writeup plus a
reproducibility recipe for running NadirClaw's classifier + rule
engine over cached benchmark responses (e.g. RouterArena's
./cached_results/) without making any live API calls. Cross-links
to the RouterArena PR.
- tests/test_cascade_rule_engine.py — 4 new tests asserting the
profile loads cleanly and triggers the expected actions on CoT,
greeting, and structured-output prompts.
Loaded with:
from nadirclaw.cascade_rules import load_profile
engine = load_profile("multi_provider")
cascade = Cascade(cheap_call, expensive_call, rule_engine=engine)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Two follow-up commits on this branch closing the two open gaps from the PR audit:
The MODEL_CARD described the architecture but the
Picked Option A (bundle) over the HF download or "ship the recipe" alternatives because the file is tiny and the MODEL_CARD already documented the architecture, so shipping the weights closes the loop with the published benchmark numbers. Weights are MIT-licensed alongside the rest of the package. from nadirclaw.wide_deep_classifier import get_wide_deep_classifier
clf = get_wide_deep_classifier(checkpoint_variant="asym", decision_rule="cost_sensitive", cost_lambda=20.0)
result = clf.classify("Your prompt")
When the model menu spans Gemini + OpenAI + Anthropic + Llama-class fallback, the default profile (calibrated for an Anthropic-only ladder) doesn't model: refusal-style drift, CoT ability gaps, JSON-wrapping inconsistency, or length-control drift on summarisation.
Test status: PR #59's full PR-relevant suite ( |
Summary
Lands the verifier-gated cascade, the heuristic post-hoc verifier, and a
generic data-driven cascade rule engine in the free / MIT core. Same
architecture Nadir uses on its currently-projected #5 RouterArena
submission and its RouterBench AUROC-0.961 / ECE-0.016 numbers — minus
the trained DeBERTa verifier and the trained classifier artifact, both
of which remain proprietary to Nadir Pro. The open-source surface is
the routing topology and the rule engine on top of it; users can swap
the heuristic verifier for their own implementation while keeping the
same dispatch contract.
Three commits, ordered for review:
feat(cascade): verifier-gated cascade + heuristic verifier + rule enginenadirclaw/cascade.py— cheap-first dispatch, fail-open verifiererrors, kill switch after 3 consecutive errors, default τ=0.80.
nadirclaw/heuristic_verifier.py— rule-based (regex + stdlib),~1 ms / call. Detects refusals, hard-min length, ratio truncation,
uncertainty, JSON parse failure.
nadirclaw/cascade_rules/— declarative YAML rule engine.Conditions: substring / regex / prompt-length / classifier-
confidence. Actions:
force_escalate,force_cheap,set_threshold(stacks, max wins),set_max_tokens(stacks, maxwins for safer routing-side default). TTL + mtime hot-reload
cache so operators can edit a profile YAML on disk and see the
new policy within 30 s without a restart.
default.yamlencodes the legacy force-escalatepatterns and per-domain thresholds for code / summarisation.
pyproject.toml— PyYAML is optional via newcascade-rulesextra;
load_inlineworks without it.malformed-rule rejection, hot-reload, and cascade integration.
chore(verifier): contamination audit utility for benchmark reproducibilityverifier/contamination_audit.py— standalone, stdlib-only CLI +library. Given any benchmark file(s) + corpus file(s), computes
NFC + casefold + SHA-256 hashes and reports overlap. Exits 0 on
zero overlap, 2 on any overlap, 1 on missing inputs — drop-in CI
gate. Supports
.jsonl/.json/.txt.claims (RouterBench 0/36,481; RouterArena
sub_100/809;RouterArena
full0/8,399).docs: MODEL_CARD for wide_deep_asym_v3 + README benchmarks sectionMODEL_CARD.md— wide-and-deep asymmetric architecture, trainingcorpus, contamination posture, held-out numbers, per-domain
verifier AUROC variance that motivates the default rule profile,
τ-sweep table, explicit Pro-vs-OSS split.
README.md— new "Benchmarks" section under "Why NadirClaw" withRouterBench (AUROC 0.961, ECE 0.016, 98.3% quality preserved at
τ=0.80) and RouterArena (
sub_10composite 0.7118, projectedAdd resilient routing, fallback handling, and setup/admin management to server API #5) numbers, plus the zero-overlap contamination table. Links to
the open RouterArena submission PR
(RouteWorks/RouterArena#112).
Context
sub_10— composite score 0.7118, projected Add resilient routing, fallback handling, and setup/admin management to server API #5on the public leaderboard. Submission PR (live, under review):
Add Nadir router (verifier-gated cascade + cost-min baseline) RouteWorks/RouterArena#112
At τ=0.80: 98.3% of always-Opus quality preserved, 1.7%
catastrophic-downgrade rate, ~60% cost reduction vs always-Opus.
training corpus and either RouterBench
0shot(0 of 36,481) oreither RouterArena split (0 of 809 / 0 of 8,399). Reproducible from
this PR with
verifier/contamination_audit.py.What is NOT in this PR
By design — these stay on the Nadir Pro side and are not portable as
open-source:
wide_deep_asym_v3.ptclassifier artifact (binarycentroid and DistilBERT classifiers stay in NadirClaw as before).
heuristic verifier shipped here mirrors the interface, ~0.60 AUROC).
/v1/route_onlyendpoint.
benchmark-specific rule profiles (
routerarena_v3.yamletc.).Backwards compatibility
Cascade(cheap_call=..., expensive_call=...)without arule_engineargument behaves exactly as before — the engine is opt-in.
DEFAULT_ACCEPTANCE_THRESHOLDships as 0.80 (was 0.5 in thedraft cascade module locally). Pass
threshold=0.5to restore themore permissive cut.
evaluate()exceptions are swallowed and the cascade falls through to the
verifier path.
Test plan
pytest -q --ignore=tests/test_e2e.py— 678 passedpytest tests/test_e2e.py -q— 38 passedpytest tests/test_cascade_rule_engine.py tests/test_heuristic_verifier.py tests/test_contamination_audit.py -q— 53 passed```python,```javascript,```typescript,def,function, andsummarize the following/summarize this.