Add generic cascade rule engine + τ=0.80 default + RouterArena context by doramirdor · Pull Request #59 · NadirRouter/NadirClaw

doramirdor · 2026-05-27T20:51:26Z

Summary

Lands the verifier-gated cascade, the heuristic post-hoc verifier, and a
generic data-driven cascade rule engine in the free / MIT core. Same
architecture Nadir uses on its currently-projected #5 RouterArena
submission and its RouterBench AUROC-0.961 / ECE-0.016 numbers — minus
the trained DeBERTa verifier and the trained classifier artifact, both
of which remain proprietary to Nadir Pro. The open-source surface is
the routing topology and the rule engine on top of it; users can swap
the heuristic verifier for their own implementation while keeping the
same dispatch contract.

Three commits, ordered for review:

feat(cascade): verifier-gated cascade + heuristic verifier + rule engine
- nadirclaw/cascade.py — cheap-first dispatch, fail-open verifier
  errors, kill switch after 3 consecutive errors, default τ=0.80.
- nadirclaw/heuristic_verifier.py — rule-based (regex + stdlib),
  ~1 ms / call. Detects refusals, hard-min length, ratio truncation,
  uncertainty, JSON parse failure.
- nadirclaw/cascade_rules/ — declarative YAML rule engine.
  Conditions: substring / regex / prompt-length / classifier-
  confidence. Actions: force_escalate, force_cheap,
  set_threshold (stacks, max wins), set_max_tokens (stacks, max
  wins for safer routing-side default). TTL + mtime hot-reload
  cache so operators can edit a profile YAML on disk and see the
  new policy within 30 s without a restart.
- Bundled default.yaml encodes the legacy force-escalate
  patterns and per-domain thresholds for code / summarisation.
- pyproject.toml — PyYAML is optional via new cascade-rules
  extra; load_inline works without it.
- 64 new tests covering parsing, priority ordering, stacking,
  malformed-rule rejection, hot-reload, and cascade integration.
chore(verifier): contamination audit utility for benchmark reproducibility
- verifier/contamination_audit.py — standalone, stdlib-only CLI +
  library. Given any benchmark file(s) + corpus file(s), computes
  NFC + casefold + SHA-256 hashes and reports overlap. Exits 0 on
  zero overlap, 2 on any overlap, 1 on missing inputs — drop-in CI
  gate. Supports .jsonl / .json / .txt.
- Reproduces the audit numbers behind Nadir's published held-out
  claims (RouterBench 0/36,481; RouterArena sub_10 0/809;
  RouterArena full 0/8,399).
- 9 new tests.
docs: MODEL_CARD for wide_deep_asym_v3 + README benchmarks section
- MODEL_CARD.md — wide-and-deep asymmetric architecture, training
  corpus, contamination posture, held-out numbers, per-domain
  verifier AUROC variance that motivates the default rule profile,
  τ-sweep table, explicit Pro-vs-OSS split.
- README.md — new "Benchmarks" section under "Why NadirClaw" with
  RouterBench (AUROC 0.961, ECE 0.016, 98.3% quality preserved at
  τ=0.80) and RouterArena (sub_10 composite 0.7118, projected
  Add resilient routing, fallback handling, and setup/admin management to server API #5) numbers, plus the zero-overlap contamination table. Links to
  the open RouterArena submission PR
  (RouteWorks/RouterArena#112).
- New cascade + rule-engine bullets in the Features list.

Context

RouterArena sub_10 — composite score 0.7118, projected Add resilient routing, fallback handling, and setup/admin management to server API #5
on the public leaderboard. Submission PR (live, under review):
Add Nadir router (verifier-gated cascade + cost-min baseline) RouteWorks/RouterArena#112
RouterBench held-out (n=11,420) — AUROC 0.961, ECE 0.016.
At τ=0.80: 98.3% of always-Opus quality preserved, 1.7%
catastrophic-downgrade rate, ~60% cost reduction vs always-Opus.
Contamination audit — zero overlap between Nadir's classifier
training corpus and either RouterBench 0shot (0 of 36,481) or
either RouterArena split (0 of 809 / 0 of 8,399). Reproducible from
this PR with verifier/contamination_audit.py.

What is NOT in this PR

By design — these stay on the Nadir Pro side and are not portable as
open-source:

The trained wide_deep_asym_v3.pt classifier artifact (binary
centroid and DistilBERT classifiers stay in NadirClaw as before).
The trained DeBERTa-v3-small cross-encoder verifier (the rule-based
heuristic verifier shipped here mirrors the interface, ~0.60 AUROC).
Per-tenant Supabase wiring and the production /v1/route_only
endpoint.
Internal labeled-data shards used to train the classifier and
benchmark-specific rule profiles (routerarena_v3.yaml etc.).

Backwards compatibility

Cascade(cheap_call=..., expensive_call=...) without a rule_engine
argument behaves exactly as before — the engine is opt-in.
DEFAULT_ACCEPTANCE_THRESHOLD ships as 0.80 (was 0.5 in the
draft cascade module locally). Pass threshold=0.5 to restore the
more permissive cut.
A buggy custom rule engine cannot fail the request: evaluate()
exceptions are swallowed and the cascade falls through to the
verifier path.

Test plan

pytest -q --ignore=tests/test_e2e.py — 678 passed
pytest tests/test_e2e.py -q — 38 passed
pytest tests/test_cascade_rule_engine.py tests/test_heuristic_verifier.py tests/test_contamination_audit.py -q — 53 passed
Default profile loads and force-escalates on ```python,
```javascript, ```typescript, def ,
function , and summarize the following / summarize this.
Threshold-stacking rules raise the verifier bar; only the strictest matched threshold wins.
Hot-reload picks up file changes via mtime invalidation.
Contamination audit returns exit-code 0 on zero overlap, 2 on overlap, 1 on missing inputs.

Ports the verifier-gated cascade architecture from Nadir Pro to the NadirClaw open-source core, plus the generic data-driven rule engine that sits in front of it. Cascade dispatch (nadirclaw/cascade.py): * Cheap-first dispatch with post-hoc verification. * Fail-open on verifier exceptions; kill switch after 3 consecutive errors so a misbehaving verifier never blocks request flow. * Default acceptance threshold tau=0.80, calibrated against the held-out RouterBench test split (n=11,420). At tau=0.80 the composed system preserves 98.3% of always-Opus quality with a 1.7% catastrophic-downgrade rate. Full tau-sweep documented inline. Heuristic verifier (nadirclaw/heuristic_verifier.py): * Rule-based, dependency-light (regex + stdlib only), ~1 ms / call. * Detects refusals, uncertainty, hard-min length, prompt/response ratio failures, and JSON parse failures. * Same scoring interface as the Nadir Pro DeBERTa verifier; ~0.60 AUROC vs ~0.96 for the trained version. Rule engine (nadirclaw/cascade_rules/): * Declarative YAML rules: substring / regex / prompt-length / classifier-confidence conditions, ORed inside `match.any_of`. * Four action types: force_escalate, force_cheap, set_threshold, set_max_tokens. Set-threshold rules stack (max wins); set_max_tokens rules stack (max wins, safer routing-side default). * TTL + mtime hot-reload cache so operators can edit a profile YAML on disk and see the new policy take effect without a restart. * PyYAML is optional (load_inline works without it); ships under a new `cascade-rules` extra in pyproject.toml. * Bundled `default.yaml` profile encodes the legacy force-escalate patterns and domain thresholds for code / summarisation — domains where post-hoc verifiers are known to be unreliable (AUROC 0.65 on mbpp, 0.77 on consensus_summary). Tests: 64 new test cases across rule parsing, priority ordering, applies_when gating, set_threshold stacking, set_max_tokens composition, malformed-rule rejection, hot-reload, and cascade integration. Existing 678-test suite remains green.

…ility Adds `verifier/contamination_audit.py`, the standalone script that reproduces Nadir's "no held-out leakage" check across RouterBench and RouterArena. Given any benchmark prompt file(s) and any training-corpus file(s), the script: 1. NFC-normalises, strips, casefolds, and SHA-256s every prompt (same recipe used internally for the Nadir verifier corpus, so hashes are portable across the audit boundary). 2. Reports overlap count and up to N (default 50) overlap examples in a JSON report. 3. Exits 0 on zero overlap, 2 on any overlap, 1 on missing inputs -- so the audit can be wired straight into a CI gate. Stdlib-only (no third-party deps). Supports .jsonl, .json (list of objects or list of strings), and .txt. Per-file prompt key auto- detection (`prompt`, `input`, `question`, `query`, `text`) with `--prompt-key` override. The internal Nadir audit results that the public benchmark claims hang on: * RouterBench 0shot: 0 of 36,481 overlap (audit 2026-05-24) * RouterArena sub_10: 0 of 809 overlap (audit 2026-05-27) * RouterArena full: 0 of 8,399 overlap (audit 2026-05-27) Tests: 9 new test cases cover the hashing convention, the three supported file formats, the prompt-key override, the report shape, and the CLI exit codes.

MODEL_CARD.md documents the pre-generation classifier architecture that backs Nadir's RouterBench and RouterArena numbers: * Wide-and-deep asymmetric architecture, BGE embedding deep branch, lambda=3 downgrade penalty. * Training corpus, intended use, limitations, and the per-domain verifier AUROC variance that motivates the default cascade-rule profile (force-escalate on code / summarisation). * Held-out numbers: RouterBench AUROC 0.961, ECE 0.016, 98.3% quality preserved at tau=0.80; RouterArena sub_10 composite 0.7118 (projected #5 on the public leaderboard). * Contamination audit table (RouterBench 0/36,481; RouterArena sub_10 0/809; RouterArena full 0/8,399). * Explicit note that the trained `wide_deep_asym_v3.pt` artifact is proprietary to Nadir Pro; NadirClaw users get the same routing topology with the simpler binary centroid or DistilBERT classifier, and the same rule engine on top. README.md additions: * New "Benchmarks" section directly under "Why NadirClaw" with the held-out RouterBench, RouterArena, and contamination-audit numbers. Links to the live RouterArena submission PR (RouteWorks/RouterArena#112). * New "Verifier-gated cascade" and "Cascade rule engine" bullets in the Features section.

Ship the actual trained pre-generation classifier in the open-source package so NadirClaw users get the same Wide&Deep ternary classifier described in MODEL_CARD.md, not just the architecture description. Why bundle (Option A from the audit): - The asym + sym checkpoints together are ~1.8 MB. Adding them as package data is friction-free for users and avoids a HuggingFace download dependency or a training-recipe re-run on first use. - The MODEL_CARD already documented the architecture in detail; shipping the weights closes the loop so the documented benchmark numbers are reproducible from the package. - The MIT license already covers code in this repo; we relicense the weights under the same MIT terms (they were derived only from Nadir's internal labeled batches, which are ours to license). What ships: - nadirclaw/models/wide_deep_asym_v3.pt (905 KB, λ=3 asym CE loss) - nadirclaw/models/wide_deep_sym_v3.pt (905 KB, plain CE loss, recovers correct simple-class behaviour under argmax decoding) - nadirclaw/wide_deep_classifier.py — singleton-cached loader with argmax + cost-sensitive decoders, lazy BGE-base-en-v1.5 encoder, 33-d structural feature extractor. - nadirclaw/structural_features.py — 33-d feature extractor (length buckets, code fences, math symbols, tool calls, question words). Pure regex, no ML deps. - pyproject.toml — `models/*.pt` added to package-data so the checkpoints ship in the wheel. - tests/test_wide_deep_classifier.py — 10 integration tests that load the actual bundled weights, run a real forward pass, and assert the singleton + decoder hot-swap contract. MODEL_CARD updated to reflect that the weights now ship in NadirClaw (was previously documented as Pro-only). README "OSS vs Pro" table updated to mention the bundled trained classifier alongside the existing binary centroid and DistilBERT options. Usage: from nadirclaw.wide_deep_classifier import get_wide_deep_classifier clf = get_wide_deep_classifier( checkpoint_variant="asym", decision_rule="cost_sensitive", cost_lambda=20.0, ) result = clf.classify("Your prompt") print(result.tier, result.confidence) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… doc Cross-vendor cascades (Gemini-cheap + OpenAI/Anthropic-mid + Opus-class top + Llama fallback) expose failure modes that the default single-vendor profile does not model: refusal-style drift between vendors, chain-of-thought ability gaps on the cheap tier, structured- output wrapping inconsistency, and length-control drift on summarisation. These were the patterns we observed when expanding Nadir's RouterArena submission from a single-provider menu to a four- provider menu. Adds: - nadirclaw/cascade_rules/profiles/multi_provider.yaml — 12-rule profile encoding the cross-provider mitigations: force_escalate on CoT / math-proof / jailbreak / code triggers, set_threshold bumps on JSON / summarise / long-prompt patterns, force_cheap short-circuits for trivial greetings and acknowledgements. - docs/multi-provider-routing.md — learnings writeup plus a reproducibility recipe for running NadirClaw's classifier + rule engine over cached benchmark responses (e.g. RouterArena's ./cached_results/) without making any live API calls. Cross-links to the RouterArena PR. - tests/test_cascade_rule_engine.py — 4 new tests asserting the profile loads cleanly and triggers the expected actions on CoT, greeting, and structured-output prompts. Loaded with: from nadirclaw.cascade_rules import load_profile engine = load_profile("multi_provider") cascade = Cascade(cheap_call, expensive_call, rule_engine=engine) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

doramirdor · 2026-05-28T01:30:19Z

Two follow-up commits on this branch closing the two open gaps from the PR audit:

1326016 — bundle the trained wide_deep_asym_v3 checkpoint

The MODEL_CARD described the architecture but the .pt weights were Pro-only, so OSS users got just the heuristic verifier. Bundling fixes that:

nadirclaw/models/wide_deep_asym_v3.pt (905 KB, λ=3 asym CE loss)
nadirclaw/models/wide_deep_sym_v3.pt (905 KB, plain CE — recovers correct simple-class behaviour under argmax decoding)
nadirclaw/wide_deep_classifier.py — singleton-cached loader with argmax + cost-sensitive decoders
nadirclaw/structural_features.py — 33-d feature extractor (pure regex)
pyproject.toml — models/*.pt added to package-data so the weights ship in the wheel
tests/test_wide_deep_classifier.py — 10 integration tests loading the real bundled weights

Picked Option A (bundle) over the HF download or "ship the recipe" alternatives because the file is tiny and the MODEL_CARD already documented the architecture, so shipping the weights closes the loop with the published benchmark numbers. Weights are MIT-licensed alongside the rest of the package.

from nadirclaw.wide_deep_classifier import get_wide_deep_classifier
clf = get_wide_deep_classifier(checkpoint_variant="asym", decision_rule="cost_sensitive", cost_lambda=20.0)
result = clf.classify("Your prompt")

c0fd1b6 — multi-provider routing profile + reproducibility doc

When the model menu spans Gemini + OpenAI + Anthropic + Llama-class fallback, the default profile (calibrated for an Anthropic-only ladder) doesn't model: refusal-style drift, CoT ability gaps, JSON-wrapping inconsistency, or length-control drift on summarisation.

nadirclaw/cascade_rules/profiles/multi_provider.yaml — 12 rules covering those four failure modes
docs/multi-provider-routing.md — learnings writeup + reproducibility recipe for running NadirClaw's classifier + rule engine over cached benchmark responses (RouterArena's ./cached_results/ pattern) with no live API calls
4 new tests in tests/test_cascade_rule_engine.py

Test status: PR #59's full PR-relevant suite (tests/test_cascade_rule_engine.py + tests/test_heuristic_verifier.py + tests/test_contamination_audit.py) plus the new tests/test_wide_deep_classifier.py = 67 passed, 0 failed.

Nadir Research and others added 5 commits May 27, 2026 16:50

doramirdor merged commit de1c42c into main May 29, 2026
3 checks passed

doramirdor deleted the feat/cascade-rule-engine-2026-05-27 branch May 29, 2026 01:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add generic cascade rule engine + τ=0.80 default + RouterArena context#59

Add generic cascade rule engine + τ=0.80 default + RouterArena context#59
doramirdor merged 5 commits into
mainfrom
feat/cascade-rule-engine-2026-05-27

doramirdor commented May 27, 2026

Uh oh!

doramirdor commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

doramirdor commented May 27, 2026

Summary

Context

What is NOT in this PR

Backwards compatibility

Test plan

Uh oh!

doramirdor commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant