feat(llm): complexity-based triage routing for multi-provider pool by bug-ops · Pull Request #2153 · bug-ops/zeph

bug-ops · 2026-03-23T05:57:26Z

Summary

Implements complexity-based pre-inference triage routing for the multi-provider LLM pool (closes #2141).

Before each LLM call, a configurable classifier provider evaluates input complexity and routes to a tier-specific provider:

Simple → fast/cheap model (e.g. Haiku, qwen3:1.7b)
Medium → default model
Complex → smart model (e.g. Sonnet, GPT-4o)
Expert → expert model (e.g. Opus, o1)

Changes

crates/zeph-llm/src/router/triage.rs (new) — ComplexityTier, TriageRouter, TriageMetrics
crates/zeph-llm/src/any.rs — AnyProvider::Triage variant + all match sites
crates/zeph-config/src/providers.rs — LlmRoutingStrategy::Triage, ComplexityRoutingConfig, TierMapping
crates/zeph-core/src/bootstrap/provider.rs — build_triage_provider()
crates/zeph-core/src/agent/tool_execution/native.rs — status indicator
src/init.rs — --init wizard step

Config

[llm.complexity_routing]
enabled = true
triage_provider = "fast"
bypass_single_provider = true
triage_timeout_secs = 5

[llm.complexity_routing.tiers]
simple = "fast"
medium = "default"
complex = "smart"
expert = "expert"

Key design decisions

TriageRouter implements LlmProvider directly via Box::pin to break the recursive AnyProvider type cycle
Classifier call wrapped in tokio::time::timeout; regex fallback on malformed JSON output
Context-window auto-escalation skips gracefully when provider returns None
last_provider_idx (AtomicUsize) tracks last-used tier for correct cost/token reporting
set_status_tx() propagates to all tier providers — streaming indicators work during inference
bypass_single_provider compares config entry names (not runtime type names) for heterogeneous pools
All metrics use AtomicU64 — no Mutex in async paths

Tests

18 new unit tests: tier routing, timeout fallback, context-window escalation (incl. None case), metrics counters, config TOML round-trip, LlmRoutingStrategy::Triage deserialization.

Total: 6443 passed (was 6432, +11 net after deduplication).

Deferred (follow-up issues)

TUI metrics panel wiring (metrics available via TriageRouter::metrics())
Interactive --init wizard step (config schema added, literal updated)
Prompt injection hardening (system message prefix in triage call)
chat_stream/chat_with_tools async delegation tests

…2141) Add LlmRoutingStrategy::Triage and TriageRouter to zeph-llm. Before each LLM call, a configurable classifier provider evaluates input complexity and routes to a tier-specific provider (Simple/Medium/Complex/Expert). Key design: - TriageRouter implements LlmProvider directly via Box::pin to break the recursive AnyProvider type cycle; bootstrap wires it as the top-level provider when routing = "triage" - Classifier call is wrapped in tokio::time::timeout (default 5s) with fallback to the default tier on timeout or JSON parse failure - Regex extraction fallback when structured JSON output is malformed - Context-window auto-escalation skips gracefully when provider returns None - last_provider_idx (AtomicUsize) tracks the last-used tier; last_usage() and last_cache_usage() delegate to that provider for correct cost tracking - set_status_tx() propagates to all tier providers so streaming indicators work during actual inference - bypass_single_provider compares config entry names, not runtime provider type names, to handle heterogeneous pools correctly - AtomicU64 metrics: per-tier call counts, latency_us_total, escalations, fallbacks, timeout_fallbacks - AnyProvider::Triage variant added; all match sites updated Config: [llm.complexity_routing] with triage_provider, tiers.{simple,medium, complex,expert}, triage_timeout_secs, max_triage_tokens, bypass_single_provider, fallback_strategy; --init wizard step; --migrate-config support Agent loop emits "Evaluating complexity..." status indicator during triage. 18 new unit tests covering tier routing, timeout fallback, context-window escalation (including None case), metrics counters, config TOML round-trip, and LlmRoutingStrategy::Triage deserialization. Closes #2141

- TriageRouter::set_status_tx takes &StatusTx (needless_pass_by_value) - remove unused LlmProvider import in bootstrap/provider.rs - collapse nested if blocks in bypass_single_provider check

- book/src/advanced/complexity-triage.md (new) — full feature page: config reference, bypass optimization, timeout/fallback, cascade hybrid - book/src/advanced/adaptive-inference.md — add triage row to strategy table - book/src/reference/configuration.md — document [llm.complexity_routing] fields and add cross-reference link - book/src/SUMMARY.md — insert new page under Advanced - README.md — mention LlmRoutingStrategy::Triage in routing description - crates/zeph-llm/README.md — add TriageRouter section with config example - crates/zeph-config/README.md — add ComplexityRoutingConfig to key types

github-actions bot added documentation Improvements or additions to documentation llm zeph-llm crate (Ollama, Claude) rust Rust code changes core zeph-core crate enhancement New feature or request size/XL Extra large PR (500+ lines) labels Mar 23, 2026

bug-ops enabled auto-merge (squash) March 23, 2026 05:58

bug-ops added 4 commits March 23, 2026 07:08

fix(llm): resolve clippy warnings in TriageRouter CI job

58e1c77

- TriageRouter::set_status_tx takes &StatusTx (needless_pass_by_value) - remove unused LlmProvider import in bootstrap/provider.rs - collapse nested if blocks in bypass_single_provider check

chore: merge origin/main, resolve CHANGELOG conflict

62369d7

docs(specs): add complexity triage routing spec (#2141)

bc34db5

bug-ops merged commit bc9af89 into main Mar 23, 2026
29 checks passed

bug-ops deleted the feat-issue-2141-complexity-triage-routing branch March 23, 2026 06:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): complexity-based triage routing for multi-provider pool#2153

feat(llm): complexity-based triage routing for multi-provider pool#2153
bug-ops merged 5 commits intomainfrom
feat-issue-2141-complexity-triage-routing

bug-ops commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 23, 2026

Summary

Changes

Config

Key design decisions

Tests

Deferred (follow-up issues)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant