Skip to content

feat(config): unified [[llm.providers]] schema (#2134)#2148

Merged
bug-ops merged 22 commits intomainfrom
feat/issue-2134/config-unified-providers
Mar 23, 2026
Merged

feat(config): unified [[llm.providers]] schema (#2134)#2148
bug-ops merged 22 commits intomainfrom
feat/issue-2134/config-unified-providers

Conversation

@bug-ops
Copy link
Owner

@bug-ops bug-ops commented Mar 23, 2026

Closes #2134, #2135, #2136, #2137, #2138, #2139

Summary

Replaces 6 separate LLM provider structs and dual orchestrator/router concepts with a single [[llm.providers]] array. Each provider is defined exactly once; routing is declared via routing field.

New config types:

  • ProviderEntry flat-union struct replaces CloudLlmConfig, OpenAiConfig, GeminiConfig, OllamaConfig, CompatibleConfig, OrchestratorProviderConfig
  • LlmRoutingStrategy enum (None/Ema/Thompson/Cascade/Task) replaces provider = "router" / provider = "orchestrator"

Implementation:

  • create_provider_from_pool() dispatches on routing strategy, building RouterProvider for Ema/Thompson/Cascade
  • resolve_secrets() updated for compatible providers in new providers pool
  • migrate_llm_to_providers() converts all 6 legacy formats; old-format configs produce startup error with --migrate-config hint
  • --init wizard generates new format
  • All legacy structs removed in same PR (no deprecation shim)
  • 10 pages of documentation updated (book/src/)
  • Root and crate READMEs updated

Tests: 6406/6406 pass (+9 net vs main baseline of 6397)

Breaking Change

Old-format configs ([llm.cloud], [llm.openai], [llm.orchestrator], [llm.router]) produce a startup error:

legacy LLM config format detected — run: zeph --migrate-config

All removed types documented in CHANGELOG.md.

Before / After

# Before: 35 lines, triplicated model field
[llm]
provider = "orchestrator"
model = "claude-sonnet-4-6"
[llm.cloud]
model = "claude-sonnet-4-6"
max_tokens = 4096
[llm.orchestrator.providers.claude]
type = "claude"
model = "claude-sonnet-4-6"

# After: 12 lines, single source of truth
[llm]
routing = "task"
[llm.routes]
chat = ["claude", "ollama"]
[[llm.providers]]
name = "claude"
type = "claude"
model = "claude-sonnet-4-6"
max_tokens = 4096
default = true

Test Plan

  • cargo +nightly fmt --check — PASS
  • cargo clippy --workspace --features full -- -D warnings — PASS
  • cargo nextest run --workspace --features full --lib --bins — 6406/6406 PASS
  • Live LLM session test (LLM serialization gate) — required before merge

bug-ops added 6 commits March 22, 2026 23:28
Add unified ProviderEntry struct that replaces CloudLlmConfig, OpenAiConfig,
GeminiConfig, OllamaConfig, CompatibleConfig, and OrchestratorProviderConfig.
Add LlmRoutingStrategy enum replacing the orchestrator/router split.
Add CandleInlineConfig for use inside ProviderEntry.
Add validate() and validate_pool() with B1/B2 blocker checks.

No behavior change: existing types untouched, new types are additive.
…bility

Introduce [[llm.providers]] array format alongside legacy [llm] fields.
LlmConfig fields provider/base_url/model become Option<T> with effective_*()
helpers that fall back to sensible defaults. Add migrate_llm_to_providers()
to auto-convert all legacy formats (ollama, claude, openai, gemini,
compatible, orchestrator, router) during --migrate-config. Update
config/default.toml to use new [[llm.providers]] format.

All 6397 tests pass.
…ders]] pool

Add build_provider_from_entry() that constructs AnyProvider directly from
a ProviderEntry without relying on legacy config sections. Update
create_provider() to dispatch via the new pool when providers is non-empty,
with fallback to the next pool entry on initialization failure.
…ew format

- check_legacy_format() is now called during bootstrap; old-format configs
  fail with actionable error: "Run zeph --migrate-config"
- effective_provider/base_url/model() now check providers[0] first, enabling
  new-format configs to work with all legacy call sites
- Default impl added for ProviderEntry
- --init wizard now generates [[llm.providers]] entries instead of legacy sections
- Update tests to verify new-format output from build_config()

Completes Phase 2 acceptance: starting with old-format config errors; --init
generates new format. All 6397 tests pass.
… format

C1: create_provider_from_pool() now dispatches on LlmRoutingStrategy —
Ema/Thompson/Cascade initialize all pool providers and wrap in RouterProvider;
Task strategy logs a warning and falls back to single provider.

C2: resolve_secrets() now iterates self.llm.providers for compatible entries
and fetches ZEPH_COMPATIBLE_<NAME>_API_KEY from vault.

H1: migrate_llm_to_providers() now copies [llm.cloud.thinking] as TOML
inline table into the migrated [[llm.providers]] entry.

H2: migrate_llm_to_providers() now copies thinking_level, thinking_budget,
include_thoughts from [llm.gemini] into the migrated entry.

H3: check_legacy_format() no longer includes orchestrator in has_legacy
detection — orchestrator is not yet migrated to [[llm.providers]] format.

H4: LlmConfig.summary_provider changed from OrchestratorProviderConfig to
ProviderEntry; build_summary_provider() uses build_provider_from_entry().
bug-ops added 12 commits March 23, 2026 01:37
Remove CloudLlmConfig, OpenAiConfig, GeminiConfig, OllamaConfig,
OrchestratorConfig, OrchestratorProviderConfig, CompatibleConfig from
zeph-config. Remove ProviderKind::Orchestrator and ::Router variants.
Remove legacy LlmConfig fields: provider, base_url, model, cloud,
openai, gemini, ollama, compatible, orchestrator, vision_model.

All bootstrap paths now use create_provider_from_pool() exclusively.
Empty pool falls back to default Ollama on localhost. The --init wizard
Orchestrator option now produces a two-entry [[llm.providers]] pool.
check_legacy_format() simplified to always return Ok(()).
Restore 10 tests from bootstrap/tests.rs that were removed during the
legacy struct cleanup but test live functionality unrelated to legacy
config: create_mcp_manager_with_stdio_transport, create_mcp_manager_empty_servers,
create_mcp_registry_when_semantic_disabled, managed_skills_dir_returns_skills_subdir,
app_builder_managed_skills_dir_matches_free_fn, skill_paths_includes_managed_dir,
skill_paths_does_not_duplicate_managed_dir, create_skill_matcher_when_semantic_disabled,
appbuilder_qdrant_ops_invalid_url_returns_err, appbuilder_qdrant_ops_valid_url_succeeds.

Test count: 5942 → 5952 (+10).
Add unified ProviderEntry struct that replaces CloudLlmConfig, OpenAiConfig,
GeminiConfig, OllamaConfig, CompatibleConfig, and OrchestratorProviderConfig.
Add LlmRoutingStrategy enum replacing the orchestrator/router split.
Add CandleInlineConfig for use inside ProviderEntry.
Add validate() and validate_pool() with B1/B2 blocker checks.

No behavior change: existing types untouched, new types are additive.
…bility

Introduce [[llm.providers]] array format alongside legacy [llm] fields.
LlmConfig fields provider/base_url/model become Option<T> with effective_*()
helpers that fall back to sensible defaults. Add migrate_llm_to_providers()
to auto-convert all legacy formats (ollama, claude, openai, gemini,
compatible, orchestrator, router) during --migrate-config. Update
config/default.toml to use new [[llm.providers]] format.

All 6397 tests pass.
…ders]] pool

Add build_provider_from_entry() that constructs AnyProvider directly from
a ProviderEntry without relying on legacy config sections. Update
create_provider() to dispatch via the new pool when providers is non-empty,
with fallback to the next pool entry on initialization failure.
…ew format

- check_legacy_format() is now called during bootstrap; old-format configs
  fail with actionable error: "Run zeph --migrate-config"
- effective_provider/base_url/model() now check providers[0] first, enabling
  new-format configs to work with all legacy call sites
- Default impl added for ProviderEntry
- --init wizard now generates [[llm.providers]] entries instead of legacy sections
- Update tests to verify new-format output from build_config()

Completes Phase 2 acceptance: starting with old-format config errors; --init
generates new format. All 6397 tests pass.
… format

C1: create_provider_from_pool() now dispatches on LlmRoutingStrategy —
Ema/Thompson/Cascade initialize all pool providers and wrap in RouterProvider;
Task strategy logs a warning and falls back to single provider.

C2: resolve_secrets() now iterates self.llm.providers for compatible entries
and fetches ZEPH_COMPATIBLE_<NAME>_API_KEY from vault.

H1: migrate_llm_to_providers() now copies [llm.cloud.thinking] as TOML
inline table into the migrated [[llm.providers]] entry.

H2: migrate_llm_to_providers() now copies thinking_level, thinking_budget,
include_thoughts from [llm.gemini] into the migrated entry.

H3: check_legacy_format() no longer includes orchestrator in has_legacy
detection — orchestrator is not yet migrated to [[llm.providers]] format.

H4: LlmConfig.summary_provider changed from OrchestratorProviderConfig to
ProviderEntry; build_summary_provider() uses build_provider_from_entry().
Remove CloudLlmConfig, OpenAiConfig, GeminiConfig, OllamaConfig,
OrchestratorConfig, OrchestratorProviderConfig, CompatibleConfig from
zeph-config. Remove ProviderKind::Orchestrator and ::Router variants.
Remove legacy LlmConfig fields: provider, base_url, model, cloud,
openai, gemini, ollama, compatible, orchestrator, vision_model.

All bootstrap paths now use create_provider_from_pool() exclusively.
Empty pool falls back to default Ollama on localhost. The --init wizard
Orchestrator option now produces a two-entry [[llm.providers]] pool.
check_legacy_format() simplified to always return Ok(()).
Restore 10 tests from bootstrap/tests.rs that were removed during the
legacy struct cleanup but test live functionality unrelated to legacy
config: create_mcp_manager_with_stdio_transport, create_mcp_manager_empty_servers,
create_mcp_registry_when_semantic_disabled, managed_skills_dir_returns_skills_subdir,
app_builder_managed_skills_dir_matches_free_fn, skill_paths_includes_managed_dir,
skill_paths_does_not_duplicate_managed_dir, create_skill_matcher_when_semantic_disabled,
appbuilder_qdrant_ops_invalid_url_returns_err, appbuilder_qdrant_ops_valid_url_succeeds.

Test count: 5942 → 5952 (+10).
@bug-ops bug-ops force-pushed the feat/issue-2134/config-unified-providers branch from 80733e0 to 7243710 Compare March 23, 2026 00:47
@bug-ops bug-ops enabled auto-merge (squash) March 23, 2026 00:47
bug-ops added 2 commits March 23, 2026 02:01
…mpty pool

When ZEPH_LLM_PROVIDER, ZEPH_LLM_BASE_URL, or ZEPH_LLM_MODEL are set on a
config with no [[llm.providers]] entries (e.g. default config), the overrides
were silently dropped because first_mut() returned None on the empty Vec.

Fix: create a default ProviderEntry before applying env overrides when the
pool is empty and any LLM env override is present.

Fixes CI failure in config_defaults_and_env_overrides (tests/integration.rs:331).
Resolve conflict in env.rs: keep fix for env var override when providers
pool is empty. Remote adds response verifier and sanitizer changes.
@bug-ops bug-ops merged commit d4f751e into main Mar 23, 2026
29 checks passed
@bug-ops bug-ops deleted the feat/issue-2134/config-unified-providers branch March 23, 2026 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config Configuration file changes core zeph-core crate documentation Improvements or additions to documentation enhancement New feature or request rust Rust code changes size/XL Extra large PR (500+ lines) tests Test-related changes

Projects

None yet

1 participant