Summary
ByteRover currently appears to use one active provider/model for all task types. It would be useful to configure different models per task category so latency-sensitive tasks can use a fast model, while background/offline tasks can use a higher-quality model.
Motivation
Some ByteRover tasks sit directly on the critical path of an agent conversation. Others are background or offline maintenance.
For example:
query / memory prefetch affects perceived response latency in an agent chat.
search should stay fast and cheap.
curate and automatic turn sync are usually background writes, where quality matters more.
dream is offline/background consolidation, where using a stronger model is desirable and latency is less important.
With only one active model, users have to choose between:
- a fast model that keeps chat responsive but may be weaker for curation/consolidation, or
- a strong model that improves curation/dream quality but can slow down query/prefetch.
Proposed config
Something like:
models:
default:
provider: google
model: gemini-3.1-flash-lite-preview
tasks:
query:
provider: google
model: gemini-3.1-flash-lite-preview
search:
provider: google
model: gemini-3.1-flash-lite-preview
curate:
provider: google
model: gemini-3-flash-preview
sync_turn:
provider: google
model: gemini-3-flash-preview
dream:
provider: google
model: gemini-3.1-pro
or if providers are shared globally:
modelRouting:
query: gemini-3.1-flash-lite-preview
search: gemini-3.1-flash-lite-preview
curate: gemini-3-flash-preview
sync_turn: gemini-3-flash-preview
dream: gemini-3.1-pro
Desired behavior
brv query uses the configured query model.
brv search uses the configured search model, or no LLM when the search path is pure retrieval.
brv curate uses the configured curate model.
- automatic conversation sync uses the configured
sync_turn model.
brv dream and daemon-triggered dream use the configured dream model.
- if no task-specific model is configured, ByteRover falls back to the current active model to preserve existing behavior.
Example use case
For an agent using ByteRover as a memory provider:
query/search: gemini-3.1-flash-lite-preview
curate/sync_turn: gemini-3-flash-preview
dream: gemini-3.1-pro
This keeps regular agent responses fast while improving quality for background memory writes and offline consolidation.
Why this matters
In an interactive agent, even a few extra seconds of prefetch latency is noticeable. But for dream and background curation, latency is much less important than quality and reliability.
Task-specific model routing would let users optimize for both:
- low perceived latency during normal chat
- higher quality long-term memory organization
Backwards compatibility
This can be fully backwards compatible if the current active provider/model remains the default for all task types unless a task override is configured.
Related issue
This is related to a curator quality issue where a lighter model appeared to copy prompt examples into the context tree. Task-specific routing would let users keep fast models for latency-sensitive retrieval while using stronger models for write/consolidation tasks.
Summary
ByteRover currently appears to use one active provider/model for all task types. It would be useful to configure different models per task category so latency-sensitive tasks can use a fast model, while background/offline tasks can use a higher-quality model.
Motivation
Some ByteRover tasks sit directly on the critical path of an agent conversation. Others are background or offline maintenance.
For example:
query/ memory prefetch affects perceived response latency in an agent chat.searchshould stay fast and cheap.curateand automatic turn sync are usually background writes, where quality matters more.dreamis offline/background consolidation, where using a stronger model is desirable and latency is less important.With only one active model, users have to choose between:
Proposed config
Something like:
or if providers are shared globally:
Desired behavior
brv queryuses the configured query model.brv searchuses the configured search model, or no LLM when the search path is pure retrieval.brv curateuses the configured curate model.sync_turnmodel.brv dreamand daemon-triggered dream use the configured dream model.Example use case
For an agent using ByteRover as a memory provider:
This keeps regular agent responses fast while improving quality for background memory writes and offline consolidation.
Why this matters
In an interactive agent, even a few extra seconds of prefetch latency is noticeable. But for dream and background curation, latency is much less important than quality and reliability.
Task-specific model routing would let users optimize for both:
Backwards compatibility
This can be fully backwards compatible if the current active provider/model remains the default for all task types unless a task override is configured.
Related issue
This is related to a curator quality issue where a lighter model appeared to copy prompt examples into the context tree. Task-specific routing would let users keep fast models for latency-sensitive retrieval while using stronger models for write/consolidation tasks.