Skip to content

Feature request: Support task-specific model routing for query, search, curate, sync, and dream #648

@999cleo

Description

@999cleo

Summary

ByteRover currently appears to use one active provider/model for all task types. It would be useful to configure different models per task category so latency-sensitive tasks can use a fast model, while background/offline tasks can use a higher-quality model.

Motivation

Some ByteRover tasks sit directly on the critical path of an agent conversation. Others are background or offline maintenance.

For example:

  • query / memory prefetch affects perceived response latency in an agent chat.
  • search should stay fast and cheap.
  • curate and automatic turn sync are usually background writes, where quality matters more.
  • dream is offline/background consolidation, where using a stronger model is desirable and latency is less important.

With only one active model, users have to choose between:

  • a fast model that keeps chat responsive but may be weaker for curation/consolidation, or
  • a strong model that improves curation/dream quality but can slow down query/prefetch.

Proposed config

Something like:

models:
  default:
    provider: google
    model: gemini-3.1-flash-lite-preview
  tasks:
    query:
      provider: google
      model: gemini-3.1-flash-lite-preview
    search:
      provider: google
      model: gemini-3.1-flash-lite-preview
    curate:
      provider: google
      model: gemini-3-flash-preview
    sync_turn:
      provider: google
      model: gemini-3-flash-preview
    dream:
      provider: google
      model: gemini-3.1-pro

or if providers are shared globally:

modelRouting:
  query: gemini-3.1-flash-lite-preview
  search: gemini-3.1-flash-lite-preview
  curate: gemini-3-flash-preview
  sync_turn: gemini-3-flash-preview
  dream: gemini-3.1-pro

Desired behavior

  • brv query uses the configured query model.
  • brv search uses the configured search model, or no LLM when the search path is pure retrieval.
  • brv curate uses the configured curate model.
  • automatic conversation sync uses the configured sync_turn model.
  • brv dream and daemon-triggered dream use the configured dream model.
  • if no task-specific model is configured, ByteRover falls back to the current active model to preserve existing behavior.

Example use case

For an agent using ByteRover as a memory provider:

query/search: gemini-3.1-flash-lite-preview
curate/sync_turn: gemini-3-flash-preview
dream: gemini-3.1-pro

This keeps regular agent responses fast while improving quality for background memory writes and offline consolidation.

Why this matters

In an interactive agent, even a few extra seconds of prefetch latency is noticeable. But for dream and background curation, latency is much less important than quality and reliability.

Task-specific model routing would let users optimize for both:

  • low perceived latency during normal chat
  • higher quality long-term memory organization

Backwards compatibility

This can be fully backwards compatible if the current active provider/model remains the default for all task types unless a task override is configured.

Related issue

This is related to a curator quality issue where a lighter model appeared to copy prompt examples into the context tree. Task-specific routing would let users keep fast models for latency-sensitive retrieval while using stronger models for write/consolidation tasks.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions