feat: context engineering — selective tools + adaptive compaction#69
feat: context engineering — selective tools + adaptive compaction#69kienbui1995 merged 1 commit intomainfrom
Conversation
1. Selective tool loading by model tier: - Tier 1 (frontier): all 30 tools - Tier 2 (strong): 25 tools (no worktree/notebook/mcp) - Tier 3 (local): 8 core tools only - Tier 4 (Qwen): 10 tools (core + memory + codebase_search) Saves ~1500 tokens for small models. 2. Adaptive compaction thresholds: - Small context (<64K): compact at 70%, light at 55% - Large context (≥64K): compact at 90%, light at 80% Prevents Qwen 3.5 (32K) from running out of context. 274 tests, 0 fail.
📝 WalkthroughWalkthroughThis PR introduces model-based tool-tier restrictions to the conversation runtime. A new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
There was a problem hiding this comment.
Code Review
This pull request introduces a tiered tool system and dynamic context compaction thresholds to optimize performance for models with smaller context windows. Key feedback includes the need to initialize and update the tool tier in all execution modes and model changes, and to use an enum for better type safety. Additionally, the compaction logic should be adjusted to prevent shadowing of micro-compaction, token estimation should account for filtered tools, and magic numbers should be replaced with constants.
| let mut rt = | ||
| mc_core::ConversationRuntime::new(model.to_string(), max_tokens, system.to_string()); | ||
| rt.set_tool_registry(tool_registry); | ||
| rt.tool_tier = model_prompt_tier(&model); |
There was a problem hiding this comment.
The tool_tier is correctly initialized here for the TUI mode, but it is missing in the run_single function (around line 1593). This will cause models to use the default tier 1 (all tools) when running in CLI or pipe mode, which may exceed the context window or reasoning capabilities of smaller models like Qwen or Llama. Additionally, the system prompt (built at line 187) will be tiered, but the actual tools sent in the request will not be, leading to a discrepancy between the model's instructions and its available tools.
| task_manager: crate::tasks::TaskManager, | ||
| hierarchical_instructions: Option<String>, | ||
| /// Model capability tier for tool filtering (1=frontier, 2=strong, 3=local, 4=qwen). | ||
| pub tool_tier: u8, |
There was a problem hiding this comment.
Using a raw u8 for tool_tier is error-prone and reduces readability. Consider defining a ModelTier enum to provide better type safety. Furthermore, note that tool_tier is currently not updated when the model is changed via set_model (line 238), which will result in incorrect tool filtering if a user switches between frontier and local models mid-session.
|
|
||
| if usage_pct > 90 { | ||
| // Small context models (<64K) compact earlier | ||
| let (full_threshold, light_threshold) = if ctx_window < 65_536 { |
| crate::compact::compact_session(&mut self.session, preserve); | ||
| } | ||
| } else if usage_pct > 80 { | ||
| } else if usage_pct > light_threshold { |
There was a problem hiding this comment.
For small context models, light_threshold is set to 55. This else if block will match for any usage_pct above 55, which effectively shadows the subsequent else if usage_pct > 60 check (line 1383) that performs micro_compact. As a result, micro_compact (which trims long outputs for all tools) will never be executed for small models when they are under context pressure. Consider reordering the checks or making them independent.
| .tool_registry | ||
| .all_specs() | ||
| .iter() | ||
| .filter(|s| tool_allowed_for_tier(s.name.as_str(), self.tool_tier)) |
There was a problem hiding this comment.
While tools are correctly filtered here before being sent to the model, the tool_schema_tokens calculation at line 1465 still iterates over all tools in the registry. This leads to an overestimation of the context used by tool definitions for lower-tier models, resulting in a smaller effective_max_tokens budget than actually available. The filter should also be applied during the token estimation at line 1465.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
mc/crates/mc-core/src/runtime.rs (1)
1362-1386: Micro-compact branch is unreachable for small context models.For models with
ctx_window < 65_536, the thresholds are(70, 55). Theelse if usage_pct > 60check at line 1383 will never trigger because any value >60 is already caught byusage_pct > light_threshold(55).If this is intentional (small models skip micro-compact and jump to light compact earlier), consider adding a comment to clarify. Otherwise, the micro-compact threshold should also be adaptive:
♻️ Proposed fix to make micro-compact threshold adaptive
// Small context models (<64K) compact earlier let (full_threshold, light_threshold) = if ctx_window < 65_536 { (70, 55) // compact at 70%, light at 55% } else { (90, 80) }; + let micro_threshold = if ctx_window < 65_536 { 40 } else { 60 }; if usage_pct > full_threshold { // Full smart compact ... } else if usage_pct > light_threshold { // Collapse reads + snip thinking ... - } else if usage_pct > 60 { + } else if usage_pct > micro_threshold { // Micro-compact only crate::compact::micro_compact(&mut self.session); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@mc/crates/mc-core/src/runtime.rs` around lines 1362 - 1386, The micro-compact branch is currently unreachable for small models because light_threshold is 55 while the hardcoded micro threshold is 60; define an adaptive micro_threshold (e.g., let micro_threshold = if ctx_window < 65_536 { 50 } else { 60 }) and replace the final else-if (usage_pct > 60) with else if usage_pct > micro_threshold so micro_compact(&mut self.session) can be reached for small and large ctx_window values; reference variables/functions: ctx_window, full_threshold, light_threshold, usage_pct, micro_threshold, crate::compact::micro_compact, crate::compact::collapse_reads, crate::compact::snip_thinking.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Line 421: In run_single, after creating the ConversationRuntime via
ConversationRuntime::new (in function run_single), set runtime.tool_tier =
model_prompt_tier(&model) so the same model-based tool tier selection done in
run_tui is applied to pipe/non-interactive runs; locate the runtime variable in
run_single and assign tool_tier using the model_prompt_tier(&model) helper
before the runtime is used.
---
Nitpick comments:
In `@mc/crates/mc-core/src/runtime.rs`:
- Around line 1362-1386: The micro-compact branch is currently unreachable for
small models because light_threshold is 55 while the hardcoded micro threshold
is 60; define an adaptive micro_threshold (e.g., let micro_threshold = if
ctx_window < 65_536 { 50 } else { 60 }) and replace the final else-if (usage_pct
> 60) with else if usage_pct > micro_threshold so micro_compact(&mut
self.session) can be reached for small and large ctx_window values; reference
variables/functions: ctx_window, full_threshold, light_threshold, usage_pct,
micro_threshold, crate::compact::micro_compact, crate::compact::collapse_reads,
crate::compact::snip_thinking.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 07f7191e-dff9-4ef1-9099-2fa4191ad50c
📒 Files selected for processing (2)
mc/crates/mc-cli/src/main.rsmc/crates/mc-core/src/runtime.rs
| let mut rt = | ||
| mc_core::ConversationRuntime::new(model.to_string(), max_tokens, system.to_string()); | ||
| rt.set_tool_registry(tool_registry); | ||
| rt.tool_tier = model_prompt_tier(&model); |
There was a problem hiding this comment.
LGTM for TUI path, but run_single is missing the same assignment.
The tier assignment in run_tui is correct. However, the run_single function (lines 1569-1593) creates a ConversationRuntime without setting tool_tier:
// Line 1569-1570
let mut runtime =
mc_core::ConversationRuntime::new(model.to_string(), max_tokens, system.to_string());
// ... tool_tier is never set, defaults to 1This means pipe/non-interactive mode (magic-code --pipe "fix bug" --model qwen3.5) will always send all 30 tools regardless of model tier, defeating the context engineering optimization.
🐛 Proposed fix to add tier assignment in run_single
async fn run_single(
model: &str,
...
) -> Result<()> {
let cancel = CancellationToken::new();
let mut runtime =
mc_core::ConversationRuntime::new(model.to_string(), max_tokens, system.to_string());
+ runtime.tool_tier = model_prompt_tier(model);
let mut tool_registry = mc_tools::ToolRegistry::new()
.with_workspace_root(std::env::current_dir().unwrap_or_default());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@mc/crates/mc-cli/src/main.rs` at line 421, In run_single, after creating the
ConversationRuntime via ConversationRuntime::new (in function run_single), set
runtime.tool_tier = model_prompt_tier(&model) so the same model-based tool tier
selection done in run_tui is applied to pipe/non-interactive runs; locate the
runtime variable in run_single and assign tool_tier using the
model_prompt_tier(&model) helper before the runtime is used.



Context Engineering for Small Models
Selective Tool Loading
Adaptive Compaction
Critical for Qwen 3.5 9B self-hosted (32K context).
274 tests, 0 fail.
Summary by CodeRabbit
New Features
Improvements