-
Notifications
You must be signed in to change notification settings - Fork 0
[Enhancement] Intent Classification to Route Queries Correctly #2
Copy link
Copy link
Open
Description
Problem
All queries go through the full search + RAG pipeline, even ones that don't need web search
(e.g., "explain recursion", "what is 2+2", "hello").
Expected Behavior
Classify intent before routing:
| Intent | Action |
|---|---|
| Factual / Current Events | Full search + RAG |
| Conceptual / Definitional | LLM-only (skip search) |
| Math / Code | Structured generation |
| Greeting / Chitchat | Lightweight response |
| Ambiguous | Search with low confidence flag |
Why It Matters
- Reduces unnecessary SearXNG calls (critical since we self-host)
- Faster response for non-search queries
- Lower Groq token usage
Suggested Approach
- Use
llama-3.1-8b-instantas a fast intent classifier (single prompt, JSON output) - Gate the search pipeline on intent result
Labels
enhancement performance good first issue
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels