- Compile pipeline: renamed index pipeline to compile pipeline with passes-based architecture
- Compiler refactor: renamed stages to passes, removed deprecated
StageResultalias andCustomStageBuilder - New backend compilation passes: query routing, reasoning chains, overlap detection, and scoring
- Agent acceleration data added to compiled documents
- LLM-powered cross-document insight extraction in ask module
- Enhanced JSON parsing with proper error handling
- Upgraded minimum Python version to 3.11
- Removed unused modules: agent, memory backend, validation, ReferenceResolver, SufficiencyLevel
- Restructured configuration modules and removed legacy retrieval config
- Simplified storage layer by removing memory backend
- Documentation updates for architecture and compilation pipeline
- Project description updated to "reasoning-based document engine"
- Core principles documentation (Reason don't vector, Model fails we fail, No thought no answer)
- Updated homepage with three core principles and key features
- Description generation enabled by default
timeout_secsoption for Python indexing- Agent-based navigation documentation
- Agent-based retrieval architecture: replaced pilot/search with Orchestrator + Workers
- Navigation commands:
ls,cd,cat,grep,find,head,pwd,wc - Orchestrator supervisor loop with dynamic re-planning
- Query understanding pipeline with
QueryPlan - Evidence evaluation and replanning modules
NavigationIndexwithDocCardandSectionCard- LLM-based confidence scoring (replaced BM25)
- Unified rerank pipeline (replaced synthesis/fusion)
DocCardcatalog in workspace storage- Shared concurrency control for LLM clients
- Memoization for LLM operations in retrieval pipeline
- LLM request timeout configuration
- GitHub Actions workflow for automated releases
- Endpoint parameter support for API configuration
- Custom config option in
EngineBuilder - Enhanced error messages with detailed failure info
- Endpoint validation in engine builder
- Runtime metrics reports (LLM, Pilot, Retrieval)
- Recursive option for
from_dirmethod - Directory indexing support via
IndexContext - Centralized
LlmPoolconfiguration system - Shared LLM client injected into pipeline context
- Pipeline checkpoint for resumable indexing
source_pathfield and updatedQueryContextAPI
IndexMetricsbinding with detailed indexing statisticsStrategyPreferencefor controlling retrieval strategies- Pure Pilot search algorithm, beam search with backtracking
- Per-step reasoning support in search algorithms
- Binary pruning and pre-filtering for wide nodes
- LLM-based query complexity detection
- Cross-document strategy with graph-based boosting
- Synonym expansion for improved query recall
- Default summary strategy changed to Full
- PDF parser: switch to
pdf-extractfor reliable text extraction - Concurrent LLM verification for TOC entries
- PDF indexing example
- Internal module naming cleanup (
_prefix for private functions)
- Search-from functionality and ToC-based navigation
- Reasoning chain (replacing navigation trace)
- Adaptive budget controller for pipeline token management
- Structural path constraints and hints extraction
- Reasoning index for fast retrieval path resolution
- Document graph system for cross-document relationships
- Streaming retrieval with
RetrieveEventsupport - Multi-document query support
- Incremental indexing with content and logic fingerprinting
- Parallel processing for multiple document sources
- Pipeline checkpoint and content merging/splitting support
- Workspace-managed dependencies and configuration
- LLM pilot functionality and summary generation
- Query decomposition support
- LLM-first search with TOC-based location
- Restructured Python examples
Initial Python SDK release.
- PyO3 bindings for the Rust engine core
- Basic
Engineclass withindex()andquery()methods pyproject.tomlwith maturin build backend- Ruff formatting configuration