An educational repo that builds the same Q&A system three ways so you can see, run, and measure the differences.
| Pattern | Core idea | Strongest when |
|---|---|---|
| KG-RAG | Retrieve from a knowledge graph; answer with structured facts and relationships | Multi-hop reasoning, entity-rich domains (biomed, finance, legal) |
| Hybrid RAG | Fuse dense (embeddings) + sparse (BM25) retrieval, then rerank | Mixed terminology, exact terms matter, broad corpora |
| Agentic RAG | An agent decides which tool/index to query, refines, and retries | Heterogeneous sources, complex queries, tool use needed |
- Read
docs/01-concepts.md - Run
notebooks/00_setup_and_data.ipynbto load the shared corpus - Walk through patterns one-by-one: notebooks
01,02,03 - See them on the same questions in
notebooks/04_side_by_side_comparison.ipynb - Measure quality, latency, and cost in
notebooks/05_evaluation_and_costs.ipynb
# 1. Install (uses uv; falls back to pip)
uv sync # or: pip install -e ".[dev]"
# 2. Copy env template and add API keys
cp .env.example .env
# 3. Start backing services (Neo4j + Qdrant)
docker compose up -d
# 4. Launch notebooks
jupyter labdata/ Source corpus + processed artifacts
docs/ Concepts, architecture diagrams, tradeoffs, decision guide
src/ Python package implementing each pattern + shared utilities
notebooks/ Hands-on walkthroughs and the side-by-side comparison
tests/ Smoke tests so notebooks don't rot
See docs/04-when-to-use.md for a decision tree. Short version:
- Need multi-hop joins across entities? → KG-RAG
- Need best general retrieval quality with low effort? → Hybrid RAG
- Need to route across multiple sources or refine queries? → Agentic RAG
MIT. Corpus licenses noted in data/README.md.