RAG (Retrieval-Augmented Generation) from scratch. 8 Jupyter notebooks building every component by hand.
# install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# install dependencies
uv sync
# install ollama (macOS)
brew install ollama
# pull a local LLM model
ollama pull llama3.2
# start ollama server (keep running in background)
ollama serveuv run jupyter notebook notebooks/| # | Notebook | Topic | Needs Ollama? |
|---|---|---|---|
| 01 | 01_text_and_chunking |
Text splitting strategies, chunk size experiments | No |
| 02 | 02_vectors_and_similarity |
Dot product, cosine similarity, euclidean — from scratch | No |
| 03 | 03_tfidf_from_scratch |
TF-IDF vectors, keyword search, synonym problem | No |
| 04 | 04_neural_embeddings |
sentence-transformers, semantic search, PCA visualization | No |
| 05 | 05_vector_store_from_scratch |
Naive vector store, ChromaDB, scaling benchmark | No |
| 06 | 06_retrieval_experiments |
Chunk size impact, multi-query retrieval, score distributions | No |
| 07 | 07_generation |
Prompt templates, RAG vs no-RAG, hallucination test | Yes |
| 08 | 08_full_rag_pipeline |
Complete pipeline, interactive Q&A | Yes |
from rag import RAGPipeline
pipe = RAGPipeline("data/knowledge_base.txt")
answer = pipe.ask("How does caffeine work?")
print(answer)# quick test
uv run python -c "from rag import RAGPipeline; print('OK')"