Skip to content

nt-suuri/basic-ragg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

basic-ragg

RAG (Retrieval-Augmented Generation) from scratch. 8 Jupyter notebooks building every component by hand.

Setup

# install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# install dependencies
uv sync

# install ollama (macOS)
brew install ollama

# pull a local LLM model
ollama pull llama3.2

# start ollama server (keep running in background)
ollama serve

Run Notebooks

uv run jupyter notebook notebooks/

Notebooks

# Notebook Topic Needs Ollama?
01 01_text_and_chunking Text splitting strategies, chunk size experiments No
02 02_vectors_and_similarity Dot product, cosine similarity, euclidean — from scratch No
03 03_tfidf_from_scratch TF-IDF vectors, keyword search, synonym problem No
04 04_neural_embeddings sentence-transformers, semantic search, PCA visualization No
05 05_vector_store_from_scratch Naive vector store, ChromaDB, scaling benchmark No
06 06_retrieval_experiments Chunk size impact, multi-query retrieval, score distributions No
07 07_generation Prompt templates, RAG vs no-RAG, hallucination test Yes
08 08_full_rag_pipeline Complete pipeline, interactive Q&A Yes

Use as Module

from rag import RAGPipeline

pipe = RAGPipeline("data/knowledge_base.txt")
answer = pipe.ask("How does caffeine work?")
print(answer)
# quick test
uv run python -c "from rag import RAGPipeline; print('OK')"

About

RAG from scratch - step-by-step learning notebooks building a Retrieval-Augmented Generation pipeline from fundamentals

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors