langchain-simple-RAG

A simple Retrieval-Augmented Generation (RAG) system built with LangChain, Ollama, and PyPDF. This system allows users to ask questions about PDF documents and receive AI-generated answers based on the document content.

Features

PDF document loading and processing
Document chunking for efficient retrieval
Vector embeddings using Ollama's BGE-M3 model
Question answering using Llama 3.2 Vision (11B parameters)
Interactive command-line interface
Source attribution for answers

Prerequisites

Python 3.12 or higher
Ollama installed and running locally
Required models pulled in Ollama:
- bge-m3 for embeddings
- llama3.2-vision:11b for text generation

Installation

Clone this repository:

git clone https://github.com/WytheHuang/langchain-simple-RAG.git
cd langchain-simple-RAG

Set up a Python virtual and activate it:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install the required packages:

uv sync

Usage

Create a documents directory in your project root and place your PDF files there:

mkdir documents
cp /path/to/your/pdfs/*.pdf documents/

Run the main script:

python main.py

Enter your questions when prompted. Type /quit to exit the program.

Example interaction:

Loading documents...
Loaded 25 document chunks

Enter your question (or '/quit' to exit): What is machine learning?

Answer: [AI-generated answer will appear here]

Sources:
Source: documents/example.pdf
Page: 1
Content: [Relevant excerpt from the document]

Project Structure

langchain-simple-RAG/
├── documents/            # Directory for PDF files
├── main.py              # Main application script
├── README.md            # This file
├── pyproject.toml       # Project configuration
└── uv.lock             # Dependencies lock file

How It Works

Document Loading: The system loads PDF documents from the documents directory using PyPDFLoader.
Document Splitting: Documents are split into smaller chunks using RecursiveCharacterTextSplitter for efficient processing.
Embedding: Document chunks are embedded using Ollama's BGE-M3 model and stored in an in-memory vector store.
Question Answering: When a user asks a question:
- The system retrieves relevant document chunks
- Passes them to the Llama 3.2 model
- Generates a contextual answer
- Provides source attribution

Configuration

The system uses the following default settings:

Chunk size: 1000 characters
Chunk overlap: 200 characters
Embedding model: bge-m3
LLM model: llama3.2-vision:11b

Development

To contribute to this project:

Install development dependencies:

uv sync --group lint

Run linters:

ruff check .
black .

License

APACHE-2.0 License See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.vscode		.vscode
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

langchain-simple-RAG

Features

Prerequisites

Installation

Usage

Project Structure

How It Works

Configuration

Development

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

langchain-simple-RAG

Features

Prerequisites

Installation

Usage

Project Structure

How It Works

Configuration

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages