RAG API with Pinecone & OpenRouter

A powerful Retrieval-Augmented Generation (RAG) API that lets you upload documents, store them in a vector database, and chat with them using AI.

🎯 What This Does

Upload documents (PDF, DOCX, TXT, MD, CSV, JSON, HTML)
Automatically chunks the text into smaller pieces
Converts chunks to embeddings (vector representations) using OpenRouter
Stores embeddings in Pinecone vector database
Chat with your documents - ask questions and get AI-generated answers based on the content

🧠 How It Works

Your Document → Text Extraction → Chunking → Embeddings → Pinecone
                                                              ↓
Your Question → Embedding → Search Similar Chunks → Send to LLM → Answer

Example:

You upload a 50-page PDF about climate change or any other document you have
You ask: "What are the main causes of global warming? or any other question related to the document you uploaded"
The system finds the most relevant chunks from your PDF
An AI reads those chunks and generates a concise answer

📋 Prerequisites

Node.js (v18 or higher)
Pinecone account - Sign up free
OpenRouter account - Sign up free

🚀 Quick Start

1. Clone or Download This Project

-  For HTTPS:
git clone https://github.com/daviddozie/rag-index.git

-  For SSH
git clone git@github.com:daviddozie/rag-index.git

cd pinecone-js

2. Install Dependencies

npm install dotenv express multer cors @pinecone-database/pinecone @openrouter/sdk uuid pdf-parse mammoth

3. Set Up Environment Variables

Create a .env file in the project root:

# Required
PINECONE_API_KEY=your_pinecone_api_key_here
OPENROUTER_API_KEY=your_openrouter_api_key_here
PINECONE_INDEX_NAME=rag-index
LLM_MODEL=google/gemini-2.0-flash-exp:free / (any of your favourite LLM)
RAG_DATA_DIR=./data
RAG_CHUNK_SIZE=500
RAG_CHUNK_OVERLAP=100
PORT=3000

Getting API Keys:

Pinecone:

Go to https://www.pinecone.io/
Sign up and create a new project
Copy your API key from the dashboard

OpenRouter:

Go to https://openrouter.ai/
Sign up and go to Settings → Keys
Create a new API key
Add credits at https://openrouter.ai/settings/credits (or use free models)

4. Start the Server

node pinecone.js

You should see:

RAG API running on http://localhost:3000
Using LLM: it shows the LLM your using (e.g: google/gemini-2.0-flash-exp:free)
Using embeddings: openai/text-embedding-3-small

📖 API Endpoints

1. Upload Documents

Endpoint: POST /upload

Upload a single file:

curl -X POST http://localhost:3000/upload \
  -F "files=@document.pdf"

Upload multiple files:

curl -X POST http://localhost:3000/upload \
  -F "files=@document1.pdf" \
  -F "files=@document2.txt"

Response:

{
  "context": "ctx-a1b2c3d4",
  "chunks": 15,
  "files": 1
}

Important: Save the context ID - you'll need it to chat with these documents!

2. Chat with Your Documents

Endpoint: POST /chat

curl -X POST http://localhost:3000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "context": "ctx-a1b2c3d4",
    "query": "What is the main topic of this document?"
  }'

Response:

{
  "answer": "The main topic of this document is...",
  "context": [
    "Relevant chunk 1 from your document...",
    "Relevant chunk 2 from your document..."
  ]
}

3. List All Contexts

Endpoint: GET /contexts

curl http://localhost:3000/contexts

Response:

[
  "ctx-a1b2c3d4",
  "ctx-e5f6g7h8",
  "ctx-i9j0k1l2"
]

4. View Context Metadata

Endpoint: GET /context/:name/metadata

curl http://localhost:3000/context/ctx-a1b2c3d4/metadata

Response:

[
  {
    "id": "chunk-uuid-1",
    "context": "ctx-a1b2c3d4",
    "filename": "document.pdf",
    "offset_start": 0,
    "offset_end": 500,
    "text": "First 500 characters of your document..."
  },
  ...
]

📁 Supported File Types

✅ PDF (.pdf) - Uses pdf-parse
✅ Word Documents (.docx) - Uses mammoth
✅ Text Files (.txt)
✅ Markdown (.md)
✅ CSV (.csv)
✅ JSON (.json)
✅ HTML (.html, .htm)

🎓 Learning Resources

What is RAG?

Retrieval-Augmented Generation is a technique that combines:

Information Retrieval - Finding relevant documents
Language Generation - Using AI to generate answers

This solves the problem of AI hallucination by grounding responses in your actual documents.

Key Concepts

1. Embeddings

Text converted to numbers (vectors)
Similar text has similar vectors
Example: "cat" and "kitten" have similar embeddings

2. Vector Database (Pinecone)

Stores embeddings efficiently
Searches by similarity, not keywords
Example: Search for "climate change" finds "global warming"

3. Chunking

Breaking documents into smaller pieces
Default: 500 characters with 100 character overlap
Overlap ensures context isn't lost at chunk boundaries

4. Semantic Search

Searches by meaning, not exact words
Example: "How do I reset my password?" matches "Password recovery instructions"

Project Structure

pinecone-js/
├── pinecone.js          # Main API server
├── .env                 # API keys (DO NOT COMMIT)
├── package.json         # Dependencies
├── data/                # Local storage (created automatically)
│   └── ctx-xxxxxxxx/    # Each context has its own folder
│       ├── files/       # Original uploaded files
│       └── metadata.json # Chunk metadata
└── README.md            # This file

🔧 Configuration Options

Chunk Size

Control how documents are split:

RAG_CHUNK_SIZE=500      # Characters per chunk
RAG_CHUNK_OVERLAP=100   # Characters that overlap between chunks

When to adjust:

Larger chunks (1000+): Better for long, coherent passages
Smaller chunks (300-500): Better for precise answers

LLM Model

Choose different AI models from OpenRouter:

# Free options
LLM_MODEL=google/gemini-2.0-flash-exp:free
LLM_MODEL=meta-llama/llama-3.2-3b-instruct:free

# Paid options (better quality)
LLM_MODEL=anthropic/claude-3.5-sonnet
LLM_MODEL=openai/gpt-4o

See all available models

🐛 Troubleshooting

"Out of credits" error

Problem: OpenRouter API returns 402 error

Solutions:

Use a free model: LLM_MODEL=google/gemini-2.0-flash-exp:free
Add credits at https://openrouter.ai/settings/credits

Upload takes a long time

Problem: Large files take time to process

Why: Each chunk needs to be converted to embeddings via API

Expected times:

Small file (1-5 pages): 5-10 seconds
Medium file (10-50 pages): 30-60 seconds
Large file (100+ pages): 2-5 minutes

"Context not found" error

Problem: Context ID doesn't exist locally

Solution: Upload a new document and use the returned context ID

Empty search results

Problem: Chat returns empty context array

Possible causes:

Wrong context ID
Query is too different from document content
Document didn't upload correctly

Fix: Check metadata endpoint to verify chunks exist

🎯 Use Cases

Research Assistant - Upload papers and ask questions
Document Q&A - Chat with user manuals, contracts, reports
Knowledge Base - Create a searchable company wiki
Study Helper - Upload textbooks and get explanations
Legal Document Analysis - Query contracts and agreements

🔐 Security Notes

Never commit .env to version control
Add .env to .gitignore
Keep API keys secret
The ./data folder contains uploaded files - protect it

📚 Further Learning

RAG Concepts:

Technologies Used:

🤝 Contributing

Feel free to:

Report bugs
Suggest features
Submit pull requests
Improve documentation

📝 License

MIT License - feel free to use this project for learning and commercial purposes.

Questions? Open an issue or check the troubleshooting section above!

Happy RAG building! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
pinecone.js		pinecone.js

Folders and files

Latest commit

History

Repository files navigation

RAG API with Pinecone & OpenRouter

🎯 What This Does

🧠 How It Works

📋 Prerequisites

🚀 Quick Start

1. Clone or Download This Project

2. Install Dependencies

3. Set Up Environment Variables

4. Start the Server

📖 API Endpoints

1. Upload Documents

2. Chat with Your Documents

3. List All Contexts

4. View Context Metadata

📁 Supported File Types

🎓 Learning Resources

What is RAG?

Key Concepts

Project Structure

🔧 Configuration Options

Chunk Size

LLM Model

🐛 Troubleshooting

"Out of credits" error

Upload takes a long time

"Context not found" error

Empty search results

🎯 Use Cases

🔐 Security Notes

📚 Further Learning

🤝 Contributing

📝 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages