Financial RAG Chatbot & Document Automation 🤖📈

An automated Retrieval-Augmented Generation (RAG) pipeline built with n8n to dynamically ingest, parse, and vectorize financial reports, featuring an intelligent intent-routing system powered by local LLMs.

📖 Overview

This project automates the extraction and semantic analysis of financial documents (like quarterly earnings reports). Instead of manually searching through dense PDFs, users can query a chat interface. An intelligent routing layer classifies the user's intent to either query the vector database for specific financial data or respond conversationally for general inquiries.

✨ Key Features

Automated Data Ingestion: Actively monitors a Google Drive folder for new financial reports and downloads them automatically.
Intelligent Intent Routing: Uses a custom HTTP routing layer with a lightweight local LLM to classify user queries.
- Search Intent: Routes to the RAG pipeline for data retrieval.
- Direct Intent: Routes to a standard conversational AI agent.
Vectorized Knowledge Base: Processes documents using LangChain (recursive character splitting) and embeds them into a Pinecone vector database for lightning-fast semantic search.
Local AI Integration: Completely integrated with Ollama, utilizing qwen2.5:1.5b for generation and nomic-embed-text for embeddings, ensuring data privacy and reducing API costs.
Context-Aware Chat: Maintains conversational memory and restricts the AI to answer only based on the retrieved financial context to prevent hallucinations.

🛠️ Tech Stack

Workflow Automation: n8n
Frameworks: LangChain
LLMs (Local): Ollama (qwen2.5:1.5b)
Embeddings: nomic-embed-text
Vector Database: Pinecone
Integrations: Google Drive API, REST APIs

🏗️ Architecture Flow

Ingestion: A Google Drive trigger detects new file uploads.
Processing: Files are downloaded, parsed by LangChain, and split into chunks (100-character overlap).
Embedding: Text chunks are vectorized using Ollama and upserted into Pinecone namespaces based on the filename.
Query Routing: When a chat message is received, an LLM evaluates the prompt. If it detects a request for financial data, it outputs SEARCH. Otherwise, it outputs DIRECT.
Retrieval & Generation (RAG):
- The system searches Google Drive for the relevant filename.
- A Pinecone vector search pulls the top 5 most relevant semantic chunks.
- The Agent synthesizes the retrieved chunks into a cited, factual response.

🚀 Getting Started

Prerequisites

n8n installed (local, Docker, or Cloud).
Ollama installed and running locally with the following models pulled:
```
ollama run qwen2.5:1.5b
ollama pull nomic-embed-text
```
A Pinecone account and API key.
Google Cloud Console account with the Google Drive API enabled and OAuth2 credentials generated.

Installation

Clone this repository.
Open your n8n instance and click Import from File (or copy the JSON and click Import from Clipboard).
Select the Rag_Chatbot.json workflow file.
Configure your credentials inside the n8n nodes:
- Authenticate the Google Drive nodes.
- Add your Pinecone API Key.
- Ensure the Ollama nodes point to your local instance (default: http://127.0.0.1:11434).
Set up a Google Drive folder for your source documents and map the Folder ID in the Trigger node.
Activate the workflow!

💡 Usage

Once the workflow is active, upload a financial report (e.g., an Apple Q3 Earnings PDF) to your designated Google Drive folder. Wait a moment for the ingestion pipeline to vectorize the document.

Access the n8n chat trigger interface and ask questions like:

"What was the total revenue reported for the quarter?"
"How did iPhone sales compare to Mac sales?"

The bot will route the query, fetch the exact data points from Pinecone, and generate an answer grounded entirely in the uploaded report.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
Workflow.json		Workflow.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Financial RAG Chatbot & Document Automation 🤖📈

📖 Overview

✨ Key Features

🛠️ Tech Stack

🏗️ Architecture Flow

🚀 Getting Started

Prerequisites

Installation

💡 Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Financial RAG Chatbot & Document Automation 🤖📈

📖 Overview

✨ Key Features

🛠️ Tech Stack

🏗️ Architecture Flow

🚀 Getting Started

Prerequisites

Installation

💡 Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages