An advanced Retrieval-Augmented Generation (RAG) system powered by LangChain agents that can dynamically search the web, fetch webpage content, and query local knowledge bases to answer questions with real-time information.
- π Dynamic Web Search: Real-time information from DuckDuckGo (no API key needed)
- π Webpage Fetching: Extract and analyze content from specific URLs
- πΎ Local Knowledge Base: Optional pgvector-powered semantic search on scraped content
- π§ Intelligent Agent: Multi-step reasoning with tool selection
- π Source Citations: Automatic source attribution with URLs
- β‘ FastAPI Backend: Modern async web framework with streaming responses
- π¨ Clean UI: Simple, responsive web interface
- Architecture
- Prerequisites
- Installation
- Configuration
- Running the Application
- Testing
- Usage Examples
- Project Structure
- API Endpoints
- Troubleshooting
- Development Notes
User Query β Agent Planning β Multi-Tool Execution β Response Generation
β
[Web Search | Fetch Webpage | Local Knowledge Base]
The agent intelligently selects and combines tools based on the query:
- Current events/facts: Uses web search
- Specific URLs: Fetches and analyzes webpage content
- Domain-specific queries: Searches local vector database (if available)
- Agent Framework: LangChain 1.0+ with
langchain-classicagents - LLM: OpenAI GPT OSS 20B via OpenRouter (free tier)
- Embeddings: HuggingFace
all-MiniLM-L6-v2(384 dimensions) - Vector DB: PostgreSQL 14+ with pgvector extension
- Web Framework: FastAPI with async streaming
- Frontend: Vanilla HTML/CSS/JavaScript
- Python 3.12+
- PostgreSQL 14+ with pgvector extension (optional, for local knowledge base)
- 4GB+ RAM
- Internet connection (for web search and LLM API)
-
OpenRouter Account (Free tier available)
- Sign up at https://openrouter.ai/
- Get your API key from the dashboard
- Free tier includes: OpenAI GPT OSS 20B, DeepSeek models, etc.
-
PostgreSQL with pgvector (Optional)
- Only needed if you want local knowledge base functionality
- See PostgreSQL Setup below
git clone https://github.com/nyang64/agentic-rag20
cd agentic-rag20python3 -m venv env
source env/bin/activate # On Windows: env\Scripts\activatepip install -r requirements.txtIf you want to use the local knowledge base feature:
# Install PostgreSQL with pgvector
# On macOS:
brew install postgresql@18
brew install pgvector
# Start PostgreSQL
brew services start postgresql@18
# Create database and enable extension
createdb myprojdb
psql myprojdb -c "CREATE EXTENSION IF NOT EXISTS vector;"
psql myprojdb -c "CREATE SCHEMA IF NOT EXISTS scraper;"
psql myprojdb -c "CREATE USER myuser WITH PASSWORD 'mypassword';"
psql myprojdb -c "GRANT ALL PRIVILEGES ON DATABASE myprojdb TO myuser;"
psql myprojdb -c "GRANT ALL ON SCHEMA scraper TO myuser;"Create or edit the .env file in the project root:
# Required - OpenRouter API Configuration
OPENROUTER_API_KEY=sk-or-v1-your-key-here
# Required - Model Selection (OpenAI Free Model via OpenRouter)
OPENAI_FREE_MODEL=openai/gpt-oss-20b:free
# Optional - PostgreSQL for Local Knowledge Base
PGVECTOR_DB_URL=postgresql://myuser:mypassword@localhost:5432/myprojdb
# Optional - LangSmith Tracing (for debugging)
LANGSMITH_ENDPOINT=https://api.smith.langchain.com
LANGSMITH_TRACING=true
LANGSMITH_API_KEY=your-langsmith-key
# System Configuration
TOKENIZERS_PARALLELISM=false
SCRAPY_SETTINGS_MODULE=scraper.settingsThe agent uses openai/gpt-oss-20b:free because:
- β Function calling support: Required for agent tool selection
- β Reliable via OpenRouter: Stable routing and availability
- β Free tier: No cost for testing and development
- β DeepSeek models: Don't support function calling via OpenRouter
Available free models on OpenRouter:
openai/gpt-oss-20b:free- Best for agents (function calling support) βx-ai/grok-4.1-fast:free- Fast responsestngtech/deepseek-r1t2-chimera:free- Good for general queriesdeepseek/deepseek-chat-v3.1:free- Advanced reasoning
Important: Only openai/gpt-oss-20b:free currently supports the function calling needed for agentic workflows.
# Activate virtual environment
source env/bin/activate # On Windows: env\Scripts\activate
# Run the web application
python web_app.pyThe application will start on http://localhost:8000
# With auto-reload for development
uvicorn web_app:app --reload --port 8000
# Production mode
uvicorn web_app:app --host 0.0.0.0 --port 8000 --workers 4Open your browser and navigate to:
- Main Interface: http://localhost:8000
- API Documentation: http://localhost:8000/docs (FastAPI auto-generated)
python test_agent.pyThis diagnostic script checks:
- β Environment variables
- β Package imports
- β LLM connection (OpenRouter)
- β Web search functionality (DuckDuckGo)
- β Vector database connection (optional)
- β Agent import and initialization
- β Agent execution with sample query
# Test agent directly from command line
python -c "
from scraper.agent_classic import create_agentic_rag
agent = create_agentic_rag()
result = agent.invoke({'input': 'What is the weather in New York today?'})
print(result['output'])
"# Test LLM connection
python -c "
from langchain_openai import ChatOpenAI
import os
from dotenv import load_dotenv
load_dotenv()
llm = ChatOpenAI(
model=os.getenv('OPENAI_FREE_MODEL'),
openai_api_key=os.getenv('OPENROUTER_API_KEY'),
openai_api_base='https://openrouter.ai/api/v1'
)
print(llm.invoke('Say hello!').content)
"
# Test web search
python -c "
from scraper.agent_classic import web_search
result = web_search.func('Python programming')
print(result[:500])
"
# Test vector database (if configured)
python -c "
from scraper.raq_query import retrieve_top3
docs = retrieve_top3('test query')
print(f'Retrieved {len(docs)} documents')
"Try these queries in the web interface:
What are the latest developments in AI?
What happened in the news today?
Who won the recent elections?
Below is a sample run screen shot, asking "what is the weather like in New York City today?"
What is the weather like in San Francisco today?
What is the current stock price of Apple?
What is the exchange rate USD to EUR?
Compare the top 3 fastest trains in the world
What are the benefits of electric vehicles?
Explain quantum computing in simple terms
Find information about renewable energy adoption in Europe and compare it to Asia
What are the main causes of climate change and what solutions are being proposed?
curl -X POST "http://localhost:8000/ask" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "query=What is the capital of France?"curl -X POST "http://localhost:8000/ask_verbose" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "query=What is the weather in Tokyo?"Response format:
{
"answer": "The current weather in Tokyo...",
"steps": [
{
"tool": "web_search",
"input": "Tokyo weather today",
"output": "Current weather data..."
}
]
}curl -X POST "http://localhost:8000/ask_streaming" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "query=Tell me about the Mars rover"agentic-rag20/
βββ README.md # This file
βββ .env # Environment variables (create this)
βββ requirements.txt # Python dependencies
βββ pyproject.toml # Project metadata
βββ web_app.py # FastAPI web application
βββ test_agent.py # Diagnostic test suite
βββ test_imports.py # Import verification
βββ adhoc_query.py # Ad-hoc vector DB queries
β
βββ scraper/ # Main package directory
β βββ __init__py # Package initialization
β βββ agent_classic.py # β Main agent (WORKING)
β βββ agent.py # β Has library issues
β βββ agent_simple.py # β Has library issues
β βββ langgraph_agent.py # β Has library issues
β βββ raq_query.py # Vector DB RAG queries
β βββ embed_and_store.py # Embedding & storage utilities
β βββ cache.py # Caching utilities
β βββ settings.py # Scrapy settings
β βββ pipelines.py # Data processing pipelines
β βββ spiders/ # Scrapy spiders (for data collection)
β
βββ static/ # Frontend assets
β βββ index.html # Web UI
β βββ img/ # Images
β
βββ env/ # Virtual environment (git-ignored)
βββ __pycache__/ # Python cache (git-ignored)
βββ .git/ # Git repository
scraper/agent_classic.py: β The working agent implementation usinglangchain-classicweb_app.py: FastAPI application with streaming endpointstest_agent.py: Comprehensive diagnostic and testing scriptscraper/raq_query.py: Vector database retrieval (optional feature)
Returns the main web interface (HTML)
Standard query endpoint with streaming response
Request:
Content-Type: application/x-www-form-urlencoded
query=Your question here
Response:
Streaming text/plain response with answer and sources
Returns detailed agent reasoning trace
Request:
Content-Type: application/x-www-form-urlencoded
query=Your question here
Response:
{
"answer": "The answer to your question...",
"steps": [
{
"tool": "web_search",
"input": "search query",
"output": "search results..."
}
]
}Streams agent's thought process in real-time
Request:
Content-Type: application/x-www-form-urlencoded
query=Your question here
Response:
Streaming text/plain with agent thoughts:
π€ Thinking about your question...
π§ Using tool: web_search
π‘ Found: [results]
β
Answer: [final answer]
Solution:
pip install langchain-classic==1.0.0Symptoms: openai.AuthenticationError: Error code: 401
Solution:
- Verify your API key is correct in
.env - Check that
.envfile exists in project root - Ensure
OPENROUTER_API_KEY=sk-or-v1-...is set correctly - Get a new key from https://openrouter.ai/ if needed
Symptoms: RateLimitError: 429
Solution:
- Free tier has rate limits (check OpenRouter dashboard)
- Wait a few minutes between requests
- Consider upgrading to paid tier for higher limits
Symptoms: psycopg2.OperationalError
Solution:
- Check PostgreSQL is running:
brew services list(macOS) - Verify database exists:
psql -l | grep myprojdb - Check credentials in
.envmatch your PostgreSQL setup - Note: PostgreSQL is optional - agent works without it for web search
Symptoms: No search results found
Solution:
- Check internet connection
- DuckDuckGo might be rate-limiting (wait a minute)
- Try a different query
- Check if you're behind a firewall/proxy
Symptoms: Agent doesn't combine multiple tools
Solution:
- This is normal behavior - agent picks the most appropriate tool
- Try queries that clearly need multiple sources
- Check agent reasoning in verbose mode
- Adjust system prompt in
agent_classic.pyif needed
If you encounter issues:
- Run diagnostics:
python test_agent.py - Check logs: Look for error messages in terminal
- Verify environment: Ensure all
.envvariables are set - Test components: Use individual component tests above
- Check documentation: Review this README and
AGENTIC_RAG_UPGRADE_GUIDE.md
The project has multiple agent files, but only agent_classic.py works because:
- β
agent_classic.py: Useslangchain-classic1.0.0 (stable, working) - β
agent.py: Uses newer LangChain APIs with dependency conflicts - β
agent_simple.py: Simplified version with library compatibility issues - β
langgraph_agent.py: Advanced LangGraph implementation with dependency issues
Recommendation: Use agent_classic.py for production. Other files are kept for reference.
The agent has access to three tools:
-
web_search: DuckDuckGo search (no API key needed)- Used for: current events, facts, recent information
- Returns: top 5 search results with titles, snippets, URLs
-
fetch_webpage: Webpage content extraction- Used for: getting detailed content from specific URLs
- Returns: cleaned text content (up to 5000 chars)
-
search_local_knowledge: Vector database search (optional)- Used for: domain-specific previously scraped content
- Returns: top 3 most relevant documents from local DB
Edit scraper/agent_classic.py:
system_message = """You are an intelligent research assistant...
[Customize instructions here]
"""@tool
def your_custom_tool(input: str) -> str:
"""Description of what this tool does"""
# Your implementation
return result
# Add to tools list
tools = [
web_search,
fetch_webpage,
search_local_knowledge,
your_custom_tool, # Add here
]Edit .env:
# Try different free models
OPENAI_FREE_MODEL=openai/gpt-oss-20b:free
# or
OPENAI_FREE_MODEL=x-ai/grok-4.1-fast:freeNote: Function calling support required for agents!
# scraper/cache.py provides a caching layer
from scraper.cache import SearchCache
cache = SearchCache(ttl_hours=24)
# Implement in agent_classic.py# Execute multiple tools simultaneously
from langchain_core.runnables import RunnableParallel
# Implementation needed- Create a new branch
- Make your changes
- Test with
python test_agent.py - Submit a pull request
- Implement search result caching
- Add Brave Search API support (better results)
- Parallel tool execution
- Better source validation and credibility scoring
- Enhanced UI with agent visualization
- User authentication and query history
- API rate limiting
- Docker containerization
AGENTIC_RAG_UPGRADE_GUIDE.md: Detailed guide on upgrading from RAG 1.0 to 2.0- FastAPI Docs: Auto-generated at http://localhost:8000/docs
- LangChain Docs: https://python.langchain.com/docs/
- OpenRouter Docs: https://openrouter.ai/docs
This project is for educational and development purposes.
Nanchang Yang
Email: nyang63@gmail.com
- LangChain for the agent framework
- OpenRouter for free LLM access
- PostgreSQL and pgvector for vector storage
- FastAPI for the web framework
Last Updated: December 2024
Status: β
Ready / Working solution
Version: 2.0.0
