A self-learning Text-to-SQL agent with natural language interface, built for production.
- Natural Language to SQL: Ask questions in plain English, get SQL queries and results
- Self-Learning: Improves with every interaction through feedback loops
- Multi-Database Support: PostgreSQL, MySQL, SQLite, and more
- Human-in-the-Loop: Validation for complex queries before execution
- Custom Instructions: Define project-specific rules and terminology
- Multiple Chats: Maintain context across conversations
- Plan & Confirm: See execution plans before running complex queries
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (Next.js + shadcn) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Chat UI β Projects β Data Sources β Custom Instructions β KB β
ββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β Backend API (FastAPI) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Orchestrator β Planner β SQL Agent β Validator β Learning β
ββββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β Services Layer β
ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββββ€
β LLM (Groq) β DataSource β Schema β Knowledge β Embeddings β
β β Connectors β Indexer β Base β (Nomic) β
ββββββββββββββ΄βββββββββββββ΄βββββββββββββ΄βββββββββββββ΄ββββββββββββββ
β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β Data Layer β
βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββββββββββββββ€
β SQLite/PG β ChromaDB β User Data Sources β
β (Metadata) β (Vectors) β (via Connectors) β
βββββββββββββββββββ΄ββββββββββββββββββ΄ββββββββββββββββββββββββββββββ
| Layer | Technology |
|---|---|
| Frontend | Next.js 14, shadcn/ui, Tailwind, Framer Motion |
| Backend | Python 3.12, FastAPI, Pydantic |
| LLM | Groq API (default), OpenAI, Ollama |
| Vector DB | ChromaDB |
| Embeddings | nomic-embed-text (via Ollama) |
| Database | SQLite (dev), PostgreSQL (prod) |
- Python 3.12+
- Node.js 20+
- Ollama (for local embeddings)
- Groq API Key (free at console.groq.com)
cd Text2Sql
# Backend
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Frontend
cd ../frontend
npm installCreate .env in the backend directory:
# LLM
LLM_PROVIDER=groq
GROQ_API_KEY=your-groq-api-key
# Embeddings (Ollama)
OLLAMA_BASE_URL=http://localhost:11434
# Database
DATABASE_URL=sqlite:///./datachat.db
# Debug
DEBUG=true# Install Ollama (macOS)
brew install ollama
# Start server
ollama serve
# Pull embedding model
ollama pull nomic-embed-text# Terminal 1: Backend
cd backend
source venv/bin/activate
uvicorn backend.main:app --reload --port 8000
# Terminal 2: Frontend
cd frontend
npm run devVisit http://localhost:3000
text2sql/
βββ backend/
β βββ api/ # FastAPI routes
β βββ agents/ # Agent implementations (SOLID)
β β βββ orchestrator.py
β β βββ planner.py
β β βββ sql_agent.py
β β βββ validator.py
β β βββ learning.py
β βββ services/ # Business logic
β β βββ llm/ # LLM providers
β β βββ datasources/ # DB connectors
β β βββ embeddings/ # Embedding service
β β βββ knowledge/ # Schema indexer
β β βββ vector_store/ # ChromaDB
β βββ models/ # Pydantic models
β βββ db/ # SQLAlchemy
β βββ core/ # Config & utils
β βββ main.py
β
βββ frontend/
β βββ app/ # Next.js pages
β βββ components/ # React components
β β βββ ui/ # shadcn components
β β βββ chat/ # Chat components
β βββ lib/ # Utilities
β βββ hooks/ # React hooks
β
βββ docker-compose.yaml
βββ README.md
Projects organize your data sources and custom instructions.
Connect to your database:
- PostgreSQL, MySQL, SQLite
- CSV/Parquet files
- REST APIs
Ask questions naturally:
- "Show me top 10 products by revenue"
- "What's the average order value this month?"
- "List customers who churned last quarter"
Help the agent learn:
- π Correct - adds to knowledge base
- π Incorrect - learn from corrections
Add project-specific rules:
- "status field uses values: active, pending, cancelled"
- "Always filter by tenant_id"
- "Revenue = quantity * unit_price"
| Endpoint | Method | Description |
|---|---|---|
/api/projects |
GET | List projects |
/api/projects |
POST | Create project |
/api/chat/message |
POST | Send chat message |
/api/chat/sessions/{id} |
GET | List chat sessions |
/api/datasources |
POST | Add data source |
/api/datasources/{id}/sync |
POST | Sync schema |
/api/feedback |
POST | Submit feedback |
# Groq (default - fast & free)
LLM_PROVIDER=groq
GROQ_API_KEY=your-key
# OpenAI
LLM_PROVIDER=openai
OPENAI_API_KEY=your-key
# Ollama (local)
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434llama-3.3-70b-versatile(default)llama-3.1-8b-instantmixtral-8x7b-32768
cd backend
pytest# Backend
black backend/
ruff check backend/
# Frontend
npm run lint# Build and run
docker-compose up -d
# View logs
docker-compose logs -f- Set
DEBUG=false - Use PostgreSQL for metadata
- Configure proper CORS origins
- Set up rate limiting
- Enable SSL/TLS
- Configure logging
- Set up monitoring
MIT