Chat with your MongoDB database using natural language queries!
- This AI-powered agent understands user queries from natural language
- Convert user queries into MongoDB queries to extract requiered data from database
- Uses **Agentic Reasoning** via CrewAI to return insightful results
- Loom Video to show working model - https://www.loom.com/share/8565bcc353994287a404885fcaf1db5b?sid=b8845cfc-d1fc-4a04-8c5e-c948cec09dad
- 💬 Chat interface using
Streamlit - 🧠 Intent matching via semantic similarity search using
FAISS. - 🔌 Connects to MongoDB collections (
accounts,transactions,customersfromsample_analyticsdatabase) - 🤖 Agent-based architecture using
LangChain - 📚 Sample question dataset for guided queries
- 🔎 Tools for structured DB retrieval: get customer tiers, accounts, transactions, etc.
- Handle
Multiple database operationsin one conversation Error Handlingfor ambiguous/ impossible queries
git clone https://github.com/your-username/conversational-db-agent.git
cd conversational-db-agentWe recommend using a virtual environment:
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txtCreate a .env file in the root directory:
MONGO_URI="your-mongodb-uri"
GOOGLE_API_KEY="your-generative-ai-key"streamlit run app.pyThe app will open in your browser at http://localhost:8501.
- The system allows users to query a MongoDB database using natural language, powered by LangChain Agent, Google Generative AI Chat Model, and Streamlit for an interactive UI.
- Provides a chat-based user interface.
- Displays both user queries and agent responses.
- Handles session management and chat history.
- Central logic for handling natural language input.
- Uses LangChain tools and memory for interactive querying.
- Integrates a custom toolset for MongoDB operations.
Functions registered with the LangChain agent to perform specific DB operations:
get_customer_tiersget_customers_with_email_domainget_accounts_for_username- And many more defined in
main.py.
- Performs similarity search to identify user intent.
- Matches user input with stored intents from
sample_questions.json.
- Primary data source (MongoDB Atlas or local).
- Collections:
customers,accounts,transactions. - Link for dataset - https://www.mongodb.com/docs/atlas/sample-data/sample-analytics/
- User enters natural language query in Streamlit chat.
- app.py sends the query to the agent (
agent.run(user_input)). - Vectorstore checks the query against
sample_questions.jsonfor intent matching using FAISS. - LangChain Agent:
- Determines the best matching tool or route.
- Executes the appropriate function (Tool).
- Tool executes MongoDB query using
pymongo. - Extracted data from databse analyzed by Agent and outputs structured response for user query in natural language.
- Response is returned from Agent → Streamlit.
- User sees the assistant's response in the chat.
| Component | Technology |
|---|---|
| UI | Streamlit |
| LLM Integration | LangChain + Google GenAI |
| Vector Database | FAISS |
| Backend Logic | Python + LangChain Tools |
| Database | MongoDB (via PyMongo) |
├── app.py # Streamlit frontend app
├── main.py # Core agent logic and tools
├── .env # MongoDB URI KEY and LLM API KEY
├── requirements.txt # Python dependencies
├── sample_questions.json # Pre-defined sample questions for few shot learning
├── README.md # Project documentation (you are here)
Stored in a .env file:
- MONGO_URI=your_mongodb_connection_string
- GOOGLE_API_KEY=your_google_genai_api_key
Sample questions are loaded from sample_questions.json to help agent to analyze different types of user’s query intent. Examples:
- "Show me transactions greater than 1000 for account 328304."
- "List customer tiers and their benefits."
- "Retrieve all customers with 'xyz@gmail.com' email."
