Skip to content

Himanshu-Laddhad/Multi-agent-analytics

Repository files navigation

Agentic Analytics (Gemini Code Pipeline)

Agentic Analytics is a Streamlit application that answers natural-language questions about tabular data using a multi-agent code-execution workflow.

The current architecture does not use an NL-to-SQL pipeline. Instead, it plans analysis steps, generates Python code, runs it in a controlled sandbox, and returns charts/files plus a business-readable summary.

Features

  • Natural-language analysis over CSV, PostgreSQL, and URL-loaded datasets.
  • Multi-step agent pipeline: query decomposition, planning, code generation, execution, error feedback, and conclusion.
  • Conversation context persisted per chat in markdown context logs.
  • Chart and artifact output support (PNG/HTML/CSV) saved under outputs/{chat_id}.
  • Streamlit chat UI with chat history and data-source setup workflow.

Tech Stack

  • UI: Streamlit
  • LLM: Google Gemini via google-genai
  • Data: pandas, numpy, Plotly, seaborn
  • Storage: SQLite (SQLAlchemy ORM)
  • Config: pydantic-settings

Quick Start

  1. Create and activate a virtual environment.
python -m venv .venv
# Windows PowerShell
.venv\Scripts\Activate.ps1
  1. Install dependencies.
pip install -r requirements.txt
  1. Configure environment.

Create .env in the repository root and set at least:

GOOGLE_API_KEY=your_google_ai_studio_key
GEMINI_MODEL=gemini-2.5-flash
GEMINI_CODE_MODEL=gemini-2.5-flash
  1. Run the app from the repository root.
streamlit run app/main.py

Open http://localhost:8501.

Project Structure

Analytics-Agent/
├── app/
│   ├── main.py                  # Streamlit entrypoint
│   ├── state.py                 # Session state helpers
│   └── ui/
│       ├── chat_view.py
│       ├── datasource_modal.py
│       └── sidebar.py
├── agents/
│   ├── pipeline_agent.py        # Main runtime orchestrator
│   ├── query_decomposer.py
│   ├── data_planner.py
│   ├── code_generator.py
│   ├── error_feedback.py
│   ├── conclusion_agent.py
│   └── schema_agent.py
├── services/
│   ├── llm_client.py
│   ├── code_sandbox.py
│   ├── context_manager.py
│   ├── data_loader.py
│   └── db_service.py
├── models/
│   ├── schemas.py
│   └── db_models.py
├── config/
│   ├── settings.py
│   └── version.py
├── tests/
│   └── unit/
├── alembic/
├── requirements.txt
└── requirements-dev.txt

Testing

pip install -r requirements-dev.txt
pytest

Configuration Reference

Key settings from config/settings.py:

  • GOOGLE_API_KEY: required Gemini API key.
  • GEMINI_MODEL: default model for planner/feedback/conclusion/decompose/schema-summary.
  • GEMINI_CODE_MODEL: model used by code generation.
  • SQLITE_DB_PATH: SQLite database file path.
  • SANDBOX_TIMEOUT: code execution timeout in seconds.
  • SANDBOX_MAX_RETRIES: max retry loops after failed execution.
  • LOG_LEVEL: logging level (INFO by default).

Notes

  • Run Streamlit from repository root so relative imports and paths resolve correctly.
  • Outputs and context are chat-scoped and stored under outputs/ and context/.

About

Streamlit app that answers natural-language questions about tabular data using a multi-agent Gemini pipeline: plan → generate Python → sandbox → charts & summaries

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors