🤖 Autonomous Data Science Agent

A production-grade multi-agent system that performs end-to-end data science workflows autonomously. Built with LangGraph and Llama 3.2 via Ollama.

🎯 What This Does

Give it a dataset + objective, and the system:

Plans the execution strategy
Explores and preprocesses data autonomously
Selects and trains appropriate models
Evaluates performance and compares models
Explains results with feature importance
Critiques its own work and iterates if needed
Generates a comprehensive markdown report

No hardcoded pipelines. True agent behavior.

🏗️ Architecture

┌─────────────┐
│   Planner   │ ← Decomposes objective into tasks
└──────┬──────┘
       │
       ▼
┌─────────────┐
│ Data Agent  │ ← Explores, cleans, preprocesses
└──────┬──────┘
       │
       ▼
┌─────────────┐
│ Model Agent │ ← Trains baseline → advanced models
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Evaluator  │ ← Compares models, selects best
└──────┬──────┘
       │
       ▼
┌─────────────┐
│ Explainer   │ ← Feature importance, insights
└──────┬──────┘
       │
       ▼
┌─────────────┐
│   Critic    │ ← Reviews pipeline, decides iterate/finish
└──────┬──────┘
       │
       ▼
  Iterate? ──No─→ Report Generator
     │
    Yes
     │
     └──────────┐
                │
                ▼
          Model Agent (again)

🚀 Quick Start

Prerequisites

Python 3.10+
Ollama installed locally
Llama 3.2 model downloaded

Installation

# Clone repository
git clone <your-repo>
cd autonomous_data_science_agent

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install and run Ollama with Llama 3.2
ollama pull llama3.2
ollama serve

Run the Agent

python main.py \
  --dataset data/raw/your_dataset.csv \
  --objective "Predict air quality and explain pollution drivers"

Example Run

python main.py \
  --dataset ../Housing.csv \
  --objective "Predict house prices and identify key value drivers"

Output:

Trained models saved in data/outputs/
Processed data in processed/
Final report in reports/generated/report_TIMESTAMP.md

📁 Project Structure

autonomous_data_science_agent/
│
├── main.py                      # Entry point
├── config.yaml                  # Configuration
├── requirements.txt
│
├── agents/                      # Multi-agent system
│   ├── planner_agent.py         # Task decomposition
│   ├── data_agent.py            # Data exploration & preprocessing
│   ├── modeling_agent.py        # Model training & selection
│   ├── evaluation_agent.py      # Model comparison & evaluation
│   ├── explanation_agent.py     # Interpretability & insights
│   └── critic_agent.py          # Self-critique ⭐
│
├── graph/
│   ├── agent_graph.py           # LangGraph orchestration
│   └── states.py                # Shared state definition
│
├── tools/
│   └── data_tools.py            # Data utilities
│
├── reports/
│   ├── report_generator.py      # Report creation
│   └── generated/               # Output reports
│
└── data/
    ├── processed/               # Cleaned data
    └── outputs/                 # Models, artifacts

⚙️ Configuration

Edit config.yaml to customize:

llm:
  provider: "ollama"
  model: "llama3.2"
  temperature: 0.3

max_iterations: 3
performance_threshold: 0.75
improvement_threshold: 0.05

modeling:
  baseline_models:
    - "linear_regression"
    - "random_forest"
    - "gradient_boosting"
  
  advanced_models:
    - "xgboost"
    - "lightgbm"
    - "neural_network"
  
  cv_folds: 5
  hyperparameter_tuning: true

data:
  max_missing_ratio: 0.3
  outlier_std_threshold: 3.0

🧠 What Makes This "Agentic"

✅ Dynamic Planning

No fixed pipeline. The planner creates a task graph based on the objective using LLM reasoning.

✅ Autonomous Decision-Making

Data Agent uses LLM to decide preprocessing strategy (imputation, encoding, scaling)
Modeling Agent selects algorithms dynamically based on task type and iteration
Critic Agent determines when to iterate or finish based on performance thresholds

✅ Self-Reflection & Iteration

The Critic Agent reviews results and triggers improvements:

if performance < threshold:
    → Iterate with advanced models
elif critic_has_suggestions and iteration < 2:
    → Try suggested improvements
else:
    → Finish and generate report

✅ Multi-Agent Collaboration

Each agent has a specific role and communicates via shared state (LangGraph TypedDict):

State flows through the graph
Agents can access previous agent outputs
Conditional branching based on critique

✅ Model Comparison & Selection

Evaluator Agent automatically:

Compares all trained models
Ranks by appropriate metric (R² for regression, accuracy for classification)
Selects best performer for final report

📊 Example Output

Console Output:

INFO:agents.modeling_agent:🤖 Modeling Agent: Training models
INFO:agents.modeling_agent:Training linear_regression...
INFO:agents.modeling_agent:  ✓ linear_regression - ('rmse', 1324506.96)
INFO:agents.modeling_agent:Training random_forest...
INFO:agents.modeling_agent:  ✓ random_forest - ('rmse', 1400565.97)
INFO:agents.modeling_agent:Training gradient_boosting...
INFO:agents.modeling_agent:  ✓ gradient_boosting - ('rmse', 1299385.98)

INFO:agents.evaluation_agent:Model Rankings:
INFO:agents.evaluation_agent:  1. gradient_boosting: R²=0.6660
INFO:agents.evaluation_agent:  2. linear_regression: R²=0.6529
INFO:agents.evaluation_agent:  3. random_forest: R²=0.6119

INFO:agents.critic_agent:Decision: ITERATE - r2 (0.666) below threshold (0.75)

Generated Report (reports/generated/report_TIMESTAMP.md):

# Autonomous Data Science Report

## 🎯 Objective
Predict house prices and explain value drivers

## 📊 Dataset Summary
- Source: `../Housing.csv`
- Rows: 545
- Columns: 13
- Target Variable: price
- Task Type: Regression

## 🔧 Preprocessing Pipeline
1. Drop High Missing Cols
2. Impute Numeric Median
3. Encode Categorical Onehot

## 🏆 Best Model
**Selected Model:** Gradient Boosting

### Performance Metrics
- RMSE: 1299385.98
- MAE: 959748.96
- R²: 0.6660

## 🧠 Feature Importance
Top 10 Most Important Features:

1. **area**: 0.4521
2. **bedrooms**: 0.1823
3. **bathrooms**: 0.1456
4. **stories**: 0.0892
5. **mainroad_yes**: 0.0543

## 💡 Key Insights
1. The gradient_boosting model achieved 0.666 R² score
2. Area is the strongest predictor of house prices
3. Model performance suggests room for improvement
4. Additional feature engineering may improve results
5. Results should be validated on new data

## 🎬 Conclusion
The autonomous agent completed 3 iteration(s) and selected **gradient_boosting** as the best performing model.

🔧 Advanced Usage

Custom Preprocessing Steps

Add new preprocessing options in data_agent.py:

elif step == "remove_outliers":
    # Your custom outlier removal logic
    pass

Custom Models

Add models to modeling_agent.py:

elif s_lower == "xgboost":
    from xgboost import XGBRegressor
    model = XGBRegressor(n_estimators=200, random_state=42)

Adjusting Iteration Behavior

Modify thresholds in config.yaml:

max_iterations: 5  # Allow more iterations
performance_threshold: 0.80  # Higher bar for satisfaction

🎓 Academic Context

Perfect for:

Master's thesis in AI/ML Engineering
PFE (Projet de Fin d'Études) requiring production systems
Research on autonomous agent systems
Portfolio projects for Data Science/ML Engineer roles

Key Differentiators:

Multi-agent architecture (not single LLM chain)
Self-critique loop with iterative improvement
Production-ready code structure with proper state management
Comprehensive logging and reporting
Uses local LLM (Ollama) - no API costs

Technical Highlights:

LangGraph for agent orchestration
TypedDict for type-safe state management
scikit-learn for ML pipeline
Autonomous decision-making via LLM reasoning

🛠️ Tech Stack

Orchestration: LangGraph
LLM: Ollama + Llama 3.2
ML: scikit-learn, pandas, numpy
Data Processing: pandas, numpy
Logging: Python logging module

🐛 Troubleshooting

Issue: KeyError: 'processed_data_path'

Solution: Ensure states.py includes all required fields in AgentState TypedDict

Issue: Unicode encoding error in report

Solution: Fixed - reports now use UTF-8 encoding

Issue: Ollama connection refused

Solution: Run ollama serve in a separate terminal

Issue: LangChain deprecation warnings

These are warnings only and don't affect functionality
Upgrade to langchain-ollama if preferred

📝 Requirements

pandas>=2.0.0
numpy>=1.24.0
scikit-learn>=1.3.0
langchain>=0.1.0
langchain-community>=0.0.20
langgraph>=0.0.26
pyyaml>=6.0
joblib>=1.3.0

🤝 Contributing

Contributions welcome! Areas for improvement:

Additional agents (AutoML, Feature Engineering Agent)
Support for more model types (deep learning, time series)
Enhanced explainability (SHAP, LIME)
Web interface for interaction
MLflow integration for experiment tracking

📄 License

MIT License

🔗 Resources

Built with autonomy in mind. No human intervention required. 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Autonomous Data Science Agent

🎯 What This Does

🏗️ Architecture

🚀 Quick Start

Prerequisites

Installation

Run the Agent

Example Run

📁 Project Structure

⚙️ Configuration

🧠 What Makes This "Agentic"

✅ Dynamic Planning

✅ Autonomous Decision-Making

✅ Self-Reflection & Iteration

✅ Multi-Agent Collaboration

✅ Model Comparison & Selection

📊 Example Output

🔧 Advanced Usage

Custom Preprocessing Steps

Custom Models

Adjusting Iteration Behavior

🎓 Academic Context

🛠️ Tech Stack

🐛 Troubleshooting

📝 Requirements

🤝 Contributing

📄 License

🔗 Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
agents		agents
evaluation		evaluation
graph		graph
memory		memory
reports		reports
tools		tools
.gitignore		.gitignore
EXAMPLE_REPORT.md		EXAMPLE_REPORT.md
README.md		README.md
config.yaml		config.yaml
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🤖 Autonomous Data Science Agent

🎯 What This Does

🏗️ Architecture

🚀 Quick Start

Prerequisites

Installation

Run the Agent

Example Run

📁 Project Structure

⚙️ Configuration

🧠 What Makes This "Agentic"

✅ Dynamic Planning

✅ Autonomous Decision-Making

✅ Self-Reflection & Iteration

✅ Multi-Agent Collaboration

✅ Model Comparison & Selection

📊 Example Output

🔧 Advanced Usage

Custom Preprocessing Steps

Custom Models

Adjusting Iteration Behavior

🎓 Academic Context

🛠️ Tech Stack

🐛 Troubleshooting

📝 Requirements

🤝 Contributing

📄 License

🔗 Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages