A comprehensive deep research system built from scratch using LangGraph, implementing multi-agent coordination, iterative research, and intelligent report generation. The system follows a three-phase architecture: Scope → Research → Write.
- 🔍 User Clarification — Determines if additional context is needed before research begins
- 📋 Brief Generation — Transforms conversations into structured research questions
- 🔎 Iterative Research — Agent-based research with custom tools and search integration
- 🌐 MCP Integration — Model Context Protocol for standardized tool access
- 👥 Multi-Agent Supervisor — Coordinates parallel research agents for complex topics
- 📝 Report Generation — Synthesizes findings into comprehensive reports
- 🐳 Docker Ready — Full Docker + docker-compose setup with LangGraph Studio
This repo contains 6 progressive tutorial notebooks building a deep research system:
| # | Notebook | Focus |
|---|---|---|
| 1 | 1_scoping.ipynb |
User clarification & brief generation |
| 2 | 2_research_agent.ipynb |
Research agent with custom tools |
| 3 | 3_research_agent_mcp.ipynb |
Research agent with MCP servers |
| 4 | 4_research_supervisor.ipynb |
Multi-agent supervisor coordination |
| 5 | 5_full_agent.ipynb |
Complete end-to-end system |
| 6 | 6_checkpoint_agent.ipynb |
Checkpoint verification agent |
| Component | Technology |
|---|---|
| Framework | LangGraph, LangChain |
| LLM Providers | Groq, Google Gemini, OpenAI, Anthropic |
| Search | Tavily API |
| Protocol | Model Context Protocol (MCP) |
| Package Manager | uv |
| Containerization | Docker + docker-compose |
| Monitoring | LangSmith (optional) |
| Code Quality | ruff, mypy |
# Clone the repository
git clone https://github.com/your-username/Autonomous-Learning-Agent.git
cd Autonomous-Learning-Agent
# Copy and configure environment variables
cp .env.example .env
# Edit .env with your API keys
# Build and run
docker-compose build
docker-compose up -dAccess the services:
| Service | URL |
|---|---|
| LangGraph Studio | https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:8000 |
| API Documentation | http://127.0.0.1:8000/docs |
# Prerequisites: Python 3.11+, Node.js, uv
uv sync
cp .env.example .env
# Edit .env with your API keys
# Run LangGraph server
uvx --refresh --from "langgraph-cli[inmem]" --with-editable . --python 3.11 langgraph dev --allow-blocking
# Or run notebooks
uv run jupyter notebookCreate a .env file from .env.example:
TAVILY_API_KEY=your_tavily_api_key # Required: Web search
GROQ_API_KEY=your_groq_api_key # Required: LLM inference
GOOGLE_API_KEY=your_google_api_key # Optional: Gemini models
LANGSMITH_API_KEY=your_langsmith_key # Optional: Tracing
LANGSMITH_TRACING=true
LANGSMITH_PROJECT=deep_research_from_scratch
⚠️ Never commit your.envfile. Use.env.exampleas a template.
User Query → Scope Phase → Research Phase → Write Phase → Final Report
│ │
Clarification Multi-Agent
Brief Gen. Supervisor
├── Agent 1
├── Agent 2
└── Agent N
- Scope — Clarify research scope and generate structured research briefs
- Research — Iterative research using agents with tools (Tavily, MCP servers)
- Write — Synthesize all findings into a comprehensive report
Autonomous-Learning-Agent/
├── notebooks/ # Tutorial notebooks (SOURCE OF TRUTH)
│ ├── 1_scoping.ipynb # User clarification & brief generation
│ ├── 2_research_agent.ipynb # Research agent with custom tools
│ ├── 3_research_agent_mcp.ipynb # MCP server integration
│ ├── 4_research_supervisor.ipynb # Multi-agent supervisor
│ ├── 5_full_agent.ipynb # Complete end-to-end system
│ ├── 6_checkpoint_agent.ipynb # Checkpoint verification
│ └── utils.py # Shared utilities
├── src/deep_research_from_scratch/ # Generated source code (DO NOT EDIT)
│ ├── autonomous_learning_agent.py
│ ├── multi_agent_supervisor.py
│ ├── research_agent.py
│ ├── prompts.py
│ ├── state_*.py
│ └── utils.py
├── Dockerfile
├── docker-compose.yml
├── langgraph.json
├── pyproject.toml
├── .env.example
└── README.md
Important: The notebooks in
notebooks/are the source of truth. Source code insrc/is auto-generated from notebooks using%%writefilemagic. See CLAUDE.md for development guidelines.
- Structured Output — Pydantic schemas for reliable AI decision making
- Async Orchestration — Parallel coordination vs synchronous simplicity
- Agent Patterns — ReAct loops, supervisor patterns, multi-agent coordination
- Search Integration — External APIs, MCP servers, content processing
- State Management — Complex state flows across subgraphs
This project is open source and available under the MIT License.
Sujitha Kotyada — @sujitha-kotyada