A production-ready template for building agentic AI systems with SLM / local LLMs no API keys, no cloud costs, everything runs on your own machine.
Agentful is an engineer-facing starter template for building agentic AI applications backed entirely by local LLMs. It wires together the most important open protocols and frameworks in the agentic AI ecosystem i.e: — A2A, MCP, LangGraph, and Chainlit — into a clean, extensible architecture you can clone and build on today.
Use Agentful to:
- Learn how agentic AI systems are structured in practice.
- Bootstrap a new local agent project with best-practice patterns already in place.
- Experiment with A2A multi-agent orchestration and MCP tool serving without writing boilerplate.
| Component | Role |
|---|---|
| Ollama | Serves the LLM locally via an OpenAI-compatible HTTP API |
| Gemma 4 e2b | Lightweight quantised model (~2 GB), runs on laptop GPU or CPU |
| LangGraph | Agent reasoning loop with persistent memory and checkpointing |
| A2A SDK | Agent-to-Agent protocol — standardised inter-agent communication |
| FastMCP | Exposes Python functions as MCP tools over HTTP |
| langchain-mcp-adapters | Discovers and wraps MCP tools for use in LangChain agents |
| Chainlit | Streaming chat web UI with per-session thread management |
flowchart LR
User["💬 Chat UI\nlocalhost:8000"]
subgraph Orchestrator ["🧭 Orchestrator — picks the right agent"]
Router["Agent Router\nreads agents.yaml"]
Sender["<<A2A Client>>\n📡 Sends tasks & receives\nstreamed responses over SSE"]
end
subgraph Agent ["🤖 Data Analysis Agent — localhost:10001"]
Handler["<<A2A Server>>\n📡 Accepts tasks &\nstreams events back"]
Brain["<<LangGraph Agent>>\n🔁 Reasons step by step\nusing tools & memory"]
LLM["🧠 Local LLM\ngemma4:e2b · localhost:11434"]
Tools["<<MCP Tools Server>>\n🔧 add · multiply\nlocalhost:8001"]
end
User -- "sends question" --> Router
Router -- "picks best agent" --> Sender
Sender -- "A2A: forwards question" --> Handler
Handler --> Brain
Brain -- "asks LLM" --> LLM
LLM -- "streams answer tokens" --> Brain
Brain -- "MCP: calls a tool" --> Tools
Tools -- "MCP: returns result" --> Brain
Brain -- "sends answer" --> Handler
Handler -- "A2A: streams back" --> Sender
Sender -- "token by token" --> User
| Pattern | Where | Why |
|---|---|---|
| A2A protocol | src/a2a/ |
Standardised agent-to-agent communication — swap or add agents without touching the orchestrator |
| MCP tool serving | src/mcp/ |
Tools are plain Python functions; discovery is automatic |
| LangGraph reasoning loop | src/agents/da_agent/graph.py |
Persistent memory, tool-use loop, and checkpointing out of the box |
| Token streaming | adapter.py → executor_base.py → client.py → app.py |
Each LLM token is forwarded end-to-end via SSE artifact chunks |
| Declarative agent registry | config/agents.yaml |
Add a new agent server without touching any Python |
Before you begin, make sure you have the following installed and running:
| Requirement | Version / notes |
|---|---|
| Python | 3.12 or later |
| uv | Recommended package manager |
| Ollama | Must be running (ollama serve) |
| Gemma 4 e2b | Pull with ollama pull gemma4:e2b |
.
├── config/
│ └── agents.yaml # Declarative A2A agent registry
├── src/
│ ├── app.py # Chainlit UI — session lifecycle and message streaming
│ ├── agents/
│ │ └── da_agent/
│ │ └── graph.py # LangGraph agent (Ollama LLM + MCP tools + MemorySaver)
│ ├── a2a/
│ │ ├── agents/
│ │ │ └── da_agent/
│ │ │ ├── adapter.py # LangGraph → A2A stream adapter (token streaming)
│ │ │ ├── card.py # Agent Card definition
│ │ │ ├── executor.py # A2A executor wiring
│ │ │ └── __main__.py # Agent server entrypoint (uvicorn)
│ │ ├── base/
│ │ │ ├── agent_base.py # BaseA2AAgent ABC
│ │ │ ├── executor_base.py # BaseAgentExecutor — task lifecycle and token streaming
│ │ │ ├── response_format.py # AgentStreamChunk TypedDict
│ │ │ └── server_factory.py # Starlette ASGI app factory
│ │ └── orchestrator/
│ │ ├── client.py # A2AAgentClient — SSE streaming client
│ │ └── registry.py # AgentRegistry — discovery and skill-based routing
│ └── mcp/
│ ├── client/
│ │ └── master_mcp_client.py # MultiServerMCPClient — tool discovery
│ └── server/
│ └── math/
│ └── server.py # FastMCP server — add() and multiply() tools
├── main.py # Chainlit entrypoint
├── chainlit.md # Chainlit welcome screen
├── pyproject.toml # Project metadata and dependencies
└── .env.example # Environment variable template
git clone https://github.com/ai-with-ali/agentful.git
cd agentfuluv synccp .env.example .envEdit .env with your settings:
OLLAMA_SERVER_URL=http://localhost:11434
MCP_DataAnalysis_Host=localhost
MCP_DataAnalysis_Port=8001ollama serve
ollama pull gemma4:e2b # first run onlyThree processes must run simultaneously. Open three separate terminals.
Terminal 1 — MCP tool server
uv run python -m src.mcp.server.math.serverRuns at http://localhost:8001 (or whichever port you set in .env).
Terminal 2 — Data Analysis A2A agent
uv run python -m src.a2a.agents.da_agent --port 10001Agent Card available at http://localhost:10001/.well-known/agent-card.json.
Terminal 3 — Chainlit web UI
uv run chainlit run main.py --port 8000Open http://localhost:8000 in your browser.
VS Code users: use the Run & Debug panel. Select each configuration from the dropdown and press
F5.
- User sends a message in the Chainlit UI.
AgentRegistryreadsconfig/agents.yaml, fetches each agent's Card from/.well-known/agent-card.json, and routes the query to the best-matching agent by skill-tag matching.A2AAgentClientopens a JSONRPC/SSE stream to the selected A2A agent server.BaseAgentExecutorruns the LangGraph agent and forwards events upstream:- Tool call details (name and arguments) →
TASK_STATE_WORKINGstatus update - Tool results →
TASK_STATE_WORKINGstatus update - LLM tokens →
TaskArtifactUpdateEventchunks (streamed immediately)
- Tool call details (name and arguments) →
- Chainlit renders working events as a collapsible step and streams each LLM token into the reply message in real time.
- Create
src/agents/<your_agent>/graph.pywith your LangGraph graph. - Create
src/a2a/agents/<your_agent>/mirroring theda_agentstructure:adapter.py,card.py,executor.py,__main__.py. - Add the agent URL to
config/agents.yaml— no other changes are needed.
Open src/mcp/server/math/server.py and add a decorated function:
@mcp.tool()
def divide(a: float, b: float) -> float:
"""Divide a by b."""
return a / bRestart the MCP server. The tool is automatically discovered by the agent on the next startup.
- Fork the repository.
- Create a feature branch:
git checkout -b feat/your-feature - Commit your changes using Conventional Commits.
- Open a pull request against
main.
