🤖 Configurable Agents

Local-first, config-driven multi-agent orchestration platform with full observability and zero cloud lock-in

Build production-grade LLM agent workflows through YAML configs or conversational chat. Deploy from laptop to enterprise Kubernetes with complete observability, multi-LLM support, and advanced control flow.

🎯 What & Why

A local-first, config-driven agent orchestration platform built for:

Developers prototyping multi-agent systems in minutes, not days
Researchers experimenting with agent architectures without infrastructure overhead
Teams validating LLM use cases with production-grade observability
Enterprises deploying agent workflows anywhere (laptop → Docker → K8s)

Why this exists:

Building multi-agent systems today requires stitching together LLM providers, orchestration frameworks, state management, observability, and deployment infrastructure. This platform delivers all of that through YAML configs and conversational interfaces, with zero cloud lock-in.

v1.0 Shipped (2026-02-04):

Production-ready local-first agent orchestration platform with multi-LLM support, advanced control flow, complete observability, and 27/27 requirements satisfied across 4 phases, 19 plans, and 1,000+ tests.

# research_agent.yaml - Multi-step workflow with control flow
schema_version: "1.0"

flow:
  name: research_agent

config:
  llm:
    provider: "openai"  # or anthropic, google, ollama
    model: "gpt-4"
    api_key_env: "OPENAI_API_KEY"

state:
  fields:
    topic: {type: str, required: true}
    research: {type: str, default: ""}
    quality_score: {type: int, default: 0}

nodes:
  - id: search
    prompt: "Search for information about {state.topic}"
    tools: [serper_search]
    outputs: [research]

  - id: evaluate
    prompt: "Evaluate research quality on scale 1-10"
    inputs: [research]
    outputs: [quality_score]
    output_schema:
      type: object
      fields:
        - name: quality_score
          type: int

edges:
  - {from: START, to: search}
  - {from: search, to: evaluate}
  - from: evaluate
    routes:
      - condition: "{state.quality_score} >= 7"
        to: END
      - condition: "{state.quality_score} < 7"
        to: search  # Retry with better search

config:
  memory:
    enabled: true
    backend: "sqlite"  # Persistent across runs

  observability:
    mlflow:
      enabled: true
      tracking_uri: "sqlite:///mlflow.db"

configurable-agents run research_agent.yaml --input topic="AI Safety"

# Or use Chat UI to generate configs
configurable-agents chat

✨ Key Features

🎨 Config-First & Chat-First

YAML as code: Declarative workflow definitions
Chat UI: Generate configs through conversation (Gradio-based)
No programming required: Accessible to non-developers
Version control friendly: Track workflow evolution in git
Shareable: Exchange configs like recipes

🧠 Advanced Control Flow

Conditional routing: Branch based on agent outputs
Loops and retry: Iterate until conditions met
Parallel execution: Fan-out/fan-in patterns
Sandboxed code: Execute agent-generated code safely

🔌 Multi-LLM Support

4 providers: OpenAI, Anthropic, Google, Ollama
Local-first: Run entirely on Ollama (zero cloud cost)
Unified cost tracking: Compare provider costs
Per-node configuration: Mix providers in one workflow

🧠 Persistent Memory

Namespaced storage: Per-node, per-agent, per-workflow
Pluggable backends: SQLite, PostgreSQL, Redis
Automatic persistence: Survives crashes and restarts
Context retention: Learn from previous executions

🔍 Complete Observability

MLFlow 3.9 integration: Automatic tracing and metrics
Multi-provider cost tracking: Unified cost reporting
Performance profiling: Bottleneck detection
Execution traces: Per-node latency, tokens, cost
Optimization: A/B testing, quality gates, prompt optimization

🎛️ User Interfaces

Chat UI (Gradio): Config generation through conversation
Orchestration Dashboard (FastAPI + HTMX): Runtime management
Agent Registry: Service discovery and health monitoring
MLFlow UI: Embedded observability dashboard
Real-time updates: SSE streaming for live monitoring

🌐 External Integrations

WhatsApp: Trigger workflows from messages
Telegram: Bot integration for workflow execution
Generic webhooks: Any external system integration
HMAC verification: Secure webhook endpoints

🛡️ Production-Grade

Parse-time validation: Catch errors before spending money
Type safety: Full Pydantic schema validation
Structured outputs: Guaranteed response formats
Error handling: Graceful degradation and retries
Security: Sandboxed execution, secret management

🐳 Deployment Flexibility

Local development: Run on laptop with SQLite
Docker Compose: Multi-container deployments
Kubernetes: Enterprise-scale with auto-scaling (v1.1+)
Storage abstraction: Swap backends without code changes

🗺️ Roadmap & Status

Version	Status	Shipped	Focus
v1.0	✅ Complete	2026-02-04	Multi-LLM, Control Flow, Observability, UIs
v1.1	🔮 Planning	TBD	Next milestone goals TBD

📋 v1.0 Details: See .planning/milestones/v1.0-ROADMAP.md for complete breakdown

v1.0 Foundation - Shipped ✅

Status: 27/27 requirements complete | 4 phases, 19 plans | 1,000+ tests (98%+ pass rate)

Phase 1: Core Engine

✅ Multi-LLM support (OpenAI, Anthropic, Google, Ollama via LiteLLM)
✅ Advanced control flow (conditional routing, loops, parallel execution)
✅ Storage abstraction (SQLite → PostgreSQL → Cloud)
✅ Cost tracking across all providers

Phase 2: Agent Infrastructure

✅ Agent registry with heartbeat/TTL
✅ Minimal agent containers (~50-100MB)
✅ Lifecycle management (registration, health checks, deregistration)
✅ Enhanced observability (performance profiling, bottleneck detection)

Phase 3: Interfaces & Triggers

✅ Gradio Chat UI for config generation
✅ FastAPI + HTMX orchestration dashboard
✅ Agent discovery and registration interface
✅ MLFlow UI iframe integration
✅ Real-time monitoring (SSE streaming)
✅ Webhook integrations (WhatsApp, Telegram, generic API)

Phase 4: Advanced Capabilities

✅ Code execution sandboxes (RestrictedPython + Docker)
✅ Persistent memory (namespaced, per-agent)
✅ 15 pre-built tools (web, file, data, system)
✅ A/B testing and optimization
✅ Quality gates and prompt optimization

Current capabilities:

# Run workflows with advanced control flow
configurable-agents run workflow.yaml --input topic="AI"

# Generate configs through chat
configurable-agents chat

# Manage workflows through dashboard
configurable-agents dashboard
# → http://localhost:8000 (Dashboard)
# → http://localhost:5000 (MLFlow UI)

# Trigger via webhooks
curl -X POST http://localhost:8000/webhooks/generic \
  -H "Content-Type: application/json" \
  -d '{"workflow_name": "research", "inputs": {"topic": "AI"}}'

# View costs and performance
configurable-agents report costs --period last_7_days
configurable-agents report profile --workflow research

# List agents
configurable-agents agents list

# Deploy workflows
configurable-agents deploy workflow.yaml

🚀 Quick Start

Installation

# Clone repository
git clone https://github.com/thatAverageGuy/configurable-agents.git
cd configurable-agents

# Install
pip install -e ".[dev]"

# Set up API keys (for providers you'll use)
cp .env.example .env
# Edit .env and add your API keys:
# - OPENAI_API_KEY (for OpenAI)
# - ANTHROPIC_API_KEY (for Anthropic)
# - GOOGLE_API_KEY (for Google Gemini)
# - No key needed for Ollama (local models)

Your First Workflow

Create hello.yaml:

schema_version: "1.0"

flow:
  name: hello_world

config:
  llm:
    provider: "openai"  # or anthropic, google, ollama
    model: "gpt-4"
    api_key_env: "OPENAI_API_KEY"

state:
  fields:
    name: {type: str, required: true}
    greeting: {type: str, default: ""}

nodes:
  - id: greet
    prompt: "Generate a friendly greeting for {state.name}"
    outputs: [greeting]
    output_schema:
      type: object
      fields:
        - name: greeting
          type: str

edges:
  - {from: START, to: greet}
  - {from: greet, to: END}

Run it:

configurable-agents run hello.yaml --input name="Alice"

Learn more:

QUICKSTART.md - Complete tutorial with v1.0 features
examples/ - More working examples
CONFIG_REFERENCE.md - Full config schema reference

🎛️ User Interfaces

Chat UI (Config Generation)

configurable-agents chat
# → http://localhost:7860

Describe your workflow in natural language, get a valid YAML config instantly. Session persistence keeps your conversation history.

Orchestration Dashboard

configurable-agents dashboard
# → http://localhost:8000 (Dashboard)
# → http://localhost:5000 (MLFlow UI embedded)

View running workflows
Inspect state and logs
Trigger new executions
Monitor agent registry
Real-time updates via SSE

🏗️ Architecture

Stack:

LangGraph: Graph execution engine
LiteLLM: Multi-LLM abstraction
Pydantic: Type validation
MLFlow: Observability (v3.9+)
FastAPI: API servers
Gradio: Chat UI
HTMX: Dashboard interactivity
SQLite/PostgreSQL: Storage backends

Design philosophy: Local-first, config-driven, pluggable, observable.

See ARCHITECTURE.md for detailed system design.

📊 v1.0 Delivery

4 Phases, 19 Plans, 27 Requirements, 1,000+ Tests

Phase 1: Core Engine (4 plans)

Multi-LLM abstraction via LiteLLM
Advanced control flow (conditionals, loops, parallel)
Storage abstraction (SQLite/PostgreSQL)
Unified cost tracking

Phase 2: Agent Infrastructure (6 plans)

Agent registry with heartbeat/TTL
Minimal container design (~50-100MB)
Performance profiling
Bottleneck detection

Phase 3: Interfaces & Triggers (6 plans)

Gradio Chat UI
FastAPI + HTMX dashboard
Agent discovery interface
Webhook integrations (WhatsApp, Telegram, generic)

Phase 4: Advanced Capabilities (3 plans)

Code execution sandboxes (RestrictedPython + Docker)
Persistent memory backend
15 pre-built tools
A/B testing and optimization

Full details: .planning/milestones/v1.0-ROADMAP.md

📚 Documentation

Core Documentation

ARCHITECTURE.md - System design overview
SPEC.md - Complete config schema specification
PROJECT_VISION.md - Long-term vision and philosophy
CONTEXT.md - Development context (living document)

Architecture Decisions

Architecture Decision Records - Design decisions and rationale
- 18 ADRs covering all major decisions
- Immutable history (append-only)
- Alternatives considered with tradeoffs

User Guides

QUICKSTART.md - Get started in 5 minutes
CONFIG_REFERENCE.md - Config schema guide
OBSERVABILITY.md - Monitoring and tracking
DEPLOYMENT.md - Docker deployment guide
TROUBLESHOOTING.md - Common issues and solutions
SECURITY_GUIDE.md - Security best practices
TOOL_DEVELOPMENT.md - Custom tool creation
PERFORMANCE_OPTIMIZATION.md - A/B testing and quality gates
PRODUCTION_DEPLOYMENT.md - Production patterns

Advanced Topics

ADVANCED_TOPICS.md - Advanced features overview

Developer Guides

SETUP.md - Development setup guide

API Documentation

API Reference - Complete API documentation (auto-generated)

🤝 Contributing

This is an active open-source project (v1.0 shipped).

We welcome contributions:

⭐ Star the repo to follow progress
📝 Open issues for bugs or feature requests
💬 Join discussions
🔧 Submit pull requests

🎯 Use Cases

Research & Analysis

# Multi-step research with quality gates
nodes:
  - id: search
    tools: [serper_search]
  - id: evaluate_quality
  - id: summarize
    routes:
      - condition: "{state.quality} >= 7"
        to: END
      - to: search  # Retry if low quality

Content Generation

# Blog post pipeline with review loop
nodes:
  - id: outline
  - id: draft
  - id: review
    routes:
      - condition: "{state.approved}"
        to: END
      - to: draft  # Revise if not approved
  - id: polish

Data Processing

# ETL workflow with parallel processing
nodes:
  - id: extract
  - id: transform_parallel
    parallel: true  # Run in parallel
  - id: validate
  - id: load

Automation

# Email triage with webhook trigger
webhooks:
  - trigger: generic
    workflow: email_triage

nodes:
  - id: classify
  - id: prioritize
  - id: draft_response

📄 License

MIT License - see LICENSE for details.

🙏 Credits

Built with inspiration from:

LangGraph (graph-based agent execution)
LiteLLM (multi-LLM abstraction)
MLFlow (LLM observability)
Infrastructure as Code (Terraform, Docker Compose)

📬 Contact

Questions? Ideas? Feedback?

📧 Email: yogesh.singh893@gmail.com
🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions

Made with ❤️ for the agent builder community

Star ⭐ this repo to follow our progress!

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
.github		.github
.planning		.planning
docs		docs
examples		examples
src/configurable_agents		src/configurable_agents
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
SETUP.md		SETUP.md
SYSTEM_PROMPT.md		SYSTEM_PROMPT.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
setup.bat		setup.bat
setup.sh		setup.sh
streamlit_app.py		streamlit_app.py
test_agent.py		test_agent.py
test_register.json		test_register.json

Folders and files

Latest commit

History

Repository files navigation

🤖 Configurable Agents

🎯 What & Why

✨ Key Features

🎨 Config-First & Chat-First

🧠 Advanced Control Flow

🔌 Multi-LLM Support

🧠 Persistent Memory

🔍 Complete Observability

🎛️ User Interfaces

🌐 External Integrations

🛡️ Production-Grade

🐳 Deployment Flexibility

🗺️ Roadmap & Status

v1.0 Foundation - Shipped ✅

Phase 1: Core Engine

Phase 2: Agent Infrastructure

Phase 3: Interfaces & Triggers

Phase 4: Advanced Capabilities

🚀 Quick Start

Installation

Your First Workflow

🎛️ User Interfaces

Chat UI (Config Generation)

Orchestration Dashboard

🏗️ Architecture

📊 v1.0 Delivery

Phase 1: Core Engine (4 plans)

Phase 2: Agent Infrastructure (6 plans)

Phase 3: Interfaces & Triggers (6 plans)

Phase 4: Advanced Capabilities (3 plans)

📚 Documentation

Core Documentation

Architecture Decisions

User Guides

Advanced Topics

Developer Guides

API Documentation

🤝 Contributing

🎯 Use Cases

Research & Analysis

Content Generation

Data Processing

Automation

📄 License

🙏 Credits

📬 Contact

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages