A kernel architecture for governing autonomous AI agents
⭐ If this project helps you, please star it! It helps others discover Agent OS.
Quick Start • Documentation • VS Code Extension • Examples
Try Agent OS instantly in your browser - no installation required
pip install agent-os-kernelfrom agent_os import StatelessKernel
# Create a governed agent in 3 lines
kernel = StatelessKernel()
# Your agent runs with policy enforcement
result = await kernel.execute(
action="database_query",
params={"query": "SELECT * FROM users"},
policies=["read_only"]
)
# ✅ Safe queries execute
# ❌ "DROP TABLE users" → Blocked by kernelThat's it! Your agent now has deterministic policy enforcement. Learn more →
📋 More examples (click to expand)
from agent_os import StatelessKernel
kernel = StatelessKernel()
kernel.load_policy_yaml("""
version: "1.0"
name: api-safety
rules:
- name: block-destructive-sql
condition: "action == 'database_query'"
action: deny
pattern: "DROP|TRUNCATE|DELETE FROM .* WHERE 1=1"
- name: rate-limit-api
condition: "action == 'api_call'"
limit: "100/hour"
""")
result = await kernel.execute(action="database_query", params={"query": "DROP TABLE users"})
# ❌ Blocked: Matched rule 'block-destructive-sql'from agent_os import KernelSpace
kernel = KernelSpace()
# Every kernel action is automatically recorded
result = await kernel.execute(action="read_file", params={"path": "/data/report.csv"})
# Query the flight recorder
entries = kernel.flight_recorder.query(agent_id="agent-001", limit=10)
for entry in entries:
print(f"{entry.timestamp} | {entry.action} | {entry.outcome}")from agent_os import KernelSpace
from agent_os.emk import EpisodicMemory
kernel = KernelSpace(policy_file="policies.yaml")
memory = EpisodicMemory(max_turns=50)
@kernel.register
async def chat(message: str, conversation_id: str = "default") -> str:
history = memory.get_history(conversation_id)
response = await call_llm(history + [{"role": "user", "content": message}])
memory.add_turn(conversation_id, message, response)
return response
# Outputs are checked against content policies; violations trigger SIGSTOPSee examples/ for 20+ runnable demos including SQL agents, GitHub reviewers, and compliance bots.
from agent_os import StatelessKernel, stateless_execute
# 1. Define safety policies (not prompts — actual enforcement)
kernel = StatelessKernel(policies=["read_only", "no_pii"])
# 2. Actions are checked against policies before execution
result = await stateless_execute(
action="database_query",
params={"query": "SELECT revenue FROM sales"},
agent_id="analyst-001",
policies=["read_only"]
)
# ✅ Safe queries execute
# ❌ "DROP TABLE users" → BLOCKED (not by prompt, by kernel)Result: Defined policies are deterministically enforced by the kernel—not by hoping the LLM follows instructions.
For the full kernel with signals, VFS, and protection rings:
from agent_os import KernelSpace, AgentSignal, AgentVFS
# Requires: pip install agent-os-kernel[full]
kernel = KernelSpace()
ctx = kernel.create_agent_context("agent-001")
await ctx.write("/mem/working/task.txt", "Hello World")Note:
KernelSpace,AgentSignal, andAgentVFSrequire installing the control-plane module:pip install agent-os-kernel[full]
Agent OS applies operating system concepts to AI agent governance. Instead of relying on prompts to enforce safety ("please don't do dangerous things"), it provides application-level middleware that intercepts and validates agent actions before execution.
Note: This is application-level enforcement (Python middleware), not OS kernel-level isolation. Agents run in the same process. For true isolation, run agents in containers.
┌─────────────────────────────────────────────────────────┐
│ USER SPACE (Agent Code) │
│ Your agent code runs here. The kernel intercepts │
│ actions before they execute. │
├─────────────────────────────────────────────────────────┤
│ KERNEL SPACE │
│ Policy Engine │ Flight Recorder │ Signal Dispatch │
│ Actions are checked against policies before execution │
└─────────────────────────────────────────────────────────┘
Prompt-based safety asks the LLM to follow rules. The LLM decides whether to comply.
Kernel-based safety intercepts actions before execution. The policy engine decides, not the LLM.
This is the same principle operating systems use: applications request resources, the kernel grants or denies access based on permissions.
graph TB
subgraph "Layer 4: Intelligence"
SCAK[Self-Correcting Agent Kernel]
MUTE[Mute Agent]
end
subgraph "Layer 3: Control Plane"
KERNEL[🎯 THE KERNEL<br/>Policy Engine + Signals]
OBS[Observability<br/>Prometheus + OTEL]
end
subgraph "Layer 2: Infrastructure"
AMB[Agent Message Bus]
IATP[Inter-Agent Trust Protocol]
ATR[Agent Tool Registry]
end
subgraph "Layer 1: Primitives"
PRIM[Base Types + Failures]
CMVK[Cross-Model Verification]
CAAS[Context-as-a-Service]
EMK[Episodic Memory Kernel]
end
SCAK --> KERNEL
MUTE --> KERNEL
KERNEL --> AMB
KERNEL --> IATP
KERNEL --> OBS
AMB --> PRIM
IATP --> PRIM
ATR --> PRIM
CMVK --> PRIM
EMK --> PRIM
CAAS --> PRIM
agent-os/
├── src/agent_os/ # Core Python package
│ ├── __init__.py # Public API (re-exports from all layers)
│ ├── stateless.py # StatelessKernel (zero-dependency core)
│ ├── base_agent.py # BaseAgent, ToolUsingAgent classes
│ ├── agents_compat.py # AGENTS.md parser (OpenAI/Anthropic standard)
│ ├── cli.py # CLI (agent-os check, review, init, etc.)
│ └── integrations/ # Framework adapters (LangChain, OpenAI, etc.)
├── modules/ # Kernel Modules (4-layer architecture)
│ ├── primitives/ # Layer 1: Base types and failures
│ ├── cmvk/ # Layer 1: Cross-model verification
│ ├── emk/ # Layer 1: Episodic memory kernel
│ ├── caas/ # Layer 1: Context-as-a-Service
│ ├── amb/ # Layer 2: Agent message bus
│ ├── iatp/ # Layer 2: Inter-agent trust protocol
│ ├── atr/ # Layer 2: Agent tool registry
│ ├── observability/ # Layer 3: Prometheus + OpenTelemetry
│ ├── control-plane/ # Layer 3: THE KERNEL (policies, signals)
│ ├── scak/ # Layer 4: Self-correcting agent kernel
│ ├── mute-agent/ # Layer 4: Face/Hands architecture
│ ├── nexus/ # Experimental: Trust exchange network
│ └── mcp-kernel-server/ # Integration: MCP protocol support
├── extensions/ # IDE & AI Assistant Extensions
│ ├── mcp-server/ # ⭐ MCP Server (Copilot, Claude, Cursor)
│ ├── vscode/ # VS Code extension
│ ├── copilot/ # GitHub Copilot extension
│ ├── jetbrains/ # IntelliJ/PyCharm plugin
│ ├── cursor/ # Cursor IDE extension
│ ├── chrome/ # Chrome extension
│ └── github-cli/ # gh CLI extension
├── examples/ # Working examples
├── docs/ # Documentation
├── tests/ # Test suite (organized by layer)
├── notebooks/ # Jupyter tutorials
├── papers/ # Research papers
└── templates/ # Policy templates
| Module | Layer | PyPI Package | Description | Status |
|---|---|---|---|---|
primitives |
1 | agent-primitives |
Base failure types, severity levels | ✅ Stable |
cmvk |
1 | cmvk |
Cross-model verification, drift detection | ✅ Stable |
emk |
1 | emk |
Episodic memory kernel (append-only ledger) | ✅ Stable |
caas |
1 | caas-core |
Context-as-a-Service, RAG pipeline | ✅ Stable |
amb |
2 | amb-core |
Agent message bus (async pub/sub) | ✅ Stable |
iatp |
2 | inter-agent-trust-protocol |
Sidecar trust protocol, typed IPC pipes | ✅ Stable |
atr |
2 | agent-tool-registry |
Tool registry with LLM schema generation | ✅ Stable |
control-plane |
3 | agent-control-plane |
THE KERNEL — Policy engine, signals, VFS | ✅ Stable |
observability |
3 | agent-os-observability |
Prometheus metrics + OpenTelemetry tracing | |
scak |
4 | scak |
Self-correcting agent kernel | ✅ Stable |
mute-agent |
4 | mute-agent |
Decoupled reasoning/execution architecture | |
nexus |
— | Not published | Trust exchange network | 🔬 Prototype |
mcp-kernel-server |
Int | mcp-kernel-server |
MCP server for Claude Desktop |
| Extension | Description | Status |
|---|---|---|
mcp-server |
⭐ MCP Server — Works with Claude, Copilot, Cursor (npx agentos-mcp-server) |
✅ Published (v1.0.1) |
vscode |
VS Code extension with real-time policy checks, enterprise features | ✅ Published (v1.0.1) |
copilot |
GitHub Copilot extension (Vercel/Docker deployment) | ✅ Published (v1.0.0) |
jetbrains |
IntelliJ, PyCharm, WebStorm plugin (Kotlin) | ✅ Built (v1.0.0) |
cursor |
Cursor IDE extension (Composer integration) | ✅ Built (v0.1.0) |
chrome |
Chrome extension for GitHub, Jira, AWS, GitLab | ✅ Built (v1.0.0) |
github-cli |
gh agent-os CLI extension |
pip install agent-os-kernelOr with optional components:
pip install agent-os-kernel[cmvk] # + cross-model verification
pip install agent-os-kernel[iatp] # + inter-agent trust
pip install agent-os-kernel[observability] # + Prometheus/OpenTelemetry
pip install agent-os-kernel[nexus] # + trust exchange network
pip install agent-os-kernel[full] # EverythingmacOS/Linux:
curl -sSL https://raw.githubusercontent.com/imran-siddique/agent-os/main/scripts/quickstart.sh | bashWindows (PowerShell):
iwr -useb https://raw.githubusercontent.com/imran-siddique/agent-os/main/scripts/quickstart.ps1 | iexfrom agent_os import StatelessKernel, stateless_execute
# Create kernel with policy
kernel = StatelessKernel(policies=["read_only", "no_pii"])
# Execute with policy enforcement
result = await stateless_execute(
action="database_query",
params={"query": "SELECT * FROM users"},
agent_id="analyst-001",
policies=["read_only"]
)from agent_os import KernelSpace, AgentSignal, PolicyRule
kernel = KernelSpace()
# Create agent context with VFS
ctx = kernel.create_agent_context("agent-001")
await ctx.write("/mem/working/task.txt", "analyze this data")
# Policy enforcement
from agent_os import PolicyEngine
engine = PolicyEngine()
engine.add_rule(PolicyRule(name="no_sql_injection", pattern="DROP|DELETE|TRUNCATE"))Agent OS borrows concepts from POSIX operating systems:
| Concept | POSIX | Agent OS |
|---|---|---|
| Process control | SIGKILL, SIGSTOP |
AgentSignal.SIGKILL, AgentSignal.SIGSTOP |
| Filesystem | /proc, /tmp |
VFS with /mem/working, /mem/episodic |
| IPC | Pipes (|) |
Typed IPC pipes between agents |
| Syscalls | open(), read() |
kernel.execute() |
# Requires: pip install agent-os-kernel[full]
from agent_os import SignalDispatcher, AgentSignal
dispatcher = SignalDispatcher()
dispatcher.signal(agent_id, AgentSignal.SIGSTOP) # Pause
dispatcher.signal(agent_id, AgentSignal.SIGCONT) # Resume
dispatcher.signal(agent_id, AgentSignal.SIGKILL) # Terminate# Requires: pip install agent-os-kernel[full]
from agent_os import AgentVFS
vfs = AgentVFS(agent_id="agent-001")
vfs.write("/mem/working/task.txt", "Current task")
vfs.read("/policy/rules.yaml") # Read-only from user spaceWrap existing frameworks with Agent OS governance:
# LangChain
from agent_os.integrations import LangChainKernel
governed = LangChainKernel().wrap(my_chain)
# OpenAI Assistants
from agent_os.integrations import OpenAIKernel
governed = OpenAIKernel().wrap_assistant(assistant, client)
# Semantic Kernel
from agent_os.integrations import SemanticKernelWrapper
governed = SemanticKernelWrapper().wrap(sk_kernel)
# CrewAI
from agent_os.integrations import CrewAIKernel
governed = CrewAIKernel().wrap(my_crew)
# AutoGen
from agent_os.integrations import AutoGenKernel
governed = AutoGenKernel().wrap(autogen_agent)
# OpenAI Agents SDK
from agent_os.integrations import OpenAIAgentsSDKKernel
governed = OpenAIAgentsSDKKernel().wrap(agent)Note: These adapters use lazy interception — they don't require the target framework to be installed until you call
.wrap().
See integrations documentation for full details.
Agent Frameworks (LangChain, CrewAI): Build agents. Agent OS governs them. Use together.
Safety Tools (NeMo Guardrails, LlamaGuard): Input/output filtering. Agent OS intercepts actions mid-execution.
| Tool | Focus | When it acts |
|---|---|---|
| LangChain/CrewAI | Building agents | N/A (framework) |
| NeMo Guardrails | Input/output filtering | Before/after LLM call |
| LlamaGuard | Content classification | Before/after LLM call |
| Agent OS | Action interception | During execution |
The examples/ directory contains demos at various levels:
| Demo | Description | Command |
|---|---|---|
| demo-app | Uses the stateless API (most reliable) | cd examples/demo-app && python demo.py |
| hello-world | Minimal example | cd examples/hello-world && python agent.py |
| quickstart | Quick intro | cd examples/quickstart && python my_first_agent.py |
These examples are self-contained and don't require external Agent OS imports:
| Demo | Description |
|---|---|
| healthcare-hipaa | HIPAA-compliant agent |
| customer-service | Customer support agent |
| legal-review | Legal document analysis |
| crewai-safe-mode | CrewAI with safety wrappers |
| Demo | Description | Command |
|---|---|---|
| carbon-auditor | Multi-model verification | cd examples/carbon-auditor && docker-compose up |
| grid-balancing | Multi-agent coordination | cd examples/grid-balancing && docker-compose up |
| defi-sentinel | Real-time attack detection | cd examples/defi-sentinel && docker-compose up |
| pharma-compliance | Document analysis | cd examples/pharma-compliance && docker-compose up |
Each production demo includes:
- Grafana dashboard on port 300X
- Prometheus metrics on port 909X
- Jaeger tracing on port 1668X
# Run carbon auditor with full observability
cd examples/carbon-auditor
cp .env.example .env # Optional: add API keys
docker-compose up
# Open dashboards
open http://localhost:3000 # Grafana (admin/admin)
open http://localhost:16686 # Jaeger tracesAgent OS includes pre-built safe tools via the Agent Tool Registry:
# Requires: pip install agent-os-kernel[full]
from atr import ToolRegistry, tool
@tool(name="safe_http", description="Rate-limited HTTP requests")
async def safe_http(url: str) -> dict:
# Tool is automatically registered and sandboxed
...
registry = ToolRegistry()
registry.register(safe_http)
# Generate schemas for any LLM
openai_tools = registry.to_openai_schema()
anthropic_tools = registry.to_anthropic_schema()Connect agents using the async message bus:
# Requires: pip install agent-os-kernel[full]
from amb_core import MessageBus, Message
bus = MessageBus()
await bus.subscribe("tasks", handler)
await bus.publish("tasks", Message(payload={"task": "analyze"}))Broker adapters available for Redis, Kafka, and NATS (requires optional dependencies).
Agent OS includes a CLI for terminal workflows:
# Check files for safety violations
agent-os check src/app.py
# Check staged git files (pre-commit)
agent-os check --staged
# Multi-model code review (simulated in current version)
agent-os review src/app.py
# Install git pre-commit hook
agent-os install-hooks
# Initialize Agent OS in project
agent-os init
# Validate AGENTS.md configuration
agent-os validateAgent OS provides an MCP server that works with any MCP-compatible AI assistant:
# Quick install via npx
npx agentos-mcp-servernpm: agentos-mcp-server
MCP Registry: io.github.imran-siddique/agentos
Add to your config file:
Claude Desktop (%APPDATA%\Claude\claude_desktop_config.json on Windows):
{
"mcpServers": {
"agentos": {
"command": "npx",
"args": ["-y", "agentos-mcp-server"]
}
}
}Features: 10 tools for agent creation, policy enforcement, compliance checking (SOC 2, GDPR, HIPAA), human-in-the-loop approvals, and audit logging.
See MCP server documentation for full details.
- 5-Minute Quickstart — Get running fast
- 30-Minute Deep Dive — Comprehensive walkthrough
- Building Your First Governed Agent — Complete tutorial
- Using Message Bus Adapters — Connect agents
- Creating Custom Tools — Build safe tools
- Cheatsheet — Quick reference
| Notebook | Description | Time |
|---|---|---|
| Hello Agent OS | Your first governed agent | 5 min |
| Episodic Memory | Agent memory that persists | 15 min |
| Time-Travel Debugging | Replay and debug decisions | 20 min |
| Cross-Model Verification | Detect hallucinations | 15 min |
| Multi-Agent Coordination | Trust between agents | 20 min |
| Policy Engine | Deep dive into policies | 15 min |
- Quickstart Guide — 60 seconds to first agent
- Framework Integrations — LangChain, OpenAI, etc.
- Kernel Internals — How the kernel works
- Architecture Overview — System design
- CMVK Algorithm — Cross-model verification
- RFC-003: Agent Signals — POSIX-style signals
- RFC-004: Agent Primitives — Core primitives
This is a research project exploring kernel concepts for AI agent governance.
These components are fully implemented and tested:
| Component | Tests |
|---|---|
StatelessKernel — Zero-dependency policy enforcement (src/agent_os/) |
✅ Full coverage |
| Policy Engine — Deterministic rule enforcement | ✅ Tested |
| Flight Recorder — SQLite-based audit logging | ✅ Tested |
CLI — agent-os check, init, secure, validate |
✅ Tested |
| Framework Adapters — LangChain, OpenAI, Semantic Kernel, CrewAI, AutoGen, OpenAI Agents SDK | ✅ Implemented |
| AGENTS.md Parser — OpenAI/Anthropic standard agent config | ✅ Full coverage |
Primitives (agent-primitives) — Failure types, severity levels |
✅ Tested |
CMVK (cmvk) — Drift detection, distance metrics (955+ lines) |
✅ Tested |
EMK (emk) — Episodic memory with JSONL storage |
✅ 8 test files |
AMB (amb-core) — Async message bus, DLQ, tracing |
✅ 6 test files |
IATP (inter-agent-trust-protocol) — Sidecar trust, typed IPC |
✅ 9 test files |
ATR (agent-tool-registry) — Multi-LLM schema generation |
✅ 6 test files |
Control Plane (agent-control-plane) — Signals, VFS, protection rings |
✅ 18 test files |
SCAK (scak) — Self-correcting agent kernel |
✅ 23 test files |
| Component | What's Missing |
|---|---|
Mute Agent (mute-agent) |
No tests; all layer dependencies use mock adapters |
Observability (agent-os-observability) |
No tests; Prometheus metrics, Grafana dashboards, OTel tracing implemented |
MCP Kernel Server (mcp-kernel-server) |
No tests; 1173-line implementation |
| GitHub CLI Extension | Single bash script with simulated output |
| Control Plane MCP Adapter | Placeholder — returns canned responses |
| Control Plane A2A Adapter | Placeholder — negotiation accepts all params |
| Component | What's Missing |
|---|---|
| Nexus Trust Exchange | No pyproject.toml, no tests, placeholder cryptography (XOR — not secure), all signature verification stubbed, in-memory storage only |
| Limitation | Impact | Mitigation |
|---|---|---|
| Application-level only | Direct stdlib calls (subprocess, open) bypass kernel |
Pair with container isolation for production |
| Blocklist-based policies | Novel attack patterns not in rules will pass | Add AST-level parsing (#32), use defense in depth |
| Shadow Mode single-step | Multi-step agent simulations diverge from reality | Use for single-turn validation only |
| No tamper-proof audit | Flight Recorder SQLite can be modified by compromised agent | Write to external sink for critical audits |
| Provider-coupled adapters | Each SDK needs separate adapter | Abstract interface planned (#47) |
See GitHub Issues for the full roadmap.
ModuleNotFoundError: No module named 'agent_os'
# Install from source
git clone https://github.com/imran-siddique/agent-os.git
cd agent-os
pip install -e .Optional modules not available
# Check what's installed
python -c "from agent_os import check_installation; check_installation()"
# Install everything
pip install -e ".[full]"Permission errors on Windows
# Run PowerShell as Administrator, or use --user flag
pip install --user -e .Docker not working
# Build with Dockerfile (no Docker Compose needed for simple tests)
docker build -t agent-os .
docker run -it agent-os python examples/demo-app/demo.pyTests failing with API errors
# Most tests work without API keys — mock mode is default
pytest tests/ -v
# For real LLM tests, set environment variables
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...git clone https://github.com/imran-siddique/agent-os.git
cd agent-os
pip install -e ".[dev]"
pytestMIT — See LICENSE