Skip to content

A Safety-First Kernel for Autonomous AI Agents - POSIX-inspired primitives with 0% policy violation guarantee

License

Notifications You must be signed in to change notification settings

imran-siddique/agent-os

Agent OS

A kernel architecture for governing autonomous AI agents

GitHub Stars Sponsor License Python CI VS Code Extension Documentation

If this project helps you, please star it! It helps others discover Agent OS.

Quick StartDocumentationVS Code ExtensionExamples


Open in Gitpod

Try Agent OS instantly in your browser - no installation required


⚡ Quick Start in 30 Seconds

pip install agent-os-kernel
from agent_os import StatelessKernel

# Create a governed agent in 3 lines
kernel = StatelessKernel()

# Your agent runs with policy enforcement
result = await kernel.execute(
    action="database_query",
    params={"query": "SELECT * FROM users"},
    policies=["read_only"]
)
# ✅ Safe queries execute
# ❌ "DROP TABLE users" → Blocked by kernel

That's it! Your agent now has deterministic policy enforcement. Learn more →

📋 More examples (click to expand)

Policy enforcement with custom rules

from agent_os import StatelessKernel

kernel = StatelessKernel()
kernel.load_policy_yaml("""
version: "1.0"
name: api-safety
rules:
  - name: block-destructive-sql
    condition: "action == 'database_query'"
    action: deny
    pattern: "DROP|TRUNCATE|DELETE FROM .* WHERE 1=1"
  - name: rate-limit-api
    condition: "action == 'api_call'"
    limit: "100/hour"
""")

result = await kernel.execute(action="database_query", params={"query": "DROP TABLE users"})
# ❌ Blocked: Matched rule 'block-destructive-sql'

Audit logging

from agent_os import KernelSpace

kernel = KernelSpace()

# Every kernel action is automatically recorded
result = await kernel.execute(action="read_file", params={"path": "/data/report.csv"})

# Query the flight recorder
entries = kernel.flight_recorder.query(agent_id="agent-001", limit=10)
for entry in entries:
    print(f"{entry.timestamp} | {entry.action} | {entry.outcome}")

Governed chatbot with memory

from agent_os import KernelSpace
from agent_os.emk import EpisodicMemory

kernel = KernelSpace(policy_file="policies.yaml")
memory = EpisodicMemory(max_turns=50)

@kernel.register
async def chat(message: str, conversation_id: str = "default") -> str:
    history = memory.get_history(conversation_id)
    response = await call_llm(history + [{"role": "user", "content": message}])
    memory.add_turn(conversation_id, message, response)
    return response
# Outputs are checked against content policies; violations trigger SIGSTOP

See examples/ for 20+ runnable demos including SQL agents, GitHub reviewers, and compliance bots.


Agent OS Terminal Demo


🎯 What You'll Build in 5 Minutes

from agent_os import StatelessKernel, stateless_execute

# 1. Define safety policies (not prompts — actual enforcement)
kernel = StatelessKernel(policies=["read_only", "no_pii"])

# 2. Actions are checked against policies before execution
result = await stateless_execute(
    action="database_query",
    params={"query": "SELECT revenue FROM sales"},
    agent_id="analyst-001",
    policies=["read_only"]
)
# ✅ Safe queries execute
# ❌ "DROP TABLE users" → BLOCKED (not by prompt, by kernel)

Result: Defined policies are deterministically enforced by the kernel—not by hoping the LLM follows instructions.

For the full kernel with signals, VFS, and protection rings:

from agent_os import KernelSpace, AgentSignal, AgentVFS

# Requires: pip install agent-os-kernel[full]
kernel = KernelSpace()
ctx = kernel.create_agent_context("agent-001")
await ctx.write("/mem/working/task.txt", "Hello World")

Note: KernelSpace, AgentSignal, and AgentVFS require installing the control-plane module: pip install agent-os-kernel[full]


What is Agent OS?

Agent OS applies operating system concepts to AI agent governance. Instead of relying on prompts to enforce safety ("please don't do dangerous things"), it provides application-level middleware that intercepts and validates agent actions before execution.

Note: This is application-level enforcement (Python middleware), not OS kernel-level isolation. Agents run in the same process. For true isolation, run agents in containers.

┌─────────────────────────────────────────────────────────┐
│              USER SPACE (Agent Code)                    │
│   Your agent code runs here. The kernel intercepts      │
│   actions before they execute.                          │
├─────────────────────────────────────────────────────────┤
│              KERNEL SPACE                               │
│   Policy Engine │ Flight Recorder │ Signal Dispatch     │
│   Actions are checked against policies before execution │
└─────────────────────────────────────────────────────────┘

The Idea

Prompt-based safety asks the LLM to follow rules. The LLM decides whether to comply.

Kernel-based safety intercepts actions before execution. The policy engine decides, not the LLM.

This is the same principle operating systems use: applications request resources, the kernel grants or denies access based on permissions.


Architecture

graph TB
    subgraph "Layer 4: Intelligence"
        SCAK[Self-Correcting Agent Kernel]
        MUTE[Mute Agent]
    end
    
    subgraph "Layer 3: Control Plane"
        KERNEL[🎯 THE KERNEL<br/>Policy Engine + Signals]
        OBS[Observability<br/>Prometheus + OTEL]
    end
    
    subgraph "Layer 2: Infrastructure"
        AMB[Agent Message Bus]
        IATP[Inter-Agent Trust Protocol]
        ATR[Agent Tool Registry]
    end
    
    subgraph "Layer 1: Primitives"
        PRIM[Base Types + Failures]
        CMVK[Cross-Model Verification]
        CAAS[Context-as-a-Service]
        EMK[Episodic Memory Kernel]
    end
    
    SCAK --> KERNEL
    MUTE --> KERNEL
    KERNEL --> AMB
    KERNEL --> IATP
    KERNEL --> OBS
    AMB --> PRIM
    IATP --> PRIM
    ATR --> PRIM
    CMVK --> PRIM
    EMK --> PRIM
    CAAS --> PRIM
Loading

Directory Structure

agent-os/
├── src/agent_os/             # Core Python package
│   ├── __init__.py           # Public API (re-exports from all layers)
│   ├── stateless.py          # StatelessKernel (zero-dependency core)
│   ├── base_agent.py         # BaseAgent, ToolUsingAgent classes
│   ├── agents_compat.py      # AGENTS.md parser (OpenAI/Anthropic standard)
│   ├── cli.py                # CLI (agent-os check, review, init, etc.)
│   └── integrations/         # Framework adapters (LangChain, OpenAI, etc.)
├── modules/                  # Kernel Modules (4-layer architecture)
│   ├── primitives/           # Layer 1: Base types and failures
│   ├── cmvk/                 # Layer 1: Cross-model verification
│   ├── emk/                  # Layer 1: Episodic memory kernel
│   ├── caas/                 # Layer 1: Context-as-a-Service
│   ├── amb/                  # Layer 2: Agent message bus
│   ├── iatp/                 # Layer 2: Inter-agent trust protocol
│   ├── atr/                  # Layer 2: Agent tool registry
│   ├── observability/        # Layer 3: Prometheus + OpenTelemetry
│   ├── control-plane/        # Layer 3: THE KERNEL (policies, signals)
│   ├── scak/                 # Layer 4: Self-correcting agent kernel
│   ├── mute-agent/           # Layer 4: Face/Hands architecture
│   ├── nexus/                # Experimental: Trust exchange network
│   └── mcp-kernel-server/    # Integration: MCP protocol support
├── extensions/               # IDE & AI Assistant Extensions
│   ├── mcp-server/           # ⭐ MCP Server (Copilot, Claude, Cursor)
│   ├── vscode/               # VS Code extension
│   ├── copilot/              # GitHub Copilot extension
│   ├── jetbrains/            # IntelliJ/PyCharm plugin
│   ├── cursor/               # Cursor IDE extension
│   ├── chrome/               # Chrome extension
│   └── github-cli/           # gh CLI extension
├── examples/                 # Working examples
├── docs/                     # Documentation
├── tests/                    # Test suite (organized by layer)
├── notebooks/                # Jupyter tutorials
├── papers/                   # Research papers
└── templates/                # Policy templates

Core Modules

Module Layer PyPI Package Description Status
primitives 1 agent-primitives Base failure types, severity levels ✅ Stable
cmvk 1 cmvk Cross-model verification, drift detection ✅ Stable
emk 1 emk Episodic memory kernel (append-only ledger) ✅ Stable
caas 1 caas-core Context-as-a-Service, RAG pipeline ✅ Stable
amb 2 amb-core Agent message bus (async pub/sub) ✅ Stable
iatp 2 inter-agent-trust-protocol Sidecar trust protocol, typed IPC pipes ✅ Stable
atr 2 agent-tool-registry Tool registry with LLM schema generation ✅ Stable
control-plane 3 agent-control-plane THE KERNEL — Policy engine, signals, VFS ✅ Stable
observability 3 agent-os-observability Prometheus metrics + OpenTelemetry tracing ⚠️ No tests
scak 4 scak Self-correcting agent kernel ✅ Stable
mute-agent 4 mute-agent Decoupled reasoning/execution architecture ⚠️ No tests
nexus Not published Trust exchange network 🔬 Prototype
mcp-kernel-server Int mcp-kernel-server MCP server for Claude Desktop ⚠️ No tests

IDE & CLI Extensions

Extension Description Status
mcp-server MCP Server — Works with Claude, Copilot, Cursor (npx agentos-mcp-server) ✅ Published (v1.0.1)
vscode VS Code extension with real-time policy checks, enterprise features ✅ Published (v1.0.1)
copilot GitHub Copilot extension (Vercel/Docker deployment) ✅ Published (v1.0.0)
jetbrains IntelliJ, PyCharm, WebStorm plugin (Kotlin) ✅ Built (v1.0.0)
cursor Cursor IDE extension (Composer integration) ✅ Built (v0.1.0)
chrome Chrome extension for GitHub, Jira, AWS, GitLab ✅ Built (v1.0.0)
github-cli gh agent-os CLI extension ⚠️ Basic

Install

pip install agent-os-kernel

Or with optional components:

pip install agent-os-kernel[cmvk]           # + cross-model verification
pip install agent-os-kernel[iatp]           # + inter-agent trust
pip install agent-os-kernel[observability]  # + Prometheus/OpenTelemetry
pip install agent-os-kernel[nexus]          # + trust exchange network
pip install agent-os-kernel[full]           # Everything

One-Command Quickstart

macOS/Linux:

curl -sSL https://raw.githubusercontent.com/imran-siddique/agent-os/main/scripts/quickstart.sh | bash

Windows (PowerShell):

iwr -useb https://raw.githubusercontent.com/imran-siddique/agent-os/main/scripts/quickstart.ps1 | iex

Quick Example

Stateless API (Always Available — Zero Dependencies Beyond Pydantic)

from agent_os import StatelessKernel, stateless_execute

# Create kernel with policy
kernel = StatelessKernel(policies=["read_only", "no_pii"])

# Execute with policy enforcement
result = await stateless_execute(
    action="database_query",
    params={"query": "SELECT * FROM users"},
    agent_id="analyst-001",
    policies=["read_only"]
)

Full Kernel API (Requires pip install agent-os-kernel[full])

from agent_os import KernelSpace, AgentSignal, PolicyRule

kernel = KernelSpace()

# Create agent context with VFS
ctx = kernel.create_agent_context("agent-001")
await ctx.write("/mem/working/task.txt", "analyze this data")

# Policy enforcement
from agent_os import PolicyEngine
engine = PolicyEngine()
engine.add_rule(PolicyRule(name="no_sql_injection", pattern="DROP|DELETE|TRUNCATE"))

POSIX-Inspired Primitives

Agent OS borrows concepts from POSIX operating systems:

Concept POSIX Agent OS
Process control SIGKILL, SIGSTOP AgentSignal.SIGKILL, AgentSignal.SIGSTOP
Filesystem /proc, /tmp VFS with /mem/working, /mem/episodic
IPC Pipes (|) Typed IPC pipes between agents
Syscalls open(), read() kernel.execute()

Signals

# Requires: pip install agent-os-kernel[full]
from agent_os import SignalDispatcher, AgentSignal

dispatcher = SignalDispatcher()
dispatcher.signal(agent_id, AgentSignal.SIGSTOP)  # Pause
dispatcher.signal(agent_id, AgentSignal.SIGCONT)  # Resume
dispatcher.signal(agent_id, AgentSignal.SIGKILL)  # Terminate

VFS (Virtual File System)

# Requires: pip install agent-os-kernel[full]
from agent_os import AgentVFS

vfs = AgentVFS(agent_id="agent-001")
vfs.write("/mem/working/task.txt", "Current task")
vfs.read("/policy/rules.yaml")  # Read-only from user space

Framework Integrations

Wrap existing frameworks with Agent OS governance:

# LangChain
from agent_os.integrations import LangChainKernel
governed = LangChainKernel().wrap(my_chain)

# OpenAI Assistants
from agent_os.integrations import OpenAIKernel
governed = OpenAIKernel().wrap_assistant(assistant, client)

# Semantic Kernel
from agent_os.integrations import SemanticKernelWrapper
governed = SemanticKernelWrapper().wrap(sk_kernel)

# CrewAI
from agent_os.integrations import CrewAIKernel
governed = CrewAIKernel().wrap(my_crew)

# AutoGen
from agent_os.integrations import AutoGenKernel
governed = AutoGenKernel().wrap(autogen_agent)

# OpenAI Agents SDK
from agent_os.integrations import OpenAIAgentsSDKKernel
governed = OpenAIAgentsSDKKernel().wrap(agent)

Note: These adapters use lazy interception — they don't require the target framework to be installed until you call .wrap().

See integrations documentation for full details.


How It Differs from Other Tools

Agent Frameworks (LangChain, CrewAI): Build agents. Agent OS governs them. Use together.

Safety Tools (NeMo Guardrails, LlamaGuard): Input/output filtering. Agent OS intercepts actions mid-execution.

Tool Focus When it acts
LangChain/CrewAI Building agents N/A (framework)
NeMo Guardrails Input/output filtering Before/after LLM call
LlamaGuard Content classification Before/after LLM call
Agent OS Action interception During execution

Examples

The examples/ directory contains demos at various levels:

Getting Started

Demo Description Command
demo-app Uses the stateless API (most reliable) cd examples/demo-app && python demo.py
hello-world Minimal example cd examples/hello-world && python agent.py
quickstart Quick intro cd examples/quickstart && python my_first_agent.py

Domain Examples (Self-Contained)

These examples are self-contained and don't require external Agent OS imports:

Demo Description
healthcare-hipaa HIPAA-compliant agent
customer-service Customer support agent
legal-review Legal document analysis
crewai-safe-mode CrewAI with safety wrappers

Production Demos (with Docker + Observability)

Demo Description Command
carbon-auditor Multi-model verification cd examples/carbon-auditor && docker-compose up
grid-balancing Multi-agent coordination cd examples/grid-balancing && docker-compose up
defi-sentinel Real-time attack detection cd examples/defi-sentinel && docker-compose up
pharma-compliance Document analysis cd examples/pharma-compliance && docker-compose up

Each production demo includes:

  • Grafana dashboard on port 300X
  • Prometheus metrics on port 909X
  • Jaeger tracing on port 1668X
# Run carbon auditor with full observability
cd examples/carbon-auditor
cp .env.example .env  # Optional: add API keys
docker-compose up

# Open dashboards
open http://localhost:3000  # Grafana (admin/admin)
open http://localhost:16686 # Jaeger traces

Safe Tool Plugins

Agent OS includes pre-built safe tools via the Agent Tool Registry:

# Requires: pip install agent-os-kernel[full]
from atr import ToolRegistry, tool

@tool(name="safe_http", description="Rate-limited HTTP requests")
async def safe_http(url: str) -> dict:
    # Tool is automatically registered and sandboxed
    ...

registry = ToolRegistry()
registry.register(safe_http)

# Generate schemas for any LLM
openai_tools = registry.to_openai_schema()
anthropic_tools = registry.to_anthropic_schema()

Message Bus

Connect agents using the async message bus:

# Requires: pip install agent-os-kernel[full]
from amb_core import MessageBus, Message

bus = MessageBus()
await bus.subscribe("tasks", handler)
await bus.publish("tasks", Message(payload={"task": "analyze"}))

Broker adapters available for Redis, Kafka, and NATS (requires optional dependencies).


CLI Tool

Agent OS includes a CLI for terminal workflows:

# Check files for safety violations
agent-os check src/app.py

# Check staged git files (pre-commit)
agent-os check --staged

# Multi-model code review (simulated in current version)
agent-os review src/app.py

# Install git pre-commit hook
agent-os install-hooks

# Initialize Agent OS in project
agent-os init

# Validate AGENTS.md configuration
agent-os validate

MCP Integration (Claude Desktop, GitHub Copilot, Cursor)

Agent OS provides an MCP server that works with any MCP-compatible AI assistant:

# Quick install via npx
npx agentos-mcp-server

npm: agentos-mcp-server
MCP Registry: io.github.imran-siddique/agentos

Add to your config file:

Claude Desktop (%APPDATA%\Claude\claude_desktop_config.json on Windows):

{
  "mcpServers": {
    "agentos": {
      "command": "npx",
      "args": ["-y", "agentos-mcp-server"]
    }
  }
}

Features: 10 tools for agent creation, policy enforcement, compliance checking (SOC 2, GDPR, HIPAA), human-in-the-loop approvals, and audit logging.

See MCP server documentation for full details.


Documentation

Tutorials

Interactive Notebooks

Notebook Description Time
Hello Agent OS Your first governed agent 5 min
Episodic Memory Agent memory that persists 15 min
Time-Travel Debugging Replay and debug decisions 20 min
Cross-Model Verification Detect hallucinations 15 min
Multi-Agent Coordination Trust between agents 20 min
Policy Engine Deep dive into policies 15 min

Reference


Status & Maturity

This is a research project exploring kernel concepts for AI agent governance.

✅ Production-Ready

These components are fully implemented and tested:

Component Tests
StatelessKernel — Zero-dependency policy enforcement (src/agent_os/) ✅ Full coverage
Policy Engine — Deterministic rule enforcement ✅ Tested
Flight Recorder — SQLite-based audit logging ✅ Tested
CLIagent-os check, init, secure, validate ✅ Tested
Framework Adapters — LangChain, OpenAI, Semantic Kernel, CrewAI, AutoGen, OpenAI Agents SDK ✅ Implemented
AGENTS.md Parser — OpenAI/Anthropic standard agent config ✅ Full coverage
Primitives (agent-primitives) — Failure types, severity levels ✅ Tested
CMVK (cmvk) — Drift detection, distance metrics (955+ lines) ✅ Tested
EMK (emk) — Episodic memory with JSONL storage ✅ 8 test files
AMB (amb-core) — Async message bus, DLQ, tracing ✅ 6 test files
IATP (inter-agent-trust-protocol) — Sidecar trust, typed IPC ✅ 9 test files
ATR (agent-tool-registry) — Multi-LLM schema generation ✅ 6 test files
Control Plane (agent-control-plane) — Signals, VFS, protection rings ✅ 18 test files
SCAK (scak) — Self-correcting agent kernel ✅ 23 test files

⚠️ Experimental (Code Exists, Tests Missing or Incomplete)

Component What's Missing
Mute Agent (mute-agent) No tests; all layer dependencies use mock adapters
Observability (agent-os-observability) No tests; Prometheus metrics, Grafana dashboards, OTel tracing implemented
MCP Kernel Server (mcp-kernel-server) No tests; 1173-line implementation
GitHub CLI Extension Single bash script with simulated output
Control Plane MCP Adapter Placeholder — returns canned responses
Control Plane A2A Adapter Placeholder — negotiation accepts all params

🔬 Research Prototype

Component What's Missing
Nexus Trust Exchange No pyproject.toml, no tests, placeholder cryptography (XOR — not secure), all signature verification stubbed, in-memory storage only

Known Architectural Limitations

Limitation Impact Mitigation
Application-level only Direct stdlib calls (subprocess, open) bypass kernel Pair with container isolation for production
Blocklist-based policies Novel attack patterns not in rules will pass Add AST-level parsing (#32), use defense in depth
Shadow Mode single-step Multi-step agent simulations diverge from reality Use for single-turn validation only
No tamper-proof audit Flight Recorder SQLite can be modified by compromised agent Write to external sink for critical audits
Provider-coupled adapters Each SDK needs separate adapter Abstract interface planned (#47)

See GitHub Issues for the full roadmap.


Troubleshooting

Common Issues

ModuleNotFoundError: No module named 'agent_os'

# Install from source
git clone https://github.com/imran-siddique/agent-os.git
cd agent-os
pip install -e .

Optional modules not available

# Check what's installed
python -c "from agent_os import check_installation; check_installation()"

# Install everything
pip install -e ".[full]"

Permission errors on Windows

# Run PowerShell as Administrator, or use --user flag
pip install --user -e .

Docker not working

# Build with Dockerfile (no Docker Compose needed for simple tests)
docker build -t agent-os .
docker run -it agent-os python examples/demo-app/demo.py

Tests failing with API errors

# Most tests work without API keys — mock mode is default
pytest tests/ -v

# For real LLM tests, set environment variables
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...

Contributing

git clone https://github.com/imran-siddique/agent-os.git
cd agent-os
pip install -e ".[dev]"
pytest

License

MIT — See LICENSE


Exploring kernel concepts for AI agent safety.

GitHub · Docs

About

A Safety-First Kernel for Autonomous AI Agents - POSIX-inspired primitives with 0% policy violation guarantee

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 7