NEXUS OS — Governed Agent Operating System

8ddbe9cf-228e-4086-bd4a-dd925476e49a (1)

NEXUS OS — Governed Agent Operating System

Local-first, auditable, multi-agent governance for AI systems. Status: Phase 0 security hardened. Active development. Azure/Foundry dead (sub blocked 2026-05-15).

NEXUS OS turns local models, research evidence, and external teams into a governed, audited, low-VRAM execution system where every action is proposal-bound, test-gated, and provenance-tracked.

Architecture

                         ┌──────────────────────┐
                         │      BRIDGE           │
                         │  JSON-RPC, MCP, SDK   │
                         └──────────┬───────────┘
                                    │
         ┌──────────────────────────┼──────────────────────────┐
         ▼                          ▼                          ▼
┌──────────────────┐    ┌──────────────────┐    ┌──────────────────┐
│    GOVERNOR      │    │  ENGINE / GMR    │    │      VAULT       │
│ KAIJU gates      │    │ Hermes router    │    │ 5-tracks memory  │
│ TrustEngine v2.2 │    │ model rotation   │    │ encryption,trust │
│ VAP proof chain  │    │ circuit breaker  │    │                  │
└──────────────────┘    └──────────────────┘    └──────────────────┘
         │                      │                        │
         └──────────────────────┼────────────────────────┘
                                ▼
         ┌─────────────────────────────────────────────┐
         │                  SWARM                       │
         │   Foreman, workers, auction, OpenClaw       │
         └─────────────────────────────────────────────┘
                                ▼
         ┌─────────────────────────────────────────────┐
         │              MONITORING                      │
         │   TokenGuard, counters, telemetry           │
         └─────────────────────────────────────────────┘
                                ▼
         ┌─────────────────────────────────────────────┐
         │         TWAVE v2.0 (NEW 2026-05-15)         │
         │  ChimeraRouterV2 + Landau-Ginzburg tracker  │
         │  EDT / LEAD / EPR / LED / CK-PLUG           │
         └─────────────────────────────────────────────┘

Port Map

Port	Service	Protocol
3000	Next.js Command Center Dashboard	HTTP
7352	Nexus Governance API (canonical)	FastAPI
7353	TWAVE Wrapper	HTTP
3003	WebSocket Swarm Events	Socket.io
11434	Local Ollama (internal)	HTTP

Repository Structure

nexus_os/                 # Python governance backend (CANONICAL — ~50 modules)
  bridge/                 #   JSON-RPC server, SDK, secrets, MCP auth
  governor/               #   KAIJU gates, TrustEngine v2.2, compliance, VAP proof chain
  vault/                  #   5-track memory (EVENT/TRUST/CAP/FAIL/GOV), encryption, trust
  engine/                 #   Hermes router, executor, skillsmith, tool discipline
  gmr/                    #   Model rotation, circuit breaker, telemetry, savings
  swarm/                  #   Foreman, workers, auction, OpenClaw spawner
  monitoring/             #   TokenGuard, counters, strategies
  observability/          #   Tracing, Squeez log compression
  twave/                  #   TWAVE v2.0: ChimeraRouterV2, Landau-Ginzburg tracker (NEW)
  stresslab/              #   ISC benchmark runner, templates
  db/                     #   Thread-safe database manager (SQLite/PostgreSQL)
  relay/                  #   Transparent model relay proxy
  cron/                   #   Scheduled agent cycles
  team/                   #   Agent coordinator

twave/                    # (symlinked / legacy path from HF dataset — kept for reference)

src/                      # Next.js Frontend Dashboard
  app/api/                #   19 API route files
  components/nexus/tabs/  #   11 tab panels (overview, stresslab, gmr, governor, vault,
                          #   research, swarm, tokens, providers, ratelimit, kpi)
  components/nexus/       #   26 custom components
  store/                  #   Zustand global state
  hooks/                  #   Custom React hooks (use-api-data, use-swarm-ws, etc.)
  lib/                    #   Prisma, rate limiter, cache, provider bridge

prisma/                   #   Database schema — 12 models

tests/                    #   Python test suite — 642 tests (632 passing)
  governor/               #   Trust scoring, compliance, kaiju auth, proof chain
  vault/                  #   Memory, trust, cache, manager, adapter, tracks
  bridge/                 #   Server, SDK, MCP auth, token integration
  engine/                 #   Router, skillsmith, tool discipline, Hermes GMR
  gmr/                    #   Core GMR (selection, budgets, circuit breakers)
  swarm/                  #   Auction, spawner budget gate
  monitoring/             #   Token guard, strategies, GMR
  team/                   #   Coordinator
  integration/            #   Compliance, bridge, heartbeat, Hermes, Squeez
  security/               #   Encryption hard-fail, poisoning v2, sanitizer
  contracts/              #   Protocol contracts
  cron/                   #   Agent cycle
  unit/                   #   Executor v2, secrets

benchmarks/               #   Dataset generators (stres5, stres6, tool taxonomy)
foundry_datasets/         #   Generated eval/stress datasets (~1.5 GB, gitignored)
  stress_lab/             #   v1→v6 stress datasets (ISC, frontier, TAMAS, tool taxonomy)
  eggroll/                #   Model SFT training data from frontier models
  opusman_eval/           #   OPUSman evaluation datasets
  state/                  #   Dataset quality tracking, gap analysis

scripts/                  #   Utility and repair scripts (~43 files — many legacy stubs)
bin/                      #   CLI tools (nexusctl, nexus-pi)
docs/
  handbook/               #   Operational guides (AFK workflow, Kiloclaw fastboot)
  reviews/                #   Asset inventory, audits

.brv/                     #   Backup/archive (gitignored)
  archive_static/         #   Archived: session logs, QA screenshots, old zips,
  azure_archive/          #   stale .pyc dumps, quarantined Azure creds
  git_history_backup/     #   Git bundle backup

research/                 #   Research reports (session logs, R&D topics)
  session_logs/           #   STRES5, STRES6, AFK session reports

Current Test State

Suite	Count	Status
Python pytest	642 tests	632 passing (10 heartbeat infra-dependent)
TWAVE v2.0	25 tests	All passing
Security tests	23 tests	All passing
Dashboard lint	0 errors	Clean

Known issues:

test_heartbeat.py (10 tests) — depends on external infrastructure, times out if ports not available
scripts/ (43 files) — contains many legacy stubs and one-off repair scripts

What's Working

TrustEngine v2.2 — HARDWALL defense: logistic scaling, adaptive decay, non-compensatory CRITICAL, 6-stage CDR (Nominal→Caution→Restricted→High Risk→Critical→Collapsed)
Vault — Canonical 5-track memory schema (store_track / retrieve_track), encryption hard-fail by default
GMR — Model rotation with circuit breakers, domain mapping, telemetry, savings tracking
Bridge — JSON-RPC governance server, SDK with circuit breaker, retry policy, token integration
Dashboard — Next.js 16 frontend, 11 tabs, all wired to real API data, zero lint errors
Stress Lab — v1–v6 datasets (ISC, frontier, TAMAS, tool taxonomy), 240 tools, 1.6 GB total
TWAVE v2.0 (NEW) — ChimeraRouterV2 (tiered routing + ERNIE), Landau-Ginzburg hallucination tracker (EDT/LEAD/EPR/LED/CK-PLUG), 25 tests
Phase 0 Security — TerminalSanitizer (ANSI injection defense), VerifiableOutput (SHA-256 integrity), AgentPTY isolation
12 API providers — NVIDIA, SambaNova, SiliconFlow, OpenCode, OpenRouter, Groq, etc.

What's Incomplete / Needs Work

Stubs & Placeholders (critical)

Component	File	Issue
AsyncBridgeExecutor	`nexus_os/engine/executor.py:115`	Production executor always returns `success=False` — not wired to real Bridge RPC
CVAVerifier	`nexus_os/governor/base.py:329`	Core Value Alignment check always passes — stub returns `(True, "passed stub")`
ModelRelay	`nexus_os/relay/model_relay.py`	Partially wired to ChimeraRouterV2 + Ollama; still needs production health policy and end-to-end server validation
Worker execute_task	`nexus_os/swarm/worker.py:180`	Produces fake simulated outputs, no real task execution
TaskClassifier	`nexus_os/engine/hermes.py:401`	"Minimal stub for test collection" — keyword-based heuristic fallback
ISC-Runner templates	`nexus_os/stresslab/isc_runner.py:79`	Only downloads 1 template per domain — placeholder

Missing API Endpoints (per FUSION_RECOMMENDATIONS.md)

GET /health — HIGH priority
POST /tasks/heartbeat — HIGH priority
POST /tasks/result — HIGH priority
GET /tasks/status/{id} — MEDIUM priority
POST /skills/propose — LOW priority
GET /skills/status/{id} — LOW priority

Dashboard Gaps

3 of 11 tabs never screenshotted during QA (Providers, RateLimit, KPI)
Prisma HealthSnapshot and TokenSnapshot models referenced but don't exist in client
Dashboard uses mock/proxy layer instead of real Python governance API on 7352

Code Quality

~15 bare except: pass blocks across the codebase
43 repair scripts in scripts/ — evidence of ongoing breakage-repair cycles
13 empty __init__.py files (fine but untidy)
4 empty test methods (exist as pass only)

Dataset Pipeline (v1 → v6)

Version	Rows	Description
v1 ISC-Bench	1,164	84 ISC templates, 9 domains
v2 ISC Expander	10,000	Combinatorial expansion, 10 phases
v3 Multi-source	484	AgentHazard + SOSBench + AgentGovBench + ClawsBench
v4 Regenerator	536,530	13 governance × 13 templates × 12 domains
v5 Frontier	181,000	7 frontier types, 11 research sources
v6 TAMAS	6,840	7 attack types, 3 topologies, 5 domains
v6.1 Tool Taxonomy	7,200	240 tools, 12 categories (NEW, closes TAMAS gap)

All generators in benchmarks/: regenerate_datasets.py, regenerate_frontier_v5.py, stres5_final.py, stres6_tamas_generator.py, stres6_tool_taxonomy.py

Getting Started

# Python backend
pip install -e .
pytest tests/ -q --ignore=tests/integration/test_heartbeat.py

# Dashboard
bun install
bun run dev

# TWAVE v2.0 demo
python -m nexus_os.twave.demo_e2e_v2 --prompt "Explain quantum entanglement" --policy auto

# Stress lab dataset generation
python benchmarks/stres6_tool_taxonomy.py --gen

# CLI
nexusctl doctor

Key Canons

Python/FastAPI is canonical for governance — dashboard proxies, does not decide.
No git add . — stage explicit reviewed paths only.
No auto-commit — every change is proposal-bound and test-gated.
Azure/Foundry: DEAD (2026-05-15) — all cloud model pipelines deprecated.
Datasets are gitignored — foundry_datasets/ is 1.5+ GB, never commit.
55 stale branches archived as archive/* tags (2026-05-15 cleanup).

Documentation Index

File	Purpose
`01_PROJECT_STATE.md`	Canonical project state (updated 2026-05-15)
`AGENTS.md`	Agent operating protocol v2.0 (safety-gated)
`knowledge.md`	Compact knowledge base
`NEXUS_OS_V4_MASTER_PLAN.md`	12-week architecture roadmap (1,182 lines)
`NEXUS_ZO_CLAW_INTEGRATION_PLAN.md`	Zo/Local/Claw A2A integration
`HEARTBEAT.md`	Cloud orchestrator heartbeat protocol
`SOUL.md`	Cloud orchestrator identity
`docs/handbook/05_NEXUS_AFK_WORKFLOW_STYLE.md`	Autonomous AFK operation rules
`docs/handbook/08_C_KILOCLAW_FASTBOOT.md`	Kiloclaw experimental lab setup
`docs/reviews/NEXUS_ASSET_INVENTORY_...md`	Complete asset inventory
`worklog.md`	Development task log

Repo Hygiene Rules

NEVER commit .env, foundry_datasets/, node_modules/, venv/, session-*.md, *.zip, *.tmp
NEVER git add . — always explicit paths
Verify tests before staging: pytest tests/ -q --ignore=tests/integration/test_heartbeat.py
Dataset work stays in foundry_datasets/ (gitignored) — use benchmarks/ generators
Research goes in research/session_logs/ (gitignored)
Backups go in .brv/ (gitignored)

License

Internal — R&D Backend Team.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NEXUS OS — Governed Agent Operating System

Architecture

Port Map

Repository Structure

Current Test State

What's Working

What's Incomplete / Needs Work

Stubs & Placeholders (critical)

Missing API Endpoints (per FUSION_RECOMMENDATIONS.md)

Dashboard Gaps

Code Quality

Dataset Pipeline (v1 → v6)

Getting Started

Key Canons

Documentation Index

Repo Hygiene Rules

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
nexus_os		nexus_os
tests		tests
.gitignore		.gitignore
01_PROJECT_STATE.md		01_PROJECT_STATE.md
AGENTS.md		AGENTS.md
NEXUS_OS_V4_MASTER_PLAN.md		NEXUS_OS_V4_MASTER_PLAN.md
README.md		README.md
SECURITY.md		SECURITY.md
knowledge.md		knowledge.md
pyproject.toml		pyproject.toml
worklog.md		worklog.md

Folders and files

Latest commit

History

Repository files navigation

NEXUS OS — Governed Agent Operating System

Architecture

Port Map

Repository Structure

Current Test State

What's Working

What's Incomplete / Needs Work

Stubs & Placeholders (critical)

Missing API Endpoints (per FUSION_RECOMMENDATIONS.md)

Dashboard Gaps

Code Quality

Dataset Pipeline (v1 → v6)

Getting Started

Key Canons

Documentation Index

Repo Hygiene Rules

License

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages