Proves the full internal boundary in one self-contained command — no Plane, no GitHub, no Kodo CLI, no Claude CLI, no network access required.
PlanningContext
-> TaskProposal (proposal_builder)
-> LaneDecision (stub routing, labeled as offline)
-> ProposalDecisionBundle
-> ExecutionCoordinator (mandatory policy gate)
-> DemoStubBackendAdapter
-> ExecutionResult
-> ExecutionRecord + ExecutionTrace (observability recorder)
-> retained evidence files
This is the complete internal path that OperationsCenter's README describes.
After this demo, the answer to "does OperationsCenter work?" is:
Run this command. Look at this output. Open these evidence files. That is OperationsCenter working.
- SwitchBoard real routing (stub routing is used and labeled as such)
- Kodo, Claude CLI, Codex CLI, Aider, or any live coding backend
- Git operations, branch push, or PR creation
- Plane board integration
- Network connectivity
For the Plane-driven golden path, see the Autonomy-Cycle Ritual section at the bottom.
cd /path/to/OperationsCenter
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"No config files, environment variables, or external services needed.
python -m operations_center.entrypoints.demo.run \
--goal "Write a tiny hello-world execution artifact" \
--repo-key demo \
--workspace-path /tmp/operations-center-demo \
--backend stub============================================================
OperationsCenter Demo Run
============================================================
[PLANNING — TaskProposal]
proposal_id : <uuid>
task_id : auto-simple-edit-<hash>
task_type : simple_edit
risk_level : low
goal : Write a tiny hello-world execution artifact
[ROUTING — LaneDecision [stub mode — labeled, not production]]
decision_id : <uuid>
lane : aider_local
backend : demo_stub
rule : demo.stub_routing
rationale : Offline stub routing for demo mode — deterministic, no external services required
[PROPOSAL-DECISION BUNDLE]
<proposal_id[:8]> + <decision_id[:8]> -> bundled
[POLICY — gate result]
status : ALLOW
(no violations or warnings)
executed : True
[EXECUTION — DemoStubBackendAdapter]
run_id : <uuid>
status : SUCCEEDED
success : True
diff_stat : 1 file changed, 6 insertions(+)
artifact : /tmp/operations-center-demo/artifacts/demo_result.txt
[OBSERVABILITY — retained records]
headline : SUCCEEDED | demo_stub @ aider_local | run=<run_id[:8]>
summary : Run <run_id[:8]>; changed 1 file; 1 file changed, 6 insertions(+)
trace warn : validation was skipped for this run
trace warn : no primary artifacts produced by this run
Evidence files:
/tmp/operations-center-demo/.operations_center/runs/<run_id>/proposal.json
/tmp/operations-center-demo/.operations_center/runs/<run_id>/decision.json
/tmp/operations-center-demo/.operations_center/runs/<run_id>/execution_request.json
/tmp/operations-center-demo/.operations_center/runs/<run_id>/result.json
/tmp/operations-center-demo/.operations_center/runs/<run_id>/execution_record.json
/tmp/operations-center-demo/.operations_center/runs/<run_id>/execution_trace.json
/tmp/operations-center-demo/.operations_center/runs/<run_id>/run_metadata.json
============================================================
Demo completed successfully.
============================================================
Trace warnings are expected and correct:
validation was skipped— the stub adapter doesn't run validation commandsno primary artifacts— the stub produces a log-excerpt artifact, not a code diff; the observability system correctly distinguishes these
Both warnings demonstrate the observability system is working, not that something is broken.
After the demo, inspect the evidence tree:
/tmp/operations-center-demo/
├── artifacts/
│ └── demo_result.txt ← stub adapter output
└── .operations_center/
└── runs/
└── <run_id>/
├── proposal.json ← canonical TaskProposal
├── decision.json ← canonical LaneDecision
├── execution_request.json ← built by ExecutionCoordinator
├── result.json ← canonical ExecutionResult
├── execution_record.json ← normalized observability record
├── execution_trace.json ← inspectable trace report
└── run_metadata.json ← run summary (lane, backend, status, flags)
Inspect any file:
cat /tmp/operations-center-demo/.operations_center/runs/*/run_metadata.json
cat /tmp/operations-center-demo/artifacts/demo_result.txtProve the policy gate prevents adapter invocation:
python -m operations_center.entrypoints.demo.run \
--goal "Write a tiny hello-world execution artifact" \
--repo-key demo \
--workspace-path /tmp/operations-center-demo-blocked \
--blocked-policyExpected: exit code 1, no artifacts/demo_result.txt, evidence files retained (record + trace written even for blocked runs).
# Just the demo tests
pytest tests/test_demo_stub_adapter.py tests/test_demo_routing.py tests/test_demo_cli.py -v
# Full suite (3 integration tests require live SwitchBoard — skip them if not running)
pytest --ignore=tests/integration/ -qThe demo test suite covers:
DemoStubBackendAdapterunit contracts (artifact write, result shape, changed-file evidence)- Stub routing produces canonical
LaneDecision - Demo policy gates: ALLOW path and BLOCK path
ExecutionCoordinatorboundary: request → policy → adapter → result → record → trace- CLI smoke: exit 0 on success, exit 1 on blocked, all evidence files created
The self-contained demo above proves the internal boundary.
The full-stack ritual proves the Plane integration and live backend:
# Dry-run first — inspect what would be proposed
./scripts/operations-center.sh autonomy-cycle
# If the dry-run output looks reasonable, execute
./scripts/operations-center.sh autonomy-cycle --executeConfirm:
- Dry-run output shows at least one candidate or a clear suppression reason
- Artifact paths are printed and the files exist under
tools/report/operations_center/ -
--executecreates a Plane task withsource: autonomyandsource: proposelabels - The task description includes
## Proposal Provenancewith traceable run IDs
Full end-to-end walkthrough from local startup to a completed task with retained artifacts.
- Docker (for Plane)
- Python 3.11+
- A GitHub account with a repo and a personal access token (repo scope)
ghCLI authenticated (gh auth login) or aGITHUB_TOKENPAT- Kodo installed and accessible via
scripts/kodo-shim
./scripts/operations-center.sh setupThis creates .venv, installs dependencies, and walks through the initial config wizard.
Alternatively, copy the templates manually:
cp config/operations_center.example.yaml config/operations_center.local.yaml
cp .env.operations-center.example .env.operations-center.localEdit both files. Minimum required changes:
config/operations_center.local.yaml
plane:
project_id: <your-plane-project-uuid>
repos:
MyRepo:
clone_url: git@github.com:yourorg/yourrepo.git
default_branch: main.env.operations-center.local
export PLANE_API_TOKEN='your-plane-api-token'
export GITHUB_TOKEN='github_pat_...'source .env.operations-center.local
./scripts/operations-center.sh dev-upThis starts:
- Plane on
http://localhost:8080— Plane infra is owned by PlatformDeployment;dev-updelegates toPlatformDeployment/scripts/plane.shautomatically - Watchers:
goal,test,improve,propose,review
Prerequisite: PlatformDeployment must be cloned as a sibling of OperationsCenter (or OPERATIONS_CENTER_PLATFORM_DEPLOYMENT_DIR set). If PlatformDeployment is not found, the Plane step will print instructions and exit.
Confirm everything is running:
./scripts/operations-center.sh dev-statusExpected output: each watcher shows state: idle or state: polling.
- Open
http://localhost:8080and navigate to your project. - Create a new work item with:
- Title:
Add a hello-world utility function - Labels:
repo: MyRepo,task-kind: goal - Description:
## Goal Add a small `hello_world()` function that returns the string "Hello, world!". Place it in `src/myrepo/utils.py` and add a corresponding test in `tests/test_utils.py`.
- Title:
- Move the work item to
Ready for AIstate.
The goal watcher picks up the task within its poll interval (default 30s).
Follow progress in the watcher log:
tail -f logs/local/watch-all/$(ls -t logs/local/watch-all/ | grep goal | head -1)Or check the watcher status file:
cat logs/local/watch-all/goal.status.json | python3 -m json.toolIn Plane, the task transitions:
Ready for AI→Running(worker claimed the task)Running→In Review(ifawait_review: trueand push succeeded) — orReview(if no PR automation) — orBlocked(if validation failed)
Every run writes structured artifacts under tools/report/kodo_plane/:
tools/report/kodo_plane/
└── TASK-<id>/
└── <run-id>/
├── request_context.json # task + repo metadata at time of run
├── request.json # full execution request
├── kodo_command.json # exact kodo CLI command used
├── kodo_stdout.txt # kodo stdout
├── kodo_stderr.txt # kodo stderr
├── validation.json # validation command results
├── diff_stat.txt # git diff --stat
├── diff_patch.txt # full patch
├── summary.json # outcome summary (success/failure/reason)
└── control_outcome.json # structured outcome metadata
Inspect the outcome:
cat tools/report/kodo_plane/TASK-*/*/summary.json | python3 -m json.toolSuccess (branch pushed, PR created):
- Plane task is in
In Review - A PR is open on GitHub pointing to
plane/<task-id>-<slug>→main summary.jsonhas"outcome_status": "executed"and"success": true
Success (no changes needed):
- Plane task moves to
Blockedwith outcomeno_op summary.jsonhas"outcome_status": "no_op"
Failure (validation failed):
- Plane task moves to
Blocked summary.jsonhas"validation_passed": falsevalidation.jsonshows which command failed and its stdout/stderr- A draft branch is still pushed (for inspection) if
push_on_validation_failure: true
Failure (contract violation):
- No execution started; task moves to
Blockedimmediately - Plane comment includes the specific failure reason (unknown repo key, missing goal, disallowed branch)
-
dev-statusshows all watchers asidleorpolling - Task transitions from
Ready for AI→Runningwithin one poll interval - Artifacts appear in
tools/report/kodo_plane/ -
summary.jsonexists and has a recognisable outcome - Plane task comment shows execution result details
- GitHub branch or PR exists if push succeeded
./scripts/operations-center.sh plane-doctor --task-id <task-id>
./scripts/operations-center.sh smoke --task-id <task-id> --comment-only./scripts/operations-center.sh dev-downRun this after significant changes to the system:
- After threshold tuning (changed
min_consecutive_runs, cooldown values, or family gates) - After watcher restarts following a budget-exhaustion or rate-limit event
- After promoting a new candidate family from gated to active
- After any change to PR automation config (
await_review,bot_logins,max_self_review_loops) - Before and after a new repo is added to config