Skip to content

Harden MCP runtime with operator policy and integration coverage#14

Merged
N1ghthill merged 1 commit into
mainfrom
runtime-policy-mcp-hardening
Mar 24, 2026
Merged

Harden MCP runtime with operator policy and integration coverage#14
N1ghthill merged 1 commit into
mainfrom
runtime-policy-mcp-hardening

Conversation

@N1ghthill
Copy link
Copy Markdown
Owner

@N1ghthill N1ghthill commented Mar 24, 2026

Summary

This change hardens the MCP-first runtime baseline in four areas:

  • adds operator-configurable policy loading through a versioned TOML file with safe defaults and fail-closed behavior
  • wires policy diagnostics and config-target loading into the runtime and mc doctor
  • adds explicit runtime/MCP integration coverage, including a real stdio subprocess test for mc mcp-serve
  • syncs docs and CI with the current product posture and validation model

What Changed

Runtime and policy

  • added versioned policy loading in src/master_control/policy/config.py
  • extended PolicyEngine to enforce tool enablement, confirmation overrides, allowed scopes, and service patterns
  • connected policy diagnostics into MasterControlRuntime.doctor()
  • made ConfigManager consume managed targets from policy instead of a parallel hardcoded source
  • surfaced policy denial reasons directly in runtime/MCP payloads

Interface and ownership cleanup

  • reduced direct core.runtime -> interfaces.agent.* coupling by moving runtime imports to master_control.agent.*
  • simplified MasterControlApp.doctor() to delegate to the runtime
  • exposed policy status in CLI doctor output

Integration coverage

  • added tests/test_runtime_policy_integration.py
  • added tests/test_mcp_stdio_integration.py
  • split pytest in CI into unit and runtime-integration slices

Docs

  • added docs/policy.md
  • added docs/runtime-integration-testing.md
  • updated README/status/roadmap/architecture/operator-workflows/runtime-mcp-maturation-plan
  • moved older planning/release records under docs/history/

Review Guide

  1. Review the new policy model and failure semantics in src/master_control/policy/config.py and src/master_control/policy/engine.py.
  2. Review runtime wiring in src/master_control/core/runtime.py, especially doctor() and run_tool() policy application.
  3. Review the new runtime/MCP integration coverage in tests/test_runtime_policy_integration.py and tests/test_mcp_stdio_integration.py.
  4. Review the CI split and doc realignment in .github/workflows/ci.yml, docs/policy.md, and docs/runtime-integration-testing.md.

Validation

  • python3 -m ruff check .
  • python3 -m mypy src
  • PYTHONPATH=src python3 -m unittest discover -s tests
  • PYTHONPATH=src python3 -m pytest -q tests --ignore tests/test_runtime_policy_integration.py --ignore tests/test_mcp_stdio_integration.py
  • PYTHONPATH=src python3 -m pytest -q tests/test_runtime_policy_integration.py tests/test_mcp_stdio_integration.py
  • python3 -m compileall src
  • PYTHONPATH=src python3 -m master_control --json doctor
  • python3 scripts/validate_operator_bootstrap.py --output-dir /tmp/mc-bootstrap-analysis-final --provider heuristic --python python3

Residual Risks

  • no container-backed integration harness yet for repeatable systemd scenarios
  • no automated real desktop MCP client validation yet
  • concurrency/schema-governance hardening is still follow-up work

Follow-up Checklist

  • Add a container-backed integration harness for repeatable service/config scenarios.
  • Validate the approval flow from a real desktop MCP client and capture a stable transcript.
  • Define tool-schema compatibility and release policy for the MCP/runtime contract.
  • Add deeper contention/concurrency coverage around approval claiming and execution paths.

@N1ghthill N1ghthill added enhancement New feature or request documentation Improvements or additions to documentation labels Mar 24, 2026
@N1ghthill N1ghthill self-assigned this Mar 24, 2026
@N1ghthill
Copy link
Copy Markdown
Owner Author

Review guide for first pass:

  1. Start with src/master_control/policy/config.py and src/master_control/policy/engine.py to validate the policy shape and fail-closed semantics.
  2. Then review src/master_control/core/runtime.py for the runtime wiring, especially doctor() diagnostics and run_tool() enforcement.
  3. Use tests/test_runtime_policy_integration.py and tests/test_mcp_stdio_integration.py as the contract reference for expected operator and MCP behavior.
  4. Finish with .github/workflows/ci.yml, docs/policy.md, and docs/runtime-integration-testing.md to confirm the validation story matches the code.

Main follow-ups intentionally left for later:

  • container-backed systemd integration harness
  • real desktop MCP client validation
  • schema compatibility/versioning policy
  • deeper concurrency contention coverage

@N1ghthill N1ghthill merged commit da5e28e into main Mar 24, 2026
3 checks passed
@N1ghthill N1ghthill deleted the runtime-policy-mcp-hardening branch March 24, 2026 02:22
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

try:
result = tool.invoke(argument_payload)
except ToolError as exc:

P1 Badge Finalize approvals when tool invocation raises non-ToolError

After a confirmed call claims a pending approval, the invoke path only handles ToolError. If tool.invoke(...) raises any other exception (for example PermissionError/OSError from filesystem operations), execution exits before _finalize_claimed_tool_approval(...) runs, leaving that approval stuck in executing and no longer rejectable/approvable through normal flows. This wedges approval state and can require manual DB cleanup; finalize the claimed approval in a broader exception/finally path.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +429 to +432
AND status = 'pending'
AND arguments_json = ?
AND audit_context_json = ?
ORDER BY id DESC
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Claim pending approvals by id, not full audit_context

Approval claiming requires exact audit_context_json equality in addition to tool and arguments, so confirmations executed from a different interface context (e.g., pending created with source="mcp_stdio" but confirmed via CLI/chat command) will run the tool without claiming the original pending row. That leaves stale pending approvals that can later be approved and replay the same mutation unexpectedly. Matching by approval id (or normalizing context for claim) avoids this orphan/replay behavior.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant