Skip to content

[codex] Add Syrin v0.12 sandbox execute loop#18

Open
rhein1 wants to merge 9 commits into
syrin-labs:mainfrom
rhein1:codex/agoragentic-syrin-sandbox-v012
Open

[codex] Add Syrin v0.12 sandbox execute loop#18
rhein1 wants to merge 9 commits into
syrin-labs:mainfrom
rhein1:codex/agoragentic-syrin-sandbox-v012

Conversation

@rhein1

@rhein1 rhein1 commented May 5, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add a Syrin v0.12 native Sandbox execute-loop example for Agoragentic routing
  • document the shared SANDBOX_WORKSPACE contract, resource limits, and approval-gated execute payload
  • add regression coverage for version targeting, workspace artifacts, sensitive action gating, boundary-safe matching, and zero-budget preservation

Why

Syrin v0.12.0 added first-party sandboxed bash/Python execution, package installation, resource pools, and recursive sandbox propagation. This gives the Agoragentic integration a native sandbox path instead of relying only on external sandbox scaffolding.

Validation

  • python -m compileall -q agoragentic tests
  • python -m unittest tests.test_agoragentic_autonomous_lifecycle -v
  • python -m unittest discover -s tests -v
  • python -m ruff check agoragentic\examples\syrin_sandbox_execute_loop.py tests\test_agoragentic_autonomous_lifecycle.py
  • python agoragentic\examples\syrin_sandbox_execute_loop.py --task "Preview a v0.12 sandbox route" --max-cost 0 --packages pandas --requested-action "deploy live spend"

Summary by CodeRabbit

  • New Features

    • Native Syrin v0.12 sandbox "execute-loop" workflow: preview-first approval gating, resource-limit handling, and a CLI-friendly example to build execution plans.
  • Documentation

    • Expanded sandbox & deployment guide, workflow schema, examples index, and quick-start entries for the native sandbox pattern.
  • Tests

    • Added tests validating plan generation, guardrail/approval behavior, workspace artifact contract, and budget validation/edge cases.

@coderabbitai

coderabbitai Bot commented May 5, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@rhein1 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 60 minutes before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2320adad-1611-4aa4-96e8-4f1ca009521c

📥 Commits

Reviewing files that changed from the base of the PR and between 06c1f84 and 20f8f30.

📒 Files selected for processing (6)
  • agoragentic/MICRO_ECF_POLICY_PACK.md
  • agoragentic/README.md
  • agoragentic/WORKFLOW_SCHEMAS.md
  • agoragentic/examples/README.md
  • agoragentic/examples/micro_ecf_policy_pack.py
  • tests/test_agoragentic_autonomous_lifecycle.py
📝 Walkthrough

Walkthrough

This PR introduces Syrin v0.12 native sandbox support to Agoragentic, including a new example module that generates sandbox execute-loop plans with guardrail logic, shared workspace contracts, and resource limits, alongside supporting documentation, schema definitions, and comprehensive tests.

Changes

Syrin v0.12 Native Sandbox Integration

Layer / File(s) Summary
Data Models
agoragentic/examples/syrin_sandbox_execute_loop.py
Introduces SandboxStep and SyrinSandboxPlan dataclasses with JSON-safe as_dict() methods to represent sandbox execution plan components and overall plan structure.
Guardrail & Contract Logic
agoragentic/examples/syrin_sandbox_execute_loop.py
Implements guardrail classification via regex boundary matching, workspace contract definition (task/output files, cleanup), and a fixed three-step sandbox sequence (prep, preview analysis, reflection writing).
Sandbox Configuration & Code Generation
agoragentic/examples/syrin_sandbox_execute_loop.py
Builds execute payloads with approval gating logic, resource limits tied to cost budgets, and generates a Syrin v0.12 Python snippet that orchestrates task writing and async sandbox execution with attempt record capture.
CLI & Integration
agoragentic/examples/syrin_sandbox_execute_loop.py
Provides CLI argument parsing (--task, --max-cost, --packages, --backend, --run-live, --requested-action) and main() entrypoint that wires all builders together and outputs a JSON plan.
Schema & Workflow Documentation
agoragentic/WORKFLOW_SCHEMAS.md, agoragentic/SANDBOX_AND_DEPLOYMENT.md, agoragentic/README.md, agoragentic/examples/README.md, README.md
Defines the syrin_sandbox_execute_loop workflow schema with inputs, controls, and expected outputs; documents the Syrin native sandbox path and its interaction contract with bash/Python via SANDBOX_WORKSPACE; adds quick-start and example index entries for the new example.
Tests & Validation
tests/test_agoragentic_autonomous_lifecycle.py
Adds tests that load the example and assert native v0.12 targeting, shared workspace contract and step sequencing, guardrail review gating behavior disabling prefer_execute, preview-only defaults, max_cost validation (NaN rejection), boundary-aware action matching, and preservation of zero-budget constraints.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I built a sandbox, snug and small,
With guarded gates to check each call,
Plans printed clear in JSON light,
Preview first, then take flight—
Hoppity tests say everything's right.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[codex] Add Syrin v0.12 sandbox execute loop' directly and clearly summarizes the main change: adding a Syrin v0.12 sandbox execute-loop example. It is concise, specific, and accurately reflects the primary contribution of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
tests/test_agoragentic_autonomous_lifecycle.py (1)

226-284: ⚡ Quick win

Add a regression for “safe action while live mode is off.”

Current additions don’t assert the default live_enabled=False path for non-sensitive actions. Add one test to verify the payload stays preview-only in that case, so execute preference cannot silently reappear.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_agoragentic_autonomous_lifecycle.py` around lines 226 - 284, Add a
regression test that verifies non-sensitive actions with the default live flag
remain preview-only: call syrin_sandbox.build_guardrail_report with
live_enabled=False using a clearly non-sensitive prompt (e.g., "preview route"
or "list items"), then call syrin_sandbox.build_execute_payload with that report
and check payload["constraints"]["preview_only"] is True and
payload["constraints"]["prefer_execute"] is False (also assert
report["decision"] == "allow" if helpful); put this new test alongside the other
tests (e.g., a new def
test_syrin_sandbox_default_live_off_keeps_preview_only(self):) so it covers the
default live_enabled=False path for non-sensitive actions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agoragentic/examples/syrin_sandbox_execute_loop.py`:
- Around line 171-197: Update build_execute_payload to factor in live-mode when
deciding execution: add a boolean parameter (e.g., run_live or live_mode) and
compute can_execute = (guardrail_report["decision"] == "allow") and run_live so
preview-first is preserved when live mode is off; keep constraints using
prefer_execute = can_execute and preview_only = not can_execute, and update any
callers of build_execute_payload accordingly (refer to function
build_execute_payload, variable can_execute, and the constraints dict).
- Around line 255-284: The builder build_syrin_sandbox_plan currently accepts
max_cost from callers without validation; add input validation at the start of
build_syrin_sandbox_plan to ensure max_cost is a finite non-negative number (not
NaN or ±inf) and raise a clear ValueError (or similar) if it fails, so invalid
values cannot propagate into build_execute_payload or build_resource_limits;
include the check before any calls to
build_execute_payload/build_resource_limits and mention max_cost in the
exception message for easier debugging.
- Around line 221-252: The generated script uses top-level "async with
Sandbox(...)" which is invalid; wrap the async usage in an async function (e.g.,
define async def main(): and move the Sandbox block and calls to sb.exec_bash /
sb.exec_python into it), add "import asyncio" at the top, and invoke
asyncio.run(main()) at module scope so the Sandbox context and awaits run inside
a proper async event loop.

---

Nitpick comments:
In `@tests/test_agoragentic_autonomous_lifecycle.py`:
- Around line 226-284: Add a regression test that verifies non-sensitive actions
with the default live flag remain preview-only: call
syrin_sandbox.build_guardrail_report with live_enabled=False using a clearly
non-sensitive prompt (e.g., "preview route" or "list items"), then call
syrin_sandbox.build_execute_payload with that report and check
payload["constraints"]["preview_only"] is True and
payload["constraints"]["prefer_execute"] is False (also assert
report["decision"] == "allow" if helpful); put this new test alongside the other
tests (e.g., a new def
test_syrin_sandbox_default_live_off_keeps_preview_only(self):) so it covers the
default live_enabled=False path for non-sensitive actions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9b5d4062-25cb-45a2-93ba-3e4329125a36

📥 Commits

Reviewing files that changed from the base of the PR and between ded42d2 and 307c973.

📒 Files selected for processing (7)
  • README.md
  • agoragentic/README.md
  • agoragentic/SANDBOX_AND_DEPLOYMENT.md
  • agoragentic/WORKFLOW_SCHEMAS.md
  • agoragentic/examples/README.md
  • agoragentic/examples/syrin_sandbox_execute_loop.py
  • tests/test_agoragentic_autonomous_lifecycle.py

Comment thread agoragentic/examples/syrin_sandbox_execute_loop.py
Comment thread agoragentic/examples/syrin_sandbox_execute_loop.py
Comment thread agoragentic/examples/syrin_sandbox_execute_loop.py
rhein1 added 2 commits May 8, 2026 23:34
…cy' into codex/agoragentic-syrin-sandbox-v012

# Conflicts:
#	agoragentic/README.md
#	agoragentic/WORKFLOW_SCHEMAS.md
#	agoragentic/examples/README.md
#	tests/test_agoragentic_autonomous_lifecycle.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants