release: v0.2.0 — repo-local guardrails, framework-agnostic core, pinned adapters#2
Merged
release: v0.2.0 — repo-local guardrails, framework-agnostic core, pinned adapters#2
Conversation
Repositions agent-contracts around fail-closed guardrails for autonomous coding/build agents in a repository: filesystem read/write scopes, shell command authorization, shell-command budgets, and a durable verdict artifact that CI can gate on. - filesystem read/write authorization scopes - shell command authorization scopes + max_shell_commands budget - verdict artifact emission and CLI verdict gating - coding-agent trace bootstrap improvements - demo contracts for blocked file writes, blocked commands, failed checks - canonical AGENT_CONTRACT.yaml repositioned as a repo-build agent - README, spec, and examples rewritten around the coding/build scope Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pinned versions The contract, CLI, verdict artifact, and GitHub Action are framework- and provider-agnostic by design. The CI verdict gate is the source of truth for enforcement; in-runtime adapters are optional ergonomic helpers that forward host hook calls into the same enforcer. - pin claude-agent-sdk==0.1.56, openai-agents==0.13.5, langchain-core==1.2.26 in their respective extras - gate all three SDK extras on Python 3.10+ (core stays 3.9+) - fix OpenAI adapter import path (from agents import RunHooks) - add real-SDK integration tests using pytest.importorskip so adapters are validated against the actual installed SDK base classes / hook surfaces, not stub fallbacks - wire CI to install [claude,openai,langchain] extras on Python 3.10+ matrix entries so the integration tests run - mypy: skip following imports into framework SDKs (newer-Python syntax) - drop CrewAI and Pydantic AI adapters/extras/tests - README: lead with "CI verdict gate = source of truth", document pinned SDK versions, add v0.3.0 TypeScript adapter roadmap Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Explains why the contract is a structured YAML artifact rather than prose: deterministic parse, typed fields for fail-closed enforcement, diff-friendly review, versioned schema, and consistency with existing cloud-native policy formats. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…launch Security fix ============ The shell command matcher used fnmatch.fnmatch(). Pattern "python -m pytest *" matched commands like "python -m pytest tests/ ; rm -rf /" because the * glob consumed shell operators (;, &&, ||, |, &, >, <, `, $(, newline) as ordinary characters. An agent could bypass any allowlist entry by appending arbitrary chained or substituted commands after an authorized prefix. v0.2.x now strict-rejects any command containing one of those metacharacters, regardless of pattern match. New ShellMetacharacterError subclasses EffectDeniedError so existing handlers keep working but verdicts can distinguish "matched no allowlist entry" from "attempted to chain commands". Regression coverage in tests/test_effects.py covers ;, &&, ||, |, >, <, >>, $(, backtick, newline, and trailing &. A future v0.3.x may introduce a shlex-based token matcher for richer command shapes; until then, strict reject is the only correct fail-closed behavior. The README now documents the threat model and the trade-off explicitly. README rewrite for launch ========================= - New headline: "Declare what your coding agent may read, write, run, and spend — in one YAML file. Enforced at runtime. Gated in CI. Fails closed." - New "Why this, why now" section grounding urgency in 2026 coding-agent failure modes (Claude Code, Codex, Cursor, Devin, Aider). - New "What an agent cannot do under a contract" before/after table making every abstract term concrete (.env writes, rm -rf, shell injection, unauthorized network, token overruns, fake green runs). - Step 2 of the quick start now shows the Claude Agent SDK adapter forwarding tool calls into the enforcer, instead of manual enforcer.check_file_read() calls that made it look like the user was the enforcer. - Quick start contract trimmed to drop redundant enforcement: sync_block and severity: critical fields (sensible defaults). - aicontracts init template emits the trimmed shape too, so the README matches what `init --template coding -o AGENT_CONTRACT.yaml` actually writes. - All CLI examples now use the `aicontracts` console script instead of `python -m agent_contracts.cli`. - Verdict artifact JSON example pruned (drops final_gate, tool_calls). - New "Shell command matching: threat model" section documents the strict-reject behavior and the v0.3.x roadmap for token-based matching. Tests ===== 196 tests pass on Python 3.12 with [dev,claude,openai,langchain] installed (was 183, +13 shell bypass regression cases). All 5 real-SDK integration tests still pass. Lint + mypy clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Owner
Author
|
Pushed Critical security fix
README rewrite for launchSurfaced by parallel reviews (positioning, competitive landscape, code/claim audit, launch craft).
Test status
Deferred to v0.2.1
Holding the tag for 24h per your request — your eyeball pass tomorrow before merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adapter matrix
aicontracts[claude]claude-agent-sdk==0.1.56aicontracts[openai]openai-agents==0.13.5aicontracts[langchain]langchain-core==1.2.26Bug fixed in this PR
from openai_agents import RunHooks. The actual package import path isfrom agents import RunHooks— the old import would have failed at first use once a user installed the[openai]extra. Now fixed and verified against the installed SDK.Removed
src/agent_contracts/adapters/crewai.py) +[crewai]extrasrc/agent_contracts/adapters/pydantic_ai.py) +[pydantic-ai]extraCLAUDE.md,docs/plans/) — now ignored and gated bypublish.ymlrepo-hygiene checkVerification performed locally
[dev,claude,openai,langchain]): 183 tests pass, all real-SDK integration tests pass against the actual installed SDK base classesruff check src/ tests/— cleanmypy src/agent_contracts— clean (18 source files)aicontracts validate AGENT_CONTRACT.yaml— passes, Tier 2Test plan
v0.2.0to triggerpublish.yml→ PyPI publish + GitHub Releasepip install aicontracts==0.2.0works in a fresh venvRelease sequence after merge
git checkout main && git pullgit tag v0.2.0 && git push origin v0.2.0— triggers PyPI publish + GitHub Releaseaicontracts==0.2.0on PyPI🤖 Generated with Claude Code