release: v0.2.0 — repo-local guardrails, framework-agnostic core, pinned adapters by pyyush · Pull Request #2 · pyyush/agentcontracts

pyyush · 2026-04-06T21:03:53Z

Summary

Repositions agent-contracts as repo-local, fail-closed guardrails for autonomous coding/build agents: filesystem read/write scopes, shell command authorization, shell-command budgets, and a durable verdict artifact CI can gate on.
Establishes the CI verdict gate as the source of truth for enforcement. The contract, CLI, verdict artifact, and GitHub Action are framework-agnostic and provider-agnostic. In-runtime adapters are optional ergonomic helpers.
Pins all framework adapter SDKs to exact versions, gates them on Python 3.10+ (core stays 3.9+), and adds real-SDK integration tests so adapters are validated against actual installed SDK base classes — not stub fallbacks.
Drops CrewAI and Pydantic AI adapters/extras. Vercel AI SDK + TypeScript companion package roadmap'd to v0.3.0.
Adds a "Why YAML, not Markdown?" design rationale section to the README.

Adapter matrix

Framework	Extra	Pinned SDK
Claude Agent SDK	`aicontracts[claude]`	`claude-agent-sdk==0.1.56`
OpenAI Agents SDK	`aicontracts[openai]`	`openai-agents==0.13.5`
LangChain	`aicontracts[langchain]`	`langchain-core==1.2.26`

Bug fixed in this PR

OpenAI adapter previously imported from openai_agents import RunHooks. The actual package import path is from agents import RunHooks — the old import would have failed at first use once a user installed the [openai] extra. Now fixed and verified against the installed SDK.

Removed

CrewAI adapter (src/agent_contracts/adapters/crewai.py) + [crewai] extra
Pydantic AI adapter (src/agent_contracts/adapters/pydantic_ai.py) + [pydantic-ai] extra
Internal planning files (CLAUDE.md, docs/plans/) — now ignored and gated by publish.yml repo-hygiene check

Verification performed locally

✅ Python 3.9 (system, no extras): 177 tests pass, 5 real-SDK integration tests skipped as expected
✅ Python 3.12 (fresh venv with [dev,claude,openai,langchain]): 183 tests pass, all real-SDK integration tests pass against the actual installed SDK base classes
✅ ruff check src/ tests/ — clean
✅ mypy src/agent_contracts — clean (18 source files)
✅ aicontracts validate AGENT_CONTRACT.yaml — passes, Tier 2

Test plan

CI matrix passes on all of 3.9, 3.10, 3.11, 3.12, 3.13
Real-SDK integration tests run (not skipped) on 3.10+ matrix entries
After merge: tag v0.2.0 to trigger publish.yml → PyPI publish + GitHub Release
After publish: pip install aicontracts==0.2.0 works in a fresh venv

Release sequence after merge

git checkout main && git pull
git tag v0.2.0 && git push origin v0.2.0 — triggers PyPI publish + GitHub Release
Verify aicontracts==0.2.0 on PyPI
Post HN / LinkedIn

🤖 Generated with Claude Code

Repositions agent-contracts around fail-closed guardrails for autonomous coding/build agents in a repository: filesystem read/write scopes, shell command authorization, shell-command budgets, and a durable verdict artifact that CI can gate on. - filesystem read/write authorization scopes - shell command authorization scopes + max_shell_commands budget - verdict artifact emission and CLI verdict gating - coding-agent trace bootstrap improvements - demo contracts for blocked file writes, blocked commands, failed checks - canonical AGENT_CONTRACT.yaml repositioned as a repo-build agent - README, spec, and examples rewritten around the coding/build scope Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…pinned versions The contract, CLI, verdict artifact, and GitHub Action are framework- and provider-agnostic by design. The CI verdict gate is the source of truth for enforcement; in-runtime adapters are optional ergonomic helpers that forward host hook calls into the same enforcer. - pin claude-agent-sdk==0.1.56, openai-agents==0.13.5, langchain-core==1.2.26 in their respective extras - gate all three SDK extras on Python 3.10+ (core stays 3.9+) - fix OpenAI adapter import path (from agents import RunHooks) - add real-SDK integration tests using pytest.importorskip so adapters are validated against the actual installed SDK base classes / hook surfaces, not stub fallbacks - wire CI to install [claude,openai,langchain] extras on Python 3.10+ matrix entries so the integration tests run - mypy: skip following imports into framework SDKs (newer-Python syntax) - drop CrewAI and Pydantic AI adapters/extras/tests - README: lead with "CI verdict gate = source of truth", document pinned SDK versions, add v0.3.0 TypeScript adapter roadmap Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Explains why the contract is a structured YAML artifact rather than prose: deterministic parse, typed fields for fail-closed enforcement, diff-friendly review, versioned schema, and consistency with existing cloud-native policy formats. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…launch Security fix ============ The shell command matcher used fnmatch.fnmatch(). Pattern "python -m pytest *" matched commands like "python -m pytest tests/ ; rm -rf /" because the * glob consumed shell operators (;, &&, ||, |, &, >, <, `, $(, newline) as ordinary characters. An agent could bypass any allowlist entry by appending arbitrary chained or substituted commands after an authorized prefix. v0.2.x now strict-rejects any command containing one of those metacharacters, regardless of pattern match. New ShellMetacharacterError subclasses EffectDeniedError so existing handlers keep working but verdicts can distinguish "matched no allowlist entry" from "attempted to chain commands". Regression coverage in tests/test_effects.py covers ;, &&, ||, |, >, <, >>, $(, backtick, newline, and trailing &. A future v0.3.x may introduce a shlex-based token matcher for richer command shapes; until then, strict reject is the only correct fail-closed behavior. The README now documents the threat model and the trade-off explicitly. README rewrite for launch ========================= - New headline: "Declare what your coding agent may read, write, run, and spend — in one YAML file. Enforced at runtime. Gated in CI. Fails closed." - New "Why this, why now" section grounding urgency in 2026 coding-agent failure modes (Claude Code, Codex, Cursor, Devin, Aider). - New "What an agent cannot do under a contract" before/after table making every abstract term concrete (.env writes, rm -rf, shell injection, unauthorized network, token overruns, fake green runs). - Step 2 of the quick start now shows the Claude Agent SDK adapter forwarding tool calls into the enforcer, instead of manual enforcer.check_file_read() calls that made it look like the user was the enforcer. - Quick start contract trimmed to drop redundant enforcement: sync_block and severity: critical fields (sensible defaults). - aicontracts init template emits the trimmed shape too, so the README matches what `init --template coding -o AGENT_CONTRACT.yaml` actually writes. - All CLI examples now use the `aicontracts` console script instead of `python -m agent_contracts.cli`. - Verdict artifact JSON example pruned (drops final_gate, tool_calls). - New "Shell command matching: threat model" section documents the strict-reject behavior and the v0.3.x roadmap for token-based matching. Tests ===== 196 tests pass on Python 3.12 with [dev,claude,openai,langchain] installed (was 183, +13 shell bypass regression cases). All 5 real-SDK integration tests still pass. Lint + mypy clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

pyyush · 2026-04-07T06:53:17Z

Pushed 1a73734 — security fix + README launch rewrite.

Critical security fix

Shell command injection bypass closed. Pattern "python -m pytest *" previously matched "python -m pytest tests/ ; rm -rf /" because fnmatch's * consumed shell operators. The matcher now strict-rejects any command containing ;, &, |, <, >, `, $(, or newline. New ShellMetacharacterError subclasses EffectDeniedError. 13 regression tests cover the bypass vectors.
Threat model section added to README documenting the strict-reject behavior and the v0.3.x roadmap for richer (shlex-based) matching.

README rewrite for launch

Surfaced by parallel reviews (positioning, competitive landscape, code/claim audit, launch craft).

Headline: Declare what your coding agent may read, write, run, and spend — in one YAML file. Enforced at runtime. Gated in CI. Fails closed.
New "Why this, why now" section grounding the launch in 2026 coding-agent failure modes
New "What an agent cannot do under a contract" before/after table — concrete agent attempts (.env writes, rm -rf, shell injection, network, token overrun, fake green runs) mapped to verdict outcomes
Quick start step 2 now shows the Claude Agent SDK adapter forwarding tool calls into the enforcer, instead of manual check_file_* calls that made it look like the user enforces the contract
All CLI examples switched from python -m agent_contracts.cli to the aicontracts console script
aicontracts init --template coding now emits the trimmed contract shape (no redundant enforcement: sync_block / severity: critical) — README YAML matches the template output exactly
Verdict JSON example pruned (final_gate, tool_calls)

Test status

Python 3.12 + [dev,claude,openai,langchain]: 196 tests pass (was 183, +13 shell bypass regression cases)
Python 3.9 (no extras): tests pass, 5 real-SDK integration tests skip as expected
ruff + mypy clean
aicontracts init --template coding -o file && aicontracts validate file round-trips, validates as Tier 2

Deferred to v0.2.1

shlex-based token matcher for richer command shapes (current strict-reject is the safe default; the trade-off is documented)
Trim "Why YAML, not Markdown?" section from ~400 to ~120 words and move to appendix

Holding the tag for 24h per your request — your eyeball pass tomorrow before merge.

pyyush and others added 4 commits April 6, 2026 15:43

pyyush merged commit 28b521d into main Apr 9, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release: v0.2.0 — repo-local guardrails, framework-agnostic core, pinned adapters#2

release: v0.2.0 — repo-local guardrails, framework-agnostic core, pinned adapters#2
pyyush merged 4 commits intomainfrom
release/v0.2.0

pyyush commented Apr 6, 2026

Uh oh!

pyyush commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pyyush commented Apr 6, 2026

Summary

Adapter matrix

Bug fixed in this PR

Removed

Verification performed locally

Test plan

Release sequence after merge

Uh oh!

pyyush commented Apr 7, 2026

Critical security fix

README rewrite for launch

Test status

Deferred to v0.2.1

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant