Skip to content

release: v0.2.0 — repo-local guardrails, framework-agnostic core, pinned adapters#2

Merged
pyyush merged 4 commits intomainfrom
release/v0.2.0
Apr 9, 2026
Merged

release: v0.2.0 — repo-local guardrails, framework-agnostic core, pinned adapters#2
pyyush merged 4 commits intomainfrom
release/v0.2.0

Conversation

@pyyush
Copy link
Copy Markdown
Owner

@pyyush pyyush commented Apr 6, 2026

Summary

  • Repositions agent-contracts as repo-local, fail-closed guardrails for autonomous coding/build agents: filesystem read/write scopes, shell command authorization, shell-command budgets, and a durable verdict artifact CI can gate on.
  • Establishes the CI verdict gate as the source of truth for enforcement. The contract, CLI, verdict artifact, and GitHub Action are framework-agnostic and provider-agnostic. In-runtime adapters are optional ergonomic helpers.
  • Pins all framework adapter SDKs to exact versions, gates them on Python 3.10+ (core stays 3.9+), and adds real-SDK integration tests so adapters are validated against actual installed SDK base classes — not stub fallbacks.
  • Drops CrewAI and Pydantic AI adapters/extras. Vercel AI SDK + TypeScript companion package roadmap'd to v0.3.0.
  • Adds a "Why YAML, not Markdown?" design rationale section to the README.

Adapter matrix

Framework Extra Pinned SDK
Claude Agent SDK aicontracts[claude] claude-agent-sdk==0.1.56
OpenAI Agents SDK aicontracts[openai] openai-agents==0.13.5
LangChain aicontracts[langchain] langchain-core==1.2.26

Bug fixed in this PR

  • OpenAI adapter previously imported from openai_agents import RunHooks. The actual package import path is from agents import RunHooks — the old import would have failed at first use once a user installed the [openai] extra. Now fixed and verified against the installed SDK.

Removed

  • CrewAI adapter (src/agent_contracts/adapters/crewai.py) + [crewai] extra
  • Pydantic AI adapter (src/agent_contracts/adapters/pydantic_ai.py) + [pydantic-ai] extra
  • Internal planning files (CLAUDE.md, docs/plans/) — now ignored and gated by publish.yml repo-hygiene check

Verification performed locally

  • ✅ Python 3.9 (system, no extras): 177 tests pass, 5 real-SDK integration tests skipped as expected
  • ✅ Python 3.12 (fresh venv with [dev,claude,openai,langchain]): 183 tests pass, all real-SDK integration tests pass against the actual installed SDK base classes
  • ruff check src/ tests/ — clean
  • mypy src/agent_contracts — clean (18 source files)
  • aicontracts validate AGENT_CONTRACT.yaml — passes, Tier 2

Test plan

  • CI matrix passes on all of 3.9, 3.10, 3.11, 3.12, 3.13
  • Real-SDK integration tests run (not skipped) on 3.10+ matrix entries
  • After merge: tag v0.2.0 to trigger publish.yml → PyPI publish + GitHub Release
  • After publish: pip install aicontracts==0.2.0 works in a fresh venv

Release sequence after merge

  1. git checkout main && git pull
  2. git tag v0.2.0 && git push origin v0.2.0 — triggers PyPI publish + GitHub Release
  3. Verify aicontracts==0.2.0 on PyPI
  4. Post HN / LinkedIn

🤖 Generated with Claude Code

pyyush and others added 4 commits April 6, 2026 15:43
Repositions agent-contracts around fail-closed guardrails for autonomous
coding/build agents in a repository: filesystem read/write scopes, shell
command authorization, shell-command budgets, and a durable verdict
artifact that CI can gate on.

- filesystem read/write authorization scopes
- shell command authorization scopes + max_shell_commands budget
- verdict artifact emission and CLI verdict gating
- coding-agent trace bootstrap improvements
- demo contracts for blocked file writes, blocked commands, failed checks
- canonical AGENT_CONTRACT.yaml repositioned as a repo-build agent
- README, spec, and examples rewritten around the coding/build scope

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pinned versions

The contract, CLI, verdict artifact, and GitHub Action are framework-
and provider-agnostic by design. The CI verdict gate is the source of
truth for enforcement; in-runtime adapters are optional ergonomic
helpers that forward host hook calls into the same enforcer.

- pin claude-agent-sdk==0.1.56, openai-agents==0.13.5,
  langchain-core==1.2.26 in their respective extras
- gate all three SDK extras on Python 3.10+ (core stays 3.9+)
- fix OpenAI adapter import path (from agents import RunHooks)
- add real-SDK integration tests using pytest.importorskip so adapters
  are validated against the actual installed SDK base classes / hook
  surfaces, not stub fallbacks
- wire CI to install [claude,openai,langchain] extras on Python 3.10+
  matrix entries so the integration tests run
- mypy: skip following imports into framework SDKs (newer-Python syntax)
- drop CrewAI and Pydantic AI adapters/extras/tests
- README: lead with "CI verdict gate = source of truth", document
  pinned SDK versions, add v0.3.0 TypeScript adapter roadmap

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Explains why the contract is a structured YAML artifact rather than
prose: deterministic parse, typed fields for fail-closed enforcement,
diff-friendly review, versioned schema, and consistency with existing
cloud-native policy formats.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…launch

Security fix
============

The shell command matcher used fnmatch.fnmatch(). Pattern "python -m pytest *"
matched commands like "python -m pytest tests/ ; rm -rf /" because the * glob
consumed shell operators (;, &&, ||, |, &, >, <, `, $(, newline) as ordinary
characters. An agent could bypass any allowlist entry by appending arbitrary
chained or substituted commands after an authorized prefix.

v0.2.x now strict-rejects any command containing one of those metacharacters,
regardless of pattern match. New ShellMetacharacterError subclasses
EffectDeniedError so existing handlers keep working but verdicts can
distinguish "matched no allowlist entry" from "attempted to chain commands".

Regression coverage in tests/test_effects.py covers ;, &&, ||, |, >, <, >>,
$(, backtick, newline, and trailing &.

A future v0.3.x may introduce a shlex-based token matcher for richer command
shapes; until then, strict reject is the only correct fail-closed behavior.
The README now documents the threat model and the trade-off explicitly.

README rewrite for launch
=========================

- New headline: "Declare what your coding agent may read, write, run, and
  spend — in one YAML file. Enforced at runtime. Gated in CI. Fails closed."
- New "Why this, why now" section grounding urgency in 2026 coding-agent
  failure modes (Claude Code, Codex, Cursor, Devin, Aider).
- New "What an agent cannot do under a contract" before/after table making
  every abstract term concrete (.env writes, rm -rf, shell injection,
  unauthorized network, token overruns, fake green runs).
- Step 2 of the quick start now shows the Claude Agent SDK adapter
  forwarding tool calls into the enforcer, instead of manual
  enforcer.check_file_read() calls that made it look like the user was
  the enforcer.
- Quick start contract trimmed to drop redundant enforcement: sync_block
  and severity: critical fields (sensible defaults).
- aicontracts init template emits the trimmed shape too, so the README
  matches what `init --template coding -o AGENT_CONTRACT.yaml` actually
  writes.
- All CLI examples now use the `aicontracts` console script instead of
  `python -m agent_contracts.cli`.
- Verdict artifact JSON example pruned (drops final_gate, tool_calls).
- New "Shell command matching: threat model" section documents the
  strict-reject behavior and the v0.3.x roadmap for token-based matching.

Tests
=====

196 tests pass on Python 3.12 with [dev,claude,openai,langchain] installed
(was 183, +13 shell bypass regression cases). All 5 real-SDK integration
tests still pass. Lint + mypy clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pyyush
Copy link
Copy Markdown
Owner Author

pyyush commented Apr 7, 2026

Pushed 1a73734 — security fix + README launch rewrite.

Critical security fix

  • Shell command injection bypass closed. Pattern "python -m pytest *" previously matched "python -m pytest tests/ ; rm -rf /" because fnmatch's * consumed shell operators. The matcher now strict-rejects any command containing ;, &, |, <, >, `, $(, or newline. New ShellMetacharacterError subclasses EffectDeniedError. 13 regression tests cover the bypass vectors.
  • Threat model section added to README documenting the strict-reject behavior and the v0.3.x roadmap for richer (shlex-based) matching.

README rewrite for launch

Surfaced by parallel reviews (positioning, competitive landscape, code/claim audit, launch craft).

  • Headline: Declare what your coding agent may read, write, run, and spend — in one YAML file. Enforced at runtime. Gated in CI. Fails closed.
  • New "Why this, why now" section grounding the launch in 2026 coding-agent failure modes
  • New "What an agent cannot do under a contract" before/after table — concrete agent attempts (.env writes, rm -rf, shell injection, network, token overrun, fake green runs) mapped to verdict outcomes
  • Quick start step 2 now shows the Claude Agent SDK adapter forwarding tool calls into the enforcer, instead of manual check_file_* calls that made it look like the user enforces the contract
  • All CLI examples switched from python -m agent_contracts.cli to the aicontracts console script
  • aicontracts init --template coding now emits the trimmed contract shape (no redundant enforcement: sync_block / severity: critical) — README YAML matches the template output exactly
  • Verdict JSON example pruned (final_gate, tool_calls)

Test status

  • Python 3.12 + [dev,claude,openai,langchain]: 196 tests pass (was 183, +13 shell bypass regression cases)
  • Python 3.9 (no extras): tests pass, 5 real-SDK integration tests skip as expected
  • ruff + mypy clean
  • aicontracts init --template coding -o file && aicontracts validate file round-trips, validates as Tier 2

Deferred to v0.2.1

  • shlex-based token matcher for richer command shapes (current strict-reject is the safe default; the trade-off is documented)
  • Trim "Why YAML, not Markdown?" section from ~400 to ~120 words and move to appendix

Holding the tag for 24h per your request — your eyeball pass tomorrow before merge.

@pyyush pyyush merged commit 28b521d into main Apr 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant