Infrastructure repository for agentic workloads: runtime, skills, harness, memory.
L1 framework, the full L2 implementation wave, the L3 default-path
wiring + audit wave, and the third-audit + L3 capability wave, on
main (see docs/backlog.md,
ADR 0007,
ADR 0010,
ADR 0011).
Every L2/L3 change is additive to the L1 Protocols: new optional
parameters, new modules, and side-by-side Protocols; nothing in the L1
surface was removed. The package imports and type-checks with no
optional dependencies installed.
See CLAUDE.md for repository structure and conventions.
agents/operator CLI (python -m agents)workloads/individual agent workloads + loader (in-tree and out-of-tree)skills/Agent Skills bundles, registry, dispatchers, install sourcesharness/contracts, enforcement, runtime adapter, budgets, eventsmemory/namespace-bound stores and production adaptersevaluation/behavioural regression gate (dispatch P@1/MRR, trajectory)tests/test suite (mirrors the source layout)docs/architecture, ADRs, the L2 backlog, generated JSON Schemascripts/operational and developer scripts
- Harness. Behavioral contracts (pre/invariant/post/governance,
hard/soft severity),
run_under_contractenforcement with opt-in default-path wiring (skill-contract composition, drift recording + threshold events, recovery directives, run-scoped lifecycles), action budgets (steps/tokens/wall-clock/tool-calls, per-tool quotas, plus a cost dimension and per-tool token/wall-clock caps, cumulative across an approval pause), structured OTel-ready events, Jensen-Shannon distributional drift, and opt-in self-attesting run-provenance records (record_sink,contract_digest,verify_run_record, thescripts/check_run_records.pyoffline gate). - Provider batch capabilities (optional extras).
AnthropicBatchProcessor(Message Batches) andcache_control_system(prefix-stable prompt caching) under theanthropicextra;OpenAIBatchProcessor(OpenAI Batch API) under theopenaiextra. Async bulk at roughly 50% token price; lazily imported, the package type-checks without either SDK. - Runtime adapter.
PydanticAIRuntimewires the guard and budget into the tool-call path: every local and MCP tool call passes the same guard gate (approve / reject / require-approval), a wall-clock watchdog (preempts at an await boundary), streaming budget enforcement, a pause/ResumableState/resume approval flow, an opt-inRetryPolicy(backoff + circuit breaker), and an opt-in structured soft-reject. Provider selection and credentials: docs/runtime-providers.md. - Memory. Namespace-bound
MemoryStorewithInMemoryStorereference plusSQLiteStore,RedisStore,S3Store,DynamoDBStoreadapters; extension Protocols for batch, cursor scan, content-addressing, CAS, MVCC version tokens (VersionedMemoryStore), and similarity query (SemanticMemoryStore+InMemorySemanticStore);TTLSweeper; transparentEncryptedStore(AES-256-GCM) with static / env / file / rotating (VersionedKeyProvider) key providers, andACLStorewith role and attribute-based (AttributeACL) policies and an auditedAccessDeniedevent, both withwrap_encrypted/wrap_aclforwarding the wrapped backend's extension Protocols truthfully; optional audit events. - Evaluation. A behavioural regression gate:
evaluate_dispatch(P@1 / MRR over a JSON golden set) andevaluate_trajectory(expected vs actual contract terminal outcome), deterministic and network-free, run as a blocking CI job viascripts/eval.py. - Skills. Agent Skills spec-compliant loader/registry, skill
versioning (
name@version), seven router dispatchers (the five core keyword, LLM, lane, routing-chain, skill-based, plus the L2 multi-ensemble and embedding), anInstrumentedDispatchertelemetry wrapper, and adefault_dispatcherfactory for the recommended instrumented chain; a deterministicHashingEmbeddingProvider; skill-level contracts; and pluggable install sources (local, GitHub, marketplace) with bounded symlink-safe extraction, optional checksum and signature verification, and gated contract execution for untrusted bundles. - CLI.
python -m agents workloads list | skills list | skills install <name> --from <src> | run <wl> <q> [--json].
uv sync --all-extras # dev: every adapter + test doublesProduction backends are optional extras, lazily imported:
pip install 'agents[redis]' # RedisStore
pip install 'agents[aws]' # S3Store, DynamoDBStore
pip install 'agents[crypto]' # EncryptedStore (AES-256-GCM)
pip install 'agents[otel]' # OTelSink (OTLP/HTTP)make check # ruff + mypy + pytest
make schema # regenerate docs/schema/*.json from the models
uv run python scripts/eval.py # the BL-130 dispatch regression gatePre-1.0 infrastructure. See STATUS.md for phase and document maturity, LIMITATIONS.md for explicit scope boundaries and known gaps, CHANGELOG.md for material changes, docs/releasing.md for the versioning, release, and operations policy, and SECURITY.md for the hardening posture and disclosure process. Roadmap: docs/backlog.md; decisions: docs/adr/.