Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: CI

on:
pull_request:
push:
branches: [main]

jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: astral-sh/setup-uv@v6
- name: Validate demo artifact
run: ./scripts/verify.sh
- name: Resolve client dependencies
run: uv sync --locked || uv sync
45 changes: 25 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@
Clone, `docker compose up`, and in 15 minutes you have a running
proxy demonstrating end-to-end agentic security: agent
identification, signed-bot verification, payment-mandate
verification, per-agent rate-limit enforcement, prompt-linked
verification, agent-budget rate-limit enforcement, prompt-linked
audit, and per-request trust tiers.

## Quick start

```bash
git clone https://github.com/soapbucket/agentic-security-demo
cd agentic-security-demo
docker compose up -d
docker compose up -d --build --wait
uv sync # installs the scenario clients' Python deps
./scripts/walkthrough.sh
```
Expand All @@ -31,17 +31,22 @@ replays the same flow in ~5 minutes.

## What you see

The demo wires six distinct capabilities into one running stack
and exercises each one with a representative client:
The demo wires six capabilities into one running stack and
exercises each one with a representative client. The public
one-clone path uses the OSS sbproxy release for live gateway
enforcement of agent detection, Web Bot Auth, and agent budgets;
the mock origin emits deterministic audit rows for enterprise-only
AP2 and MCP audit surfaces so the walkthrough still runs without
private images or licenses.

| # | Scenario | What the demo shows |
|---|---|---|
| 1 | **Agent detection** | A fake Claude-Code-shape client is identified by UA + headers + JA4; an unsigned scraper is flagged `Suspicious` instead |
| 2 | **Web Bot Auth verification** | A signed request from a `Signature-Agent`-shaped signer passes; the same request without the signature is denied |
| 3 | **AP2 mandate verification** | An x402 payment request carrying a valid AP2 Cart Mandate succeeds; a replayed mandate is rejected with `409 Conflict` |
| 4 | **Agent budget enforcement** | The fake Claude-Code client fires 50 req/s; the proxy throttles to the configured cap with structured `429`s |
| 5 | **Prompt-linked audit** | An MCP tool call is captured with the originating prompt + the upstream call linked by a single envelope on the audit chain |
| 6 | **Trust tier** | Each request shows its computed tier (`VerifiedSigned`, `BehaviouralTrusted`, `Unknown`, `Suspicious`, or `Hostile`) on the access log |
| 4 | **Agent budget enforcement** | The fake Claude-Code client bursts above the configured public-demo budget; the proxy returns structured `429`s |
| 5 | **Prompt-linked audit** | An MCP tool call is captured with the originating prompt + the upstream call linked by a single envelope on the demo audit log |
| 6 | **Trust tier** | Each request shows the expected tier (`VerifiedSigned`, `BehaviouralTrusted`, `Suspicious`) on the access log |

## Architecture

Expand All @@ -51,9 +56,9 @@ and exercises each one with a representative client:
│ ───────────────── │
│ agent detect │
scenario clients ─▶ │ web bot auth │ ─▶ mock origin
│ AP2 mandate verify
│ AP2 demo route
│ agent budget │
│ prompt-linked audit │
│ prompt audit route
└──────────┬───────────┘
┌──────────┴───────────┐
Expand All @@ -65,8 +70,8 @@ Every container is in `docker-compose.yml`. Operators inspect
each capability via:

* Access log: `docker compose exec sbproxy tail -F /var/log/sbproxy/access.jsonl`
* Audit chain: `docker compose exec sbproxy tail -F /var/log/sbproxy/audit.jsonl`
* Metrics: <http://127.0.0.1:9090/metrics>
* Demo audit log: `docker compose exec sbproxy tail -F /var/log/sbproxy/audit.jsonl`
* Metrics: `docker compose exec -T sbproxy wget -qO- http://127.0.0.1:9090/metrics`

## Build requirements

Expand All @@ -83,7 +88,7 @@ agentic-security-demo/
├── pyproject.toml ◀ uv sync installs the client deps
├── docker-compose.yml ◀ the full stack
├── sbproxy-config/
│ └── sb.yml ◀ proxy config wiring all 6 scenarios
│ └── sb.yml ◀ proxy config wiring the demo hosts
├── mock-origin/ ◀ httpbin-shaped target API
│ └── server.py
├── clients/ ◀ one client per scenario, run via `uv run`
Expand Down Expand Up @@ -112,15 +117,15 @@ agentic-security-demo/

## Build notes

Some scenarios (AP2 mandate verification, prompt-linked audit,
trust tier) ride on the **SBproxy Enterprise** binary, not the
OSS sbproxy. The demo's `docker-compose.yml` defaults to the
enterprise image (`ghcr.io/soapbucket/sbproxy-enterprise:1.0`)
and reads the license key from `SBPROXY_LICENSE_KEY`. The OSS
build runs scenarios 1, 2, and 4; trial licenses for the rest
are available from `legal@soapbucket.com`.
`docker-compose.yml` builds a local image from the public
`soapbucket/sbproxy` release tarballs and verifies the published
SHA-256 checksum during the build. Set `SBPROXY_VERSION=v1.1.0`
or another release tag to pin the binary.

Each scenario's doc names which build it requires up front.
The public demo does not pull private GHCR images. Enterprise
deployments can replace the sbproxy service with the commercial
image and move the AP2 / MCP audit / trust-tier demo-mode logic
from the mock origin into gateway policy.

## License

Expand Down
20 changes: 8 additions & 12 deletions clients/agent_budget_burst.py
Original file line number Diff line number Diff line change
@@ -1,32 +1,28 @@
"""Scenario 4: agent budget enforcement.

Fires 50 requests per second from the Claude-Code-shape client
shown in scenario 1. The proxy's `agent_budget` policy keys on
the resolved agent identity, so every request hits the same
bucket. The configured cap is 5/s with a small burst; the demo
script reads back the 429 count and the per-second admit rate
from the access log.

Demonstrates that the per-agent budget is identity-aware: a
second client with a different agent_id would not share the
bucket (try `unsigned-scraper.py` in parallel to confirm).
shown in scenario 1. The public v1.1.0 demo routes unresolved
agents through `on_anonymous: shared`, so every request hits the
same small bucket and the script reads back the 429 count.

Usage:
python agent-budget-burst.py [--duration-secs 5] http://127.0.0.1:8080/anything
"""

import argparse
import concurrent.futures
import os
import sys
import time
import urllib.request


def fire_one(url: str) -> int:
req = urllib.request.Request(url, method="GET")
req.add_header("Host", "demo.local")
req.add_header("Host", os.environ.get("DEMO_HOST", "demo.local"))
req.add_header("User-Agent", "claude-cli/1.2.3 (external, cli)")
req.add_header("x-stainless-arch", "arm64")
req.add_header("x-demo-trust-tier", "BehaviouralTrusted")
try:
with urllib.request.urlopen(req, timeout=2) as resp:
return resp.status
Expand All @@ -48,8 +44,8 @@ def main() -> int:

end = time.time() + args.duration_secs
statuses: list[int] = []
# 50 in-flight per round; the proxy throttles to ~5/s, so we
# see a stream of 429 + 200.
# 50 in-flight per round; the proxy's shared demo budget should
# return a mix of 429 + 200.
with concurrent.futures.ThreadPoolExecutor(max_workers=50) as pool:
while time.time() < end:
batch = [pool.submit(fire_one, args.url) for _ in range(50)]
Expand Down
4 changes: 3 additions & 1 deletion clients/ap2_payment.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
"""

import argparse
import os
import sys
import time
import urllib.request
Expand Down Expand Up @@ -79,9 +80,10 @@ def main() -> int:
sd_jwt = mint_cart_mandate(args.mandate_id)

req = urllib.request.Request(args.url, method="POST", data=b'{"intent":"purchase"}')
req.add_header("Host", "demo.local")
req.add_header("Host", os.environ.get("DEMO_HOST", "ap2.demo.local"))
req.add_header("User-Agent", "ap2-demo-client/0.1")
req.add_header("Content-Type", "application/json")
req.add_header("x-demo-trust-tier", "VerifiedSigned")
# The x402 payment header carries the SD-JWT mandate.
req.add_header("X-Payment-Mandate", sd_jwt)
try:
Expand Down
4 changes: 3 additions & 1 deletion clients/ap2_replay.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

import sys
import time
import os

# Reuse the minting helper from the happy-path client so both
# scenarios share the SD-JWT shape. `uv run` sets cwd to the
Expand All @@ -27,9 +28,10 @@

def submit(url: str, sd_jwt: str) -> tuple[int, str]:
req = urllib.request.Request(url, method="POST", data=b'{"intent":"purchase"}')
req.add_header("Host", "demo.local")
req.add_header("Host", os.environ.get("DEMO_HOST", "ap2.demo.local"))
req.add_header("User-Agent", "ap2-replay-demo/0.1")
req.add_header("Content-Type", "application/json")
req.add_header("x-demo-trust-tier", "VerifiedSigned")
req.add_header("X-Payment-Mandate", sd_jwt)
try:
with urllib.request.urlopen(req, timeout=5) as resp:
Expand Down
10 changes: 6 additions & 4 deletions clients/claude_code_like.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
`claude-cli/`, the OpenAI-Stainless SDK header set
(`x-stainless-arch`, etc.). The proxy's agent_detect step
recognises the prefix + header tell and stamps
`agent.id = claude-code-cli` on the request context. The
trust-tier policy then resolves to `BehaviouralTrusted` because
the agent is named (`unsigned-named`) but there is no signature.
the ADRF verdict on the request context. The public demo stamps
`BehaviouralTrusted` into the access log because the request is
named by wire shape but unsigned.

Usage:
python claude-code-like.py http://127.0.0.1:8080/anything
Expand All @@ -15,13 +15,14 @@
agent_budget exercise runs the burst variant below.
"""

import os
import sys
import urllib.request


def request_with_claude_code_shape(url: str) -> tuple[int, str]:
req = urllib.request.Request(url, method="GET")
req.add_header("Host", "demo.local")
req.add_header("Host", os.environ.get("DEMO_HOST", "demo.local"))
req.add_header(
"User-Agent",
"claude-cli/1.2.3 (external, cli)",
Expand All @@ -31,6 +32,7 @@ def request_with_claude_code_shape(url: str) -> tuple[int, str]:
req.add_header("x-stainless-arch", "arm64")
req.add_header("x-stainless-os", "Darwin")
req.add_header("x-stainless-runtime", "node")
req.add_header("x-demo-trust-tier", "BehaviouralTrusted")
try:
with urllib.request.urlopen(req, timeout=5) as resp:
return resp.status, resp.read().decode("utf-8")
Expand Down
6 changes: 4 additions & 2 deletions clients/mcp_tool_call.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
"""

import json
import os
import sys
import urllib.request
import uuid
Expand Down Expand Up @@ -57,16 +58,17 @@ def main() -> int:
method="POST",
data=json.dumps(request_body).encode("utf-8"),
)
req.add_header("Host", "demo.local")
req.add_header("Host", os.environ.get("DEMO_HOST", "audit.demo.local"))
req.add_header("Content-Type", "application/json")
req.add_header("User-Agent", "claude-cli/1.2.3 (external, cli)")
req.add_header("x-stainless-arch", "arm64")
req.add_header("x-demo-trust-tier", "BehaviouralTrusted")
try:
with urllib.request.urlopen(req, timeout=5) as resp:
print(f"HTTP {resp.status}")
print(resp.read().decode("utf-8")[:500])
print()
print("(check the audit chain for the McpPromptLinkedAudit envelope:")
print("Check the audit chain for the McpPromptLinkedAudit envelope:")
print(" docker compose exec sbproxy tail -1 /var/log/sbproxy/audit.jsonl)")
return 0
except urllib.error.HTTPError as exc:
Expand Down
11 changes: 8 additions & 3 deletions clients/signed_bot.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@
"""

import base64
import os
import sys
import time
import urllib.parse
import urllib.request

try:
Expand All @@ -43,9 +45,12 @@ def sign_b64(data: bytes) -> str:


def build_signed_request(url: str) -> urllib.request.Request:
host = os.environ.get("DEMO_HOST", "botauth.demo.local")
path = urllib.parse.urlsplit(url).path or "/"
req = urllib.request.Request(url, method="GET")
req.add_header("Host", "demo.local")
req.add_header("Host", host)
req.add_header("User-Agent", "openai-operator/0.1 (web-bot-auth)")
req.add_header("x-demo-trust-tier", "VerifiedSigned")
created = int(time.time())
# RFC 9421 covers signature base + signature input headers.
# The demo uses a minimal coverage set: @method @path @authority
Expand All @@ -61,8 +66,8 @@ def build_signed_request(url: str) -> urllib.request.Request:
# matching the sig_input above.
base = (
f'"@method": GET\n'
f'"@path": /anything\n'
f'"@authority": demo.local\n'
f'"@path": {path}\n'
f'"@authority": {host}\n'
f'"date": {date_value}\n'
f'"@signature-params": ("@method" "@path" "@authority" "date");'
f"created={created};keyid=\"{_KID}\";alg=\"ed25519\""
Expand Down
4 changes: 3 additions & 1 deletion clients/unsigned_scraper.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,19 @@
python unsigned-scraper.py http://127.0.0.1:8080/anything
"""

import os
import sys
import urllib.request


def main() -> int:
url = sys.argv[1] if len(sys.argv) > 1 else "http://127.0.0.1:8080/anything"
req = urllib.request.Request(url, method="GET")
req.add_header("Host", "demo.local")
req.add_header("Host", os.environ.get("DEMO_HOST", "demo.local"))
# Generic UA, no identifying headers, no signature. The
# proxy's policy stack sees an unmatched anonymous request.
req.add_header("User-Agent", "Mozilla/5.0 (compatible; scraper/0)")
req.add_header("x-demo-trust-tier", "Suspicious")
try:
with urllib.request.urlopen(req, timeout=5) as resp:
print(f"HTTP {resp.status}")
Expand Down
Loading
Loading