Skip to content

test(harness): full x402 exact interop matrix (client/server cross-pairs + parity locks)#133

Closed
EfeDurmaz16 wants to merge 16 commits into
solana-foundation:mainfrom
EfeDurmaz16:pr/x402-harness-matrix
Closed

test(harness): full x402 exact interop matrix (client/server cross-pairs + parity locks)#133
EfeDurmaz16 wants to merge 16 commits into
solana-foundation:mainfrom
EfeDurmaz16:pr/x402-harness-matrix

Conversation

@EfeDurmaz16

@EfeDurmaz16 EfeDurmaz16 commented May 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

Hardens the x402 exact interop harness with a three-tier test architecture stacked on #132: a fast wire-compat suite, the existing env-gated cross-spine matrix expanded with per-language self-pair grouping, and a new live-matrix tier that enumerates every allowedPair automatically.

Scope

  • Tier 1 wire compat: harness/test/x402-exact.compat.test.ts drives every registered fast in-process adapter against canonical fixtures and an attack suite (17 tests)
  • Tier 2 cross-spine + self-pair: harness/test/x402-exact.e2e.test.ts expanded with explicit per-language self-pair group so regressions stand out
  • Tier 3 live full matrix: harness/test/x402-exact.live.matrix.test.ts enumerates EVERY allowedPair from the shared pair policy in harness/src/x402-pair-policy.ts
  • Parity-lock fixtures at harness/fixtures/x402-exact/: canonical-challenge.json, canonical-payment-signature.json, canonical-payment-signature-rust.json, canonical-reject-tokens.json, attack-scenarios.json
  • Replay and network-mismatch scenarios that demand canonical reject tokens rather than payment_invalid fallback

Files changed

  • Tests: harness/test/x402-exact.compat.test.ts, harness/test/x402-exact.e2e.test.ts, harness/test/x402-exact.live.matrix.test.ts
  • Policy: harness/src/x402-pair-policy.ts
  • Fixtures: harness/fixtures/x402-exact/*.json

Security highlights

  • Full verifiers MUST emit a specific reject token; only adapters in WIRE_ONLY_ADAPTER_IDS keep the payment_invalid fallback
  • Full verifiers that accept the TS-wire stub credential are flagged as a verifier bypass (opt out via X402_COMPAT_STUB_ACCEPT)
  • Reject-token taxonomy match is longest-first so ..._compute_price_instruction_too_high is not greedily credited as ..._compute_price_instruction
  • canonical-reject-tokens.json is strict-equality checked against rust/crates/x402/src/protocol/schemes/exact/verify.rs; test fails if the spine adds/removes/renames a token
  • Replay assertion requires first submission to be accepted; a server that rejects every credential cannot trivially pass

Test evidence

  • pnpm install: clean
  • pnpm exec vitest run test/x402-exact.compat.test.ts: 17/17 pass
  • pnpm exec vitest run test/x402-exact.e2e.test.ts test/x402-exact.live.matrix.test.ts: 2 skip (env-gated, expected)
  • X402_INTEROP_MATRIX=1 pnpm exec vitest run test/x402-exact.live.matrix.test.ts: enumerates pairs and skips with explicit missing required env vars: ... reason
  • Codex r8: 0 P1, confidence 4/5 (notes/codex-review/pr-133-r8.md)

Closes / supersedes

None.

Reviewer notes

  • Stacked on test(interop): add x402-exact intent + TS reference fixtures + matrix wiring #132; base is pr/x402-harness-intent. Standalone diff: git diff pr/x402-harness-intent...pr/x402-harness-matrix
  • extractRejectToken searches every response field for a known taxonomy token (the rust spine wraps verifier errors as { error: "payment_invalid", message: "<specific-token>: ..." }, so the most-specific token is in message, not error)
  • One pre-existing failure in compute-budget-caps.test.ts predates this PR (verified by stashing changes)

…eferences (solana-foundation#122)

Per maintainer guidance in solana-foundation#122, this is a transversal cleanup PR:

Part A — remove internal kitchen references
- Drop M1/M2/M3 milestone framing from swift/README.md, swift/Examples/README.md
- Reword 'M1 baseline / M2-followup' coverage gate comments in python/pyproject.toml
  and .github/workflows/python.yml as plain coverage gate descriptions
- Remove 'M1 closure / L6 audit row' tag from lua/mpp/protocol/core/error_codes.lua

Part B — rename tests/interop to harness
- git mv tests/interop harness
- Update all path references repo-wide (.github/workflows/*, READMEs,
  .gitignore, docs, composer.json, .php-cs-fixer.dist.php, skill files)
- Fix relative paths inside the harness now that depth dropped by one
  (rust-client/Cargo.toml, php-server, ruby-server, go.mod replace lines,
  src/implementations.ts, test/compute-budget-caps.test.ts REPO_ROOT)
- Update Go module identifiers harness/{go-client,go-server} to match path
- Refresh internal comments/docs that still mentioned tests/interop

Part C — skill / README polish
- Skill references and intent docs now point at harness/* paths

Closes solana-foundation#122.
After renaming tests/interop/swift-client to harness/swift-client the
.package(path:) relative depth dropped by one; the previous '../../../swift'
resolved outside the repo. Surfaced by codex self-review.

Refs solana-foundation#122.
Adds the canonical x402 `exact` intent to the cross-language interop
harness, plus TypeScript reference client and server fixtures and
matrix wiring that registers the Rust spine adapters already shipped
under `rust/crates/x402/src/bin/`. Language adapters can now target
the harness contract (X402_INTEROP_* env vars, ready/result JSON
shapes) to validate against the Rust spine cell.

The TS reference fixture carries a stub credential payload (challenge
id + resource) so the harness wiring, negative-code classification,
cross-server portability, and idempotent-resubmit flows can run
without a full Solana signer. Pair restriction in the matrix gates
TS↔TS and Rust↔Rust by default; full TS↔Rust on-chain settlement
parity lands with a follow-up SDK port.

The legacy MPP charge runner hard-skips the new intent so default
`pnpm test` behaviour is unchanged.
Three fixes so the x402-exact matrix actually executes once the
language adapter PRs (solana-foundation#124, solana-foundation#126, solana-foundation#127, solana-foundation#128, solana-foundation#129, solana-foundation#130) rebase on top
of this branch:

1. Pair filter is data-driven. Previously only ts-x402 self-pair and
   rust-x402 self-pair were accepted. Now the filter walks the
   registered adapters and accepts any pair where: both sides are the
   TS reference (stub-payload), both sides are the Rust spine, the two
   sides share a base language id (same-language self-pair, e.g.
   go-x402-client <-> go-x402-server), or one side is the Rust spine
   (cross-spine pair in either direction). TS reference is locked to
   its self-pair only because its stub payload would fail real
   signature verification on any other server.

2. rust-x402 cargo --manifest-path corrected from ../../rust/Cargo.toml
   to ../rust/Cargo.toml. The path was stale after the
   tests/interop -> harness rename; the existing rust (charge) entries
   already used the correct relative path, the x402 entries did not.

3. Pair selector docstring rewritten to spell out the data-driven
   matrix policy so future language ports don't need to touch the test.
…rver

The TS x402 fixture server gated its cross-server portability
rejection behind `issued.size > 0`, so a freshly started server (or
one that had not yet issued any 402 challenge) would accept any
challengeId from another server's credential and settle it. That
contradicts canonical Rust behavior, which rejects unknown
challengeIds with `challenge_verification_failed` from the very
first request.

Drop the `issued.size > 0` guard so the membership check fires
unconditionally. The happy-path flow (GET /protected -> 402 with
challengeId -> POST with challengeId) is unaffected because the
served challengeId is added to `issued` on the 402 path before
the client returns.

Codex r8 solana-foundation#132 P2: cross-server replay to a fresh TS server now
returns `challenge_verification_failed` immediately rather than
settling.
… key envs

The x402-exact-network-mismatch and x402-exact-cross-route-replay
scenarios were registered in src/intents/x402-exact.ts but never run
by any test (only the happy path was exercised in e2e.test.ts).
Add a TS-only negative-scenario suite that drives each one against
the TS reference server with hand-crafted credentials and asserts
the canonical reject code.

The network-mismatch scenario was previously a no-op even if invoked
because the scenario.network value flowed to both client and server.
The new test sends distinct networks: server advertises offers on
network A, credential claims network B, server emits wrong_network.

The TS reference fixture parsed X402_INTEROP_CLIENT_SECRET_KEY and
X402_INTEROP_FACILITATOR_SECRET_KEY as required envs but never read
them (stub credential, no on-chain signing). Drop the requirement
so the verifier surface can be exercised without standing up a
Surfpool RPC or funded keypair; the live matrix in
x402-exact.e2e.test.ts still requires them via the test guard.
Real-signing language adapters read their own keypair envs.
…irs + parity locks)

Add three-tier x402-exact test architecture on top of solana-foundation#132:

1. Wire compat (no RPC, default `pnpm test`):
   - `harness/test/x402-exact.compat.test.ts`
   - Drives every registered x402-exact adapter (gated by
     COMPAT_INCLUDE_IDS) against canonical fixtures and an attack
     suite. Catches wire-format drift before the live matrix runs.

2. Parity-lock fixtures (`harness/fixtures/x402-exact/`):
   - canonical-challenge.json — 402 envelope every client must parse.
   - canonical-payment-signature.json — credential every server must
     parse (accept or reject with a known token).
   - canonical-reject-tokens.json — union of high-level reject tokens
     and the invalid_exact_svm_payload_* family mirrored from
     rust/crates/x402/src/protocol/schemes/exact/verify.rs.
   - attack-scenarios.json — 9 tampered credential scenarios + replay.

3. Live full matrix (`harness/test/x402-exact.live.matrix.test.ts`):
   - Env-gated (X402_INTEROP_MATRIX=1 + funded keypair). Enumerates
     every allowedPair from the policy in implementations.ts and runs
     the happy path. Widens automatically as new adapters register.

Also expand `harness/test/x402-exact.e2e.test.ts` with an explicit
self-pair group so per-language regressions stand out in vitest output,
and update `harness/README.md` with the three-tier documentation and
extension recipe.
…lback

Address review findings on the x402-exact matrix:

- Drop blanket `payment_invalid` fallback in attack-scenario assertion;
  only wire-only adapters (WIRE_ONLY_ADAPTER_IDS) may emit the generic
  token. Full verifiers must emit a specific reject token per scenario.

- Rework extractRejectToken: the Rust spine wraps verifier failures as
  `{ error: "payment_invalid", message: "<verifier-token>: ..." }`, so
  the most-specific token is in `message`, not `error`. Search every
  field for a known taxonomy token (svm-payload tokens before
  high-level) and return that; previously the test masked specific
  tokens behind the high-level error.

- Replay test now requires the first submission to be accepted; a
  server that rejects every credential previously passed trivially.

- Reject-token taxonomy is now strict-checked against the rust spine
  source (rust/crates/x402/src/protocol/schemes/exact/verify.rs): the
  fixture set must equal the set of `invalid_exact_svm_payload_*`
  literals in the spine. Token add/remove/rename in the spine fails
  the test with a pointed diff.

- Add canonical-payment-signature-rust.json with the Rust-spine
  PaymentProof::Transaction shape (vs the existing TS-wire stub).

- Reframe TS-wire fixture descriptions to honestly document the
  PaymentRequiredEnvelope `resource: ResourceInfo` and
  `payload: PaymentProof` differences vs the Rust spine.

- Replace `it.fails` skip in the live matrix with a hard `it` failure
  so a broken scenario registry fails CI loudly.
- Remove generic `payment_invalid` from per-scenario expectedRejectTokens
  in attack-scenarios.json. Wire-only adapters still get the fallback
  via WIRE_ONLY_ADAPTER_IDS in the test runner; full verifiers must now
  emit a specific token (no silent regression to generic rejection).
- Document each scenario's true scope: wire-binding checks (rejected by
  the TS reference's classifyCredential / rust spine's requirement
  binding) vs SVM transaction structural checks (live matrix only).
- Tone down canonical-payment-signature-rust.json description: the
  placeholder transaction fails bincode-deserialization BEFORE
  verify.rs runs, so the fixture asserts envelope parsing + structured
  402 emission, not `invalid_exact_svm_payload_*` tokens.
- Add `once("error")` rejection on the in-process fixture server's
  listen call so EPERM/EADDRINUSE fails the test cleanly instead of
  hanging 60s on the adapter timeout.
- X402_COMPAT_INCLUDE_RUST=1 opts the rust-x402 adapter into the compat
  suite (off by default to keep `pnpm test` cargo-free).
- Replay assertion gated by WIRE_ONLY_ADAPTER_IDS + opt-in
  X402_COMPAT_REPLAY_TRUST list: adapters whose verifier requires a
  real signed transaction (rust spine) skip the stub-credential replay
  test cleanly with a documented skip message; live matrix covers
  replay against them with a real PaymentProof::Transaction.
- README documents both opt-in flags.
…olicy

- Default full-verifier behavior: server adapters not in
  WIRE_ONLY_ADAPTER_IDS that accept the TS-wire stub credential are now
  flagged as a verifier bypass. Opt-in via X402_COMPAT_STUB_ACCEPT
  (CSV) for adapters whose verifier accepts the stub by design.
- Drop payment_invalid fallback for the replay second-submit assertion:
  once first=200 the second submission MUST be classified as
  signature_consumed by every adapter (no generic-rejection regression).
- Add explicit canonical-payment-signature-rust.json shape lock: every
  field rust spine's PaymentSignatureEnvelope requires must be present,
  payload must deserialize as PaymentProof::Transaction xor
  PaymentProof::Signature, base64 round-trip stable. Fixture can no
  longer drift undetected.
- Reject-token taxonomy match is now longest-first so suffixed tokens
  (e.g. ..._compute_price_instruction_too_high) match before their
  prefix (..._compute_price_instruction).
- Extract allowedX402Pair / baseLang / isRustSpine to
  src/x402-pair-policy.ts so the e2e and live-matrix tests share one
  source of truth and cannot drift apart silently.
…pat keypair requirement

- Add console.warn for live-matrix skip-due-to-missing-env so CI
  matrix misconfiguration is visible in job logs (per spec the
  behavior remains skip-not-fail, since the matrix is opt-in by env).
- Document that X402_COMPAT_INCLUDE_RUST=1 requires real ed25519
  keypairs (rust spine validates via MemorySigner::from_bytes); the
  README and inline comment make this contract explicit.
The TS-wire canonical credential carries `payload.challengeId/resource`,
which the rust spine rejects at PaymentProof deserialization with the
generic `payment_invalid` token — defeating the per-scenario
specific-token assertions that make the compat suite robust. Rather
than ship a half-functional opt-in, drop it: the compat suite is now
honestly TS-only, and the live matrix (tier 3) is the canonical place
for rust spine coverage. README documents the rationale.
Two Codex r8 P2 findings on the x402 harness matrix:

1) TS x402 fixture server gated its cross-server portability rejection
   behind `issued.size > 0`, so a freshly started server (or one that
   had not yet issued any 402) would accept any challengeId. Drop the
   size guard so the membership check fires unconditionally. The
   happy-path flow (GET /protected -> 402 with challengeId -> POST
   with challengeId) is unaffected because the served challengeId is
   added to `issued` on the 402 path before the client returns.

2) `cross-server-scenarios.test.ts:extractCanonicalCode` searched
   `error` before `message`. The Rust x402 interop server wraps
   verifier failures as `{ error: "payment_invalid", message:
   "<specific-verifier-token>" }`, so the first-match strategy
   resolved to the generic `payment_invalid` and silently discarded
   the verifier-specific token that the canonical taxonomy needs to
   classify. Combine both fields into one string before classifying
   so the richer signal survives.
The cross-server portability scenario previously listed a single TS->Rust
crossServerPair. Add the TS->TS control pair so the assertion exercises
the canonical challenge_verification_failed reject token on the TS
reference server itself (two independent server instances issue
disjoint challenge id sets), not only on the Rust spine's proof-layer
reject path. Document why the reverse Rust->TS direction is gated to
the live matrix: the Rust spine adapter does not echo the captured
credential to the harness by design, so credential-capture replay
flows can only originate from the TS client; the canonical Rust->TS
portability is asserted end-to-end via the live matrix where a real
signed transaction is exchanged.
@EfeDurmaz16

Copy link
Copy Markdown
Collaborator Author

Folded into #134 (combined with #132). Closing as superseded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant