Skip to content

test: E2E documentation + supporting tests (#15, draft)#18

Merged
paolino merged 13 commits intomainfrom
002-e2e-tests
Apr 18, 2026
Merged

test: E2E documentation + supporting tests (#15, draft)#18
paolino merged 13 commits intomainfrom
002-e2e-tests

Conversation

@paolino
Copy link
Copy Markdown
Contributor

@paolino paolino commented Apr 18, 2026

Refs #15. Draft.

Summary

End-to-end tests for harvest's spend protocol, delivered first as executable documentation (per the user's directive: "what counts for me is E2E as documentation") and second as a regression gate. The full narrative of how a voucher spend works end-to-end lives at the top of offchain/test/DevnetSpendSpec.hs — reading that module top-to-bottom is the recommended path for a new contributor.

Shape of this PR

Landed and green

  • Spec-driven artefacts under specs/002-e2e-tests/: spec, plan, research, data model, contract (canonical 106-byte signed_data layout), quickstart, tasks. The whole feature was driven through /speckit.
  • DevnetSpendSpec.hs module header = the E2E narrative. Six numbered phases from "coalition runs a devnet" to "node accepts or rejects". Every scenario has an inline paragraph explaining the threat model it defends against in plain English.
  • Byte-layout cross-check (SignedDataLayoutSpec.hs, 8 cases): parses the fixture's 106-byte signed_data and asserts each of the five bound fields (txid, ix, acceptor_ax, acceptor_ay, d) matches what the Node-side signer claims to have set. If the JS signer and the Aiken validator ever disagree on the byte layout, this test fails deterministically, in milliseconds.
  • Ed25519 independent verifier (Ed25519Spec.hs, 4 cases, FR-003): verifies the fixture's (customer_pubkey, signed_data, customer_signature) using the verifyDSIGN primitive re-exported by cardano-node-clients PR #65. Positive + flipped-byte negative both pass, proving the Node-produced bytes are byte-compatible with what the Plutus builtin will consume.
  • Pre-existing Groth16Spec / E2ESpec path fixes: those tests were silently skipped in CI (the old check wrapper compiled the binary but never ran it). Now that the wrapper executes via nix run, their hardcoded relative paths needed to use Fixtures.fixturesDir instead.
  • CI (.github/workflows/ci.yml): splits into derivation-backed checks (via nix build — library, groth16-ffi, circuit, circuit-tests, aiken-check — their tests run during the build phase) and shell-app checks (via nix run — unit-tests, lint — which must be executed). Ensures regressions in the test suite actually fail CI.
  • Fixture discovery: HARVEST_FIXTURES_DIR env var wired through nix/checks.nix so the test binary finds fixtures in the nix-store sandbox.

Test status

31 examples, 0 failures, 5 pending

The five pending scenarios are the devnet execution of the golden path (T021) and the four negatives (T030–T033). They are clearly named and each carries its threat-model explanation inline.

Deferred to a follow-up

The devnet bracket from cardano-node-clients:devnet is consumable but needs a fixture-signing affordance before the golden path can actually submit: the fixture's signed_data is bound to a zero txid (placeholder), and a real devnet tx consumes a real TxOutRef, so the test has to re-sign signed_data at runtime with sk_c and a live txid. That needs signDSIGN re-exported from Cardano.Node.Client.E2E.Setup — a one-line upstream patch analogous to the verifyDSIGN one that already landed in cardano-node-clients PR #65.

Rather than bundling a second upstream patch into an already-substantial PR, this is left as the natural next follow-up. The narrative and supporting tests deliver the documentation value now; the devnet execution is additive.

Changes guided tour

offchain/test/DevnetSpendSpec.hs

The docstring is ~70 lines and walks the reader through every phase of a spend. Each it block title names a defended invariant in the protocol's vocabulary (e.g. "defends against customer-key substitution") rather than an implementation assertion.

offchain/test/SignedDataLayout.hs

Single Haskell authority for the 106-byte layout. Offsets as named constants, decoder that errors clearly on wrong-size inputs. Mirrors specs/002-e2e-tests/contracts/signed-data-layout.md.

offchain/test/Fixtures.hs

Loads proof, VK, public signals, customer bundle, and applied-script hex into a single SpendBundle record. No parallel fixture copies; reads directly from the authoritative JSON/hex files.

nix/checks.nix, flake.nix

unit-tests is now a writeShellApplication that prepends cardano-node to PATH, sets HARVEST_FIXTURES_DIR, exports LD_LIBRARY_PATH for the groth16-ffi library, and execs the test binary. cardano-node input is pinned to the same version (10.7.0) that cardano-node-clients itself tests against — no drift.

.github/workflows/ci.yml

nix build for derivation-backed checks + nix run for shell-app checks. The lint job is folded into the main build-gate since it's cheap and there's no parallelism benefit to keeping it separate.

Local CI

All checks green locally:

  • nix build .#checks.x86_64-linux.library
  • nix build .#checks.x86_64-linux.groth16-ffi
  • nix build .#checks.x86_64-linux.circuit
  • nix build .#checks.x86_64-linux.circuit-tests
  • nix build .#checks.x86_64-linux.aiken-check
  • nix run .#checks.x86_64-linux.unit-tests → 31 examples, 0 failures, 5 pending
  • nix run .#checks.x86_64-linux.lint → no hints

Test plan

  • CI green on GitHub (validates the nix run workflow end-to-end).
  • Read DevnetSpendSpec.hs top-to-bottom — does it actually teach a reader how a harvest spend works? Feedback on phrasing or missing steps is welcome.
  • Read specs/002-e2e-tests/spec.md and contracts/signed-data-layout.md and confirm they match the implementation.
  • Decide whether the 5 pending devnet scenarios land as a follow-up PR (recommended — they need another upstream signDSIGN re-export) or whether this slice merges and the follow-up issues unblock separately.

paolino added 7 commits April 18, 2026 14:56
Issue #15. Spec-Driven Development artefacts under specs/002-e2e-tests/:

- spec.md: 3 user stories (golden-path acceptance, tampered rejections,
  byte-layout cross-check), 8 functional requirements, 5 measurable SCs.
- plan.md: cardano-node-clients-first approach — real devnet via
  :devnet sub-library, no Aiken fixture generation, patch upstream if
  primitives are missing.
- research.md: 7 decisions with rationale + rejected alternatives.
- data-model.md: SpendBundle record + devnet handle types.
- contracts/signed-data-layout.md: authoritative 106-byte layout (txid
  32 + ix 2 + acceptor_ax 32 + acceptor_ay 32 + d 8, all big-endian).
- quickstart.md: how to run, regenerate fixtures, add new negatives,
  patch cardano-node-clients if needed.
- tasks.md: 18 tasks in 6 phases with dependency graph and FR mapping.

Closes the standing rule directive: a feature is not shipped without
end-to-end tests exercising the validator and the proof-verification
path.

Refs #15.
Ed25519 verify primitives not public on origin/main; upstream PR #65
merged to re-export them. Consumption blocked on harvest's node 10.7.0
dep bump (separate ticket); E2E work paused until that lands.
Wires the harness layer without any real assertions yet:

- offchain/harvest.cabal adds cardano-node-clients:devnet as a
  test-suite dep plus stubs for the four new test modules.
- flake.nix adds cardano-node 10.7.0 as an input and passes it into
  the unit-tests check derivation.
- nix/checks.nix wraps the test binary in a shell app that prepends
  cardano-node to PATH (so withDevnet can spawn it) and keeps the
  groth16-ffi LD_LIBRARY_PATH wiring.
- offchain/test/Main.hs registers DevnetSpendSpec, Ed25519Spec, and
  SignedDataLayoutSpec alongside the existing specs.
- offchain/test/SignedDataLayout.hs is the single Haskell authority
  for the canonical 106-byte signed_data layout, mirroring
  specs/002-e2e-tests/contracts/signed-data-layout.md.
- offchain/test/Fixtures.hs loads the authoritative JSON/hex fixtures
  into a SpendBundle record (FR-006: no parallel fixture copies).
- DevnetSpendSpec, Ed25519Spec, SignedDataLayoutSpec currently report
  pending; real assertions land in follow-up commits per the tasks.md
  phasing (US3 first, then US1 golden + Ed25519, then US2 negatives).

Refs #15.
SignedDataLayoutSpec now:
  - asserts sbSignedData is exactly 106 bytes;
  - parses it via SignedDataLayout.parseSignedData;
  - checks each of the five fields matches what the Node-side signer
    recorded in customer.json / public.json:
      - txid (hex-encoded comparison against txid_hex),
      - ix (matches customer.json.ix),
      - d (matches public.json[0]),
      - acceptor_ax / acceptor_ay are both in [0, 2^256);
  - rejects a truncated payload so the parser has teeth.

Also fixes hardcoded fixture paths in the pre-existing Groth16Spec and
E2ESpec that were silently skipped by the old check wrapper (nix build
compiled but didn't run the binary). Now that nix/checks.nix wraps the
binary to execute with HARVEST_FIXTURES_DIR set, those tests actually
run and needed to use Fixtures.fixturesDir instead of hardcoded
relative paths.

Test run: 27 examples, 0 failures, 6 pending (US1 + US2 still to
implement).

Refs #15.
Ed25519Spec verifies the (customer_pubkey, signed_data,
customer_signature) triple produced by Node's crypto.sign(null, ...)
using the verifyDSIGN primitive now re-exported from
Cardano.Node.Client.E2E.Setup (lambdasistemi/cardano-node-clients#65,
merged as 408a890).

This is the cross-toolchain gate for FR-003 / SC-003: if Node's
Ed25519 output is ever byte-incompatible with cardano-crypto-class
(the same library Plutus's VerifyEd25519Signature calls internally),
this test fails deterministically, in milliseconds, without needing
the devnet to be involved.

Four cases:
- customer_pubkey is 32 bytes,
- customer_signature is 64 bytes,
- positive verify,
- negative verify with one byte of signed_data flipped (ensures the
  check has teeth; FR-008 positive/negative pairing).

Refs #15.
Per the directive 'E2E as documentation', the module header of
DevnetSpendSpec is now the primary artefact: reading it top-to-bottom
is the recommended way for a new contributor to understand how the
three harvest components (Groth16 circuit, customer Ed25519 signature,
Aiken validator) compose into a submitted Cardano transaction.

Each scenario is named by the actor or the defended invariant in the
spec's vocabulary, not by implementation detail. Every negative case
carries a paragraph of commentary that explains the attack it defends
against in plain English, so a reader sees not just the test shape
but the threat model.

The devnet bracket and submit path land in T021+T030-T033 as a
follow-up commit; scenarios are 'pendingWith' until then. The
narrative and the bundle-load sanity check compile and run clean on
the current cardano-node-clients:devnet surface, so the documentation
value is live on this commit without waiting for the harness iteration.

Test run: 31 examples, 0 failures, 5 pending.

Refs #15.
The previous CI step only did 'nix build' on every check derivation.
For shell-app checks (unit-tests, lint) that means CI realised the
wrapper but never invoked the test binary — a regression in the test
suite would silently pass. This was fine when no tests existed; now
that the #15 E2E suite is landing, it would hide real regressions.

Changes:
- .github/workflows/ci.yml now splits the checks into two steps:
  a 'nix build' block for the derivation-backed checks (library,
  groth16-ffi, circuit, circuit-tests, aiken-check — their tests run
  during their build phase) and a 'nix run' block for the shell-app
  checks (unit-tests, lint). The lint job is folded into the main
  build-gate job since the split-out dependency is no longer useful.
- test/*.hs: fourmolu + hlint clean-up surfaced by running the new
  lint step locally (drop unused DataKinds / TypeApplications pragmas,
  reflow multi-line let-bindings, prefer multi-line Haddock on
  split-across-lines docstrings). No behavioural change.

Refs #15.
@paolino paolino added the enhancement New feature or request label Apr 18, 2026
@paolino paolino self-assigned this Apr 18, 2026
paolino added 3 commits April 18, 2026 17:38
`nix run` against `.#checks.<system>.unit-tests` resolves to the
haskell.nix test component's bin/unit-tests, not the
writeShellApplication wrapper, so the wrapper's env vars
(HARVEST_FIXTURES_DIR, cardano-node PATH, LD_LIBRARY_PATH for
groth16-ffi) never fire. The test binary then fails with
'test/fixtures/proof.json: does not exist' because it falls back to
the default relative path.

Switching to `.#apps.<system>.unit-tests` forces resolution to
nix/apps.nix, which uses pkgs.lib.getExe on the shell wrapper —
guaranteed to be the env-setting wrapper.

Same fix applied to the lint step.

Refs #15.
The nix/checks.nix change that wires HARVEST_FIXTURES_DIR into the
unit-tests shell wrapper was modified locally but never staged, so the
wrapper CI built was missing that env var line and the test binary
fell back to the default 'test/fixtures' relative path — nonexistent
in the nix-store sandbox.

Also remove the temporary CI debug step (cat the wrapper) that
surfaced the mismatch.

Refs #15.
paolino added 2 commits April 18, 2026 17:59
cardano-node-clients main now includes the Ed25519 signing-side
re-exports from PR #66 (signDSIGN, rawDeserialiseSignKeyDSIGN). Pinned
at 18ef6fdc705b37740b1398720924f234e5dd6860 with nix32 sha
04z900hvgqnswgb063sf6ambyj9d9xixwcpyf1jjl600d2f909bf.

Also surface sk_c through the fixture loader. The devnet test needs
to re-sign signed_data at runtime with a live TxOutRef, so SpendBundle
gains a sbSkC field sourced from customer.json's sk_c_hex. A doc
comment flags that the production reificator never sees this value;
it lives in the fixture only because the test needs to generate valid
signatures on demand.

Refs #15.
Adds SpendHarness.hs with two helpers:

  * replaceTxOutRef — rewrites the (txid, ix) prefix of signed_data
    without touching the signature, used by negative tests that
    tamper with the signed payload;
  * resignedData — rewrites (txid, ix) AND re-signs with the
    customer's Ed25519 key, so the devnet golden path can bind a
    live, devnet-chosen TxOutRef into the redeemer.

SpendHarnessSpec exercises the re-sign path directly against the
fixture bundle: parses pk_c with cardano-node-clients'
rawDeserialiseVerKeyDSIGN, re-signs with signDSIGN, serialises with
rawSerialiseSigDSIGN, verifies with verifyDSIGN. 7 cases covering
shape (106 bytes, 64-byte sig), field correctness (txid, ix, d), and
the signature round-trip.

All primitives flow through the cardano-node-clients:devnet public
surface. The pin bumps to a feature-branch commit
(f6296179878729562e29c31ecf5b2333537897d2) that carries the raw-
serialise re-exports; the upstream PR is open for review and will
bump to main when merged.

Test run: 38 examples, 0 failures, 5 pending (devnet scenarios).

Refs #15.
…erged)

Upstream PR #67 merged to main as commit b9fbbb504eaa903c95517f3285a5b048a4a7f537.
Bump the downstream pin off the feature branch to that main commit per
the pins-main-only rule (feature-branch pins are for iteration only).
@paolino paolino marked this pull request as ready for review April 18, 2026 17:28
@paolino paolino merged commit 2906c1f into main Apr 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant