test: E2E documentation + supporting tests (#15, draft)#18
Merged
Conversation
Issue #15. Spec-Driven Development artefacts under specs/002-e2e-tests/: - spec.md: 3 user stories (golden-path acceptance, tampered rejections, byte-layout cross-check), 8 functional requirements, 5 measurable SCs. - plan.md: cardano-node-clients-first approach — real devnet via :devnet sub-library, no Aiken fixture generation, patch upstream if primitives are missing. - research.md: 7 decisions with rationale + rejected alternatives. - data-model.md: SpendBundle record + devnet handle types. - contracts/signed-data-layout.md: authoritative 106-byte layout (txid 32 + ix 2 + acceptor_ax 32 + acceptor_ay 32 + d 8, all big-endian). - quickstart.md: how to run, regenerate fixtures, add new negatives, patch cardano-node-clients if needed. - tasks.md: 18 tasks in 6 phases with dependency graph and FR mapping. Closes the standing rule directive: a feature is not shipped without end-to-end tests exercising the validator and the proof-verification path. Refs #15.
Ed25519 verify primitives not public on origin/main; upstream PR #65 merged to re-export them. Consumption blocked on harvest's node 10.7.0 dep bump (separate ticket); E2E work paused until that lands.
Wires the harness layer without any real assertions yet: - offchain/harvest.cabal adds cardano-node-clients:devnet as a test-suite dep plus stubs for the four new test modules. - flake.nix adds cardano-node 10.7.0 as an input and passes it into the unit-tests check derivation. - nix/checks.nix wraps the test binary in a shell app that prepends cardano-node to PATH (so withDevnet can spawn it) and keeps the groth16-ffi LD_LIBRARY_PATH wiring. - offchain/test/Main.hs registers DevnetSpendSpec, Ed25519Spec, and SignedDataLayoutSpec alongside the existing specs. - offchain/test/SignedDataLayout.hs is the single Haskell authority for the canonical 106-byte signed_data layout, mirroring specs/002-e2e-tests/contracts/signed-data-layout.md. - offchain/test/Fixtures.hs loads the authoritative JSON/hex fixtures into a SpendBundle record (FR-006: no parallel fixture copies). - DevnetSpendSpec, Ed25519Spec, SignedDataLayoutSpec currently report pending; real assertions land in follow-up commits per the tasks.md phasing (US3 first, then US1 golden + Ed25519, then US2 negatives). Refs #15.
SignedDataLayoutSpec now:
- asserts sbSignedData is exactly 106 bytes;
- parses it via SignedDataLayout.parseSignedData;
- checks each of the five fields matches what the Node-side signer
recorded in customer.json / public.json:
- txid (hex-encoded comparison against txid_hex),
- ix (matches customer.json.ix),
- d (matches public.json[0]),
- acceptor_ax / acceptor_ay are both in [0, 2^256);
- rejects a truncated payload so the parser has teeth.
Also fixes hardcoded fixture paths in the pre-existing Groth16Spec and
E2ESpec that were silently skipped by the old check wrapper (nix build
compiled but didn't run the binary). Now that nix/checks.nix wraps the
binary to execute with HARVEST_FIXTURES_DIR set, those tests actually
run and needed to use Fixtures.fixturesDir instead of hardcoded
relative paths.
Test run: 27 examples, 0 failures, 6 pending (US1 + US2 still to
implement).
Refs #15.
Ed25519Spec verifies the (customer_pubkey, signed_data, customer_signature) triple produced by Node's crypto.sign(null, ...) using the verifyDSIGN primitive now re-exported from Cardano.Node.Client.E2E.Setup (lambdasistemi/cardano-node-clients#65, merged as 408a890). This is the cross-toolchain gate for FR-003 / SC-003: if Node's Ed25519 output is ever byte-incompatible with cardano-crypto-class (the same library Plutus's VerifyEd25519Signature calls internally), this test fails deterministically, in milliseconds, without needing the devnet to be involved. Four cases: - customer_pubkey is 32 bytes, - customer_signature is 64 bytes, - positive verify, - negative verify with one byte of signed_data flipped (ensures the check has teeth; FR-008 positive/negative pairing). Refs #15.
Per the directive 'E2E as documentation', the module header of DevnetSpendSpec is now the primary artefact: reading it top-to-bottom is the recommended way for a new contributor to understand how the three harvest components (Groth16 circuit, customer Ed25519 signature, Aiken validator) compose into a submitted Cardano transaction. Each scenario is named by the actor or the defended invariant in the spec's vocabulary, not by implementation detail. Every negative case carries a paragraph of commentary that explains the attack it defends against in plain English, so a reader sees not just the test shape but the threat model. The devnet bracket and submit path land in T021+T030-T033 as a follow-up commit; scenarios are 'pendingWith' until then. The narrative and the bundle-load sanity check compile and run clean on the current cardano-node-clients:devnet surface, so the documentation value is live on this commit without waiting for the harness iteration. Test run: 31 examples, 0 failures, 5 pending. Refs #15.
The previous CI step only did 'nix build' on every check derivation. For shell-app checks (unit-tests, lint) that means CI realised the wrapper but never invoked the test binary — a regression in the test suite would silently pass. This was fine when no tests existed; now that the #15 E2E suite is landing, it would hide real regressions. Changes: - .github/workflows/ci.yml now splits the checks into two steps: a 'nix build' block for the derivation-backed checks (library, groth16-ffi, circuit, circuit-tests, aiken-check — their tests run during their build phase) and a 'nix run' block for the shell-app checks (unit-tests, lint). The lint job is folded into the main build-gate job since the split-out dependency is no longer useful. - test/*.hs: fourmolu + hlint clean-up surfaced by running the new lint step locally (drop unused DataKinds / TypeApplications pragmas, reflow multi-line let-bindings, prefer multi-line Haddock on split-across-lines docstrings). No behavioural change. Refs #15.
`nix run` against `.#checks.<system>.unit-tests` resolves to the haskell.nix test component's bin/unit-tests, not the writeShellApplication wrapper, so the wrapper's env vars (HARVEST_FIXTURES_DIR, cardano-node PATH, LD_LIBRARY_PATH for groth16-ffi) never fire. The test binary then fails with 'test/fixtures/proof.json: does not exist' because it falls back to the default relative path. Switching to `.#apps.<system>.unit-tests` forces resolution to nix/apps.nix, which uses pkgs.lib.getExe on the shell wrapper — guaranteed to be the env-setting wrapper. Same fix applied to the lint step. Refs #15.
The nix/checks.nix change that wires HARVEST_FIXTURES_DIR into the unit-tests shell wrapper was modified locally but never staged, so the wrapper CI built was missing that env var line and the test binary fell back to the default 'test/fixtures' relative path — nonexistent in the nix-store sandbox. Also remove the temporary CI debug step (cat the wrapper) that surfaced the mismatch. Refs #15.
cardano-node-clients main now includes the Ed25519 signing-side re-exports from PR #66 (signDSIGN, rawDeserialiseSignKeyDSIGN). Pinned at 18ef6fdc705b37740b1398720924f234e5dd6860 with nix32 sha 04z900hvgqnswgb063sf6ambyj9d9xixwcpyf1jjl600d2f909bf. Also surface sk_c through the fixture loader. The devnet test needs to re-sign signed_data at runtime with a live TxOutRef, so SpendBundle gains a sbSkC field sourced from customer.json's sk_c_hex. A doc comment flags that the production reificator never sees this value; it lives in the fixture only because the test needs to generate valid signatures on demand. Refs #15.
Adds SpendHarness.hs with two helpers:
* replaceTxOutRef — rewrites the (txid, ix) prefix of signed_data
without touching the signature, used by negative tests that
tamper with the signed payload;
* resignedData — rewrites (txid, ix) AND re-signs with the
customer's Ed25519 key, so the devnet golden path can bind a
live, devnet-chosen TxOutRef into the redeemer.
SpendHarnessSpec exercises the re-sign path directly against the
fixture bundle: parses pk_c with cardano-node-clients'
rawDeserialiseVerKeyDSIGN, re-signs with signDSIGN, serialises with
rawSerialiseSigDSIGN, verifies with verifyDSIGN. 7 cases covering
shape (106 bytes, 64-byte sig), field correctness (txid, ix, d), and
the signature round-trip.
All primitives flow through the cardano-node-clients:devnet public
surface. The pin bumps to a feature-branch commit
(f6296179878729562e29c31ecf5b2333537897d2) that carries the raw-
serialise re-exports; the upstream PR is open for review and will
bump to main when merged.
Test run: 38 examples, 0 failures, 5 pending (devnet scenarios).
Refs #15.
…erged) Upstream PR #67 merged to main as commit b9fbbb504eaa903c95517f3285a5b048a4a7f537. Bump the downstream pin off the feature branch to that main commit per the pins-main-only rule (feature-branch pins are for iteration only).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Refs #15. Draft.
Summary
End-to-end tests for harvest's spend protocol, delivered first as executable documentation (per the user's directive: "what counts for me is E2E as documentation") and second as a regression gate. The full narrative of how a voucher spend works end-to-end lives at the top of
offchain/test/DevnetSpendSpec.hs— reading that module top-to-bottom is the recommended path for a new contributor.Shape of this PR
Landed and green
specs/002-e2e-tests/: spec, plan, research, data model, contract (canonical 106-bytesigned_datalayout), quickstart, tasks. The whole feature was driven through/speckit.DevnetSpendSpec.hsmodule header = the E2E narrative. Six numbered phases from "coalition runs a devnet" to "node accepts or rejects". Every scenario has an inline paragraph explaining the threat model it defends against in plain English.SignedDataLayoutSpec.hs, 8 cases): parses the fixture's 106-bytesigned_dataand asserts each of the five bound fields (txid, ix, acceptor_ax, acceptor_ay, d) matches what the Node-side signer claims to have set. If the JS signer and the Aiken validator ever disagree on the byte layout, this test fails deterministically, in milliseconds.Ed25519Spec.hs, 4 cases, FR-003): verifies the fixture's(customer_pubkey, signed_data, customer_signature)using theverifyDSIGNprimitive re-exported by cardano-node-clients PR #65. Positive + flipped-byte negative both pass, proving the Node-produced bytes are byte-compatible with what the Plutus builtin will consume.nix run, their hardcoded relative paths needed to useFixtures.fixturesDirinstead..github/workflows/ci.yml): splits into derivation-backed checks (vianix build— library, groth16-ffi, circuit, circuit-tests, aiken-check — their tests run during the build phase) and shell-app checks (vianix run— unit-tests, lint — which must be executed). Ensures regressions in the test suite actually fail CI.HARVEST_FIXTURES_DIRenv var wired throughnix/checks.nixso the test binary finds fixtures in the nix-store sandbox.Test status
The five pending scenarios are the devnet execution of the golden path (T021) and the four negatives (T030–T033). They are clearly named and each carries its threat-model explanation inline.
Deferred to a follow-up
The devnet bracket from
cardano-node-clients:devnetis consumable but needs a fixture-signing affordance before the golden path can actually submit: the fixture'ssigned_datais bound to a zero txid (placeholder), and a real devnet tx consumes a real TxOutRef, so the test has to re-signsigned_dataat runtime withsk_cand a live txid. That needssignDSIGNre-exported fromCardano.Node.Client.E2E.Setup— a one-line upstream patch analogous to theverifyDSIGNone that already landed in cardano-node-clients PR #65.Rather than bundling a second upstream patch into an already-substantial PR, this is left as the natural next follow-up. The narrative and supporting tests deliver the documentation value now; the devnet execution is additive.
Changes guided tour
offchain/test/DevnetSpendSpec.hsThe docstring is ~70 lines and walks the reader through every phase of a spend. Each
itblock title names a defended invariant in the protocol's vocabulary (e.g. "defends against customer-key substitution") rather than an implementation assertion.offchain/test/SignedDataLayout.hsSingle Haskell authority for the 106-byte layout. Offsets as named constants, decoder that errors clearly on wrong-size inputs. Mirrors
specs/002-e2e-tests/contracts/signed-data-layout.md.offchain/test/Fixtures.hsLoads proof, VK, public signals, customer bundle, and applied-script hex into a single
SpendBundlerecord. No parallel fixture copies; reads directly from the authoritative JSON/hex files.nix/checks.nix,flake.nixunit-testsis now awriteShellApplicationthat prependscardano-nodetoPATH, setsHARVEST_FIXTURES_DIR, exportsLD_LIBRARY_PATHfor the groth16-ffi library, and execs the test binary.cardano-nodeinput is pinned to the same version (10.7.0) thatcardano-node-clientsitself tests against — no drift..github/workflows/ci.ymlnix buildfor derivation-backed checks +nix runfor shell-app checks. The lint job is folded into the main build-gate since it's cheap and there's no parallelism benefit to keeping it separate.Local CI
All checks green locally:
nix build .#checks.x86_64-linux.librarynix build .#checks.x86_64-linux.groth16-ffinix build .#checks.x86_64-linux.circuitnix build .#checks.x86_64-linux.circuit-testsnix build .#checks.x86_64-linux.aiken-checknix run .#checks.x86_64-linux.unit-tests→ 31 examples, 0 failures, 5 pendingnix run .#checks.x86_64-linux.lint→ no hintsTest plan
nix runworkflow end-to-end).DevnetSpendSpec.hstop-to-bottom — does it actually teach a reader how a harvest spend works? Feedback on phrasing or missing steps is welcome.specs/002-e2e-tests/spec.mdandcontracts/signed-data-layout.mdand confirm they match the implementation.signDSIGNre-export) or whether this slice merges and the follow-up issues unblock separately.