Skip to content

Import RFC 8785 (JCS) test vectors from cyberphone reference suite across SDKs #110

@EfeDurmaz16

Description

@EfeDurmaz16

Filed from PR #102 review thread (Ludo inline #102 (comment) + #102 (comment)).

Scope

Import battle-tested RFC 8785 (JSON Canonicalization Scheme) test vectors from an external reference suite. Apply the same corpus to every SDK that ships a JCS encoder so cross-SDK byte-for-byte equality is mechanically verifiable.

Authoritative sources to research

  1. RFC 8785 itself (Rundgren / WebPKI.org): https://datatracker.ietf.org/doc/html/rfc8785
    • Section 3.2.2: I-JSON requirement.
    • Section 3.2.3: property sorting with worked examples.
    • Appendix B: full canonical-form examples (input -> canonical output pairs).
  2. cyberphone/json-canonicalization reference repository: https://github.com/cyberphone/json-canonicalization
    • `testdata/input/`: input JSON files.
    • `testdata/output/`: expected canonical output bytes.
    • `testdata/numbers/`: number serialization edge cases (ES6 ToString).
    • Cross-validated against Java, Node, Go, C#, Ruby, Python reference implementations in the same repo.
  3. ECMA-262 Number::toString (referenced by JCS sec 3.2.2.2):
  4. Cross-language reference implementations to compare against:
    • JavaScript: npm `canonicalize` package by sjkomp + cyberphone repo's `npm/canonicalize`.
    • Java: `org.webpki.json.JSONCanonicalizer` in cyberphone repo.
    • Python: `jcs` and `pyjwt` packages.
    • Go: `github.com/cyberphone/json-canonicalization/go/src/webpki.org/jsoncanonicalizer`.
  5. NIST / WebPKI public test vectors: https://webpki.org/jose/canonicalization/

Current test coverage to replace

Each SDK has 20-30 hand-rolled JCS cases today (see the lua/php/ruby/python/ts canonical-JSON test files; lua now in `json_canonical_rfc8785_spec.lua`, ruby in `json_canonical_rfc8785_test.rb`). The import should replace these per-language cases with the shared corpus.

Cross-SDK scope

Apply imported vectors to PHP, Ruby, Lua, Python, and TypeScript JCS encoders. Each language's test file should import the SAME corpus.

Out of scope

Do NOT propose fabricated vectors. The whole point is to import from a reference suite that another implementation has already battle-tested. Label as `m2-followup` (or whatever the repo uses for post-M1 cross-SDK hardening).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions