Skip to content

feat: add JSON as a second canonicalized language#27

Merged
bdelanghe merged 1 commit into
mainfrom
feat/json-roundtrip-filter
Jun 29, 2026
Merged

feat: add JSON as a second canonicalized language#27
bdelanghe merged 1 commit into
mainfrom
feat/json-roundtrip-filter

Conversation

@bdelanghe

@bdelanghe bdelanghe commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds JSON as a second canonicalized language in git-ast's clean/smudge round-trip, alongside the Rust support from #26. JSON is additive: it slots into the per-extension dispatch in filters::transform — no change to the pkt-line filter-process protocol, the printer, or pktline.

Rebased onto current main (#26). An earlier revision of this branch was built on a stale pre-#26 base; it has been replaced with this additive integration — no conflicts, no duplication.

  • src/json.rscanonicalize() parses with serde_json and re-emits a deterministic canonical form (object keys sorted, pretty-printed, trailing newline). Same contract as printer::canonicalize; fail-closed on invalid JSON (rejects the commit).
  • src/filters.rstransform() now dispatches by extension: .rs → printer, .json → json, smudge identity, pass-through otherwise.
  • src/setup.rs — routes both *.rs and *.json in .gitattributes (idempotent).
  • src/main.rs / src/lib.rs — help text + crate docs updated for two languages.

Design note

Canonical form is sorted-key pretty JSON, not compact RFC-8785 JCS — same value-level normalization (sorted keys, deterministic scalars), but one value per line so the filter's cleaner diffs purpose holds. serde_json only (its Map is a BTreeMap, so keys sort and scalars format deterministically).

Tests — all green (31 unit + 10 scenarios / 40 steps)

  • src/json.rs (6): sort/normalize, idempotence, key-order independence, value fidelity, trailing newline, reject-invalid.
  • src/filters.rs (2): clean canonicalizes JSON over the protocol; invalid JSON yields status=error.
  • cucumber claims (tests/features/claims.feature, 4 new): reformatting → no diff, byte-identical blobs, checkout round-trip, fail-closed — driving real git. The install step now routes *.json too.
  • cargo fmt --all -- --check + cargo clippy --all-targets -- -D warnings clean.

PR run sheet

  1. Independent PR — only adds the JSON language; no bundled/speculative changes
  2. Changed codepaths verified — unit + protocol + 4 real-git cucumber scenarios
  3. Root cause identified — n/a (additive feature)
  4. No duplication — reuses feat: working clean/smudge round-trip for a Rust subset #26's transform/pktline/setup; one dispatch arm + one language module
  5. No unrelated changes — README/help/docs scoped to the new language

Trust ledger

Moves row 8.1 toward 🟡: a second language now rides the same working pipeline. Rust remains a documented subset; structural diff/merge (node identity) stays out of scope.

🤖 Generated with Claude Code

@bdelanghe bdelanghe closed this Jun 28, 2026
@bdelanghe bdelanghe reopened this Jun 28, 2026
Adds JSON to git-ast's clean/smudge round-trip, alongside the existing Rust
support. JSON is additive and slots into the per-extension dispatch in
filters::transform — no change to the pkt-line protocol, printer, or pktline.

- src/json.rs: canonicalize() parses with serde_json and re-emits a
  deterministic canonical form (object keys sorted, pretty-printed, trailing
  newline). Same contract as printer::canonicalize; fail-closed on invalid JSON.
- src/filters.rs: transform() now dispatches by extension — .rs -> printer,
  .json -> json — with smudge identity and pass-through for everything else.
- src/setup.rs: routes both *.rs and *.json in .gitattributes (idempotent).
- src/main.rs / lib.rs: help text + crate docs updated for two languages.

Canonical form is sorted-key *pretty* JSON (diff-friendly) rather than compact
RFC-8785 JCS — same value-level normalization, one value per line. serde_json
only (its Map is a BTreeMap, so keys sort and scalars format deterministically).

Tests:
- 6 unit tests in json.rs (sort/normalize, idempotence, key-order independence,
  value fidelity, trailing newline, reject-invalid)
- 2 protocol tests in filters.rs (clean canonicalizes JSON; invalid JSON yields
  status=error over the filter-process protocol)
- 4 JSON scenarios in the cucumber claims suite (reformatting -> no diff,
  byte-identical blobs, checkout round-trip, fail-closed) driving real git;
  the install step now routes *.json too.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bdelanghe bdelanghe force-pushed the feat/json-roundtrip-filter branch from 20a745d to c2ce65f Compare June 28, 2026 23:47
@bdelanghe bdelanghe changed the title feat: real JSON clean/smudge filter (retire the placeholder) feat: add JSON as a second canonicalized language Jun 28, 2026
@bdelanghe bdelanghe marked this pull request as ready for review June 28, 2026 23:49
@bdelanghe bdelanghe merged commit 2c9531d into main Jun 29, 2026
1 check passed
@bdelanghe bdelanghe deleted the feat/json-roundtrip-filter branch June 29, 2026 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant