Skip to content

Transpiler cleanup#15

Closed
cuzzo wants to merge 1 commit intomasterfrom
transpiler-cleanup
Closed

Transpiler cleanup#15
cuzzo wants to merge 1 commit intomasterfrom
transpiler-cleanup

Conversation

@cuzzo
Copy link
Copy Markdown
Owner

@cuzzo cuzzo commented May 7, 2026

#TRANSPILE_PURE

Self-host preparation: typed schemas, struct extractions, walker traits, respond_to? cleanup. Squashed from 21 incremental commits.

Test plan

  • srb tc clean
  • prspec 3967/0
  • transpile-tests 432/0 leaks
  • #TRANSPILE_PURE workflow passes (verifies generated Zig is unchanged after counter-id normalization)

How #TRANSPILE_PURE works on this branch

The transpiler uses node.object_id.abs to disambiguate nested-WITH guard variables, MATCH binding aliases, snapshot guards, and acquire-block labels. Ruby object_id shifts whenever allocation order changes — so any refactor that adds/removes object allocations produces different counter values even when the generated Zig is functionally identical.

We can't replace object_id in the transpiler: it uniquely identifies cloned AST subtrees, which pipeline rewriting can produce. Source-position-based naming would silently collide on clones that share (line, column) — that would be a real bug.

Solution: keep the transpiler unchanged and normalize the OUTPUT instead. tools/normalize_zig.rb maps each unique counter to a stable first-occurrence index per file (__c_guard_3220__c_guard_N1). The workflow normalizes both base and branch trees before diffing, so structurally-equivalent output diffs to zero.

Verified locally: 54 raw file diffs (counter shifts) → 0 after normalization.

See commit message for full breakdown.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

🐰 Bencher Report

Branchtranspiler-cleanup
Testbedubuntu-latest

⚠️ WARNING: No Threshold found!

Without a Threshold, no Alerts will ever be generated.

Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds flag.

Click to view all benchmark results
Benchmarkleak-build-msMeasure (units) x 1e3leak-countMeasure (units)leak-run-msMeasure (units)
benchmarks/concurrent/01_socket_throughput/bench📈 view plot
⚠️ NO THRESHOLD
5.59 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
1,362.88 units
benchmarks/concurrent/06_dynamic_spawn/bench📈 view plot
⚠️ NO THRESHOLD
5.08 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
3,307.55 units
benchmarks/concurrent/11_parallel_aggregation/bench📈 view plot
⚠️ NO THRESHOLD
4.99 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
6,916.93 units
benchmarks/concurrent/18_atomic_counter/bench📈 view plot
⚠️ NO THRESHOLD
4.97 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
40.86 units
benchmarks/inter-clear/04_concurrent_mvcc_fat_struct/bench📈 view plot
⚠️ NO THRESHOLD
5.12 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
332.14 units
benchmarks/sequential/03_alloc_throughput/bench📈 view plot
⚠️ NO THRESHOLD
4.96 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
11,790.00 units
benchmarks/sequential/08_sort/bench📈 view plot
⚠️ NO THRESHOLD
5.03 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
1,480.64 units
benchmarks/sequential/13_soa_layout/bench📈 view plot
⚠️ NO THRESHOLD
4.99 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
1,327.45 units
🐰 View full continuous benchmarking report in Bencher

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 7, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 81.81818% with 152 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.76%. Comparing base (0b6c26b) to head (3b94242).

Files with missing lines Patch % Lines
src/backends/pipeline_host.rb 36.76% 86 Missing ⚠️
src/ast/ast.rb 79.62% 11 Missing ⚠️
src/mir/escape_analysis.rb 42.85% 8 Missing ⚠️
src/mir/control_flow.rb 91.13% 7 Missing ⚠️
src/mir/mir_lowering.rb 90.78% 7 Missing ⚠️
src/annotator-helpers/pipe_analysis.rb 91.78% 6 Missing ⚠️
src/mir/mir_pass.rb 81.81% 6 Missing ⚠️
src/mir/promotion_plan.rb 84.84% 5 Missing ⚠️
src/mir/fsm_transform/segments.rb 0.00% 4 Missing ⚠️
src/annotator-helpers/capabilities.rb 94.54% 3 Missing ⚠️
... and 7 more
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #15      +/-   ##
==========================================
+ Coverage   89.73%   89.76%   +0.02%     
==========================================
  Files         173      174       +1     
  Lines       46723    46905     +182     
  Branches    11604    11463     -141     
==========================================
+ Hits        41927    42102     +175     
- Misses       4796     4803       +7     
Flag Coverage Δ
ruby 85.78% <81.81%> (+0.06%) ⬆️
zig 95.58% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cuzzo cuzzo force-pushed the transpiler-cleanup branch 21 times, most recently from aa91282 to 69570e2 Compare May 8, 2026 11:18
…ts, respond_to? cleanup

Squashed from 21 incremental commits. Original history preserved at tag
`transpiler-cleanup-original`. Each piece below was independently green
on srb tc / prspec (3598/0) / transpile-tests (432/0 leaks).

Hash-as-struct schemas → typed Data classes in src/ast/schemas.rb:
- `EnumSchema(variants, visibility)`
- `ResourceSchema(close_zig, static_methods, fields, type_params, ...)`
- `UnionSchema(variants, type_params, visibility)`
- `StructSchema(fields, field_defaults, borrowed_fields, type_params,
   methods, visibility, extern_module, as_type)`

Eliminates 60+ `schema.is_a?(Hash) && schema[:kind] == :X` dispatches
across annotator, MIR pipeline, and tools. Fields cleanly separated
from metadata; the `schema.keys.reject { |k| k.is_a?(Symbol) }` pattern
goes away (subsumes the original String/Symbol normalization).

Parallel registries in `mir_lowering.rb` (`@struct_schemas`,
`@union_schemas`) and `importer.rb` carry typed values too.

- `Formatter::Emitter::FnSig(toks, start, arrow_idx, po, pc)` — 5
  methods that took the same arg cluster.
- `PipelineHost::PipelineSite(list, options)` — 24 lower_X methods in
  PipelineHost; clump dropped from 13 methods.
- `MIRPass::WalkCtx(bindings, promo)` — read-only carry through
  transform_body / recurse_branches!.
- `OwnershipDataflow::DataflowStep(state, consumed)` — per-walk state
  for collect_ownership_transfers + 4 helpers.

Introduced `AST::HasBodies` module. Body-owning AST nodes (12 types:
IfStatement, WhileLoop, ForRange, ForEach, MatchStatement, WithBlock,
DoBlock, BgBlock, BgStreamBlock, FunctionDef, TestBlock,
WhileBindLoop) declare `child_bodies`. `AST.walk_body` and
`AST._bg_visit_recursive` collapse from hand-coded case chains to
trait-driven loops. Adding a new body-owning node type is now a single
include + def, no walker edits.

655 → 504 sites (-151). Methodology: trace each receiver via Prism +
call-site grep before touching. Most "guards" were dead defensive code
where Locatable's universal accessors (token, full_type, type_info,
storage, matched_stdlib_def, was_moved, zig_pattern, mutates_receiver,
line, column, etc.) made the check pointless on AST receivers.

Clusters cleaned: `:strip` / `:empty?` (emit() returns String|nil),
`:line` / `:column` (Locatable + Token), `:matched_stdlib_def`,
`:was_moved`, `:zig_pattern`, `:mutates_receiver`, `:token`,
`:full_type` (40 sites), `:value` (sites with case/when narrowing).

False positives caught by tests: 1 in `visit_StubDecl` where
`node.value` is genuinely polymorphic (AST | Symbol). Replaced with
explicit `is_a?(AST::Locatable)` and a comment.

Also removed 1 spec test that locked in the dead-defensive code via
`Struct.new(:stack_tier, :stack_vars_bytes)` — lockstep deletion per
CLAUDE.md "test for deleted functionality."

Converted PipelineHost#transpile_pipeline's 24-arm if/elsif chain to
case/when (54 LOC → 30 LOC). Other AST-is_a? chains in src/ are 2-3
arms with mixed predicates (`is_a?(X) && was_moved`) where case/when
doesn't simplify cleanly — those stay as-is.

- `.github/workflows/transpile-pure.yml`: PR-body-triggered byte-diff
  of generated .zig vs merge base. Gated on `#TRANSPILE_PURE` marker.
- `clear emit-zig <path> -o <outdir>` CLI subcommand: walks .cht files
  and writes generated .zig to a mirrored tree without compiling.
- Used by the workflow to capture deterministic transpiler output.

- Gemfile: sorbet, sorbet-runtime, tapioca added to dev group.
- `sorbet/config` + `sorbet/rbi/clear-stubs.rbi` (minimal stubs to
  unblock `srb tc`).
- `tools/gen_attr_rbi.rb` (Prism-based RBI generator for AST attr_*
  declarations) → `sorbet/rbi/clear-attr-accessors.rbi` (91 classes,
  404 attr_* shims).
- Test pilot: importer.rb at `# typed: true` clean.

- `tools/respond_to_inventory.rb` — Prism walks src/ast/ast.rb to map
  AST classes → attrs (Struct members + include Locatable +
  attr_accessor + custom getters), then walks src/**/*.rb for every
  respond_to?(:X) site. Outputs CSVs and a summary md.
- `tools/respond_to_narrowing.rb` — per-site receiver-type classifier.
  For each respond_to? site, finds prior is_a? guards, case/when
  arms, .X assignments from Locatable attrs, and walker-block params.
  Classifies as ast_locatable / typed_specific /
  from_locatable_attr / walker_yielded / unknown. Drove Phase 1f's
  full_type sweep (40 sites, 1 false positive).
- `docs/agents/respond_to_inventory.md` — methodology + phased plan.

- TODO.md updated: P0 self-host prep #1 String/Symbol normalization
  folded into #2 (the schema work subsumes it).
- Two pure-deletion commits removed dead code unrelated to the main
  themes (4 dead methods debride flagged, dead defensive guards in
  escape_analysis).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cuzzo cuzzo force-pushed the transpiler-cleanup branch from 69570e2 to 3b94242 Compare May 8, 2026 11:29
@cuzzo cuzzo closed this May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants