Skip to content

fix(mir): MUTABLE @list of struct grew through dangling ptr (UAF)#21

Merged
cuzzo merged 2 commits intomasterfrom
hotfix-list-append-buffer-uaf
May 8, 2026
Merged

fix(mir): MUTABLE @list of struct grew through dangling ptr (UAF)#21
cuzzo merged 2 commits intomasterfrom
hotfix-list-append-buffer-uaf

Conversation

@cuzzo
Copy link
Copy Markdown
Owner

@cuzzo cuzzo commented May 8, 2026

A helper function declaring MUTABLE xs: T[]@list and calling .append(StructLit{...}) would corrupt the caller's list buffer or segfault on a subsequent .append. The pattern:

FN record!(MUTABLE items: TraceItem[]@list, step: Int64) RETURNS !Void ->
items.append(TraceItem{ step: step, ... });
RETURN;
END

FN main!() RETURNS !Void ->
MUTABLE items: TraceItem[]@list = List[];
record!(items, 1_i64); # buffer allocated in record!'s frame
junk = burnFrame!(...); # bump arena advances past freed bytes
record!(items, 2_i64); # writes through dangling items.ptr -> SEGFAULT
...
END

The bug surfaced after the recent MUTABLE @list pointer-pass commits (1872ae4 / fcedfbe) made this calling pattern compile; the :receiver_storage allocator-routing in std_lib.rb / mir_lowering.rb predates pointer-passing and assumes a single-frame world.

Two coordinated fixes:

  • escape_analysis.rb (Condition 9) -- when a @list is passed to a MUTABLE @list parameter, escape-promote the caller's binding to :heap at decl. Mirrors the existing GIVE-to-TAKES promotion (Condition 8). Without this, the caller's cleanup uses frameAlloc while the callee's .append heap-allocates, leaking the buffer on deinit.

  • mir_lowering.rb (resolve_alloc_sym) -- when a method call's receiver is a @list parameter that arrived via pointer-pass (in @current_fn_collection_params), force :heap for the :receiver_storage resolution. The receiver's static storage tag on the param doesn't reflect the heap promotion that happened on the caller's binding (storage doesn't propagate across the function boundary), so the callee-side resolver has to make the call locally based on "did I receive this by pointer?".

Both edges are necessary: escape-promote alone leaves the helper's .append using the helper's frameAlloc() (UAF persists); the lowering-side guard alone heap-allocates the buffer but the caller's frameAlloc cleanup leaks it.

Test 380 covers the deterministic repro: cross-frame .append of a struct, with burnFrame! calls between record!s to force the segfault on origin/master.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

🐰 Bencher Report

Branchhotfix-list-append-buffer-uaf
Testbedubuntu-latest

⚠️ WARNING: No Threshold found!

Without a Threshold, no Alerts will ever be generated.

Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds flag.

Click to view all benchmark results
Benchmarkleak-build-msMeasure (units) x 1e3leak-countMeasure (units)leak-run-msMeasure (units)
benchmarks/concurrent/04_fanout_fanin/bench📈 view plot
⚠️ NO THRESHOLD
4.52 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
3,633.61 units
benchmarks/concurrent/09_kvstore/bench📈 view plot
⚠️ NO THRESHOLD
4.50 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
60,003.32 units
benchmarks/concurrent/14_nested_lock/bench📈 view plot
⚠️ NO THRESHOLD
4.42 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
365.74 units
benchmarks/inter-clear/02_concurrent_fsm_vs_stackful/bench_fsm📈 view plot
⚠️ NO THRESHOLD
4.40 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
135.01 units
benchmarks/inter-clear/02_concurrent_fsm_vs_stackful/bench_stackful📈 view plot
⚠️ NO THRESHOLD
4.40 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
367.36 units
benchmarks/sequential/11_pipeline_overhead/bench📈 view plot
⚠️ NO THRESHOLD
4.39 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
11,576.28 units
🐰 View full continuous benchmarking report in Bencher

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 8, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 78.18182% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.87%. Comparing base (f9c81f0) to head (d5db63d).

Files with missing lines Patch % Lines
src/mir/escape_analysis.rb 50.00% 10 Missing ⚠️
src/backends/pipeline_host.rb 50.00% 2 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #21      +/-   ##
==========================================
- Coverage   89.87%   89.87%   -0.01%     
==========================================
  Files         183      183              
  Lines       47636    47679      +43     
  Branches    11598    11620      +22     
==========================================
+ Hits        42812    42850      +38     
- Misses       4824     4829       +5     
Flag Coverage Δ
ruby 86.06% <78.18%> (-0.02%) ⬇️
zig 95.59% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cuzzo
Copy link
Copy Markdown
Owner Author

cuzzo commented May 8, 2026

Original hotfix (commit 8f324000)

A helper function declaring MUTABLE xs: T[]@list and calling .append(StructLit{...}) would corrupt the caller's list buffer or segfault on a subsequent .append. Pattern:

FN record!(MUTABLE items: TraceItem[]@list, step: Int64) RETURNS !Void ->
    items.append(TraceItem{ step: step, ... });
    RETURN;
END

FN main!() RETURNS !Void ->
    MUTABLE items: TraceItem[]@list = List[];
    record!(items, 1_i64);              # buffer allocated in record!'s frame
    junk = burnFrame!(...);             # bump arena advances past freed bytes
    record!(items, 2_i64);              # writes through dangling items.ptr → SEGFAULT
END

The bug surfaced after the recent MUTABLE @list pointer-pass commits (1872ae48 / fcedfbe6) made this calling pattern compile; the :receiver_storage allocator-routing in std_lib.rb / mir_lowering.rb predates pointer-passing and assumes a single-frame world.

Two coordinated fixes:

  • escape_analysis.rb (Condition 9) — when a @list is passed to a MUTABLE @list parameter, escape-promote the caller's binding to :heap at decl.
  • mir_lowering.rb (resolve_alloc_sym) — when a method call's receiver is a @list parameter that arrived via pointer-pass, force :heap for :receiver_storage.

Test 380 covers the deterministic repro.


Why this commit alone is insufficient (and the call-out)

This was a genuine memory-safety regression in a language that promises memory safety, and the chain of how it landed needs to be acknowledged honestly.

The two introducing commits — 1872ae48 (pointer-pass MUTABLE @list) and fcedfbe6 (forwarded @list params skip the & re-wrap) — were rushed out to close short-term functionality gaps:

  1. Users reported that MUTABLE @list parameters couldn't be mutated by callees (.append / .pop didn't propagate). True bug.
  2. The fix pointer-passed the param. This enabled the unsafe shape that test 380 exercises.
  3. Neither commit added a corresponding rule in MIRChecker to enforce the new lifetime constraint that pointer-passing imposes (the caller's binding must outlive the callee's frame mark, which means it must be heap-class).
  4. The forwarding follow-on (fcedfbe6) extended the same shape to forwarded args without revisiting the safety question.
  5. Existing tests passed because frame arenas reuse memory and the UAF only manifests with specific allocator-churn patterns (the burnFrame! step in test 380). Smoke testing didn't trip it.

The original hotfix in this PR (8f324000) plugged the user-visible crash. But it did so via lowering-side decisions only: escape_analysis.rb Condition 9 + resolve_alloc_sym's pointer-pass guard. No corresponding MIRChecker invariant was added. That means the safety story rested on two specific lines in mir_lowering.rb continuing to route :heap correctly forever. A future MIR refactor, a different pointer-pass path (closures, channel sends, async captures), or even a routine cleanup of allocator-routing logic could have re-introduced the same UAF without the checker noticing.

The introducing commits and the original hotfix were both rushed out in attempts to cover functionality gaps without thinking through the implications for the MIR system's ability to guarantee safety. The result was a quietly weakened safety promise.

What the new commit (62223f11) adds

INV-CROSS-FRAME-PARAM-ALLOC in src/mir/mir_checker.rb — the structural fence that should have accompanied the pointer-pass change from the start:

An InlineZig op whose target_var is a pointer-passed parameter must NOT use the :frame allocator. Pointer-passed params carry a lifetime that extends past the current function's mark/restore; a frame allocation here would die when the function returns, leaving the caller with a dangling buffer pointer.

The checker independently verifies the lowering's allocator-routing decision via a new MIR::Param.pointer_passed field tagged at lowering time. Defense in depth: if resolve_alloc_sym or escape_analysis.rb Condition 9 ever regresses, the checker catches the resulting bad MIR before codegen.

Validation

I temporarily disabled the lowering-side fix (commented out the two @current_fn_collection_params&.include?(...) guards in resolve_alloc_sym) and re-built the test 380 case. The new MIR Checker invariant rejected the build with the exact violation we want:

[CROSS_FRAME_PARAM_ALLOC] record::items -- operation alloc is :frame
  but 'items' is a pointer-passed parameter (lifetime extends past
  this function's frame mark; buffer would dangle on return). Use :heap.

Tests 374, 378, and 380 all failed to transpile with the lowering regressed (correct: the lowering chose bad MIR; checker rejects). With the lowering's fix restored, all three pass, the full transpile-tests corpus (521 tests) passes, and the unit suite (4136 specs / 0 failures) passes.

Adjacent fix surfaced during validation

Lowering's mutable_scalar_params was incorrectly classifying MUTABLE collection params as scalars (because transpile_type returns "anytype" for collections, which doesn't match the [] / * prefix check). The mis-classification renamed items to _m_items in MIR::Param.name while MIR::InlineZig.target_var kept the original items. The names didn't match and the new invariant couldn't find the param. Excluded collection? and needs_pointer_passing? types from the scalar classifier; the _m_ rename is meant for true scalars only.

What this PR does NOT do

This is Phase 2 of the structural-fix plan (the MIR Checker invariant — the highest-leverage piece). Outstanding follow-up work:

  • Phase 1 — annotator-layer rejection. Make passing a :frame-stored binding to a MUTABLE @list parameter a typecheck error with three documented resolutions (@heap / COPY / non-MUTABLE). Forces the user to make the lifetime-extension explicit instead of accepting a silent escape promotion.
  • Phase 3 — audit other escape paths. BG-pointer-captured collections, channel sends, future closures — every shape where a binding can be mutated through a pointer across a frame boundary. Each needs its own pinned test.
  • Phase 4 — ASAN job in CI. The bug went undetected because the existing test suite didn't have the allocator-churn pattern that exposes the UAF. AddressSanitizer would have caught test 380 on the first run; we should be running it continuously.
  • Phase 5 — process change. CLAUDE.md invariants entry for INV-CROSS-FRAME-PARAM-ALLOC, plus a checklist for MIR-semantic PRs requiring "is there a checker invariant covering this shape?" as a merge gate.

These should land as separate PRs.

Test coverage

6 new specs in spec/mir_checker_spec.rb for the new invariant: positive case, negative case, local bindings (no false positives), every allocator key (alloc / key_alloc / val_alloc), empty-params functions, slice (non-pointer-passed) params.

🤖 Generated with Claude Code

cuzzo and others added 2 commits May 8, 2026 14:13
A helper function declaring `MUTABLE xs: T[]@list` and calling
`.append(StructLit{...})` would corrupt the caller's list buffer or
segfault on a subsequent `.append`. The pattern:

  FN record!(MUTABLE items: TraceItem[]@list, step: Int64) RETURNS !Void ->
      items.append(TraceItem{ step: step, ... });
      RETURN;
  END

  FN main!() RETURNS !Void ->
      MUTABLE items: TraceItem[]@list = List[];
      record!(items, 1_i64);              # buffer allocated in record!'s frame
      junk = burnFrame!(...);             # bump arena advances past freed bytes
      record!(items, 2_i64);              # writes through dangling items.ptr -> SEGFAULT
      ...
  END

The bug surfaced after the recent `MUTABLE @list` pointer-pass commits
(1872ae4 / fcedfbe) made this calling pattern compile; the
`:receiver_storage` allocator-routing in std_lib.rb / mir_lowering.rb
predates pointer-passing and assumes a single-frame world.

Two coordinated fixes:

* **escape_analysis.rb (Condition 9)** -- when a `@list` is passed
  to a `MUTABLE @list` parameter, escape-promote the caller's
  binding to `:heap` at decl. Mirrors the existing GIVE-to-TAKES
  promotion (Condition 8). Without this, the caller's cleanup uses
  `frameAlloc` while the callee's `.append` heap-allocates, leaking
  the buffer on deinit.

* **mir_lowering.rb (resolve_alloc_sym)** -- when a method call's
  receiver is a `@list` parameter that arrived via pointer-pass
  (in `@current_fn_collection_params`), force `:heap` for the
  `:receiver_storage` resolution. The receiver's static storage tag
  on the *param* doesn't reflect the heap promotion that happened
  on the caller's binding (storage doesn't propagate across the
  function boundary), so the callee-side resolver has to make the
  call locally based on "did I receive this by pointer?".

Both edges are necessary: escape-promote alone leaves the helper's
`.append` using the helper's `frameAlloc()` (UAF persists); the
lowering-side guard alone heap-allocates the buffer but the caller's
`frameAlloc` cleanup leaks it.

Test 380 covers the deterministic repro: cross-frame `.append` of a
struct, with `burnFrame!` calls between record!s to force the
segfault on origin/master.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…F class

The previous hotfix (parent of this branch) plugged a UAF discovered
in production: a `MUTABLE T[]@list` parameter let a callee `.append`
into a buffer whose growth allocator was the *callee's* frame, which
got rewound on return — leaving the caller with a dangling pointer.
That hotfix added Condition 9 to escape_analysis.rb (auto-promote
the caller's binding to :heap) and a guard in mir_lowering.rb's
`resolve_alloc_sym` (force :heap when the receiver is a pointer-
passed param).

That hotfix unblocked the immediate crash — but it did so by adding
two coordinated lowering-side decisions, with no corresponding
MIR Checker invariant. The two introducing commits (1872ae4 /
fcedfbe) likewise landed the pointer-pass functionality without a
matching checker rule. Both commits were rushed out to close
short-term gaps (callees couldn't grow caller @lists, then forwarded
@list params were re-`&`-wrapped); neither came with a structural
fence in the safety system. The result: a memory-safety promise was
silently weakened, and the only thing standing between users and the
UAF was that two specific lines in mir_lowering.rb routed `:heap`
correctly. A future MIR refactor or a different pointer-pass path
(closures, channel sends, async captures) would have re-introduced
the same UAF without the checker noticing.

This branch adds the structural fence the previous fixes should have
included from the start.

**INV-CROSS-FRAME-PARAM-ALLOC** in `src/mir/mir_checker.rb`:

  An InlineZig op whose target_var is a pointer-passed parameter
  must NOT use the `:frame` allocator. Pointer-passed params carry
  a lifetime that extends past the current function's mark/restore;
  a frame allocation here would die when the function returns,
  leaving the caller with a dangling buffer pointer.

The checker independently re-derives "is this param pointer-passed?"
from a new `MIR::Param.pointer_passed` field (lowering tags it).
We can't infer it from the Zig type string alone — collection params
lower to `anytype` for polymorphism — so the lowering tags it
explicitly and the checker reads the flag. Defense in depth: if
`resolve_alloc_sym` or escape_analysis Condition 9 ever regresses,
the checker catches the resulting bad MIR before codegen.

Lowering's `mutable_scalar_params` was incorrectly classifying
MUTABLE collection params as scalars (because `transpile_type`
returns `"anytype"` for collections, which doesn't match the [] / *
prefix check). The mis-classification renamed `items` to `_m_items`
in MIR::Param.name, while `MIR::InlineZig.target_var` kept the
original `items`. The names didn't match, and the new invariant
couldn't find the param. Excluded `collection?` and
`needs_pointer_passing?` types from the scalar classifier; the
rename is meant for true scalars only.

Defense-in-depth verified end-to-end:

1. Disabled the lowering-side fix (commented out the two
   `@current_fn_collection_params&.include?(...)` lines in
   `resolve_alloc_sym`). Re-built test 380. The new MIR Checker
   invariant rejected the build with:

     [CROSS_FRAME_PARAM_ALLOC] record::items -- operation alloc is
     :frame but 'items' is a pointer-passed parameter (lifetime
     extends past this function's frame mark; buffer would dangle
     on return). Use :heap.

   Tests 374, 378, and 380 all failed to transpile (correct
   behavior — the lowering's choice is bad MIR).

2. With the lowering's fix restored, all three tests pass, the full
   transpile-tests corpus (521 tests) passes, and the unit suite
   (4136 specs) passes.

This commit is Phase 2 of the plan I drafted (the highest-leverage
piece, the structural fence). Phase 1 (annotator-layer rejection
with explicit user opt-in via `@heap` / `COPY` / non-MUTABLE) and
Phase 3 (audit of other escape paths — BG capture by pointer,
channel sends, etc.) remain. Phase 4 (ASAN job in CI) is a separate
PR.

6 new specs in `spec/mir_checker_spec.rb` cover the invariant: the
positive case (frame on pointer-passed param fires), the negative
case (heap on pointer-passed param passes), local bindings (no
false positives), every allocator key (key_alloc / val_alloc /
shard_alloc all checked), empty-params functions, and slice
(non-pointer-passed) params.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cuzzo cuzzo force-pushed the hotfix-list-append-buffer-uaf branch from 62223f1 to d5db63d Compare May 8, 2026 14:16
@cuzzo cuzzo merged commit 67b8da9 into master May 8, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants