Skip to content

Vopr loom audit docs#26

Open
cuzzo wants to merge 33 commits intovopr-loom-auditfrom
vopr-loom-audit-docs
Open

Vopr loom audit docs#26
cuzzo wants to merge 33 commits intovopr-loom-auditfrom
vopr-loom-audit-docs

Conversation

@cuzzo
Copy link
Copy Markdown
Owner

@cuzzo cuzzo commented May 8, 2026

No description provided.

cuzzo and others added 23 commits May 7, 2026 20:00
Adds `zig build coverage-loom -Dcoverage-loom` -- a separate kcov build
scoped to *-loom-test.zig and the parking-lot-loom executable, writing
to zig-out/coverage-loom/merged/kcov-merged/cobertura.xml. Distinct
from the existing -Dcoverage path because the report is meant to
answer "what atomic sites does Loom miss?" -- mixing in unit/TSan/VOPR
coverage would defeat that.

Also adds src/tools/loom_atomic_coverage.rb, a standalone scanner that
cross-references atomic-op call sites in zig/runtime + zig/lib against
the cobertura XML and reports gaps. Three uncovered categories:
  - 0-hit: kcov instrumented the line; never executed
  - file unloaded: file not in any *-loom-test.zig
  - line missing: file loaded but the function isn't called

Includes a sandwich-rule heuristic for kcov's inline-elision artifact
(Zig's `pub inline fn` atomic wrappers lose DWARF line markers when
LLVM inlines them, so kcov reports 0 hits on lines whose surrounding
block actually executed). Recognizes // LOOM-EXCLUDE-BEGIN / -END
source markers for code regions Loom can't reach by design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ParkingMutex.lock, ParkingRwLock.lockWrite, and ParkingRwLock.lockRead
each branch on `if (sched_opt == null)` for a no-fiber-scheduler
acquire path. Loom always runs with a scheduler, so getScheduler()
never returns null in loom scenarios -- the 10 atomic ops in those
three blocks are by design unreachable under the loom harness, not
real coverage gaps. Coverage of the thread-only paths comes from
parking-lot-hammer-test.zig and parking-rwlock-fiber-hammer-test.zig
under TSan.

The LOOM-EXCLUDE markers are read by src/tools/loom_atomic_coverage.rb
to drop these sites from the gap report. They have zero runtime cost
and zero behavioral divergence between production and tests -- a
runtime check would corrupt loom's own metrics (panic-path lock acqs
get counted as sim atomic ops).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds loom-shimmed scenarios for atomic-op surfaces that the existing
suite never exercised:

  AtomicPtr.updateFlow (lib/atomic_ptr.zig):
    - .cont_commit path covers the load + cmpxchgWeak retry loop
    - .skip_no_commit path covers the early-return branch
    Brings atomic_ptr.zig from 6/8 to 8/8 covered atomic ops.

  Versioned(T) (runtime/versioned.zig):
    - deinitSync no-readers teardown
    - updateFlow commit + skip variants
    - updateMulti 2-cell happy path (tag-acquire + commit-store)
    - updateMulti rollback on user-txn error
    Brings versioned.zig from 5/14 to 13/14. Line 565 (contention
    rollback under concurrent updateMulti) remains -- needs a
    multi-fiber harness scenario.

  ParkingMutex.tryLock + presetLocked (lib/parking-lot.zig):
    Direct single-thread test under the loom comptime alias. The
    existing harness scenarios all route through lock() into
    lockSlow's parking path; tryLock and the presetLocked test
    rendezvous helper had no caller. Lines 640, 644, 651 closed.

Overall loom atomic coverage: 38.4% -> 40.3% (+13 sites closed,
+10 reclassified as cluster-B confounders via LOOM-EXCLUDE).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e pre-check

Closes the remaining 7 atomic sites in lib/parking-lot.zig with four
new loom scenarios:

  Mutex / RwLock-write / RwLock-read post-park epilogue (lines 969,
  970, 1267, 1269, 1270, 1675):
    Holder fiber acquires the lock, yields to let parker park, then
    pre-sets parker.lock_timed_out=true via direct atomic store
    (.release) before unlocking. The wake-vs-timeout race lets the
    parker exit park-loop with timed_out=true and run the real
    epilogue: line 969 resets the flag, line 970 (+1269/1270/1675)
    re-checks state for owner-grant. Existing testTimeoutAtomicCoverage
    only exercised the parker/scanner ordering handshake without ever
    routing through lockSlow's epilogue.

  FSM rwlock 2-writer contention (line 1327):
    Two FSM writers contesting the same rwlock. The second one to
    enter tryWriteLockForFsm sees WRITE_LOCKED_BIT set (held by the
    first) and runs the re-entrancy / cycle pre-check that loads
    fsm_write_owner. Existing FSM rwlock tests (1W+1R, 1W+2R) all
    enter tryWriteLockForFsm with state == 0, so the if at line 1326
    never triggers.

Brings lib/parking-lot.zig to 198/198 (100%) covered atomic ops in
the loom report. Overall loom atomic coverage 40.3% -> 41.1%.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AtomicInbox was the original cross-scheduler messaging primitive --
a Treiber-stack MPSC list of InboxNode-embedded structs. spsc.zig's
SpscRing replaced it for production messaging (UAF bugs from
linked-list node reuse) and AtomicInbox has no remaining production
callers.

Verified dead via comptime probe:
  Scheduler @Hasfield inbox = false

Three race tests that called the long-removed `target.inbox.push(...)`
have been silently broken since the SPSC migration: under
`b.addTest`, files with zero `test {}` blocks pass via Zig's lazy
analysis (no test => function bodies skipped); under `hammer_exe_files`,
build-config errors abort before reaching them. They have not actually
compiled in some time and are removed entirely.

Removes:
  - lib/queues.zig: AtomicInbox / InboxNode / InboxType types
  - lib/queues.zig: Task.inbox_link field (vestigial, never read/written)
  - runtime/scheduler.zig: type aliases + inbox_link field on
    SpawnRequest and RemoteCall
  - runtime/queues-test.zig: AtomicInbox tests + thiefWorker dummy-inbox arg
  - runtime/scheduler-race-test.zig (broken)
  - runtime/inbox-race-test.zig (broken)
  - runtime/inbox-race-smoke-test.zig (broken)
  - zig/build.zig: 3 test_files entries + 2 hammer_exe_files entries

Validation:
  zig build test            -> 162 passed, 1 skipped, 0 failed
  zig build test-tsan       -> All 60 tests passed
  zig build test-loom-vopr  -> All 36 tests passed
  ./clear test transpile-tests/ -> 514 passed, 0 leaks

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds zig/versioned-multi-loom-test.zig -- a standalone executable
that drives two fibers concurrently calling Versioned.updateMulti on
overlapping cell-sets, deterministically reaching the contention-
rollback branch at versioned.zig:574 (the per-cell tag-release store).

The line had been line-missing in the loom kcov report because no
existing scenario could trigger it: contention requires one fiber to
acquire SOME (>0) tags, then exhaust the inner-retry budget on a
later cell while another fiber holds it. Single-fiber tests don't
contend; the parking-lot-loom and vopr-loom harnesses are tightly
coupled to lock primitives + RunQueue and don't drive Runtime/EBR
fiber semantics.

Two design moves make the path reachable in a small enumerable
schedule space:

1. Test seam in versioned.zig: MAX_INNER_RETRIES_MULTI is now
   overridable from `@import("root")` (mirroring the existing
   MAX_UPDATE_RETRIES seam). Production keeps the 1024 default;
   the harness sets it to 4 so the rollback fires after just 4
   spin observations.

2. Cell layout: a contiguous `g_cells: [3]Versioned(i64)` array
   guarantees address ordering g_cells[0] < g_cells[1] < g_cells[2]
   regardless of Zig BSS layout. Fiber X uses {&g_cells[0..1]} and
   Fiber Y uses {&g_cells[1..2]} -- their first acquisitions differ
   (X tags g_cells[0], Y tags g_cells[1]) but they overlap on
   g_cells[1]. Whichever fiber tries the second cell after the
   other has tagged it spins out the inner-retry budget with
   acquired > 0, exercising the rollback store.

Harness shape mirrors the existing PinDepthLoomHarness in
versioned-loom-test.zig: 2 fibers, schedule-driven yield picking,
round-robin tail after the explicit schedule exhausts (prevents
starvation that would otherwise hit error.UpdateRetriesExhausted).

Wired into build.zig as a new exe with its own kcov dir under
`-Dcoverage-loom`. Covers all 1024 binary schedules at depth 10
(69k SimAtomic ops total) with 0 invariant violations.

Brings runtime/versioned.zig to 14/14 (100%) in the loom atomic
coverage report. Closes the last Tier 1 gap. Overall loom coverage:
41.3% -> 41.4%.

The harness shape (contiguous cells + lowered retry budget +
round-robin tail) is reusable for the broader Tier 4 library gap
(observable, streams, data-structures, ownership) where similar
multi-fiber scenarios are needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds zig/ownership-loom-test.zig -- four scenarios driving two-fiber
interleavings on Arc<T> / Weak<T> reference counts:

  1. clone-vs-deinit: fetchAdd vs fetchSub on strong_count, including
     the last-drop branch.
  2. weak-upgrade-vs-strong-drop: cmpxchgWeak retry race when a
     concurrent drop takes strong_count to zero mid-upgrade.
  3. concurrent-downgrade: two simultaneous fetchAdd on weak_count.
  4. inspect-accessors: refCount/weakCount/isAlive/strongCount + the
     Weak.fromArc + Weak.clone fetchAdd sites.

Also adds the comptime SimAtomic alias to lib/ownership.zig (mirroring
the pattern in lib/queues.zig and lib/ebr.zig) so atomic ops on the
control block become deterministic yield points under Loom. Without
this lib/ownership.zig hardcoded std.atomic.Value, leaving Arc/Weak
ops uninstrumented even when callers ran under SimAtomic-shimming.

Brings lib/ownership.zig to 14/14 (100%) covered atomic ops in the
loom report. Validates the harness pattern (contiguous shared state +
two fibers + schedule-driven yield + round-robin tail) generalizes
beyond the EBR-aware versioned case to a pure shared-counter use
case -- exactly what the broader Tier 4 lib/ surface needs.

Overall loom coverage: 41.4% -> 43.0%.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the alias pattern already in lib/queues.zig, lib/ebr.zig, and
lib/ownership.zig: declare a comptime `Atomic` selector at the top
of the file that resolves to SimAtomic when the root module exports
it (Loom mode), or to std.atomic.Value otherwise. Replaces all
38 std.atomic.Value usages with Atomic(...).

Production: zero behavioral change -- root has no SimAtomic, alias
picks std.atomic.Value, codegen identical.

Loom: every Chunk.len.store/load, SubscriberRecord.seq.store/load,
and concurrentBounded* err_code/next_idx atomic op becomes a yield
point under the harness. Required so an upcoming streams-loom-test
can deterministically interleave producer/consumer flows. Without
this lib/streams.zig hardcoded std.atomic.Value would bypass the
harness even when callers ran under SimAtomic shimming.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The SimAtomic alias commit accidentally shipped a self-referencing
fallback (`else Atomic;` on the comptime branch), which the compiler
flags as "value of declaration depends on itself" the moment any
caller touches `lib/streams.zig`. The non-loom branch must resolve
to `std.atomic.Value` -- the type the file used before the alias
was introduced.

Caught locally because no immediate caller had been added in 6cc9a50;
data-structures-test.zig surfaces it in the next test build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the >11.8x slowdown when swapping lib/streams.zig's
Inner.mutex from compat.Mutex (pthread) to ParkingMutex on the
08_pubsub benchmark (0.169s -> >2s TIMEOUT). Documents the
diagnostic chain: each ParkingMutex slow-path acquire chains 15+
atomic ops + cross-scheduler submitResume vs pthread's adaptive
spin + futex hand-off. Lists the contention paths that need the
fix, the affected files, and three plausible fixes (adaptive spin,
hand-off, hybrid spin-park primitive). Until that lands,
lib/streams.zig and lib/data-structures.zig keep compat.Mutex and
the latent OS-thread-blocking issue is documented but unfixed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (S6)

The run-loop's per-iteration "if idle, steal from a victim" block at
scheduler.zig was inline in run() with no callable seam, so existing
loom scenarios -- which manually drive fiber switches and never enter
run() -- couldn't reach its 4 active_tasks fetchAdd/fetchSub sites.

Refactored the block into pub fn idleStealFrom(self, victim):
  - Same behavior; just hoisted out of the run-loop.
  - run() now calls self.idleStealFrom(victim) inside the existing
    idleness/registry gate.
  - Public so loom tests can drive the steal+accounting paths
    directly without the run-loop's implicit registry+rng deps.

Two loom scenarios in parking-lot-loom.zig exercise both arms:
  - testIdleStealFromStackful: 2 schedulers, push 4 stub Tasks to
    victim, steal -> covers stackful steal's fetchAdd/fetchSub pair.
  - testIdleStealFromFsm: empty stackful + 4 FSM stubs in victim ->
    fall-through covers the FSM steal's fetchAdd/fetchSub pair.

Both scenarios verify count consistency (stealer_delta ==
victim_delta) post-steal as a basic invariant. The atomic ops
themselves are RMWs so can't lose updates; the loom value is line
coverage of the steal path that was previously line-missing.

Brings runtime/scheduler.zig to 63/163 (was 59/163). Closes the
S6 active_tasks-accounting cluster from
docs/agents/parking-mutex-performance-problems.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…k (S2+S5+S1 partial)

Three loom scenarios in parking-lot-loom.zig + a small extraction in
scheduler.zig:

  testCrossSchedulerResumeFlow: 2 schedulers (a, b). Sets active=a,
  calls sched_b.submitResume(stub_task) -- routes through the cross-
  scheduler SPSC channel push since sender (a) != target (b). Then
  sched_b.drainChannels() processes the queued Resume message.
  Covers:
    - 896: in_inbox.cmpxchgStrong IDLE -> IN_QUEUE (S5 wake CAS)
    - 928: dirty_mask.fetchOr to signal target sched (S1)
    - 1053: drainChannels Resume case status.store(.Ready) (S2)

  testCoopYieldWithWork: fiber under LoomHarness pushes a stub Task
  to ready_queue (so hasWork() is true), then calls coopYield.
  Drives:
    - 1631: coopYield's status.store(.Ready) on the cooperative-
      yield path (S2)

  testWakeExpiredSleepers: pushes a stub Task with wake_time in the
  past onto sleeping_queue, calls wakeExpiredSleepers. Drives:
    - 1188: sleeping-queue scan status.store(.Ready) (S2)

  Refactor: extracted scheduler.run()'s sleeping-queue scan (lines
  1180-1194) into pub fn wakeExpiredSleepers(self). Same shape as
  the idleStealFrom extraction from S6 -- public seam so loom tests
  can drive the wake path without entering the full run() loop.

Brings runtime/scheduler.zig to 67/163 (was 63/163, +4 sites). Closes
the actively-coverable subset of S2+S5+S1; remaining sites in those
clusters (1325 in_inbox check inside run()'s dispatch switch, 1964
timeout-fire in scanLockWaiters, 2046 io_uring CQE, 878 dirty_mask
in submitSpawn) need either run() loop entry or io_uring fixtures
which are out of scope for plain loom.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…uler (S9 + S1)

Two scenarios in parking-lot-loom.zig:

  testPickTwoRoundRobin: registers 3 stub schedulers in
  global_registry, calls pickTwo() in a loop. Drives:
    - 2123: next.fetchAdd(1, .monotonic) (S9)
    - 2124: slots[i % n].load(.acquire) (S9)
    - 2125: slots[(i +% 1) % n].load(.acquire) (S9)
    - 2153: register's slot.cmpxchgStrong(null, sched) (S10 drive-by)

  testCrossSchedulerFsmResumeFlow: 2 schedulers; sender (a) is
  active_scheduler but submitFsmResume targets sched_b. Drives:
    - 878: dirty_mask.fetchOr in submitFsmResume (S1)
    - drainChannels FsmResume case fields (status set, push to
      fsm_ready_queue)

Brings runtime/scheduler.zig to 72/163 (was 67/163, +5 sites).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… RemoteCall (S10+S11+S3)

Three small scenarios in parking-lot-loom.zig:

  testRegistryCrossIterPinPaths: registers a single scheduler in
  global_registry, calls pinTask + pinFsmTask with synthetic stub
  pointers (which won't resolve to any slab). Covers the slot-load
  branch of both walkers (S10).

  testWaitGroupDoneSpinlock: WaitGroup.add(2) + done() x 2 -- first
  done covers the prev != 1 release-and-return path (line 2755);
  second done covers the prev == 1 last-decrement path (lines
  2760-2765). The lock.swap(1, .acquire) spin acquire (line 2749)
  + counter.fetchSub (line 2753) fire on each call (S11).

  testRemoteCallCompletion: pushes a synthetic RemoteCall message
  with a small completion (no waiter parked) into a scheduler's
  channel, calls drainChannels. Drives the RemoteCall switch arm
  including the completion.finished.store(true, .release) at line
  1097 (S3). The wg.done() at 1098 also re-exercises S11's spinlock
  paths.

Brings runtime/scheduler.zig to 76/163 (was 72/163, +4 sites).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ial)

Adds pub fn scanLockWaitersPub to scheduler.zig (mirrors the existing
scanFsmLockWaitersPub seam) so loom tests can drive the timeout-fire
path without entering run().

Two scenarios in parking-lot-loom.zig:

  testScanLockWaitersTimeoutFire: synthetic Task in lock_waiters with
  waiting_for_lock pointing at a sentinel and lock_wait_start_ms = 0.
  With sched.lock_timeout_ms = 1, scanLockWaiters fires the timeout
  branch and clears wait fields + sets lock_timed_out + status=.Ready
  + enqueues the task. Covers lines 1907 (initial wait check), 1912
  (start_ms load), 1914 (deadline compare), 1965-1970 (clear fields
  + set timed_out), 1971 (enqueue). Drops the inner WaiterList
  re-check (lines 1932-1944) by leaving waiting_for_lock_list = null;
  those sites need a real parking-lot WaiterList -- defer.

  testScanFsmLockWaitersTimeoutFire: same shape on the FSM scanner.
  Drives lines 1702 (wait check), 1706 (start_ms), 1707 (deadline
  compare), 1734-1738 (clear fields), 1740 (push to fsm_ready_queue).

Brings runtime/scheduler.zig to 87/163 (was 80/163, +7 sites).
Remaining 0-hit sites in scanLockWaiters / scanFsmLockWaiters are the
WaiterList re-check branches (1932-1944, 1957) which fire only when
the wake-side races us mid-scan -- need real parking-lot WaiterList
state to drive deterministically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without these tests parking-lot-loom never references WaitGroup.{
registerFsmWaiter,wait} or Semaphore.{acquire,release}, so the linker
strips them and kcov reports MISSING for ~25 atomic sites that
production code exercises every spawn-bounded async block.

Four direct tests:

  testWaitGroupRegisterFsmWaiter — counter==0 fast-return,
  counter>0 happy-path park, and the inner-recheck race where
  count drops to 0 between the outer load and the lock acquire.

  testWaitGroupWaitNonFiber — sched.current_task is null at
  loom-test construction, so wait() takes the busy-wait branch
  and returns immediately when counter==0.

  testSemaphoreFastPath — count=2, two CAS-decrement acquires
  + two no-waiter releases (counter.fetchAdd branch).

  testSemaphoreReleaseWithWaiter — waiting_task=stub triggers the
  direct-grant branch in release(); active_scheduler is set to
  the same scheduler so submitResume's same-scheduler routing
  enqueues the stub without touching the SPSC machinery.

Brings runtime/scheduler.zig kcov coverage to 109/163 hit + 2 elided
(was 96/163 + 1). Nil-hit (function not linked) drops from 51 to 22.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
testIoSubmitFns drives all six io submission paths (read/write/
accept/connect/recv/send) with stubbed IoWaiter + SimRing. Each
submission stages an SQE and stores .Blocked into the task's status.
Without this, the io submit fns are dead-stripped from parking-lot-
loom (no caller) and L1811/1834/1842/1850/1858/1886 report MISSING
even though every fiber's I/O hot path goes through them.

testSchedulerRegistryFns covers getLeastLoaded (L2147-2148),
notifyAll (L2207, L2209), deinit reset paths (L2219-2224), and
count() live-slot scan (L2252, L2255). Two registered schedulers
with biased active_tasks load gives deterministic least-loaded
selection.

Brings runtime/scheduler.zig kcov coverage to 125/163 hit + 2 elided
(was 109/163 + 2). Nil-hit drops 22 -> 6.

Remaining 6 nil sites are residual inlined functions or paths the
loom binary never enters: drainChannels iteration + notifyDirty
fast-path (L845, L938, L944, L959), Scheduler.sleep() (L1650),
spawnSchedulers start signal (L2685).

The 30 remaining 0-hit sites all sit inside the run() main loop or
require a real fiber stack (Semaphore slow-path park, lock-waiter
race re-check). Closing N1 -- further coverage gains require either
running the full scheduler loop under SimRing or extracting many
more public seams; ROI gets steep from here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
testSleepTaskLinking calls sleepTask with a stub task and a wake_time
in the far future. Covers L1650 status.store(.Blocked) -- the only
remaining nil-hit site reachable from a direct call (no run() loop
or fiber stack required).

Brings runtime/scheduler.zig kcov coverage to 126/163 hit + 2 elided
(was 125/163 + 2). Nil-hit drops 6 -> 5.

Remaining 5 nil sites are NOT test-fixable from the loom binary:

  L845 (notifyDirty fetchOr), L938/L944 (drainChannels iter),
  L959 (RemoteCall completion store) -- inlined out at the loom
  binary's optimization level. The functions ARE reached by
  testCrossSchedulerResumeFlow / testRemoteCallCompletion (sim ops
  > 0, behavior verified) but kcov has no source-line entry for
  the inlined call sites.

  L2685 -- inside a top-level Zig `test` block in scheduler.zig,
  not run by the parking-lot-loom executable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Loom and VOPR target orthogonal axes (atomic-op interleaving vs.
fault/clock/retry simulation). The loom_atomic_coverage.rb tool
revealed which atomic sites Loom never reached; this tool does the
same for VOPR's surface area.

  src/tools/vopr_coverage.rb -- categorized scanner. Five auto-
    detected categories (time / random / net_io / fs_io / ring_io)
    via grep-style patterns, plus a sixth (retry) driven by source
    markers. Output mirrors the loom report shape.

  Two source conventions for retry sites:
    `// VOPR-START-RETRY: <description>` ... `// VOPR-END-RETRY`
      -- multi-line block. Standalone comment lines have no kcov
      hit count, so the scanner attributes them to the first
      instrumented line at-or-after the marker (the loop header).
    `// VOPR-RETRY` on the same line as a single-statement retry
      -- compact form for one-line spinlock acquires.

  29 markers added across 6 files at the high-value retry loops:
    - versioned.zig: 4 (MVCC update/updateFlow/updateMulti)
    - atomic_ptr.zig: 2 (update/updateFlow CAS retries)
    - scheduler.zig: 6 (WaitGroup/Semaphore primitives)
    - data-structures.zig: 15 (sharded inner-lock spins)
    - observable.zig: 1 (SpinLock CAS acquire)
    - queues.zig: 1 (WaiterList spin)
  parking-lot.zig retry loops are intentionally excluded -- Loom
  covers their CAS interleaving structurally.

  zig/build.zig -- `-Dcoverage-vopr` option + `coverage-vopr` step.
  Mirrors `coverage-loom`. Wraps every `*-vopr-test.zig` entry under
  kcov, merges to `zig-out/coverage-vopr/merged/kcov-merged/cobertura.xml`.

First report (4 vopr tests: vopr-test, fsm-vopr-test,
fsm-lock-vopr-test, versioned-vopr-test):

  Time                         6/32  ( 18.8%)
  Random                       0/4   (  0.0%)
  Network IO (raw)             0/1   (  0.0%)
  Filesystem IO (raw)          1/25  (  4.0%)
  io_uring (RingType seam)     1/10  ( 10.0%)
  Retry markers                6/29  ( 20.7%)
  TOTAL                       14/101 ( 13.9%)

Major gaps: lib/atomic_ptr.zig, lib/data-structures.zig,
lib/observable.zig are FILE-NOT-LOADED -- no current VOPR test
imports them. The 6 retry markers that DO hit are all in scheduler/
versioned (which fsm-vopr / versioned-vopr already exercise).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single-threaded deterministic property test for AtomicPtr(T).
Mirrors versioned-vopr-test shape: seeded PRNG drives a random
sequence of read / readHold / releaseHeld / update / reclaim ops,
checking three invariants per step (post-update read, held-guard
dereference still valid, limbo grew by 1 on update).

Goal: get lib/atomic_ptr.zig into the VOPR coverage tree. Before
this test, the file was FILE-NOT-LOADED in cobertura -- no VOPR
test imported it. After, the file is LOADED and the AtomicPtr
update retry marker reaches FILE-LOADED state.

VOPR coverage delta: retry markers 6/29 -> 7/29; lib/atomic_ptr.zig
gaps reclassified from FILE-NOT-LOADED to LINE-MISSING/0-hit (real
single-threaded VOPR can't fault-inject a CAS miss, so the inner
retry body still doesn't execute -- the file is just instrumented now).

Skipped lib/observable.zig and lib/data-structures.zig: their retry
loops are inner-lock spins that need true contention to fault-inject.
Single-thread VOPR can never lose a CAS, so the loop body never
executes. Those files need a SimAtomic-with-CAS-fault-injection mode
to be useful -- distinct work item.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the SimAtomic / SimRing pattern: a comptime alias inside
lib/compat.zig picks up `SimClock` / `SimRandom` if the test root
exports them, falling through to the OS clock_gettime / getrandom
syscalls otherwise. Production builds (no SimClock/SimRandom on
root) inline the OS path -- zero overhead.

  zig/runtime/vopr-clock.zig
    SimClock with virtual_ns state, reset() / advanceMs() /
    advanceNs() / milliTimestamp() / nanoTimestamp(). Single-thread
    (matches the runtime's VOPR tests). Self-test verifies
    advance/read symmetry.

  zig/runtime/vopr-random.zig
    SimRandom backed by std.Random.DefaultPrng. seed() / fill().
    Self-test verifies seed determinism + cross-seed divergence.

  zig/lib/compat.zig
    Two `sim_*_decl = if (@hasDecl(root, "SimX")) root.SimX else void`
    seams. milliTimestamp / nanoTimestamp / randomBytes consult the
    seam at the top, fall through to OS.

  All five existing VOPR test entries (vopr-test, fsm-vopr-test,
  fsm-lock-vopr-test, versioned-vopr-test, atomic-ptr-vopr-test)
  now `pub const SimClock = ...; pub const SimRandom = ...;` at
  module root.

VOPR coverage delta:

  Before V6+V7:                          After V6+V7:
    Time             6/32  (18.8%)        Time             6/34  (17.6%)
    Random           0/4   ( 0.0%)        Random           0/4   ( 0.0%)
    Net IO (raw)     0/1   ( 0.0%)        Net IO (raw)     0/1   ( 0.0%)
    FS IO (raw)      1/25  ( 4.0%)        FS IO (raw)      1/25  ( 4.0%)
    Ring IO          1/10  (10.0%)        Ring IO          1/10  (10.0%)
    Retry            7/29  (24.1%)        Retry            7/29  (24.1%)
    TOTAL           15/101 (14.9%)        TOTAL           15/103 (14.6%)

The two new Time sites (compat.zig:150, 157) are the SimClock
seam checks themselves; they correctly route to the simulator
under VOPR but kcov can't track the comptime `if (decl != void)`
branch resolution at source-line granularity.

Coverage doesn't move because the existing VOPR scenarios are
single-threaded MVCC / FSM / parking-lot dispatch tests that don't
drive timeouts, sleep, or randomness. The shim is plumbing for
FUTURE VOPR scenarios (timeout-driven retries, sleep wakeup,
randomized fault injection) -- now possible to write
deterministically. Pre-V6 those scenarios would have introduced
real-clock / OS-entropy non-determinism.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…out (V8)

Three deterministic timeout scenarios that exercise the time-read
sites in scheduler.zig kcov was reporting as 0-hit:

  scheduler.zig:1456  wakeExpiredSleepers: const now = milliTimestamp();
  scheduler.zig:1910  scanLockWaiters:     const now_ms = milliTimestamp();

Each scenario stamps `lock_wait_start_ms` / `wake_time` at a fixed
offset from `compat.milliTimestamp()` (now+0 = inside-deadline,
now-200 = past-deadline). No real-time waits, no flake. Covers both
the inside-deadline (no fire) and past-deadline (fire) branches.

Mirrors the parking-lot-loom S8 scenarios but bare-bones without
the loom harness -- pure single-thread VOPR shape.

GAP-B caveat (worth flagging for the next round): the SimClock
seam in lib/compat.zig is behind `@hasDecl(@import("root"), ...)`.
Under `b.addTest`, root is Zig's auto-generated test_runner module,
NOT this entry file. The same issue parking-lot-loom hit pre-2026-05.
Activating SimClock for VOPR tests requires promoting them to
`b.addExecutable` so root resolves to our entry. Until then these
scenarios use offset-from-real-clock to drive timeouts deterministically.

VOPR coverage delta:

  Time             6/32 (18.8%)  ->  8/34 (23.5%)
  Total            14/101         ->  17/103

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ctivates (V10)

Promotes scheduler-timeout-vopr from b.addTest to b.addExecutable so
the comptime SimClock seam in lib/compat.zig actually activates. Same
GAP-B fix parking-lot-loom went through pre-2026-05.

Why this matters: under b.addTest, `@import("root")` resolves to Zig's
auto-generated test_runner module, NOT our entry file. The
`@hasDecl(root, "SimClock")` check in compat.zig's milliTimestamp /
nanoTimestamp wrappers therefore returns false, the seam silently
falls through to OS clock_gettime, and "VOPR-deterministic" timeouts
are actually real-time-dependent.

Layout changes:

  zig/runtime/scheduler-timeout-vopr.zig (renamed from -test.zig)
    Now exports `pub fn testX() !void` per scenario. No `test "..."`
    blocks -- the executable wrapper drives them via main().

  zig/scheduler-timeout-vopr-test.zig (rewritten)
    Top-level executable wrapper. Re-exports SimClock + SimRandom at
    module root, defines a tests array, main() iterates with pass/fail
    accounting. Mirrors parking-lot-loom-test.zig structure.

  zig/build.zig
    New stv_exe (b.addExecutable) parallel to pl_loom_exe. Wired into
    test-loom-vopr (no-coverage) and coverage-vopr (kcov-wrapped).
    Removed the prior b.addTest entry from the test_files loop.

  zig/runtime/scheduler-timeout-vopr.zig (impl rewrite)
    Now uses SimClock.advanceMs() to drive timeouts deterministically:
      * stamp lock_wait_start_ms with SimClock.milliTimestamp() (= 0)
      * advanceMs(50) -- inside deadline, no fire
      * advanceMs(100) -- past deadline, fires
    First scenario is `testSimClockActive` -- a GAP-B regression gate
    that fails if compat.milliTimestamp() doesn't track SimClock.

  src/tools/vopr_coverage.rb
    TEST_FILE_RE now also matches `*-vopr.zig` (impl side of
    executable-style VOPR tests). Otherwise the test scenario file
    counted as production runtime and inflated the site catalog.

Verification: GAP-B gate passes (SimClock IS active under
scheduler-timeout-vopr executable). All 4 scenarios green.

Coverage delta is small (Time 8/34 -> 9/34) because the prior
b.addTest version already hit the same scheduler.zig sites via real-
clock fallthrough -- the structural win is determinism, not raw
coverage. Future VOPR scenarios that rely on virtual-clock advance
(sleep wakeup ordering, multi-task timeout races, idle-arming
timeouts in the run loop) can now be written without flakiness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cuzzo cuzzo changed the base branch from master to vopr-loom-audit May 8, 2026 16:21
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 8, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (vopr-loom-audit@fed8997). Learn more about missing BASE report.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@                Coverage Diff                 @@
##             vopr-loom-audit      #26   +/-   ##
==================================================
  Coverage                   ?   88.64%           
==================================================
  Files                      ?      163           
  Lines                      ?    45270           
  Branches                   ?    11256           
==================================================
  Hits                       ?    40130           
  Misses                     ?     5140           
  Partials                   ?        0           
Flag Coverage Δ
ruby 83.89% <ø> (?)
zig 95.84% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

🐰 Bencher Report

Branchvopr-loom-audit-docs
Testbedubuntu-latest

⚠️ WARNING: No Threshold found!

Without a Threshold, no Alerts will ever be generated.

Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds flag.

Click to view all benchmark results
Benchmarkleak-build-msMeasure (units) x 1e3leak-countMeasure (units)leak-run-msMeasure (units)
benchmarks/concurrent/05_backpressure/bench📈 view plot
⚠️ NO THRESHOLD
4.33 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
1,960.30 units
benchmarks/concurrent/10_shard_vs_locked/bench📈 view plot
⚠️ NO THRESHOLD
4.26 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
60,003.96 units
benchmarks/concurrent/16_observables/bench📈 view plot
⚠️ NO THRESHOLD
4.27 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
99.62 units
benchmarks/inter-clear/03_concurrent_mvcc_vs_rwlock/bench📈 view plot
⚠️ NO THRESHOLD
4.59 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
410.02 units
benchmarks/sequential/07_pointer_chase/bench📈 view plot
⚠️ NO THRESHOLD
4.23 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
315.91 units
benchmarks/sequential/12_weak_ref_graph/bench📈 view plot
⚠️ NO THRESHOLD
4.35 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
106.76 units
benchmarks/server/03_pathological/server📈 view plot
⚠️ NO THRESHOLD
4.46 units x 1e3📈 view plot
⚠️ NO THRESHOLD
0.00 units📈 view plot
⚠️ NO THRESHOLD
1,002.75 units
🐰 View full continuous benchmarking report in Bencher

Audience: someone fluent in Ruby/Python/JS but new to concurrent
testing. Explains why concurrent code has bugs sequential code can't
have, what Loom and VOPR each protect against, and how to use the
coverage scanners to find at-risk sites that no test exercises.

Sections:
1. What the tests are (one paragraph each)
2. What they protect against — what / when / how matrix per system
   (Loom / Hammer / VOPR), with the explicit notes that Loom does
   NOT catch deadlocks (Hammer + TSan does; lock ordering prevents)
   and VOPR DOES catch livelocks (deterministic retry sequencing
   makes the symmetry visible)
3. Clearest worked example of the bug class for each (counter that
   loses updates for Loom; timeout that doesn't fire under load for
   VOPR), with explicit explanation of why sequential programming
   is immune
4. Coverage scanners: src/tools/loom_atomic_coverage.rb and
   src/tools/vopr_coverage.rb. What each scans for, the LOOM-EXCLUDE
   / VOPR-EXCLUDE marker convention, and how to use the reports as
   a checklist
5. Anatomy of a Loom test, walking through ownership-loom-test.zig:
   the SimAtomic activation seam, the binary-schedule enumeration,
   the GAP-B counter gate, and what to check when reading one
6. Anatomy of a VOPR test, walking through scheduler-timeout-vopr-
   test.zig: the SimClock / SimRandom seams, the activation gate as
   first scenario, the b.addExecutable (NOT b.addTest) requirement,
   and what to check when reading one. Includes a "VOPR's blind
   spot" subsection: VOPR's coverage is bounded by the simulator's
   coverage; concrete unmodeled failures (disk hangs, mid-stream
   TCP RSTs, kernel pauses, NTP clock jumps) are invisible to VOPR.
   Mitigation posture: each production bug should produce either a
   new VOPR scenario or a new shim capability. Hammer and VOPR are
   complementary — Hammer has no model gap; VOPR has no determinism
   gap.
7. CLAUDE.md gates: Loom + VOPR + Hammer tests are merge requirements
   for any zig/ change that introduces atomics, I/O, or locks. The
   coverage scanners make this mechanically checkable.

All file references verified to resolve under this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cuzzo cuzzo force-pushed the vopr-loom-audit-docs branch from c277be6 to 72ef99c Compare May 8, 2026 16:30
cuzzo and others added 2 commits May 8, 2026 17:06
Five targeted improvements based on cross-review against the Concurrent
Systems Programming Cheat Sheet:

1. **Cheat-sheet pointer at the top.** New callout block introduces
   the cheat sheet as the home for terminology and the broader
   concurrency-primitive landscape, so this doc can stay focused on
   Loom + VOPR specifics. Three placeholder URLs (TODO_CHEAT_SHEET_URL)
   marked for fill-in when the cheat sheet is published.

2. **New §1.5 "What exists in other languages."** Short subsections on
   Rust, Go, C/C++, and Zig — what each provides for Hammer / Loom /
   VOPR equivalents (e.g. tokio-rs/loom crate, tokio::time::pause(),
   go test -race), what's NOT built-in, and what production cost the
   testing seam imposes. Closes with a summary table comparing the
   five environments. Lets a reader from Rust / Go / C++ understand
   how CLEAR's setup relates to what they know.

3. **§2 jargon glossary.** Added a callout that points to the cheat
   sheet for TSan / ASan / FFI / lock-contention / etc., plus inline
   one-line definitions of the two terms NOT in the cheat sheet:
   ABA-like race and half-published state.

4. **§2 OS-boundary explanation.** "any other non-determinism that
   crosses an OS boundary" now explicitly lists what that means —
   asking for the time, randomness, network packet ordering, disk
   I/O completion timing.

5. **Activation-gate warning callouts.** Pulled the GAP-B (Loom) and
   SimClock (VOPR) silent-failure traps into prominent ⚠ warning
   blocks in §5 and §6 respectively. Each one explains the silent
   failure mode, the mitigation (counter-advanced check / first-
   scenario time-advance check), and the prescription ("every Loom
   suite must have this gate"). These are the most important review
   checks for someone reading a Loom or VOPR test.

Also added a one-paragraph callout explaining `comptime` for readers
unfamiliar with compile-time metaprogramming, framed as
"dependency injection that resolves at compile time."

Net: +186 / -11 lines. No reordering of existing sections; all
additions are top-loaded with cross-references.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The §1 VOPR section had three implementation paragraphs (comptime
seam, production cost, "what is comptime"); the parallel Loom
section had only "what it is" + one "the trick" paragraph. Asymmetry
made VOPR feel like an implementation walkthrough rather than a
high-level definition.

This commit:
1. Trims VOPR §1 to mirror Loom's structure: one "what it is" + one
   short "the trick" paragraph, with forward references to §6 (the
   activation mechanism) and §1.5's Zig subsection (comptime).
2. Moves the "What is comptime?" callout from §1 into §1.5's Zig
   subsection, where comptime is first explained as a feature.

The §6 anatomy section already covers the comptime seam and the
b.addExecutable requirement in detail; deleting from §1 doesn't lose
information.

Net: -3 lines (16 added, 19 removed). Symmetric §1 sections now read
"this is what they are" rather than "this is what they are and how
they're built."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cuzzo cuzzo force-pushed the vopr-loom-audit-docs branch from bbad923 to 6bb1e1e Compare May 8, 2026 17:39
cuzzo and others added 6 commits May 8, 2026 17:48
Adds docs/correct-systems-programming-cheat-sheet.md (the broader
context doc previously referenced as TODO_CHEAT_SHEET_URL) and
replaces the three placeholders in docs/agents/loom-vopr-getting-started.md
with relative links to the new file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Link 'how to get LLMs to write better systems-level code' to the
  LLM workflow section, and 'my reasoning for building CLEAR' to the
  Why I'm building CLEAR section.
- Fix 'how to get LLMs write' -> 'how to get LLMs to write'.
- Indent Helgrind / Memcheck / Massif as sub-bullets under Valgrind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… note

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cuzzo cuzzo force-pushed the vopr-loom-audit branch from 3d1647d to 13df362 Compare May 8, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants