Vopr loom audit docs#26
Open
cuzzo wants to merge 33 commits intovopr-loom-auditfrom
Open
Conversation
Adds `zig build coverage-loom -Dcoverage-loom` -- a separate kcov build scoped to *-loom-test.zig and the parking-lot-loom executable, writing to zig-out/coverage-loom/merged/kcov-merged/cobertura.xml. Distinct from the existing -Dcoverage path because the report is meant to answer "what atomic sites does Loom miss?" -- mixing in unit/TSan/VOPR coverage would defeat that. Also adds src/tools/loom_atomic_coverage.rb, a standalone scanner that cross-references atomic-op call sites in zig/runtime + zig/lib against the cobertura XML and reports gaps. Three uncovered categories: - 0-hit: kcov instrumented the line; never executed - file unloaded: file not in any *-loom-test.zig - line missing: file loaded but the function isn't called Includes a sandwich-rule heuristic for kcov's inline-elision artifact (Zig's `pub inline fn` atomic wrappers lose DWARF line markers when LLVM inlines them, so kcov reports 0 hits on lines whose surrounding block actually executed). Recognizes // LOOM-EXCLUDE-BEGIN / -END source markers for code regions Loom can't reach by design. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ParkingMutex.lock, ParkingRwLock.lockWrite, and ParkingRwLock.lockRead each branch on `if (sched_opt == null)` for a no-fiber-scheduler acquire path. Loom always runs with a scheduler, so getScheduler() never returns null in loom scenarios -- the 10 atomic ops in those three blocks are by design unreachable under the loom harness, not real coverage gaps. Coverage of the thread-only paths comes from parking-lot-hammer-test.zig and parking-rwlock-fiber-hammer-test.zig under TSan. The LOOM-EXCLUDE markers are read by src/tools/loom_atomic_coverage.rb to drop these sites from the gap report. They have zero runtime cost and zero behavioral divergence between production and tests -- a runtime check would corrupt loom's own metrics (panic-path lock acqs get counted as sim atomic ops). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds loom-shimmed scenarios for atomic-op surfaces that the existing
suite never exercised:
AtomicPtr.updateFlow (lib/atomic_ptr.zig):
- .cont_commit path covers the load + cmpxchgWeak retry loop
- .skip_no_commit path covers the early-return branch
Brings atomic_ptr.zig from 6/8 to 8/8 covered atomic ops.
Versioned(T) (runtime/versioned.zig):
- deinitSync no-readers teardown
- updateFlow commit + skip variants
- updateMulti 2-cell happy path (tag-acquire + commit-store)
- updateMulti rollback on user-txn error
Brings versioned.zig from 5/14 to 13/14. Line 565 (contention
rollback under concurrent updateMulti) remains -- needs a
multi-fiber harness scenario.
ParkingMutex.tryLock + presetLocked (lib/parking-lot.zig):
Direct single-thread test under the loom comptime alias. The
existing harness scenarios all route through lock() into
lockSlow's parking path; tryLock and the presetLocked test
rendezvous helper had no caller. Lines 640, 644, 651 closed.
Overall loom atomic coverage: 38.4% -> 40.3% (+13 sites closed,
+10 reclassified as cluster-B confounders via LOOM-EXCLUDE).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e pre-check
Closes the remaining 7 atomic sites in lib/parking-lot.zig with four
new loom scenarios:
Mutex / RwLock-write / RwLock-read post-park epilogue (lines 969,
970, 1267, 1269, 1270, 1675):
Holder fiber acquires the lock, yields to let parker park, then
pre-sets parker.lock_timed_out=true via direct atomic store
(.release) before unlocking. The wake-vs-timeout race lets the
parker exit park-loop with timed_out=true and run the real
epilogue: line 969 resets the flag, line 970 (+1269/1270/1675)
re-checks state for owner-grant. Existing testTimeoutAtomicCoverage
only exercised the parker/scanner ordering handshake without ever
routing through lockSlow's epilogue.
FSM rwlock 2-writer contention (line 1327):
Two FSM writers contesting the same rwlock. The second one to
enter tryWriteLockForFsm sees WRITE_LOCKED_BIT set (held by the
first) and runs the re-entrancy / cycle pre-check that loads
fsm_write_owner. Existing FSM rwlock tests (1W+1R, 1W+2R) all
enter tryWriteLockForFsm with state == 0, so the if at line 1326
never triggers.
Brings lib/parking-lot.zig to 198/198 (100%) covered atomic ops in
the loom report. Overall loom atomic coverage 40.3% -> 41.1%.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AtomicInbox was the original cross-scheduler messaging primitive -- a Treiber-stack MPSC list of InboxNode-embedded structs. spsc.zig's SpscRing replaced it for production messaging (UAF bugs from linked-list node reuse) and AtomicInbox has no remaining production callers. Verified dead via comptime probe: Scheduler @Hasfield inbox = false Three race tests that called the long-removed `target.inbox.push(...)` have been silently broken since the SPSC migration: under `b.addTest`, files with zero `test {}` blocks pass via Zig's lazy analysis (no test => function bodies skipped); under `hammer_exe_files`, build-config errors abort before reaching them. They have not actually compiled in some time and are removed entirely. Removes: - lib/queues.zig: AtomicInbox / InboxNode / InboxType types - lib/queues.zig: Task.inbox_link field (vestigial, never read/written) - runtime/scheduler.zig: type aliases + inbox_link field on SpawnRequest and RemoteCall - runtime/queues-test.zig: AtomicInbox tests + thiefWorker dummy-inbox arg - runtime/scheduler-race-test.zig (broken) - runtime/inbox-race-test.zig (broken) - runtime/inbox-race-smoke-test.zig (broken) - zig/build.zig: 3 test_files entries + 2 hammer_exe_files entries Validation: zig build test -> 162 passed, 1 skipped, 0 failed zig build test-tsan -> All 60 tests passed zig build test-loom-vopr -> All 36 tests passed ./clear test transpile-tests/ -> 514 passed, 0 leaks Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds zig/versioned-multi-loom-test.zig -- a standalone executable
that drives two fibers concurrently calling Versioned.updateMulti on
overlapping cell-sets, deterministically reaching the contention-
rollback branch at versioned.zig:574 (the per-cell tag-release store).
The line had been line-missing in the loom kcov report because no
existing scenario could trigger it: contention requires one fiber to
acquire SOME (>0) tags, then exhaust the inner-retry budget on a
later cell while another fiber holds it. Single-fiber tests don't
contend; the parking-lot-loom and vopr-loom harnesses are tightly
coupled to lock primitives + RunQueue and don't drive Runtime/EBR
fiber semantics.
Two design moves make the path reachable in a small enumerable
schedule space:
1. Test seam in versioned.zig: MAX_INNER_RETRIES_MULTI is now
overridable from `@import("root")` (mirroring the existing
MAX_UPDATE_RETRIES seam). Production keeps the 1024 default;
the harness sets it to 4 so the rollback fires after just 4
spin observations.
2. Cell layout: a contiguous `g_cells: [3]Versioned(i64)` array
guarantees address ordering g_cells[0] < g_cells[1] < g_cells[2]
regardless of Zig BSS layout. Fiber X uses {&g_cells[0..1]} and
Fiber Y uses {&g_cells[1..2]} -- their first acquisitions differ
(X tags g_cells[0], Y tags g_cells[1]) but they overlap on
g_cells[1]. Whichever fiber tries the second cell after the
other has tagged it spins out the inner-retry budget with
acquired > 0, exercising the rollback store.
Harness shape mirrors the existing PinDepthLoomHarness in
versioned-loom-test.zig: 2 fibers, schedule-driven yield picking,
round-robin tail after the explicit schedule exhausts (prevents
starvation that would otherwise hit error.UpdateRetriesExhausted).
Wired into build.zig as a new exe with its own kcov dir under
`-Dcoverage-loom`. Covers all 1024 binary schedules at depth 10
(69k SimAtomic ops total) with 0 invariant violations.
Brings runtime/versioned.zig to 14/14 (100%) in the loom atomic
coverage report. Closes the last Tier 1 gap. Overall loom coverage:
41.3% -> 41.4%.
The harness shape (contiguous cells + lowered retry budget +
round-robin tail) is reusable for the broader Tier 4 library gap
(observable, streams, data-structures, ownership) where similar
multi-fiber scenarios are needed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds zig/ownership-loom-test.zig -- four scenarios driving two-fiber
interleavings on Arc<T> / Weak<T> reference counts:
1. clone-vs-deinit: fetchAdd vs fetchSub on strong_count, including
the last-drop branch.
2. weak-upgrade-vs-strong-drop: cmpxchgWeak retry race when a
concurrent drop takes strong_count to zero mid-upgrade.
3. concurrent-downgrade: two simultaneous fetchAdd on weak_count.
4. inspect-accessors: refCount/weakCount/isAlive/strongCount + the
Weak.fromArc + Weak.clone fetchAdd sites.
Also adds the comptime SimAtomic alias to lib/ownership.zig (mirroring
the pattern in lib/queues.zig and lib/ebr.zig) so atomic ops on the
control block become deterministic yield points under Loom. Without
this lib/ownership.zig hardcoded std.atomic.Value, leaving Arc/Weak
ops uninstrumented even when callers ran under SimAtomic-shimming.
Brings lib/ownership.zig to 14/14 (100%) covered atomic ops in the
loom report. Validates the harness pattern (contiguous shared state +
two fibers + schedule-driven yield + round-robin tail) generalizes
beyond the EBR-aware versioned case to a pure shared-counter use
case -- exactly what the broader Tier 4 lib/ surface needs.
Overall loom coverage: 41.4% -> 43.0%.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the alias pattern already in lib/queues.zig, lib/ebr.zig, and lib/ownership.zig: declare a comptime `Atomic` selector at the top of the file that resolves to SimAtomic when the root module exports it (Loom mode), or to std.atomic.Value otherwise. Replaces all 38 std.atomic.Value usages with Atomic(...). Production: zero behavioral change -- root has no SimAtomic, alias picks std.atomic.Value, codegen identical. Loom: every Chunk.len.store/load, SubscriberRecord.seq.store/load, and concurrentBounded* err_code/next_idx atomic op becomes a yield point under the harness. Required so an upcoming streams-loom-test can deterministically interleave producer/consumer flows. Without this lib/streams.zig hardcoded std.atomic.Value would bypass the harness even when callers ran under SimAtomic shimming. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The SimAtomic alias commit accidentally shipped a self-referencing fallback (`else Atomic;` on the comptime branch), which the compiler flags as "value of declaration depends on itself" the moment any caller touches `lib/streams.zig`. The non-loom branch must resolve to `std.atomic.Value` -- the type the file used before the alias was introduced. Caught locally because no immediate caller had been added in 6cc9a50; data-structures-test.zig surfaces it in the next test build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the >11.8x slowdown when swapping lib/streams.zig's Inner.mutex from compat.Mutex (pthread) to ParkingMutex on the 08_pubsub benchmark (0.169s -> >2s TIMEOUT). Documents the diagnostic chain: each ParkingMutex slow-path acquire chains 15+ atomic ops + cross-scheduler submitResume vs pthread's adaptive spin + futex hand-off. Lists the contention paths that need the fix, the affected files, and three plausible fixes (adaptive spin, hand-off, hybrid spin-park primitive). Until that lands, lib/streams.zig and lib/data-structures.zig keep compat.Mutex and the latent OS-thread-blocking issue is documented but unfixed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (S6)
The run-loop's per-iteration "if idle, steal from a victim" block at
scheduler.zig was inline in run() with no callable seam, so existing
loom scenarios -- which manually drive fiber switches and never enter
run() -- couldn't reach its 4 active_tasks fetchAdd/fetchSub sites.
Refactored the block into pub fn idleStealFrom(self, victim):
- Same behavior; just hoisted out of the run-loop.
- run() now calls self.idleStealFrom(victim) inside the existing
idleness/registry gate.
- Public so loom tests can drive the steal+accounting paths
directly without the run-loop's implicit registry+rng deps.
Two loom scenarios in parking-lot-loom.zig exercise both arms:
- testIdleStealFromStackful: 2 schedulers, push 4 stub Tasks to
victim, steal -> covers stackful steal's fetchAdd/fetchSub pair.
- testIdleStealFromFsm: empty stackful + 4 FSM stubs in victim ->
fall-through covers the FSM steal's fetchAdd/fetchSub pair.
Both scenarios verify count consistency (stealer_delta ==
victim_delta) post-steal as a basic invariant. The atomic ops
themselves are RMWs so can't lose updates; the loom value is line
coverage of the steal path that was previously line-missing.
Brings runtime/scheduler.zig to 63/163 (was 59/163). Closes the
S6 active_tasks-accounting cluster from
docs/agents/parking-mutex-performance-problems.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…k (S2+S5+S1 partial)
Three loom scenarios in parking-lot-loom.zig + a small extraction in
scheduler.zig:
testCrossSchedulerResumeFlow: 2 schedulers (a, b). Sets active=a,
calls sched_b.submitResume(stub_task) -- routes through the cross-
scheduler SPSC channel push since sender (a) != target (b). Then
sched_b.drainChannels() processes the queued Resume message.
Covers:
- 896: in_inbox.cmpxchgStrong IDLE -> IN_QUEUE (S5 wake CAS)
- 928: dirty_mask.fetchOr to signal target sched (S1)
- 1053: drainChannels Resume case status.store(.Ready) (S2)
testCoopYieldWithWork: fiber under LoomHarness pushes a stub Task
to ready_queue (so hasWork() is true), then calls coopYield.
Drives:
- 1631: coopYield's status.store(.Ready) on the cooperative-
yield path (S2)
testWakeExpiredSleepers: pushes a stub Task with wake_time in the
past onto sleeping_queue, calls wakeExpiredSleepers. Drives:
- 1188: sleeping-queue scan status.store(.Ready) (S2)
Refactor: extracted scheduler.run()'s sleeping-queue scan (lines
1180-1194) into pub fn wakeExpiredSleepers(self). Same shape as
the idleStealFrom extraction from S6 -- public seam so loom tests
can drive the wake path without entering the full run() loop.
Brings runtime/scheduler.zig to 67/163 (was 63/163, +4 sites). Closes
the actively-coverable subset of S2+S5+S1; remaining sites in those
clusters (1325 in_inbox check inside run()'s dispatch switch, 1964
timeout-fire in scanLockWaiters, 2046 io_uring CQE, 878 dirty_mask
in submitSpawn) need either run() loop entry or io_uring fixtures
which are out of scope for plain loom.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…uler (S9 + S1)
Two scenarios in parking-lot-loom.zig:
testPickTwoRoundRobin: registers 3 stub schedulers in
global_registry, calls pickTwo() in a loop. Drives:
- 2123: next.fetchAdd(1, .monotonic) (S9)
- 2124: slots[i % n].load(.acquire) (S9)
- 2125: slots[(i +% 1) % n].load(.acquire) (S9)
- 2153: register's slot.cmpxchgStrong(null, sched) (S10 drive-by)
testCrossSchedulerFsmResumeFlow: 2 schedulers; sender (a) is
active_scheduler but submitFsmResume targets sched_b. Drives:
- 878: dirty_mask.fetchOr in submitFsmResume (S1)
- drainChannels FsmResume case fields (status set, push to
fsm_ready_queue)
Brings runtime/scheduler.zig to 72/163 (was 67/163, +5 sites).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… RemoteCall (S10+S11+S3) Three small scenarios in parking-lot-loom.zig: testRegistryCrossIterPinPaths: registers a single scheduler in global_registry, calls pinTask + pinFsmTask with synthetic stub pointers (which won't resolve to any slab). Covers the slot-load branch of both walkers (S10). testWaitGroupDoneSpinlock: WaitGroup.add(2) + done() x 2 -- first done covers the prev != 1 release-and-return path (line 2755); second done covers the prev == 1 last-decrement path (lines 2760-2765). The lock.swap(1, .acquire) spin acquire (line 2749) + counter.fetchSub (line 2753) fire on each call (S11). testRemoteCallCompletion: pushes a synthetic RemoteCall message with a small completion (no waiter parked) into a scheduler's channel, calls drainChannels. Drives the RemoteCall switch arm including the completion.finished.store(true, .release) at line 1097 (S3). The wg.done() at 1098 also re-exercises S11's spinlock paths. Brings runtime/scheduler.zig to 76/163 (was 72/163, +4 sites). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ial) Adds pub fn scanLockWaitersPub to scheduler.zig (mirrors the existing scanFsmLockWaitersPub seam) so loom tests can drive the timeout-fire path without entering run(). Two scenarios in parking-lot-loom.zig: testScanLockWaitersTimeoutFire: synthetic Task in lock_waiters with waiting_for_lock pointing at a sentinel and lock_wait_start_ms = 0. With sched.lock_timeout_ms = 1, scanLockWaiters fires the timeout branch and clears wait fields + sets lock_timed_out + status=.Ready + enqueues the task. Covers lines 1907 (initial wait check), 1912 (start_ms load), 1914 (deadline compare), 1965-1970 (clear fields + set timed_out), 1971 (enqueue). Drops the inner WaiterList re-check (lines 1932-1944) by leaving waiting_for_lock_list = null; those sites need a real parking-lot WaiterList -- defer. testScanFsmLockWaitersTimeoutFire: same shape on the FSM scanner. Drives lines 1702 (wait check), 1706 (start_ms), 1707 (deadline compare), 1734-1738 (clear fields), 1740 (push to fsm_ready_queue). Brings runtime/scheduler.zig to 87/163 (was 80/163, +7 sites). Remaining 0-hit sites in scanLockWaiters / scanFsmLockWaiters are the WaiterList re-check branches (1932-1944, 1957) which fire only when the wake-side races us mid-scan -- need real parking-lot WaiterList state to drive deterministically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without these tests parking-lot-loom never references WaitGroup.{
registerFsmWaiter,wait} or Semaphore.{acquire,release}, so the linker
strips them and kcov reports MISSING for ~25 atomic sites that
production code exercises every spawn-bounded async block.
Four direct tests:
testWaitGroupRegisterFsmWaiter — counter==0 fast-return,
counter>0 happy-path park, and the inner-recheck race where
count drops to 0 between the outer load and the lock acquire.
testWaitGroupWaitNonFiber — sched.current_task is null at
loom-test construction, so wait() takes the busy-wait branch
and returns immediately when counter==0.
testSemaphoreFastPath — count=2, two CAS-decrement acquires
+ two no-waiter releases (counter.fetchAdd branch).
testSemaphoreReleaseWithWaiter — waiting_task=stub triggers the
direct-grant branch in release(); active_scheduler is set to
the same scheduler so submitResume's same-scheduler routing
enqueues the stub without touching the SPSC machinery.
Brings runtime/scheduler.zig kcov coverage to 109/163 hit + 2 elided
(was 96/163 + 1). Nil-hit (function not linked) drops from 51 to 22.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
testIoSubmitFns drives all six io submission paths (read/write/ accept/connect/recv/send) with stubbed IoWaiter + SimRing. Each submission stages an SQE and stores .Blocked into the task's status. Without this, the io submit fns are dead-stripped from parking-lot- loom (no caller) and L1811/1834/1842/1850/1858/1886 report MISSING even though every fiber's I/O hot path goes through them. testSchedulerRegistryFns covers getLeastLoaded (L2147-2148), notifyAll (L2207, L2209), deinit reset paths (L2219-2224), and count() live-slot scan (L2252, L2255). Two registered schedulers with biased active_tasks load gives deterministic least-loaded selection. Brings runtime/scheduler.zig kcov coverage to 125/163 hit + 2 elided (was 109/163 + 2). Nil-hit drops 22 -> 6. Remaining 6 nil sites are residual inlined functions or paths the loom binary never enters: drainChannels iteration + notifyDirty fast-path (L845, L938, L944, L959), Scheduler.sleep() (L1650), spawnSchedulers start signal (L2685). The 30 remaining 0-hit sites all sit inside the run() main loop or require a real fiber stack (Semaphore slow-path park, lock-waiter race re-check). Closing N1 -- further coverage gains require either running the full scheduler loop under SimRing or extracting many more public seams; ROI gets steep from here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
testSleepTaskLinking calls sleepTask with a stub task and a wake_time in the far future. Covers L1650 status.store(.Blocked) -- the only remaining nil-hit site reachable from a direct call (no run() loop or fiber stack required). Brings runtime/scheduler.zig kcov coverage to 126/163 hit + 2 elided (was 125/163 + 2). Nil-hit drops 6 -> 5. Remaining 5 nil sites are NOT test-fixable from the loom binary: L845 (notifyDirty fetchOr), L938/L944 (drainChannels iter), L959 (RemoteCall completion store) -- inlined out at the loom binary's optimization level. The functions ARE reached by testCrossSchedulerResumeFlow / testRemoteCallCompletion (sim ops > 0, behavior verified) but kcov has no source-line entry for the inlined call sites. L2685 -- inside a top-level Zig `test` block in scheduler.zig, not run by the parking-lot-loom executable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Loom and VOPR target orthogonal axes (atomic-op interleaving vs.
fault/clock/retry simulation). The loom_atomic_coverage.rb tool
revealed which atomic sites Loom never reached; this tool does the
same for VOPR's surface area.
src/tools/vopr_coverage.rb -- categorized scanner. Five auto-
detected categories (time / random / net_io / fs_io / ring_io)
via grep-style patterns, plus a sixth (retry) driven by source
markers. Output mirrors the loom report shape.
Two source conventions for retry sites:
`// VOPR-START-RETRY: <description>` ... `// VOPR-END-RETRY`
-- multi-line block. Standalone comment lines have no kcov
hit count, so the scanner attributes them to the first
instrumented line at-or-after the marker (the loop header).
`// VOPR-RETRY` on the same line as a single-statement retry
-- compact form for one-line spinlock acquires.
29 markers added across 6 files at the high-value retry loops:
- versioned.zig: 4 (MVCC update/updateFlow/updateMulti)
- atomic_ptr.zig: 2 (update/updateFlow CAS retries)
- scheduler.zig: 6 (WaitGroup/Semaphore primitives)
- data-structures.zig: 15 (sharded inner-lock spins)
- observable.zig: 1 (SpinLock CAS acquire)
- queues.zig: 1 (WaiterList spin)
parking-lot.zig retry loops are intentionally excluded -- Loom
covers their CAS interleaving structurally.
zig/build.zig -- `-Dcoverage-vopr` option + `coverage-vopr` step.
Mirrors `coverage-loom`. Wraps every `*-vopr-test.zig` entry under
kcov, merges to `zig-out/coverage-vopr/merged/kcov-merged/cobertura.xml`.
First report (4 vopr tests: vopr-test, fsm-vopr-test,
fsm-lock-vopr-test, versioned-vopr-test):
Time 6/32 ( 18.8%)
Random 0/4 ( 0.0%)
Network IO (raw) 0/1 ( 0.0%)
Filesystem IO (raw) 1/25 ( 4.0%)
io_uring (RingType seam) 1/10 ( 10.0%)
Retry markers 6/29 ( 20.7%)
TOTAL 14/101 ( 13.9%)
Major gaps: lib/atomic_ptr.zig, lib/data-structures.zig,
lib/observable.zig are FILE-NOT-LOADED -- no current VOPR test
imports them. The 6 retry markers that DO hit are all in scheduler/
versioned (which fsm-vopr / versioned-vopr already exercise).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single-threaded deterministic property test for AtomicPtr(T). Mirrors versioned-vopr-test shape: seeded PRNG drives a random sequence of read / readHold / releaseHeld / update / reclaim ops, checking three invariants per step (post-update read, held-guard dereference still valid, limbo grew by 1 on update). Goal: get lib/atomic_ptr.zig into the VOPR coverage tree. Before this test, the file was FILE-NOT-LOADED in cobertura -- no VOPR test imported it. After, the file is LOADED and the AtomicPtr update retry marker reaches FILE-LOADED state. VOPR coverage delta: retry markers 6/29 -> 7/29; lib/atomic_ptr.zig gaps reclassified from FILE-NOT-LOADED to LINE-MISSING/0-hit (real single-threaded VOPR can't fault-inject a CAS miss, so the inner retry body still doesn't execute -- the file is just instrumented now). Skipped lib/observable.zig and lib/data-structures.zig: their retry loops are inner-lock spins that need true contention to fault-inject. Single-thread VOPR can never lose a CAS, so the loop body never executes. Those files need a SimAtomic-with-CAS-fault-injection mode to be useful -- distinct work item. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the SimAtomic / SimRing pattern: a comptime alias inside
lib/compat.zig picks up `SimClock` / `SimRandom` if the test root
exports them, falling through to the OS clock_gettime / getrandom
syscalls otherwise. Production builds (no SimClock/SimRandom on
root) inline the OS path -- zero overhead.
zig/runtime/vopr-clock.zig
SimClock with virtual_ns state, reset() / advanceMs() /
advanceNs() / milliTimestamp() / nanoTimestamp(). Single-thread
(matches the runtime's VOPR tests). Self-test verifies
advance/read symmetry.
zig/runtime/vopr-random.zig
SimRandom backed by std.Random.DefaultPrng. seed() / fill().
Self-test verifies seed determinism + cross-seed divergence.
zig/lib/compat.zig
Two `sim_*_decl = if (@hasDecl(root, "SimX")) root.SimX else void`
seams. milliTimestamp / nanoTimestamp / randomBytes consult the
seam at the top, fall through to OS.
All five existing VOPR test entries (vopr-test, fsm-vopr-test,
fsm-lock-vopr-test, versioned-vopr-test, atomic-ptr-vopr-test)
now `pub const SimClock = ...; pub const SimRandom = ...;` at
module root.
VOPR coverage delta:
Before V6+V7: After V6+V7:
Time 6/32 (18.8%) Time 6/34 (17.6%)
Random 0/4 ( 0.0%) Random 0/4 ( 0.0%)
Net IO (raw) 0/1 ( 0.0%) Net IO (raw) 0/1 ( 0.0%)
FS IO (raw) 1/25 ( 4.0%) FS IO (raw) 1/25 ( 4.0%)
Ring IO 1/10 (10.0%) Ring IO 1/10 (10.0%)
Retry 7/29 (24.1%) Retry 7/29 (24.1%)
TOTAL 15/101 (14.9%) TOTAL 15/103 (14.6%)
The two new Time sites (compat.zig:150, 157) are the SimClock
seam checks themselves; they correctly route to the simulator
under VOPR but kcov can't track the comptime `if (decl != void)`
branch resolution at source-line granularity.
Coverage doesn't move because the existing VOPR scenarios are
single-threaded MVCC / FSM / parking-lot dispatch tests that don't
drive timeouts, sleep, or randomness. The shim is plumbing for
FUTURE VOPR scenarios (timeout-driven retries, sleep wakeup,
randomized fault injection) -- now possible to write
deterministically. Pre-V6 those scenarios would have introduced
real-clock / OS-entropy non-determinism.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…out (V8) Three deterministic timeout scenarios that exercise the time-read sites in scheduler.zig kcov was reporting as 0-hit: scheduler.zig:1456 wakeExpiredSleepers: const now = milliTimestamp(); scheduler.zig:1910 scanLockWaiters: const now_ms = milliTimestamp(); Each scenario stamps `lock_wait_start_ms` / `wake_time` at a fixed offset from `compat.milliTimestamp()` (now+0 = inside-deadline, now-200 = past-deadline). No real-time waits, no flake. Covers both the inside-deadline (no fire) and past-deadline (fire) branches. Mirrors the parking-lot-loom S8 scenarios but bare-bones without the loom harness -- pure single-thread VOPR shape. GAP-B caveat (worth flagging for the next round): the SimClock seam in lib/compat.zig is behind `@hasDecl(@import("root"), ...)`. Under `b.addTest`, root is Zig's auto-generated test_runner module, NOT this entry file. The same issue parking-lot-loom hit pre-2026-05. Activating SimClock for VOPR tests requires promoting them to `b.addExecutable` so root resolves to our entry. Until then these scenarios use offset-from-real-clock to drive timeouts deterministically. VOPR coverage delta: Time 6/32 (18.8%) -> 8/34 (23.5%) Total 14/101 -> 17/103 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ctivates (V10)
Promotes scheduler-timeout-vopr from b.addTest to b.addExecutable so
the comptime SimClock seam in lib/compat.zig actually activates. Same
GAP-B fix parking-lot-loom went through pre-2026-05.
Why this matters: under b.addTest, `@import("root")` resolves to Zig's
auto-generated test_runner module, NOT our entry file. The
`@hasDecl(root, "SimClock")` check in compat.zig's milliTimestamp /
nanoTimestamp wrappers therefore returns false, the seam silently
falls through to OS clock_gettime, and "VOPR-deterministic" timeouts
are actually real-time-dependent.
Layout changes:
zig/runtime/scheduler-timeout-vopr.zig (renamed from -test.zig)
Now exports `pub fn testX() !void` per scenario. No `test "..."`
blocks -- the executable wrapper drives them via main().
zig/scheduler-timeout-vopr-test.zig (rewritten)
Top-level executable wrapper. Re-exports SimClock + SimRandom at
module root, defines a tests array, main() iterates with pass/fail
accounting. Mirrors parking-lot-loom-test.zig structure.
zig/build.zig
New stv_exe (b.addExecutable) parallel to pl_loom_exe. Wired into
test-loom-vopr (no-coverage) and coverage-vopr (kcov-wrapped).
Removed the prior b.addTest entry from the test_files loop.
zig/runtime/scheduler-timeout-vopr.zig (impl rewrite)
Now uses SimClock.advanceMs() to drive timeouts deterministically:
* stamp lock_wait_start_ms with SimClock.milliTimestamp() (= 0)
* advanceMs(50) -- inside deadline, no fire
* advanceMs(100) -- past deadline, fires
First scenario is `testSimClockActive` -- a GAP-B regression gate
that fails if compat.milliTimestamp() doesn't track SimClock.
src/tools/vopr_coverage.rb
TEST_FILE_RE now also matches `*-vopr.zig` (impl side of
executable-style VOPR tests). Otherwise the test scenario file
counted as production runtime and inflated the site catalog.
Verification: GAP-B gate passes (SimClock IS active under
scheduler-timeout-vopr executable). All 4 scenarios green.
Coverage delta is small (Time 8/34 -> 9/34) because the prior
b.addTest version already hit the same scheduler.zig sites via real-
clock fallthrough -- the structural win is determinism, not raw
coverage. Future VOPR scenarios that rely on virtual-clock advance
(sleep wakeup ordering, multi-task timeout races, idle-arming
timeouts in the run loop) can now be written without flakiness.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## vopr-loom-audit #26 +/- ##
==================================================
Coverage ? 88.64%
==================================================
Files ? 163
Lines ? 45270
Branches ? 11256
==================================================
Hits ? 40130
Misses ? 5140
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
| Branch | vopr-loom-audit-docs |
| Testbed | ubuntu-latest |
⚠️ WARNING: No Threshold found!Without a Threshold, no Alerts will ever be generated.
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the--ci-only-thresholdsflag.
Click to view all benchmark results
| Benchmark | leak-build-ms | Measure (units) x 1e3 | leak-count | Measure (units) | leak-run-ms | Measure (units) |
|---|---|---|---|---|---|---|
| benchmarks/concurrent/05_backpressure/bench | 📈 view plot | 4.33 units x 1e3 | 📈 view plot | 0.00 units | 📈 view plot | 1,960.30 units |
| benchmarks/concurrent/10_shard_vs_locked/bench | 📈 view plot | 4.26 units x 1e3 | 📈 view plot | 0.00 units | 📈 view plot | 60,003.96 units |
| benchmarks/concurrent/16_observables/bench | 📈 view plot | 4.27 units x 1e3 | 📈 view plot | 0.00 units | 📈 view plot | 99.62 units |
| benchmarks/inter-clear/03_concurrent_mvcc_vs_rwlock/bench | 📈 view plot | 4.59 units x 1e3 | 📈 view plot | 0.00 units | 📈 view plot | 410.02 units |
| benchmarks/sequential/07_pointer_chase/bench | 📈 view plot | 4.23 units x 1e3 | 📈 view plot | 0.00 units | 📈 view plot | 315.91 units |
| benchmarks/sequential/12_weak_ref_graph/bench | 📈 view plot | 4.35 units x 1e3 | 📈 view plot | 0.00 units | 📈 view plot | 106.76 units |
| benchmarks/server/03_pathological/server | 📈 view plot | 4.46 units x 1e3 | 📈 view plot | 0.00 units | 📈 view plot | 1,002.75 units |
Audience: someone fluent in Ruby/Python/JS but new to concurrent testing. Explains why concurrent code has bugs sequential code can't have, what Loom and VOPR each protect against, and how to use the coverage scanners to find at-risk sites that no test exercises. Sections: 1. What the tests are (one paragraph each) 2. What they protect against — what / when / how matrix per system (Loom / Hammer / VOPR), with the explicit notes that Loom does NOT catch deadlocks (Hammer + TSan does; lock ordering prevents) and VOPR DOES catch livelocks (deterministic retry sequencing makes the symmetry visible) 3. Clearest worked example of the bug class for each (counter that loses updates for Loom; timeout that doesn't fire under load for VOPR), with explicit explanation of why sequential programming is immune 4. Coverage scanners: src/tools/loom_atomic_coverage.rb and src/tools/vopr_coverage.rb. What each scans for, the LOOM-EXCLUDE / VOPR-EXCLUDE marker convention, and how to use the reports as a checklist 5. Anatomy of a Loom test, walking through ownership-loom-test.zig: the SimAtomic activation seam, the binary-schedule enumeration, the GAP-B counter gate, and what to check when reading one 6. Anatomy of a VOPR test, walking through scheduler-timeout-vopr- test.zig: the SimClock / SimRandom seams, the activation gate as first scenario, the b.addExecutable (NOT b.addTest) requirement, and what to check when reading one. Includes a "VOPR's blind spot" subsection: VOPR's coverage is bounded by the simulator's coverage; concrete unmodeled failures (disk hangs, mid-stream TCP RSTs, kernel pauses, NTP clock jumps) are invisible to VOPR. Mitigation posture: each production bug should produce either a new VOPR scenario or a new shim capability. Hammer and VOPR are complementary — Hammer has no model gap; VOPR has no determinism gap. 7. CLAUDE.md gates: Loom + VOPR + Hammer tests are merge requirements for any zig/ change that introduces atomics, I/O, or locks. The coverage scanners make this mechanically checkable. All file references verified to resolve under this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c277be6 to
72ef99c
Compare
Five targeted improvements based on cross-review against the Concurrent
Systems Programming Cheat Sheet:
1. **Cheat-sheet pointer at the top.** New callout block introduces
the cheat sheet as the home for terminology and the broader
concurrency-primitive landscape, so this doc can stay focused on
Loom + VOPR specifics. Three placeholder URLs (TODO_CHEAT_SHEET_URL)
marked for fill-in when the cheat sheet is published.
2. **New §1.5 "What exists in other languages."** Short subsections on
Rust, Go, C/C++, and Zig — what each provides for Hammer / Loom /
VOPR equivalents (e.g. tokio-rs/loom crate, tokio::time::pause(),
go test -race), what's NOT built-in, and what production cost the
testing seam imposes. Closes with a summary table comparing the
five environments. Lets a reader from Rust / Go / C++ understand
how CLEAR's setup relates to what they know.
3. **§2 jargon glossary.** Added a callout that points to the cheat
sheet for TSan / ASan / FFI / lock-contention / etc., plus inline
one-line definitions of the two terms NOT in the cheat sheet:
ABA-like race and half-published state.
4. **§2 OS-boundary explanation.** "any other non-determinism that
crosses an OS boundary" now explicitly lists what that means —
asking for the time, randomness, network packet ordering, disk
I/O completion timing.
5. **Activation-gate warning callouts.** Pulled the GAP-B (Loom) and
SimClock (VOPR) silent-failure traps into prominent ⚠ warning
blocks in §5 and §6 respectively. Each one explains the silent
failure mode, the mitigation (counter-advanced check / first-
scenario time-advance check), and the prescription ("every Loom
suite must have this gate"). These are the most important review
checks for someone reading a Loom or VOPR test.
Also added a one-paragraph callout explaining `comptime` for readers
unfamiliar with compile-time metaprogramming, framed as
"dependency injection that resolves at compile time."
Net: +186 / -11 lines. No reordering of existing sections; all
additions are top-loaded with cross-references.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The §1 VOPR section had three implementation paragraphs (comptime seam, production cost, "what is comptime"); the parallel Loom section had only "what it is" + one "the trick" paragraph. Asymmetry made VOPR feel like an implementation walkthrough rather than a high-level definition. This commit: 1. Trims VOPR §1 to mirror Loom's structure: one "what it is" + one short "the trick" paragraph, with forward references to §6 (the activation mechanism) and §1.5's Zig subsection (comptime). 2. Moves the "What is comptime?" callout from §1 into §1.5's Zig subsection, where comptime is first explained as a feature. The §6 anatomy section already covers the comptime seam and the b.addExecutable requirement in detail; deleting from §1 doesn't lose information. Net: -3 lines (16 added, 19 removed). Symmetric §1 sections now read "this is what they are" rather than "this is what they are and how they're built." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bbad923 to
6bb1e1e
Compare
Adds docs/correct-systems-programming-cheat-sheet.md (the broader context doc previously referenced as TODO_CHEAT_SHEET_URL) and replaces the three placeholders in docs/agents/loom-vopr-getting-started.md with relative links to the new file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Link 'how to get LLMs to write better systems-level code' to the LLM workflow section, and 'my reasoning for building CLEAR' to the Why I'm building CLEAR section. - Fix 'how to get LLMs write' -> 'how to get LLMs to write'. - Indent Helgrind / Memcheck / Massif as sub-bullets under Valgrind. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… note Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.