Pprof by cuzzo · Pull Request #22 · cuzzo/clear

cuzzo · 2026-05-08T13:14:17Z

No description provided.

`clear profile` now writes heap.pb.gz, lock.pb.gz, and mvcc.pb.gz into the .profile/ dir alongside the existing text reports. CPU is delegated to `perf_to_profile` (the standard Go converter for perf.data); we print a one-line install hint when it's missing. Pure-Ruby pprof v3 encoder (`src/tools/pprof.rb`, ~190 lines) so the feature stays stdlib-only — no `google-protobuf` C-extension dep. Sample-type columns mirror Go's conventions so pprof's standard flags work out of the box: - heap: alloc_objects / alloc_space / inuse_objects / inuse_space - lock: contentions / delay / hold / acquisitions - mvcc: reads / commits / retries / cow_bytes (cow_bytes = struct_size * (commits + retries)) Each per-site address resolves via addr2line + the transpiler's `// CLR:N` markers, so `pprof -list <fn>` shows CLEAR source lines. Verified end-to-end: `pprof -top -alloc_space heap.pb.gz` reads function names correctly (clearMain / entryWrapper / makeResultSet / intToString) and `-tags` shows the per-sample addr labels. 20 new specs cover the wire encoder, dedup, gzip envelope, and per-format column math; 4096 unit specs total, 0 failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds tab-completion for bash, zsh, and fish. The CLI is the source of truth for the completion script — regenerating `clear completions bash` after a subcommand change refreshes everything. Per-subcommand semantics: - build / run / fmt / fix / profile / explain -> *.cht files - test / benchmark -> *.cht or dirs - doctor -> *.profile/ dirs - completions -> bash | zsh | fish Install (bash): echo 'source <(clear completions bash)' >> ~/.bashrc Install (zsh): mkdir -p ~/.zsh/completions clear completions zsh > ~/.zsh/completions/_clear # ensure fpath=(~/.zsh/completions \$fpath) BEFORE compinit Install (fish): clear completions fish > ~/.config/fish/completions/clear.fish 17 specs cover script structure, dispatch, and CLI integration. Live- verified on this machine: tab after `clear doctor examples/litedb/` suggests only the .profile/ directory; tab after `clear profile examples/litedb/` suggests the .cht file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`alloc-profile.zig` now captures up to 16 stack frames per allocation via `std.debug.captureCurrentStackTrace` (FP-based unwinder; profile builds keep frame pointers via omit_frame_pointer=false). Hash table keyed by the full trace, so distinct call paths to the same helper (intToString, makeResultSet, etc.) resolve as separate sites instead of collapsing into one. Format change (alloc-profile v1 -> v2): the first column of alloc.txt goes from a single hex addr to a comma-separated leaf-first trace. Both consumers updated to parse it; v1 single-addr files still parse as 1-element traces. alloc.txt before: 0x401234 1000 40000 500 20000 500 alloc.txt after: 0x401234,0x402000 1000 40000 500 20000 500 Stack budget: trace buffer is `[16]usize` = 128 B on the recorder's stack, well within the 60 KB Large stacks `clear profile` already forces. captureCurrentStackTrace itself is a frame-pointer walk, no additional frames pushed. SpinLock prevents recorder reentrance, so the existing no-allocation contract holds. Sampling: `clear profile --sample=N` records every Nth alloc and scales captured values by N at record time so pprof / doctor see estimated totals. Default N=1 (no sampling). Header records `sample_n` so consumers can flag the approximation. Useful for hot allocators where the per-alloc unwind cost matters; otherwise leave off for accurate counts. Consumer updates: - src/tools/pprof_converter.rb -- emits multi-frame Sample.location_id; shares Locations across samples that include the same frame (a hot leaf like entryWrapper appears in every trace and is interned once). - src/tools/doctor.rb -- splits the addrs column into a trace; leaf addr remains the "primary site" so existing top-N output is unchanged but now backed by trace data for future caller-aware views. End-to-end verified on examples/litedb: pprof tree shows real caller/ callee relationships (`entryWrapper -> clearMain -> intToString`) where previously cum% always equaled flat%. 4095 unit specs / 0 failures; 2 new specs cover multi-frame parsing and Location reuse. lock-profile and mvcc-profile remain single-frame for now; same pattern applies and will follow as separate commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds optional per-record stack capture to lock-profile (v3) and mvcc-profile (v2). Off by default; opt in via: clear profile foo.cht --sync-callstacks When on, each (lock, caller-trace) pair becomes a distinct row, so pprof's tree/flame views show per-caller attribution. Doctor aggregates rows back to one-per-lock for its existing diagnoses (no behavior change there). Same shape for (cell, caller-trace) in mvcc-profile. The flag is off by default because the FP walk costs ~100-500ns per record, while uncontended mutex acquire is ~10-20ns and MVCC commit fast paths are ~20-50ns -- the trace can dominate the operation it measures. When --sync-callstacks is set without an explicit --sample, we auto-default to --sample=100 to keep the cost manageable. Users can pass --sample=1 to opt in to full capture at full cost. FP-unwind through parking-lot verified end-to-end: parking_mutex_lock -> clearMain -> entryWrapper parking_mutex_lock -> clearMain.__DoBranchCtx0_0.run -> ... addr2line resolves all captured frames cleanly. Frame pointers are already retained in profile builds (omit_frame_pointer = false), and captureCurrentStackTrace's FP walker handles parking-lot's hot/slow paths without symbol noise. New shared module zig/runtime/profile-trace.zig: - MAX_FRAMES (single source of truth, 16) - captureFromHere() -- thin wrapper around std.debug.captureCurrentStackTrace - syncCallstacksEnabled() -- reads CLEAR_PROFILE_SYNC_CALLSTACKS once - sampleStride() -- reads CLEAR_PROFILE_SAMPLE once alloc-profile uses the shared MAX_FRAMES + captureFromHere; its sample-stride read is unchanged so it doesn't move under us. Wire format: - lock-profile v3: 12th tab-separated column `caller_trace`. `-` = empty (sync-callstacks off); comma-separated leaf-first addrs when populated. - mvcc-profile v2: 8th column same shape. Tab-separated lines so the trace field can carry commas without a column-count ambiguity. Older whitespace-only files still parse via fallback split. Consumers (doctor.rb, pprof_converter.rb) updated to: - tab-split the line; fall back to whitespace-split if too few cols - parse `-` as empty trace - aggregate (addr, trace) rows by addr in doctor for the per-lock / per-cell view - emit multi-frame Sample.location_id in pprof_converter, with the lock/cell pointer as the Sample's leaf and caller frames stacked below 4 new specs cover the v3/v2 wire format, empty-trace handling, multi- frame stack emission, and per-row sample emission. 4099 unit specs total / 0 failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three new doctor flags that exploit the multi-frame trace data shipped in the prior commits, plus a Mapping in our pb.gz output so pprof stops complaining about the missing binary name. clear doctor foo.profile/ --cumulative Rank functions by cumulative bytes/allocs across every frame in each trace. A function high in the call stack accrues its callees' costs ("intToString shows up on N% of allocation paths"), where the existing flat view only credits the leaf alloc site. clear doctor foo.profile/ --focus=REGEX Keep only sites whose trace touches a function matching the pattern. Composes with the existing top-N display. clear doctor new.profile/ --diff old.profile/ Per-function delta on alloc_space, lock contention, and MVCC retries. Annotates with directional arrows and brief diagnoses: "newly contended", "retries eliminated", "new retry storm", cold->hot site lists. Computes deltas ourselves rather than shelling to `pprof -base` so doctor stays self-contained and can layer commentary on top of the math. Mapping in pb.gz (src/tools/pprof.rb): - New `add_mapping(binary:, build_id:)` registers the binary the profile is about. pprof picks up the filename for its header ("File: litedb" replaces "Main binary filename not available"). - Locations now carry `mapping_id = primary_mapping` so symbol metadata (has_functions / has_filenames / has_line_numbers) propagates without per-Location plumbing. - All three converters (heap / lock / mvcc) call add_mapping when a binary is available; converter changes are 1 line each. Source-line view (`pprof -list <fn>`) already worked because Function.filename was set to source.cht; verified end-to-end. Diff implementation notes: - parse_alloc_for_diff keys by leaf-function name when a binary is available, raw hex addr otherwise (so unit tests can run without a real binary, and the diff still groups distinct sites). - Uses the after-dir's binary as the canonical addr2line target so before/after addresses resolve through the same symbol table. ASLR / rebuild can shift addresses but the function name stays stable for the same source. - Lock diff sums (lock,trace) rows by addr first, then deltas; same for mvcc by cell pointer. 10 new specs (6 doctor-diff scenarios, 2 pprof Mapping cases, plus 1 cumulative/focus integration each via existing CLI tests). 4108 unit specs total, 0 failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three changes building on the prior round of doctor + pprof work: 1. **Runtime-function misattribution fix.** addr2line returns paths like `/.../runtime/runtime.zig:652` for runtime frames, but our resolver walked back through `transpiled.zig` looking for `// CLR:` markers regardless of what file the address came from. As a result `pprof -list entryWrapper` rendered the runtime function against random source.cht line numbers. Resolver now only assigns `clear_line` when addr2line's file is the user's transpiled CLEAR build target (basename matches `._clear_tmp_*.zig`); runtime and stdlib frames keep their actual zig file path so `pprof -list` shows the right source. Discriminator metadata ("(discriminator 4)" appended to the file:line) handled in the line-extract regex too. 2. **`clear doctor --peek=REGEX`.** Aggregates samples whose trace touches the regex; reports self-bytes plus the callers (frames above) and callees (frames below, when matched non-leaf). Mirrors `pprof -peek`. ~80 lines using the trace data we already capture. 3. **`clear doctor --ignore=REGEX`.** Inverse of `--focus`. Drops samples whose trace touches a matching function. Composes with `--focus`: ignore wins on overlap so a focused-but-also-ignored site is excluded. ~5 lines. Also: `clear_source_path` is now is_user_zig-aware so user functions point at `source.cht` while runtime/stdlib functions point at their actual `.zig` file. pprof's web UI and `-list` show the right source for each. 10 new specs (5 peek scenarios, 3 ignore scenarios, 2 misattribution unit checks). 4118 unit specs total / 0 failures. Verified end-to-end: `pprof -list entryWrapper` now shows runtime.zig:651 ("// 4. EXECUTE USER CODE"), `pprof -list clearMain` shows source.cht:51. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two final pprof additions: 1. **`clear doctor --by=COL`**. Sorts/displays the heap section by one of `bytes` (default), `allocs`, `inuse_bytes`, or `inuse_allocs`. Mirrors pprof's `-sample_index=` switching for the doctor view. Reveals "this site has tons of small allocs" vs "few big ones" without re-running. Inuse columns are derived from alloc - free at parse time, so the column is available even when the runtime didn't record `live_bytes` directly. 2. **channels.txt → channels.pb.gz**. Completes the pb.gz emission set. One sample per registered channel; sample types: pushes / pops / push_blocked / pop_blocked / max_depth. Synthetic function names like `channel#7` give pprof a stable label per channel; the capacity travels as a per-sample tag (visible via `pprof -tags`). Skipped (returns nil) when channels.txt has no rows. Verified end-to-end on transpile-tests/241_open_stream_pipelines, which exercises `BG STREAM` and produces real channel telemetry — `pprof -top -sample_index=pushes channels.pb.gz` ranks the channels by push count. 6 new specs (3 channels conversion, 3 --by pivoting). 4124 unit specs total / 0 failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov-commenter · 2026-05-08T13:15:55Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 88.76081% with 78 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.85%. Comparing base (07b3f81) to head (9c27336).
⚠️ Report is 31 commits behind head on master.

Files with missing lines	Patch %	Lines
src/tools/doctor.rb	87.74%	43 Missing ⚠️
src/tools/pprof_converter.rb	82.90%	33 Missing ⚠️
src/tools/pprof.rb	98.47%	2 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master      #22      +/-   ##
==========================================
+ Coverage   89.60%   89.85%   +0.24%     
==========================================
  Files         170      185      +15     
  Lines       45996    47991    +1995     
  Branches    11290    11953     +663     
==========================================
+ Hits        41216    43123    +1907     
- Misses       4780     4868      +88

Flag	Coverage Δ
ruby	`86.06% <88.76%> (+0.68%)`	⬆️
zig	`95.62% <ø> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-05-08T13:17:17Z

Bencher Report

Branch	pprof
Testbed	ubuntu-latest

⚠️ WARNING: No Threshold found!
Without a Threshold, no Alerts will ever be generated.
leak-count (Measure (units))
leak-build-ms (Measure (units))
leak-run-ms (Measure (units))
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds flag.

Click to view all benchmark results

Benchmark	leak-build-ms	Measure (units) x 1e3	leak-count	leak-run-ms	Measure (units)
benchmarks/concurrent/02_concurrent_search/bench	📈 view plot ⚠️ NO THRESHOLD	1.81 units x 1e3	📈 view plot ⚠️ NO THRESHOLD	📈 view plot ⚠️ NO THRESHOLD	6.74 units
benchmarks/concurrent/07_stream_merge/bench	📈 view plot ⚠️ NO THRESHOLD	1.60 units x 1e3	📈 view plot ⚠️ NO THRESHOLD	📈 view plot ⚠️ NO THRESHOLD	34.10 units
benchmarks/concurrent/12_false_sharing/bench	📈 view plot ⚠️ NO THRESHOLD	1.74 units x 1e3	📈 view plot ⚠️ NO THRESHOLD	📈 view plot ⚠️ NO THRESHOLD	1,083.89 units
benchmarks/concurrent/19_atomic_ptr/bench	📈 view plot ⚠️ NO THRESHOLD	1.78 units x 1e3	📈 view plot ⚠️ NO THRESHOLD	📈 view plot ⚠️ NO THRESHOLD	269.81 units
benchmarks/inter-clear/05_concurrent_mvcc_pure_read/bench	📈 view plot ⚠️ NO THRESHOLD	1.67 units x 1e3	📈 view plot ⚠️ NO THRESHOLD	📈 view plot ⚠️ NO THRESHOLD	496.73 units
benchmarks/sequential/04_hashmap/bench	📈 view plot ⚠️ NO THRESHOLD	1.56 units x 1e3	📈 view plot ⚠️ NO THRESHOLD	📈 view plot ⚠️ NO THRESHOLD	2,031.81 units
benchmarks/sequential/09_frame_vs_heap/bench	📈 view plot ⚠️ NO THRESHOLD	1.63 units x 1e3	📈 view plot ⚠️ NO THRESHOLD	📈 view plot ⚠️ NO THRESHOLD	1,717.45 units
benchmarks/sequential/14_iterator/bench	📈 view plot ⚠️ NO THRESHOLD	1.58 units x 1e3	📈 view plot ⚠️ NO THRESHOLD	📈 view plot ⚠️ NO THRESHOLD	372.94 units

🐰 View full continuous benchmarking report in Bencher

cuzzo and others added 7 commits May 8, 2026 02:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pprof#22

Pprof#22
cuzzo wants to merge 7 commits intomasterfrom
pprof

cuzzo commented May 8, 2026

Uh oh!

codecov-commenter commented May 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 8, 2026 •

edited

Loading

⚠️ WARNING: No Threshold found!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cuzzo commented May 8, 2026

Uh oh!

codecov-commenter commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

⚠️ WARNING: No Threshold found!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov-commenter commented May 8, 2026 •

edited

Loading

github-actions Bot commented May 8, 2026 •

edited

Loading