OTLP export / Devfiler integration#97
Conversation
Export profiles in OpenTelemetry Profiles v1development format via gRPC, enabling storage in Pyroscope and visualization in Grafana. - Vendor OTLP proto files (v1.10.0) and compile with protox (pure-Rust, no protoc binary needed) + tonic-build for gRPC client generation - New otlp.rs: converts collapse-format stacks to OTLP ProfilesDictionary structure (string/function/location/stack/attribute tables) with proper deduplication and synthetic addresses per function - New OtlpSink in output.rs: lazy gRPC connection, supports both streaming (write_batch) and batch (finish) modes via block_in_place - CLI flags: --otlp-endpoint, --otlp-insecure, --otlp-service-name - Feature-gated behind 'otlp' (included in default features) - Expands aggregated sample counts into individual timestamps as required by devfiler (which derives event count from timestamps_unix_nano.len) - Uses 'go' frame type for pre-symbolized frames so devfiler reads function names from the proto rather than attempting binary symbolization - Includes mapping with synthetic build ID so all frames group under one executable in devfiler - 8 unit tests covering conversion, dedup, kernel frames, off-CPU, and serialization roundtrip
Switch OTLP export from synthetic addresses to real instruction pointers: - Populate StackFrameInfo.address with real IPs in format_stack_trace() by zipping raw addresses back into symbolized frames after blazesym - New framecounts_to_otlp_request() builds OTLP profiles with: - Real Location.address values (instruction pointers from eBPF unwinder) - Real Mapping entries parsed from /proc/<pid>/maps - ELF build IDs from .note.gnu.build-id for each mapped binary - Function names in loc.lines as fallback for receivers without symbols - New OtlpNativeSink consumes Vec<FrameCount> from collect_raw() in batch mode, sending structured data with real addresses - Serve mode uses OtlpSink via MultiplexSink (collapse strings, streaming) - TUI mode: pass OtlpSink into spawn_profiling_thread, sends each refresh cycle alongside the web feed -- OTLP now works with --tui, --serve, and batch modes uniformly via --otlp-endpoint - Uses 'go' frame type so devfiler reads function names from proto (it unconditionally skips loc.lines for 'native' frames)
New --flush-interval <ms> flag enables continuous profiling mode: runs indefinitely (or until --time expires), uploading a batch of samples every N milliseconds to configured sinks. Example: sudo probee --otlp-endpoint 127.0.0.1:11000 --flush-interval 10000 --skip-idle This provides a lightweight daemon-like mode for continuous profiling without the overhead of --serve's HTTP server. Works with all output sinks including OTLP, file-based formats, and can be combined with --time for bounded continuous runs.
New workspace crate that acts as a bridge between profile-bee and devfiler for native symbol resolution: - profile-bee POSTs binaries to /upload during profiling - symbol-server computes htlhash FileId (SHA-256 of head+tail+length), extracts symbols from ELF .symtab/.dynsym with Rust/C++ demangling, generates zstd-compressed symbfile (devfiler's binary format) - devfiler polls with --symb-endpoint and gets metadata.json + ranges Modules: - fileid.rs: htlhash FileId computation (compatible with elastic/devfiler) - extract.rs: ELF symbol extraction with demangling - symbfile.rs: devfiler symbfile writer (magic + ULEB128 + protobuf RangeV1) - store.rs: on-disk symbol cache - server.rs: axum HTTP server (upload + devfiler-compatible GET endpoints)
When --symbol-server <url> is set, profile-bee spawns a background task that discovers ELF binaries from /proc/<pid>/maps and POSTs them to the symbol-server. The server extracts symbols and serves them to devfiler. - New symbol_upload.rs module: async uploader with deduplication, ELF magic verification, and automatic /proc/pid/maps discovery - Adds reqwest dependency for HTTP POST - Non-blocking: uploads happen in background, don't affect profiling - Deduplicates: each binary path is uploaded at most once per session
Switch framecounts_to_otlp_request() from 'go' frame type back to 'native' so devfiler triggers its auto-symbolization pipeline. Replace GNU Build ID with htlhash (SHA-256 of head+tail+length) which is the FileId format devfiler uses to look up symbols from the server. This ensures the build ID in OTLP profiles matches what symbol-server computes, allowing devfiler to fetch and resolve symbols correctly. The collapse-string path (serve/TUI/flush-interval) still uses 'go' frame type since it doesn't have real addresses for symbolization.
- Fix axum route syntax: use ':param' not '{param}' for path parameters
(was causing 404 for all devfiler symbol fetch requests)
- Fix pid=0 on symbolized frames: extract real pid from FrameCount's
root frame so /proc/<pid>/maps can be read for mapping lookups
- Add request logging to symbol server for debugging
- Add frame address stats to OTLP native path for visibility
These fixes complete the devfiler symbol resolution pipeline:
profile-bee sends htlhash build IDs → devfiler fetches from symbol
server → symbols are resolved for native frame addresses.
The first RangeV1 record must use setElfVA (tag 12, absolute address) to establish the baseline. Previous code used flat Optional fields which prost didn't encode correctly for the oneof semantics. Switch to proper #[prost(oneof)] enum so prost generates the correct wire format: tag 12 (uint64) for the first record, tag 1 (sint64 zigzag) for delta-encoded subsequent records.
Create profile-bee-symbols as a shared library crate used by both profile-bee and symbol-server: - fileid.rs: htlhash FileId computation (SHA-256 of head+tail+length), compatible with devfiler/Elastic format - extract.rs: ELF symbol extraction from .symtab/.dynsym with Rust/C++ demangling, DWARF enrichment stub for future work Deduplicates htlhash computation that was previously inlined in profile-bee's otlp.rs and duplicated in symbol-server. Both crates now depend on profile-bee-symbols for consistent behavior. This enables future optimizations where profile-bee can optionally pre-extract symbols (since it already runs blazesym) and POST them in symbfile format directly, reducing symbol-server's workload.
When --symbol-server is set without --pid, scan /proc/*/maps for every running process and upload all discovered ELF binaries. This ensures devfiler can resolve symbols for any process that appears in profiles. The upload loop deduplicates by path, so shared libraries (libc, etc.) are only uploaded once even when mapped by hundreds of processes.
devfiler expects Location.address to be in ELF VA space (relative to the ELF file), not runtime VA space. The OTel eBPF profiler normalizes in BPF code (pc - bias), but profile-bee was sending raw runtime VAs. Now computes: elf_va = runtime_addr - mapping.memory_start + file_offset This ensures the address devfiler stores in FrameId.addr_or_line matches the ELF VA ranges in the symbfile interval tree, enabling symbol lookup.
Axum's default body limit (2MB) was rejecting binaries like amazon-cloudwatch-agent (211MB) and libprotobuf (3.3MB).
- New docs/otlp_export.md: full usage guide covering Pyroscope, devfiler, symbol-server setup, CLI flags, architecture diagram, and ecosystem comparison (symbolization approaches across devfiler/Pyroscope/Parca/ OTel Collector/Grafana Alloy) - Update README: add OTLP and symbol-server to features list, add quick-start examples for Pyroscope and devfiler workflows - Fix clippy: remove unnecessary u64 casts, simplify boolean expression - Run cargo fmt across all crates
New --symbol-server-listen <port> flag starts an in-process symbol
server, combining profiler and symbol serving in a single binary.
Useful for debugging and local development without running a separate
daemon.
When --symbol-server-listen is set, the symbol uploader automatically
targets the embedded server (no need to also pass --symbol-server).
Usage:
sudo probee --otlp-endpoint 127.0.0.1:11000 \
--symbol-server-listen 8888 --flush-interval 10000
Then point devfiler at: ./devfiler --symb-endpoint http://localhost:8888
Feature-gated behind 'symbol-server' (included in default features).
Build without: cargo build --no-default-features --features tui,otlp
…ility
Simplify the OTLP export decision to one clear rule:
Native mode (real ELF VAs) is used when:
1. A symbol server is configured (--symbol-server or --symbol-server-listen)
2. AND the collection path supports collect_raw() (batch or --flush-interval)
Pre-symbolized mode (function names) is used in all other cases:
- --serve / --tui (consume collapse strings, can't use collect_raw)
- Any mode without a symbol server (Pyroscope, generic OTLP receivers)
Previously the mode was chosen based on confusing nested conditionals
that could result in native frames without a symbol server (unusable
by devfiler) or pre-symbolized frames when a symbol server was available.
Also clean up debug eprintln statements, remove dead code from old
batch mode, and fix leftover brace mismatch.
…back - Add TODO in compute_htlhash() documenting the optimization opportunity for pre-extracting symbols when blazesym is already loaded - Update NEXT_STEPS.md with full OTLP status and future enhancements including DWARF enrichment, Parca/debuginfod compat, and GNU Build ID fallback for the pre-symbolized path - Note: htlhash (SHA-256) is already only computed in the native path (devfiler + symbol server). The pre-symbolized path (Pyroscope) uses a cheap FNV hash, so no performance concern there.
- MultiplexSink now logs errors but continues running other sinks. Previously an OTLP transport error would propagate and kill the entire profiling session, even when local file sinks were working. - OtlpSink and OtlpNativeSink reset the gRPC client on failure so they reconnect on the next send attempt. This handles transient outages where the OTLP endpoint goes down and comes back. - Profiling continues uninterrupted when the OTLP server is temporarily unavailable — samples are dropped but local output and the profiling loop keep running.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 21 minutes and 15 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (7)
📝 WalkthroughWalkthroughThis PR introduces OTLP Profiles export functionality and a standalone symbol server for the profile-bee profiling tool. It adds two new Rust crates ( Changes
Sequence Diagram(s)sequenceDiagram
participant Client as profile-bee Client
participant Uploader as SymbolUploader
participant SymServer as Symbol Server
participant Store as Disk Store
Client->>Uploader: spawn(server_url)
Uploader->>Uploader: upload_loop()
Client->>Uploader: upload_for_pid(pid)
Uploader->>Uploader: scan /proc/pid/maps
Uploader->>Uploader: read ELF binary
Uploader->>SymServer: POST /upload?filename=... (ELF bytes)
SymServer->>SymServer: extract_symbols()
SymServer->>SymServer: write_symbfile()
SymServer->>Store: write metadata.json + ranges
SymServer-->>Uploader: 200 OK {file_id, symbol_count}
Uploader->>Uploader: cache in HashSet
sequenceDiagram
participant Profiler as profile-bee
participant Collector as Collection Loop
participant OTLP as OTLP Exporter
participant Endpoint as OTLP Endpoint
Profiler->>Collector: collect_raw() or collect()
Collector->>Collector: gather FrameCount samples
alt native_mode (symbol_server configured)
Collector->>OTLP: framecounts_to_otlp_request(frames)
OTLP->>OTLP: parse /proc/pid/maps
OTLP->>OTLP: compute htlhash build IDs
OTLP->>OTLP: deduplicate locations by (elf_va, mapping)
else pre_symbolized_mode
Collector->>OTLP: collapse_to_otlp_request(stacks)
OTLP->>OTLP: build function/location dictionaries
end
OTLP->>OTLP: populate timestamps, sample counts
OTLP->>Endpoint: gRPC Export(ExportProfilesServiceRequest)
Endpoint-->>OTLP: ExportProfilesServiceResponse
sequenceDiagram
participant Client as HTTP Client
participant Server as Symbol Server
participant Extract as extract_symbols()
participant Writer as symbfile::write_symbfile()
participant Store as SymbolStore
Client->>Server: GET /:a/:b/:id/ranges
Server->>Server: parse FileId from path
alt FileId not found
Server-->>Client: 404 Not Found
else FileId exists
Server->>Store: get_ranges(file_id)
Store-->>Server: compressed symbfile bytes
Server-->>Client: 200 OK (application/octet-stream)
end
Client->>Server: POST /upload?filename=binary
Server->>Server: compute FileId from body
alt already cached
Server-->>Client: 409 already_exists
else new binary
Server->>Extract: extract_symbols(elf_path)
Extract-->>Server: Vec<SymbolRange>
Server->>Writer: write_symbfile(ranges)
Writer-->>Server: zstd-compressed bytes
Server->>Store: store_symbfile(file_id, data)
Server-->>Client: 200 OK {file_id, symbol_count}
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
profile-bee/Cargo.toml (1)
10-18:⚠️ Potential issue | 🟠 MajorAdd
proto/**to the package include list to enablecargo installwith default features.Line 64 enables
otlpin default builds, andbuild.rscompiles files underprofile-bee/protousing the protox compiler when the feature is active. The packageincludeallowlist omitsproto/**, causing publishedcargo install profile-beebuilds to fail.📦 Proposed package include fix
include = [ "build.rs", "Cargo.toml", "../README.md", "src/**/*.rs", "bin/**/*.rs", + "proto/**", "assets/**", "ebpf-bin/profile-bee.bpf.o", ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@profile-bee/Cargo.toml` around lines 10 - 18, The package include list in Cargo.toml omits the proto/** files required when the otlp feature is enabled by default and build.rs compiles proto files; update the include array in Cargo.toml to add "proto/**" so the proto files are packaged for cargo install, ensuring build.rs can find and compile files under proto when the otlp/default feature is used.
🧹 Nitpick comments (4)
profile-bee-symbols/src/extract.rs (1)
34-36: Whole-file read into memory.
std::fs::read(path)buffers the entire binary into RAM. For large stripped binaries (hundreds of MB / multi-GB), this can be a significant allocation. Considermemmap2(a commonobjectpairing) for zero-copy parsing if you expect large inputs, or document the limitation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@profile-bee-symbols/src/extract.rs` around lines 34 - 36, The extract_symbols function currently calls std::fs::read(path) which loads the entire file into memory; change it to use a memory-mapped file (e.g., open the file with std::fs::File and create a memmap2::Mmap or memmap2::MmapOptions mapping) and pass the mapped bytes to object::File::parse to avoid large heap allocations, or if you choose not to map, add documentation on the memory limitation; update references in extract_symbols to use the mapped byte slice instead of the Vec<u8>.profile-bee/src/trace_handler.rs (1)
516-526: Consider usingaddrs.len()invariant or assertion.
zipsilently truncates on length mismatch. blazesym returns oneSymbolizedper input address so they should match, but ifsymbolize_user_stackerrors and falls back to empty viaunwrap_or_default(), the downstream V8 override still indexesuser_syms[i]withi < user_syms.len(), which is safe. A debug_assert tying the invariant would guard against future refactoring.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@profile-bee/src/trace_handler.rs` around lines 516 - 526, The loop assumes user_syms and addrs have equal length but zip would silently truncate on mismatch; add a debug assertion using the lengths returned from symbolize_user_stack (referencing user_stack, addrs, user_syms, and symbolize_user_stack) before the for loop to assert user_syms.len() == addrs.len() (or at least that user_syms.len() <= addrs.len() as appropriate), so future refactors will catch violations during debug builds while keeping release behavior unchanged; keep the rest of the loop that sets frame.address unchanged.profile-bee/bin/profile-bee.rs (1)
737-746: Usetokio::time::sleepinstead ofstd::thread::sleepinside#[tokio::main].
mainisasync, sostd::thread::sleep(Duration::from_millis(100))parks the current Tokio worker for 100ms and prevents the just-spawned embedded server's tasks from making progress on this thread. Prefertokio::time::sleep(...).await. Better still, havespawn_embedded_serverreturn a "ready" signal (oneshot /Notify) to avoid the race entirely.♻️ Proposed change
#[cfg(feature = "symbol-server")] if let Some(port) = opt.symbol_server_listen { profile_bee::embedded_symbol_server::spawn_embedded_server( port, tokio::runtime::Handle::current(), ); // Give the server a moment to bind - std::thread::sleep(std::time::Duration::from_millis(100)); + tokio::time::sleep(std::time::Duration::from_millis(100)).await; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@profile-bee/bin/profile-bee.rs` around lines 737 - 746, Replace the blocking std::thread::sleep in the async main with an async-friendly wait: call tokio::time::sleep(Duration::from_millis(100)).await instead of std::thread::sleep(...) after calling profile_bee::embedded_symbol_server::spawn_embedded_server so the Tokio reactor isn't blocked; ideally modify spawn_embedded_server in the embedded_symbol_server module to return a readiness signal (oneshot channel or tokio::sync::Notify) and await that ready signal in main to eliminate the race entirely.profile-bee/src/otlp.rs (1)
564-602: Replace manual/proc/<pid>/mapsparsing withprocfscrate inread_proc_maps().The
procfscrate (already in dependencies and used indwarf_unwind.rs) provides a standardized API for reading process memory maps. The manualsplit_whitespace()parser inotlp.rsduplicates this logic and introduces unnecessary maintenance burden. Refactorread_proc_maps()to useprocfs::process::Process::maps()and filter results byMMPermissions::EXECUTE, consistent with the existing pattern indwarf_unwind.rsat lines 576–611.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@profile-bee/src/otlp.rs` around lines 564 - 602, read_proc_maps currently parses /proc/<pid>/maps by hand; replace it to use procfs::process::Process::new(pid)?.maps() (as used in dwarf_unwind.rs) and map procfs::process::MemoryMap entries into ProcMapEntry: use map.address.0 and .1 for start/end, map.offset for offset, and map.pathname.map_or_else(|| "".to_string(), |p| p.to_string()) for pathname; filter maps by map.perms.contains(procfs::process::MMPermissions::EXECUTE) and skip empty or bracketed pathnames, and on any Procfs error return an empty Vec with a tracing::debug message similar to the existing behavior. Ensure you reference the read_proc_maps function and construct ProcMapEntry { start, end, offset, pathname } accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@profile-bee-symbols/src/extract.rs`:
- Around line 104-122: The DWARF helpers extract_from_dwarf and
enrich_with_dwarf are stubs which cause the "extracted {} symbols from DWARF"
log to always show 0 and make stripped uploads fail upstream; update the code to
either (A) make extract_symbols return a clear Err variant when DWARF extraction
is not implemented (so callers can surface a specific "stripped binary / DWARF
not implemented" error) or (B) change the log in the empty-symtab branch to
explicitly state "stripped binary: DWARF extraction not implemented — no symbols
extracted" instead of reporting 0, and ensure extract_from_dwarf /
enrich_with_dwarf remain no-ops but documented; locate and modify the functions
extract_from_dwarf, enrich_with_dwarf and the extract_symbols call site to
implement the chosen behavior.
- Around line 125-138: The demangling function `demangle` currently calls
`rustc_demangle::try_demangle` first which mis-parses C++ Itanium names (e.g.
`_ZN...E...`) as Rust; update `demangle` to attempt C++ demangling via
`cpp_demangle::Symbol::new` and
`sym.demangle(&cpp_demangle::DemangleOptions::default())` before calling
`rustc_demangle::try_demangle`, or alternatively add a stricter guard around
`rustc_demangle::try_demangle` (e.g. only accept names starting with Rust v0
`_R`) so that C++ Itanium symbols fall through to the `cpp_demangle` branch and
return correct results from the `demangle` function.
In `@profile-bee/bin/profile-bee.rs`:
- Around line 879-921: The OTLP block reads the field opt.tui which is only
compiled when the "tui" feature is enabled, causing a compile error when "otlp"
is enabled without "tui"; fix by introducing a cfg-bridged local boolean (e.g.,
is_tui) with #[cfg(feature = "tui")] let is_tui = opt.tui; and #[cfg(not(feature
= "tui"))] let is_tui = false; then use that local (replace uses of opt.tui in
the otlp block where can_use_native and the else-if check are computed) so
otlp_native_sink, can_use_native, and use_native logic no longer directly access
the tui-gated field.
In `@profile-bee/Cargo.toml`:
- Line 24: The Cargo.toml dependency entry for profile-bee-symbols is currently
path-only which breaks published installs; update the profile-bee dependency
declaration in Cargo.toml to include a crates.io version alongside the path
(e.g., profile-bee-symbols = { version = "x.y.z", path =
"../profile-bee-symbols" }) matching the published crate version used by
profile-bee-common/profile-bee-tui so the package can be resolved when installed
from crates.io.
In `@profile-bee/src/embedded_symbol_server.rs`:
- Around line 50-63: The server is currently bound to all interfaces ("0.0.0.0")
which exposes uploads publicly; change the default bind address to loopback by
replacing the addr construction used in spawn_embedded_server (where addr is set
to "0.0.0.0:{port}" and passed to tokio::net::TcpListener::bind) so it uses
"127.0.0.1:{port}" instead, and if remote access is required add an explicit
configuration option or parameter (e.g., a bind_addr argument) so callers can
opt into non-loopback binding.
- Around line 111-153: The temp file name is deterministic via
file_id.format_hex() and the handler does blocking I/O and CPU work inline;
change creation of tmp_path to use a per-request unique tempfile (e.g., append a
UUID or use tempfile::NamedTempFile) instead of file_id.format_hex(), and move
all blocking operations (writing the temp file, calling
extract::extract_symbols(&tmp_path), removing the file, and calling
write_symbfile(&symbols)) into a tokio::task::spawn_blocking closure so the
async handler only awaits the spawned task; ensure you still remove the temp
file on both success and error and propagate errors back into the same
StatusCode responses as before, referencing tmp_path, extract::extract_symbols,
and write_symbfile in your changes.
In `@profile-bee/src/otlp.rs`:
- Around line 326-332: The code currently parses count_str into an i64 and later
uses .max(1) so zero/negative counts become a single sample; instead, after
parsing (the variable names: count_str -> count) explicitly reject any parse
error (including overflow into i64) by logging and continuing, and then skip any
count <= 0 with a warning (do not coerce to 1). Apply the same check where
FrameCount.count is handled (remove or avoid any .max(1> coercion) for
collapse-path logic) and replicate this validation in the other collapse-path
sites you flagged (the blocks around the other ranges), so parsed count must be
a positive i64 before being used.
- Around line 839-917: The dedup bug arises because frames with addr==0 or no
mapping all become (mapping_index=0,address=0) so real_loc_map reuses the wrong
Location; fix by extending the RealLocationKey (used for real_loc_map) to
include the fallback identity when mapping_idx==0 or elf_va==0 — compute a
name_strindex (intern the func_name via strings.intern) and a filename_strindex
(frame.source.as_deref().map(|s| strings.intern(s)).unwrap_or(0)) and include
those values in the key creation before checking real_loc_map; keep using
find_or_create_mapping, symbol_name/func_name, FuncKey/func_set and
Location/Line as before, but ensure the same augmented key is used for insertion
into real_loc_map so unmapped/zero-address frames are deduped only when name and
source also match.
In `@profile-bee/src/output.rs`:
- Around line 640-646: write_batch currently calls send_stacks with the entire
accumulated snapshot and finish does the same, causing duplicate cumulative
samples to be re-exported; modify the collapse-format sink so write_batch only
sends the delta since the last flush (compute difference between the last sent
snapshot and the current stacks) or change write_batch to no-op and have finish
call send_stacks once for the final snapshot. Update the logic around
send_stacks, write_batch, finish and any tracking state (e.g., store the
last_sent_stacks buffer) to ensure streaming mode exports deltas per flush or a
single export on finish.
- Around line 66-84: MultiplexSink currently swallows all errors in write_batch
and finish which prevents callers like Session::run_streaming (which expects
sink.write_batch(...)?, sink.finish(...)?) from observing required failures;
change MultiplexSink so it only logs and ignores errors for designated
best-effort sinks (e.g., an enum or flag per sink indicating best_effort) and
returns Err if any non-best-effort sink fails (propagate the first error or
aggregate and return a combined error); update the loops in write_batch and
finish to check each sink's metadata (MultiplexSink.sinks entries), log
best-effort failures with tracing::warn!, and return Err(...) for required sink
failures so callers can fail the run when necessary.
- Around line 567-574: The OTLP network calls currently block without timeouts;
wrap both client.connect(...) and client.export(...) calls in
tokio::time::timeout(...) using a configurable deadline (add a config field or a
DEFAULT_TIMEOUT constant) so they fail fast; specifically update
OtlpSink::send_stacks() to wrap the runtime.block_on(async { client.export(...)
}) future in timeout and convert timeout::Elapsed into an appropriate anyhow
error, update OtlpSink::ensure_connected() to wrap client.connect(...)
similarly, and apply the same changes to OtlpNativeSink::send_framecounts() and
OtlpNativeSink::ensure_connected() so both connect() and export() paths return
timely errors instead of blocking indefinitely.
In `@profile-bee/src/symbol_upload.rs`:
- Around line 94-99: The reqwest client used in upload_loop has no timeouts and
a single stalled POST will block the entire single-threaded receiver; rebuild
the client in upload_loop using reqwest::Client::builder() to set a reasonable
per-request timeout (timeout(Duration::from_secs(...))) and optionally a
connect_timeout(Duration::from_secs(...))), replacing Client::new() with the
builder so each POST will fail fast; apply the same change to the other
uploader/client construction referenced (the similar client creation around the
143-172 region) so all uploads have timeouts.
- Around line 176-184: The urlencoding function currently percent-encodes
Unicode codepoints (using c as u32) which produces invalid percent sequences for
non-ASCII chars; change urlencoding to operate on the UTF-8 byte sequence
instead (e.g., iterate s.as_bytes() and emit ASCII-safe bytes unmodified while
percent-encoding any byte outside the unreserved set as %02X) or replace its
implementation with a standard percent-encoding helper (percent-encoding crate)
so filenames like "café.so" are encoded as %C3%A9 and never emit multi-byte
codepoint values; update the function named urlencoding accordingly and ensure
callers that pass OsStr/coerced strings provide a UTF-8 byte view.
In `@README.md`:
- Around line 80-82: Update the README example to clarify Devfiler symbol-server
wiring: explain that starting a standalone symbol-server and passing
--symbol-server to probee is one option, but Devfiler also requires passing
Devfiler's --symb-endpoint (pointing at the running symbol-server URL) to stream
symbols; alternatively mention the embedded mode by using Devfiler's
--symbol-server-listen flag so no external server is needed. Reference the flags
--symbol-server, --symb-endpoint and --symbol-server-listen and adjust the two
example blocks (the one that starts symbol-server and the tested Devfiler flow)
so users know to either (1) run symbol-server and set --symb-endpoint to
http://localhost:8888 when using --symbol-server, or (2) run Devfiler with
--symbol-server-listen to avoid unsymbolized native frames.
In `@symbol-server/src/main.rs`:
- Around line 29-31: Change the CLI default for the server bind address from
0.0.0.0 to 127.0.0.1 so the POST /upload endpoint is not exposed to the network
by default: update the #[arg(... default_value = "0.0.0.0")] on the bind field
(the --bind argument in main.rs) to use "127.0.0.1", and update any user-facing
docs or README that mention the old default to reflect the new loopback default
and note that operators can opt into network-facing behavior by passing --bind
explicitly.
In `@symbol-server/src/server.rs`:
- Around line 89-142: The code currently writes to a deterministic tmp_path
(format!("symbol-server-{}.elf", file_id.format_hex())) and performs heavy sync
work on the Axum worker; change this to create a per-request unique temp file
(e.g., use tempfile::NamedTempFile or include a UUID/pid/timestamp) for writing
the body, then move the extraction/generation/storage sequence into a blocking
task via tokio::task::spawn_blocking: perform std::fs::write to the unique temp,
call extract::extract_symbols(&tmp_path), call
symbfile::write_symbfile(&symbols), then call store.store_symbfile(file_id,
&ranges_data, &filename, num_symbols) all inside the spawn_blocking closure,
ensuring the temp file is removed on both success and error and propagate errors
back to the async handler.
In `@symbol-server/src/store.rs`:
- Around line 57-67: The current write sequence writes ranges then metadata.json
which can leave a partial entry if metadata write fails; update the write logic
that writes to dir.join("ranges") and dir.join("metadata.json") to first write
each to a temp file (e.g., ranges.tmp, metadata.json.tmp) and then atomically
rename (std::fs::rename) the temp files into place so both files appear
atomically; additionally, modify rebuild_index() to only consider an entry valid
if both dir.join("ranges") and dir.join("metadata.json") exist (i.e., check
presence of both before indexing) and keep using id.format_es() for the indexed
identifier.
---
Outside diff comments:
In `@profile-bee/Cargo.toml`:
- Around line 10-18: The package include list in Cargo.toml omits the proto/**
files required when the otlp feature is enabled by default and build.rs compiles
proto files; update the include array in Cargo.toml to add "proto/**" so the
proto files are packaged for cargo install, ensuring build.rs can find and
compile files under proto when the otlp/default feature is used.
---
Nitpick comments:
In `@profile-bee-symbols/src/extract.rs`:
- Around line 34-36: The extract_symbols function currently calls
std::fs::read(path) which loads the entire file into memory; change it to use a
memory-mapped file (e.g., open the file with std::fs::File and create a
memmap2::Mmap or memmap2::MmapOptions mapping) and pass the mapped bytes to
object::File::parse to avoid large heap allocations, or if you choose not to
map, add documentation on the memory limitation; update references in
extract_symbols to use the mapped byte slice instead of the Vec<u8>.
In `@profile-bee/bin/profile-bee.rs`:
- Around line 737-746: Replace the blocking std::thread::sleep in the async main
with an async-friendly wait: call
tokio::time::sleep(Duration::from_millis(100)).await instead of
std::thread::sleep(...) after calling
profile_bee::embedded_symbol_server::spawn_embedded_server so the Tokio reactor
isn't blocked; ideally modify spawn_embedded_server in the
embedded_symbol_server module to return a readiness signal (oneshot channel or
tokio::sync::Notify) and await that ready signal in main to eliminate the race
entirely.
In `@profile-bee/src/otlp.rs`:
- Around line 564-602: read_proc_maps currently parses /proc/<pid>/maps by hand;
replace it to use procfs::process::Process::new(pid)?.maps() (as used in
dwarf_unwind.rs) and map procfs::process::MemoryMap entries into ProcMapEntry:
use map.address.0 and .1 for start/end, map.offset for offset, and
map.pathname.map_or_else(|| "".to_string(), |p| p.to_string()) for pathname;
filter maps by map.perms.contains(procfs::process::MMPermissions::EXECUTE) and
skip empty or bracketed pathnames, and on any Procfs error return an empty Vec
with a tracing::debug message similar to the existing behavior. Ensure you
reference the read_proc_maps function and construct ProcMapEntry { start, end,
offset, pathname } accordingly.
In `@profile-bee/src/trace_handler.rs`:
- Around line 516-526: The loop assumes user_syms and addrs have equal length
but zip would silently truncate on mismatch; add a debug assertion using the
lengths returned from symbolize_user_stack (referencing user_stack, addrs,
user_syms, and symbolize_user_stack) before the for loop to assert
user_syms.len() == addrs.len() (or at least that user_syms.len() <= addrs.len()
as appropriate), so future refactors will catch violations during debug builds
while keeping release behavior unchanged; keep the rest of the loop that sets
frame.address unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4ae80dcb-4d43-40a3-97f3-0ee7cbf2341d
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (29)
.gitignoreCargo.tomlREADME.mddocs/NEXT_STEPS.mddocs/otlp_export.mdprofile-bee-symbols/Cargo.tomlprofile-bee-symbols/src/extract.rsprofile-bee-symbols/src/fileid.rsprofile-bee-symbols/src/lib.rsprofile-bee/Cargo.tomlprofile-bee/bin/profile-bee.rsprofile-bee/build.rsprofile-bee/proto/opentelemetry/proto/collector/profiles/v1development/profiles_service.protoprofile-bee/proto/opentelemetry/proto/common/v1/common.protoprofile-bee/proto/opentelemetry/proto/profiles/v1development/profiles.protoprofile-bee/proto/opentelemetry/proto/resource/v1/resource.protoprofile-bee/src/embedded_symbol_server.rsprofile-bee/src/lib.rsprofile-bee/src/otlp.rsprofile-bee/src/output.rsprofile-bee/src/symbol_upload.rsprofile-bee/src/trace_handler.rssymbol-server/Cargo.tomlsymbol-server/src/extract.rssymbol-server/src/fileid.rssymbol-server/src/main.rssymbol-server/src/server.rssymbol-server/src/store.rssymbol-server/src/symbfile.rs
extract.rs: - Fix demangle order: try C++ Itanium before Rust to prevent mis-parsing - Clear DWARF stub messaging: log 'stripped binary: DWARF not implemented' instead of misleading '0 symbols from DWARF' profile-bee.rs: - cfg-gate opt.tui behind #[cfg(feature = "tui")] so otlp compiles without tui feature enabled Cargo.toml: - Add version to profile-bee-symbols dependency for crates.io compat - Add proto/** to include list for cargo install embedded_symbol_server.rs: - Bind to 127.0.0.1 instead of 0.0.0.0 (security) - Use per-request unique temp file names (include PID) - Move blocking I/O into tokio::task::spawn_blocking symbol_upload.rs: - Add connect_timeout(10s) and request timeout(60s) to reqwest client - Fix urlencoding: operate on UTF-8 bytes not Unicode codepoints otlp.rs: - Reject count <= 0 with warning instead of coercing to 1 - Fix RealLocationKey dedup: include name/source for unmapped frames so addr=0 frames don't all collide into one Location output.rs: - Add 10s timeout to all OTLP connect() and export() calls via tokio::time::timeout to prevent indefinite blocking symbol-server: - Bind default to 127.0.0.1 - spawn_blocking for upload handler - Per-request unique temp files - Atomic writes (temp + rename) for ranges and metadata.json - rebuild_index requires both files present trace_handler.rs: - debug_assert on user_syms.len() <= addrs.len() for zip safety README: - Clarify devfiler wiring: show embedded mode and --symb-endpoint
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
symbol-server/src/server.rs (1)
89-119:⚠️ Potential issue | 🟠 MajorTemp-file race is still reachable —
process::id()is constant per process.The previous review flagged the deterministic
tmp_pathrace and recommended a UUID /tempfile::NamedTempFile. The update only addedstd::process::id()to the filename, but every request inside the same server process shares the same pid, so two concurrent uploads of the samefile_idstill collide onsymbol-server-<hex>-<pid>.elf:
- Req A:
fs::write(&tmp_path, &bodyA)→ beginsextract_symbols(&tmp_path)- Req B:
fs::write(&tmp_path, &bodyB)(overwrite) → A now parses B's bytes, or B'sremove_filepulls the file out from under A.Because
has_symbols(&file_id)is checked non-atomically (line 73) and nothing serializes per-FileIdwork, concurrent clients uploading the same ELF will race here. Use a per-request unique name (nonce/UUID/atomic counter) ortempfile::NamedTempFileso the path cannot alias.🛡️ Proposed fix (atomic counter — no new dep)
+use std::sync::atomic::{AtomicU64, Ordering}; +static UPLOAD_SEQ: AtomicU64 = AtomicU64::new(0); @@ - let tmp_dir = std::env::temp_dir(); - let tmp_path = tmp_dir.join(format!( - "symbol-server-{}-{}.elf", - file_id.format_hex(), - std::process::id() - )); + let tmp_dir = std::env::temp_dir(); + let seq = UPLOAD_SEQ.fetch_add(1, Ordering::Relaxed); + let tmp_path = tmp_dir.join(format!( + "symbol-server-{}-{}-{}.elf", + file_id.format_hex(), + std::process::id(), + seq + ));Or drop the temp file entirely if
extract::extract_symbolscan be adapted to accept&[u8]/ aCursor—bodyis alreadyBytesin memory, so writing up to 512 MB to disk just to read it back is avoidable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@symbol-server/src/server.rs` around lines 89 - 119, The temp-file name generation for the spawn_blocking closure still collides across concurrent requests because std::process::id() is constant; replace the deterministic tmp_path logic in the closure (where you create tmp_path and call std::fs::write, extract::extract_symbols(&tmp_path), and std::fs::remove_file) with a per-request unique temp file using tempfile::NamedTempFile (or generate a UUID/atomic nonce) so each request gets a unique path; write the body into that NamedTempFile, pass its path to extract::extract_symbols, and drop/close the NamedTempFile (no manual remove_file needed) to eliminate the race.
🧹 Nitpick comments (2)
symbol-server/src/server.rs (1)
73-85: Optional: per-FileIdsingle-flight around extraction.
has_symbols(&file_id)+ laterstore_symbfile(...)is a TOCTOU window; two near-simultaneous uploads of the same binary will both run the full (expensive) extract + symbfile pipeline before one of them wins the store. Not a correctness issue (result is idempotent), but wastes CPU and 512 MB of temp I/O per duplicate request. AMutex<HashSet<FileId>>ordashmapof in-flight IDs — with a "wait/skip if in-flight" branch — would collapse duplicates.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@symbol-server/src/server.rs` around lines 73 - 85, The current has_symbols(&file_id) check creates a TOCTOU race causing duplicate concurrent extractions; add a per-FileId single-flight registry (e.g., a shared Mutex<HashSet<FileId>> or a dashmap) that you consult at the start of the upload handler: if file_id is already in-flight, either wait for completion or return an "in_progress" response; otherwise insert file_id into the registry, proceed with extraction and call store_symbfile(...), and ensure removal from the registry in a finally/RAII guard so the ID is removed on success or error. Use the existing FileId type and the upload handler (where has_symbols and store_symbfile are used) to locate where to check/insert/remove the in-flight marker.profile-bee/bin/profile-bee.rs (1)
737-773: Symbol uploader / embedded server coordination is race-prone.The embedded server is spawned and then
std::thread::sleep(Duration::from_millis(100))is used to wait for it to bind (line 745). Immediately after,SymbolUploader::spawnat line 761 will start POSTing uploads tohttp://127.0.0.1:<port>. If the 100ms bind window isn't enough on a loaded system, early uploads will fail with connection refused and need to rely on the uploader's internal retries.Consider making
spawn_embedded_serverreturn atokio::sync::oneshot::Receiver<()>(or similar) signaled afterTcpListener::bindsucceeds, andawaitit before starting the uploader. This removes the magic sleep and converts a silent failure mode into deterministic sequencing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@profile-bee/bin/profile-bee.rs` around lines 737 - 773, Replace the blind sleep coordination between spawn_embedded_server and the SymbolUploader with an explicit readiness signal: change profile_bee::embedded_symbol_server::spawn_embedded_server to return a tokio::sync::oneshot::Receiver<()> (or similar ready-future) that is fulfilled once TcpListener::bind succeeds, then await that receiver (or poll it) before constructing the SymbolUploader via profile_bee::symbol_upload::SymbolUploader::spawn; ensure the current opt.symbol_server_listen/ effective_symbol_server logic still chooses the embedded URL and only starts the uploader after the readiness signal is received so uploads won't attempt connections before the embedded server has bound.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@profile-bee/bin/profile-bee.rs`:
- Around line 1984-1986: run_combined_mode currently drops OTLP export silently;
update it to mirror run_tui_mode's OTLP wiring by building an OtlpSink (and
include it in a MultiplexSink if needed) and pass that sink into
spawn_profiling_thread, or at minimum detect when --otlp-endpoint is set and log
a warning before returning. Locate run_combined_mode and add the same OtlpSink
construction logic used in run_tui_mode (the OtlpSink/MultiplexSink creation
and configuration) and ensure the resulting sink is forwarded to
spawn_profiling_thread (or emit a clear warning mentioning the --otlp-endpoint
flag if you choose not to construct the sink).
---
Duplicate comments:
In `@symbol-server/src/server.rs`:
- Around line 89-119: The temp-file name generation for the spawn_blocking
closure still collides across concurrent requests because std::process::id() is
constant; replace the deterministic tmp_path logic in the closure (where you
create tmp_path and call std::fs::write, extract::extract_symbols(&tmp_path),
and std::fs::remove_file) with a per-request unique temp file using
tempfile::NamedTempFile (or generate a UUID/atomic nonce) so each request gets a
unique path; write the body into that NamedTempFile, pass its path to
extract::extract_symbols, and drop/close the NamedTempFile (no manual
remove_file needed) to eliminate the race.
---
Nitpick comments:
In `@profile-bee/bin/profile-bee.rs`:
- Around line 737-773: Replace the blind sleep coordination between
spawn_embedded_server and the SymbolUploader with an explicit readiness signal:
change profile_bee::embedded_symbol_server::spawn_embedded_server to return a
tokio::sync::oneshot::Receiver<()> (or similar ready-future) that is fulfilled
once TcpListener::bind succeeds, then await that receiver (or poll it) before
constructing the SymbolUploader via
profile_bee::symbol_upload::SymbolUploader::spawn; ensure the current
opt.symbol_server_listen/ effective_symbol_server logic still chooses the
embedded URL and only starts the uploader after the readiness signal is received
so uploads won't attempt connections before the embedded server has bound.
In `@symbol-server/src/server.rs`:
- Around line 73-85: The current has_symbols(&file_id) check creates a TOCTOU
race causing duplicate concurrent extractions; add a per-FileId single-flight
registry (e.g., a shared Mutex<HashSet<FileId>> or a dashmap) that you consult
at the start of the upload handler: if file_id is already in-flight, either wait
for completion or return an "in_progress" response; otherwise insert file_id
into the registry, proceed with extraction and call store_symbfile(...), and
ensure removal from the registry in a finally/RAII guard so the ID is removed on
success or error. Use the existing FileId type and the upload handler (where
has_symbols and store_symbfile are used) to locate where to check/insert/remove
the in-flight marker.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 920ae1f3-d804-48c5-8c23-2ecdfe11fff7
📒 Files selected for processing (12)
README.mdprofile-bee-symbols/src/extract.rsprofile-bee/Cargo.tomlprofile-bee/bin/profile-bee.rsprofile-bee/src/embedded_symbol_server.rsprofile-bee/src/otlp.rsprofile-bee/src/output.rsprofile-bee/src/symbol_upload.rsprofile-bee/src/trace_handler.rssymbol-server/src/main.rssymbol-server/src/server.rssymbol-server/src/store.rs
🚧 Files skipped from review as they are similar to previous changes (6)
- profile-bee/Cargo.toml
- README.md
- profile-bee-symbols/src/extract.rs
- symbol-server/src/store.rs
- profile-bee/src/symbol_upload.rs
- profile-bee/src/output.rs
run_combined_mode: - Wire OTLP sink into combined TUI+serve mode (was silently dropping OTLP export by passing None to spawn_profiling_thread) Temp file collision: - Use atomic counter (REQUEST_COUNTER) for unique temp file names in both symbol-server and embedded_symbol_server, eliminating races across concurrent upload requests Embedded server readiness: - Replace blind 100ms sleep with oneshot channel: spawn_embedded_server now returns a Receiver<()> that fires after TcpListener::bind succeeds - Main awaits the readiness signal before starting the symbol uploader, eliminating the connection race entirely
All workspace crates now use consistent v0.3.12 versioning. Ran cargo clippy --fix and cargo fmt across all packages.
To setup Devfiler, follow instructions in https://github.com/elastic/devfiler
I've run Devfiler on my mac with profile-bee on a linux server.
I then setup both a listening port forwarding reverse port forwarding so profile-bee can send OTLP traces to Devfiler, and for Devfiler to contact the symbol server (which can be run together with profile-bee or a separate process).
Devfiler is ran with
(alternatively as 2 processes)
Profile-bee is ran with
This addresses #93. cc @AntoxaBarin
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores