Skip to content

[NOT READY] feat(profiling): add --show-from filter to profiling aggregate#470

Closed
long2mao1 wants to merge 1 commit into
DataDog:mainfrom
long2mao1:feat/profiling-aggregate-show-from
Closed

[NOT READY] feat(profiling): add --show-from filter to profiling aggregate#470
long2mao1 wants to merge 1 commit into
DataDog:mainfrom
long2mao1:feat/profiling-aggregate-show-from

Conversation

@long2mao1
Copy link
Copy Markdown
Contributor

@long2mao1 long2mao1 commented May 6, 2026

Summary

The Datadog Continuous Profiler UI exposes a show_from(<function>) flame-graph filter that zooms the displayed graph to subtrees rooted at a chosen function. The filter is applied client-side — the API still returns the full aggregated profile — so reproducing it from the CLI required hand-rolled jq, which is awkward for AI agents and humans alike. This adds an equivalent native flag.

Depends on #460pup profiling aggregate doesn't return a usable response without that recursion-limit fix, so this branch is stacked on top of it. The first commit on this PR is the fix; the second is the feature.

Behavior

pup profiling aggregate --show-from=<function_name> does, in order:

  1. Resolves the function name to string IDs (exact match against strings).
  2. Walks the response's frame table to find every frame_id whose function field (frameSchema[4]) refers to one of those string IDs. A single logical function commonly maps to many frame IDs after Go inlining / generics.
  3. Walks flameGraph and collects the topmost subtree rooted at any matching frame on each call path (does not descend into nested re-entries — matches UI semantics).
  4. Merges the collected subtrees into a single root by recursively collapsing children that share the same function name. Without this, the same logical function would appear as N siblings at depth 0 (one per inlined frame ID), which doesn't match what the UI shows.
  5. Prunes frames and strings to only the entries the trimmed subtree references (with consistent ID remapping), and drops UI aggregation fields (metadata, summaryTable, endpointCounts, etc.) that describe the unfiltered profile rather than the filtered subtree.

The output is the same wire-format shape as the unfiltered API response (positional tuples + frames + strings interning tables, with frameSchema and nodeSchema preserved). The intent is that anything an agent already knows about Datadog's profiler API still applies — --show-from only narrows the data, it doesn't change the schema.

Concrete impact

Tested end-to-end against a 14-hour aggregate query on a real Go service. Order-of-magnitude numbers:

Without --show-from With --show-from=<function>
Response size tens of MB ~100 KB
Top-level subtrees hundreds 1 (merged)
frames / strings table sizes thousands of entries dozens of entries

The resulting flameGraph shape matches the UI's show_from(...) view exactly: a single root for the chosen function, with its merged callee subtree below.

Changes

  • New --show-from flag on ProfilingActions::Aggregate (src/main.rs).
  • New helpers in src/commands/profiling.rs:
    • apply_show_from — exact-match filter + topmost subtree collection.
    • merge_subtrees_by_function — recursive same-name sibling merge.
    • prune_aggregate_response — frames/strings remap + UI-field drop.
    • collect_show_from_subtrees, collect_used_frame_ids, remap_frame_ids — small recursive walkers.

Design notes (for reviewer feedback)

  • Exact match only, intentionally. Substring/regex would be a cheap follow-up but easier to extend than to retract.
  • Pruning is on whenever --show-from is set, not behind its own flag. The argument: if a user opted into a focused subtree view, the unfiltered metadata/summary fields are deadweight. Happy to gate behind --prune if the reviewer prefers.
  • We considered fully denormalizing the output to a self-describing { name, value_ms, children } record tree (no frames/strings lookup), since that's much friendlier for jq and LLMs. Rejected on the grounds that it diverges from the documented Datadog API shape and would force any existing agent/script to learn a second representation. Open to revisiting if the reviewer disagrees.
  • Sibling ordering after merge is by string ID (BTreeMap), not by value descending as the UI shows. Trivial to switch — flagging because it's a small UX difference.

Testing

  • Verified end-to-end on a real org against a multi-hour aggregate query. Output tree matches the UI's show_from(...) view, including merged sibling collapse, with the same per-minute CPU figure the UI displays.
  • Verified frame names still resolve correctly through the remapped tables (multi-level walks of the call tree).
  • No tests included yet — happy to add unit tests for apply_show_from and merge_subtrees_by_function against synthetic flame-graph fixtures before the draft moves out of draft, if the reviewer wants them inline.

Follow-ups out of scope

  • --show-from-mode=substring|regex for non-exact matching.
  • Sibling ordering by descending value to match the UI.
  • Same flag on pup profiling callgraph once that endpoint's own scaling issues (504s on long windows) are addressed.

🤖 Generated with Claude Code

The Datadog Continuous Profiler UI exposes a `show_from(<function>)`
flame-graph filter that zooms the displayed graph to subtrees rooted at
a chosen function. The filter is applied client-side — the API still
returns the full aggregated profile — so reproducing it from the CLI
required hand-rolled jq, which is awkward for AI agents and humans
alike. This adds an equivalent native flag.

`pup profiling aggregate --show-from=<function_name>` does, in order:

1. Resolves the function name to string IDs (exact match against
   `data.strings`).
2. Walks the response's frame table to find every `frame_id` whose
   `function` field (`frameSchema[4]`) refers to one of those string
   IDs. A single logical function commonly maps to many frame IDs
   after Go inlining / generics.
3. Walks `flameGraph` and collects the topmost subtree rooted at any
   matching frame on each call path (does not descend into nested
   re-entries — matches UI semantics).
4. Merges the collected subtrees into a single root by recursively
   collapsing children that share the same function name. Without
   this, the same logical function appears as N siblings at depth 0
   (one per inlined frame ID), which doesn't match the UI.
5. Prunes `frames` and `strings` to only the entries the trimmed
   subtree references (with consistent ID remapping), and drops UI
   aggregation fields (`metadata`, `summaryTable`, `endpointCounts`,
   etc.) that describe the unfiltered profile rather than the filtered
   subtree.

Concrete impact on a 14-hour `service:logs-event-store-reader` query:
unfiltered response is 43 MB; with `--show-from=vectorEQStringNotMulti`
the response drops to ~107 KB and shows the same call tree as the UI.

- New `--show-from` flag on `ProfilingActions::Aggregate` (src/main.rs)
- `apply_show_from`, `merge_subtrees_by_function`, and
  `prune_aggregate_response` helpers in src/commands/profiling.rs

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@platinummonkey platinummonkey added the enhancement New feature or request label May 7, 2026
@long2mao1 long2mao1 changed the title feat(profiling): add --show-from filter to profiling aggregate [DO NOT MERGE YET] feat(profiling): add --show-from filter to profiling aggregate May 7, 2026
@long2mao1 long2mao1 changed the title [DO NOT MERGE YET] feat(profiling): add --show-from filter to profiling aggregate [NOT READY] feat(profiling): add --show-from filter to profiling aggregate May 7, 2026
@platinummonkey
Copy link
Copy Markdown
Collaborator

closing for #492

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request product:apm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants