Skip to content

Conversation

@yoochangyeon
Copy link
Contributor

What does this PR do?

Adds SA_RESETHAND flag to the SIGSEGV signal handler registration in src/ddprof.cc and test/simple_malloc.cc.

With this flag, the kernel automatically resets the signal disposition to SIG_DFL (default behavior: process termination) before entering the handler. This ensures that if the handler fails to reach _exit(), the next SIGSEGV will
terminate the process instead of re-entering the handler infinitely.

Motivation

We observed a production issue where a Node.js process profiled by ddprof entered an infinite SIGSEGV loop, causing 100% CPU usage that never recovered.

The strace output showed this pattern repeating indefinitely:

[pid 2881] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xe3a8} ---
[pid 2881] read(7, "", 1) = 1
[pid 2881] write(22, "...", 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 2881] write(8, "", 1) = 1
[pid 2881] rt_sigreturn({mask=[]}) = 0

The syscall pattern (readwritert_sigreturn) did not match ddprof's sigsegv_handler (which calls write to stderr and _exit). This indicates that another signal handler (likely from the Node.js runtime) had overridden
ddprof's handler via a later sigaction() call.

Because the overriding handler returns normally via rt_sigreturn, the CPU re-executes the faulting instruction, which triggers SIGSEGV again — creating an infinite loop.

SA_RESETHAND prevents this scenario: even if ddprof's handler is overridden, or if the handler itself fails to terminate the process, the second SIGSEGV will use the default disposition and terminate the process cleanly.

Additional Notes

How to test the change?

  1. Existing tests: The change only adds a flag to sigaction() and does not alter handler logic, so existing tests should pass without modification.
  2. Manual verification: Attach ddprof to a process alongside another runtime (e.g., Node.js) that also installs a SIGSEGV handler. Trigger a SIGSEGV — the process should terminate instead of entering an infinite loop.
  3. Flag verification: Run cat /proc/<pid>/status | grep SigCgt to confirm SIGSEGV (signal 11) is still caught after the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant