tracee/event: batch waitpid events and sort CLONE before SIGSTOP to fix multithread deadlock#337
Open
daniel-thisnow wants to merge 1 commit intotermux:masterfrom
Open
Conversation
Fix a deadlock that occurs when a multi-threaded process (e.g. Node.js with its libuv thread pool) rapidly creates several threads via clone(CLONE_VM|CLONE_THREAD|...). The kernel delivers two events per thread creation: PTRACE_EVENT_CLONE to the parent and an initial SIGSTOP to the new child. PRoot's SIGSTOP_PENDING mechanism correctly handles the case where a single child's SIGSTOP arrives before the parent's PTRACE_EVENT_CLONE. However, when four threads are created in rapid succession (as libuv does at startup), the interleaving of multiple SIGSTOP and PTRACE_EVENT_CLONE events can leave a child permanently stopped in ptrace-stop with no future waitpid(2) report to wake it — a deadlock where proot blocks in waitpid() and all tracees sit in tracing-stop. Root cause: when waitpid(2) returns a child's SIGSTOP before the parent's PTRACE_EVENT_CLONE, handle_tracee_event() sets tracee->sigstop = SIGSTOP_PENDING and returns signal = -1, deferring the restart to new_child(). With multiple concurrent clones the timing window for this misordering is wide enough to be hit reliably. Fix: after waking from the blocking waitpid(2), drain all additional already-pending events using WNOHANG into a small batch, sort the batch so PTRACE_EVENT_CLONE/FORK/VFORK events come before SIGSTOP events (using qsort with a simple priority function), then process the sorted batch. This guarantees new_child() is always called before the child's initial SIGSTOP is handled, so tracee->exe is set and the direct SIGSTOP_IGNORED path is taken. Tested on Android 13 / aarch64 (Linux 5.15): npm install inside proot-distro Ubuntu now completes reliably (5/5 runs, previously hung non-deterministically within seconds of startup). Fixes: termux#326
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When a multi-threaded process such as Node.js starts its libuv thread pool, it calls
clone(CLONE_VM|CLONE_THREAD|...)four times in rapid succession. The kernel delivers two events per thread creation:PTRACE_EVENT_CLONEto the parent (processed bynew_child(), setstracee->exe)SIGSTOPto the new childThese two events can be returned by
waitpid(2)in either order. PRoot'sSIGSTOP_PENDINGmechanism correctly handles a single misordering: if the child's SIGSTOP arrives first (tracee->exe == NULL),handle_tracee_event()setssigstop = SIGSTOP_PENDINGand defers the restart tonew_child().However, with four threads created at once, the interleaving of multiple
SIGSTOPandPTRACE_EVENT_CLONEevents can leave a child permanently in ptrace-stop with no futurewaitpid(2)report — a deadlock where proot blocks inwaitpid()and all tracees sit in tracing-stop.This manifests as
npm install(or any Node.js workload that initialises libuv) hanging non-deterministically inside proot-distro. The hang was reported in #326.Fix
After waking from the blocking
waitpid(2), drain all additional already-pending events usingWNOHANGinto a small batch (capped at 64 events). Sort the batch withqsortusing a simple priority function:PTRACE_EVENT_CLONE / FORK / VFORKSIGSTOPThen process the sorted batch sequentially. This guarantees
new_child()is always called before the child's initialSIGSTOPis handled, sotracee->exe != NULLand the directSIGSTOP_IGNOREDpath is taken.The correctness argument:
Testing
Tested on Android 13 / aarch64 (Linux 5.15.178, Termux + proot-distro Ubuntu):
Compiled with clang 21 targeting aarch64, no new warnings.
Relates to #326.
Built on Android, from Claude claude android funtimes 🤖