fix(ptrace): sessionless pid-attach + reviewed seed helper (#292, supersedes #312)#359
Merged
Merged
Conversation
…ess pid-attach
Two related races / accounting bugs in how the tracer handles child
processes whose stop arrives before the parent's PTRACE_EVENT_FORK is
processed by handleNewChild.
1. Minimal-state fallback only inherited a subset of parent fields.
handleNewChild copies SessionID, HasPrefilter, PendingPrefilter
(conditionally), TGID-level escalation, thread-level escalation,
and parent TGID to a freshly-created child. The minimal-state
fallback sites in run() and handleEventStop() only copied
SessionID / HasPrefilter / parent TGID, leaving the child with
different enforcement state from one created via the normal path
if it execve'd in the race window.
Fix: factor the create-from-scratch logic out of handleNewChild
into seedChildStateFromParent(parent, childTID, childTGID) and
call it from all three sites (handleNewChild's else branch + the
two minimal-state fallbacks). A child created via either path is
now byte-identical in enforcement state.
2. Unknown-session execve conflated two distinct cases.
The previous revision flipped the unknown_session branch in
HandleExecve from deny -> allow, but it lumped together:
(a) Sessionless pid-attach. initPtraceTracer calls tr.AttachPID
(pid) without WithSessionID for the attach_mode=pid path, so
the attached root and its descendants are sessionless by
design -- the wrapper / session layer governs enforcement.
Pass-through is correct here.
(b) Non-empty SessionID missing from the session manager. This is
a real session-accounting bug, not a race. Treating it as a
silent allow turns the bug into silent under-enforcement.
Fix: mark case (a) explicitly. New TraceeState.SessionlessPIDAttach
and ExecContext.SessionlessPIDAttach fields. attach.go sets the
flag on the root tracee when opts.sessionID == "" (the legitimate
attach_mode=pid path). seedChildStateFromParent propagates it to
descendants. HandleExecve splits the unknown-session branch:
SessionlessPIDAttach=true -> allow with rule sessionless_pid_attach
(slog.Debug); otherwise -> deny with rule unknown_session
(slog.Warn). Case (b) is now visible and fail-closed again.
Tests (all pass under -race):
- internal/ptrace/tracer_inherit_test.go: covers findParentByTGID
and seedChildStateFromParent. Verifies all enforcement fields
propagate (including SessionlessPIDAttach), the conditional skip
of PendingPrefilter when parent already has the filter installed,
and the nil-parent / per-thread defaults.
- internal/api/ptrace_handlers_test.go:
TestHandleExecve_SessionlessPIDAttachAllows verifies (a) allows
with rule sessionless_pid_attach.
TestHandleExecve_NonEmptyUnknownSessionDenies verifies (b) denies
with rule unknown_session and EACCES.
Empirical / functional validation against the original #292 cascade:
pure upstream v0.19.3+f531584 still reproduces 30/30 MODE_A failures
on attach_mode=pid + seccomp_prefilter under sustained sb.run()-equivalent load;
the revised binary passes 90/90 calls across 3 iterations with the
SessionlessPIDAttach=true branch taking the allow path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rent The previous helper hard-coded SuppressInitialStop=true. That's correct for handleNewChild's normal-path caller (state is created on the parent's PTRACE_EVENT_FORK, BEFORE the child's initial SIGSTOP arrives, so the upcoming stop should be swallowed) but wrong for the two minimal-state fallback callers in handleStop() and handleEventStop(), where the child's initial stop has ALREADY been dispatched -- that's exactly what triggered the fallback. Leaving the flag set there silently swallows the NEXT external SIGSTOP delivered to the process (e.g. kill -STOP, debugger), which is a behavioral regression vs. the pre-refactor fallback code that deliberately did not set this flag. Add a suppressInitialStop bool parameter and pass: - true from handleNewChild's else branch (normal create path) - false from the two fallback paths Also corrects the helper docstring and the handleNewChild / handleEventStop inline comments that referred to "run()" -- the actual containing function is handleStop(). Adds TestSeedChildStateFromParent_SuppressInitialStopFalse covering the fallback contract; the existing three tests pass true explicitly to preserve their normal-path coverage. Found by roborev review of #312 (job 9372, Medium severity). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…date path handleNewChild's existing-state branch reconciles a child created via the procfs-based minimal-state fallback (handleStop/handleEventStop) with the kernel-authoritative parent from PTRACE_EVENT_FORK. It already overwrites SessionID, HasPrefilter, PendingPrefilter, and the four escalation flags so a wrong inference by findParentByTGID + readPPID cannot survive once the fork event arrives. SessionlessPIDAttach was missing from that list. Worst case without this: the procfs fallback infers SessionlessPIDAttach=true on a child whose true parent has it false (and whose SessionID's session has already vanished). HandleExecve then misses the session lookup, falls into the sessionless_pid_attach branch, and silently allows -- exactly the case the new "unknown_session" deny path is supposed to make visible. Found by roborev review of the branch (job 9375, Low severity). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #292 (intermittent shim/server hang on long
sb.run()sequences inattach_mode=pid). Supersedes #312 — preserves the original contributor's commit + adds two follow-up fixes from roborev review.Commits
3e19201b— original PR fix(ptrace): inherit SessionID from parent on minimal-state child creation #312 commit by @es-fabricemarie. Splits theHandleExecveunknown-session branch intosessionless_pid_attach(allow) vsunknown_session(still fail-closed deny), and addsseedChildStateFromParentto inheritSessionID+ flags from the parent in the two minimal-state fallback paths inhandleStop/handleEventStop. Root-cause analysis and reproducer are in Intermittent shim/server hang on long sb.run() sequences (v0.19.2, ptrace-pid hybrid) #292's thread.1bd856ef— addresses roborev Medium finding (job 9372).seedChildStateFromParenthard-codedSuppressInitialStop: true, correct forhandleNewChild's normal path but wrong for the two fallback callers where the child's initial SIGSTOP has already been received. The flag is now parameterized; fallback callers passfalseso an unrelated future SIGSTOP isn't silently swallowed. Comments referring torun()are corrected tohandleStop().bc609c5e— addresses roborev Low finding (job 9375).handleNewChild's existing-state branch reconciles a fallback-created child with the kernel-authoritative parent; it overwrote every enforcement field exceptSessionlessPIDAttach. A staletrueinferred from a wrong-procfs-parent could turn the newsessionless_pid_attachallow into a silent fail-open. Now overwritten alongside the other fields.Test Plan
go test ./internal/ptrace/... ./internal/api/...— all green (incl. fourTestSeedChildStateFromParent_*covering both suppressInitialStop values + the newHandleExecvesessionless/unknown split tests)go test ./...— full suite, 0 failuresGOOS=windows go build ./...— cross-compile cleanroborev review --branch --wait --reasoning maximum— pre-followup found 1 Medium + 4 Low (job 9372); after the Medium fix one Low remained (job 9375); now both addressed. Final state has only the open Lows about (a) per-thread escalation imprecision infindParentByTGID, (b) implicitSessionlessPIDAttachinference from emptyopts.sessionID, (c) no end-to-end test of the fallback call sites. Documented as known limitations; not regressions and not blockers.Reproducer
Contributor's
reproduction-script.zipattached on #292 — 150/150 MODE_A failures on purev0.19.3+f531584, 150/150 PASS with this branch's commit 1 alone. Commits 2 and 3 are defense-in-depth on top.Credit
Author of commit
3e19201bis @es-fabricemarie (PR #312). I'll close #312 once this lands.🤖 Generated with Claude Code