On Windows, using os.kill(pid, 0) as a process-existence check can behave
poorly. In one Python 3.13 Windows environment, Pluto's sync process reported
"Parent process died" about five seconds after startup even though the
training process was still running. A direct smoke test of os.kill(os.getpid(),
0) also returned successfully and then the process exited with code 137.
Prefer psutil.pid_exists(pid), which Pluto already depends on elsewhere, and
keep os.kill(pid, 0) only as a fallback when psutil is unavailable. Apply the
same helper to SyncStore orphan detection so both parent checks use consistent
behavior.
pluto_ml_windows_pid_check.patch
On Windows, using os.kill(pid, 0) as a process-existence check can behave
poorly. In one Python 3.13 Windows environment, Pluto's sync process reported
"Parent process died" about five seconds after startup even though the
training process was still running. A direct smoke test of os.kill(os.getpid(),
0) also returned successfully and then the process exited with code 137.
Prefer psutil.pid_exists(pid), which Pluto already depends on elsewhere, and
keep os.kill(pid, 0) only as a fallback when psutil is unavailable. Apply the
same helper to SyncStore orphan detection so both parent checks use consistent
behavior.
pluto_ml_windows_pid_check.patch