Skip to content

Fix NCU profiling under Buck PAR via KERNEL_PROFILER_PYTHON (#136)#136

Closed
wychi wants to merge 1 commit into
meta-pytorch:mainfrom
wychi:export-D105739069
Closed

Fix NCU profiling under Buck PAR via KERNEL_PROFILER_PYTHON (#136)#136
wychi wants to merge 1 commit into
meta-pytorch:mainfrom
wychi:export-D105739069

Conversation

@wychi

@wychi wychi commented May 19, 2026

Copy link
Copy Markdown
Contributor

Summary:

NCU profiling subprocesses failed inside a Buck PAR with:

ImportError: .../platform010/lib/python3.12/lib-dynload/
_posixsubprocess.cpython-312-x86_64-linux-gnu.so:
undefined symbol: _Py_NoneStruct
ModuleNotFoundError: No module named 'torch'

Root cause: inside a PAR, sys.executable points at the static-linked
native-main binary. Re-exec'ing it from a subprocess skips the env setup
that bootstrap.sh performs (LD_LIBRARY_PATH for CUDA, LD_PRELOAD for
the allocator, PYTHONPATH/FB_PAR
* for the embedded import system), so
the spawned Python falls back to system platform010 stdlib whose
lib-dynload .so files are ABI-incompatible with the static libpython.

Fix is split along the OSS/fb boundary:

  • OSS ncu_profiler.profile_triton_kernel honors a generic env-var
    override KERNEL_PROFILER_PYTHON, falling back to sys.executable.
    No Meta/buck/PAR knowledge in OSS code.

  • fb utils/fb/internal_env.setup_internal_environment detects the
    PAR via "#native-main#" in sys.executable.name and sets the env
    var to <BASE_DIR>/_bootstrap.sh, which rebuilds the full env from
    $0 before exec'ing the native main.

Respects an explicit user override.

Differential Revision: D105739069

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 19, 2026
@meta-codesync

meta-codesync Bot commented May 19, 2026

Copy link
Copy Markdown

@wychi has exported this pull request. If you are a Meta employee, you can view the originating Diff in D105739069.

…orch#136)

Summary:

NCU profiling subprocesses failed inside a Buck PAR with:

  ImportError: .../platform010/lib/python3.12/lib-dynload/
    _posixsubprocess.cpython-312-x86_64-linux-gnu.so:
    undefined symbol: _Py_NoneStruct
  ModuleNotFoundError: No module named 'torch'

Root cause: inside a PAR, sys.executable points at the static-linked
native-main binary. Re-exec'ing it from a subprocess skips the env setup
that _bootstrap.sh performs (LD_LIBRARY_PATH for CUDA, LD_PRELOAD for
the allocator, PYTHONPATH/FB_PAR_* for the embedded import system), so
the spawned Python falls back to system platform010 stdlib whose
lib-dynload .so files are ABI-incompatible with the static libpython.

Fix is split along the OSS/fb boundary:

  * OSS ncu_profiler.profile_triton_kernel honors a generic env-var
    override KERNEL_PROFILER_PYTHON, falling back to sys.executable.
    No Meta/buck/PAR knowledge in OSS code.

  * fb utils/fb/internal_env.setup_internal_environment detects the
    PAR via "#native-main#" in sys.executable.name and sets the env
    var to <BASE_DIR>/_bootstrap.sh, which rebuilds the full env from
    \$0 before exec'ing the native main.

Respects an explicit user override.

Differential Revision: D105739069
@meta-codesync meta-codesync Bot changed the title Fix NCU profiling under Buck PAR via KERNEL_PROFILER_PYTHON Fix NCU profiling under Buck PAR via KERNEL_PROFILER_PYTHON (#136) May 19, 2026
@wychi wychi force-pushed the export-D105739069 branch from e1abe61 to f7bc3ce Compare May 19, 2026 20:29
@wychi wychi closed this May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant