Skip to content

xpumd 1.3.6 SIGABRT on Arc Pro B70 (0xe223) — gmm_helper/resource_info.cpp:15 with newer compute-runtime 26.14 / libigdgmm12 22.9 #128

@cwhanlon

Description

@cwhanlon

Summary

xpumd from xpumanager 1.3.6 crashes with SIGABRT during initialize device manager on an Intel Arc Pro B70 (PCI ID 0xe223) running with the latest user-space compute stack (compute-runtime 26.14.37833.4, libigdgmm12 22.9). Standalone xpu-smi works on the same host — only the daemon path fails.

Environment

  • GPU: Intel Arc Pro B70 (BMG-G31, PCI 8086:e223) — single-GPU host
  • OS: Ubuntu 24.04.4 LTS (noble)
  • Kernel: 6.17.0-23-generic (HWE) — using the xe driver, not i915
  • Re-BAR: enabled (full 32 GiB BAR mapped)
  • Compute runtime: intel-opencl-icd 26.14.37833.4-0, libze-intel-gpu1 26.14.37833.4-0 (from intel/compute-runtime v26.14.37833.4 .debs)
  • IGC: intel-igc-core-2 2.32.7, intel-igc-opencl-2 2.32.7 (from intel/intel-graphics-compiler v2.32.7)
  • libigdgmm12: 22.9.0 (from compute-runtime release)
  • Level Zero loader: 1.21.9 (libze1 from repositories.intel.com/gpu/ubuntu noble unified)
  • xpumanager: v1.3.6xpumanager_1.3.6_20260206.143628.1004f6cb.u24.04_amd64.deb

clinfo correctly reports the device under this stack:

Platform #0: Intel(R) OpenCL Graphics
 `-- Device #0: Intel(R) Graphics [0xe223]
Driver Version: 26.14.37833.4
Global memory size: 32530182144 (30.3GiB)

Reproduction

  1. Fresh Ubuntu 24.04.4 with kernel 6.17 HWE on a host containing only an Arc Pro B70.
  2. Install Intel compute stack from the GitHub releases above.
  3. Install xpumanager_1.3.6_*.u24.04_amd64.deb.
  4. xpum.service starts; xpumd aborts ~3s later, before reaching device discovery.

Crash trace

xpumd: XPUM: Init xpum library
xpumd: XPU Manager:        1.3.6.20260206
xpumd: Build:                1004f6cb
xpumd: Level Zero:        1.21.9
xpumd: xpumd core starts to initialize
xpumd: initialize configuration
xpumd: xpum mode: xpum
xpumd: initialize datalogic
xpumd: initialize device manager
xpumd: Abort was called at 15 line in file:
xpumd: ../../neo/shared/source/gmm_helper/resource_info.cpp
systemd[1]: xpum.service: Main process exited, code=dumped, status=6/ABRT
systemd[1]: xpum.service: Failed with result core-dump.

Expected behavior

xpumd should initialize on Arc Pro B70 and expose the per-engine telemetry / temperatures / bandwidth that are unavailable through standalone xpu-smi alone.

Notes

  • The abort site (gmm_helper/resource_info.cpp:15) suggests xpumanager 1.3.6 bundles a NEO/GMM build that predates B70 (0xe223) device support, and is hitting an unhandled-device path during initialize device manager.
  • xpu-smi (the standalone CLI without the daemon) works correctly on this host and reports the device, power draw, frequency, and memory used. Engine utilization, temps, and bandwidth are N/A from xpu-smi alone — which is why the daemon would be useful here.
  • Compute-runtime 26.14.37833.4 (April 2026) is the first NEO release I am aware of that has 0xe223 in shared/source/dll/devices/devices_base.inl as a BmgHwConfig entry. xpumanager 1.3.6 (Feb 2026) likely bundles an older NEO snapshot that does not have it.

Workaround

Remove xpumanager and use the standalone xpu-smi package; partial telemetry, but does not crash.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions