Skip to content

Add 'amd-ctk gpu list' command to display GPU info#117

Merged
sgopinath1 merged 1 commit intoROCm:mainfrom
shiv-tyagi:add-gpu-list-cmd
Apr 26, 2026
Merged

Add 'amd-ctk gpu list' command to display GPU info#117
sgopinath1 merged 1 commit intoROCm:mainfrom
shiv-tyagi:add-gpu-list-cmd

Conversation

@shiv-tyagi
Copy link
Copy Markdown
Member

@shiv-tyagi shiv-tyagi commented Apr 23, 2026

Fixes #116

Summary

Adds a new amd-ctk gpu list CLI command that displays AMD GPU information including device index, UUID, and DRM device paths. This provides a first-party way to discover GPU UUIDs needed for container configurations (AMD_VISIBLE_DEVICES), replacing the previous dependency on external tools like rocm-smi or amd-smi.

Changes

  • cmd/amd-ctk/gpu/gpu.go — New gpu subcommand group under amd-ctk.
  • cmd/amd-ctk/gpu/list/list.go — Implements amd-ctk gpu list. Discovers GPUs via internal/amdgpu, maps KFD topology UUIDs to device indices, and prints a formatted table of GPU Id, UUID (0x hex), and render device paths.
  • cmd/amd-ctk/main.go — Registers the new gpu command.
  • README.md — Replaces rocm-smi/amd-smi UUID discovery instructions with amd-ctk gpu list. Adds a note clarifying UUID source (KFD topology) vs ASIC_SERIAL/Unique ID from other tools.

Testing

Validated on an Ubuntu machine with 1 AMD GPU and the amd Docker runtime configured:

  • amd-ctk --help lists the new gpu command
  • amd-ctk gpu --help lists the list subcommand
  • amd-ctk gpu list exits with code 0
  • Output reports GPU count (Found 1 AMD GPU device)
  • Output contains table headers (GPU Id, UUID, DRM Devices)
  • Output shows renderD device paths
  • Output shows UUID in hex format (0xD8D3E250FF726FB0)
  • UUID from gpu list works with AMD_VISIBLE_DEVICES=<UUID> for container GPU injection
  • Render device reported by gpu list matches the device injected into the container

Introduces a new 'gpu list' subcommand that shows AMD GPU device
index, UUID (from KFD topology), and DRM device paths. This provides
a first-party alternative to rocm-smi/amd-smi for discovering GPU
UUIDs needed for AMD_VISIBLE_DEVICES container configurations.

Updates README to document the new command and clarify UUID source.

Made-with: Cursor
@shiv-tyagi shiv-tyagi changed the base branch from release/1.3.x to main April 23, 2026 11:06
@sgopinath1 sgopinath1 merged commit 0da9695 into ROCm:main Apr 26, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Documentation]: wrong UUID for injecting GPUs into containers

3 participants