[TeleViz] Add televiz basic shaders + DeviceImage CUDA-Vulkan interop by farbod-nv · Pull Request #451 · NVIDIA/IsaacTeleop

farbod-nv · 2026-05-01T17:50:32Z

viz/shaders/: glslang-compiled SPIR-V embedded as constexpr arrays; ships textured_quad.{vert,frag} for the upcoming QuadLayer.
viz/core/device_image: VkImage backed by external memory, imported into CUDA as cudaArray_t. Symmetric pair to HostImage.
viz/core/vk_context: pin CUDA device to Vulkan physical device by UUID at init() so interop types can assume same-GPU operation.
CI: add CUDA Toolkit + glslang-tools to build-ubuntu and sanitizer.
Tests: viz_shaders_tests, viz_core_tests DeviceImage round-trip; all unit tests pass under ASAN+UBSAN.

Summary by CodeRabbit

Release Notes

New Features
- Added CUDA support for GPU image operations and memory interoperability with Vulkan
- Introduced GLSL shader compilation system with automated SPIR-V generation
Tests
- Added GPU-based image handling tests
- Added shader blob validation tests
Chores
- Updated CI to install CUDA Toolkit and shader compilation tools
- Added build-time dependencies for shader compilation

- viz/shaders/: glslang-compiled SPIR-V embedded as constexpr arrays; ships textured_quad.{vert,frag} for the upcoming QuadLayer. - viz/core/device_image: VkImage backed by external memory, imported into CUDA as cudaArray_t. Symmetric pair to HostImage. - viz/core/vk_context: pin CUDA device to Vulkan physical device by UUID at init() so interop types can assume same-GPU operation. - CI: add CUDA Toolkit + glslang-tools to build-ubuntu and sanitizer. - Tests: viz_shaders_tests, viz_core_tests DeviceImage round-trip; all unit tests pass under ASAN+UBSAN. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai · 2026-05-01T17:51:14Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 768f483b-e1fe-42c9-91df-deb7b2115fcc

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This pull request introduces CUDA-Vulkan GPU memory interop capabilities to the visualization module. The changes add CUDA Toolkit as a build requirement, implement a new DeviceImage class that creates Vulkan images with CUDA-accessible backing memory via external memory handles, introduce shader compilation infrastructure using glslangValidator to embed SPIR-V bytecode, and extend VkContext to match the active CUDA device UUID with the selected Vulkan physical device. Supporting changes include CI workflow updates, comprehensive documentation, and new test suites validating both the interop implementation and shader compilation.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant VkCtx as VkContext::init()
    participant VkDev as Vulkan Physical Device
    participant CUDA as CUDA Driver
    participant DevImg as DeviceImage::create()
    participant VkImg as Vulkan Image
    participant VkMem as Vulkan Device Memory
    participant CudaArr as CUDA Array

    App->>VkCtx: initialize graphics context
    VkCtx->>VkDev: vkGetPhysicalDeviceProperties2 (UUID)
    VkDev-->>VkCtx: physical device UUID
    VkCtx->>CUDA: enumerate CUDA devices
    CUDA-->>VkCtx: device UUIDs
    VkCtx->>CUDA: match UUID & cudaSetDevice()
    CUDA-->>VkCtx: active device set

    App->>DevImg: create(vk_context, resolution, format)
    DevImg->>VkImg: vkCreateImage (exportable memory)
    VkImg-->>DevImg: VkImage handle
    DevImg->>VkMem: vkAllocateMemory (exportable device-local)
    VkMem-->>DevImg: VkDeviceMemory handle
    DevImg->>VkMem: vkGetMemoryFdProperties (external FD)
    VkMem-->>DevImg: memory file descriptor
    DevImg->>CUDA: cudaImportExternalMemory (FD)
    CUDA-->>DevImg: external memory handle
    DevImg->>CudaArr: cudaExternalMemoryGetMappedMipmappedArray
    CudaArr-->>DevImg: CUDA mipmapped array
    DevImg->>VkImg: vkCreateImageView
    VkImg-->>DevImg: VkImageView handle
    DevImg-->>App: DeviceImage ready (CUDA & Vulkan handles)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 15.15% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title accurately summarizes the two primary additions: basic shaders (via GLSLang compilation and embedding) and DeviceImage CUDA-Vulkan interop. It directly reflects the main changes across all file modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fm/televiz_m3a

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/viz/core/cpp/device_image.cpp`:
- Around line 336-376: The command buffer allocated via vkAllocateCommandBuffers
in the transition path can leak if any subsequent check_vk or Vulkan call
throws; wrap the sequence between allocation and vkFreeCommandBuffers in a
try/catch (or use RAII) so that vkFreeCommandBuffers(device, command_pool_, 1,
&cmd) is always called: after allocating cmd, run the vkBeginCommandBuffer,
vkCmdPipelineBarrier, vkEndCommandBuffer, vkQueueSubmit and vkQueueWaitIdle
inside a try block and in the catch (or a finally-equivalent) call
vkFreeCommandBuffers and rethrow the exception; reference the functions/checks
vkAllocateCommandBuffers, check_vk, vkBeginCommandBuffer, vkEndCommandBuffer,
vkQueueSubmit, vkQueueWaitIdle and vkFreeCommandBuffers to locate and protect
the allocated resource.
- Around line 117-172: DeviceImage::destroy currently tears down CUDA and Vulkan
resources without waiting for GPU work, risking use-after-free; fix by inserting
explicit synchronization before any resource frees: if any CUDA resources
(cuda_mipmapped_array_ or cuda_external_memory_) may be active call
cudaDeviceSynchronize() (best-effort, ignore non-fatal return) before
cudaFreeMipmappedArray/cudaDestroyExternalMemory, and after verifying ctx_ and
obtaining device call vkDeviceWaitIdle(device) (or the appropriate queue-fence
wait via ctx_) before destroying command_pool_, image_view_, image_, and freeing
memory_; keep the existing early-null checks but perform the CUDA sync prior to
CUDA frees and the Vulkan device wait prior to Vulkan teardown (use symbols
cuda_mipmapped_array_, cuda_external_memory_, cudaDeviceSynchronize,
ctx_->device(), and vkDeviceWaitIdle).

In `@src/viz/core/cpp/vk_context.cpp`:
- Line 194: The call to cudaSetDevice() inside match_cuda_device_to_vulkan()
during init() only sets the device for the init thread; subsequent CUDA calls
from other threads (e.g., DeviceImage::import_to_cuda(),
cudaImportExternalMemory()) will run on the wrong device. Fix by either
documenting that VkContext and all viz_core types are single-threaded and must
be used only from the init thread, or (preferred) add a defensive thread-local
device guard: store the matched CUDA device id in VkContext during
match_cuda_device_to_vulkan(), then ensure every CUDA entry point in viz_core
(for example DeviceImage::import_to_cuda(), any wrappers that call
cudaImportExternalMemory(), and other CUDA-facing methods) calls
cudaSetDevice(vk_context->matched_cuda_device_id) at start (or uses a
thread-local RAII guard that calls cudaSetDevice on construction) to guarantee
correct device affinity across threads.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c1e02f0b-aa6e-4202-bf75-44e0af115541

📥 Commits

Reviewing files that changed from the base of the PR and between a939dd1 and c991a64.

📒 Files selected for processing (19)

.github/workflows/build-ubuntu.yml
deps/README.md
src/viz/AGENTS.md
src/viz/CMakeLists.txt
src/viz/core/cpp/CMakeLists.txt
src/viz/core/cpp/device_image.cpp
src/viz/core/cpp/inc/viz/core/device_image.hpp
src/viz/core/cpp/inc/viz/core/vk_context.hpp
src/viz/core/cpp/vk_context.cpp
src/viz/core_tests/cpp/CMakeLists.txt
src/viz/core_tests/cpp/test_device_image.cpp
src/viz/shaders/CMakeLists.txt
src/viz/shaders/cpp/CMakeLists.txt
src/viz/shaders/cpp/compile_shader.cmake
src/viz/shaders/cpp/textured_quad.frag
src/viz/shaders/cpp/textured_quad.vert
src/viz/shaders_tests/CMakeLists.txt
src/viz/shaders_tests/cpp/CMakeLists.txt
src/viz/shaders_tests/cpp/test_shader_blobs.cpp

…a device - run_one_shot_layout_transition: wrap submit/wait in an RAII guard so the command buffer is freed on every exit path (otherwise a queue submit failure leaks one cmd per retry). - DeviceImage::destroy: cudaDeviceSynchronize before CUDA frees and vkDeviceWaitIdle before Vulkan teardown, so async work submitted by the caller has retired before the resources go away. - VkContext stores the matched CUDA device id and exposes it via cuda_device_id(); DeviceImage::import_to_cuda + ::destroy now call cudaSetDevice on the current thread before any CUDA API. cudaSetDevice is per-host-thread, so this protects users who create a DeviceImage on a worker thread. All 37 unit + 28 GPU tests pass; unit tests also pass under ASAN+UBSAN. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Jimver/cuda-toolkit@v0.2.19 hardcodes /x86_64/ in the NVIDIA repo URL, which fails on the ubuntu-22.04-arm matrix entry (cuda-nvcc-12-4 / cuda-cudart-12-4 packages don't exist for that arch on that path). Replace with a small composite action (.github/actions/setup-cuda) that picks /x86_64/ or /sbsa/ based on dpkg --print-architecture and installs cuda-nvcc-* + cuda-cudart-dev-* via apt. Used in both build-ubuntu and test-viz-sanitizers. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

viz_core links CUDAToolkit::cudart, but the experimental Windows CI runner doesn't have CUDA installed. With BUILD_VIZ=ON we hit "Could not find nvcc" at find_package(CUDAToolkit) time. No Windows-XR consumer for viz today, so flip it OFF for now and add CUDA install to build-windows.yml when we have a real reason to ship viz on Windows. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Jimver/cuda-toolkit works correctly on Windows — the /x86_64/ hardcoding bug only bites on Linux ARM. Runs NVIDIA's silent network installer and sets CUDA_PATH so find_package(CUDAToolkit) succeeds. Re-enables BUILD_VIZ=ON for the experimental Windows job. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

GPU runners have the NVIDIA driver (libcuda.so.1) but not the CUDA Toolkit (libcudart.so.12). After M3a's CUDA dependency landed, viz test binaries fail to load on the GPU runners with "cannot open shared object file: libcudart.so.12". Bundle libcudart.so.12 from the build host's CUDA install into the viz-tests-* artifact, then point LD_LIBRARY_PATH at the artifact dir when running tests. Same pattern as auditwheel for the Python wheel. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Reverses the bundle-libcudart-in-artifact hack from 2203957. Run the same setup-cuda composite action on the test-viz-gpu job: NVIDIA's apt postinst registers /etc/ld.so.conf.d/cuda-12-4.conf so libcudart lands on the standard ld.so search path with no LD_LIBRARY_PATH or artifact gymnastics needed. Symmetric with build-ubuntu, and we'd need cudart on the runner anyway for richer GPU tests in M3b+. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

device_image.cpp included <unistd.h> for ::close on the fd returned by vkGetMemoryFdKHR. MSVC has no <unistd.h>; build-windows fails with "Cannot open include file: 'unistd.h'". Wrap close in a tiny shim: <io.h>+_close on _WIN32, <unistd.h>+close elsewhere. The whole fd path is unreachable at runtime on Windows (vkGetMemoryFdKHR returns nullptr on that platform → import_to_cuda throws before memory_fd_ is set), but we still need a clean compile. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Switch CUDA::cudart -> CUDA::cudart_static. Makes the wheel and viz_*_tests artifacts self-contained: - Wheel: _viz.so has no dynamic libcudart.so.12 dep so auditwheel bundles nothing CUDA-related (release artifact stays clean). - Test artifacts: run on GPU runners that have only the NVIDIA driver (libcuda.so.1). The self-hosted runner's sudo policy disallows apt installs from a job step, so we can't install the toolkit there. - Drops the back-and-forth between bundling libcudart in the artifact vs. installing CUDA on the GPU runner — neither is needed now. Build host still needs the CUDA Toolkit for libcudart_static.a; setup-cuda already covers that on build-ubuntu / test-viz-sanitizers. Tradeoff: ~3 MB binary growth per consumer. Safe today because viz_core is the only CUDA-using component in the codebase. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Now that viz_core links libcudart_static (4eb4db1), the GPU runner no longer needs the CUDA Toolkit. The previous setup-cuda step was failing on the self-hosted runner anyway (sudo policy), and is now unnecessary. Test binaries depend only on the NVIDIA driver (libcuda.so.1) which is already present on the runner. Signed-off-by: Farbod Motlagh <fmotlagh@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/viz/core/cpp/device_image.cpp

Comment thread src/viz/core/cpp/device_image.cpp

Comment thread src/viz/core/cpp/vk_context.cpp

farbod-nv and others added 2 commits May 1, 2026 11:32

farbod-nv requested review from aristarkhovNV, jiwenc-nv and junjeechao May 1, 2026 20:25

farbod-nv changed the title ~~Add televiz basic shaders + DeviceImage CUDA-Vulkan interop~~ [TeleViz] Add televiz basic shaders + DeviceImage CUDA-Vulkan interop May 1, 2026

farbod-nv and others added 7 commits May 1, 2026 13:31

junjeechao approved these changes May 2, 2026

View reviewed changes

farbod-nv temporarily deployed to dev May 4, 2026 00:01 — with GitHub Actions Inactive

farbod-nv merged commit 4da8221 into main May 4, 2026
86 of 110 checks passed

farbod-nv deleted the fm/televiz_m3a branch May 4, 2026 18:46

coderabbitai Bot mentioned this pull request May 4, 2026

[TeleViz] Add QuadLayer + textured-quad pipeline + VizCudaArray #466

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TeleViz] Add televiz basic shaders + DeviceImage CUDA-Vulkan interop#451

[TeleViz] Add televiz basic shaders + DeviceImage CUDA-Vulkan interop#451
farbod-nv merged 10 commits into
mainfrom
fm/televiz_m3a

farbod-nv commented May 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 1, 2026 •

edited

Loading

Review skipped

Walkthrough

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

farbod-nv commented May 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

farbod-nv commented May 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 1, 2026 •

edited

Loading