[TeleViz] Add DisplayBackend + WindowBackend + Window smoke test example by farbod-nv · Pull Request #469 · NVIDIA/IsaacTeleop

farbod-nv · 2026-05-05T21:58:39Z

Adds DisplayBackend as an abstraction for final display path (Window, XR, offscreen)
Setup swapchain to be shared by all the display backend.
Adds smoke test televiz example (first example!)

Summary by CodeRabbit

New Features
- Interactive window rendering with GLFW and a window backend
- Offscreen backend for host readback
- Multi-layer tiling with aspect-aware layout and dynamic render-target resizing
- Compositor refactored to use pluggable display backends; session exposes should_close()
New Examples
- Added a window "smoke" demo showing a 2×2 quad-layer tiled layout
Tests
- New unit and integration tests for tile layout, window primitives, and session rendering
Chores
- CI: added X11/Wayland dev packages for Linux builds

Building blocks for DisplayMode::kWindow. No integration with VizSession / VizCompositor yet — that lands in commit 2. deps/third_party: - FetchContent GLFW 3.4 (gated on BUILD_VIZ). Docs / tests / examples / install disabled. Vulkan-only — no GL fallback target. viz/session/tile_layout.{hpp,cpp}: - tile_layout(layer_aspects, fb_size, padding) -> vector<TileSlot> - TileSlot = {outer, content}: outer is the equal-slice tile (used by the compositor as scissor in commit 2); content is the aspect-fit rectangle inside outer (used as the layer's per-view viewport). - Row-major grid: cols = ceil(sqrt(n)), rows = ceil(n/cols). Last column / row absorbs the framebuffer remainder so no pixels are unaddressed at non-divisible sizes. Padding inset on every tile. - Aspect-fit math letterboxes vertically when content_aspect < tile_aspect, horizontally otherwise. Clear color shows through the margins (no extra clear pass needed). - 10 unit tests cover n=0..9, padding interactions, square / 16:9 / vertical layers, exact-divisibility vs remainder, mixed aspects. viz/session/glfw_window.{hpp,cpp}: - Owns GLFWwindow + VkSurfaceKHR. - Process-wide GLFW init refcount via mutex+counter so multiple windows coexist (M7 multi-camera_viz). glfwInit failure throws — callers gate via window-environment probe (no-display CI skips). - Resize callback flips an atomic flag; consume_resized() returns it and clears in one op. The compositor will check this at frame start in commit 2 and recreate the swapchain. viz/session/swapchain.{hpp,cpp}: - VkSwapchainKHR + per-image acquire / render-done semaphore ring. - Hardcoded VK_PRESENT_MODE_FIFO_KHR (vsync). Surface format prefers B8G8R8A8_SRGB > R8G8B8A8_SRGB > runtime first format. - imageUsage = TRANSFER_DST only — we never render directly into swapchain images, only blit the intermediate framebuffer. Matches the "intermediate RT then blit" path the offscreen mode already uses. - Validates the chosen queue family supports present on the surface before doing any work (NVIDIA Linux always reports yes; throws loud if a stranger setup hits this path). - acquire_next_image() returns nullopt on out-of-date / suboptimal; present() returns false on the same. Compositor will branch to recreate(new_size) on either signal. - Per-image semaphore ring keeps in-flight frames from reusing a semaphore another in-flight image is still consuming. Tests: - 10 [unit] tile_layout tests. - 5 [gpu][window] tests for GlfwWindow + Swapchain that skip cleanly on no-display environments via glfwInit() && glfwVulkanSupported(). Verify construct + destroy, idempotent destroy, recreate at a new extent, validation rejection of null instance / zero dims. Build: 50/50 unit pass. Window tests register and skip cleanly on this no-display sandbox. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

VizSession + VizCompositor now drive a GLFW window through a Vulkan swapchain in DisplayMode::kWindow. kOffscreen behavior is unchanged. Public API additions (commit 2 of M4): - viz::Rect2D in viz_types.hpp — Vulkan-free 2D pixel rect. - ViewInfo::viewport — pixel rect in the framebuffer the layer should draw into for that view. Compositor fills it; layer binds it via vkCmdSetViewport. In window mode it's the layer's aspect-fit content rect inside its tile; in offscreen it's the full target; in M5 it'll be the per-eye SBS half-rect from the OpenXR runtime. - viz::bind_view_viewport(cmd, view) — standard mapping helper. Layers call this once per view inside record(). No y-flip, depth 0..1. - LayerBase::aspect_ratio() — virtual, returns optional<float>. The compositor uses this in window mode to compute per-layer content rects via tile_layout. nullopt = "fill the tile". XR ignores it. - VizSession::should_close() — true when the user asked the window to close. Always false in kOffscreen / kXr. - viz/session/display_mode.hpp — split out so VizSession::Config and VizCompositor::Config can both reference it without an include cycle (VizSession owns VizCompositor). LayerBase contract (record() doc comment): DO bind viewport per view via vkCmdSetViewport. DO NOT bind scissor — the compositor sets it. Overriding scissor breaks tile isolation in window mode and per-eye comp layers in XR. QuadLayer: - aspect_ratio() returns width / height of its config resolution. - record() drops its local viewport / scissor binds (compositor owns scissor; layer binds viewport from each ViewInfo). Iterates `views` and draws once per view — 1 iteration in window/offscreen, 2 in XR stereo. Same dispatch shape across modes. Swapchain: - AcquiredImage gains a render_done semaphore so the compositor can signal it during render submit and present() waits on it. Matches the standard image_available -> render_done -> present chain. VizCompositor: - Config: mode + swapchain* fields. kWindow requires non-null swapchain (validated in create()). - render() kWindow path: acquire swapchain image -> render to intermediate -> blit (TRANSFER_DST barrier -> vkCmdBlitImage -> PRESENT_SRC barrier) -> submit waiting on image_available, signaling render_done -> present. Out-of-date / suboptimal returns silently; caller (VizSession) handles via consume_resized(). - Per-layer scissor pre-bind. Per-layer ViewInfo with viewport overridden to tile.content (window) or full-fb (offscreen). Layers see exactly what they need to draw their region. - handle_resize(new_size): drain GPU, recreate swapchain, recreate intermediate render target. (0, 0) is a no-op (window minimized). - readback_staging only allocated in kOffscreen — saves one buffer + one allocation in kWindow. VizSession: - init() kWindow path: glfwGetRequiredInstanceExtensions -> VkContext::Config -> VkContext::init -> GlfwWindow::create -> Swapchain::create -> VizCompositor::create with mode + swapchain. destroy() tears down in reverse order (compositor before swapchain before window before context — surface lifetime matters). - render() polls GLFW events and consumes the resize flag at frame start, calling compositor->handle_resize() when set. - readback_to_host() now throws on non-kOffscreen (was silent until reaching the staging buffer). Tests: - test_viz_session: rejection-test only kXr now (kWindow is wired). kWindow validation lives in [gpu][window] tests. Build: 50/50 unit tests pass. [gpu][window] tests still register and skip cleanly without a display. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Runnable demo + a [gpu][window] integration test that exercises the full kWindow render loop end to end. examples/televiz/window_smoke: - Opens a 1024x768 GLFW window in DisplayMode::kWindow. - Adds 4 QuadLayers (256x256 each) with solid red / green / blue / white CUDA-fed textures. - Compositor tiles them 2x2 row-major, aspect-preserving — quads fill their tiles since 1:1 aspect matches each tile's aspect at this resolution. Letterbox would kick in if camera frames didn't match the tile shape (visible later when M7 wires camera_streamer's monitor mode through this path). - Loops session->render() until the user closes the window. CMake: - New examples/televiz/CMakeLists.txt orchestrator + window_smoke subdir. Linked via viz::session + viz::layers (CUDA::cudart pulled transitively from viz_layers). - Top-level CMakeLists adds examples/televiz under the existing if(BUILD_EXAMPLES) gate, additionally guarded by if(BUILD_VIZ). Standard examples build (Holoscan / OXR / etc.) works unchanged with BUILD_VIZ=OFF. Tests (test_window_primitives.cpp): - New "VizSession kWindow renders multiple QuadLayers without errors" [gpu][window] case. Creates a kWindow session, registers 3 QuadLayers (exercises the row-major 2-col x 2-row grid with one empty cell), submits solid colors, runs 8 frames. Verifies no exceptions, frame_index advances, resolution matches config. - No readback in kWindow (swapchain present path doesn't expose host bytes). The test relies on validation layers (debug build) to catch spec violations — same gate as the offscreen tests. Build: 50/50 unit tests pass. 6 [gpu][window] tests register and skip cleanly on no-display hosts. The integration test joins the existing primitives tests on the standard SKIP gate (is_gpu_available + window_environment_available). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

VizCompositor::render() and VizSession::init() were both growing mode-specific branches. M5's XR backend would have added at least 5 more (xrWaitFrame loop, xrLocateViews, XR swapchain handling, XR composition layer submission, instance/device extension list). Lift all of that behind a polymorphic DisplayBackend so each backend is a self-contained class and the compositor / session stay mode-agnostic. Public additions: - viz/session/display_backend.hpp — interface. Per-frame contract: begin_frame -> render pass -> record_post_render_pass -> submit -> end_frame. Plus required_*_extensions (called pre-VkContext-init), poll_events / should_close / consume_resized / resize / readback. Unimplemented overrides default to sensible no-ops so the smallest backend (offscreen) can override only what it actually owns. - viz/session/offscreen_backend.{hpp,cpp} — owns intermediate RT + readback staging buffer + a dedicated cmd pool/buffer for the readback path. begin_frame/record_post/end_frame are no-ops; only readback_to_host has substance. - viz/session/window_backend.{hpp,cpp} — owns GlfwWindow + Swapchain + intermediate RT. required_instance_extensions returns glfwGetRequiredInstanceExtensions; required_device_extensions adds VK_KHR_swapchain. record_post_render_pass blits intermediate -> swapchain image with the right transitions; end_frame presents. poll_events / should_close / consume_resized / resize forward to GlfwWindow. Compositor: - VizCompositor::Config slimmed: just clear_color (mode + swapchain* fields gone — the backend owns all that). - create() takes a DisplayBackend& by ref; stored as non-owning pointer. - render() has zero `if (mode == ...)` branches. begin_frame, render pass on backend.render_target(), per-layer scissor + view, end render pass, backend.record_post_render_pass, submit (waits = layers' cuda_done_writing + frame.wait_before_render; signals = frame.signal_after_render), backend.end_frame, fence wait. - Compositor no longer owns RenderTarget or readback staging — both moved to backends. Compositor now owns just frame_sync + cmd pool/buffer. - handle_resize() is gone — backends handle their own resize via consume_resized / resize, driven by VizSession. Session: - One unique_ptr<DisplayBackend> backend_ replaces window_ + swapchain_ (and the conditional compositor::Config fields). Mode dispatch is a make_backend(config) factory. - init flow: make_backend -> read its required extensions -> VkContext::init -> backend.init -> VizCompositor::create. Reverse order on destroy (compositor -> backend -> ctx). - render() polls backend events + handles resize at frame start. begin_frame populates FrameInfo.resolution from backend. - readback_to_host / should_close forward to the backend. Swapchain: - Add image_at(index) accessor (used by WindowBackend during the post-render-pass blit / barriers). LOC: net +280 across 11 files. Mostly relocation — the readback staging code moved from VizCompositor to OffscreenBackend, the window/swapchain code moved from VizSession + VizCompositor to WindowBackend. Build: 50/50 unit tests pass. 6 [gpu][window] tests register and skip cleanly without display. window_smoke example builds. M5 readiness: XrBackend is a single new class implementing the same interface. No compositor / session changes needed when it lands. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Empirical fixes from running the window_smoke example. The kWindow path was technically correct after (1/3)–(4/4), but interactive use exposed several behavioral issues: - FIFO present mode pinned the surface to vblank in a way that contended with the desktop compositor, lagging the entire system while smoke was running. - Resize triggered a full Swapchain + RenderTarget recreate per GLFW resize event (~60/sec during drag), tanking fps to ~12 with visible hiccups. - On OUT_OF_DATE, an early return from begin_frame skipped the frame pacer and produced 56kHz spin loops. - The session passed a stale extent into backend.resize() so the size-match early-out compared the wrong values. Patterns adopted from Holoviz (modules/holoviz/src/) and nvpro_core2 (swapchain/application). Their smooth-resize behavior comes from a small set of techniques we now match. Swapchain (swapchain.cpp): - Prefer MAILBOX over FIFO. FIFO is the universal fallback. MAILBOX decouples present from vblank — eliminates the system-wide UI lag on NVIDIA Linux + Wayland. - vkCreateSwapchainKHR receives the old VkSwapchainKHR via oldSwapchain so the driver recycles internal resources. Recreate is now substantially cheaper than full destroy/create. - acquire_next_image: VK_SUBOPTIMAL_KHR returns the image (still valid; WSI scales on present). Only OUT_OF_DATE returns nullopt so the caller knows to force-recreate. - present: same SUBOPTIMAL passthrough. - image_at(index) accessor for the backend's blit / barrier path. RenderTarget (render_target.{hpp,cpp}): - Add resize(new_size) that destroys color/depth/framebuffer and rebuilds them at the new extent KEEPING the render pass alive. Render pass compatibility doesn't depend on extent, so layer pipelines built against the original render pass stay valid. Saves ~1ms per resize and avoids invalidating cached render-pass-keyed state. WindowBackend (window_backend.{hpp,cpp}): - resize() queries window_->framebuffer_size() directly — the size is the backend's concern, not the caller's. Fixes the bug where VizSession passed the stale swapchain extent. - Frame pacer: sleep_until at the START of begin_frame (not end of end_frame). Always runs once per render iteration, including when begin_frame returns nullopt for OUT_OF_DATE recovery. Eliminates the spin loop. Period queried from primary monitor's GLFW video mode at init; falls back to 60 Hz on headless / virtual displays. - No throttle on resize — recreate per event matches Holoviz / nvpro_core2. With oldSwapchain + RenderTarget::resize, per-event recreate is ~5-10ms; drag holds ~30-45 fps and recovers cleanly. - OUT_OF_DATE in begin_frame triggers an immediate resize() (no throttle to skip past — we cannot render without a working swapchain). - Add Config::target_fps to override the queried refresh rate. VizSession (viz_session.cpp): - Pass Resolution{} hint to backend.resize() — backend self-discovers the new size from its window. The hint is kept on the interface for backends that prefer caller-driven sizing. window_smoke (main.cpp): - Print FPS + frame time once per second so users can quantify "laggy" without running ctest. Verified: idle holds 60 fps cleanly; resize drops to 30-45 fps during the drag and recovers to 60 immediately after. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-05-05T21:58:50Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0b1433cd-de6e-43cc-907f-80993ba8edcb

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a DisplayBackend abstraction and concrete backends (WindowBackend using GLFW + swapchain, OffscreenBackend) and related utilities (GlfwWindow, Swapchain, tile_layout). Refactors VizCompositor and VizSession to delegate render-target ownership and per-frame lifecycle to DisplayBackend. Extends layer APIs for per-view viewports and optional aspect_ratio. Introduces RenderTarget::resize and Rect2D/ViewInfo.viewport. Adds a televiz example (GLFW + CUDA buffers) and FetchContent GLFW dependency; updates CI apt packages and tests for new session/window primitives.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant VizSession as VizSession
    participant Backend as DisplayBackend\n(WindowBackend / OffscreenBackend)
    participant Compositor as VizCompositor
    participant Layer as LayerBase / QuadLayer
    participant Vulkan as Vulkan API

    App->>+VizSession: init(config)
    VizSession->>+Backend: make_backend(config) / backend->init(ctx, preferred_size)
    Backend->>Vulkan: create swapchain / render_target / resources
    VizSession->>Compositor: create(ctx, backend, config)

    loop Per Frame
      App->>VizSession: render(layers)
      VizSession->>Backend: begin_frame(predicted_display_time)
      Backend-->>VizSession: Frame{views, wait_before_render, wait_stage, signal_after_render, backend_token}
      VizSession->>Compositor: render(layers)
      Compositor->>Layer: record(cmd, views, target)
      Layer->>Vulkan: bind per-view viewport / draw
      Compositor->>Vulkan: submit with frame wait/signal semaphores
      Vulkan->>Backend: present or provide image for readback
      Backend->>VizSession: end_frame()
    end

    App->>VizSession: destroy()
    VizSession->>Backend: destroy()
    Backend->>Vulkan: release resources

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding a DisplayBackend abstraction, WindowBackend implementation, and a window smoke test example for the TeleViz visualization system.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fm/televiz_m4

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/televiz/window_smoke/main.cpp`:
- Around line 83-92: The code currently only catches exceptions from
VizSession::create, leaking CUDA device buffers if make_solid_color_buffer,
submit_solid, or session->render throw; modify main so the entire setup + render
sequence (calls to make_solid_color_buffer, submit_solid, any buffer allocations
and session->render) is executed inside a single try block with a matching catch
that performs cleanup, or refactor CUDA allocations into RAII types (e.g., a
wrapper class or std::unique_ptr with a custom deleter) so buffers are freed
automatically on exception; locate and update usages of make_solid_color_buffer,
submit_solid, and session->render and ensure the cleanup loop that frees device
buffers is invoked from the catch or encoded into the RAII destructors.

In `@src/viz/core/cpp/render_target.cpp`:
- Around line 147-164: RenderTarget::resize currently destroys attachments and
updates resolution_ before calling
create_color_image/create_depth_image/create_framebuffer, so if any creation
throws the object is left broken; fix by saving the old resolution, call
destroy_attachments(), then perform the three create_* calls inside a try block,
and in the catch(...) restore resolution_ to the saved old value and re-create
the old attachments (call
create_color_image/create_depth_image/create_framebuffer with a Config using the
old resolution) before rethrowing the exception so the object is left in a valid
state; reference RenderTarget::resize, resolution_, destroy_attachments(),
create_color_image(), create_depth_image(), create_framebuffer().

In `@src/viz/session/cpp/viz_compositor.cpp`:
- Around line 146-153: After backend_->begin_frame() returns a frame, ensure the
acquired frame is always released: introduce an RAII guard (e.g., ScopedFrame)
that takes the returned frame and calls backend_->end_frame(*frame) in its
destructor unless explicitly released, or wrap the recording/submit/finalize
logic in a try/catch that calls backend_->end_frame(*frame) (or
backend_->abort_frame(*frame) if available) before rethrowing; apply this
protection around the use of the local variable frame in the
begin_frame()/end_frame() region and duplicate the same fix for the second
occurrence covering lines 215-280 so no exception can leave a backend frame
stranded.

In `@src/viz/session/cpp/viz_session.cpp`:
- Around line 165-176: begin_frame() is fabricating frame metadata instead of
asking the backend, so change VizSession::begin_frame() to query and persist the
backend/compositor frame state (use whichever backend
API/compositor_->beginFrame()/acquireFrame() or similar exists) and copy that
returned FrameInfo into current_frame_info_ (including should_render,
resolution, views, predicted_display_time) only when the backend actually
started/accepted the frame; set frame_in_progress_ true only on successful
backend acquisition and ensure end_frame()/render() read this persisted
backend-provided FrameInfo rather than constructing placeholders.
- Around line 216-228: The event polling and resize logic currently inside
VizSession::render() (calls to backend_->poll_events(),
backend_->consume_resized(), and backend_->resize(Resolution{})) must be
extracted into a shared pre-frame helper (e.g., VizSession::pump_events() or
pre_frame()) and invoked by both the render() path and the explicit
begin_frame()/end_frame() API; update render() to call the new helper and add a
call to the same helper at the start of begin_frame() so explicit frame-loop
users also pump events and handle resize/close notifications.

In `@src/viz/session/cpp/window_backend.cpp`:
- Around line 57-86: In WindowBackend::required_instance_extensions(), remove
the direct glfwInit()/glfwTerminate() calls and instead call retain_glfw()
before invoking glfwGetRequiredInstanceExtensions() and release_glfw() after; if
retain_glfw() fails, throw the same runtime_error used today for glfwInit
failure, and ensure release_glfw() is called on all exit paths (success and all
error paths) — use RAII or try/finally-style cleanup to guarantee release_glfw()
runs; reference retain_glfw() and release_glfw() (the mutex-protected refcount
helpers in glfw_window.cpp) and replace both direct glfwInit()/glfwTerminate()
usages in this function.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 76bcb032-0fda-48d9-9b65-eae58519451b

📥 Commits

Reviewing files that changed from the base of the PR and between 208395c and af0204c.

📒 Files selected for processing (32)

CMakeLists.txt
deps/third_party/CMakeLists.txt
examples/televiz/CMakeLists.txt
examples/televiz/window_smoke/CMakeLists.txt
examples/televiz/window_smoke/main.cpp
src/viz/core/cpp/inc/viz/core/render_target.hpp
src/viz/core/cpp/inc/viz/core/viz_types.hpp
src/viz/core/cpp/render_target.cpp
src/viz/layers/cpp/inc/viz/layers/layer_base.hpp
src/viz/layers/cpp/inc/viz/layers/quad_layer.hpp
src/viz/layers/cpp/quad_layer.cpp
src/viz/session/cpp/CMakeLists.txt
src/viz/session/cpp/glfw_window.cpp
src/viz/session/cpp/inc/viz/session/display_backend.hpp
src/viz/session/cpp/inc/viz/session/display_mode.hpp
src/viz/session/cpp/inc/viz/session/glfw_window.hpp
src/viz/session/cpp/inc/viz/session/offscreen_backend.hpp
src/viz/session/cpp/inc/viz/session/swapchain.hpp
src/viz/session/cpp/inc/viz/session/tile_layout.hpp
src/viz/session/cpp/inc/viz/session/viz_compositor.hpp
src/viz/session/cpp/inc/viz/session/viz_session.hpp
src/viz/session/cpp/inc/viz/session/window_backend.hpp
src/viz/session/cpp/offscreen_backend.cpp
src/viz/session/cpp/swapchain.cpp
src/viz/session/cpp/tile_layout.cpp
src/viz/session/cpp/viz_compositor.cpp
src/viz/session/cpp/viz_session.cpp
src/viz/session/cpp/window_backend.cpp
src/viz/session_tests/cpp/CMakeLists.txt
src/viz/session_tests/cpp/test_tile_layout.cpp
src/viz/session_tests/cpp/test_viz_session.cpp
src/viz/session_tests/cpp/test_window_primitives.cpp

GLFW 3.4 defaults Wayland support ON on Linux but the build needs wayland-scanner + libwayland-dev present at configure time. CI runners (and minimal containers) often lack these — the FetchContent build fails with "Failed to find wayland-scanner" before any of our code is compiled. Match nvpro_core2's pragmatism (third_party/CMakeLists.txt:27 — "GLFW_BUILD_WAYLAND OFF"): X11 only by default. Xwayland covers Wayland sessions for X11 clients in practice, so the loss is limited to Wayland-only setups without Xwayland (rare in 2026). Holoscan SDK takes the opposite approach (cmake/deps/glfw_rapids.cmake: 68) — FATAL_ERROR if Wayland headers are missing — and demands devs install seven X11 sub-libraries. Too aggressive for a project of our size with mostly-NVIDIA-workstation users. Pure-Wayland users without Xwayland can re-enable: cmake -DGLFW_BUILD_WAYLAND=ON ... (plus apt install libwayland-dev wayland-scanner libxkbcommon-dev) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

PR review feedback. Cuts: - Milestone references in code (M4/M5/M7) — those belong in commit messages, not source files. - Cross-references to Holoviz / nvpro_core2 / camera_streamer in comments — those belong in PR descriptions. - Prose paragraphs that just paraphrase the code below them. Net -250 lines across 18 files, no behavior change. 50/50 unit tests still pass; build clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

CodeRabbit pass. window_smoke (main.cpp): - CudaDeviceBuffer RAII wraps cudaMalloc/cudaFree. Setup + render loop now run inside one try/catch; partial allocations free automatically on exception. RenderTarget::resize: - Save old extent, attempt new attachments inside try; on failure restore the old attachments so the object stays usable. If the restore itself fails, fall through to a clean empty state and rethrow. VizCompositor::render: - FrameGuard RAII calls backend->end_frame() if the function unwinds before reaching the explicit end_frame call, so an acquired swapchain image isn't stranded on exception. CodeRabbit also flagged a "second occurrence" at lines 215-280 — not actually a second begin_frame, just the submit-info/wait code, so single guard covers it. VizSession: - Extract pump_events() (poll + resize) into a private helper, call it from begin_frame() instead of only render(). Explicit begin_frame()/end_frame() loop users now get the same event / resize handling that render() does. GlfwWindow: - Promote retain_glfw / release_glfw from anonymous namespace to public static GlfwWindow::retain / release. Lets external callers (WindowBackend) use the same refcount instead of bare glfwInit/Terminate. WindowBackend::required_instance_extensions: - Use GlfwWindow::retain/release with an RAII guard so the GLFW refcount is balanced on every exit path (success and exception). Skipped (with rationale): - VizSession::begin_frame "fabricates FrameInfo": the placeholder identity view is intentional. Backend per-eye view info lives in per-layer ViewInfo built inside compositor::render — exposing it via the public FrameInfo requires the XR backend's actual per-eye API. Revisit when XR lands. CI / GLFW deps: - Re-enable Wayland in deps/third_party/CMakeLists.txt (X11 + Wayland is GLFW 3.4's Linux default). Override knob documented. - Add the GLFW build deps to .github/workflows/build-ubuntu.yml: libxrandr-dev libxinerama-dev libxcursor-dev libxi-dev libxext-dev libxkbcommon-dev libwayland-dev wayland-protocols. Without these the FetchContent build fails at "RandR headers not found" / "wayland-scanner not found". Build clean, 50/50 unit tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

farbod-nv · 2026-05-05T22:48:44Z

@coderabbitai review

coderabbitai · 2026-05-05T22:48:49Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Two real correctness bugs from the second review pass. abort_frame on the DisplayBackend interface: - Compositor's FrameGuard previously called end_frame() on unwind, which presents waiting on signal_after_render. If the exception fired before our vkQueueSubmit ran, that semaphore was never signaled, so present blocks on a semaphore that never signals. - New abort_frame() is the "drop this frame, recover next" hook. WindowBackend marks the swapchain dirty; the next begin_frame recreates it before doing anything else, retiring all images including the one we held. OffscreenBackend defaults to no-op. - FrameGuard now calls abort_frame instead of end_frame on destructor unwind. Force-recreate path for OUT_OF_DATE: - begin_frame's acquire-failure handler used to call resize(), which short-circuited when window framebuffer size matched current swapchain extent. WSI can fire OUT_OF_DATE for non-size reasons (monitor reconfig, format change), so the size-match guard left the swapchain stuck. - New WindowBackend::force_recreate() bypasses the size-match check. Called both from the OUT_OF_DATE acquire path and from the needs_recreate_ flag set by abort_frame. Doc-only: - Note on swapchain.cpp's present-support check that physical-device selection happens before the surface exists; multi-GPU hosts where the display isn't on the Vulkan-preferred device need the caller to pin physical_device_index explicitly. Proper fix (presentation-support callback through VkContext::Config) deferred until a real user hits it. - VizSession::Config::external_context now documents that the caller-supplied context must already have the backend's required extensions enabled and must support present on the eventual surface in kWindow mode. VizSession does not retroactively enable them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/viz/session/cpp/viz_session.cpp (1)
69-84: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

VizSession::Config::required_extensions is ignored here.

vk_cfg is rebuilt from backend-required extensions only, so any caller-supplied extensions on VizSession::Config stop affecting owned VkContext creation after this refactor. That turns part of the public config into a no-op and can regress sessions that need extra Vulkan extensions for interop or custom layers. Merge the session-configured extensions into vk_cfg before owned_ctx_->init(vk_cfg).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/viz/session/cpp/viz_session.cpp` around lines 69 - 84, The code
overwrites vk_cfg with only backend-required extensions and ignores
VizSession::Config::required_extensions; modify the owned context init path so
that after filling vk_cfg.instance_extensions and vk_cfg.device_extensions from
backend_->required_instance_extensions() and
backend_->required_device_extensions(), you merge any caller-specified
extensions from VizSession::Config::required_extensions (and any instance/device
extension lists on config_) into vk_cfg before calling owned_ctx_->init(vk_cfg),
preserving duplicates/ordering as appropriate and ensuring external_context
logic (config_.external_context and its is_initialized check) remains unchanged.

♻️ Duplicate comments (1)

src/viz/session/cpp/inc/viz/session/viz_compositor.hpp (1)
42-45: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

This API hides the real frame acquisition state.

DisplayBackend::begin_frame() is now the source of truth for views, sync, and “skip this frame”, but VizCompositor::render() acquires that internally and exposes none of it. The explicit VizSession::begin_frame() / end_frame() API therefore has to fabricate FrameInfo, so callers can observe should_render, resolution, and views for a frame the backend later rejects during resize/minimize. Split acquisition from render, or return/persist the accepted frame metadata before begin_frame() returns.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/viz/session/cpp/inc/viz/session/viz_compositor.hpp` around lines 42 - 45,
VizCompositor currently hides the real frame acquisition state because
DisplayBackend::begin_frame() is the source of truth but VizCompositor::render()
calls it internally; update the API so acquisition is split from submission: add
a new method (e.g., VizCompositor::acquireFrame() or VizSession::begin_frame()
must return/persist a FrameInfo) that calls DisplayBackend::begin_frame(),
captures and returns the accepted frame metadata (should_render, resolution,
views) even if backend later rejects the frame, and change
VizCompositor::render() to take that FrameInfo (or to use the persisted state)
instead of calling begin_frame() itself; also ensure the existing
VizSession::begin_frame()/end_frame() flow uses the new acquire/persisted
FrameInfo so callers can observe frame state prior to render.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/viz/session/cpp/viz_session.cpp`:
- Around line 69-84: The code overwrites vk_cfg with only backend-required
extensions and ignores VizSession::Config::required_extensions; modify the owned
context init path so that after filling vk_cfg.instance_extensions and
vk_cfg.device_extensions from backend_->required_instance_extensions() and
backend_->required_device_extensions(), you merge any caller-specified
extensions from VizSession::Config::required_extensions (and any instance/device
extension lists on config_) into vk_cfg before calling owned_ctx_->init(vk_cfg),
preserving duplicates/ordering as appropriate and ensuring external_context
logic (config_.external_context and its is_initialized check) remains unchanged.

---

Duplicate comments:
In `@src/viz/session/cpp/inc/viz/session/viz_compositor.hpp`:
- Around line 42-45: VizCompositor currently hides the real frame acquisition
state because DisplayBackend::begin_frame() is the source of truth but
VizCompositor::render() calls it internally; update the API so acquisition is
split from submission: add a new method (e.g., VizCompositor::acquireFrame() or
VizSession::begin_frame() must return/persist a FrameInfo) that calls
DisplayBackend::begin_frame(), captures and returns the accepted frame metadata
(should_render, resolution, views) even if backend later rejects the frame, and
change VizCompositor::render() to take that FrameInfo (or to use the persisted
state) instead of calling begin_frame() itself; also ensure the existing
VizSession::begin_frame()/end_frame() flow uses the new acquire/persisted
FrameInfo so callers can observe frame state prior to render.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c05bb26b-23c6-4adf-b556-2eaf8c31cca7

📥 Commits

Reviewing files that changed from the base of the PR and between af0204c and 0f679ca.

📒 Files selected for processing (22)

.github/workflows/build-ubuntu.yml
deps/third_party/CMakeLists.txt
examples/televiz/window_smoke/main.cpp
src/viz/core/cpp/inc/viz/core/render_target.hpp
src/viz/core/cpp/render_target.cpp
src/viz/session/cpp/glfw_window.cpp
src/viz/session/cpp/inc/viz/session/display_backend.hpp
src/viz/session/cpp/inc/viz/session/display_mode.hpp
src/viz/session/cpp/inc/viz/session/glfw_window.hpp
src/viz/session/cpp/inc/viz/session/offscreen_backend.hpp
src/viz/session/cpp/inc/viz/session/swapchain.hpp
src/viz/session/cpp/inc/viz/session/tile_layout.hpp
src/viz/session/cpp/inc/viz/session/viz_compositor.hpp
src/viz/session/cpp/inc/viz/session/viz_session.hpp
src/viz/session/cpp/inc/viz/session/window_backend.hpp
src/viz/session/cpp/offscreen_backend.cpp
src/viz/session/cpp/swapchain.cpp
src/viz/session/cpp/viz_compositor.cpp
src/viz/session/cpp/viz_session.cpp
src/viz/session/cpp/window_backend.cpp
src/viz/session_tests/cpp/test_viz_session.cpp
src/viz/session_tests/cpp/test_window_primitives.cpp

🚧 Files skipped from review as they are similar to previous changes (1)

src/viz/core/cpp/inc/viz/core/render_target.hpp

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

GLFW's glfw3.h pulls <GL/gl.h> by default unless GLFW_INCLUDE_NONE is defined. GLFW_INCLUDE_VULKAN alone only adds vulkan.h — it does not suppress the OpenGL include. CI runners without libgl-dev fail to build viz_session. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Two ordering bugs in the deferred-recreate path: 1. Reset command buffer before begin_frame(). A prior frame that threw mid-recording leaves the command buffer in RECORDING state with references to the framebuffer that begin_frame may then destroy via force_recreate(). Vulkan forbids destroying a framebuffer while a recording command buffer references it. 2. force_recreate() now returns bool. Previously needs_recreate_ was cleared unconditionally, but the recreate no-ops when the window is minimized (extent 0,0). The dirty flag was lost and the next acquire ran on a stale swapchain. Clear the flag only on successful recreate; mark dirty on the OUT_OF_DATE path too. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Move frame_sync_->reset() to immediately before submit_or_signal_fence(). The previous placement reset the fence then ran semaphore-vector construction and layer->get_wait_semaphores(), both of which can throw. A throw in that window left the fence reset but never signaled, so the next render() blocked forever at frame_sync_->wait(). The earlier comment claimed protection against exactly this failure mode but didn't deliver it — the reset has to be the last thing before submit. WindowBackend::end_frame now sets needs_recreate_ when Swapchain::present returns false (VK_ERROR_OUT_OF_DATE_KHR). Previously the result was discarded and recovery was deferred to the next acquire, which made resize behavior brittle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Previously the top-of-frame reset only handled the next render() — but VizSession::pump_events() runs between calls and can call backend_->resize() which destroys framebuffer attachments. If a prior render() threw mid-recording, the cmd buffer holds stale framebuffer references and pump_events triggers UB. Two changes: 1. RAII guard at top of render() resets the cmd buffer on every exit path (success or exception). This guarantees INITIAL state on return, so pump_events can safely destroy resources. 2. Move the trailing frame_sync_->wait() to before backend_->end_frame(). Without this, a throw between submit and wait would leave the cmd buffer in PENDING state when the guard runs (UB to reset). After the wait, end_frame can throw and the cmd buffer is in EXECUTABLE (resettable). Synchronous-frame contract preserved. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Spell out that end_frame runs after both the submit and the in-flight fence wait, so signal_after_render is signaled and vkQueuePresentKHR is safe. Also note that throws between submit and end_frame route through abort_frame instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

next_frame_deadline_ was initialized to now(), so begin_frame's first iteration added frame_period_ and slept ~16ms before rendering anything — visible as a stall when the window opens. Initialize one period in the past so the first += lands at now(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

readback_to_host's default body throws std::runtime_error. Linux builds got the header transitively; MSVC didn't. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

farbod-nv and others added 5 commits May 5, 2026 13:37

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

farbod-nv and others added 2 commits May 5, 2026 15:15

farbod-nv requested review from jiwenc-nv, junjeechao and nv-jakob May 5, 2026 22:30

farbod-nv requested a review from aristarkhovNV May 5, 2026 22:49

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

farbod-nv and others added 5 commits May 5, 2026 15:58

viz: clang-format pass

b5d7bf7

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

farbod-nv had a problem deploying to dev May 5, 2026 23:23 — with GitHub Actions Error

farbod-nv and others added 2 commits May 5, 2026 16:26

farbod-nv temporarily deployed to dev May 5, 2026 23:41 — with GitHub Actions Inactive

viz/session: add missing <stdexcept> to display_backend.hpp

d8546ec

readback_to_host's default body throws std::runtime_error. Linux builds got the header transitively; MSVC didn't. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

farbod-nv temporarily deployed to dev May 6, 2026 00:06 — with GitHub Actions Inactive

junjeechao approved these changes May 6, 2026

View reviewed changes

farbod-nv merged commit 926f144 into main May 6, 2026
42 checks passed

farbod-nv deleted the fm/televiz_m4 branch May 6, 2026 01:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TeleViz] Add DisplayBackend + WindowBackend + Window smoke test example#469

[TeleViz] Add DisplayBackend + WindowBackend + Window smoke test example#469
farbod-nv merged 17 commits into
mainfrom
fm/televiz_m4

farbod-nv commented May 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 5, 2026 •

edited

Loading

Review skipped

Walkthrough

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

farbod-nv commented May 5, 2026

Uh oh!

coderabbitai Bot commented May 5, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

farbod-nv commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

farbod-nv commented May 5, 2026

Uh oh!

coderabbitai Bot commented May 5, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

farbod-nv commented May 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading