[pull] master from libretro:master#980
Merged
pull[bot] merged 11 commits intoAlexandre1er:masterfrom Apr 30, 2026
Merged
Conversation
The GDI video driver previously only worked with RGUI and had a stub
font driver returning NULL for glyph metrics. This commit brings it
up to feature parity with d3d8/d3d9 for the menu/widget/font/OSD
surfaces while preserving Win95/98 backwards compatibility.
gfx/drivers/gdi_gfx.c
Display driver (gfx_display_ctx_gdi)
- draw: solid quads via FillRect+cached HBRUSH, translucent
solids via 1x1 premultiplied AlphaBlend, per-corner gradients
via software bilinear interpolation into a scratch DIB (alpha
interpolated per-pixel alongside RGB so widget drop-shadows
that hold RGB constant and animate alpha top->bottom fade
correctly), and textures via per-texture DIB + AlphaBlend
with optional RGB tint through a scratch DIB.
- draw recognises both coord conventions: plain quad
(draw->x/y/w/h) and custom geometry (coords->vertex +
coords->tex_coord) used by gfx_display_draw_texture_slice
(9-patch) and gfx_display_draw_bg (menu wallpapers).
- draw->scale_factor applied as centred scaling around the dst
rect midpoint on the plain-quad path only (XMB selected-tab
icon zoom). Slice path leaves scale_factor uninitialised so
we must not read it there.
- get_default_vertices/tex_coords return real BL,BR,TL,TR
arrays in 0..1 space so gfx_display_draw_bg gets non-zero
coords (otherwise MaterialUI/XMB wallpapers degenerate to a
0x0 rect).
- scissor_begin/end via SaveDC/RestoreDC + CreateRectRgn +
SelectClipRgn. blend_begin/end are no-ops (AlphaBlend is
per-call).
Font driver (modelled on d3d8/gl1)
- A8 atlas mirrored into a BGRA premultiplied DIB; per-glyph
AlphaBlend hot path for plain white, scratch-DIB tinted
path for arbitrary RGB. All five vtable functions
implemented (get_glyph and get_message_width were NULL in
the legacy driver, breaking text layout in widgets and the
stats overlay).
- Pre-Win98 fallback: TextOut + system font when AlphaBlend
isn't available. Glyph metrics still come from the font
backend so layout stays consistent.
gdi_load_texture
- Premultiplies alpha at load time and detects fully-opaque
textures (lets the draw fast-path skip alpha math).
gdi_frame
- bmp_menu (window-sized 32-bit BGRA DIB section) is the
compositing target whenever any window-resolution content
is active that frame: textured menus, gfx_widgets, OSD msg,
or the Display Statistics overlay. The core/RGUI frame is
StretchDIBits-upscaled into bmp_menu so widgets, OSD glyphs
and stats text land at native window resolution on top of
the upscaled image, instead of being drawn at core size and
smeared by WM_PAINT's StretchBlt at present time.
- Pure core gameplay (no widgets, OSD, menu, or stats) keeps
the legacy small-bmp + WM_PAINT path to avoid the bmp_menu
clear + BitBlt cost in the steady state.
- When a textured menu is alive over running content, the
core frame is uploaded into bmp_menu as a background
underlay BEFORE menu_driver_frame paints the menu on top.
Without this, Ozone's semi-transparent wallpaper composites
against solid black instead of the running game.
- Display Statistics overlay (statistics_show + stat_text)
rendered through the font driver with osd_stat_params
positioning. Drawn BEFORE widgets so widget panels can
partially obscure it (matching d3d8/d3d9 layering).
Suppressed while the menu is alive — d3d8/d3d9 structure
this as `else if (statistics_show)` off the menu condition;
the menu drivers own the screen and have their own stats
path.
- Per-frame OSD msg renders AFTER widgets so transient
messages aren't hidden by notification panels.
- bmp_menu clear is FillRect with bmp_menu pre-selected, not
a direct DIB pixel write. Direct writes to a DIB section
selected into a DC race with GDI's batch queue and produce
sporadic missing draws.
- GdiFlush() before BitBlt-to-winDC at present time so all
draws into bmp_menu have committed.
- Manual BitBlt-to-winDC + ValidateRect for the bmp_menu
present path; the legacy bmp path keeps InvalidateRect ->
WM_PAINT.
- gdi_gfx_widgets_enabled wired into the video_gdi vtable.
gfx/common/gdi_defines.h
Extends gdi_t with bmp_menu DIB section + menu_pixels pointer +
menu_surface_width/height, bmp_width/bmp_height tracking
(separate from video_width to avoid Step 6/8 fighting over the
field when RGUI is alive at a different size from the core),
cached HBRUSH (brush_cached + colour invalidation flag),
scissor save/restore state, menu_textured_active flag.
gfx/common/win32_common.c
WM_PAINT handlers (wnd_proc_gdi_dinput, wnd_proc_gdi_winraw,
wnd_proc_gdi_common): StretchBlt source rect uses bmp_width /
bmp_height with a fallback to video_width / video_height.
configuration.c
check_menu_driver_compatibility: case 'g' now matches "gdi"
exactly so XMB / Ozone / MaterialUI become selectable on the
GDI driver instead of falling through to RGUI.
Compile-clean under MinGW i686 with -Wall -Wextra
-Werror=implicit-function-declaration
-Werror=declaration-after-statement for _WIN32_WINNT in {0x0400,
0x0410, 0x0501} and with / without HAVE_MENU and HAVE_GFX_WIDGETS.
The GDI video driver was stretching the core frame edge-to-edge
across the entire window regardless of the user's aspect ratio
setting (Settings -> Video -> Scaling -> Aspect Ratio). d3d8 / d3d9
report e.g. Scale: 2821 x 2160 inside a 3840 x 2160 viewport with
black pillar bars; GDI was reporting Scale: 0 x 0 and stretching
the frame to fill the window edge-to-edge.
Three pieces were missing:
1. gdi_t had no video_viewport_t state. set_viewport was an
empty stub, set_aspect_ratio and apply_state_changes pokes
were both NULL, so the user's aspect-ratio settings never
reached the driver.
2. The frame upload (Step 9 + the textured-menu Step 4b
underlay) used (0, 0, surface_w, surface_h) as the
destination rect on bmp_menu, ignoring the viewport entirely.
3. WM_PAINT's StretchBlt used (0, 0, screen_w, screen_h) as the
destination on the legacy bmp path, also ignoring viewport.
gfx/common/gdi_defines.h
Adds video_viewport_t vp + bool keep_aspect + bool should_resize
to gdi_t. vp.full_width / full_height hold the window size;
vp.x / y / width / height hold the destination rect for the core
frame after aspect-ratio correction. Includes ../video_defines.h
for the type.
gfx/drivers/gdi_gfx.c
- gdi_init: keep_aspect = video->force_aspect; should_resize =
true so the first frame computes a real viewport before any
present.
- gdi_alive: sets should_resize on window-resize events (the
resize bool from win32_check_window).
- gdi_set_viewport: real implementation calling
video_driver_update_viewport(&gdi->vp, force_full,
gdi->keep_aspect, true). Mirrors d3d8's pattern.
- gdi_set_aspect_ratio + gdi_apply_state_changes pokes added
and wired into the poke interface table. Both set
should_resize; set_aspect_ratio also forces keep_aspect = true
(matching d3d8).
- gdi_frame Step 2b: when should_resize is set, call
gdi_set_viewport with the current window size and clear the
flag. Defensive fallback to full-window viewport if vp ends
up zero-sized so we still draw something.
- gdi_upload_core_frame_to_menu (Step 4b helper) and Step 9
bmp_menu branch: destination rect is now the viewport
sub-rect (gdi->vp.x / y / width / height) of bmp_menu rather
than the whole surface. Step 4 already cleared bmp_menu to
black so the bars appear automatically without an explicit
fill.
- Forward declaration for gdi_set_viewport since it's called
from gdi_frame but defined alongside the vtable near the
bottom of the file.
gfx/common/win32_common.c
- New wnd_proc_gdi_paint(gdi) helper: paints the four border
rects (top / bottom / left / right of the viewport) with
BLACK_BRUSH, then StretchBlts gdi->bmp into the viewport
rect. Skips the FillRect calls when the viewport already
fills the window.
- All three WM_PAINT handlers (wnd_proc_gdi_dinput,
wnd_proc_gdi_winraw, wnd_proc_gdi_common) replaced with a
single call to the helper. The previous bodies were three
near-identical 25-line blocks.
Manual BitBlt-from-bmp_menu present path in gdi_frame Step 14
needed no change: bmp_menu is window-sized and was cleared to
black with the game underlay landing only in the viewport
sub-rect, so the bars are baked into bmp_menu by present time
and a 1:1 BitBlt to winDC is still correct.
Note: at high window resolutions GDI's WM_PAINT StretchBlt will
drop frames where d3d8 / d3d9 would not — the StretchBlt is a
software scale on the CPU. Running at integer scale (Settings ->
Video -> Scaling -> Integer Scale) avoids this on cores whose
native resolution divides evenly into the window.
Compile-clean under MinGW i686 with -Wall -Wextra
-Werror=implicit-function-declaration
-Werror=declaration-after-statement for _WIN32_WINNT in {0x0400,
0x0410, 0x0501} and with / without HAVE_MENU and
HAVE_GFX_WIDGETS.
The GDI video driver had get_overlay_interface set to NULL, so
on-screen input overlays (touch controls / virtual gamepads
configured via Settings -> On-Screen Overlay -> Display Overlay)
loaded silently, registered for input hit-testing, but never
rendered anything visible. This commit adds a working overlay
implementation matching the d3d8 / d3d9 contract, so an overlay
config tuned against those backends behaves identically here.
gfx/common/gdi_defines.h
Adds nested struct gdi_overlay { HBITMAP bmp; unsigned tex_w/h;
float tex_coords[4]; float vert_coords[4]; float alpha_mod;
bool fullscreen; } and an array of these on gdi_t, plus
overlays_size and overlays_enabled flags. All wrapped in
HAVE_OVERLAY.
gfx/drivers/gdi_gfx.c
Implements the six video_overlay_interface_t entry points:
- gdi_overlay_load: builds a 32-bit BGRA premultiplied DIB
section per image (same conversion gdi_load_texture does
for menu / widget textures), so the per-frame draw is a
straight AlphaBlend with no pixel rewriting.
- gdi_overlay_tex_geom / gdi_overlay_vertex_geom: store the
0..1 normalised geometry verbatim. Unlike d3d8 / d3d9 / gl
we do NOT flip y here: those backends emit vertices in y-up
clip space and rely on the viewport transform to invert,
whereas GDI is a y-down pixel-space blit straight to a DC.
Overlay descriptor coordinates are y-down (same convention
as RETRO_DEVICE_POINTER, which the hit-test path consumes),
so a direct multiply is correct. Applying the d3d-style
`y = 1 - y; h = -h;` flip here renders the overlay
vertically mirrored.
- gdi_overlay_enable: toggles overlays_enabled.
- gdi_overlay_full_screen: per-overlay flag controlling
whether vert_coords span the full window (true; covers
letterbox / pillarbox bars so touch controls keep working
when the game is letterboxed) or just the game viewport
rect (false).
- gdi_overlay_set_alpha: per-overlay alpha modulation,
applied at render time as AlphaBlend's
SourceConstantAlpha (per-pixel alpha was already
premultiplied at load).
gdi_overlays_render composites all enabled overlays onto the
active target (bmp_menu when active) using AlphaBlend with
AC_SRC_ALPHA + SourceConstantAlpha. No-op on Win95 (no
AlphaBlend); the static dispatch is sized so the compile-time
branch produces no warnings on either side.
Wired into gdi_frame as a new Step 10b between stats (Step 10)
and widgets (Step 11), matching the d3d8 / d3d9 layering: stats
-> overlay -> widgets -> OSD msg. Overlays render regardless
of menu state so virtual gamepad controls remain visible while
navigating the menu, also matching the d3d8 / d3d9 behaviour.
Wired into the need_bmp_menu trigger block: an active overlay
forces bmp_menu allocation just like widgets do, since overlays
are window-resolution images that would smear if drawn into the
small core-frame bmp and scaled up by WM_PAINT.
gdi_overlay_free called from gdi_free before texDC teardown
since the overlay DIB sections may be selected into texDC at
free time.
Forward declarations added near the existing forward-decl block.
Compile-clean under MinGW i686 with -Wall -Wextra
-Werror=implicit-function-declaration
-Werror=declaration-after-statement for _WIN32_WINNT in {0x0400,
0x0410, 0x0501} and with / without HAVE_OVERLAY, HAVE_MENU,
HAVE_GFX_WIDGETS.
…t_info
gdi: implement viewport_info
The video_driver_t::viewport_info hook is optional, and several
drivers (gdi, caca, sixel, network, fpga, vg, ps2, xenon360,
xshm) leave it NULL. The wrapper video_driver_get_viewport_info
returns false in that case and leaves the caller's struct
untouched.
That contract is unsafe with the way many call sites are
written. Callers in menu_setting.c, menu/drivers/rgui.c, and
several input drivers declare `video_viewport_t vp;` on the
stack, call video_driver_get_viewport_info(&vp), and proceed to
read fields off the struct without checking the return value.
When viewport_info is NULL, those reads land on uninitialised
stack memory.
The most damaging consequence: setting_action_start_custom_vp_*
in menu_setting.c writes the result straight back into
settings->video_vp_custom:
custom->width = vp.full_width - custom->x;
custom->height = vp.full_height - custom->y;
When the active video driver is gdi (no viewport_info),
vp.full_width / vp.full_height are stack garbage, and the
resulting custom_viewport_width / _height get persisted to
retroarch.cfg on shutdown. Restarting then reads back values
like custom_viewport_width=57874, custom_viewport_height=32759,
custom_viewport_x=972119847 — and aspect ratio CUSTOM uses those
values for the game frame's destination rect, which renders the
game effectively invisible until the user manually picks a
different aspect ratio.
This patch:
gfx/video_driver.c
Zero the output struct in video_driver_get_viewport_info
when the driver doesn't implement viewport_info, or when
the viewport pointer itself is NULL. Callers that
correctly check the return value see no behavioural change;
callers that ignore it now read all-zeros rather than stack
garbage, which degenerates predictably (vp.full_width = 0
means custom->width gets set to -custom->x — bounded —
instead of unbounded). This bounds the failure mode for
every driver currently lacking viewport_info, not just gdi.
gfx/drivers/gdi_gfx.c
Implements viewport_info properly. Mirrors
d3d8_viewport_info: copy gdi->vp into the output struct.
With this in place, gdi_t's existing vp tracking (already
maintained for letterbox/pillarbox and overlay rendering)
becomes the source of truth that the menu reads back when
the user adjusts the custom viewport.
Two related fixes for the Vulkan HDR pipeline, both surfaced by toggling HDR mode (Off/HDR10/scRGB) in the menu while running fullscreen on Win11 + NVIDIA. 1) vulkan_create_swapchain never set VK_CTX_FLAG_HDR_SCRGB on the scRGB success path - it only ever cleared it. Downstream code that branches on this flag would therefore behave as if scRGB output was inactive even when the swapchain had been created with R16G16B16A16_SFLOAT + VK_COLOR_SPACE_EXTENDED_SRGB_LINEAR_EXT. 2) swapchain_semaphores[] is populated lazily by vulkan_acquire_next_image, one slot at a time, only for the image actually returned by vkAcquireNextImageKHR. On the swapchain recreate path vulkan_destroy_swapchain memsets the entire array to zero, and at least one path through the recreate reaches vulkan_present with current_swapchain_index pointing at an image whose slot has not yet been re-populated. vkQueuePresentKHR is then handed VK_NULL_HANDLE in pWaitSemaphores, which NVIDIA real-fullscreen on Win11 dereferences inside the ICD and segfaults on. Other drivers (AMD, Intel, MoltenVK, and NVIDIA in windowed-fullscreen) tolerate the NULL silently, which is why this only reproduces on one configuration. Fix by pre-creating all per-image present semaphores immediately after vkGetSwapchainImagesKHR, so no acquire ordering can leave the array in a half-populated state. The existing lazy-allocate in vulkan_acquire_next_image becomes a no-op (slot is already non-NULL) but is left in place as a safety net. Reproducer: fullscreen Vulkan + NVIDIA RTX 5090 + Win11, toggle HDR mode in the menu. Crashes inside vkQueuePresentKHR on the second frame after reinit, when the acquire returns swapchain image index 1 for the first time on the new swapchain.
When RGUI is up over a running core with menu_rgui_transparency
enabled, d3d9 / d3d8 / etc render the chequer pattern with
partial alpha so the game shows through. GDI was rendering the
chequer opaque against solid black, hiding the game completely.
Two changes:
Step 4b (game-frame underlay)
The condition that skipped Step 4b whenever menu_frame was
set was based on the assumption that RGUI uses the legacy
non-textured (gdi->bmp + WM_PAINT) path. That stopped being
true once widgets / overlays could force RGUI through the
bmp_menu path — which is the common case (any user with the
FPS widget or a touch overlay enabled). The condition is
relaxed to "menu is alive AND content is loaded", so RGUI
gets the same game-frame underlay that XMB / Ozone /
MaterialUI already do.
Step 9 RGUI bmp_menu branch
StretchDIBits with SRCCOPY discards the alpha and overwrites
the underlay. Replaced with gdi_blit_rgui_alpha, a new
helper that:
- Allocates a 32-bit BGRA scratch DIB section sized to
RGUI's frame.
- Converts the 16-bit RGBA4444 source to BGRA32
premultiplied (alpha lives in bits 0-3 of the RGBA4444
word; argb32_to_rgba4444 in rgui.c is the producer).
- AlphaBlend-stretches the scratch DIB over bmp_menu with
AC_SRC_OVER + AC_SRC_ALPHA.
On Win95 (no AlphaBlend) the helper is compiled out and the
code falls back to the existing opaque StretchDIBits path,
so transparency degrades to solid backgrounds — same
behaviour as platforms RGUI itself doesn't consider
transparency-capable (ps2, sdl_dingux, etc).
The opaque path is also still used for non-RGUI 16-bit
sources and for non-16-bit sources, which keeps the core
game frame on its existing fast SRCCOPY path.
The conversion is per-pixel software, but RGUI frames are small
(256x192 to 512x480 typical) and only one composite happens per
frame. Caching the scratch DIB on gdi_t is a possible future
optimisation; left out for now since correctness was the goal
and the per-frame cost is negligible at typical RGUI
resolutions.
The four AlphaBlend helper paths were allocating a fresh DIB
section on every call:
- gdi_blit_rgui_alpha (RGUI alpha composite, per frame)
- gdi_blit_texture_modulated (tinted icon, per draw)
- gfx_display_gdi_draw gradient (per-vertex colour ramp, per draw)
- gfx_display_gdi_draw 1x1 (translucent solid colour, per draw)
CreateDIBSection / DeleteObject is a kernel-side syscall pair —
allocates committed pages, populates the BITMAPINFOHEADER,
registers a kernel handle, and on free does the inverse. For
Ozone, gfx_display_ctx_gdi_draw is called hundreds of times per
frame; the bulk of those go through the 1x1 or gradient paths.
Cache one DIB per path on gdi_t:
- scratch_1x1: fixed 1x1 BGRA, allocated lazily on first use,
held for the lifetime of gdi_t. Used by the translucent
solid-quad path: rewrite the single pixel each call instead
of recreating the whole DIB.
- scratch_quad: variable, grow-only. Shared by the gradient
and texture-modulated paths. When a draw asks for w x h
pixels and the existing DIB is at least that big, it's
reused; otherwise we DeleteObject + CreateDIBSection at the
new max dimension. We never shrink, since the whole point
is to amortise the allocation cost across frames.
- scratch_rgui: variable, grow-only. Used by the RGUI alpha
path. Kept separate from scratch_quad so a frame that
composites RGUI AND draws gradient quads doesn't thrash one
shared slot back and forth.
The "DIB might be larger than the request" wrinkle: the inner
loops in the gradient / tint / RGUI paths now use the cached
DIB's actual width as their stride (gdi->scratch_*_w), and the
downstream AlphaBlend / BitBlt source rect is the requested
w x h sub-rect — so leftover stale pixels in the unused tail are
never sampled.
Three new helpers:
gdi_ensure_scratch_quad / gdi_ensure_scratch_rgui:
Grow-or-reuse a variable-sized scratch DIB. Return false on
allocation failure; callers bail out of the draw (matching
the previous CreateDIBSection-failure behaviour).
gdi_ensure_scratch_1x1:
Lazy first-time allocation of the fixed 1x1 DIB.
gdi_release_scratch:
Tear down all three. Called from gdi_free, before texDC is
destroyed (the scratch DIBs may be selected into texDC at
free time).
No visual change intended. All four paths produce exactly the
same pixels into the same destination rect — the only thing
that changes is the lifetime of the source DIB section.
The gradient path in gfx_display_gdi_draw computed every pixel
through a doubly-nested bilinear interpolation loop:
for each row
compute top-bottom interp factor
for each column
compute left-right interp factor
blend TL/TR (top edge), blend BL/BR (bottom edge)
blend top/bottom edges horizontally
premultiply if not all_opaque
store
In practice almost every gradient the menu and widget code emits
is one of two specific shapes:
- vertical-only (TL == TR and BL == BR): header strips,
drop shadows, sidebar fades. Every column is identical;
every row is a uniform colour.
- horizontal-only (TL == BL and TR == BR): rarer, but used for
e.g. some progress / focus indicators. Every row is
identical; colour varies across columns only.
Detect those cases and collapse the doubly-nested loop:
vertical-only: compute the row colour once per row (one
interpolation per channel), then fill the row width with that
pixel. For a 600x80 vertical gradient that's ~80 pixel
computes plus 80 row fills, instead of ~48000 pixel computes.
horizontal-only: compute the first row pixel-by-pixel, memcpy
it to every subsequent row.
The general 4-corner bilinear path stays as the fallback for
anything that doesn't fit either shape — moved into its own
else branch with no behavioural change relative to the previous
implementation.
Output is byte-identical to the bilinear path for both 1D
shapes (the math reduces to the same formula when the redundant
dimension's factors cancel), so this is purely a speed
optimisation with no visual change.
The hot pixel paths in gfx_display_gdi_draw and gdi_font_render_line
do many `(uint32_t)x / 255u` operations per pixel — that's a 20-30
cycle integer divide on x86 vs a few cycles for shift+add. For a
typical Ozone-with-widgets frame:
- General 4-corner gradient: 14 divides per pixel.
- 1D gradients (vertical/horizontal): 4 divides per row/column,
plus 3 per non-opaque pixel. Less hot since the previous
commit collapsed those to 1D loops, but still worth a free
win.
- Tinted-glyph font composite: 4 divides per glyph pixel.
Add a GDI_DIV255 macro:
#define GDI_DIV255(x) ((((x) + 1) + ((x) >> 8)) >> 8)
Verified bit-exact equivalent of `(uint32_t)x / 255u` for every
input in [0, 255*255 = 65025] — a brute-force comparison against
integer division across all 65026 values produces zero diffs.
That's exactly the input range that products of two 8-bit values
land in, which is what every divide-by-255 site here computes.
Applied at every hot per-pixel /255 site:
- Gradient bilinear (general 4-corner path): 14 sites per
pixel.
- 1D gradient paths (vertical-only, horizontal-only): 4 sites
per row/column plus 3 sites per non-opaque pixel.
- Tinted-glyph font scratch composite: 4 sites per pixel.
- 1x1 translucent-solid premultiply: 3 sites per draw.
- Texture-modulated tint (out_a only): 1 site per pixel.
- Font line outer premultiply: 3 sites per line.
- gdi_load_texture / gdi_overlay_load: 3 sites per non-opaque
pixel. Load-time only, but free to apply for consistency.
Deliberately NOT changed:
- The `/ (255u * 255u)` divides for out_r/g/b in
gdi_blit_texture_modulated. Collapsing those to two
sequential GDI_DIV255 calls would introduce up to 1 LSB of
rounding error compared to the single divide, since
(a/255)*(b/255) has a different rounding boundary than
(a*b)/(255*255). The cost saving isn't worth a visible
drift in tinted-icon pixels.
- The `(x + 127) / 255` rounded form in gdi_blit_rgui_alpha.
That's deliberately round-to-nearest rather than truncate,
which GDI_DIV255 doesn't reproduce. RGUI's per-frame cost
is dominated by syscall / blit overhead, not the divides.
- The `(iy * 255u) / (dst_h - 1)` interp-factor divides.
Divisor varies per draw; not a constant-255 case.
No visual change intended. Output is byte-identical to the
divide-based code at every converted site.
The bmp_menu present path was creating a temporary DC, selecting
the DIB section into it, BitBlt-ing through that DC to the
window DC, then tearing the temporary DC down — every frame:
HDC menu_dc = CreateCompatibleDC(gdi->winDC);
if (menu_dc) {
HBITMAP menu_old = (HBITMAP)SelectObject(menu_dc, gdi->bmp_menu);
GdiFlush();
BitBlt(gdi->winDC, ..., menu_dc, ..., SRCCOPY);
SelectObject(menu_dc, menu_old);
DeleteDC(menu_dc);
}
SetDIBitsToDevice does the same thing — copy a DIB to a window
DC at 1:1 scale — but takes the raw pixel pointer directly,
skipping the temporary DC framework entirely. We already keep a
uint32_t* into the DIB pixels (gdi->menu_pixels, populated by
gdi_ensure_menu_surface) so the substitution is straightforward:
GdiFlush();
SetDIBitsToDevice(gdi->winDC,
0, 0, surface_width, surface_height,
0, 0, 0, surface_height,
gdi->menu_pixels, &bmi, DIB_RGB_COLORS);
Per frame this saves a CreateCompatibleDC, two SelectObjects,
and a DeleteDC syscall. The pixel-copy bandwidth is unchanged
(the GDI implementation still has to move surface_width *
surface_height * 4 bytes from system RAM to the window's
displayable surface), but the API path is shorter and avoids
some of GDI's DC-based source-routing overhead.
bmp_menu is allocated at exactly the window surface size so
there's no scaling involved, which is the key precondition for
SetDIBitsToDevice (the no-scaling, simpler cousin of
StretchDIBits). The DIB is top-down (biHeight is negative in
gdi_ensure_menu_surface) and SetDIBitsToDevice supports
top-down DIBs natively, so the bit layout matches without
needing to flip rows.
Compatibility: SetDIBitsToDevice is part of the original Win32
API. Same availability surface as BitBlt itself — Win95, NT
3.5+, Win32s. No regression vs the previous path.
The legacy bmp + WM_PAINT + StretchBlt route (used when no
widgets, no textured menu, no OSD/stats) is untouched — that
path actually does scale (small core frame to window-sized
viewport), so SetDIBitsToDevice doesn't fit there.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )