[pull] master from libretro:master#975
Merged
pull[bot] merged 17 commits intoAlexandre1er:masterfrom Apr 28, 2026
Merged
Conversation
The hourglass icon on the task-progress notification widget never
visibly rotated. The animation system was correctly easing
msg->hourglass_rotation through 0 -> -2*pi every six seconds, but
the rendered icon stayed static across every backend.
Three independent bugs, each blocking a different family of
drivers:
1. gfx/gfx_widgets.c - the call site that draws the hourglass
forwarded the live `radians` value to gfx_widgets_draw_icon
but left `cosine`/`sine` at their cos(0)/sin(0) defaults.
The matrix-rotation path (gl, gl1, glcore, vulkan, d3d8)
consumes those trig values via gfx_display_rotate_z, so it
was always building an identity rotation regardless of
`radians`.
Fix: recompute cosf(radians)/sinf(radians) alongside `radians`
in the in-progress branch.
2. gfx/drivers/d3d_shaders/sprite_sm4.hlsl.h - the sprite shader
used by the d3d10 / d3d11 / d3d12 menu display drivers. The
C code on those drivers dutifully writes
`sprite->params.rotation = draw->rotation` into the vertex
stream, but the shader pipeline never reads it: the GSInput
struct doesn't have a `params` field, so rotation is dropped
at the VS->GS boundary, and the geometry shader builds the
four corner vertices straight from `position.xy + position.zw
* cornerOffset` with no rotation applied.
Fix: forward `params` from VS to GS (params.x is already
consumed by the VS for scaling; we need params.y in the GS
for rotation), and rewrite the GS to compute corner offsets
by rotating per-corner pixel-space sign vectors `(sx, sy) in
{+/-1}^2` and scaling them back by `position.z * 0.5` (X) and
`position.w * 0.5` (Y). Because position.z = 2W/vw and
position.w = 2W/vh both encode the same pixel-space length W
for a square icon, this preserves the icon's shape under
rotation on a non-square viewport. Non-square icons would
skew, but the only caller of rotation today (the hourglass)
is square.
3. gfx/drivers/d3d9hlsl.c, gfx/drivers/d3d9cg.c - the d3d9
menu display drivers' single-sprite path (no explicit vertex
array supplied by caller, which is what gfx_widgets_draw_icon
does) builds an axis-aligned quad from draw->x / draw->y /
draw->width / draw->height directly, applies scale_factor,
and never references draw->rotation or draw->matrix_data at
all - so rotation was silently dropped on both shader
paths.
Fix: when draw->rotation is non-zero, rotate each of the
four pixel-space corner offsets around the quad centre, then
normalise back to [0,1] via 1/video_width and 1/video_height.
Same aspect-correct approach as the d3d_shaders fix above.
Add <math.h> to d3d9hlsl.c (d3d9cg.c already had it).
The d3d9 multi-vertex path uses the matrix_data route, so it's
handled by fix (1) alone and needs no driver-side change.
Tested on gl / gl1 / glcore / vulkan / d3d9_cg / d3d9_hlsl /
d3d10 / d3d11 / d3d12.
Follow-up to b9777c8 (the original ASan+UBSan workflow) and 70f24be (v8: flip to halt-on-error after a clean baseline). Adds a third smoke tier that actually loads a libretro core and runs it through ~5 seconds of real frame dispatch + clean shutdown, covering everything --help and --features can't reach: dlopen of a core, retro_load_game, the stb_image-driven decode path, the X11 + XVideo color-conversion + shared-memory image- transfer pipeline, the runloop, and full cleanup-on-shutdown. The new step: 1. Builds cores/libretro-imageviewer/image_core.so with the same SANITIZER=address,undefined the main retroarch binary uses -- ensures stb_image's stack-buffer-overflow / use-after- return surface is fully instrumented (the global ASan allocator interceptor catches heap bugs across module boundaries regardless, but stack instrumentation requires per-TU compilation). This works because v9 (efae310) made the standalone Makefile sanitizer-aware. 2. Generates an 8x8 solid-red test PNG via a Python heredoc using only struct + zlib -- 75 bytes, no apt dependency on imagemagick or similar (Python ships on every ubuntu-latest). 3. Spins up Xvfb on display :99 with a 320x240x24 screen (small to minimise X server memory footprint), waits for the XVideo extension to be available, writes a minimal retroarch.cfg that pins the video driver to "xvideo" and the audio driver to "null" (no PulseAudio / ALSA dependency on the runner), and runs: DISPLAY=:99 ./retroarch \ -c /tmp/asan-cfg/retroarch.cfg \ -L cores/libretro-imageviewer/image_core.so \ /tmp/test.png \ --max-frames=300 \ --verbose --max-frames=300 = ~5s nominal at 60fps, ~15-25s under sanitizer overhead. Crucially, --max-frames triggers a clean exit through the normal runloop teardown -- not a SIGTERM mid-execution -- so cleanup paths are sanitizer- instrumented too. Wall-clock timeout(1) at 60s as a safety backstop. 4. Surfaces sanitizer findings (AddressSanitizer: / runtime error:) in the step output as informational text only. The step is soft-fail (continue-on-error: true) on this first iteration because lots of things can fire here that aren't RetroArch bugs (Xvfb quirks, libGL / Mesa software-rasterizer leaks at shutdown, X11 driver init noise) and forcing strict enforcement before measuring a baseline would block merges on noise. Once the baseline is characterised the same way the --help step's was, this step can be flipped to strict. Why xvideo specifically: it's the smallest self-contained X11 video driver in the tree (1163 LOC adapted from bSNES / MPlayer), no GL dependency, real YUV color conversion + XShm transfer, and Xvfb exposes the XVideo extension by default. null video would be simpler but skips the entire pixel-data path, which is exactly the surface most likely to fire under sanitizer. * .github/workflows/Linux-asan-ubsan.yml - Install dependencies: add xvfb, x11-utils, libxv-dev, libxext-dev, libxxf86vm-dev. No new compiler/runtime dependencies; the X11 transitive dev headers are needed for xvideo build, xvfb + x11-utils for the runtime invocation. - Configure: add --enable-xvideo to make xvideo a hard build requirement -- silent fall-back to a different driver would skew the smoke's coverage without warning. - New steps appended after --features: * Build imageviewer core under ASan + UBSan * Generate test PNG for imageviewer * Smoke run imageviewer headless under Xvfb (soft-fail) - Header comment + --help comment updated to describe the three-tier coverage structure (--help, --features, imageviewer). Verified locally on efae310: - YAML parses with 9 steps, soft-fail correctly scoped to only the new imageviewer step. - Imageviewer core builds clean with SANITIZER=address,undefined (build step minus apt installs reproduced verbatim locally). - Test PNG generation runs clean, produces a valid 75-byte 8x8 PNG (verified by `file` and stb_image acceptance). - Xvfb on a fresh ubuntu-latest equivalent exposes the XVideo and MIT-SHM extensions out-of-the-box. - xdpyinfo -queryExtensions emits extension names with leading whitespace, hence the ^[[:space:]]+ in the detection regex (a naive ^XVideo / ^MIT-SHM regex would silently fail to match -- caught and fixed during pre-flight). Cannot verify locally: the actual end-to-end run, which would require building a sanitizer-instrumented retroarch binary against a full apt set. The first CI run is the experiment, the same way the original workflow's was.
Reported by gcc 15.2.0 (msys2 / MinGW) building libretro-net-retropad
under -Wstringop-overflow:
stdstring.c: In function 'string_tokenize':
stdstring.c:500:16: warning: writing 1 byte into a region of
size 0 [-Wstringop-overflow=]
500 | token[_len] = '\0';
| ~~~~~~~~~~~~^~~~~~
The flagged write is redundant. string_tokenize allocates
malloc((_len + 1) * sizeof(char)) and then calls
strlcpy(token, str_ptr, _len + 1). Looking at the strlcpy
contract in libretro-common/compat/compat_strl.c:31-41:
size_t __len = _len < len - 1 ? _len : len - 1;
memcpy(s, in, __len);
s[__len] = '\0';
strlcpy already NUL-terminates within the `_len + 1` byte limit.
Because _len was computed at lines 489-492 as either
(delim_ptr - str_ptr) -- strictly <= strlen(str_ptr) -- or
strlen(str_ptr) directly, _len <= strlen(str_ptr) is always
true, so strlcpy's __len resolves to _len and the terminator
lands at token[_len]. The token[_len] = '\0' line just rewrites
the byte strlcpy already wrote.
gcc 15's tightened -Wstringop-overflow analyser cannot constrain
the relationship between _len and the malloc'd size (the size
expression `(_len + 1) * sizeof(char)` involves a multiply that
the analyser doesn't fold through to the bounds check on
token[_len]), so it warns under the assumption that _len could
exceed the malloc'd size by one. Removing the redundant write
silences the warning without changing semantics.
Not a strlcpy_append situation: that helper (78c52ab) is for
chained appends of the form
_len += strlcpy(s + _len, src, len - _len);
This is a single copy into a freshly-malloc'd buffer -- the
right fix is to delete the redundant trailing write, not to
swap idioms.
* libretro-common/string/stdstring.c
Drop the redundant token[_len] = '\0' that follows the strlcpy
in string_tokenize, with an inline comment explaining the
strlcpy contract that makes the line dead and the
-Wstringop-overflow interaction that made it actively harmful.
Verified locally:
- The five string_tokenize behaviours that callers rely on
(basic comma split, empty token at start, multi-char
delimiter, no-delimiter pass-through, empty input) all pass
under -fsanitize=address with the redundant write removed.
- Cannot reproduce the gcc 15 warning on local gcc 13.3
(warning is gcc-15-specific), but verified the local build
stays clean under -Wstringop-overflow=4 and -Werror after
the change.
Run #1 of the v10 imageviewer smoke step tripped on Xvfb's incomplete XVideo support: Error: [XVideo] XvQueryAdaptors() found 0 adaptors. Error: [Video] Cannot open video driver. Exiting... Xvfb advertises the XVideo *extension* (which my pre-flight check verified) but provides zero XVideo *adaptors* -- adaptors are the hardware-acceleration backends, and Xvfb has no real video hardware to back them with. The xvideo driver in gfx/drivers/xvideo.c correctly errors out via XvQueryAdaptors's count check rather than crashing -- that's good defensive code, not a bug -- but it means xvideo is unusable on Xvfb and we need a different video driver for the smoke. Pivot to SDL2. RetroArch's sdl2 video driver (gfx/drivers/sdl2_gfx.c) uses SDL2's X11 backend with MIT-SHM for image transfer -- both confirmed available on Xvfb out of the box. A standalone SDL_CreateRenderer probe under Xvfb returns a working SDL_RENDERER_SOFTWARE renderer cleanly (verified locally before this patch landed). Coverage tradeoff: we lose xvideo's YUV color-conversion + XShm codepath but keep all the high-leverage surface (dlopen of the core, retro_load_game, the stb_image decode path, video driver init, the runloop, full cleanup-on-shutdown). The dropped xvideo-specific surface is small enough that it's not worth the cost of finding a virtual X server with software XVideo adaptors -- nothing widely available actually provides those. * .github/workflows/Linux-asan-ubsan.yml - Install dependencies: drop libxv-dev, libxext-dev, libxxf86vm-dev (xvideo build deps). libsdl2-dev was already in the base set so no new packages required. - Configure: --enable-xvideo -> --enable-sdl2. Same explicit- over-implicit reasoning: silent fall-back to a different driver would skew the smoke without warning. - Smoke step: video_driver = "xvideo" -> "sdl2" in the inline retroarch.cfg. The xdpyinfo XVideo readiness probe becomes an MIT-SHM probe (which SDL2 actually uses). Step comment rewritten to document why we pivoted from xvideo, with the actual error message from run #1 inlined for context. - Header comment: tier-2 description updated from "X11 + XVideo color-conversion pipeline" to "SDL2 + X11 + MIT-SHM rendering pipeline". Pre-flight lesson: extension presence != functional driver. xdpyinfo -queryExtensions told me XVideo was present; what I should have verified is that XvQueryAdaptors returns a positive count. For SDL2 the equivalent positive verification is SDL_CreateRenderer returning non-NULL, which the local probe confirmed. Verified locally: - YAML parses, 9 steps, soft-fail still scoped to only the imageviewer step. - Standalone SDL2 renderer probe (SDL_CreateWindow + SDL_CreateRenderer w/ SDL_RENDERER_SOFTWARE) on Xvfb display returns ok, exit 0. - Patch applies cleanly to b9777c8 + v8 (efae310 + ace9c9f didn't touch this file). Cannot verify locally: the actual end-to-end run, which would require building a sanitizer-instrumented retroarch binary linked against SDL2. As before, the next CI run is the experiment.
Found by the new ASan+UBSan CI workflow's headless SDL2 imageviewer smoke (b9777c8 + d967813). Run #2 of the smoke (after the v12 xvideo->sdl2 pivot) caught a SEGV at process startup: AddressSanitizer:DEADLYSIGNAL ==9537==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000968. The signal is caused by a READ memory access. #0 XQueryExtension (libX11.so.6+0x361f6) #1 XInitExtension (libX11.so.6) #2 XextAddDisplay (libXext.so.6) #3 XScreenSaverQueryExtension (libXss.so.1+0x1565) #4 x11_suspend_screensaver #5 video_driver_init_internal #6 drivers_init #7 retroarch_main_init The address 0x968 is a small offset off NULL -- libX11's XQueryExtension dereferencing the Display * argument before checking it for null. Root cause: gfx/common/x11_common.c::xss_screensaver_inhibit() was called via x11_suspend_screensaver() with g_x11_dpy == NULL. The flow: 1. SDL2 video driver inits. 2. sdl2_set_handles() in gfx/common/sdl2_common.c:54 calls video_driver_display_type_set(RARCH_DISPLAY_X11) so the display *type* gate matches X11, and routes the SDL-owned X Display through video_driver_display_set() at line 55. 3. The file-scope `g_x11_dpy` in x11_common.c:66 stays at its initial NULL value. It's only assigned at line 795 inside XOpenDisplay() in the xvideo / GL / X11-direct init paths, which the SDL2 driver doesn't take. 4. drivers_init() -> video_driver_init_internal() -> x11_suspend_screensaver() -- line 286 gates on display_type == RARCH_DISPLAY_X11 (passes; SDL2 set it), then at lines 289-292 tries dbus_suspend_screensaver() if HAVE_DBUS, then at line 293 calls xss_screensaver_inhibit(g_x11_dpy, enable) -- with g_x11_dpy == NULL. 5. xss_screensaver_inhibit() at line 227 calls XScreenSaverQueryExtension(NULL, ...) -> libX11 dereferences -> SEGV. Why hasn't this been hit before? Most desktop builds set HAVE_DBUS -- pkg-config finds dbus-1 on basically every modern Linux distro -- so the dbus_suspend_screensaver() call at line 290 short-circuits and returns true before the xss path is reached. The CI runner doesn't apt-install libdbus-1-dev (HAVE_DBUS off) but libxss1 is pulled in transitively by xvfb / x11-utils (HAVE_XSCRNSAVER on at link time), exposing the corner. A symmetric defensive check already exists at line 255 in xdg_screensaver_inhibit(): if (g_x11_dpy && g_x11_win) ... The xss path was the missing-guard one. * gfx/common/x11_common.c Add a `if (!dpy) return false;` guard at the top of xss_screensaver_inhibit(). Applies only to the HAVE_XSCRNSAVER branch; the no-op stub at line 237 already discards `dpy` with `(void) dpy;` and returns false. Inline comment explaining the failure mode points at the relevant call sites in sdl2_common.c so a future maintainer can trace the invariant without re-deriving it. Verified locally: - The added check returns false for NULL dpy without invoking libX11 (matches the existing "extension not present" return contract). Non-NULL dpy paths are unchanged. - xss_screensaver_inhibit() is `static`, single caller (line 311 in x11_suspend_screensaver) -- no other reachable code paths to audit. Cannot verify locally that this resolves the CI SEGV without rebuilding retroarch with the same apt set the runner has -- the next CI run is the experiment. Expected outcome: xss_screensaver_inhibit() returns false harmlessly, the function falls through to the xdg_screensaver path (which has its own NULL guard), and the SDL2 smoke proceeds into retro_run.
…oy paths Four issues found in the same audit pass over gfx/drivers/vulkan.c, all in the create-with-old / destroy-buffer ownership protocol. None of them are the trigger for the iOS materialui content-close abort that prompted the audit (the font teardown does not go through the patched paths), but they are real and worth fixing on their own. * vulkan_create_texture: VkDeviceMemory leak on alloc failure (gfx/drivers/vulkan.c) When called with a non-NULL `old`, the function destroys old's view/image/buffer at the top of the `if (old)` block, then either pilfers old->memory or allocates fresh memory for tex.memory. The trailing `if (old)` cleanup is responsible for freeing old->memory in the no-pilfer path (skipped by the pilfer branch via old->memory = VK_NULL_HANDLE). The vkAllocateMemory failure branch returns early before reaching that cleanup. At return, old->view/image/buffer are dangling but zeroed by the caller's `*old = tex` overwrite, while old->memory still holds the original live VkDeviceMemory and is now unreachable. Pure leak per failed call; reachable from every call site that passes a real `old` (vk_per_frame texture realloc on resize, MUI/RGUI set_texture, readback staging realloc). Inline an equivalent cleanup before the early return: unmap if mapped, free old->memory, zero old. Mirrors the unmap-before-free the pilfer branch already does on line 1277. * vulkan_create_texture: free-while-mapped on no-pilfer cleanup (gfx/drivers/vulkan.c) The trailing `if (old)` cleanup was calling vkFreeMemory on old->memory without unmapping first, even though the pilfer branch a few lines up does unmap. vkFreeMemory implicitly unmaps so this is not a spec violation, but MoltenVK has historically handled free-while-mapped poorly (KhronosGroup/MoltenVK#957) and the asymmetry with the pilfer branch is gratuitous. Add the `if (old->mapped) vkUnmapMemory(...)` guard before the free. * vulkan_destroy_buffer: VUID-vkFreeMemory-memory-00677 violation (gfx/drivers/vulkan.c) Order was: vkUnmapMemory (unconditional) -> vkFreeMemory -> vkDestroyBuffer. Two problems. First, the memory is freed while buffer->buffer is still bound to it, which the spec forbids per VUID-vkFreeMemory-memory-00677 -- vulkan_destroy_texture (line 974) gets this right; this function was the outlier. Second, vkUnmapMemory was unconditional, but vulkan_create_buffer leaves buffer->mapped == NULL when its initial vkMapMemory fails, so a destroy on a never-mapped buffer would unmap memory that is not currently mapped. Fix the order to unmap (gated on buffer->mapped) -> destroy buffer -> free memory, with NULL-handle guards on the destroys for symmetry with vulkan_destroy_texture. * vulkan_init_quad_ibo: same VUID violation in vkMapMemory failure path (gfx/drivers/vulkan.c) Identical wrong order in the IBO-init map-failure cleanup -- frees vk->quad_ibo.memory before destroying vk->quad_ibo.buffer that is still bound to it. Swap the two calls. vulkan_deinit_quad_ibo itself was already correct.
rgui_render() was filling the entire framebuffer with bg_dark_color before drawing the info messagebox, which erased the menu underneath and left the popup floating against an empty background. The messagebox already paints its own opaque background, border, and drop shadow within its footprint, so the full-screen fill served no purpose other than destroying the menu render that had just been composed above it. Drop it; the popup now appears on top of the visible menu as intended.
Five independent issues found during a focused audit of menu/drivers/
xmb.c. Two are real corruption vectors driven by uninitialized memory
and a stale-pointer free ordering; two are size_t underflows in the
ticker helpers that don't fire on the current XMB call sites but
violate the contract a future caller could plausibly hit; and one is
an unused heap allocation on every main-menu init. Grouped together
because they share an audit context and none is large enough to stand
alone.
* xmb_alloc_node: uninitialized gfx_thumbnail_t fields
(menu/drivers/xmb.c)
The comment above xmb_alloc_node deliberately uses malloc instead
of calloc to avoid zeroing the multi-KB thumbnail_path_data char
arrays on every node alloc on big lists (MAME, FBA). Per-field
init only sets node->thumbnail_icon.icon.texture = 0, leaving the
rest of the gfx_thumbnail_t substruct (status, flags, alpha,
delay_timer, width, height) and all of thumbnail_path_data
uninitialized.
When a node is later freed without its thumbnail ever having been
loaded -- the common case for the line-edit path at the bottom of
the file (xmb_list_init -> xmb_alloc_node, allocated and stored
into list->list[i].userdata) -- xmb_free_node calls
gfx_thumbnail_reset, which reads `thumbnail->flags &
GFX_THUMB_FLAG_FADE_ACTIVE` and `thumbnail->texture` before
initializing them. Uninit `flags` may match the bit and trigger
gfx_animation_kill_by_tag against arbitrary state; uninit
`texture != 0` feeds a garbage handle to
video_driver_texture_unload. The latter is the actual heap
corruption vector -- driver-side texture maps treat the handle as
an index/key into their own allocations.
Separately, the lazy thumbnail-path resolution in the render path
reads `thumbnail_icon->thumbnail_path_data.icon_path[0]` to decide
whether resolution is needed; on a freshly-allocated node that
byte is uninitialized.
Zero the small gfx_thumbnail_t substruct (~32 bytes) and clear
icon_path[0] only. This preserves the original optimization that
kept thumbnail_path_data's PATH_MAX/NAME_MAX char arrays unzeroed,
since the path-data fields are written via fill_pathname/strlcpy
before any read of those fields.
* playlist_db_node_map freed after the nodes it borrows from
(menu/drivers/xmb.c)
The map's values are non-owning xmb_node_t* pointers into
xmb->horizontal_list[i].userdata, populated in
xmb_context_reset_horizontal_list. In xmb_refresh_horizontal_list,
xmb_free, and the xmb_init error path the order was: free nodes
via xmb_free_list_nodes, then RHMAP_FREE the map. Between those
two calls the map holds dangling pointers.
No re-entry path I could trace would actually read the map during
that window today (the menu is single-threaded and
xmb_free_list_nodes calls only gfx_thumbnail_reset and free), but
the invariant isn't enforced and the fix is a one-line reorder.
Free the map first at all three sites.
* xmb_animation_line_ticker_smooth: size_t underflow in fade-string
copy length (menu/drivers/xmb.c)
At the two fade-string memcpy sites:
if (copy_len >= line_ticker->top_fade_str_len)
copy_len = line_ticker->top_fade_str_len - 1;
if top_fade_str_len == 0 the comparison is true, copy_len
underflows to SIZE_MAX, the memcpy writes effectively unbounded
bytes, and the trailing NUL store at line_ticker->top_fade_str
[copy_len] writes at SIZE_MAX offset. Same pattern for
bottom_fade_str_len.
XMB's only caller passes sizeof(stack_array) so the live path is
safe today, but xmb_animation_line_ticker_smooth takes the public
gfx_animation_ctx_line_ticker_smooth_t shape. Add the missing
`_str_len > 0` guard to each branch.
* xmb_animation_line_ticker: same underflow on line_ticker->len
(menu/drivers/xmb.c)
max_copy = line_ticker->len - 1 underflows when len == 0, then
copy_len > max_copy is false and the memcpy at line ~4566 writes
the full block size. The early sanity check guards line_len < 1
and max_lines < 1 but not len < 1. Add it for symmetry with the
smooth variant; same caller-contract argument.
* xmb_menu_init_list: dead heap allocation
(menu/drivers/xmb.c)
info.exts = strldup("lpl", sizeof("lpl")) has been allocated on
every main-menu init since the function was introduced.
DISPLAYLIST_MAIN_MENU never reads info->exts (only
directory-scanning displaylists like
DISPLAYLIST_DATABASE_PLAYLISTS_HORIZONTAL do, where the same line
legitimately filters .lpl files); the buffer is allocated, never
read, then freed by menu_displaylist_info_free. Almost certainly
copy-pasted from xmb_init_horizontal_list at line ~2927. Drop it.
Three issues found during a focused audit of menu/drivers/ozone.c. The first is a latent double-free in the deep-copy helper that the struct's own comment warned about; the second is a dangling-pointer window during teardown across three sites; the third is a dead heap allocation on every main-menu init. Same audit class as the recent xmb pass (commit 9a27e66) since ozone shares much of its data-flow shape with xmb. * ozone_copy_node: latent double-free of console_name (menu/drivers/ozone.c) ozone_node has a comment above its definition saying /* If you change this struct, also change ozone_alloc_node and ozone_copy_node */ ozone_alloc_node was kept in sync when console_name was added, ozone_copy_node was not. ozone_copy_node does *new_node = *old_node; new_node->fullpath = old_node->fullpath ? strdup(old_node->fullpath) : NULL; -- bitwise-copying every pointer field including console_name, then deep-copying only fullpath. When either the source or the copy is later passed to ozone_free_node, the free(node-> console_name) at the top of that function frees the same buffer twice. The bug is latent today: console_name is only populated on horizontal_list nodes (sidebar playlist tabs), and the only caller of ozone_copy_node is ozone_list_deep_copy, which copies the vertical selection_buf into selection_buf_old for the scroll-transition cache. Vertical entries don't carry a console_name, so no double-free actually fires from current call sites. But the precondition is unenforced -- any future feature that copies a horizontal node (or even adds a second string field to ozone_node and forgets the same step) re-opens the bug. Mirror fullpath's strdup-or-NULL pattern. * playlist_db_node_map freed after the nodes it borrows from (menu/drivers/ozone.c) Same shape as the xmb fix in 9a27e66. The map's values are non-owning ozone_node_t* pointers into ozone->horizontal_list[i]. userdata, populated in ozone_context_reset_horizontal_list. In ozone_refresh_horizontal_list, the ozone_init error path, and ozone_free, the order was: free nodes via ozone_free_list_nodes, then RHMAP_FREE the map. Between those two calls the map holds dangling pointers. No re-entry path I could trace would actually read the map during that window today, but the invariant isn't enforced and the fix is a one-line reorder per site. Free the map first at all three. * ozone_menu_init_list: dead heap allocation (menu/drivers/ozone.c) info.exts = strldup("lpl", sizeof("lpl")) is allocated on every main-menu init. DISPLAYLIST_MAIN_MENU never reads info->exts (only directory-scanning displaylists like DISPLAYLIST_DATABASE_PLAYLISTS_HORIZONTAL do, where the same call legitimately filters .lpl files); the buffer is allocated, never read, then freed by menu_displaylist_info_free. Almost certainly copy-pasted from ozone_init_horizontal_list at line ~5046, same as the xmb instance fixed in 9a27e66. Drop it.
word_wrap_wideglyph violated its implicit contract that the returned
length always fits inside the destination buffer. Three return paths
forwarded strlcpy()'s return value, which is strlen(src) per spec --
not bytes-actually-written -- so on truncation the function returned
a value larger than the destination it had written into.
The sister function word_wrap() has a `len < src_len + 1` guard up
front that bails to 0 when the buffer is too small, sidestepping
this entirely. word_wrap_wideglyph has no such guard and tries to
fit what it can, which is correct behaviour, but the inflated return
breaks every caller that uses it as a length argument.
Concrete impact: three menu drivers (xmb, ozone, materialui) feed
this return value as the `n` argument to memchr() over the wrapped
destination buffer when laying out messageboxes
(xmb_render_messagebox_internal at xmb.c line ~1183;
ozone_draw_messagebox at ozone.c line ~7362;
materialui_render_messagebox at materialui.c line ~2825). Pre-patch,
an inflated `n` walks memchr past the destination into adjacent stack
memory (the lines[] / line_lengths[] arrays the messagebox helpers
keep right next to the wrapped buffer). A '\n' byte found there
yields a wild pointer used in pointer arithmetic and as a length
argument to font_driver_get_message_width(), causing stack info
disclosure or a crash.
The wideglyph path is selected by msg_hash_get_wideglyph_str(), so
the bug is reachable in CJK locales (Japanese, Korean, Chinese) via
any messagebox payload exceeding MENU_LABEL_MAX_LENGTH (256 bytes
default) -- error notifications, file paths, network handshake
text, search results.
Two issues fixed in stdstring.c:
* Return value: every truncating return path
(`return strlcpy(...)` and `return (s - s_start) + strlcpy(...)`)
now clamps strlcpy's return to the actual bytes written:
copied = strlcpy(s, src, n);
if (copied >= n) copied = (n > 0) ? n - 1 : 0;
return [(s - s_start) +] copied;
This avoids introducing a strnlen() dependency (not used anywhere
else in libretro-common, portability concerns on old MSVC).
* `len` semantic drift: the loop body decremented `len` by char_len
on every byte written, while the early-return paths computed
`remaining = len - (s - s_start)`. Both expressions track "bytes
used", so the formula effectively double-subtracted, yielding a
too-small `remaining` and forcing the strlcpys to truncate harder
than necessary. The lastspace/lastwideglyph rewinds at lines ~411
and ~429 also moved `s` backward without a matching `len` adjustment,
desyncing the two over time. Drop the `len -= char_len`, compute
`remaining = len - (s - s_start)` once at the top of each loop
iteration (and reuse it for the buffer-overflow guard previously
spelled `if (char_len >= len)`). `len` now stays constant across
the function body, matching word_wrap()'s convention.
Regression test: libretro-common/samples/string/word_wrap_overflow
_test/. Five cases covering both truncating and non-truncating
inputs across the line-358 early-return path and the main-loop
late-return paths. The discriminator is two-fold: the test asserts
return-value-vs-bytes-written directly, AND mimics the menu drivers'
call shape by feeding the returned value to memchr() over the
destination, so an ASan build catches pre-patch failures via
heap-buffer-overflow on the memchr. Verified by running against
both pre-patch and post-patch stdstring.c:
pre-patch: short_input_tiny_dest fails
(returned=47, dst_size=16) -> exit 1
post-patch: all five cases pass -> exit 0
CI: word_wrap_overflow_test added to RUN_TARGETS in
.github/workflows/Linux-libretro-common-samples.yml. Builds under
SANITIZER=address,undefined like the other overflow regression
tests in the same workflow (rpng_chunk_overflow_test,
vfs_read_overflow_test, cdrom_cuesheet_overflow_test,
chd_meta_overflow_test, cdfs_dir_record_test, rzip_chunk_size_test).
Note on libretro-common/test/string/test_stdstring.c (the
libcheck-based suite): I considered adding a unit-test entry there
too, but the file is already broken at HEAD on master --
test_string_replacesubstr calls string_replace_substring with the
old 3-arg signature, but the API was extended to 5 args some time
ago; the test fails to compile. Wiring up Makefile.test in CI
would have caught that regression but is out of scope here.
The samples-based regression test is the working surface and
matches the established pattern for every other overflow test
in the tree.
The fallback path in rgui_render_osk that triggers when the on-screen keyboard cannot fit on the framebuffer was building its message via the unsafe `*pos += strlcpy(buf + *pos, src, len - *pos)` pattern. The construction was sound here because msg is large (NAME_MAX_LENGTH) and inputs are short, but the same string-build shape is what strlcpy_append exists to replace -- it's bound-checked, short-circuits cleanly on truncation, and avoids the off-by-one arithmetic around the embedded '\n'. Mirrors materialui_render's identical construction at materialui.c line ~8091, which was already converted in the strlcpy_append chain-conversion sweeps (commits 25ade82 / e446242 / 78c52ab).
Three independent issues found during a focused audit of menu/ drivers/materialui.c. One is a real heap overflow in the status-bar metadata builder driven by user-controllable input strings; one is an OOB write/read on power-user-sized playlist counts; one is a TOCTOU OOB read in the touch-tap handler. Same audit class as the recent xmb (9a27e66) and ozone (d56c728) commits, grouped together because they share an audit context and none is large enough to stand alone. * status-bar metadata: heap overflow via _len underflow in strlcpy chain (menu/drivers/materialui.c) materialui_render_process_entry_playlist_desktop builds the status-bar metadata string ("Core: <name> ⠀ <runtime> ⠀ <last played>") through a chain of six _len += strlcpy(buf + _len, src, sizeof(buf) - _len); calls. This pattern is NOT self-bounding: strlcpy returns strlen(src), not bytes-actually-written, so when any intermediate call truncates -- because the source is longer than the remaining buffer space -- _len overshoots sizeof(status_bar.str). The next iteration's `sizeof(...) - _len` underflows size_t to ~SIZE_MAX, the strlcpy treats the destination as essentially infinite, and writes proceed past status_bar.str into adjacent struct fields (runtime_fallback_str, last_played_fallback_str, then whatever follows in materialui_handle_t on the heap). Reachable via any path that produces a long source string: a crafted playlist entry with an outsize core_name, a long runtime log entry, or a verbose locale translation of MENU_ENUM_LABEL_VALUE_PLAYLIST_SUBLABEL_CORE. Convert the chain to strlcpy_append, the bound-checked replacement added in 78c52ab and applied across the codebase in 25ade82 / e446242. After truncation strlcpy_append clamps *pos to len-1 so the chain short-circuits cleanly instead of propagating an underflow. * playlist_selection[]: OOB write/read for users with too many playlists (menu/drivers/materialui.c) mui->playlist_selection is sized [NAME_MAX_LENGTH], which is the *file/dir name length cap* (128 on small builds, 256 on default) -- the wrong constant for what's actually a per-playlist-tab remembered-selection cache. materialui_navigation_set writes mui->playlist_selection[mui->playlist_selection_ptr] = selection; with no bound check on the index, where the index is the user's row in the playlists tab (also stored unchecked one branch above). A user with > NAME_MAX_LENGTH playlists -- not implausible for someone with FBNeo, MAME, and every console they own -- navigating to row N and then entering any playlist will OOB-write at offset N into adjacent struct fields and the heap beyond. The read site in materialui_populate_entries has the matching OOB read. Two-part fix. Replace NAME_MAX_LENGTH with a purpose-built MUI_PLAYLIST_SELECTION_MAX (1024 -- comfortable headroom for any plausible setup, ~8 KB cache on mui). Bound-check both the write and read against that constant. Selections in slots beyond MUI_PLAYLIST_SELECTION_MAX simply aren't remembered; falling through to the existing default-selection behaviour is preferable to a heap corruption. * materialui_pointer_up TAP path: OOB read on stale ptr (menu/drivers/materialui.c) The TAP-gesture branch does list = MENU_LIST_GET_SELECTION(menu_list, 0); if (!list) break; if (!(node = (materialui_node_t*)list->list[ptr].userdata)) break; with no bound check on ptr against list->size. ptr was set in a prior render frame's hit-test loop, where it was bounded by that frame's entries_end. Between render and event delivery the list can be repopulated -- search filter applied, navigation back/ forward, async list rebuild -- leaving ptr stale and possibly past the new list end. The neighbouring LONG_PRESS branch (line ~11199) and the swipe handler at materialui_pointer_up_swipe_horz_default (line ~10874) both already guard with `ptr < entries_end`; the TAP path missed the same defence. Add it.
Four independent issues found during a focused audit of gfx/ gfx_thumbnail.c, the subsystem touched by all four menu drivers (xmb, ozone, materialui, rgui). Same audit class as the recent xmb (9a27e66), ozone (d56c728), and materialui (d5e598a) commits. * gfx_thumbnail_get_path: missing break on LEFT case (gfx/gfx_thumbnail.c) The switch in gfx_thumbnail_get_path() that selects which path to return based on thumbnail_id is missing a `break` after the GFX_THUMBNAIL_LEFT case. When a caller asks for the LEFT thumbnail and *path_data->left_path is empty, control falls through to GFX_THUMBNAIL_ICON; if *path_data->icon_path is populated, the function returns the icon path *as if it were the left thumbnail*. The caller then loads the icon as the left thumbnail texture. Wrong-content substitution rather than memory corruption, but silent and reachable on any menu where left thumbnails are unset but icons are configured (the default for most playlist configurations on xmb/ozone/materialui). Add the missing break. * gfx_thumbnail_get_content_dir: stack buffer overflow on long directory paths (gfx/gfx_thumbnail.c) The function extracts the basename of the directory portion of path_data->content_path by copying that prefix into a local tmp_buf[NAME_MAX_LENGTH] and then calling path_basename_nocompression() on the copy. The strlcpy size passed is the directory-portion length _len, which is bounded only by `_len < PATH_MAX_LENGTH` (i.e. up to 2047) -- but tmp_buf is only NAME_MAX_LENGTH (256) bytes. Whenever the directory portion exceeds 255 chars, strlcpy writes up to _len-1 bytes into the smaller tmp_buf and corrupts the stack past it. Reachable on systems with deep folder hierarchies; content_path is set from runtime_content_path_basename and other content paths capped at PATH_MAX_LENGTH, so 500+ char directory prefixes are entirely plausible (a savefile path nested under the user's home directory plus a deep ROM organisation scheme). Fix: anchor the copy at the latest position that still lets the directory's last segment fit in tmp_buf, then take the basename from there. We never needed the full prefix in tmp_buf -- only its tail -- so dropping the leading bytes when _len > sizeof(tmp_buf) gives the same answer without the overflow. * gfx_savestate_thumbnail_get_path: _len underflow in path build (gfx/gfx_thumbnail.c) Same unsafe pattern that the recent strlcpy_append sweep (78c52ab / 25ade82 / e446242 / d5e598a) was created to replace, applied here to a mixed strlcpy + snprintf chain: _len = strlcpy(s, state_name, len); if (state_slot > 0) _len += snprintf(s + _len, len - _len, "%d", state_slot); ... strlcpy(s + _len, FILE_PATH_PNG_EXTENSION, len - _len); Both strlcpy and snprintf return their would-be length on truncation, so when state_name approaches @len, _len overshoots @len, the next `len - _len` subtraction underflows size_t to ~SIZE_MAX, and subsequent calls treat the destination as essentially infinite. Reachable: state_name is runloop_st->name.savestate which is sized PATH_MAX_LENGTH; on a deep ROM directory the savestate name itself can fill the buffer, and the slot suffix or .png extension push the running total past @len. Add explicit truncation guards after each step so the chain stays inside the buffer. Also add the missing `!s || !len` guard at the top -- the historical s[0] = '\0' before the NULL check was a latent NULL-deref. * gfx_thumbnail_path_init: malloc -> calloc to zero playlist_index (gfx/gfx_thumbnail.c) gfx_thumbnail_path_init() allocates the path_data struct via malloc and then calls gfx_thumbnail_path_reset(), which resets the string buffers and the three playlist_*_mode fields but leaves playlist_index untouched -- garbage from malloc until the first set_content_*() call writes it. Read sites in xmb (xmb_set_title at xmb.c:2566 and the related fallback at xmb.c:7462) sample thumbnail_path_data->playlist_index before any setter has necessarily run, then pass it to playlist_get_index(). The callee bounds-checks against the playlist's size, so a garbage index just produces a stale pl_entry rather than a crash, but the code is latently UB. Switch to calloc. Cheaper than touching path_reset's API and forecloses the issue for any future read site that doesn't replicate the bounds-check defence.
task_set_title() and task_set_error() take ownership of a heap pointer and rely on the task system to free it at completion, but neither setter frees the *previous* value -- so calling either one twice on the same task without an intervening free leaks the prior string. task_free_title() has long existed for the title side of this protocol; the error side had no symmetric helper at all, forcing callers that legitimately needed to update the error mid- task to either guard with task_get_error() or just live with the leak. This commit: * Adds task_free_error(), symmetric to the pre-existing task_free_title() (both lock property_lock under HAVE_THREADS, free if non-NULL, NULL out the field). * Documents the free-then-set protocol clearly in queues/task_queue.h on both setters, both struct fields, and cross-references the helpers via @see -- the previous "@warning This does not free the existing X" line was correct but easy to miss, and three sites in tasks/task_cloudsync.c missed it. * Fixes those three task_cloudsync.c violators (lines 107, 1281, 1331). task_cloudsync sets the initial title at line 1439 via direct assignment, then later replaces it via task_set_title in three callbacks without calling task_free_title first. Each replacement leaks the prior strdup. * Adds a regression test under libretro-common/samples/queues/ task_queue_title_error_test/ exercising both helpers and the free-then-set protocol (single-set, replace-via-free, free-on- null no-op, double-free no-op, 50-iteration stress chain). Wires it into the existing libretro-common-samples CI workflow, so it builds under -fsanitize=address,undefined and LSan flags any future regression to the protocol. The test compiles task_queue.c without HAVE_THREADS so the slock / scond / sthread paths drop out cleanly -- the title/error protocol is identical with and without threading; the locks just serialise it. The single remaining external dependency (cpu_features_get_time_usec) is provided as a trivial stub since no test path drives the queue through gather(). Verified pre-patch: the test fails to link without task_free_error (undefined reference), confirming it genuinely exercises the new symbol rather than passing trivially. Verified post-patch: ASan + UBSan + LSan clean, all ten cases pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )