Skip to content

[pull] master from libretro:master#976

Merged
pull[bot] merged 8 commits intoAlexandre1er:masterfrom
libretro:master
Apr 29, 2026
Merged

[pull] master from libretro:master#976
pull[bot] merged 8 commits intoAlexandre1er:masterfrom
libretro:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 29, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

LibretroAdmin and others added 8 commits April 29, 2026 00:12
UBSan-instrumented run reported:

  menu/drivers/ozone.c:6401:13: runtime error: -10.0781 is outside
  the range of representable values of type 'unsigned int'

Origin: the value-text y argument to ozone_draw_entry_value() is
declared as unsigned, but the y expression at the call site is

    y                                                 /* size_t */
    + ozone->dimensions.entry_height / 2              /* int    */
    + ozone->fonts.entries_label.line_centre_offset   /* int    */
    + scroll_y                                        /* float  */

The trailing float promotes the whole sum to float.  scroll_y is
clamped to (-inf, 0] (lines 10435-10436), so when the topmost
partially-visible row's vertical centre lands above the header
boundary -- reachable during pointer/wheel scrolling -- the sum
goes slightly negative (UBSan saw -10.0781).  The implicit
float-to-unsigned conversion of a negative is undefined per C11
6.3.1.4 p1.

Runtime impact: low.  Every modern compiler wraps the conversion
to ~UINT_MAX in practice; ozone_draw_entry_value then promotes
that back to float as it forwards into gfx_display_draw_text(),
producing a coordinate around 4.29e9 that the rasteriser clips
off-screen.  The value text just doesn't render that frame,
visibly matching the off-screen-row case it would have hit on
the next frame anyway.

Fix: cast the sum through int (defined conversion for the range
these screen coordinates occupy: small positive y plus a bounded
negative scroll_y) so the negative case wraps in a defined
manner rather than triggering UB.  Matches the
(int)((float)y + scroll_y) idiom already used at lines 3581,
3595, 5970 and 5985 in the same file for sibling y arguments
that are signed-typed at the callee.

Also folds entry_height / 2 -> entry_height / 2.0f for
consistency with the surrounding label and sublabel y
expressions at lines 6319 and 6346, which already use float
division -- relevant only on the rare odd entry_height where
integer division would round half a pixel below the float
result.
UBSan-instrumented run reported, on the same arithmetic at two
sites in ozone_draw_sidebar():

  menu/drivers/ozone.c:3658:41: runtime error: -20.4167 is
  outside the range of representable values of type 'unsigned int'
  menu/drivers/ozone.c:3806:41: runtime error: -20.4167 is
  outside the range of representable values of type 'unsigned int'

Two pre-existing TODO/FIXME comments in this file already noted
the same trip (with a -12.549 sample) but punted on the fix.
Same situation also lurks in the sibling non-smooth-ticker
branches, which UBSan didn't happen to catch only because
ticker.len is size_t rather than unsigned and the smooth path is
the default.

Where the float comes from
--------------------------
ozone->dimensions_sidebar_width is a *float* "animated field"
(declared so at line 578) that tweens between 0 / collapsed /
normal sidebar widths via the gfx_animation system.  It's
initialised to 0.0f at startup (line 9324), so on the very first
frame after init -- and during every collapse/expand animation
-- it holds a small or fractional value.

Through line 3568,

    entry_width = (unsigned)dimensions_sidebar_width
                - sidebar_padding_horizontal * 2;

so entry_width is integer but tracks the float.  Then both
ticker computations evaluate

    entry_width - sidebar_entry_icon_size - 40 * scale_factor

scale_factor is float (last_scale_factor at line 580), so
40 * scale_factor is float, so the whole sum is float.  When
entry_width is small (mid-animation, or sidebar collapsed) the
sum lands below zero -- UBSan saw -20.4167 (40 * 1.5 - the
~icon_size offset against an entry_width near zero).

The implicit float-to-unsigned (or float-to-size_t) conversion
of a negative is UB per C11 6.3.1.4 p1.  Modern compilers wrap
to ~UINT_MAX/SIZE_MAX, which the ticker then treats as "plenty
of room", drawing too much text for one or two animation frames
until dimensions_sidebar_width settles.  Visually mostly
invisible; UBSan-fatally noisy.

Fix
---
Compute the available width in signed int, with the float
intruder explicitly cast through int

    avail_width = entry_width - icon_size - (int)(40.0f * scale_factor);

and clamp to zero before assignment to the unsigned/size_t
field:

    ticker_field_width = avail_width > 0 ? (unsigned)avail_width : 0;

This produces a defined zero-width when there's no room (the
correct semantic -- the ticker has nothing to draw and skips
rather than scribbling at a wrap-around offset).  Apply
identically at both sites (system tab loop and horizontal-list
loop) and use the same local in both the smooth and non-smooth
branches, which removes the duplicated arithmetic and applies
the same clamp to ticker.len's previously-buggy non-smooth
branch.  The two TODO/FIXME comments are removed since they're
now fixed rather than known-broken.

Also adds a glyph_width != 0 guard around the non-smooth divide.
glyph_width is populated during font init and is normally non-
zero by the time draw_sidebar runs, but a font that failed to
load would leave it at 0 and the existing code would divide by
zero -- belt-and-suspenders since we're touching the line
anyway.

Address-of-root-cause vs whack-a-mole
-------------------------------------
The deeper root cause is dimensions_sidebar_width being a float
that flows into integer pixel math, rather than these specific
sites.  Normalising the animated width to an int field would
require restructuring how the gfx_animation tween writes back
(its subject is a float* by API), so it's not a drive-by fix.
For now, every site that mixes scale_factor into integer pixel
arithmetic is a candidate UBSan trip; this commit handles the
two reported plus their immediate non-smooth counterparts.
Future trips will need similar local clamps.
xdg-screensaver's X11 backend shells out to xset internally. On systems
without x11-xserver-utils installed (minimal installs, containers, some
WMs) the resulting failures surface as confusing stderr spam:

  /usr/bin/xdg-screensaver: 885: xset: not found
  /usr/bin/xdg-screensaver: 893: [: Illegal number:
  /usr/bin/xdg-screensaver: 632: xset: not found

with no indication that the missing dependency is xset rather than
something in RetroArch itself.

Probe once on first use via 'command -v' (POSIX shell builtin, no extra
deps) for both xdg-screensaver and xset, with output silenced. If either
is missing, flip xdg_screensaver_available to false and log a single
explanatory line; subsequent calls short-circuit out cleanly. Probe is
gated behind a static bool so it runs at most once per process, and only
when we'd actually invoke xdg-screensaver -- the DBus inhibit path still
returns early before we ever get here.
…lacement-heap crash

MoltenVK 1.3.0 flipped MVK_CONFIG_USE_MTLHEAP to active by default for
non-AMD GPUs, routing VkDeviceMemory through a placement MTLHeap.
Placement heaps need macOS 13+ AND supportsPlacementHeaps; older Intel
Macs satisfy neither and abort at swapchain-image allocation:

  -[MTLHeapDescriptorInternal validateWithDevice:]:335:
    failed assertion `Placement heap type is not supported.'

Force the path off in vulkan_context_init() before the loader is
opened, while still respecting an explicit user override. "0" is
boolean-false on 1.2.x and MVK_CONFIG_USE_MTLHEAP_NEVER on 1.3.x, so
the fix is version-agnostic.

Fixes #18985.
… avoid placement-heap crash"

This reverts commit ea87b48.
The non-stopping branch of tpool_wait's predicate checked only
working_cnt, which is the number of jobs currently being executed
by a worker.  working_cnt is incremented when a worker dequeues a
job, not when a producer enqueues one, so:

    tp = tpool_create(N);
    for (i = 0; i < M; i++)
        tpool_add_work(tp, job, ctx);
    tpool_wait(tp);     /* could return immediately */
    /* read ctx state here, expecting all M jobs to have run */

was free to return before any worker had picked up the first job.
Producers that outpaced the workers' first dequeue would observe
zero or partial completion.

The worker's completion-side signal already covers the correct
condition (tpool.c:139 signals working_cond only when both
working_cnt == 0 and work_first == NULL), so the fix is just to
widen the wait predicate to match: also wait while work_first is
non-NULL.

The only in-tree caller is cores/libretro-ffmpeg/ffmpeg_core.c,
which uses tpool_wait to drain the pool before clearing the video
buffer.  All three call sites benefit from the corrected
semantics; none rely on the previous early-return behaviour.

Adds a regression test under
libretro-common/samples/rthreads/tpool_wait_test/ covering:
 - tpool_add_work then tpool_wait drains all jobs
 - tpool_create(0) defaults to a working pool
 - tpool_create / tpool_destroy roundtrip is heap-clean
 - tpool_destroy with queued-but-unrun work is heap-clean
   (tpool_destroy is documented to discard such work)

Hooks the new sample into the Linux libretro-common samples
workflow.  Built and run under SANITIZER=address,undefined like
the other samples; completes in under one second on a github-
hosted runner.

Verified the test catches the bug: against unpatched tpool.c,
test_work_executes_once and test_zero_threads_default fail with
counter=0 under ASan + UBSan.
Audit-pass on libretro-common/queues/fifo_queue.c found three
defensive gaps.  Each is currently latent (audio + gfx widget
callers all gate on FIFO_WRITE_AVAIL / FIFO_READ_AVAIL before
calling fifo_write / fifo_read) but the function trusts @len
unconditionally, so any future caller that forgets the gate
silently corrupts memory rather than dropping the request.

* fifo_write: out-of-bounds write when @len exceeds available
  space (libretro-common/queues/fifo_queue.c).  The wrap-around
  branch computes rest_write = len - first_write and memcpy()s
  rest_write bytes starting at buffer->buffer; for any @len
  exceeding @buffer->size, that second memcpy walks off the end
  of the backing allocation.  Worse, the gating test
  `buffer->end + len > buffer->size` itself wraps in size_t for
  huge @len, mis-routing the call into a single-memcpy branch
  with @len bytes of unbounded write.  Fix by capping @len at
  FIFO_WRITE_AVAIL(buffer) on entry.  No-op for current callers
  (they already cap); turns the latent overrun into a defined
  silent truncation for any future caller that forgets to gate.

* fifo_read: same out-of-bounds shape on the read side --
  rest_read bytes copied past the end of @buffer->buffer when
  @len exceeds the readable range.  Same fix: cap @len at
  FIFO_READ_AVAIL(buffer) on entry.

* fifo_initialize_internal: SIZE_MAX wrap (libretro-common/
  queues/fifo_queue.c).  The function reserves one ring slot
  for the empty/full distinction, so the actual allocation is
  (len + 1) bytes.  Passing SIZE_MAX wraps that addition to 0;
  calloc(1, 0) is allowed to satisfy with a non-NULL pointer
  to a zero-byte allocation, so initialisation can succeed and
  leave buf->size == 0.  The next fifo_write would then divide
  by zero on the `% buffer->size` end-pointer update.  No
  current caller asks for SIZE_MAX, so the rejection is purely
  defensive.

Also documents the new caps in the public header (fifo_write /
fifo_read) so callers know that exceeding available space is a
silent drop, not a silent overrun.

Adds a regression test under libretro-common/samples/queues/
fifo_queue_bounds_test/.  The test exercises:

  - SIZE_MAX initialisation (must be rejected),
  - over-write of 2048 bytes into a 100-byte ring (capped at 100,
    no OOB write),
  - over-read into a 64-byte buffer (capped at FIFO_READ_AVAIL,
    trailing bytes untouched),
  - SIZE_MAX @len passed to fifo_write with end != 0 (the
    integer-overflow case in the original gating test),
  - wrap-around correctness when the cap isn't engaged,
  - zero-length write/read no-ops.

Wired into the existing libretro-common-samples CI workflow with
SANITIZER=address,undefined.  Pre-patch the test fails: the
SIZE_MAX init succeeds (no rejection), and the over-write trips
ASan with a 1947-byte heap-buffer-overflow WRITE.  Post-patch
all seven cases pass under ASan + UBSan + LSan.

Note on residual scope
----------------------
The cap protects @buffer->buffer (the destination ring) but not
@in_buf (the caller-supplied source).  fifo_write reads exactly
@len bytes from @in_buf -- a buggy caller that passes
@len > sizeof(*in_buf) gets a source over-read that we cannot
detect from inside the function (no @in_buf size parameter).
That residual is unchanged by this commit; the surface area for
it is much smaller than the destination overrun (audio drivers
all pass full-size frame buffers).  Any future hardening would
require an API break to add @in_buf_len, which is out of scope.
@pull pull Bot locked and limited conversation to collaborators Apr 29, 2026
@pull pull Bot added the ⤵️ pull label Apr 29, 2026
@pull pull Bot merged commit 1a13965 into Alexandre1er:master Apr 29, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant