Skip to content

[pull] master from libretro:master#972

Merged
pull[bot] merged 10 commits intoAlexandre1er:masterfrom
libretro:master
Apr 28, 2026
Merged

[pull] master from libretro:master#972
pull[bot] merged 10 commits intoAlexandre1er:masterfrom
libretro:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 28, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

LibretroAdmin and others added 10 commits April 28, 2026 01:39
screening

Three closely-related issues stacked on top of each other made any
single-pass-or-later compile failure inside a multi-pass preset
permanently brick the driver until restart:

1. d3d9_cg_renderchain_add_pass discarded d3d9_cg_load_program's
   return value.  When a pass like mdapt-pass3.cg failed to compile
   (C5109 domain-conflict, etc.), the pass was still appended to the
   chain with vprg/fprg = NULL.  The render path then called
   cgD3D9BindProgram(NULL) every frame, which leaves the device in
   an unspecified state -- on the user's GPU, that's a black core
   frame with menu/widgets/overlay still drawing fine (those use
   fixed-function or stock-blend, not the renderchain).

2. d3d9_cg_load_program's error: label leaked the partially-created
   Cg program(s).  In the cgD3D9LoadProgram failure case, pass->fprg
   was a valid Cg program object (compile succeeded, GPU-load did
   not), and any subsequent code path that didn't NULL-check would
   happily try to bind it.  Worse, the leaked program objects pin
   the surrounding context until cgDestroyContext runs.

3. d3d9_cg_set_shader's failure handling left d3d->renderchain_data
   pointing at a half-built chain (whatever passes succeeded before
   the failure, plus the bad pass with NULL programs after #1).  The
   driver kept rendering through that chain.  Even reapplying the
   stock shader from the menu wouldn't recover, because the menu's
   "Apply Shader" path also goes through set_shader -> process_shader
   -> restore, hitting the same broken state if anything along the
   way faulted.

Fix all three:

- add_pass now memsets the local pass struct (so the error path has
  a well-defined struct to walk), checks load_program's return, and
  funnels every failure through a single error: label that releases
  the Cg programs, vertex declaration, vertex buffer, texture, and
  attrib_map it created before the failure point.

- load_program's error: label cgDestroyProgram()s pass->vprg/fprg
  (whichever is non-NULL) and zeroes them, so callers that ignore
  the return value still see a clean pass struct.

- set_shader, on process_shader-or-restore failure, frees
  d3d->shader_path, runs process_shader again to populate the stock
  defaults, and calls restore once more to rebuild the chain
  against the stock shader.  The original return value (false) is
  preserved, so the menu still surfaces the failure to the user --
  they just don't get stuck on a black screen anymore.

The init-ordering fix (call process_shader before initialize on
the first init path so init_chain doesn't read a calloc-zeroed
fbo struct) is also included, since both regressions would have
needed to be tested together anyway.

Tested with snes9x2010 + DKC2 + sequence: stock -> testie-crt
(7-pass, works) -> mdapt+xbr-hybrid+aa (mdapt-pass3 C5109 failure)
-> stock (recovered).  Pre-fix the third step stayed black; post-
fix it goes back to plain stock rendering.
Many community .slangp presets — notably Mega_Bezel — write absolute
scale dimensions as floats, e.g.

   scale_type_x26 = absolute
   scale_x26      = 800.0

Before commit 1d9ed2e ("file/config_file: reject non-numeric input in
integer getters"), config_get_int silently accepted "800.0" as 800
because strtol stops at the '.', returns 800, and leaves errno == 0.
The old getter only checked errno, so it returned true with *in = 800.

After 1d9ed2e — correctly — config_get_int rejects strings with
trailing garbage and leaves *in untouched. In video_shader_parse_pass
the local iattr is initialised to 0 but scale->abs_x / abs_y are not
assigned on failure, so the absolute-scaled passes end up with a 0
dimension. The Vulkan driver then tries to build a 0×N (or 0×0)
framebuffer and crashes.

Fall back to config_get_float + roundf on the absolute branches so
the long-tail of presets using float-shaped absolute scales keeps
working without resurrecting the silent-accept behaviour in
config_file.

Hopefully fixes #18981.
…wildcard replacement

video_shader_replace_wildcards_impl built each replacement
value into a 256-byte stack buffer:

  char replace_text[256];
  _len = strlcpy(replace_text, source, sizeof(replace_text));

then unbounded:

  memcpy(dst + prefix, replace_text, _len);
  strlcpy(dst + prefix + _len, found + token_len,
          PATH_MAX_LENGTH - prefix - _len);

Two related primitives were exposed:

 1. strlcpy returns strlen(source) regardless of destination
    capacity.  When source resolved to a $CONTENT-DIR$ /
    $PRESET-DIR$ / $CORE$ / $GAME$ value longer than 256
    bytes (DIR_MAX_LENGTH is 4096, library_name and
    path_basename are platform-dependent and routinely
    exceed 256), _len was strlen(source).  The memcpy then
    read past the end of the 256-byte replace_text array
    into adjacent stack memory.

 2. The same _len fed (size_t)(PATH_MAX_LENGTH - prefix -
    _len) on the next line.  When prefix + _len exceeded
    PATH_MAX_LENGTH the subtraction underflowed size_t to a
    huge value and the strlcpy walked off the end of dst.
    Even with the _len clamp from (1), a wildcard token
    placed near the end of the input path (legitimate for
    an attacker-supplied .slangp/.glslp) lets prefix +
    REPLACE_BUF_SZ exceed PATH_MAX_LENGTH on its own.

The snprintf-chained branches (CORE_REQUESTED_ROTATION,
VIDEO_USER_ROTATION, VIDEO_FINAL_ROTATION,
SCREEN_ORIENTATION) had a sibling problem: snprintf returns
the would-have-written length, so _len += snprintf(...) on
a saturated buffer pushes _len past sizeof(replace_text)
even when the destination ate the entire size budget.

Reachable via shader presets the user opens.  These come
from the online updater (typically slang-shaders or
common-shaders), the user's own preset directory, and from
content-bundled presets distributed alongside ROMs.

Fix: clamp at the use site so all the upstream branches are
covered in one place.

  if (_len >= sizeof(replace_text))
     _len = sizeof(replace_text) - 1;
  if (prefix >= PATH_MAX_LENGTH || _len >= PATH_MAX_LENGTH - prefix)
     break;

Adds samples/tasks/video_shader_parse/video_shader_wildcard_test
as a regression test, registered in
.github/workflows/Linux-samples-tasks.yml as an ASan-enabled
step.  Four cases:

  - short replacement produces correct output
  - long source (600 chars) clamped without OOB read
  - huge source (4000 chars) clamped without strlcpy underflow
  - prefix near PATH_MAX_LENGTH correctly bails out

Test follows the existing samples/tasks pattern (verbatim
copy of the post-fix arithmetic block, ASan as the
discriminator) used by archive_name_safety_test and
http_method_match_test.  Verified to fire under ASan when
the clamps are removed: stack-buffer-overflow READ of size
600 at offset 288 in replace_text (a 256-byte buffer).
With the clamps the test passes clean.

Note on test scope: the test exercises a verbatim copy of
the post-replacement arithmetic block from
video_shader_parse.c lines 344-358, not the full
video_shader_replace_wildcards_impl.  The full function
depends on settings_t / runloop_state / video driver
globals that aren't tractable to stub for a unit test --
following the convention used by archive_name_safety_test
("if task_decompress.c amends the predicate, the copy here
must follow"), the test's verbatim block must track the
production block.  A comment in the test points at the
production lines explicitly.
… leak sites

Audit pass over every intfstream_close call site in production
code: out of 53 sites, 48 were already followed by free() (the
established libretro-common convention -- intfstream_close
closes the inner backend, the caller owns the struct).  Five
were not, leaking 48 bytes per call.

Five fixed:

  - input/bsv/bsvmovie.c:1315  check_stream  (open_memory)
  - input/bsv/bsvmovie.c:1643  out_stream    (open_writable_memory)
  - input/bsv/bsvmovie.c:1845  read_mem      (open_memory)
  - libretro-common/formats/cdfs/cdfs.c:433  cue_stream  (alloc-fail path)
  - libretro-common/formats/cdfs/cdfs.c:438  cue_stream  (success path)

The bsv leaks are the more interesting set: those code paths
run on every replay verification (check_stream), every
checkpoint encode (out_stream), and every checkpoint decode
(read_mem) -- so during a long netplay session with an active
replay or in checkpointed playback the leak compounds linearly
with elapsed time.  Each instance is small (48 bytes per
intfstream_t struct) but they accumulate.

The cdfs leak is per-cue-file: every time RetroArch parses a
.cue (loading a multi-track CD image, bulk database scan over
a CD library) it leaks 48 bytes.  Both close paths in
cdfs_open_cue_path were affected.

Note on contract: changing intfstream_close to free the struct
itself was considered and rejected.  intfstream_close ships in
libretro-common which is vendored into many cores; cores that
correctly compensate for the existing convention by calling
free() after close (the same pattern this commit standardises
on inside RetroArch itself) would silently double-free under a
contract change.  Fix-the-call-sites is the only safe path.

The 48 already-correct call sites are left as-is.  This commit
is a pure addition of missing free() calls; no behavioural or
cleanup-order changes.
input_driver.c's poll-time remap loop indexed
mapper->analog_value[i][n] using values from
settings->uints.input_remap_ids[i][k] without proper bounds in
two places:

  1. Button -> analog branch (line ~7392):
        handle->analog_value[i][remap_button - RARCH_FIRST_CUSTOM_BIND] = ...
     No bound on remap_button at all.  Reachable values:
     [RARCH_FIRST_CUSTOM_BIND, RARCH_UNMAPPED) i.e. [16, 1024),
     producing remap_axis_bind in [0, 1008).  Indexing past the
     8-element row writes int16_t values up to ~2014 bytes past
     the end of analog_value, into adjacent input_mapper_t
     fields (keys[], key_button[], buttons[]) and beyond into
     adjacent input_driver_state_t fields.

  2. Analog -> analog branch (line ~7427):
        if (remap_axis_bind < sizeof(handle->analog_value[i]))
           handle->analog_value[i][remap_axis_bind] = ...
     The sizeof() returns 16 (the byte size of the int16_t[8]
     row), not the element count.  Indices 8..15 passed the
     check and OOB-wrote into the next user's analog_value row,
     or past the array entirely for the last user.

Both sites read remap_button / remap_axis from
settings->uints.input_remap_ids[i][k], which is loaded from
.rmp files via input_remapping_load_file in configuration.c.
Pre-this-patch the loader fed config_get_int's result directly
into storage with no range check, so a malformed .rmp could
plant any int there.  An attacker who can deliver a .rmp
alongside a ROM (a common distribution channel for
"controller config bundles") gets a 60Hz attacker-controlled
out-of-bounds write into adjacent input state -- including
the keys[] keyboard bitmask for the last user, which controls
which RetroArch hotkeys appear pressed.

Two-pronged fix:

  - Load-time validation in input_remapping_load_file:
    reject _remap values outside [0, RARCH_ANALOG_BIND_LIST_END)
    (clamping to RARCH_UNMAPPED) before storing into
    settings->uints.input_remap_ids[i][j].  Applied to both the
    button-branch (j < RARCH_FIRST_CUSTOM_BIND) and the
    analog-branch (j >= RARCH_FIRST_CUSTOM_BIND) of the
    loader.

  - Use-site bounds in input_driver.c: add an
    ARRAY_SIZE-based bound to the previously-unguarded
    button -> analog branch, and replace the broken sizeof()
    bound with ARRAY_SIZE() in the analog -> analog branch.
    Defence in depth: even if the load-time validator misses
    a path (or a future caller writes to input_remap_ids[]
    bypassing the loader), the use sites are now safe.

Adds samples/tasks/input_remap/input_remap_bounds_test as a
regression test, registered in
.github/workflows/Linux-samples-tasks.yml as an ASan-enabled
step.  Seven cases covering the load-time clamp's accept and
reject paths, in-range writes going to the correct slot,
out-of-range writes being skipped without OOB, and the
specific historical sizeof-vs-ARRAY_SIZE bug shape.

Test follows the existing samples/tasks pattern (verbatim copy
of the post-fix predicates with a maintenance-contract comment;
ASan as the discriminator).  Verified to fire under ASan when
the bound check is removed: heap-buffer-overflow WRITE of size
2 at offset 16 of a 16-byte allocation, in
do_button_to_analog_write.  With the bound the test passes
clean.
…size

bsv_movie_handle_read_input_event's read path read attacker-
controlled event counts straight from the .bsv replay file
and used them as loop bounds, indexing into fixed-size
arrays:

  uint8_t   key_event_count;            (.bsv format)
  uint16_t  input_event_count;          (.bsv format)

  bsv_key_data_t   key_events[128];     (struct bsv_movie)
  bsv_input_data_t input_events[512];   (struct bsv_movie)

Pre-this-patch the read loops at lines 822-836 (key) and
848-862 (input) iterated count-times into the fixed-size
arrays with no upper-bound check.  A malformed .bsv could
supply:

  - key_event_count = 255 -> (255-128)*sizeof(bsv_key_data_t)
                         = 1524 bytes of OOB heap-write past
                         key_events[]
  - input_event_count = 65535 ->
                         (65535-512)*sizeof(bsv_input_data_t)
                         = 520184 bytes of OOB heap-write
                         past input_events[]

bsv_movie_t is heap-allocated via calloc in
tasks/task_movie.c::bsv_movie_init_internal so this is a
heap-buffer-overflow with attacker-chosen 8- or 12-byte
payloads at attacker-chosen offsets, producing corruption
of the rest of the bsv_movie_t struct (rewind state,
statestream pointers, save buffers) and far beyond it.

Reachability: the user opens a malformed .bsv file via
Menu -> "Load Replay File" or the --bsvplay command-line
argument.  Same threat class as CDFS / CHD / BPS file-load
bugs -- requires the user to deliberately load attacker-
supplied content.  Initial audit notes flagged this as
netplay-reachable but verification of network/netplay/
netplay_frontend.c shows that netplay's "replay" terminology
refers to internal frame-replay from the netplay ring buffer
and does not invoke the .bsv file-read path; this is a
file-loading bug only.

Fix: in input/bsv/bsvmovie.c, reject the file as malformed
and end playback if either count exceeds its array's
ARRAY_SIZE.  A legitimate writer cannot produce more events
than the array holds because the writer's append path at
lines 489-490 / 475-476 shares the same backing array.

Adds samples/tasks/bsv_replay/bsv_replay_bounds_test as a
regression test, registered in
.github/workflows/Linux-samples-tasks.yml as an ASan-enabled
step.  Four cases covering legitimate counts up to and
including the cap, and out-of-range counts being rejected
without writes.

Test follows the existing samples/tasks pattern (verbatim
copy of the post-fix predicate with maintenance-contract
comment, ASan as the discriminator).  Verified to fire
under ASan when the predicate always returns true:
heap-buffer-overflow WRITE of size 1 at offset 1536 of a
1536-byte allocation in simulate_key_event_read.  With the
predicate the test passes clean.

Note on writer side: the append site at line 489
   movie->input_events[movie->input_event_count++] = data;
has no bound check either, but it's not fed by attacker
data and a single frame is unlikely to produce more than
512 events in practice.  Not addressed by this commit; left
for a separate audit pass.
… buffers

Two unrelated stack/heap overflows in disc-image format
parsers, both reachable when the user loads a malicious
.iso/.bin/.cue/.chd.  Same threat class as the BSV file-load
bugs.

cdfs_find_file (libretro-common/formats/cdfs/cdfs.c)
walks ECMA-119 directory records read into a 2048-byte
stack buffer.  The loop guard only checked that the
record header byte was in bounds; tmp[33 + path_length],
tmp[2..4], and tmp[10..13] could read past the buffer
end, leaking adjacent stack bytes into ';' / '\0'
comparisons and feeding attacker-influenced data into
`sector` (redirecting subsequent reads) and `file->size`.
The advance tmp += tmp[0] with attacker-controlled tmp[0]
compounded this by jumping the iterator forward arbitrarily
when records claimed lengths shorter than the legal
minimum (33 + filename + 1).  Fix: require the entire
record window to fit inside the buffer in the loop guard,
and reject records whose claimed length is below the
filename-comparison minimum.

chdstream_get_meta (libretro-common/streams/chd_stream.c)
parsed GDROM metadata KEY:VALUE pairs read into a 256-byte
stack buffer.  The TYPE: / SUBTYPE: / PGTYPE: / PGSUB:
branches all did:

  while (p[len] && p[len] != ' ') len++;
  memcpy(md->FIELD, p, len);

with no upper bound on len -- the only terminator was the
meta-buffer NUL.  A malicious .chd with "TYPE:" + 200
non-space bytes stack-overflowed md->type[64] by 136 bytes,
clobbering subtype/pgtype/pgsub and (depending on layout)
the saved frame pointer.  Fix: apply the same
"len >= sizeof(md->FIELD)" reject already used by the
SCD-format parser earlier in the same function, to all
four GDROM fields.

Adds two regression tests under libretro-common/samples,
registered in .github/workflows/Linux-libretro-common-
samples.yml's RUN_TARGETS, with ASan + UBSan made the
default for that workflow (MAKE_ARGS env var) since both
tests rely on ASan as the bound-check discriminator.
Verified locally that all 22 existing RUN_TARGETS still
pass under the new sanitizer build.

Tests follow the existing samples pattern (verbatim copy
of the post-fix predicate with maintenance-contract
comment).  Verified to fire under ASan when the bounds are
removed: heap-buffer-overflow READ at offset 2048 of a
2048-byte allocation (cdfs); heap-buffer-overflow WRITE of
size 200 at offset 180 of a 180-byte allocation (chd).
Both pass clean with the bounds in place.
@pull pull Bot locked and limited conversation to collaborators Apr 28, 2026
@pull pull Bot added the ⤵️ pull label Apr 28, 2026
@pull pull Bot merged commit 669468d into Alexandre1er:master Apr 28, 2026
38 of 39 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants