Skip to content

[pull] master from libretro:master#968

Merged
pull[bot] merged 7 commits intoAlexandre1er:masterfrom
libretro:master
Apr 27, 2026
Merged

[pull] master from libretro:master#968
pull[bot] merged 7 commits intoAlexandre1er:masterfrom
libretro:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 27, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

LibretroAdmin and others added 7 commits April 27, 2026 01:26
gl1_draw_tex sets glPixelStorei(GL_UNPACK_ROW_LENGTH, pot_width)
before its glTexImage2D and never resets it. GL_UNPACK_ROW_LENGTH
is global pixel-store state that persists until something changes
it, so every subsequent glTexImage2D in the frame inherits it as
the source row stride.

When this leaked stride does not match the source data's actual row
stride, glTexImage2D reads source rows with the wrong pitch and
writes the texture with glyphs landing at offsets that are shifted
relative to where atlas_offset_x/atlas_offset_y say they should be.

In ozone this manifests as the title and sidebar fonts (whichever
fonts are uploaded while a stale ROW_LENGTH is in effect) rendering
as horizontal stripe patterns instead of letters: the texcoords
baked into the vertex block point at slot positions, but those
positions in the actual texture contain the wrong glyph data —
typically the codepoint-0 fallback bitmap from atlas (0,0), which
freetype draws as horizontal bars.

The corruption is invisible to glGetTexImage readback because the
same ROW_LENGTH state that skewed the upload would skew the
readback symmetrically, so a 2D dump comes out looking correct as
an image even though the GPU sampler reading specific (s,t)
coordinates hits different texels than the bake math expects.

A context reset masks the bug because re-running font init under a
GL state where ROW_LENGTH happens to already be 0 produces a clean
upload.

Fix in two places, defense-in-depth:

1. gl1_raster_font_upload_atlas now explicitly sets
   GL_UNPACK_ALIGNMENT=1 and GL_UNPACK_ROW_LENGTH=0 immediately
   before glTexImage2D, so the upload is robust regardless of what
   state it inherits. This matches the existing pattern in
   gl3_raster_font_upload_atlas.

2. gl1_draw_tex now resets GL_UNPACK_ROW_LENGTH=0 after its
   glTexImage2D, so it stops leaking the value to other uploads in
   the first place.

Either fix alone resolves the symptom. Both together also protect
against future code paths that may set ROW_LENGTH and forget to
reset it.

Bug appears to be long-standing — present at least as far back as
April 2026 (commit c10115b), likely much older. Goes unnoticed
because gl1 + ozone is rarely used in practice; gl/glcore/Vulkan
are the common paths.

VITA is excluded from the new glPixelStorei calls, matching the
existing #ifndef VITA guards on the other glPixelStorei sites in
this file.
gl1_render_overlay was missing essentially all of the GL state setup
needed to actually draw anything. Three issues, all fixed by mirroring
the patterns already used in gfx_display_gl1_draw and
gl1_raster_font_draw_vertices:

1. Client arrays were never enabled. The function assigned values to
   gl->coords.vertex/tex_coord/color, but those struct fields don't
   drive GL state — they need to be wired up via glEnableClientState
   plus glVertexPointer/glTexCoordPointer/glColorPointer pointing at
   the overlay coord buffers. Without them glDrawArrays runs with
   whatever client array state happened to be active from the
   previous draw, which generally meant nothing rendered.

2. The PROJECTION matrix was pushed but never popped, leaking a
   matrix-stack entry every frame. The MODELVIEW matrix was also
   never reset to identity.

3. The viewport size source was wrong. Callers pass video_width/
   video_height from video_info, which in gl1 is the emulated core
   frame size (e.g. 256x224 for SNES, 320x240 default for the menu
   surface) — not the window size. Fullscreen overlays need the
   actual window viewport, so use gl->screen_width/height (set from
   the context driver's get_video_size) instead, with fallback to the
   passed-in width/height for safety. Same semantic mismatch fixed
   recently for the statistics font in commit 1224324.

Additionally, the projection itself was loaded as identity, which
gives clip space [-1,1] x [-1,1]. Overlay vertices are emitted in
[0,1] UI space by gl1_overlay_vertex_geom, so an identity projection
mapped them into the upper-right quadrant of the viewport instead of
filling it. Load gl->mvp_no_rot — the pre-built ortho(0,1,0,1,-1,1)
matrix — for the projection so [0,1] vertices map to the full
viewport, matching what gl3 does with the same matrix in its shader
uniform.

With these changes overlays render at the correct full-screen size
and position, matching the gl/glcore behaviour.
Previously the GL1 driver expanded RGUI's 16bpp framebuffer to 32bpp on
the CPU every frame via conv_rgba4444_argb8888 before uploading it as
BGRA8888. The expansion was a per-pixel loop (with an MMX fast path on
x86) and doubled the GPU upload bandwidth. RGUI already assembles its
framebuffer in RGBA4444; GL has been able to consume that layout
directly via GL_UNSIGNED_SHORT_4_4_4_4 since GL 1.2 (1998).

Add a SUPPORTS_PACKED_PIXELS capability flag, probed once at init from
the GL version (>= 1.2) or the GL_EXT_packed_pixels extension. When
set, the menu draw path keeps RGUI's native 16bpp layout end-to-end:
rows are memcpy'd into a POT-padded staging buffer at half the previous
size and uploaded via (GL_RGBA, GL_UNSIGNED_SHORT_4_4_4_4) with no
swizzle or expansion. The original 32bpp expansion path is preserved
as a fallback for strict GL 1.1 implementations and for the Vita build
(vitaGL packed-pixel paths are unverified).

The new path is endian-safe by construction. RGUI's argb32_to_rgba4444
produces a host-endian uint16_t with R in bits 15..12 and A in bits
3..0; glTexImage2D reads each GL_UNSIGNED_SHORT_4_4_4_4 unit through
the host's native uint16_t interpretation, so the same source bytes
work on LE and BE hosts without a byte swap.

gl1_draw_tex gains a fb_4444 parameter that selects the new upload
format and skips the BGRA-fallback CPU swizzle (the 16bpp path's bytes
already match GL_RGBA channel order). All three existing callers
update accordingly; the content path passes false and is byte-identical
to before.

Removes the misleading "I could not get 444 or 555 to work" FIXME on
the original gl1_draw_tex; this commit is what it asked for.
Both d3d9_hlsl and d3d9_cg previously expanded RGUI's 16bpp menu
framebuffer to 32bpp on the CPU every frame via a per-pixel loop,
then uploaded the result into a D3DFMT_A8R8G8B8 texture. RGUI
already assembles its framebuffer in 16bpp; D3D9 has supported
D3DFMT_A4R4G4B4 as a baseline texture format since launch.

Add D3D9_ARGB4444_FORMAT to d3d9_common.h (with D3DFMT_LIN_A4R4G4B4
for the _XBOX build path), and "d3d9_hlsl"/"d3d9_cg" cases to the
RGUI pixel format dispatcher selecting argb32_to_argb4444 (already
in use for the rsx/PS3 driver, which targets the same ARGB4444
bit layout).

In both menu set_texture_frame paths, allocate the menu texture
as D3DFMT_A4R4G4B4 when rgb32 is false (the only case in current
practice; RGUI is the sole caller), and upload row-by-row via
memcpy. The rgb32 = true API branch is preserved for forward
compatibility and continues to use D3DFMT_A8R8G8B8.

Track the bpp of the currently-allocated menu texture in a new
d3d9_video_t::menu_tex_rgb32 field so the texture is recreated
when the rgb32 flag flips between calls.

Endian-safe by construction: argb32_to_argb4444 produces a
host-endian uint16_t with A in bits 15..12 down to B in 3..0;
D3DFMT_A4R4G4B4 is read by D3D as host-endian 16-bit units with
the same bit assignments. Same contract as the original
ARGB8888 path, just one storage size smaller, so the ordering
holds on both LE (PC) and BE (Xbox 360) hosts without a byte swap.
@pull pull Bot locked and limited conversation to collaborators Apr 27, 2026
@pull pull Bot added the ⤵️ pull label Apr 27, 2026
@pull pull Bot merged commit 16e1dca into Alexandre1er:master Apr 27, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants