Skip to content

Fix panic in subset when glyph_ids maps to only .notdef#133

Closed
MasKusuno wants to merge 1 commit into
yeslogic:masterfrom
MasKusuno:fix/subset-empty-glyph-ids
Closed

Fix panic in subset when glyph_ids maps to only .notdef#133
MasKusuno wants to merge 1 commit into
yeslogic:masterfrom
MasKusuno:fix/subset-empty-glyph-ids

Conversation

@MasKusuno

Copy link
Copy Markdown

Summary

  • When subset() is called with CmapTarget::Unicode and glyph IDs that have no cmap mapping (e.g., only glyph ID 0 / .notdef), MappingsToKeep ends up empty. This causes CmapSubtableFormat4::from_mappings to panic on an unconditional unwrap() of the first mapping iterator element.
  • In WASM builds this manifests as an "unreachable" trap, making it difficult to diagnose.
  • This is easily triggered in practice when subsetting a font with codepoints it does not contain — common in CJK font fallback chains where different fonts cover different Unicode ranges.

Changes

  • Return SubsetError::NoGlyphs early in subset() when mappings are empty and the cmap target is Unicode
  • Replace the unwrap() in CmapSubtableFormat4::from_mappings with Err(ParseError::BadValue) as defense-in-depth
  • Add is_empty() method to MappingsToKeep<T>
  • Add NoGlyphs variant to SubsetError
  • Add two tests (CFF/OTF and TTF) verifying .notdef-only input returns NoGlyphs instead of panicking

Test plan

  • cargo test --lib — 336 passed, 0 failed
  • cargo test (all integration + doc tests) — all passed
  • cargo fmt -- --check — clean

Context

Discovered while building a WASM-based CJK font subsetter for GJS Kanji Database. When a codemap requests a font chunk for a Unicode range that the font doesn't cover, the subsetter receives only .notdef glyph IDs and crashes.

@wezm wezm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find, but I'm not sure the fix is the right approach though. In a scenario where only glyphs that do not have a cmap entry are retained, MappingsToKeep legitimately ends up empty. With the proposed changes an error would be returned. Instead I think that it should still build a cmap table. I.e. a cmap that is present, but has fields populated such that no glyphs are mapped.

@MasKusuno

Copy link
Copy Markdown
Author

Thanks for the review! You're right — returning an error is the wrong approach since empty mappings is a legitimate case.

I've updated the PR to build a valid cmap table with no glyph mappings instead:

  • CmapSubtableFormat4::from_mappings: when mappings are empty, produces a table with only the required 0xFFFF sentinel segment
  • CmapSubtableFormat12::from_mappings: when mappings are empty, produces a table with an empty groups list
  • Removed the SubsetError::NoGlyphs variant and the early-return guard in subset()
  • Updated the tests to verify that .notdef-only input produces a valid, parseable font

All 336 lib tests + 13 doc tests pass, cargo fmt clean.

@wezm wezm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Just a little more iteration needed on both of the tests.

Also please squash all commits into a single commit when you're done.

Comment thread src/subset.rs Outdated
Comment on lines +1847 to +1849
let _ = ReadScope::new(&font_data)
.read::<OpenTypeFont<'_>>()
.expect("subset output should be a valid OpenType font");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not parse much of the font. Instead the cmap table should be parsed like test_subset_with_macroman_cmap and test_subset_with_os2_and_unicode_cmap.

Comment thread src/subset.rs Outdated
}

#[test]
fn subset_notdef_only_cff_produces_valid_font() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're already in a subset module, so the subset_ prefix can be removed from the test names.

When subset() is called with CmapTarget::Unicode and glyph IDs that
have no cmap mapping (e.g., only glyph ID 0 / .notdef), MappingsToKeep
ends up empty, causing CmapSubtableFormat4::from_mappings to panic on
an unconditional unwrap().

Build a valid empty cmap table instead:
- CmapSubtableFormat4::from_mappings: when mappings are empty, produce
  a table with only the required 0xFFFF sentinel segment
- CmapSubtableFormat12::from_mappings: when mappings are empty, produce
  a table with an empty groups list
- Add is_empty() method to MappingsToKeep<T>
- Add tests for .notdef-only input with both CFF and TTF fonts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@MasKusuno MasKusuno force-pushed the fix/subset-empty-glyph-ids branch from 3a4dba5 to ba1ad98 Compare April 8, 2026 05:57
@MasKusuno

Copy link
Copy Markdown
Author

Thanks for the feedback! I've updated the tests to properly parse the cmap table (matching the pattern in test_subset_with_macroman_cmap / test_subset_with_os2_and_unicode_cmap), removed the subset_ prefix from test names, and squashed everything into a single commit.

@wezm

wezm commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Applied in 5fb3fff

@wezm wezm closed this Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants