Skip to content

Add Buhin#onMissing callback for surfacing missing references#16

Open
MasKusuno wants to merge 1 commit into
kurgm:masterfrom
MasKusuno:feat/buhin-on-missing
Open

Add Buhin#onMissing callback for surfacing missing references#16
MasKusuno wants to merge 1 commit into
kurgm:masterfrom
MasKusuno:feat/buhin-on-missing

Conversation

@MasKusuno

Copy link
Copy Markdown

Summary

This PR addresses #14 by adding an opt-in onMissing callback to the Buhin class, following the API design you suggested.

API

class Buhin {
  public onMissing: BuhinMissingHandler | null;
  // ...
}

type BuhinMissingHandler = (name: string) => string | undefined;

search(name) is unchanged when onMissing is null (the default). When a handler is set and name is absent from the store:

  • If the handler returns a string, that value is used as the lookup result.
  • If the handler returns undefined (or nothing), the default "" fallback is preserved.
  • If the handler throws, the exception propagates — useful for fail-fast pipelines.

onMissing is not invoked for names that are present in the store, even if their stored value is "".

Example

const kage = new Kage();

// Log warnings while keeping the default fallback:
kage.kBuhin.onMissing = (name) => {
  console.warn(`Buhin "${name}" is missing`);
};

// Or fail fast:
kage.kBuhin.onMissing = (name) => {
  throw new Error(`Buhin "${name}" is missing`);
};

// Or substitute a placeholder glyph:
kage.kBuhin.onMissing = () => "1:0:0:0:0:200:200"; // empty box

Why this is non-breaking

  • New property defaults to null, so search behaves identically to today's implementation when callers don't set it.
  • The signature of search is unchanged; only its internal lookup path gains the optional callback step.
  • BuhinMissingHandler is a new exported type; existing imports are unaffected.

Tests

test/index.js gains a Buhin#onMissing block covering:

  • Default behavior (onMissing === null, missing name returns "").
  • Handler invocation for missing names only.
  • String return overrides the default.
  • Throw propagation.
  • Stored "" takes precedence over the handler.
  • End-to-end: a missing 99: target surfaces through onMissing during makeGlyph.

Existing tests continue to pass.

Note on the name@N correction

Thank you for the follow-up clarification — I had indeed conflated revision suffixes with variant siblings. I'll be revisiting our downstream pre-processing logic separately; the engine-side improvement here is independent of that misunderstanding and remains valuable on its own.

Refs #14

Adds an optional `onMissing` handler to the `Buhin` class that is invoked
when `search` is called with a name not registered in the store. This
allows callers to surface data inconsistencies that would otherwise be
silently dropped by the engine — for example, when building fonts from
large dumps where a missing buhin causes downstream `99:` instructions
to disappear without any warning.

The handler may return a string (used as the lookup result) or undefined
(falls back to the default empty string). Throwing from the handler is
also supported for fail-fast use cases.

Defaults to `null`, so existing callers see no behavior change.

Refs kurgm#14
@MasKusuno

Copy link
Copy Markdown
Author

Quantitative validation in a downstream font build

I gave the new onMissing callback a real workout in our font build pipeline (digital-go-jp/gjs GJSGothic). The results turned out stronger than I expected, so I wanted to share them here as additional motivation for the API.

Setup

The pipeline pushes the entire GlyphWiki dump (~2.5M entries) into Buhin, then renders ~67,000 cmap-target glyphs through Kage. Without any handling, name@N revision references in the source data become silent drops.

I patched our local copy of Buhin with the implementation from this PR, then wired up onMissing not just to log the missing names but to resolve them through a small fallback chain (name@Nname, name-var-NNNname-var-001name, name-jvname-jname, etc.).

Numbers

Configuration onMissing hits Unresolved Affected top-level cmap glyphs
Raw dump (no preprocess, no fallback) 14,710 14,710 ~10,744
Raw dump + onMissing fallback 14,710 13 ~13
Hand-rolled preprocessing (current pipeline) 13 13 0
Preprocessing + onMissing fallback 13 0 0

Two findings worth sharing:

  1. Side-by-side rendering. I built two versions of the Plane 16 PUA + IPAmj-compatible Gothic font — one with raw-dump onMissing fallback and one without — and viewed ~300 affected glyphs side by side. The fallback-enabled side reliably fills in the components that the no-fallback side drops as missing radicals. The visual difference is unambiguous.

  2. The same callback subsumes the hand-rolled preprocessing. Once onMissing is in place, the dump preprocessing step we maintained downstream becomes redundant — the engine itself can do the recovery, and it does so closer to the point where the missing reference is consumed (so it covers paths preprocessing didn't anticipate, like our last 13 stragglers).

Implications for the API

The original PR description framed this purely as a diagnostic hook ("surface data inconsistencies"). The validation above shows it doubles as a clean recovery mechanism: returning a string from the handler lets callers patch missing references without having to traverse and rewrite the dump beforehand.

In other words, onMissing ends up being both a "tell me when something is wrong" hook and a "tell me what to use instead" hook — and that duality is exactly what made it powerful in our case. The current BuhinMissingHandler signature already supports both modes (undefined for log-only, string for substitute), which I think is the right design.

Happy to address review feedback whenever you have time. Thanks again for the suggestion in #14 that motivated the API!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant