Skip to content

Research: font-aware confusable confidence scoring via rendered glyph comparison #1

@paultendo

Description

@paultendo

Current confusable maps treat all pairs as equally dangerous. In practice, some pairs are indistinguishable across all common fonts (Cyrillic a / Latin a) while others only collide in specific typefaces.

Rendering confusable pairs across standard system fonts and measuring pixel similarity could produce empirically weighted confidence scores that feed into the existing risk scoring pipeline, with no runtime rendering dependency.

Prior art: GlyphNet demonstrates attention-based CNN detection on 4M rendered domain images. Their approach is domain-specific and image-based. The opportunity here is to distil rendering results into static map weights consumable at runtime.

Phases (if pursued):

  1. Render confusable pairs across 20-30 common system fonts, measure pixel similarity per pair per font
  2. Distil into confidence-weighted confusable map (replaces flat weights in confusableDistance)
  3. Discover novel confusable pairs not yet in confusables.txt
  4. Export font-stability metadata per pair

This would be a separate offline tool that produces artifacts namespace-guard imports, not a runtime dependency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions