Skip to content

Include Unihan variants in fuzzy search #31

@nisbet-hubbard

Description

@nisbet-hubbard

A semantically-based fuzzy search mode for ideographs as described in §2.1.6 (#24) should naturally also cover similar variants that are present in Unicode per se due to either application of the now deprecated source separation rule or a different component structure. These variants are discussed in Annex #38: https://www.unicode.org/reports/tr38/#N10211.

This can be implemented with the help of Unihan_Variants.txt: https://www.unicode.org/Public/17.0.0/ucd/Unihan.zip

The relevant subsets of Unihan variants are:

  • kSemanticVariant
  • kSimplifiedVariant
  • kTraditionalVariant
  • kZVariant

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions