Skip to content

Unable to detect references with non-Latin conventions. #16

@JoshTheDerf

Description

@JoshTheDerf

Great work on fetch(bible)!
We're looking at using it to detect and link references for some of our non-English articles.

Detection works great for references using standard latin formatting conventions, but not as well for CJK languages. I suspect different languages may need different sets of regexes in https://github.com/gracious-tech/fetch/blob/master/references/src/detect.ts#L15-L25

For example, Genesis 1:1-2 is commonly formatted differently for the following languages:

  • Genesis 1:1-2 - English
  • 创世记1章1–2节 - Chinese (Simplified)
  • 創世記1:1ー2 - Japanese
  • 창세기 1장 1–2절 - Korean

Could we brainstorm a way to add regular expression override configuration to either the enhance script or the language/translation definitions themselves?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions