Skip to content

Fix regex for 'viz' in Latinisms YAML#201

Open
rajannpatel wants to merge 1 commit into
canonical:mainfrom
rajannpatel:Latinism-rule-for-viz
Open

Fix regex for 'viz' in Latinisms YAML#201
rajannpatel wants to merge 1 commit into
canonical:mainfrom
rajannpatel:Latinism-rule-for-viz

Conversation

@rajannpatel
Copy link
Copy Markdown

This PR adds viz. (short for videlicet) to the list of discouraged Latin abbreviations

Regex Details

The regex is designed to avoid false-positive matches on domains, emails, and overlapping words (e.g. viz.com, vizier, supervise):

\b(?:viz\.(?!\w)|viz(?![\w\.])): "'specifically' or 'namely'"
  • viz\.(?!\w): Matches viz. only if the dot is not followed by a word character (catches "viz. ", misses "viz.com").
  • viz(?![\w\.]): Matches viz only if the word is not followed by a word character or a dot (catches "viz, ", misses "vizier").

Verification

Testing this against the following markdown confirms correct behavior:

<!-- Triggers warning: -->
Please provide the details, viz. the names and dates.
We have three options, viz, red, green, and blue.
The results are clear viz the chart below.

<!-- Passes silently (no false positives): -->
The vizier advised the sultan.
Check the website at viz.com or viz.org.
We need to supervise the process.

Copy link
Copy Markdown
Contributor

@SecondSkoll SecondSkoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current change doesn't do anything. Output against supplied test case is:

 test.md
 3:24  suggestion  Instead of 'viz', use 'specifically' or 'namely'.   Canonical.025a-latinisms-with-english-equivalents 
 4:23  suggestion  Instead of 'viz', use 'specifically' or 'namely'.   Canonical.025a-latinisms-with-english-equivalents 
 8:22  suggestion  Instead of 'viz.', use 'specifically' or 'namely'.  Canonical.025a-latinisms-with-english-equivalents 

Which still misses a case and flags a false positive. Fix suggested.

\b(?:versus|vs\.(?!\w)|vs(?![\.\w])): "'compared to/with' or 'opposed to'"
\bvice\sversa\b: "'the reverse' or 'the other way around'"
\b(viz\.(?!\w)|viz(?![\w\.])): "'specifically' or 'namely'"
\b(?:viz\.(?!\w)|viz(?![\w\.])): "'specifically' or 'namely'"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
\b(?:viz\.(?!\w)|viz(?![\w\.])): "'specifically' or 'namely'"
\b(viz(?!\.?\w)): "'specifically' or 'namely'"

The suggested fix doesn't actually do anything (there's some weirdness to non-capturing groups in our Vale implementation), in fact it's the same logic as the existing code. It should work properly already, but as you have raised there are some uncaptured cases and some false positives for some reason.

It seems there's some real weirdness with escaped . characters, which is the root of the issue. It looks like the easiest way to deal with it is to drop the capture of the . that should exist after viz.

This requires a small change to the test cases as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants