-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
It's fairly common for accents on vowels to be missed off, or placed where they don't belong, especially in informal texts. The tagger is currently strict about this, because the lexicon only includes words with the standard use of accents.
E.g. "swn y mor" is almost certain to mean "the sound of the sea", but the standard spelling would be "sŵn y môr". The tagger therefore understands "swn" to be a form of the verb "to be", and "mor" to be an adverb, meaning "so":
59 swn 6,2 bod B Bdibdyf1u
60 y 6,3 y YFB YFB
61 mor 6,4 mor Adf Adf
62 . 6,5 . Atd Atdt
The optimal tagging would be:
59 swn 6,2 sŵn E Egu
60 y 6,3 y YFB YFB
61 mor 6,4 môr E Egu
62 . 6,5 . Atd Atdt
Metadata
Metadata
Assignees
Labels
No labels