Disvortilo is a simple tool that breaks Esperanto words into roots and affixes.
You can install Disvortilo from PyPI using pip:
pip install disvortilofrom disvortilo import Disvortilo
disvortilo = Disvortilo()
print(disvortilo.parse("malliberejo"))
# > [('mal', 'liber', 'ej', 'o')]
# some have more than one possible output
print(disvortilo.parse("esperantistino"))
# > [('esper', 'ant', 'ist', 'in', 'o'), ('esperant', 'ist', 'in', 'o')]
# you can also get the morphemes along the their categories
print(disvortilo.parse_detailed("plibonigojn"))
# > [(('pli', WordPart.FULL_WORD), ('bon', WordPart.ROOT), ('ig', WordPart.SUFFIX), ('ojn', WordPart.POS))]Parser class for splitting Esperanto words into morphemes.
Returns all valid analyses of word.
Each analysis is a tuple of morpheme strings in order.
Example return value:
[('esper', 'ant', 'ist', 'in', 'o'), ('esperant', 'ist', 'in', 'o')]Like parse, but each morpheme is returned together with its detected category (WordPart).
Each analysis is a tuple of (morpheme, WordPart) pairs.
Example return value:
[(('pli', WordPart.FULL_WORD), ('bon', WordPart.ROOT), ('ig', WordPart.SUFFIX), ('ojn', WordPart.POS))]Enum values used by parse_detailed:
PREFIXROOTSUFFIXFULL_WORDPOSNUMBERNAMECORRELATIVE_STARTCORRELATIVE_END
Splits a sentence into Esperanto word-like tokens.
Supports Esperanto diacritics, optional trailing apostrophes, and forms like 3 and 3an.
Example:
from disvortilo import split_sentence
split_sentence("Mi vidas 3an domon.")
# > ['Mi', 'vidas', '3an', 'domon']