Skip to content

Latest commit

 

History

History
52 lines (34 loc) · 3.22 KB

File metadata and controls

52 lines (34 loc) · 3.22 KB

Datasets

image

Text datasets

Parallel corpora

Monolingual corpora

Classification

POS-tagging

NER

  • WikiANN (the transcription is problematic: Latin and Cyrillic are used inconsistently, Wikipedia Markup is parsed incorrectly, but if you want to use it, see wikiann directory)

Question Answering

Instruction

Audio datasets