bond-lab/Bahasa-Wordnet
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
This is the raw data for the Wordnet Bahasa, a wordnet for the Malay languages (currently Malaysian and Indonesian). For more details see the project page at: http://sourceforge.net/p/wn-msa/ The data is released under the MIT license. File format: synset\tlang\tgoodness\tlemma synset is the offset-pos from Princeton wordnet 3.0 lang B (Bahasa = msa); I (Indonesian = ind); M (Malay = zsm) goodness is: Y = hand checked and good O = automatic high quality (good) M = automatic medium quality (ok) L = automatic, probably bad (low) X = hand checked and bad Normal release has only Y and O. e.g. 00015388-n B X fauna 00015388-n M Y haiwan 00015388-n I Y hewan Note: msa is the supertype of ind and zsm ======================================================================== Apostrophe should be (’) U+2019 as in: Côte d’Ivoire. Technically glottal stop should be (ʼ) Letter apostrophe U+02BC. We need to make the lookup more forgiving of this. There are some abbreviations in use: yg = yang sso = ======================================================================== Def: 06822958-n DEF tanda koma di bawah konsonan c tanda bunyi 's' 06823760-n DEF dua titik di atas huruf vokal