The historical dictionary of the Balinese language — Balinese · Indonesian · English
Named after the palm leaf manuscripts of Bali — the primary medium through which Balinese knowledge, literature, and language have been preserved for over a millennium. LBD is the first attempt at an OED-grade historical dictionary of the Balinese language.
LBD is an open, corpus-backed, historically grounded dictionary of the Balinese language — trilingual across Balinese, Indonesian, and English. Each entry traces a word from its earliest attested form in lontar manuscripts and copper plate inscriptions (prasasti) through its modern usage, with full register (unda-usuk) stratification and etymological depth.
The OED is the reference model: not a translation dictionary, not a learner's dictionary, but a scholarly record of what words mean, what they have meant, and where they come from — grounded in primary evidence.
lema: padem
aksara: ᬧᬤᭂᬫ᭄
register: alus singgih
pos: verba
etymology: Kawi paḍem < Sanskrit padma (lotus) → euphemistic register shift
senses:
1. [ritual/formal] meninggal dunia (of royalty, high caste) — to die (honorific)
attestations:
- Prasasti Sukawana A1 (~882 CE): "...sang hyang paḍem..."
- Lontar Usada Rare (undated, Gedong Kirtya LT 1247): "yan padem sapisan..."
register_equivalents:
andap: mati
alus_sor: seda
alus_singgih: padem
LBD is built on two existing projects:
| Project | Role in LBD |
|---|---|
lontar-ocr |
Corpus ingestion — converts manuscript images to Unicode text, feeds the attestation layer |
lontar |
Rendering and export — Aksara Bali text shaping (lontar-aksara), scholarly output formats (DOCX, PDF, LaTeX) |
LBD is not a standalone project. It is the dictionary layer that makes the outputs of lontar-ocr and lontar meaningful as preservation infrastructure.
| Feature | Modern Kamus (Balai Bahasa) | BASAbali Wiki | LBD |
|---|---|---|---|
| Trilingual (Bali–ID–EN) | Bali–ID only | Bali–ID–EN | ✅ |
| Etymology | ❌ | ❌ | ✅ Full chain |
| Earliest attestation | ❌ | ❌ | ✅ Corpus-backed |
| Unda-usuk register cross-links | Partial | Partial | ✅ Structured |
| Aksara Bali per entry | ❌ | Partial | ✅ |
| IPA transcription | ❌ | ❌ | ✅ |
| Dialect tagging | ❌ | ❌ | ✅ |
| Open structured data | ❌ | Partial | ✅ CC BY-SA |
| Historical depth | ❌ | ❌ | ✅ OED-model |
| Phase | Status | Goal |
|---|---|---|
| 0 | 🔴 Design | Schema, data model, editorial governance |
| 1 | 🔴 Design | Seed database (5,000–10,000 entries), REST + GraphQL API |
| 1.5 | 🔴 Design | Public web UI — entry pages, search, browse, corpus viewer |
| 2 | 🔴 Design | Aksara Bali rendering, lontar-aksara integration |
| 3 | 🔴 Design | Corpus attestation pipeline via lontar-ocr |
| 4 | 🔴 Design | Editorial interface — review queue, diff view, dispute workflow |
| 5 | 🔴 Design | Institutional partnerships, Van der Tuuk extraction |
Status legend: 🔴 Design · 🟡 In Progress · 🟢 Usable · ✅ Stable
- Gedong Kirtya, Singaraja — largest lontar collection in the world (~3,000+ texts)
- Leiden University Library — Van der Tuuk bequest, partially digitized
- Udayana University — active transliteration program
- Museum Bali, Denpasar
- Van der Tuuk, Kawi-Balineesch-Nederlandsch Woordenboek (1897–1912) — 4 vols, ~3,600 pp — the historical backbone, public domain
- Zoetmulder, Old Javanese-English Dictionary (1982) — Kawi layer reference
- Balai Bahasa Provinsi Bali, Kamus Bali–Indonesia (3rd ed.) — modern lemma base
- Sutjaja, Tuttle Concise Balinese Dictionary — ~18,000 entries, register-aware
See ARCHITECTURE.md for the full technical design.
See TODO.md for the detailed task breakdown.
- Rust toolchain (1.76+ recommended)
- PostgreSQL 17
- Meilisearch running locally (default http://localhost:7700)
cp .env.example .env
# edit DATABASE_URL, MEILISEARCH_URL, MEILISEARCH_KEY
# Run migrations (requires database exists)
sqlx migrate run
# Seed sample data
psql -d lbd -f scripts/seed_data.sql
# Run server
cargo runcurl http://127.0.0.1:3000/health
curl http://127.0.0.1:3000/entries/padem- axum 0.7.5
- tokio 1.37.0
- sqlx 0.7.4
- meilisearch-sdk 0.27.0
- serde/serde_json 1.0.x
cargo runboots successfully (health endpoint OK)cargo checkclean on updated dependencies
Dictionary data: Creative Commons Attribution-ShareAlike 4.0 (CC BY-SA 4.0)
Software: Licensed under Apache-2.0.
ᬮᭀᬦ᭄ᬢᬭ᭄ ᬩᬲ ᬩᬮᬶ — Lontar Basa Bali