- 1. History and Development of the Project
- 2. Methodology
- 3. Team & Contributors
- 4. Workflow & Design
- 5. Bibliography
- 6. License, Contact & Citation
- CLDF Datasets
MAGRAM stands for MAinz GRAMmaticalization project. It was funded by the German Research Foundation (DFG; under BI 591/12--1) and conducted at the Johannes Gutenberg Universität Mainz under the leadership of Prof. Dr. Walter Bisang and apl. Prof. Dr. Andrej Malchukov, from January 2016 to March 2020.
MAGRAM began with the hypothesis that grammaticalization is not necessarily cross-linguistically homogeneous (cf. Bisang 2011). The project explored areal and/or cross-phylogenetic variation based on two key hypotheses:
- Grammaticalization paths of the type [SOURCE → TARGET] vary areally/genetically in terms of both the sources and targets associated with specific concepts.
- There are cross-linguistic differences in the degree of covariation of meaning and form—semantic changes do not always imply form-related changes.
A major outcome of MAGRAM is the two-volume Comparative Handbook of Grammaticalization Scenarios (link), which includes:
- A position paper (Bisang et al. 2020a),
- A methodology paper (Bisang et al. 2020b),
- 25 detailed studies on grammaticalization scenarios across languages and areas.
Each grammaticalization path is described by its source and target, with attention to both form and function/meaning.
Example:
(1) willan 'want' → will FUT (cf. Kuteva et al. 2019: 453)
Intermediate steps are not systematically recorded; instead, we focus on the earliest lexical meaning and the most grammaticalized function and form.
Data sources:
- Chapters from the Comparative Handbook of Grammaticalization Scenarios
- A list of 30 source concepts sent to contributors
Table 1. Source Concepts (Bisang et al. 2020b: 97–101 with reference to Heine & Kuteva 2002)
| No. | Source Concept | No. | Source Concept |
|---|---|---|---|
| 1 | arrive | 16 | head (body part) |
| 2 | back (body part) | 17 | here |
| 3 | body | 18 | leave |
| 4 | child | 19 | live |
| 5 | come | 20 | love |
| 6 | copula | 21 | man |
| 7 | demonstrative | 22 | one |
| 8 | do | 23 | say |
| 9 | fall | 24 | see |
| 10 | finish | 25 | side |
| 11 | follow | 26 | sit |
| 12 | get | 27 | take |
| 13 | give | 28 | thing |
| 14 | go | 29 | want |
| 15 | hand (body part) | 30 | woman |
Note: Absence of a source concept in the database for a given language does not necessarily imply its non-existence.
We based our grammaticalization parameters on Lehmann’s (1995) framework, adjusted as follows:
Table 2. Modified Lehmann Parameters
| Domain | Paradigmatic | Syntagmatic |
|---|---|---|
| Weight | 1. Semantic Integrity (SI) | — |
| 2. Phonetic Reduction (PR) | ||
| Cohesion | 3. Paradigmaticity (PM) | 4. Bondedness (BD) |
| Variability | 5. Paradigmatic Variability (PV) | 6. Syntagmatic Variability (SV) |
We also added:
- Decategorization (DC)
- Allomorphy (AM)
Each target is assigned a value from 1 to 4 for every parameter. Example for Bondedness (BD):
- Free morpheme or lexical root
- Clitic
- Agglutinative affix
- Part of a porte-manteau or suprasegmental/zero morpheme
Two types of data are collected:
- Change data: change (1) / no change (0) from source to target
- Value data: value (1–4) per parameter for the target
- Walter Bisang (WB) – PI
- Andrej Malchukov (AM) – PI
- Linlin Sun (LS) – annotation and evaluation
- Iris Rieder (IR) – annotation and evaluation
- Eduard Schroeder (ES) – annotation and evaluation
- Marvin Martiny (MM) – annotation and evaluation, database curation
The contributors to our database in alphabetical order of their area of expertise:
- Beja (Cushitic, Afroasiatic): Martine Vanhove
- Chinese (Sinitic, Sino-Tibetan): Linlin Sun and Walter Bisang
- Creoles and Pidgins: Susanne Michaelis and Martin Haspelmath
- Emai (Edoid, Niger-Congo): Ronald P. Schaefer and Francis O. Egbokhare
- German (Indo-European): Luise Kempf and Damaris Nübling
- Hoocak (Core Siouan): Johannes Helmbrecht
- Indo-Aryan (Indo-European): Annie Montaut
- Iranian (Indo-European): Agnes Korn
- Iroquoian: Marianne Mithun
- Japhug (Rgyalrong, Sino-Tibetan): Guillaume Jacques
- Khmer (Austroasiatic): Walter Bisang
- Korean: Seongha Rhee
- Lezgic (Northeast Caucasian): Timur Maisak
- Malayo-Polynesian (Austronesian) & Mori (Yareban): Nikolaus P. Himmelmann
- Manding (Mande, Niger-Congo): Denis Creissels
- Mian (Papua New Guinea): Sebastian Fedden
- Nyulnyulan (Non-Pamanyungan, Australian): William B. McGregor
- Quechua and Aymara: Willem F. H. Adelaar
- Romance (Indo-European): Michela Cennamo
- Slavic (Indo-European): Björn Wiemer
- Southern Uto-Aztecan: Zarina Estrada-Fernández
- Thai: Walter Bisang
- Tswana (Bantu, Niger-Congo): Denis Creissels
- Tungusic (Manchu-Tungusic, Transeurasian): Andrej Malchukov
- Uralic: Juha Janhunen
- Yeniseian: Edward Vajda
- Yucatecan (Mayan): Christian Lehmann
This is a schematic overview of the steps leading to the creation of the database:
Figure 1. Schematic overview of the project workflow.
There are three main data types:
- Metadata (path-specific and language-specific)
- Path descriptions (including examples and comments)
- Parameter evaluations
- Language material only
- One form (citation form)
- Allomorphs: separated by commas; zero alternation in round brackets
- Coding: periphery in brackets (with
+for unbound items) - Non-Latin script: in
<angle brackets> - Syntactically (not phonologically) bound morphs: use
<.>instead of space
- Function only
- One function per entry
- Correspondence to form columns
- Primordial source prioritized
- Formatting:
- (core) grammatical: CAPS
- (core) lexical: 'quotes'
- Other: lowercase
- Coding periphery follows the same principles as form
- Variation in terminology
- Grammatical labels: We use a fixed set of grammatical labels defined in a separate document ('grammatical labels').
- Polygrammaticalization: When multiple targets stem from the same source, we display the most grammaticalized meaning; others may appear in the comments.
For the web version of the database, we have added groupings of grams for easier access. Those groupings are based on broad interpretations of the items' functions. And although they have some theoretical background, they are not to be seen as proper comparative concepts. They are there for your convenience and have been assigned in a short amount of time, without the necessary attention to detail necessary for strict grouping.
- Glossing follows the Leipzig Glossing Rules
- If no citation is given for an example, the contributor is the source (p.c.)
Multiple comments are separated by semicolons.
Each of the 8 parameters is defined in detail in the methodology paper (Bisang et al. 2020b). Additional notes:
- Decategorization & Syntagmatic Variability: These use the lexical item’s original category (e.g., noun, verb) and positional freedom as baselines.
- Semantic Integrity: The distinction between ‘referential’ (2) and ‘relational’ (3) should be understood in the sense that lexical categories have denotations (they independently convey the concept of a property, action or object), while grammatical categories do not.
- Inflectional semantic case will be treated as value 4 for the parameter of Semantic Integrity, because practically every semantic case also constitutes a minor pattern of syntactic case.
- Phonetic reduction: the nucleus approach singles out one morph of the source construction as the nucleus, which is measured. It is compared to the corresponding part of the target construction. If elements fuse in the course of the development (univerbation), unintuitive values for phonetic reduction may occur. This method has the advantage of theoretical consistency insofar as it applies the same metrics to the same unit of analysis (the nucleus).
- Bisang, Walter. 2011. Grammaticalization and typology. Oxford University Press.
- Bisang et al. 2020a. Grammaticalization Scenarios, Vol. 1 & 2.
- Bisang et al. 2020b. Measuring grammaticalization: A questionnaire.
- Hammarström et al. 2022. Glottolog
- Kuteva et al. 2019. World Lexicon of Grammaticalization (2nd ed.).
- Lehmann, Christian. 1995 [1982]. Thoughts on grammaticalization. Lincom Europa.
Dataset citation (e.g., Yucatec Maya):
Christian Lehmann & MAGRAM team. 2025. Yucatec Maya. In: Bisang, Walter & Malchukov, Andrej & Martiny, Marvin (eds.) Mainz Grammaticalization Project. Leipzig: Max Planck Institute for Evolutionary Anthropology. https://crossgram.clld.org/contributions/magram
Full database citation:
Bisang, Walter, Malchukov, Andrej & Martiny, Marvin (eds.). 2025. MAGRAM: Mainz Grammaticalization Project Database. Leipzig: Max Planck Institute for Evolutionary Anthropology. https://crossgram.clld.org/contributions/magram
The following CLDF datasets are available in cldf:
