Studies in Type-Token Relations, Hapaxes and n-Legomena
-
Updated
Jun 5, 2023 - Jupyter Notebook
Studies in Type-Token Relations, Hapaxes and n-Legomena
Collection of stand-alone supplementary materials: reports, scripts, tables, and figures for the journal version of paper "Corrections of Zipf's and Heaps' Laws Derived from Hapax Rate Models" . This article is to appear in Journal of Quantitative Linguistics.
Collection of scripts, tables, and figures for the working version of paper "Corrections of Zipf's and Heaps' Laws Derived from Hapax Rate Models"
This project processes text files to identify hapax legomena (words that appear only once) and saves the results in an Excel file. It uses tokenization, optional lemmatization, and frequency analysis to extract and list these rare words.
Add a description, image, and links to the hapax-legomenon topic page so that developers can more easily learn about it.
To associate your repository with the hapax-legomenon topic, visit your repo's landing page and select "manage topics."