The largest open-source dataset for Tunisian Arabic (Derja) NLP, featuring social media text, transcripts, and e-commerce data for LLM training and fine-tuning.
-
Updated
Jan 28, 2026
The largest open-source dataset for Tunisian Arabic (Derja) NLP, featuring social media text, transcripts, and e-commerce data for LLM training and fine-tuning.
Add a description, image, and links to the derja topic page so that developers can more easily learn about it.
To associate your repository with the derja topic, visit your repo's landing page and select "manage topics."