Skip to content

BengaliAI/reg-speech-aacl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Are ASR foundation models generalized enough to capture features of regional dialects for low-resource languages?

Accepted at AACL 2025 | Paper (Coming Soon) | Poster | Dataset | Model | Demo


Abstract

Conventional research on speech recognition modeling relies on the canonical form for most low-resource languages while automatic speech recognition (ASR) for regional dialects is treated as a fine-tuning task. To investigate the effects of dialectal variations on ASR we develop a 78-hour annotated Bengali Speech-to-Text (STT) corpus named Ben-10.

Investigation from linguistic and data-driven perspectives shows that speech foundation models struggle heavily in regional dialect ASR, both in zero-shot and fine-tuned settings. We observe that all deep learning methods struggle to model speech data under dialectal variations but dialect-specific model training alleviates the issue. Our dataset also serves as an out-of-distribution (OOD) resource for ASR modeling under constrained resources in ASR algorithms. The dataset and code developed for this project are publicly available.


Competitions & Reports

A competition was organized on Kaggle based on this dataset.


Dataset Details

  • Size: 78 hours of annotated Bengali speech
  • Dialects: 10 regional dialects of Bengali
  • Format: Speech-to-Text (STT) corpus
  • Use Case: Out-of-distribution (OOD) resource for ASR modeling under constrained resources

Repository Structure

reg-speech-aacl/
├── finetuning/          # Fine-tuning scripts and notebooks
├── result_analysis/     # Analysis notebooks and results
├── AACL_2025_Poster.pdf # Conference poster
└── README.md           # This file

Usage

The repository contains:

  • Finetuning scripts (finetuning/): Code for fine-tuning ASR models on regional dialects
  • Result analysis (result_analysis/): Analysis notebooks for different regions

Citation

If you use this dataset or code in your research, please cite:

@inproceedings{ben10-2025,
  title={Are ASR foundation models generalized enough to capture features of regional dialects for low-resource languages?},
  author={...},
  booktitle={Proceedings of AACL 2025},
  year={2025}
}

Citation will be updated once the paper is published.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •