Hakha Chin Speech-to-Text Translator

A fine-tuned Whisper model for transcribing Hakha Chin (CNH) speech to text and translating to English. Built to help bridge language barriers in Hakha Chin-speaking communities.

🎯 Overview

This project fine-tunes OpenAI's Whisper model on Hakha Chin Bible audio to create a speech-to-text system for this low-resource language. The model transcribes Hakha Chin audio and provides automatic translation to English.

Current Status: ✅ V4 Model - Production Ready (with limitations)

Features

Speech-to-Text: Transcribe Hakha Chin audio to text
Translation: Automatic translation to English
Web Interface: Easy-to-use Gradio interface
Audio Processing: Handles uploaded files and microphone input
Sliding Window: Processes long audio in manageable chunks

🚀 Quick Start

Prerequisites

Python 3.8+
GPU recommended (CUDA support for faster processing)
~2GB disk space for model files

Installation

# Clone the repository
git clone https://github.com/trinitron88/ChinTranslator2.git
cd ChinTranslator2

# Install dependencies
pip install torch transformers gradio librosa deep-translator soundfile numpy

Running the Interface

python gradio_interface.py

The interface will launch in your browser with a shareable public link.

📊 Model Performance

V4 Model (Current)

Training Data: 1,375 segments from 44 Bible chapters (Mark & Matthew)
Validation Data: 344 segments
Training Loss: 6.47 → 2.0 (smooth descent)
Estimated Accuracy: 60-70% on biblical text

Known Limitations

Training Data Constraints:
- All male narrators (Bible speakers only)
- Biblical/formal vocabulary domain
- Read speech, not conversational
- Single audio source
Performance:
- Processing speed: ~3-4x real-time on GPU
- Lower accuracy on non-biblical conversational speech
- Reduced accuracy on female voices
Domain:
- Best for biblical or formal Hakha Chin
- Limited modern/conversational vocabulary

🔧 Project Structure

.
├── README.md
├── gradio_interface.py           # Main web interface (optimized)
├── fine-tuning-aligned.py        # Training script
├── whisper_alignment_2.py        # Audio-text alignment
├── process-matthew.py            # Data preprocessing
├── continue_training.py          # Continue training existing model
├── aligned_train_data.json       # Training segments (1,375)
├── aligned_val_data.json         # Validation segments (344)
├── Audio/                        # Audio files (mark_*.mp3, matt_*.mp3)
├── Text/                         # Text transcripts (*.txt)
└── whisper-hakha-chin/          # Fine-tuned model (V4)

📚 Usage

Web Interface

Launch the Gradio interface:
```
python gradio_interface.py
```
Choose input method:
- Upload Audio: Upload an audio file (MP3, WAV, etc.)
- Record Audio: Use your microphone to record
Click "Transcribe & Translate"
View results:
- Hakha Chin transcription
- English translation

Training a New Model

Prepare your data:
- Audio files in Audio/ directory
- Corresponding text files in Text/ directory
- Use naming convention: book_chapter.mp3 and book_chapter.txt
Align audio and text:
```
python whisper_alignment_2.py
```
Train the model:
```
python fine-tuning-aligned.py
```
Model will be saved to ./whisper-hakha-chin/

🛠️ Technical Details

Model Architecture

Base Model: OpenAI Whisper Small (244M parameters)
Task: Transcription (not translation)
Language: Hakha Chin (forced, no language token)
Approach: Fine-tuning with frozen encoder, trainable decoder

Training Configuration

Epochs: 5
Batch Size: 4 (effective 16 with gradient accumulation)
Learning Rate: 1e-5
Optimizer: AdamW
Mixed Precision: FP16 (on GPU)

Audio Processing

Sample Rate: 16kHz (mono)
Segmentation: Non-silence detection
Window: 30-second sliding windows with overlap
Normalization: Automatic volume adjustment

Translation

Method: Google Translate API (via deep-translator)
Source: Hakha Chin (CNH) or auto-detect
Target: English

🔄 Model Versions

Version	Chapters	Segments	Status	Notes
V1	Mark	-	❌ Abandoned	Severe repetition, undertrained
V2	Mark (16)	540	✅ Working	Good baseline, limited vocabulary
V3	Mark + Matthew	1,517	❌ Failed	Bad alignment, multilingual gibberish
V4	Mark + Matthew (44)	1,375	✅ Current	Proper alignment, production ready

🚀 Future Improvements

Immediate

Field test with native speakers
Collect accuracy metrics on real conversations
Optimize processing speed

Short Term

Expand to all 260+ available Bible chapters
Add data augmentation (speed, pitch, noise)
Test on diverse audio conditions

Long Term

Collect conversational Hakha Chin data
Add female and diverse speakers
Implement speaker diarization
Train dedicated Hakha Chin → English translation model
Create community crowdsourcing platform

📖 Data Sources

Audio: Faith Comes By Hearing (Hakha Chin Bible)
Text: YouVersion Bible (Hakha Chin)
Books: Gospel of Mark (16 chapters), Gospel of Matthew (28 chapters)

🤝 Contributing

Contributions welcome! Areas of interest:

Additional training data (conversational Hakha Chin)
Performance optimizations
Accuracy improvements
UI/UX enhancements
Documentation

📝 License

This project is for educational and language preservation purposes. Please respect the licenses of:

OpenAI Whisper (Apache 2.0)
Bible audio and text sources
Transformers library (Apache 2.0)

🙏 Acknowledgments

OpenAI for the Whisper model
Faith Comes By Hearing for Hakha Chin Bible audio
YouVersion for Hakha Chin Bible text
The Hakha Chin community

📧 Contact

For questions or collaboration: GitHub Issues

Last Updated: November 5, 2025
Model Version: V4
Status: Production Ready (with known limitations)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
ChinAudioTranslator.ipynb		ChinAudioTranslator.ipynb
ChinTranslator_Colab.ipynb		ChinTranslator_Colab.ipynb
README.md		README.md
aligned_train_data.json		aligned_train_data.json
aligned_training_data.json		aligned_training_data.json
aligned_val_data.json		aligned_val_data.json
continue_training.py		continue_training.py
fine-tuning-aligned.py		fine-tuning-aligned.py
fine-tuning.py		fine-tuning.py
gradio_interface.py		gradio_interface.py
process-matthew.py		process-matthew.py
rename_files.py		rename_files.py
whisper_alignment_2.py		whisper_alignment_2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hakha Chin Speech-to-Text Translator

🎯 Overview

Features

🚀 Quick Start

Prerequisites

Installation

Running the Interface

📊 Model Performance

V4 Model (Current)

Known Limitations

🔧 Project Structure

📚 Usage

Web Interface

Training a New Model

🛠️ Technical Details

Model Architecture

Training Configuration

Audio Processing

Translation

🔄 Model Versions

🚀 Future Improvements

Immediate

Short Term

Long Term

📖 Data Sources

🤝 Contributing

📝 License

🙏 Acknowledgments

📧 Contact

About

Uh oh!

Releases

Packages

Languages

trinitron88/ChinTranslator

Folders and files

Latest commit

History

Repository files navigation

Hakha Chin Speech-to-Text Translator

🎯 Overview

Features

🚀 Quick Start

Prerequisites

Installation

Running the Interface

📊 Model Performance

V4 Model (Current)

Known Limitations

🔧 Project Structure

📚 Usage

Web Interface

Training a New Model

🛠️ Technical Details

Model Architecture

Training Configuration

Audio Processing

Translation

🔄 Model Versions

🚀 Future Improvements

Immediate

Short Term

Long Term

📖 Data Sources

🤝 Contributing

📝 License

🙏 Acknowledgments

📧 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages