A Python project that uses OpenAI's Whisper model to transcribe children's reading testing audio.
- Clone this repository
- Install the required dependencies:
pip install -r requirements.txt
- Create a
.envfile with your OpenAI API key:
OPENAI_API_KEY=your_api_key_here
- Place your audio files in the
audio_files/directory
You can place your audio files in the audio_files/ directory and then simply refer to them by filename:
python transcribe.py filename.mp3
Or provide a full path to an audio file anywhere on your system:
python transcribe.py /path/to/your/audio/file.mp3
You can also specify an output file for the transcription:
python transcribe.py filename.mp3 -o transcription.txt
To use a different audio files directory:
python transcribe.py filename.mp3 -d your_audio_directory
This project includes a tool to convert various audio formats (WMA, DSS) to MP3:
python wma.py your_file.wma
python wma.py your_file.dss
python wma.py your_file.wma -o output.mp3
python wma.py your_directory/
python wma.py your_directory/ -o output_directory/
python wma.py your_directory/ -r
python wma.py your_file.wma -b 320k
python wma.py your_directory/ -t wma,dss,wav
Run the interactive sample script:
python sample_usage.py
- Transcribes audio files using OpenAI's Whisper model
- Automatically converts audio to the required MP3 format using pydub
- Supports saving transcriptions to a text file
- Environment variables for secure API key storage
- Dedicated
audio_files/directory for organizing your audio files - Audio format conversion utility (WMA, DSS to MP3)
The transcription tool can handle various audio formats including:
- MP3
- WAV
- FLAC
- OGG
- WMA (will be automatically converted to MP3)
- DSS (will be automatically converted to MP3)
- and more (supported by pydub/ffmpeg)