Automatic Speech Recognition (ASR)

About • Installation • How To Use • Credits • License

About

This repository is an attempt to train an ASR model. Trained model is able to achieve 23% WER and 10% CER on test-clean dataset, leveraging beam search and language model guidance. Underlying model uses deepspeech2.

val_CER_(Argmax): 0.15835317756054224
val_WER_(Argmax): 0.44245870354749056
val_CER_(BeamSearchLM): 0.1086015791507941
val_WER_(BeamSearchLM): 0.2379445747889793

Follow the steps desribed in "How to use" section to run inference on the best model to reproduce stated results, or run training to create the model with the same performance.

Link to wandb artifcats

Full report can be found here

Installation

Follow these steps to install the project:

(Optional) Create and activate new environment using conda or venv (+pyenv).

a. conda version:

# create env
conda create -n project_env python=PYTHON_VERSION

# activate env
conda activate project_env

b. venv (+pyenv) version:

# create env
~/.pyenv/versions/PYTHON_VERSION/bin/python3 -m venv project_env

# alternatively, using default python version
python3 -m venv project_env

# activate env
source project_env

Install all required packages
```
pip install -r requirements.txt
```
Install pre-commit:
```
pre-commit install
```
Also make sure that gzip utility is installed

How To Use

To train a model and to reproduce stated results, run the following command:

python3 train.py -cn=deepspeech_char_colab.yaml trainer.save_dir="saved"

Train for 23 epochs.

To download model that achieves stated result use this command

gdown https://drive.google.com/uc\?id\=1Ook3vMZV9c7D-TMen6GjIBKWzaAQnwHR

To run inference (evaluate the model or save predictions):

python3 python3 inference.py -cn=inference.yaml inferencer.from_pretrained=<path_to_downloaded_model>

Specified LM model will be downloaded automatically

Credits

This repository is based on a PyTorch Project Template.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
ASR_Report.pdf		ASR_Report.pdf
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automatic Speech Recognition (ASR)

About

Installation

How To Use

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Automatic Speech Recognition (ASR)

About

Installation

How To Use

Credits

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages