About • Installation • How To Use • Credits • License
This repository is an attempt to train an ASR model. Trained model is able to achieve 23% WER and 10% CER on test-clean dataset, leveraging beam search and language model guidance. Underlying model uses deepspeech2.
val_CER_(Argmax): 0.15835317756054224
val_WER_(Argmax): 0.44245870354749056
val_CER_(BeamSearchLM): 0.1086015791507941
val_WER_(BeamSearchLM): 0.2379445747889793Follow the steps desribed in "How to use" section to run inference on the best model to reproduce stated results, or run training to create the model with the same performance.
Full report can be found here
Follow these steps to install the project:
-
(Optional) Create and activate new environment using
condaorvenv(+pyenv).a.
condaversion:# create env conda create -n project_env python=PYTHON_VERSION # activate env conda activate project_env
b.
venv(+pyenv) version:# create env ~/.pyenv/versions/PYTHON_VERSION/bin/python3 -m venv project_env # alternatively, using default python version python3 -m venv project_env # activate env source project_env
-
Install all required packages
pip install -r requirements.txt
-
Install
pre-commit:pre-commit install
-
Also make sure that
gziputility is installed
To train a model and to reproduce stated results, run the following command:
python3 train.py -cn=deepspeech_char_colab.yaml trainer.save_dir="saved"Train for 23 epochs.
To download model that achieves stated result use this command
gdown https://drive.google.com/uc\?id\=1Ook3vMZV9c7D-TMen6GjIBKWzaAQnwHRTo run inference (evaluate the model or save predictions):
python3 python3 inference.py -cn=inference.yaml inferencer.from_pretrained=<path_to_downloaded_model>Specified LM model will be downloaded automatically
This repository is based on a PyTorch Project Template.