Script adapted from: https://github.com/m-bain/whisperX
Create a virtual environment using requirements.txt
'''
Prepare and save transcription & speaker diarization from an audio file.
Parameters
----------
audio_file: Path
path & audio file name to process, .wav
output_path: Path
path to save output files
max_speakers : integer
number of speakers in the conversation, default = 2
Returns
-------
audio file name + transcription.txt - audio transcription
audio file name + diarization.json - speaker diarization
audio file name + combined_result_json.json - transcription & diarization combined, in .json format
audio file name + combined_result_tsv.tsv - transcription & diarization combined, in .tsv format
command line: $ python x_asr.py audio_file output_path max_speakers
'''