without depending o #4 which may take a while an idea for quick simulation and quality assessment of the TTS acoustic model could be add some similarity measure at the spectrogram level
ground truth would have to be generated with the same parameters used for FastSpeech 2 tho, which can be a burden finding out an exact match
without depending o #4 which may take a while an idea for quick simulation and quality assessment of the TTS acoustic model could be add some similarity measure at the spectrogram level
ground truth would have to be generated with the same parameters used for FastSpeech 2 tho, which can be a burden finding out an exact match