-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Passing the language string to the engine doesn't consistently translate. For example, this log from an input recording of English and a language parameter of Chinese outputs English text.
> {"event": "on", "type": "transcribe", "language": "zh"}
Recording started
Warning: Some sources (like microphones) may produce inaudible results
with 8-bit sampling. Use '-f' argument to increase resolution
e.g. '-f S16_LE'.
Recording WAVE 'test.wav' : Unsigned 8 bit, Rate 8000 Hz, Mono
{"event": "off", "type": "transcribe"}
Aborted by signal Terminated...
Recording stopped
Device set to use cpu
Output language: zh
/home/lee/.pyenv/versions/3.10.15/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py:573: FutureWarning: The input name `inputs` is deprecated. Please make sure to use `input_features` instead.
warnings.warn(
You have passed language=zh, but also have set `forced_decoder_ids` to [[1, None], [2, 50360]] which creates a conflict. `forced_decoder_ids` will be ignored in favor of language=zh.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Seemingly random following attempts translate correctly for the English "I like Chinese food" to "我喜欢中国菜" which is correct. Look into the Attention mask and EOS and PAD tokens. What is "my input's attention_mask"?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working