Skip to content

Use int8 compute_type for WhisperX on non-CUDA devices#58

Open
pyjeebz wants to merge 1 commit into
facebookresearch:mainfrom
pyjeebz:fix/whisperx-cpu-compute-type
Open

Use int8 compute_type for WhisperX on non-CUDA devices#58
pyjeebz wants to merge 1 commit into
facebookresearch:mainfrom
pyjeebz:fix/whisperx-cpu-compute-type

Conversation

@pyjeebz
Copy link
Copy Markdown

@pyjeebz pyjeebz commented Apr 30, 2026

ExtractWordsFromAudio._get_transcript_from_audio hardcodes compute_type = "float16" for the WhisperX subprocess regardless of device (tribev2/eventstransforms.py:107-108):

device = "cuda" if torch.cuda.is_available() else "cpu"
compute_type = "float16"

faster-whisper (via ctranslate2) refuses float16 on CPU and raises:

ValueError: Requested float16 compute type, but the target device or backend do not support efficient float16 computation.

Anyone without an NVIDIA GPU hits this when calling get_events_dataframe() on text or audio inputs (text goes through TTS → WhisperX too), which blocks model.predict() end-to-end on CPU-only environments.

The fix mirrors the device selection one line up — keep float16 on CUDA, fall back to int8 on CPU. int8 is what faster-whisper recommends for CPU per their README.

Repro on a CPU-only box:

from tribev2 import TribeModel
m = TribeModel.from_pretrained("facebook/tribev2")
m.get_events_dataframe(text_path="some_text.txt")  # raises before this patch

CUDA users are unaffected. Verified locally on WSL2 / torch==2.6.0+cpu: WhisperX large-v3 returned the expected word timings, and a full text-mode model.predict() completed end-to-end (output shape (n_timesteps, 20484) on fsaverage5, matching the model card). I haven't run the [test] extra's pytest suite — happy to if useful. CLA signed.

faster-whisper / ctranslate2 reject float16 on CPU with ValueError, so
the WhisperX subprocess in ExtractWordsFromAudio fails for any user
without an NVIDIA GPU. Mirror the pattern already used for `device` and
pick int8 (faster-whisper's recommended CPU compute type) when CUDA is
unavailable.
Copilot AI review requested due to automatic review settings April 30, 2026 04:52
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 30, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes CPU-only execution of WhisperX transcription by selecting a supported compute_type when CUDA isn’t available, unblocking get_events_dataframe() / end-to-end model.predict() on non-NVIDIA environments.

Changes:

  • Switch WhisperX compute_type from always-float16 to float16 on CUDA and int8 on CPU.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants