(retriever) ASR refactor by charlesbluca · Pull Request #1658 · NVIDIA/NeMo-Retriever

charlesbluca · 2026-03-19T14:27:48Z

Description

Refactors ASR so all “chunk rows → transcript rows” logic goes through asr_chunks_to_text, with ASRActor as a thin wrapper.

What changed

asr_chunks_to_text(batch_df, model=..., client=..., asr_params=...)
Single batch entry point for ASR. ASRActor only constructs model/client from ASRParams and delegates here.
Injectable model / client
Inprocess and the GPU pool can pass a ParakeetCTC1B1ASR (or remote client) so the same code path runs inside and outside Ray map_batches.
Long audio (local Parakeet)
ParakeetCTC1B1ASR splits inputs that exceed the model length budget and concatenates transcripts.
Media probing
media_interface: more robust ffprobe handling when duration or bit_rate is missing (e.g. VBR / bad probes).
Inprocess
_load_doc_to_df / _iter_doc_chunks unify loading for pdf / html / image / audio / txt in the ingest loop.
API
Removed apply_asr_to_df; use asr_chunks_to_text directly (tests updated).

Why

One implementation for Ray batch, inprocess, and audio CLI—easier to fix and extend.
Injected model supports GPU pool / avoids duplicate model setup where the caller already holds the model.
Preserves main behavior for remote segment_audio while keeping the refactored structure.

Testing

pytest nemo_retriever/tests/test_asr_actor.py

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

- Rebase onto main (includes remote segment_audio / 5c5557a) - asr_chunks_to_text + model/client injection; _build_output_rows, _infer_remote - Parakeet long-audio split; media_interface ffprobe robustness - inprocess: _load_doc_to_df / _iter_doc_chunks; asr_chunks_to_text with model - stage: asr_chunks_to_text; tests; no apply_asr_to_df Made-with: Cursor

charlesbluca added 2 commits March 19, 2026 14:23

Merge branch 'main' into asr-refactor

c8e1416

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(retriever) ASR refactor#1658

(retriever) ASR refactor#1658
charlesbluca wants to merge 2 commits intoNVIDIA:mainfrom
charlesbluca:asr-refactor

charlesbluca commented Mar 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

charlesbluca commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What changed

Why

Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

charlesbluca commented Mar 19, 2026 •

edited

Loading