Skip to content

(retriever) Ensure batched OCR inference#1680

Open
charlesbluca wants to merge 3 commits intoNVIDIA:mainfrom
charlesbluca:retriever-ocr-batching
Open

(retriever) Ensure batched OCR inference#1680
charlesbluca wants to merge 3 commits intoNVIDIA:mainfrom
charlesbluca:retriever-ocr-batching

Conversation

@charlesbluca
Copy link
Collaborator

@charlesbluca charlesbluca commented Mar 20, 2026

  • Breakdown local OCR pipeline to enable batched inference
  • Ensure that upstream stages effectively saturate OCR

Description

  • Refactors local OCR to circumvent pipeline API that only accepts single-image inputs
  • Refactors ocr_page_elements upstream stage to aggregate OCR inputs before sending to local model to ensure saturation

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

@charlesbluca charlesbluca requested a review from a team as a code owner March 20, 2026 17:29
@charlesbluca charlesbluca requested a review from drobison00 March 20, 2026 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant