LLM call batching and multithreading by radu-gheorghe · Pull Request #266 · SeaseLtd/rated-ranking-evaluator

radu-gheorghe · 2026-05-12T14:14:34Z

This PR includes #265 (which includes #264).

It adds performance/cost improvements by supporting batching: when we evaluate N results from a query, we break that N into micro-batches (of configurable size) and send them to the LLM at once. You'd typically need a clearer prompt and possibly even a better model, but in my experience, it pays off. Because a good prompt might get quite big.

Side-effect: the prompt is now configurable (for the batch case at least).

We can also parallelize both generating queries from docs and generating judgments from query-doc pairs.

Both features are opt-in. Previous behavior is pretty much as it used to be.

…ma to collection_name

radu-gheorghe added 5 commits May 6, 2026 18:36

allow using search_engine_type=vespa in config and default vespa_sche…

b04b8a4

…ma to collection_name

make queries needed respect the cache

d4d4ff7

test for query budget

b21bf95

category queries initial implementation

1cf8eaa

initial batching and threading implementation

a0c63ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM call batching and multithreading#266

LLM call batching and multithreading#266
radu-gheorghe wants to merge 5 commits into
SeaseLtd:dataset-generatorfrom
radu-gheorghe:dataset-generator-llm-batching

radu-gheorghe commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

radu-gheorghe commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant