Skip to content

fix(generate): avoid None entries in merged logits_processors#1230

Closed
BLuchterhand wants to merge 1 commit into
ml-explore:mainfrom
BLuchterhand:fix/logits-processors-none-element-iteration
Closed

fix(generate): avoid None entries in merged logits_processors#1230
BLuchterhand wants to merge 1 commit into
ml-explore:mainfrom
BLuchterhand:fix/logits-processors-none-element-iteration

Conversation

@BLuchterhand

@BLuchterhand BLuchterhand commented Apr 29, 2026

Copy link
Copy Markdown

Bug

PromptProcessingBatch.extend filled missing per-slot logits_processors with [None] * N. Merging an unconfigured batch with a processor-equipped batch produced a list shaped [None, ..., [fn], ...]. GenerationBatch._step (line 1346) iterates self.logits_processors[e] under the any() guard at line 1337, which raises TypeError: 'NoneType' object is not iterable on the None slots.

Reproduce

a = PromptProcessingBatch.empty(model, fallback)
a.uids = [0]; a.logits_processors = []
b = PromptProcessingBatch.empty(model, fallback)
b.uids = [1]; b.logits_processors = [make_logits_processors({0: 2000.0})]
a.extend(b)
# a.logits_processors == [None, [<fn>]]   ← later crashes _step

Fix

-            self.logits_processors = [None] * len(self.uids)
+            self.logits_processors = [[]] * len(self.uids)
...
-            else [None] * len(batch.uids)
+            else [[]] * len(batch.uids)

Per-slot type is List[Callable], so the absent-value sentinel should be [], not None. Matches the existing [[]] * len(keep) at line 1120 (PromptProcessingBatch.filter).

Test

test_prompt_processing_batch_extend_mixes_logits_processors in tests/test_generate.py. Asserts no None entries remain in the merged list. Fails on main (AssertionError: None is not an instance of <class 'list'>), passes with the fix.

Related

#1225 fixes the symmetric stale-length bug in GenerationBatch.filter. This is situated in a different code path so both can land independently.

Traceback (production hit)

File "mlx_lm/generate.py", line 1346, in _step
    for processor in self.logits_processors[e]:
TypeError: 'NoneType' object is not iterable

@BLuchterhand BLuchterhand force-pushed the fix/logits-processors-none-element-iteration branch 2 times, most recently from b279bd1 to b58f987 Compare April 29, 2026 21:37
@BLuchterhand BLuchterhand marked this pull request as draft April 29, 2026 21:42
PromptProcessingBatch.extend filled missing per-slot logits_processors
with [None] when either side lacked configured processors. Merging an
unconfigured batch with a processor-equipped batch then produced a list
shaped like [None, ..., [fn], ...]. GenerationBatch._step at line 1346
iterates self.logits_processors[e] under the any() guard at line 1337,
which raises TypeError on the None slots.

Fill with [[]] instead. Matches the existing pattern at line 1120
(filter() restoring [[]] * len(keep)) and the per-slot type
List[Callable].

Reproduce: construct two PromptProcessingBatch instances, one without
processors and one with, then call extend; the merged
self.logits_processors contains None entries. New unit test covers this
shape directly.
@BLuchterhand BLuchterhand changed the title fix(generate): handle None entries in GenerationBatch logits_processors fix(generate): avoid None entries in merged logits_processors Apr 29, 2026
@BLuchterhand BLuchterhand force-pushed the fix/logits-processors-none-element-iteration branch from b58f987 to 423301b Compare April 29, 2026 21:47
@BLuchterhand BLuchterhand marked this pull request as ready for review April 29, 2026 21:47
@mloiterman

Copy link
Copy Markdown

+1 — independent production hit, batched-inference server, mlx-lm 0.31.3.

Our workload mixes grammar-mode requests (RAG hybrid search using response_format=json_schema) with plain chat requests, both arriving concurrently and batched together by BatchGenerator. The mixed batch is exactly the shape your repro produces: at least one request with logits_processors and at least one without. Single-workload runs (all-grammar or all-plain) don't trip it.

Live traceback at batch_size=8:

File "mlx_lm/generate.py", line 1346, in _step
    for processor in self.logits_processors[e]:
TypeError: 'NoneType' object is not iterable

The diagnosis matches yours exactly. The supervisor in our setup catches the crash via BrokenPipeError, respawns the engine subprocess (~60 s), and in-flight requests return 500 — but that's our restart machinery doing its job, not a real recovery. Without the patch the bug is fatal to any heterogeneous batched workload.

We're applying your diff to our local .venv as a stopgap (mirrors how we're running #1225's patch today, since neither has merged yet). Vouching for the fix shape: [] is the right sentinel for List[Callable], matching the existing [[]] * len(keep) in PromptProcessingBatch.filter:1120.

Validated locally: a regression probe that fires a long plain request followed by a grammar request ~300 ms later (so the grammar request extend()s the existing plain batch) reliably FAILS with 500 {"error":"Generation failed"} on the un-patched engine, and PASSES once the engine respawns onto the patched code.

Related: #1225 lands the symmetric fix in GenerationBatch.filter. Different code paths, same sentinel argument — both are safe to land independently.

@nastya236 nastya236 added the bug Something isn't working label Jun 4, 2026
@nastya236

Copy link
Copy Markdown
Collaborator

Thanks! This seems to be duplicated in https://github.com/ml-explore/mlx-lm/pull/1225/changes

@nastya236 nastya236 closed this Jun 8, 2026
mloiterman added a commit to mloiterman/mlx-lm that referenced this pull request Jun 10, 2026
…t per-slot logits_processors

extend() fills missing per-slot logits_processors with [None] * N. Merging
an unconfigured batch with a processor-equipped batch produces a mixed list
([None, ..., [fn], ...]); GenerationBatch._step then iterates
self.logits_processors[e] under the any() guard and crashes with
TypeError: 'NoneType' object is not iterable.

Per-slot type is List[Callable], so the absent-value sentinel is [] —
matching the existing [[]] * len(keep) in PromptProcessingBatch.filter.
samplers keep the None sentinel (consumed as `self.samplers[e] or
self.fallback_sampler`, type Optional[Callable]).

Fix and regression test carried over from ml-explore#1230 (closed as duplicate of
this PR; the two changes are companions in the same file — filter here,
extend there).

Co-authored-by: BLuchterhand <benlucht8@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants