Gemma 4 MTP draft models fail to load: unknown model architecture 'gemma4-assistant'

Support for these drafters landed in llama.cpp on 2026-06-07
([ggml-org/llama.cpp#23398](https://github.com/ggml-org/llama.cpp/issues/23398), plus E2B/E4B variants in [ggml-org/llama.cpp#24282](https://github.com/ggml-org/llama.cpp/issues/24282)
a day later), but the llama.cpp submodule in v2026.6.9538 and current main
is pinned at 5343f4502a (2026-06-06) — just before that merge.

Could you bump the llama.cpp submodule and cut a new release? Happy to test
a pre-release wheel (CUDA 12.8, RTX 5090, Windows).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemma 4 MTP draft models fail to load: unknown model architecture 'gemma4-assistant' #157

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Gemma 4 MTP draft models fail to load: unknown model architecture 'gemma4-assistant' #157

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions