qwen3.6-27b-mtp no gains when MTP enabled

## MTP Appears Non-Functional on Qwen3.6-27B-MTP (Performance Drops Instead of Improving)

### Environment

- **Model:** [Qwen3.6-27B-MTP-GGUF](https://huggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF)
- **OS:** Fedora 44
- **Runtime:** Vulkan (llama.cpp v2.20.1)
- **GPU:** AMD Radeon RX 7900 XTX

### Issue

Enabling **MTP** for `Qwen3.6-27B-MTP-GGUF` does not appear to provide any multi-token prediction benefit.

Instead of increasing throughput, generation speed drops significantly, suggesting that MTP may not actually be functioning while still incurring additional overhead.

### Results

| Configuration | Generation Speed |
|--------------|------------------|
| MTP Disabled | ~13 tokens/s |
| MTP Enabled | ~13 tokens/s |

### Expected Behavior

When MTP is working correctly, generation speed should increase due to successful multi-token predictions.

For comparison, on the same system, `Qwen3.6-35B-A3B-MTP-GGUF` behaves as expected:

| Model | MTP Disabled | MTP Enabled |
|-------|-------------|-------------|
| Qwen3.6-35B-A3B-MTP-GGUF | ~50 tokens/s | ~110 tokens/s |

### Why This Looks Like an MTP Issue

The runtime and hardware are clearly capable of benefiting from MTP, as demonstrated by the 35B A3B model.

With `Qwen3.6-27B-MTP-GGUF`, enabling MTP appears to:

- Provide no observable multi-token prediction speedup.
- Reduce throughput by roughly 60%.
- Behave as though MTP overhead is present, but the MTP predictions themselves are not contributing to generation.

### Reproduction

1. Load `Qwen3.6-27B-MTP-GGUF`.
2. Enable MTP in Advanced Settings.
3. Generate text and measure throughput.
4. Disable MTP and repeat.
5. Observe that throughput decreases from ~35 t/s to ~14 t/s when MTP is enabled.

### Question

Is MTP currently expected to work with `Qwen3.6-27B-MTP-GGUF` under Vulkan and llama.cpp v2.20.1, or are there known limitations/issues affecting this model?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwen3.6-27b-mtp no gains when MTP enabled #581

MTP Appears Non-Functional on Qwen3.6-27B-MTP (Performance Drops Instead of Improving)

Environment

Issue

Results

Expected Behavior

Why This Looks Like an MTP Issue

Reproduction

Question

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

qwen3.6-27b-mtp no gains when MTP enabled #581

Description

MTP Appears Non-Functional on Qwen3.6-27B-MTP (Performance Drops Instead of Improving)

Environment

Issue

Results

Expected Behavior

Why This Looks Like an MTP Issue

Reproduction

Question

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions