Skip to content

UPSTREAM PR #21753: vulkan: Support asymmetric FA in coopmat2 path#1344

Open
loci-dev wants to merge 1 commit into
mainfrom
loci/pr-21753-fa_cm2_mixed
Open

UPSTREAM PR #21753: vulkan: Support asymmetric FA in coopmat2 path#1344
loci-dev wants to merge 1 commit into
mainfrom
loci/pr-21753-fa_cm2_mixed

Conversation

@loci-dev
Copy link
Copy Markdown

Note

Source pull request: ggml-org/llama.cpp#21753

Overview

There has been some recent interest/experimentation with mixed quantization types for FA. I had originally designed the cm2 FA shader with this in mind (because I didn't realize it wasn't supported at the time!), this change adds the missing pieces and enables it.

Also support Q1_0 since people have been trying that out (seems crazy, but who knows).

We should be able to do similar things in the coopmat1/scalar path, but there's another change open against the scalar path and I don't want to conflict.

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES, I used Cursor (composer-2-fast) for most of this change.

There has been some recent interest/experimentation with mixed quantization
types for FA. I had originally designed the cm2 FA shader with this in mind
(because I didn't realize it wasn't supported at the time!), this change
adds the missing pieces and enables it.

Also support Q1_0 since people have been trying that out (seems crazy, but
who knows).

We should be able to do similar things in the coopmat1/scalar path, but
there's another change open against the scalar path and I don't want to
conflict.
@loci-review
Copy link
Copy Markdown

loci-review Bot commented Apr 11, 2026

No meaningful performance changes were detected across 126762 analyzed functions in the following binaries: build.bin.llama-cvector-generator, build.bin.llama-tts, build.bin.libllama.so, build.bin.libmtmd.so, build.bin.llama-bench, build.bin.libggml-cpu.so, build.bin.libggml.so, build.bin.libggml-base.so, build.bin.llama-tokenize, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli.

💬 Questions? Tag @loci-dev

@loci-dev loci-dev force-pushed the main branch 9 times, most recently from d101579 to 63ab8d1 Compare April 18, 2026 02:17
@loci-dev loci-dev force-pushed the main branch 2 times, most recently from 7638ab4 to f1b46d5 Compare April 20, 2026 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants