Skip to content

Add MXFP4 quantization type for dense models (ftype 41)#2176

Open
trigun6187 wants to merge 1 commit into
huggingface:mainfrom
trigun6187:add-mxfp4-dense-type
Open

Add MXFP4 quantization type for dense models (ftype 41)#2176
trigun6187 wants to merge 1 commit into
huggingface:mainfrom
trigun6187:add-mxfp4-dense-type

Conversation

@trigun6187
Copy link
Copy Markdown

@trigun6187 trigun6187 commented May 18, 2026

Adds MXFP4=41 to GGMLFileQuantizationType for dense MXFP4 GGUF models.


Note

Low Risk
Low risk: adds a new GGUF file quantization enum value and includes it in the ordering list, with minimal impact beyond recognizing the new label.

Overview
Adds support for dense MXFP4 GGUF models by introducing GGMLFileQuantizationType.MXFP4 = 41 and including it in GGUF_QUANT_ORDER, ensuring quant label parsing and quant ordering logic recognize the new ftype.

Reviewed by Cursor Bugbot for commit 7b1c5a9. Bugbot is set up for automated code reviews on this repo. Configure here.

GGMLFileQuantizationType currently has MXFP4_MOE = 38 for MoE models,
but no entry for dense MXFP4 models (ftype 41 = LLAMA_FTYPE_MOSTLY_MXFP4
in llama.cpp). The per-tensor dtype enum GGMLQuantizationType already
has MXFP4 = 39, but HF uses the file-level type for variant detection.

This adds MXFP4 = 41 to GGMLFileQuantizationType and places it in
GGUF_QUANT_ORDER alongside NVFP4 (both are 4-bit FP4 formats).
@julien-c
Copy link
Copy Markdown
Member

this is for @ngxson and @mishig25 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants