Add MXFP4 quantization type for dense models (ftype 41)#2176
Open
trigun6187 wants to merge 1 commit into
Open
Conversation
GGMLFileQuantizationType currently has MXFP4_MOE = 38 for MoE models, but no entry for dense MXFP4 models (ftype 41 = LLAMA_FTYPE_MOSTLY_MXFP4 in llama.cpp). The per-tensor dtype enum GGMLQuantizationType already has MXFP4 = 39, but HF uses the file-level type for variant detection. This adds MXFP4 = 41 to GGMLFileQuantizationType and places it in GGUF_QUANT_ORDER alongside NVFP4 (both are 4-bit FP4 formats).
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds MXFP4=41 to GGMLFileQuantizationType for dense MXFP4 GGUF models.
Note
Low Risk
Low risk: adds a new GGUF file quantization enum value and includes it in the ordering list, with minimal impact beyond recognizing the new label.
Overview
Adds support for dense MXFP4 GGUF models by introducing
GGMLFileQuantizationType.MXFP4 = 41and including it inGGUF_QUANT_ORDER, ensuring quant label parsing and quant ordering logic recognize the new ftype.Reviewed by Cursor Bugbot for commit 7b1c5a9. Bugbot is set up for automated code reviews on this repo. Configure here.