gguf: parser for percentage-mixed GGUF filenames by mishig25 · Pull Request #2170 · huggingface/huggingface.js

mishig25 · 2026-05-13T08:41:54Z

Adds a parser for percentage-mixed GGUF filenames — files that ship a single artifact with tensors quantized to multiple ggml types and encode the per-type byte share in the name as <pct><quant> tokens.

Example: DeepSeek-V4-Flash-55IQ2_XXS-34Q2_K-07Q8_0-03F16.gguf (antirez/deepseek-v4-gguf).

API

parseGGUFQuantMix("DeepSeek-V4-Flash-55IQ2_XXS-34Q2_K-07Q8_0-03F16.gguf")
// {
//   components: [{pct:55,quant:'IQ2_XXS'}, {pct:34,quant:'Q2_K'}, {pct:7,quant:'Q8_0'}, {pct:3,quant:'F16'}],
//   dominant:   {pct:55, quant:'IQ2_XXS'},
// }

Also exports GGUF_QUANT_MIX_COMPONENT_RE and GGUFQuantMix / GGUFQuantMixComponent types.

`parseGGUFQuantLabel` delegates to it

When the filename looks like a mix, parseGGUFQuantLabel now returns the dominant component instead of the file-order last match (which today surfaces the smallest tail — F16 in the example above). Plain single-quant filenames are unchanged.

parseGGUFQuantLabel("DeepSeek-V4-Flash-55IQ2_XXS-34Q2_K-07Q8_0-03F16.gguf") // → "IQ2_XXS"
parseGGUFQuantLabel("Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf")               // → "Q4_K_M"  (unchanged)
parseGGUFQuantLabel("Qwen3-4B-UD-Q2_K_XL.gguf")                             // → "UD-Q2_K_XL"  (unchanged)

Notes

Returns undefined for plain single-quant filenames so it composes cleanly with the existing fallback.
Delimited lookbehind / lookahead avoids false positives on size labels like 7B.
Quant alternation is length-sorted so IQ2_XXS wins over the prefix IQ2_XS.

Note

Low Risk
Low risk: adds a new filename parser and adjusts parseGGUFQuantLabel output only for percentage-mixed names, with extensive unit test coverage; no GGUF binary parsing or data handling logic is changed.

Overview
Adds support for percentage-mixed GGUF filenames (e.g. Model-55IQ2_XXS-34Q2_K-07Q8_0-03F16.gguf) by introducing parseGGUFQuantMix, GGUF_QUANT_MIX_COMPONENT_RE, and the GGUFQuantMix* types in @huggingface/tasks, and re-exporting them from packages/gguf.

Updates parseGGUFQuantLabel to detect these mixed names and return the dominant (largest %) quant instead of the last match, and adds comprehensive tests covering suffixes like -imatrix/-MTP, path prefixes, size-label false positives, and non-quant storage types.

^{Reviewed by Cursor Bugbot for commit f4a6f75. Bugbot is set up for automated code reviews on this repo. Configure here.}

Adds parseGGUFQuantMix for GGUF filenames that encode a per-tensor-class byte-share recipe rather than a single quant label, e.g. DeepSeek-V4-Flash-55IQ2_XXS-34Q2_K-07Q8_0-03F16-imatrix.gguf Returns { components, dominant } or undefined for plain single-quant filenames (so it composes cleanly with parseGGUFQuantLabel). Also extends parseGGUFQuantLabel: when a mix is detected, return the dominant (largest-pct) component rather than the file-order last match (which would surface the smallest tail — F16 in the example above). Originating use case: huggingface.co/antirez/deepseek-v4-gguf, where DeepSeek V4 Flash is shipped with an asymmetric MoE recipe (routed experts at IQ2_XXS / Q2_K, shared experts and attention projections at Q8_0, embed / router at F16). No single LLAMA_FTYPE_MOSTLY_* label captures the file's behavior, hence the per-quant breakdown. Implementation notes: - delimited lookbehind / lookahead so size labels like "7B" / "8B" aren't misread as components; - quant alternation length-sorted so "IQ2_XXS" wins over its prefix "IQ2_XS"; - path prefix stripped before parsing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit a6e16c8. Configure here.}

pcuenca

Looks good based on the tests

GGMLQuantizationType includes I8/I16/I32/I64/F64 alongside the real quantization types. Those are integer/float storage types for metadata tensors, not quant methods. Including them in the mix component alternation meant a filename containing two dash-delimited tokens like "-32I32-" and "-16I16-" would be wrongly recognized as a mix recipe. Filter them out before building GGUF_QUANT_MIX_COMPONENT_RE and add a regression test (incl. a real-world-shaped check that F64 is dropped while F32 + Q8_0 in the same filename still parse as a mix). Reported by Cursor Bugbot on PR huggingface#2170. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mishig25 force-pushed the mishig/gguf-quant-mix-parser branch from 8bf5a36 to a6e16c8 Compare May 13, 2026 08:52

mishig25 marked this pull request as ready for review May 13, 2026 09:03

mishig25 requested review from SBrandeis, Wauplin, gary149, julien-c, ngxson and pcuenca as code owners May 13, 2026 09:03

cursor Bot reviewed May 13, 2026

View reviewed changes

Comment thread packages/tasks/src/gguf.ts

pcuenca approved these changes May 13, 2026

View reviewed changes

mishig25 mentioned this pull request May 13, 2026

docs(gguf): extend Encoding slot to support percentage-mixed recipes ggml-org/ggml#1489

Draft

mishig25 marked this pull request as draft May 13, 2026 10:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf: parser for percentage-mixed GGUF filenames#2170

gguf: parser for percentage-mixed GGUF filenames#2170
mishig25 wants to merge 2 commits into
huggingface:mainfrom
mishig25:mishig/gguf-quant-mix-parser

mishig25 commented May 13, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

pcuenca left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mishig25 commented May 13, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API

parseGGUFQuantLabel delegates to it

Notes

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mishig25 commented May 13, 2026 •

edited by cursor Bot

Loading

`parseGGUFQuantLabel` delegates to it