ggml-hexagon: add Q4_1 support by max-krasnyansky · Pull Request #60 · max-krasnyansky/llama.cpp

max-krasnyansky · 2026-05-23T06:14:08Z

This PR adds support for Q4_1 data type for MUL_MAT operations in the Hexagon backend.

Changes:

Added HTP_TYPE_Q4_1 and HTP_TYPE_Q8_1 mappings and their x4x2 constants.
Handled tensor buffering logic (get_alloc_size) to accommodate the QK_Q4_1x4x2 (160 bytes) and QK_Q8_1x4x2 block dimensions.
Added repack_q4_1_q4x4x2 and repack_q8_1_q8x4x2 pack/unpack functions to ggml-hexagon.cpp, modifying the buffer assignment operations (set_tensor and get_tensor) accordingly.
Introduced dynamic quantize rows logic quantize_row_f32_q8_1x4x2 for FP32 activations in matmul-ops.c, taking into account offset calculations.
Integrated HVX vector dot kernels vec_dot_q4_1x4x2_q8_1x4x2_1x1, 2x1, and 2x2, utilizing the newly created minimum layout offsets format.
Mapped Q4_1 type into the supported types matrix in ggml_hexagon_supported_mul_mat and ggml_hexagon_supported_mul_mat_id.
Configured HMX support for HTP_TYPE_Q4_1, handling its corresponding minimum offset logic directly within the dequantization pass hmx-matmul-ops.c.

PR created automatically by Jules for task 12614708629588356175 started by @max-krasnyansky

Co-authored-by: max-krasnyansky <1380796+max-krasnyansky@users.noreply.github.com>

google-labs-jules · 2026-05-23T06:14:10Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

Co-authored-by: max-krasnyansky <1380796+max-krasnyansky@users.noreply.github.com>

ggml-hexagon: add Q4_1 support

47b46c8

Co-authored-by: max-krasnyansky <1380796+max-krasnyansky@users.noreply.github.com>

github-actions Bot added ggml Hexagon labels May 23, 2026

google-labs-jules Bot and others added 3 commits May 23, 2026 21:56

ggml-hexagon: add Q4_1 support

e995bec

Co-authored-by: max-krasnyansky <1380796+max-krasnyansky@users.noreply.github.com>

ggml-hexagon: add Q4_1 support

b1844da

Co-authored-by: max-krasnyansky <1380796+max-krasnyansky@users.noreply.github.com>

ggml-hexagon: add Q4_1 support

88098cf

Co-authored-by: max-krasnyansky <1380796+max-krasnyansky@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-hexagon: add Q4_1 support#60

ggml-hexagon: add Q4_1 support#60
max-krasnyansky wants to merge 4 commits into
masterfrom
jules-12614708629588356175-17769d61

max-krasnyansky commented May 23, 2026

Uh oh!

google-labs-jules Bot commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

max-krasnyansky commented May 23, 2026

Changes:

Uh oh!

google-labs-jules Bot commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant