Skip to content

ggml: uniformize im2col dst_type for all conv ops#23660

Open
Juste-Leo2 wants to merge 1 commit into
ggml-org:masterfrom
Juste-Leo2:FixConv
Open

ggml: uniformize im2col dst_type for all conv ops#23660
Juste-Leo2 wants to merge 1 commit into
ggml-org:masterfrom
Juste-Leo2:FixConv

Conversation

@Juste-Leo2
Copy link
Copy Markdown
Contributor

Overview

This PR adjusts the im2col output type in all convolution operations that use it. Instead of always forcing F16, we keep F16 only when the weight is F16, and use F32 for everything else (BF16, F32, quantized types).

Additional information

This change was discovered while working on the Zaya model, which uses ggml_conv_1d_grouped (#22833). This operation goes through im2col, and the old code forced F16, which caused precision loss with BF16 weights or even crashes with quantized weights. The ggml_conv2d and ggml_conv3d operations had a similar issue, as they passed the weight type directly to im2col without checking.
Related to PR #23112 (Zaya).

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES
    • The proposed changes here indirectly follow from previous Zaya-related PRs
    • AI was used to translate this PR into English and improve readability

@Juste-Leo2 Juste-Leo2 marked this pull request as ready for review May 25, 2026 10:31
@Juste-Leo2 Juste-Leo2 requested a review from ggerganov as a code owner May 25, 2026 10:31
@github-actions github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label May 25, 2026
@Juste-Leo2
Copy link
Copy Markdown
Contributor Author

CC @pwilkin in case you'd like to take a look :)

Copy link
Copy Markdown
Member

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On what backend did it crash on quantized weights?

@Juste-Leo2
Copy link
Copy Markdown
Contributor Author

Juste-Leo2 commented May 25, 2026

On what backend did it crash on quantized weights?

It crashed on the CPU and Vulkan backends (CUDA was unaffected). I had previously done a temporary workaround (here), but forcing F16 ended up breaking BF16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants