[CPU] Enable BF16 dynamic quantization path for compressed FullyConnected by liubo-intel · Pull Request #35726 · openvinotoolkit/openvino

liubo-intel · 2026-05-08T05:39:27Z

Details:

Extends the CPU plugin's weight-decompression FC path so that BF16 activations can go through the oneDNN dynamic-quantization kernel, in addition to F32.

oneDNN fork PR：openvinotoolkit/oneDNN#310

Tickets:

CVS-182410

Copilot

Pull request overview

Extends the Intel CPU plugin’s compressed FullyConnected (weights decompression) path to allow BF16 activations to use the oneDNN dynamic quantization implementation (previously limited to F32), and adds functional coverage for the new BF16 scenario.

Changes:

Enable BF16 as a supported activation type for compressed FullyConnected on x86_64.
Extend dynamic-quantization eligibility checks in the oneDNN FC primitive to accept BF16 sources (with ISA gating).
Add a BF16-specific MatMul-weights-decompression test fixture and instantiate new dyn-quant BF16 test cases.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/x64/matmul_weights_decompression.cpp`	Adds BF16 dyn-quant test instantiation and a BF16-specific additional-config filter.
`src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/classes/matmul_weights_decompression.hpp`	Introduces a BF16-derived test class and a shared setup helper taking data precision.
`src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/classes/matmul_weights_decompression.cpp`	Refactors setup to parameterize network precision and adds BF16 test execution.
`src/plugins/intel_cpu/src/nodes/fullyconnected.cpp`	Enables BF16 in `getSupportedCompressedActivationsTypes()` for x86_64.
`src/plugins/intel_cpu/src/nodes/executors/dnnl/dnnl_fullyconnected_primitive.cpp`	Updates dynamic-quantization gating to allow BF16 sources with additional ISA checks.

rkazants

@liubo-intel, @yuxu42, is it needed for Qwen3,5?

liubo-intel · 2026-05-11T06:15:52Z

@liubo-intel, @yuxu42, is it needed for Qwen3,5?

Hi, @rkazants : This PR is mainly intended to enable NVL to benefit from the more efficient avx512_core_vnni instruction set for the BF16 dynamic-quant path. As far as I know, it is not in the scope of the Qwen3.5 enablement effort.

maxnick · 2026-05-18T15:47:50Z

@liubo-intel , could you please create a dedicated oneDNN fork PR to facilitate the review process?

* Gate BF16 dyn-quant entry on AMX-capable HW (two layers: node-level getSupportedCompressedActivationsTypes + primitive-level useDynamicQuantizationImpl), since AMX BF16 TMUL handles long prompts (prefill) more efficiently than VNNI int8 dyn-quant. * Drive the BF16 dyn-quant test through the inference_precision hint on an f32 IR; remove the MatmulWeightsDecompressionBF16 subclass and decompression_precisions_bf16.

liubo-intel · 2026-05-19T08:05:19Z

@liubo-intel , could you please create a dedicated oneDNN fork PR to facilitate the review process?

oneDNN fork PR：openvinotoolkit/oneDNN#310

maxnick

In general LGTM. Please apply comment in the corresponding oneDNN PR.

liubo-intel marked this pull request as ready for review May 8, 2026 05:39

liubo-intel requested review from a team as code owners May 8, 2026 05:39

github-actions Bot added the category: CPU OpenVINO CPU plugin label May 8, 2026

yuxu42 requested a review from Copilot May 8, 2026 05:58

Copilot started reviewing on behalf of yuxu42 May 8, 2026 05:59 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

Comment thread src/plugins/intel_cpu/src/nodes/fullyconnected.cpp

Comment thread src/plugins/intel_cpu/src/nodes/executors/dnnl/dnnl_fullyconnected_primitive.cpp Outdated

rkazants reviewed May 11, 2026

View reviewed changes

yuxu42 assigned maxnick May 11, 2026

yuxu42 requested a review from maxnick May 11, 2026 06:34

maxnick added this to the 2026.3 milestone May 12, 2026

maxnick requested changes May 18, 2026

View reviewed changes

liubo-intel added 3 commits May 18, 2026 22:48

Enable BF16 dynamic quantization path for compressed FC

89db848

Apply suggestions from code review

0e3f79a

liubo-intel force-pushed the liubo/dynamic_quant_bf16_support branch from 54f8ed1 to ebdc717 Compare May 19, 2026 08:00

maxnick reviewed May 21, 2026

View reviewed changes

Comment thread ...ns/intel_cpu/tests/functional/custom/subgraph_tests/src/x64/matmul_weights_decompression.cpp Outdated

Apply suggestions from code review

b2ce2d1

yuxu42 requested a review from maxnick May 26, 2026 05:28

onednn commit message reword

88e1736

maxnick approved these changes May 27, 2026

View reviewed changes

maxnick enabled auto-merge May 27, 2026 17:02

maxnick added this pull request to the merge queue May 27, 2026

Merged via the queue into openvinotoolkit:master with commit bf0228a May 27, 2026
195 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CPU] Enable BF16 dynamic quantization path for compressed FullyConnected#35726

[CPU] Enable BF16 dynamic quantization path for compressed FullyConnected#35726
maxnick merged 5 commits into
openvinotoolkit:masterfrom
liubo-intel:liubo/dynamic_quant_bf16_support

liubo-intel commented May 8, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

rkazants left a comment

Uh oh!

liubo-intel commented May 11, 2026

Uh oh!

maxnick commented May 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liubo-intel commented May 19, 2026

Uh oh!

maxnick left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

liubo-intel commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details:

Tickets:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

rkazants left a comment

Choose a reason for hiding this comment

Uh oh!

liubo-intel commented May 11, 2026

Uh oh!

maxnick commented May 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liubo-intel commented May 19, 2026

Uh oh!

maxnick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liubo-intel commented May 8, 2026 •

edited

Loading