[GPU] fix dGPU func testcases in smoke_ScaledAttnDynamic4D_GPU and smoke_MatMulCompressedWeights_extra_multiply#35343
Closed
yuanxion wants to merge 8 commits into
Closed
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes two Intel dGPU functional test failures by (1) treating scalar/rank-1 SDPA attention-mask inputs as placeholders (not real runtime masks) in OCL SDPA implementations, and (2) preventing unsafe MatMul→FullyConnected conversion for parameter-based compressed weights that still carry non-trivial batch dimensions.
Changes:
- Add shared SDPA helper to detect whether the attention-mask input is a real runtime mask, and reuse it across SDPA OCL implementations.
- Update MatMul→FC transformation to block conversion for parameter-based compressed weights with non-1 aligned batch dimensions.
- Add/extend unit tests for SDPA placeholder-mask behavior and MatMul→FC “no-convert” scenario.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/plugins/intel_gpu/tests/unit/transformations/convert_matmul_to_fc_test.cpp | Adds a regression unit test ensuring MatMul→FC does not occur for parameter-based compressed weights with per-batch dimensions. |
| src/plugins/intel_gpu/tests/unit/test_cases/sdpa_gpu_test.cpp | Adds a unit test asserting scalar placeholder mask behaves like “no runtime mask” in SDPA OCL path. |
| src/plugins/intel_gpu/src/plugin/transformations/convert_matmul_to_fc.cpp | Extends aligned-shape check to detect parameter-based compressed weights and block unsafe FC conversion. |
| src/plugins/intel_gpu/src/graph/impls/ocl_v2/sdpa/sdpa_utils.hpp | Introduces has_runtime_attn_mask_input() helper to filter out scalar/1D placeholder masks. |
| src/plugins/intel_gpu/src/graph/impls/ocl_v2/sdpa/sdpa_ref.cpp | Uses the helper to set JIT constants and to skip binding placeholder mask inputs. |
| src/plugins/intel_gpu/src/graph/impls/ocl_v2/sdpa/sdpa_gen_opt.cpp | Uses the helper for JIT config and argument binding in the optimized SDPA generator. |
| src/plugins/intel_gpu/src/graph/impls/ocl_v2/sdpa/sdpa_gen_micro.cpp | Uses the helper to control mask-related JIT constants/args for the micro-kernel generator. |
…efs func test cases Signed-off-by: yuan.xiong <yuan.xiong@intel.com>
…ession.Inference func test cases Signed-off-by: yuan.xiong <yuan.xiong@intel.com>
Signed-off-by: yuan.xiong <yuan.xiong@intel.com>
Signed-off-by: yuan.xiong <yuan.xiong@intel.com>
Signed-off-by: yuan.xiong <yuan.xiong@intel.com>
Signed-off-by: yuan.xiong <yuan.xiong@intel.com>
a68d87a to
996a20e
Compare
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Details
fixes 2 Intel dGPU functional testcases:
Description of the issue
Symptom
Root cause
How to fix it
The code and line that caused this issue
openvino/src/plugins/intel_gpu/src/graph/impls/ocl_v2/sdpa/sdpa_gen_opt.cpp
Lines 126 to 127 in 6c7a684
openvino/src/plugins/intel_gpu/src/plugin/transformations/convert_matmul_to_fc.cpp
Lines 122 to 124 in 6c7a684
Reproduction step and snapshot
./ov_gpu_func_tests --device_suffix=1 --gtest_filter='smoke_ScaledAttnDynamic4D_GPU/ScaledAttnLayerGPUTest.CompareWithRefs/netPRC=f16_IS=[?.5.?.128]_[?.5.?.128]_[?.5.?.32]_[?.1.?.?]_TS=(2.5.100.128)_(2.5.1.128)_(2.5.387.128)_(2.5.100.128)_(2.5.1.128)_(2.5.387.128)_(2.5.100.32)_(2.5.1.32)_(2.5.387.32)_(1.1.100.100)_(1.1.1.1)_(2.1.387.387)_is_causal=0_has_attn=0_is_attn_const=1_has_scale=1_is_scale_const=1_with_transpose0_has_sink=0_'./ov_gpu_func_tests --device_suffix=1 --gtest_filter='smoke_MatMulCompressedWeights_extra_multiply/MatmulWeightsDecompression.Inference/data_shape=[]_[1.4.16]__weights_shape=[16,32]_group_size=2_weights_precision=u8_activations_precision=f32_transpose_weights=0_decompression_subtract=0_reshape_on_decompression=0_extra_multiply=1_per_tensor_zp=0_param_weights=1_dyn_quan_group_size=0'Problematic graph
N/A
Checklist
Tickets:
AI Assistance: