-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Extend moe_3gemm to all oneDNN aware GPUs #35335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
peterchen-intel
wants to merge
23
commits into
openvinotoolkit:master
Choose a base branch
from
peterchen-intel:oom/fixing
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+22
−3
Open
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
a7fa364
intel_gpu: fix Qwen3 MoE GEMM3_SWIGLU on MTL-class (non-systolic) iGPU
peterchen-intel 2bcc00a
intel_gpu: scope MOE3GemmFusedCompressed queue fix to in_order only
peterchen-intel d442d39
intel_gpu: revert skip_transfer_on_igpu extension to MTL
peterchen-intel 4a0a504
intel_gpu: revert moe_gather.hpp rank-2 fix (not needed in final path)
peterchen-intel e1a7c83
intel_gpu: fix oneDNN engine init for MoE on non-systolic GPU
peterchen-intel 8a030fd
Merge branch 'master' into oom/fixing
peterchen-intel 1498dd1
intel_gpu: fix unnecessary tmp_out buffer per-layer in paged_attention
peterchen-intel 2004557
Merge branch 'master' into oom/fixing
peterchen-intel c9eb5c6
Merge branch 'master' into oom/fixing
peterchen-intel 046c575
Merge branch 'master' into oom/fixing
peterchen-intel 3ebb016
Merge branch 'master' into oom/fixing
peterchen-intel 990c0bc
Roll back the mistaken change
peterchen-intel 39ba7fc
Merge branch 'master' into oom/fixing
peterchen-intel 2ee0c08
iintel_gpu: enable compressed MoE fusion chain on non-systolic devices
peterchen-intel 1284765
Merge branch 'master' into oom/fixing
peterchen-intel f8d0cc3
Support oneDNN known gpu_arch only
peterchen-intel 3875448
device_info.arch >= cldnn::gpu_arch::xe_lp
peterchen-intel ae72a35
Merge branch 'master' into oom/fixing
peterchen-intel fa26fd1
For ENABLE_ONEDNN_FOR_GPU only
peterchen-intel 4ce2b9a
Remove duplicate
peterchen-intel dfd234c
Remove duplicate
peterchen-intel 8edab88
Define disable_moe_opt only if ENABLE_ONEDNN_FOR_GPU
peterchen-intel fdfdc18
Merge branch 'master' into oom/fixing
peterchen-intel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.