Qwen 3.5 MoE: Add Metal source transformations by manuelcandales · Pull Request #18879 · pytorch/executorch

manuelcandales · 2026-04-14T16:25:39Z

Adds metal_source_transformations.py with module replacements for Metal:

FusedMoEExperts -> MetalMoEExperts (two metal::gather_qmv calls with
SiLU gating, replacing torch.ops.triton.fused_moe)
GatedDeltaNet -> metal::gated_delta_rule custom op (replaces both the
T=1 native path and T>1 Triton kernel)
FullAttention -> removes turboquant codepath, keeps standard SDPA
SparseMoE -> removes .float() cast on expert_weights

Also includes quantize_experts_metal() which quantizes expert weights to
MLX affine INT4 format (unsigned uint4 with scale + bias per group),
compatible with the Metal gather_qmv kernel.

Authored with Claude.

[ghstack-poisoned]

manuelcandales · 2026-04-14T16:25:40Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2026-04-14T16:25:50Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18879

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 166 Pending

As of commit 187e4f5 with merge base d408a10 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

github-actions · 2026-04-21T18:05:52Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Adds metal_source_transformations.py with module replacements for Metal: - FusedMoEExperts -> MetalMoEExperts (two metal::gather_qmv calls with SiLU gating, replacing torch.ops.triton.fused_moe) - GatedDeltaNet -> metal::gated_delta_rule custom op (replaces both the T=1 native path and T>1 Triton kernel) - FullAttention -> removes turboquant codepath, keeps standard SDPA - SparseMoE -> removes .float() cast on expert_weights Also includes quantize_experts_metal() which quantizes expert weights to MLX affine INT4 format (unsigned uint4 with scale + bias per group), compatible with the Metal gather_qmv kernel.

manuelcandales added 6 commits April 14, 2026 12:25

Update

a3a42e4

[ghstack-poisoned]

Update

1c965c6

[ghstack-poisoned]

Update

1be53ab

[ghstack-poisoned]

Update

47cbe76

[ghstack-poisoned]

Update

805a09d

[ghstack-poisoned]

Update

5306c5a

[ghstack-poisoned]

manuelcandales requested a review from lucylq as a code owner April 14, 2026 16:25

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 14, 2026

manuelcandales added 9 commits April 14, 2026 18:23

Update

958712e

[ghstack-poisoned]

Update

eba74c4

[ghstack-poisoned]

Update

c222005

[ghstack-poisoned]

Update

e7a7acc

[ghstack-poisoned]

Update

5530242

[ghstack-poisoned]

Update

59f88db

[ghstack-poisoned]

Update

1fbb94f

[ghstack-poisoned]

Update

60ca500

[ghstack-poisoned]

Update

d80da37

[ghstack-poisoned]

manuelcandales requested review from mergennachin and metascroy and removed request for lucylq April 15, 2026 15:14

manuelcandales mentioned this pull request Apr 16, 2026

Qwen 3.5 MoE Metal: Use max-sized prefill example for dynamic inputs #18956

Merged

metascroy approved these changes Apr 20, 2026

View reviewed changes

manuelcandales added 20 commits April 20, 2026 15:01

Update

f4f616e

[ghstack-poisoned]

Update

b8e1201

[ghstack-poisoned]

Update

9ce837a

[ghstack-poisoned]

Update

248115a

[ghstack-poisoned]

Update

ee865c3

[ghstack-poisoned]

Update

36d45ef

[ghstack-poisoned]

Update

9000488

[ghstack-poisoned]

Update

a060d19

[ghstack-poisoned]

Update

01c3ce5

[ghstack-poisoned]

Update

0c1a88b

[ghstack-poisoned]

Update

2c56804

[ghstack-poisoned]

Update

933122c

[ghstack-poisoned]

Update

9def0ed

[ghstack-poisoned]

Update

01ecf6a

[ghstack-poisoned]

Update

1766789

[ghstack-poisoned]

Update

7423226

[ghstack-poisoned]

Update

4b791ea

[ghstack-poisoned]

Update

ff92256

[ghstack-poisoned]

Update

f8ebcfb

[ghstack-poisoned]

Update

4cf31c8

[ghstack-poisoned]

Base automatically changed from gh/manuelcandales/173/head to main April 21, 2026 18:00

manuelcandales requested review from kirklandsign, larryliu0820 and shoumikhin as code owners April 21, 2026 18:00

Update

187e4f5

[ghstack-poisoned]

manuelcandales merged commit 9600f63 into main Apr 21, 2026
166 of 190 checks passed

manuelcandales deleted the gh/manuelcandales/174/head branch April 21, 2026 18:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen 3.5 MoE: Add Metal source transformations#18879

Qwen 3.5 MoE: Add Metal source transformations#18879
manuelcandales merged 40 commits into
mainfrom
gh/manuelcandales/174/head

manuelcandales commented Apr 14, 2026

Uh oh!

manuelcandales commented Apr 14, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

manuelcandales commented Apr 14, 2026

Uh oh!

manuelcandales commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18879

⏳ No Failures, 166 Pending

Uh oh!

github-actions Bot commented Apr 21, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

manuelcandales commented Apr 14, 2026 •

edited

Loading

pytorch-bot Bot commented Apr 14, 2026 •

edited

Loading

This PR needs a `release notes:` label