[ET-VK] Modernize permute op with safe indexing and unified dispatch by pytorchbot · Pull Request #18514 · pytorch/executorch

pytorchbot · 2026-03-26T02:56:21Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #18511 by @SS-JIA
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/510/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/510/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/main
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/510/orig
Differential Revision: D98220451
@diff-train-skip-merge

Modernize the permute operator to follow current best practices, fixing an Adreno 740 driver crash caused by dynamic UBO indexing in the texture shader. Texture shader changes: - Replace old indexing_utils.h with indexing.glslh - Use TextureMetadata UBOs instead of push constant sizes - Use texture_pos_to_tensor4d_idx_simple() and related helpers - Replace permute_dims[out_packed_dim] with safe_idx() to avoid dynamic indexing of push constant with spec-const-derived index - Use TextureElementIndex pattern for the slow path C++ dispatch changes: - Merge add_permute_node() and add_permute_buffer_node() into a single unified function using graph.meta_ubo() and conditional logic - Remove unused channel_info computation - Move WHCNPermuteDims struct into anonymous namespace - Guard texture path with VK_CHECK_COND(permute_ndim <= 4) Differential Revision: [D98220451](https://our.internmc.facebook.com/intern/diff/D98220451/) ghstack-source-id: 357844381 Pull Request resolved: #18511

pytorch-bot · 2026-03-26T02:56:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18514

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 33 Pending

As of commit de83a9f with merge base 8d43d97 ():

NEW FAILURE - The following job has failed:

pull / test-openvino-linux / linux-job (gh)
RuntimeError: Command docker exec -t 0238b210d703621d1ded30d0a527c768502c8c42413138ccfeee7e477a356b5d /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-03-26T02:57:13Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

The Adreno 740 GPU driver crashes (SIGSEGV in vkCreateComputePipelines) when GLSL shaders dynamically index a UBO-backed ivec4/ivec3 with a specialization-constant-derived value. This was causing the skin segmentation model to crash during pipeline creation on Samsung S23. Fix all instances across 18 shader files by replacing patterns like `meta.sizes[packed_dim]` with `safe_idx(meta.sizes, packed_dim)`, which uses an if/else chain that the driver resolves at pipeline creation time. Changes: - Add safe_idx(ivec3) overload to indexing.glslh - Fix transfer_texture.glsl, slice.glslh, select.glslh (transfer ops) - Fix nchw_to_int8x4_buffer.glsl, full_texture.glsl (staging/utility) - Fix gather, split, index_tensor, where, expand, pad, repeat, arange texture shaders (1-line fixes each) - Fix softmax.glsl, reduce.glsl, reduce2d.glsl, var_texture3d.glsl (reduction shaders with multiple fixes + added indexing.glslh include) - Remove unused ShaderNameUtils.h include from Slice.cpp Differential Revision: [D98220450](https://our.internmc.facebook.com/intern/diff/D98220450/) ghstack-source-id: 357844383 Pull Request resolved: #18512

pytorchbot requested a review from SS-JIA as a code owner March 26, 2026 02:56

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 26, 2026

SS-JIA approved these changes Mar 26, 2026

View reviewed changes

SS-JIA merged commit 3eba197 into main Mar 26, 2026
160 of 164 checks passed

SS-JIA deleted the gh/SS-JIA/510/orig branch March 26, 2026 03:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK] Modernize permute op with safe indexing and unified dispatch#18514

[ET-VK] Modernize permute op with safe indexing and unified dispatch#18514
SS-JIA merged 2 commits into
mainfrom
gh/SS-JIA/510/orig

pytorchbot commented Mar 26, 2026

Uh oh!

pytorch-bot Bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pytorchbot commented Mar 26, 2026

Uh oh!

pytorch-bot Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18514

❌ 1 New Failure, 33 Pending

Uh oh!

github-actions Bot commented Mar 26, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot Bot commented Mar 26, 2026 •

edited

Loading

This PR needs a `release notes:` label