Skip to content

Fix XNNPACK channels-last reshape for per-channel binary ops under dynamic quant#20376

Open
Hyungkeun-Park-Nota wants to merge 2 commits into
pytorch:mainfrom
Hyungkeun-Park-Nota:fix/xnnpack-channels-last-binary-broadcast
Open

Fix XNNPACK channels-last reshape for per-channel binary ops under dynamic quant#20376
Hyungkeun-Park-Nota wants to merge 2 commits into
pytorch:mainfrom
Hyungkeun-Park-Nota:fix/xnnpack-channels-last-binary-broadcast

Conversation

@Hyungkeun-Park-Nota

Copy link
Copy Markdown
Contributor

Summary

ChannelsLastTaggedReshapePass.input_to_nhwc's dynamic-quant branch calls
input_node.replace_all_uses_with(input_node_nhwc) on a shared activation. Because
the main traversal runs in topological order, this can switch a broadcasting binary
op's activation operand to NHWC after that op was already processed, while its
other operand (e.g. a per-channel constant) stays NCHW. The operands then have
incompatible shapes (e.g. [1, H, W, C] vs [1, C, 1, 1]), and XNNPACK fails at
runtime in xnn_reshape_binary_elementwise_nd with xnn_status_invalid_parameter.
Lowering succeeds; only execute() fails. Found on a ResNet-50-backbone detection
model lowered with w8a8 dynamic quantization.

Fix

After the main traversal, re-converge any broadcasting binary op (add/mul/sub/
div) whose operands ended up in different memory formats, reusing the existing
input_to_nhwc/input_to_nchw helpers. It runs after the retrace so it sees the
settled graph, and retraces again only if something changed; it is a no-op for
graphs without diverged operands.

Test

test_dynamic_quant_per_channel_binary_chain_lowers_and_runs builds a graph where a
per-channel binary op is the first consumer of an input activation and the
convolution chain runs through it. It fails to execute without this fix and passes
with it.

pytest backends/xnnpack/test/passes/test_channels_last_tagged_reshape.py -k per_channel_binary_chain

@pytorch-bot

pytorch-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20376

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 18, 2026
…namic quant

ChannelsLastTaggedReshapePass.input_to_nhwc has a dynamic-quant branch that calls
input_node.replace_all_uses_with(input_node_nhwc), globally redirecting a shared
activation to its NHWC copy. The main traversal visits nodes in topological order,
so this can retroactively switch a broadcasting binary op's activation operand to
NHWC after that op was already processed, while its other operand (e.g. a
per-channel constant) stays NCHW. The two operands then have incompatible logical
shapes (e.g. [1, H, W, C] vs [1, C, 1, 1]), and XNNPACK fails at runtime in
xnn_reshape_binary_elementwise_nd with xnn_status_invalid_parameter. Lowering
succeeds; only execution fails (observed on DETR with w8a8 dynamic quantization).

Add a re-convergence sweep after the main traversal that restores the pass's own
invariant -- all operands of a node share one memory format -- for broadcasting
binary ops (add/mul/sub/div), converging to NHWC when possible (else NCHW). It runs
after the retrace so it observes the settled graph, and retraces once more only if
anything changed. Graphs with no diverged binary operands are untouched.

Add a regression test (DynamicQuantPerChannelBinaryChain) whose per-channel binary
op is the first consumer of an input activation, with the convolution chain running
through it; it fails to execute (xnn_status_invalid_parameter) without this fix and
passes with it.
@Hyungkeun-Park-Nota Hyungkeun-Park-Nota force-pushed the fix/xnnpack-channels-last-binary-broadcast branch from 050fb6e to 289a570 Compare June 18, 2026 09:35
@Hyungkeun-Park-Nota Hyungkeun-Park-Nota marked this pull request as ready for review June 19, 2026 01:02
@Hyungkeun-Park-Nota

Copy link
Copy Markdown
Contributor Author

@pytorchbot label "release notes: xnnpack"

@pytorch-bot pytorch-bot Bot added the release notes: xnnpack Changes to the XNNPack backend delegate label Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: xnnpack Changes to the XNNPack backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants