[ET-VK][Ops] quantization op shaders and impl#11369

Merged

facebook-github-bot merged 16 commits into

gh/ahmtox/11/basefrom

gh/ahmtox/11/head

Jun 17, 2025

ahmtox commented Jun 4, 2025 •

edited

Loading

Contributor

Stack from ghstack (oldest at bottom):

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are half (fp16) and float (fp32). The only output types supported are byte (uint8), char (int8), short (int16), int (int32).

Differential Revision: D75959064


          [ET-VK][Ops] quantization op shaders and impl

0c9c7a6

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

ahmtox requested a review from SS-JIA as a code owner

June 4, 2025 18:03

This was referenced Jun 4, 2025

[ET-VK] double, short, and uint16 dtype runtime support #11365

Merged

[ET-VK][Ops] quantize ops skeleton test framework #11366

Merged

[ET-VK][Ops] quantize_per_token.default test setup #11367

Merged

pytorch-bot Bot commented Jun 4, 2025 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11369

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 9dc4092 with merge base 3b1c7fd ():

NEW FAILURE - The following job has failed:

Build Presets / linux (pybind, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ahmtox mentioned this pull request

[ET-VK][Ops] quantize_per_tensor.default test setup #11368

Merged

ahmtox pushed a commit that referenced this pull request


          [ET-VK][Ops] quantization op shaders and impl

36f2cb5

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

ghstack-source-id: 288187842
Pull Request resolved: #11369

facebook-github-bot added the CLA Signed label

facebook-github-bot commented Jun 4, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064

facebook-github-bot added the fb-exported label


          Update on "[ET-VK][Ops] quantization op shaders and impl"

f2c2380

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

This was referenced Jun 9, 2025

[ET] enabling half dtype input for quantization #11479

Merged

[ET-VK][Ops] dequantize ops skeleton test framework #11480

Merged

[ET-VK][Ops] dequantize_per_tensor.default test setup #11481

Merged

[ET-VK][Ops] dequantize_per_token.default test setup #11482

Merged

[ET-VK][Ops] dequantization op shaders and impl #11483

Merged

facebook-github-bot commented Jun 9, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

e2cb320

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 9, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

cb4bcfe

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

This was referenced Jun 11, 2025

[ET] enabling half dtype output for dequantization and making logic consistent #11552

Merged

[ET-VK][Ops] enabling double support for quantization and dequantization ops #11553

Merged

[ET-VK][Ops] choose_qparams ops skeleton test framework #11554

Merged

[ET-VK][Ops] choose_qparams.tensor test setup #11555

Merged

[ET-VK][Ops] choose_qparams_per_token_asymmetric.default test setup #11556

Merged

[ET-VK][Ops] choose_qparams op shaders and impl #11557

Merged

facebook-github-bot commented Jun 11, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

3615a76

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 11, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064

ahmtox mentioned this pull request

[ET-VK][Ops] common test utils for converting aten types to vulkan types #11575

Merged

facebook-github-bot commented Jun 11, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

9f7d105

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 12, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

d49d3a2

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 12, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

499dbfd

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 12, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

de2298b

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 12, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

06734c3

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 13, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064

SS-JIA approved these changes

View reviewed changes


          Update on "[ET-VK][Ops] quantization op shaders and impl"

10bcfe7

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 13, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

67e425b

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 13, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

15a7258

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 13, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

c09bd60

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 16, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064

SS-JIA approved these changes

View reviewed changes

SS-JIA approved these changes

View reviewed changes


          Update on "[ET-VK][Ops] quantization op shaders and impl"

9dc4092

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

facebook-github-bot commented Jun 17, 2025

Contributor

This pull request was exported from Phabricator. Differential Revision: D75959064

facebook-github-bot merged commit 4ffc98a into gh/ahmtox/11/base

95 of 98 checks passed

facebook-github-bot deleted the gh/ahmtox/11/head branch

June 17, 2025 22:03

facebook-github-bot temporarily deployed to cherry-pick-bot

June 17, 2025 22:03

— with

GitHub Actions Inactive

pytorchbot mentioned this pull request

[ET-VK][Ops] quantization op shaders and impl #11767

Merged

cccclai pushed a commit that referenced this pull request


          [ET-VK][Ops] quantization op shaders and impl (#11767)

d984a2c

This PR was created by the merge bot to help merge the original PR into
the main branch.
ghstack PR number: #11369 by
@ahmtox
^ Please use this as the source of truth for the PR details, comments,
and reviews
ghstack PR base:
https://github.com/pytorch/executorch/tree/gh/ahmtox/11/base
ghstack PR head:
https://github.com/pytorch/executorch/tree/gh/ahmtox/11/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/main
Merge bot PR head:
https://github.com/pytorch/executorch/tree/gh/ahmtox/11/orig
@diff-train-skip-merge

Co-authored-by: morelos <morelos@devvm4573.ash0.facebook.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed fb-exported release notes: vulkan