[ET-VK][Ops] quantization op shaders and impl #11369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

ahmtox wants to merge 14 commits into gh/ahmtox/11/base from gh/ahmtox/11/head

+755 −6

ahmtox commented Jun 4, 2025 •

edited

Loading

Stack from ghstack (oldest at bottom):

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are half (fp16) and float (fp32). The only output types supported are byte (uint8), char (int8), short (int16), int (int32).

Differential Revision: D75959064


          [ET-VK][Ops] quantization op shaders and impl

0c9c7a6

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

ahmtox requested a review from SS-JIA as a code owner

June 4, 2025 18:03

This was referenced Jun 4, 2025

[ET-VK] double, short, and uint16 dtype runtime support #11365

Open

[ET-VK][Ops] quantize ops skeleton test framework #11366

Open

[ET-VK][Ops] quantize_per_token.default test setup #11367

Open

pytorch-bot bot commented Jun 4, 2025 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11369

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Cancelled Job, 7 Pending

As of commit 15a7258 with merge base 8cfa858 ():

NEW FAILURES - The following jobs have failed:

Build Presets / linux (llm, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
pull / android / build-llm-demo / linux-job (gh)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

CANCELLED JOB - The following job was cancelled. Please retry:

pull / unittest / linux / linux-job (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ahmtox mentioned this pull request

[ET-VK][Ops] quantize_per_tensor.default test setup #11368

Open

ahmtox pushed a commit that referenced this pull request


          [ET-VK][Ops] quantization op shaders and impl

36f2cb5

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

ghstack-source-id: 288187842
Pull Request resolved: #11369

facebook-github-bot added the CLA Signed label

Contributor

facebook-github-bot commented Jun 4, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064

facebook-github-bot added the fb-exported label


          Update on "[ET-VK][Ops] quantization op shaders and impl"

f2c2380

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

This was referenced Jun 9, 2025

[ET] enabling half dtype input for quantization #11479

Open

[ET-VK][Ops] dequantize ops skeleton test framework #11480

Open

[ET-VK][Ops] dequantize_per_tensor.default test setup #11481

Open

[ET-VK][Ops] dequantize_per_token.default test setup #11482

Open

[ET-VK][Ops] dequantization op shaders and impl #11483

Open

Contributor

facebook-github-bot commented Jun 9, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

e2cb320

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 9, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

cb4bcfe

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

This was referenced Jun 11, 2025

[ET] enabling half dtype output for dequantization and making logic consistent #11552

Open

[ET-VK][Ops] enabling double support for quantization and dequantization ops #11553

Open

[ET-VK][Ops] choose_qparams ops skeleton test framework #11554

Open

[ET-VK][Ops] choose_qparams.tensor test setup #11555

Open

[ET-VK][Ops] choose_qparams_per_token_asymmetric.default test setup #11556

Open

[ET-VK][Ops] choose_qparams op shaders and impl #11557

Open

Contributor

facebook-github-bot commented Jun 11, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

3615a76

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 11, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064

ahmtox added the release notes: vulkan label


          Update on "[ET-VK][Ops] quantization op shaders and impl"

26a3dc8

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

ahmtox mentioned this pull request

[ET-VK][Ops] common test utils for converting aten types to vulkan types #11575

Open

Contributor

facebook-github-bot commented Jun 11, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

9f7d105

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 12, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

d49d3a2

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 12, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

499dbfd

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 12, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

de2298b

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 12, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

06734c3

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 13, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064

SS-JIA approved these changes

View reviewed changes


          Update on "[ET-VK][Ops] quantization op shaders and impl"

10bcfe7

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 13, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

67e425b

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 13, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064


          Update on "[ET-VK][Ops] quantization op shaders and impl"

15a7258

Creating the quantize_per_tensor and quantize_per_token logic shaders and impl which are linked with the testing framework.

NOTE: Currently the only input types supported are **half** (fp16) and **float** (fp32). The only output types supported are **byte** (uint8), **char** (int8), **short** (int16), **int** (int32).

Differential Revision: [D75959064](https://our.internmc.facebook.com/intern/diff/D75959064/)

[ghstack-poisoned]

Contributor

facebook-github-bot commented Jun 13, 2025

This pull request was exported from Phabricator. Differential Revision: D75959064

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed fb-exported release notes: vulkan