[ET-VK][quantization] Implement layout-flexible quantize/dequantize operators #17261

pytorchbot · 2026-02-05T23:28:44Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #17106 by @SS-JIA
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/399/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/399/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/405/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/399/orig
Differential Revision: D92061370
@diff-train-skip-merge

…perators Pull Request resolved: #17106 Implemented quantize_per_tensor and dequantize_per_tensor GLSL shaders and C++ dispatch logic to support the new single-dimension packed INT8 layouts (kPackedInt8_4W, kPackedInt8_4C, kPackedInt8_4H). These operators enable conversion between floating-point tensors and packed int8 representations with per-tensor scale and zero-point parameters. The implementation includes: - GLSL shaders: quantize_per_tensor and dequantize_per_tensor with support for both texture->buffer and buffer->buffer data flows, including GL_EXT_debug_printf statements for debugging - QuantizeDequantize.cpp: Added dispatch functions for the new layouts and registered etvk.q_dq_8bit_per_tensor.default operator - Test infrastructure: Created q_dq_8bit_per_tensor test binary with DEBUG_MODE support and reference CPU implementation for validation The shaders implement the quantization formula ``` Q = clamp(round(x/scale) + zp, -128, 127) ``` and dequantization formula ``` x' = (Q - zp) \* scale ``` with proper int8 packing/unpacking using little-endian byte ordering and sign extension. ghstack-source-id: 338638544 @exported-using-ghexport Differential Revision: [D92061370](https://our.internmc.facebook.com/intern/diff/D92061370/)

pytorch-bot · 2026-02-05T23:28:48Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17261

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 76 Pending, 1 Unrelated Failure

As of commit 1781382 with merge base 1cffd23 ():

NEW FAILURE - The following job has failed:

Build Presets / apple (ios) / build (gh)
The process '/opt/homebrew/bin/git' failed with exit code 1

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / test-models-linux-basic (mv3, xnnpack-quantization-delegation, cmake, linux.arm64.2xlarge, execut... / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…perators (#17261) Implemented quantize_per_tensor and dequantize_per_tensor GLSL shaders and C++ dispatch logic to support the new single-dimension packed INT8 layouts (kPackedInt8_4W, kPackedInt8_4C, kPackedInt8_4H). These operators enable conversion between floating-point tensors and packed int8 representations with per-tensor scale and zero-point parameters. The implementation includes: - GLSL shaders: quantize_per_tensor and dequantize_per_tensor with support for both texture->buffer and buffer->buffer data flows, including GL_EXT_debug_printf statements for debugging - QuantizeDequantize.cpp: Added dispatch functions for the new layouts and registered etvk.q_dq_8bit_per_tensor.default operator - Test infrastructure: Created q_dq_8bit_per_tensor test binary with DEBUG_MODE support and reference CPU implementation for validation The shaders implement the quantization formula Q = clamp(round(x/scale) + zp, -128, 127) and dequantization formula x' = (Q - zp) * scale, with proper int8 packing/unpacking using little-endian byte ordering and sign extension. Differential Revision: [D92061370](https://our.internmc.facebook.com/intern/diff/D92061370/) [ghstack-poisoned]

pytorchbot requested review from SS-JIA, kirklandsign and larryliu0820 as code owners February 5, 2026 23:28

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 5, 2026

SS-JIA approved these changes Feb 5, 2026

View reviewed changes

SS-JIA merged commit 694f9b8 into gh/SS-JIA/405/orig Feb 5, 2026
159 of 173 checks passed

SS-JIA deleted the gh/SS-JIA/399/orig branch February 5, 2026 23:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK][quantization] Implement layout-flexible quantize/dequantize operators #17261

[ET-VK][quantization] Implement layout-flexible quantize/dequantize operators #17261

Uh oh!

pytorchbot commented Feb 5, 2026

Uh oh!

pytorch-bot bot commented Feb 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[ET-VK][quantization] Implement layout-flexible quantize/dequantize operators #17261

[ET-VK][quantization] Implement layout-flexible quantize/dequantize operators #17261

Uh oh!

Conversation

pytorchbot commented Feb 5, 2026

Uh oh!

pytorch-bot bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17261

❌ 1 New Failure, 76 Pending, 1 Unrelated Failure

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Feb 5, 2026 •

edited

Loading