Enable QuantFusionPass in compiler pipeline (#19728) by ethansfng · Pull Request #19728 · pytorch/executorch

ethansfng · 2026-05-21T20:31:00Z

Summary:

Both and Cadence now use the shared QuantFusionPass from compiler_funcs.py.

QuantFusionPass in compiler_funcs.py iterates patterns, matches anchor_ops(), calls fuse() on each match, with debug logging and dead code elimination
Cadence: compiler.py now uses QuantFusionPass instead of the old QuantFusion isinstance switch
Removed Cadence compiler target's dep on :fusion_pass (no longer imported)

Differential Revision: D105728219

pytorch-bot · 2026-05-21T20:31:05Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19728

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 181c7bd with merge base 7d8063f ():

NEW FAILURE - The following job has failed:

pull / test-parakeet-xnnpack-linux / linux-job (gh)
RuntimeError: Command docker exec -t 277f33343ba3927d9735c73e178dff4f2852b899e8bbc3f53d719a1afcf71250 /exec failed with exit code 1

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-05-21T20:31:13Z

@ethansfng has exported this pull request. If you are a Meta employee, you can view the originating Diff in D105728219.

github-actions · 2026-05-21T20:31:59Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: Enable new QuantFusionPass that calls pattern.fuse() instead of monolithic QuantFusion Differential Revision: D105728219

…19743) Summary: torchao's `convert_pt2e` adds `out_dtype` kwargs to dequant nodes for bf16 models. `cadence::dequantize_per_tensor` doesn't support this kwarg (it hardcodes float32 output), so `ReplacePT2DequantWithCadenceDequantPass` crashes when it forwards kwargs blindly to the cadence op. Strip `out_dtype` from kwargs before creating the cadence dequant node, and insert an `aten.to.dtype` cast after it to preserve the original output dtype semantics. Differential Revision: D105630451

Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137

Summary: Add `fuse()` implementations to the first batch of Cadence `QuantizationPattern` subclasses — the standard fully-quantized patterns that use the shared `_fuse_conv`, `_fuse_linear`, and `_fuse_matmul` helpers: - `AddmmPattern` — transpose weight + linear fusion - `AddPattern` — two-input quantized add - `AddReluBasePattern` — add+relu fusion with `anchor_ops()` override - `BmmPattern`, `MatmulPattern` — matmul fusion via `_fuse_matmul` - `CatPattern` — cat passthrough on quantized inputs - `Conv1dPattern`, `Conv2dPattern` — conv fusion via `_fuse_conv` with depthwise routing - `LayerNormPattern` — layer norm with default weight/bias creation - `LinearPattern` — linear fusion via `_fuse_linear` Differential Revision: D105728156

Summary: Add `fuse()` implementations to the remaining Cadence `QuantizationPattern` subclasses: - `MaxPool2dPattern`, `MaxPool2dWithoutIndicesPattern` — order-preserving pool on quantized values - `ReluBasePattern` (inherited by `ReluPattern0`/`1`) — relu with requantization - `ConvReluBasePattern` (inherited by `Conv1d`/`2dReluPattern0`/`1`) — conv+relu fusion with `anchor_ops()` override to match only the conv op - `SoftmaxPattern` — softmax with dummy mask/pos tensors and fake_mode metadata - `MixedW8A32LinearPattern` — weight-only quantized linear (no input/output quant) - `MixedW8A32ConvPattern` — weight-only quantized conv1d with NCL→NLC permutation - `MixedW8A32GruPattern` — weight-only quantized GRU with 4 dequantized params Differential Revision: D105728177

Summary: Both and Cadence now use the shared `QuantFusionPass` from `compiler_funcs.py`. - `QuantFusionPass` in `compiler_funcs.py` iterates patterns, matches `anchor_ops()`, calls `fuse()` on each match, with debug logging and dead code elimination - Cadence: `compiler.py` now uses `QuantFusionPass` instead of the old `QuantFusion` isinstance switch - Removed Cadence `compiler` target's dep on `:fusion_pass` (no longer imported) Differential Revision: D105728219

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 21, 2026

meta-codesync Bot added fb-exported meta-exported labels May 21, 2026

meta-codesync Bot changed the title ~~Unify quantization fusion pass across backends~~ Unify quantization fusion pass across backends (#19728) May 21, 2026

ethansfng force-pushed the export-D105728219 branch from 42b8ffe to 8f2e4ea Compare May 21, 2026 21:11

ethansfng force-pushed the export-D105728219 branch from 8f2e4ea to 250f45e Compare May 21, 2026 21:18

ethansfng force-pushed the export-D105728219 branch from 250f45e to 9010a85 Compare May 22, 2026 18:48

meta-codesync Bot changed the title ~~Unify quantization fusion pass across backends (#19728)~~ Enable QuantFusionPass in compiler pipeline May 23, 2026

ethansfng force-pushed the export-D105728219 branch from 9010a85 to 6e7ff7c Compare May 23, 2026 00:38

ethansfng added 5 commits May 22, 2026 21:00

meta-codesync Bot changed the title ~~Enable QuantFusionPass in compiler pipeline~~ Enable QuantFusionPass in compiler pipeline (#19728) May 23, 2026

ethansfng force-pushed the export-D105728219 branch from 6e7ff7c to 181c7bd Compare May 23, 2026 04:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable QuantFusionPass in compiler pipeline (#19728)#19728

Enable QuantFusionPass in compiler pipeline (#19728)#19728
ethansfng wants to merge 5 commits into
pytorch:mainfrom
ethansfng:export-D105728219

ethansfng commented May 21, 2026 •

edited by meta-codesync Bot

Loading

Uh oh!

pytorch-bot Bot commented May 21, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ethansfng commented May 21, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19728

❌ 1 New Failure, 2 Unrelated Failures

Uh oh!

meta-codesync Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ethansfng commented May 21, 2026 •

edited by meta-codesync Bot

Loading

pytorch-bot Bot commented May 21, 2026 •

edited

Loading

This PR needs a `release notes:` label