Commit c822247
Quantize moveaxis/movedim so they delegate to Ethos-U (#20314)
Summary:
The ARM PT2 quantizer's pass-through shared-qspec set in quantization_annotator.py
(_one_to_one_shared_input_qspec) covers permute/permute_copy/transpose/view/squeeze
etc., but omits aten.moveaxis/aten.movedim. A model that uses torch.moveaxis
therefore leaves those ops unquantized: the quantizer brackets each one with
dequantize -> moveaxis(float) -> quantize.
On lowering, moveaxis decomposes to a float permute_copy. The Ethos-U55
operator-support check (operator_support/ethos_u55_support.py) only delegates
permute_copy for int8/int16/int32, so it rejects the float one. Each rejected
permute is stranded on the host, splitting the model into many delegated
partitions (one NPU island per permute), which bloats the .pte with per-partition
delegate overhead and host round-trips.
Add aten.moveaxis.int / aten.movedim.int to _one_to_one_shared_input_qspec
(guarded with getattr for torch-build variance, mirroring the existing
transpose.Dimname handling) so they share the input quantization spec exactly like
transpose/permute. They then stay int8, decompose to int8 permute_copy, and
delegate to the NPU -- eliminating the host float islands.
Impact: a quantized example ensemble (ConvNeXt-style blocks that
use torch.moveaxis) that previously lowered into 9 Ethos-U55 partitions now lowers
into a single delegate, with zero host permutes and ~24% smaller .pte, with no
model changes. Generalizes to any moveaxis/movedim-using model on the Ethos-U
backend.
Differential Revision: D1084780111 parent 23f9021 commit c822247
2 files changed
Lines changed: 35 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
619 | 619 | | |
620 | 620 | | |
621 | 621 | | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
622 | 626 | | |
623 | 627 | | |
624 | 628 | | |
| |||
630 | 634 | | |
631 | 635 | | |
632 | 636 | | |
| 637 | + | |
633 | 638 | | |
634 | 639 | | |
635 | 640 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| |||
78 | 79 | | |
79 | 80 | | |
80 | 81 | | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
81 | 88 | | |
82 | 89 | | |
83 | 90 | | |
| |||
118 | 125 | | |
119 | 126 | | |
120 | 127 | | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
121 | 151 | | |
122 | 152 | | |
123 | 153 | | |
| |||
0 commit comments