Commit 89d57c4
Arm backend: fix composable quantizer leaving int16 elementwise constants and IO-boundary shared clusters unquantized
Summary:
Two correctness fixes for the composable TOSA quantizer (enabled via `use_composable_quantizer=True`).
1. Parameter operands of non conv/linear ops were quantized as weights.
In `annotate_match`, the first parameter input of any matched op was assigned the weight qspec. Only conv and linear ops have true weight/bias operands; for any other op (for example an elementwise `add`/`sub`), a constant/parameter operand is an ordinary activation operand. Quantizing it with the weight dtype while the other operand and the output use the activation dtype produces a graph that cannot be lowered for ops whose TOSA implementation requires both operands to share a dtype (`add`/`sub`), and silently demotes constants in 16A8W (int16-activation) configurations to int8. Weight/bias classification is now restricted to the ops that actually have them; parameter inputs of all other ops receive the input-activation qspec.
2. Shared-op clusters on the quantized IO boundary were left in float.
`SharedQspecQuantizer` only propagates a qspec from an already-quantized neighbor. A cluster of shared/no-arithmetic ops (for example `cat` and view/reshape ops) whose only quantized neighbors are a uint8 IO input (deliberately skipped so uint8 stays confined to the IO boundary) and/or an input placeholder carrying an empty annotation has no qspec to propagate, so the cluster was rejected and remained in float, falling off the integer delegate onto the CPU. Such clusters now initiate quantization from the global config's input-activation qspec when they sit on the quantized IO boundary, while still keeping uint8 confined to the IO boundary.
This change was developed with assistance from Claude.
Differential Revision: D1073208471 parent 79cbc45 commit 89d57c4
2 files changed
Lines changed: 57 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1220 | 1220 | | |
1221 | 1221 | | |
1222 | 1222 | | |
| 1223 | + | |
| 1224 | + | |
| 1225 | + | |
| 1226 | + | |
1223 | 1227 | | |
1224 | 1228 | | |
1225 | 1229 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
336 | 336 | | |
337 | 337 | | |
338 | 338 | | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
339 | 347 | | |
340 | 348 | | |
341 | 349 | | |
342 | | - | |
| 350 | + | |
343 | 351 | | |
344 | 352 | | |
345 | 353 | | |
346 | | - | |
| 354 | + | |
347 | 355 | | |
348 | 356 | | |
349 | 357 | | |
| |||
481 | 489 | | |
482 | 490 | | |
483 | 491 | | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
484 | 496 | | |
485 | 497 | | |
486 | 498 | | |
| |||
552 | 564 | | |
553 | 565 | | |
554 | 566 | | |
555 | | - | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
556 | 581 | | |
557 | 582 | | |
558 | 583 | | |
| 584 | + | |
559 | 585 | | |
560 | 586 | | |
561 | 587 | | |
| |||
564 | 590 | | |
565 | 591 | | |
566 | 592 | | |
| 593 | + | |
567 | 594 | | |
568 | 595 | | |
569 | 596 | | |
570 | 597 | | |
| 598 | + | |
571 | 599 | | |
572 | | - | |
| 600 | + | |
573 | 601 | | |
574 | 602 | | |
575 | 603 | | |
| |||
588 | 616 | | |
589 | 617 | | |
590 | 618 | | |
591 | | - | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
592 | 640 | | |
593 | 641 | | |
594 | 642 | | |
| |||
0 commit comments