Arm backend: Make composable_quantizer default#19758
Conversation
A few fixes needed: - Add new ops added since initial upstream of composable_quantizer - Add while-op quantize fix from 3be4546 to TosaQuantizerV2 - Add fixed_qparams fix from fb90480 to TosaQuantizerV2 - Update some tests to mirror new behaviours - Update quanitzer_tutorial to not be WIP - Remove hardswish from FUSED_ACTIVATION_OPS - Explicitly check that weights and biases are input args to conv/ linear ops. The assumption that wights and biases are the only parameters of networks does not hold for real models. Signed-off-by: Adrian Lundell <adrian.lundell@arm.com> Change-Id: Ifa127a73d4db45cd2d3461101f97c0cf852bf7bf
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19758
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 4 New Failures, 4 Unrelated FailuresAs of commit 6c4194e with merge base 77df9b7 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Pull request overview
This PR switches the Arm backend’s TOSAQuantizer to use the composable quantizer implementation by default, and updates Arm quantization annotation/support logic plus tests/tutorial materials to match the new behaviors (including while-loop and fixed-qparams handling).
Changes:
- Make
TOSAQuantizerdefault touse_composable_quantizer=True. - Extend/update quantizer support + annotation behavior (e.g., while-loop shared-qspec handling, fixed-qparams input qspecs for trig ops, additional supported ops).
- Update Arm backend tests and the Arm quantizer tutorial notebook to reflect the new defaults/behaviors.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/arm/quantizer_tutorial.ipynb | Removes WIP framing in the composable quantizer tutorial intro. |
| backends/arm/test/ops/test_while.py | Adds an INT while-loop test path forcing composable quantizer usage. |
| backends/arm/test/ops/test_transpose_conv2d.py | Updates tests to use TOSAQuantizationConfig for global config setup. |
| backends/arm/test/ops/test_to_copy.py | Simplifies redundant-cast xfail configuration shared between FP/INT. |
| backends/arm/test/misc/test_shared_qspecs.py | Updates golden expectations for shared-qspec annotation counts/qparams. |
| backends/arm/test/misc/test_quant_custom_meta.py | Adjusts test quantizer config (including set_io(None)) to match new behavior. |
| backends/arm/quantizer/quantizer_support.py | Updates supported/fused patterns (and adds more supported ops). |
| backends/arm/quantizer/quantization_config.py | Adds fixed-qparams input spec generation for specific trig ops under composable flow. |
| backends/arm/quantizer/arm_quantizer.py | Makes composable quantizer the default for TOSAQuantizer. |
| backends/arm/quantizer/arm_quantizer_utils.py | Tightens weight/bias identification and adds while-loop shared-qspec special-casing; extends shared-qspec op list. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Adrian Lundell <adrian.lundell@arm.com> Change-Id: I75e207d010b2bdc9abc86023153c86d2c96af3fd
|
@rascani has imported this pull request. If you are a Meta employee, you can view this in D106381887. |
|
Kicking off an internal test run. |
|
It looks like the new quantizer breaks |
…tream/change-1253855 Change-Id: I86bc8025e41175a796fa0ff7aabc9847b1de923a
These configs does not exist in the new quantizer, so checking it does not make sense anymore. Signed-off-by: Adrian Lundell <adrian.lundell@arm.com> Change-Id: Ic6c0b303466010be59e9c9c37fb179938c412a16
| elif node.target in _fixed_input_qspec_ops: | ||
|
|
||
| input_act_qspec = super().get_input_act_qspec(node, input_node) | ||
| num_bits = torch.iinfo(input_act_qspec.dtype).bits | ||
| qparams = _fixed_input_qspec_ops[node.target][num_bits] | ||
| return FixedQParamsQuantizationSpec( | ||
| dtype=input_act_qspec.dtype, | ||
| scale=qparams.scale, | ||
| zero_point=qparams.zero_point, | ||
| quant_min=input_act_qspec.quant_min, | ||
| quant_max=input_act_qspec.quant_max, | ||
| qscheme=input_act_qspec.qscheme, | ||
| is_dynamic=input_act_qspec.is_dynamic, | ||
| ) |
Signed-off-by: Adrian Lundell <adrian.lundell@arm.com> Change-Id: I7760780d4f06e300f17575e7c6cb34c6dfdae64c
|
Kicking off another internal run. |
|
FYI @3l1 - not sure if you saw this. |
|
From @rascani 's internal CI runs, These should be forward fixed. |
I already have these fixed in D106539874. I'll land it ahead of time so the default switch can be a no-op. |
Signed-off-by: Adrian Lundell <adrian.lundell@arm.com> Change-Id: Ib6fefa3ba308d955b46e23095633f54df8ed3bf0
A few fixes needed:
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani