[ONNX][SmoothQuant] Introduce new axes and axes_mode parameters #3687

andrey-churkin · 2025-10-13T07:07:39Z

Changes

This PR introduces a new axes and axes_mode parameters for TensorReducerBase. These parameters have the following meaning:
- axes: The axes along which the reduction operation should be applied. If None, the operation will be applied to all axes (i.e., tuple(range(tensor.ndim))).
- axes_mode: Determines how the specified axes are treated during the operation. Use AxesMode.REDUCTION to reduce over the given axes, or AxesMode.KEEP to preserve them.
These parameters are used to calculate the reduction axes (determine_reduction_axes() method) during statistic collection, allowing us to avoid requiring the actual tensor shape (actually only number of dimensions ndim is required) before inference.
Modifies the SmoothQuant algorithm to use the axes and axes_mode parameters for the ONNX backend instead of relying on the tensor shape from the NNCF graph, as this shape isn't always available.

Related tickets

Ref: 173880, Ref: 174334

Tests

Build post_training_quantization # 735 (# 739)
tests/onnx/test_nncf_graph_builder.py::test_unknown_shape

andrey-churkin · 2025-10-15T06:27:38Z

src/nncf/quantization/algorithms/smooth_quant/openvino_backend.py

+
+    @staticmethod
+    def get_abs_max_reducer_cls() -> type[OVAbsMaxReducer]:
+        return OVAbsMaxReducer
+
+    @staticmethod
+    def get_shape_reducer_cls() -> type[OVShapeReducer]:
+        return OVShapeReducer


We add the get_abs_max_reducer_cls() and get_shape_reducer_cls() methods here because the OpenVINO backend uses the OVAbsMaxReducer and OVShapeReducer classes instead of AbsMaxReducer and ShapeReducer to enable in-place statistic collection.

nikita-savelyevv

Should we perhaps add a test with an ONNX model for which ndim is not known beforehand to have an example of why keep_dims approach is introduced?

src/nncf/quantization/algorithms/smooth_quant/algorithm.py

andrey-churkin · 2025-10-16T07:12:42Z

Should we perhaps add a test with an ONNX model for which ndim is not known beforehand to have an example of why keep_dims approach is introduced?

Thank you for the suggestion. I’ll consider how to implement it.

UPD: This problem is reproduced on timm/visformer_small model from the ptq scope.

UPD: tests/onnx/test_nncf_graph_builder.py::test_unknown_shape test was added

daniil-lyakhov · 2025-10-16T14:47:17Z

src/nncf/experimental/common/tensor_statistics/collectors.py

-    def __init__(self, reduction_axes: Optional[ReductionAxes] = None, inplace: bool = False):
+    def __init__(
+        self,
+        reduction_axes: Optional[ReductionAxes] = None,


Should we forward this parameter in the children of the TensorReducerBase?

daniil-lyakhov · 2025-10-16T14:47:54Z

src/nncf/experimental/common/tensor_statistics/collectors.py

+    def __init__(
+        self,
+        reduction_axes: Optional[ReductionAxes] = None,
+        keep_axes: Optional[tuple[int, ...]] = None,


Suggested change

keep_axes: Optional[tuple[int, ...]] = None,

keep_axes: Optional[Axes] = None,

Perhaps we could rename ReductionAxes and reuse them there?

daniil-lyakhov · 2025-10-16T14:49:44Z

src/nncf/experimental/common/tensor_statistics/collectors.py


    def __hash__(self) -> int:
-        return hash((self.__class__.__name__, self.inplace, self._reduction_axes))
+        return hash((self.__class__.__name__, self.inplace, self._reduction_axes, self._keep_axes))


Perhaps we should update __hash__ methods for some of the TensorReducerBase as well

daniil-lyakhov · 2025-10-16T15:01:23Z

tests/cross_fw/test_templates/test_smooth_quant.py


-    def test_get_abs_max_channel_collector(self, inplace_statistics: bool):
-        backend = self.get_backend()
-        reduction_axes = (3, 2, 1)


Please test self._backend_entity.get_abs_max_reducer_cls() and _backend_entity.get_shape_reducer_cls

daniil-lyakhov · 2025-10-16T15:04:09Z

src/nncf/quantization/algorithms/smooth_quant/algorithm.py

+            if model_backend == BackendType.ONNX:
+                keep_axes = (self._backend_entity.get_activation_channel_axis(node_to_smooth, input_act_port),)
+                reduction_axes = None
+            else:
+                keep_axes = None
+                reduction_axes = self._calculate_input_reduction_axes(graph, node_to_smooth, input_act_port)


Usually we create a method in the backend to resolve such situation, why don't you introduce a method in the backend? The comment could be placed as a docstring for the method

It helps simplify the code and avoid duplication.

daniil-lyakhov · 2025-10-16T15:08:10Z

src/nncf/quantization/algorithms/smooth_quant/algorithm.py

+            ):
+                stats = tensor_collector.get_statistics()
+                shape = stats[SHAPE_BRANCH_KEY]
+                shape = tuple() if shape is None else tuple(shape.tolist())


When shape could be None?

nncf/tests/cross_fw/test_templates/test_ptq_params.py

Line 453 in 7eed8fc

def test_empty_statistics(self, mode, mocker):

If shape can be None only during testing and not in any real life scenario then I would suggest to properly mock the returned shape in tests, rather that adopting algorithm logic to support None shapes.

src/nncf/experimental/common/tensor_statistics/collectors.py

andrey-churkin · 2025-10-23T11:09:26Z

@ljaljushkin @nikita-savelyevv @daniil-lyakhov Please review

nikita-savelyevv · 2025-10-24T09:14:34Z

src/nncf/quantization/algorithms/smooth_quant/algorithm.py

+            ):
+                stats = tensor_collector.get_statistics()
+                shape = stats[SHAPE_BRANCH_KEY]
+                shape = tuple() if shape is None else tuple(shape.tolist())


If shape can be None only during testing and not in any real life scenario then I would suggest to properly mock the returned shape in tests, rather that adopting algorithm logic to support None shapes.

src/nncf/quantization/algorithms/smooth_quant/algorithm.py

minor improvements

6a0fbf2

andrey-churkin requested a review from a team as a code owner October 13, 2025 07:07

andrey-churkin added 2 commits October 13, 2025 08:19

minor update

17062a6

retrieve shapes from stats

2d08025

ljaljushkin added the Code Freeze label Oct 13, 2025

andrey-churkin added 7 commits October 13, 2025 11:09

raw version

57d828f

minor fix

a801647

update

53c6d72

minor fix

97cab6a

fix

3a96172

fix

8f16b32

fix

acb88d7

andrey-churkin requested a review from nikita-savelyevv October 14, 2025 10:07

andrey-churkin added 3 commits October 14, 2025 13:30

improve code

104dea5

minor fix

4c11627

update

8b2ea5f

andrey-churkin commented Oct 15, 2025

View reviewed changes

andrey-churkin changed the title SQ [ONNX][SmoothQuant] Introduce a new keep_axes parameter Oct 15, 2025

andrey-churkin added NNCF Common Pull request that updates NNCF Common NNCF ONNX Pull requests that updates NNCF ONNX labels Oct 15, 2025

add tests

dd61ed7

github-actions bot removed the NNCF ONNX Pull requests that updates NNCF ONNX label Oct 15, 2025

daniil-lyakhov self-requested a review October 15, 2025 09:28

nikita-savelyevv reviewed Oct 15, 2025

View reviewed changes

Apply comments

92dee3e

daniil-lyakhov reviewed Oct 16, 2025

View reviewed changes

ljaljushkin reviewed Oct 17, 2025

View reviewed changes

src/nncf/experimental/common/tensor_statistics/collectors.py Outdated Show resolved Hide resolved

Add axes mode

aa8691f

github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Oct 23, 2025

andrey-churkin requested a review from ljaljushkin October 23, 2025 09:51

add tests

1458577

andrey-churkin requested a review from daniil-lyakhov October 23, 2025 10:21

andrey-churkin added 2 commits October 23, 2025 11:55

add tests

55e171f

add tests

b9365c7

andrey-churkin requested a review from nikita-savelyevv October 23, 2025 11:07

github-actions bot added the NNCF ONNX Pull requests that updates NNCF ONNX label Oct 23, 2025

andrey-churkin changed the title ~~[ONNX][SmoothQuant] Introduce a new keep_axes parameter~~ [ONNX][SmoothQuant] Introduce new axes and axes_mode parameters Oct 23, 2025

ljaljushkin approved these changes Oct 23, 2025

View reviewed changes

nikita-savelyevv reviewed Oct 24, 2025

View reviewed changes

andrey-churkin added 3 commits October 24, 2025 12:37

reply to comments

bbdd037

minor fix

e09ff06

apply comments

6579eb4

andrey-churkin requested a review from nikita-savelyevv October 24, 2025 12:07

nikita-savelyevv approved these changes Oct 24, 2025

View reviewed changes

andrey-churkin merged commit 0379ea1 into openvinotoolkit:develop Oct 24, 2025
20 checks passed

	keep_axes: Optional[tuple[int, ...]] = None,
	keep_axes: Optional[Axes] = None,

[ONNX][SmoothQuant] Introduce new axes and axes_mode parameters #3687

[ONNX][SmoothQuant] Introduce new axes and axes_mode parameters #3687

Uh oh!

Conversation

andrey-churkin commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Related tickets

Tests

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikita-savelyevv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andrey-churkin commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andrey-churkin commented Oct 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andrey-churkin commented Oct 13, 2025 •

edited

Loading

andrey-churkin commented Oct 16, 2025 •

edited

Loading