- 
                Notifications
    You must be signed in to change notification settings 
- Fork 262
[ONNX][SmoothQuant] Introduce new axes and axes_mode parameters #3687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ONNX][SmoothQuant] Introduce new axes and axes_mode parameters #3687
Conversation
|  | ||
| @staticmethod | ||
| def get_abs_max_reducer_cls() -> type[OVAbsMaxReducer]: | ||
| return OVAbsMaxReducer | ||
|  | ||
| @staticmethod | ||
| def get_shape_reducer_cls() -> type[OVShapeReducer]: | ||
| return OVShapeReducer | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add the get_abs_max_reducer_cls() and get_shape_reducer_cls() methods here because the OpenVINO backend uses the OVAbsMaxReducer and OVShapeReducer classes instead of AbsMaxReducer and ShapeReducer to enable in-place statistic collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we perhaps add a test with an ONNX model for which ndim is not known beforehand to have an example of why keep_dims approach is introduced?
| 
 Thank you for the suggestion. I’ll consider how to implement it. UPD: This problem is reproduced on timm/visformer_small model from the ptq scope. UPD:  | 
| def __init__(self, reduction_axes: Optional[ReductionAxes] = None, inplace: bool = False): | ||
| def __init__( | ||
| self, | ||
| reduction_axes: Optional[ReductionAxes] = None, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we forward this parameter in the children of  the TensorReducerBase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| def __init__( | ||
| self, | ||
| reduction_axes: Optional[ReductionAxes] = None, | ||
| keep_axes: Optional[tuple[int, ...]] = None, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| keep_axes: Optional[tuple[int, ...]] = None, | |
| keep_axes: Optional[Axes] = None, | 
Perhaps we could rename ReductionAxes and reuse them there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|  | ||
| def __hash__(self) -> int: | ||
| return hash((self.__class__.__name__, self.inplace, self._reduction_axes)) | ||
| return hash((self.__class__.__name__, self.inplace, self._reduction_axes, self._keep_axes)) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should update  __hash__ methods for some of the TensorReducerBase as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|  | ||
| def test_get_abs_max_channel_collector(self, inplace_statistics: bool): | ||
| backend = self.get_backend() | ||
| reduction_axes = (3, 2, 1) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test self._backend_entity.get_abs_max_reducer_cls() and _backend_entity.get_shape_reducer_cls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| if model_backend == BackendType.ONNX: | ||
| keep_axes = (self._backend_entity.get_activation_channel_axis(node_to_smooth, input_act_port),) | ||
| reduction_axes = None | ||
| else: | ||
| keep_axes = None | ||
| reduction_axes = self._calculate_input_reduction_axes(graph, node_to_smooth, input_act_port) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we create a method in the backend to resolve such situation, why don't you introduce a method in the backend? The comment could be placed as a docstring for the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It helps simplify the code and avoid duplication.
| ): | ||
| stats = tensor_collector.get_statistics() | ||
| shape = stats[SHAPE_BRANCH_KEY] | ||
| shape = tuple() if shape is None else tuple(shape.tolist()) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When shape could be None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def test_empty_statistics(self, mode, mocker): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If shape can be None only during testing and not in any real life scenario then I would suggest to properly mock the returned shape in tests, rather that adopting algorithm logic to support None shapes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| @ljaljushkin @nikita-savelyevv @daniil-lyakhov Please review | 
| ): | ||
| stats = tensor_collector.get_statistics() | ||
| shape = stats[SHAPE_BRANCH_KEY] | ||
| shape = tuple() if shape is None else tuple(shape.tolist()) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If shape can be None only during testing and not in any real life scenario then I would suggest to properly mock the returned shape in tests, rather that adopting algorithm logic to support None shapes.
Changes
This PR introduces a new
axesandaxes_modeparameters forTensorReducerBase. These parameters have the following meaning:axes: The axes along which the reduction operation should be applied. IfNone, the operation will be applied to all axes (i.e.,tuple(range(tensor.ndim))).axes_mode: Determines how the specifiedaxesare treated during the operation. UseAxesMode.REDUCTIONto reduce over the given axes, orAxesMode.KEEPto preserve them.These parameters are used to calculate the reduction axes (
determine_reduction_axes()method) during statistic collection, allowing us to avoid requiring the actual tensor shape (actually only number of dimensionsndimis required) before inference.Modifies the
SmoothQuantalgorithm to use theaxesandaxes_modeparameters for the ONNX backend instead of relying on the tensor shape from the NNCF graph, as this shape isn't always available.Related tickets
Ref: 173880, Ref: 174334
Tests