Updated reference with torch.compile#3381
Conversation
|
https://github.com/openvinotoolkit/nncf/actions/runs/14083174698/job/39440621184#step:8:4268
|
| orig_shape = grad_output.shape | ||
| grad_output = grad_output.reshape(input_shape) | ||
|
|
||
| # TODO:(nlyalyus) should be implemented via torch extensions, but some optimizations are required: ticket-161670 |
There was a problem hiding this comment.
@ljaljushkin, @nikita-malininn what do you think about covering this TODO in this PR, since it didn't make this into the release and we probably have time to not leave technical debt?
There was a problem hiding this comment.
Agree, we have time now.
But, as we discussed, to make a decision about implementing CUDA or Triton kernel for group-wise fake quantize, we first need to perform benchmarking (task 165734). torch.compile could be a good option
03689f0 to
be7e640
Compare
|
@alexsu52, @ljaljushkin, @AlexanderDokuchaev, review, please. |
ljaljushkin
left a comment
There was a problem hiding this comment.
No major comments from my side.
only rebase is needed
|
|
||
| torch_executor = ReferenceQuantize(backend_type=ReferenceBackendType.TORCH) | ||
| torch_forward = CompilationWrapper(torch_executor.forward) | ||
| torch_backward = CompilationWrapper(torch_executor.backward) |
There was a problem hiding this comment.
Wrapped function lose annotation


If write decorator like function, correctly it looks like
nncf/nncf/common/utils/caching.py
Line 61 in dd99ed5
(used functools.wrap and TypeVar to keep signature and docstring of function)
It breaks suggestion in editors and check arguments by mypy, but i dont know how do it for class.
Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
Changes
Reason for changes
Related tickets
Tests
examples/llm_compression/torch/qat_with_lora example times:
tests times:
Reopened #3343