Skip to content

Updated reference with torch.compile#3381

Merged
alexsu52 merged 17 commits intoopenvinotoolkit:developfrom
nikita-malininn:nm/ref_compile
Apr 28, 2025
Merged

Updated reference with torch.compile#3381
alexsu52 merged 17 commits intoopenvinotoolkit:developfrom
nikita-malininn:nm/ref_compile

Conversation

@nikita-malininn
Copy link
Copy Markdown
Contributor

@nikita-malininn nikita-malininn commented Mar 26, 2025

Changes

  • Added torch.compile for forward & backward in reference implementation.

Reason for changes

  • Training speed-up from 7 minutes to 5 minutes for 1 epoch of phi3.5 qat-lora tuning

Related tickets

  • 163973

Tests

examples/llm_compression/torch/qat_with_lora example times:

Epoch Branch Time  
Epoch 0 nm/ref_compile 4m 10s
Epoch 0 develop 4m 30s

tests times:

Test Branch Time  
tests/torch/quantization/test_strip.py develop 36.99s
tests/torch/quantization/test_strip.py nm/ref_compile 49.80s
tests/torch/ptq/test_fq_lora.py develop 19.51s
tests/torch/ptq/test_fq_lora.py nm/ref_compile 23.45s

Reopened #3343

@github-actions github-actions Bot added the NNCF PT Pull requests that updates NNCF PyTorch label Mar 26, 2025
@nikita-malininn nikita-malininn marked this pull request as ready for review March 26, 2025 13:15
@nikita-malininn nikita-malininn requested a review from a team as a code owner March 26, 2025 13:15
@nikita-malininn
Copy link
Copy Markdown
Contributor Author

https://github.com/openvinotoolkit/nncf/actions/runs/14083174698/job/39440621184#step:8:4268

WARNING:nncf:Could not use torch.compile with reference functions. Falling back on not compiled versions - Reason: Windows not yet supported for torch.compile

Comment thread nncf/torch/quantization/reference.py Outdated
@nikita-malininn nikita-malininn marked this pull request as draft March 27, 2025 07:13
@nikita-malininn nikita-malininn marked this pull request as ready for review March 27, 2025 16:01
Comment thread nncf/torch/quantization/reference.py Outdated
orig_shape = grad_output.shape
grad_output = grad_output.reshape(input_shape)

# TODO:(nlyalyus) should be implemented via torch extensions, but some optimizations are required: ticket-161670
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ljaljushkin, @nikita-malininn what do you think about covering this TODO in this PR, since it didn't make this into the release and we probably have time to not leave technical debt?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we have time now.
But, as we discussed, to make a decision about implementing CUDA or Triton kernel for group-wise fake quantize, we first need to perform benchmarking (task 165734). torch.compile could be a good option

Comment thread nncf/torch/quantization/reference.py Outdated
@nikita-malininn nikita-malininn marked this pull request as draft April 11, 2025 09:32
@nikita-malininn nikita-malininn marked this pull request as ready for review April 17, 2025 10:08
@nikita-malininn
Copy link
Copy Markdown
Contributor Author

@alexsu52, @ljaljushkin, @AlexanderDokuchaev, review, please.

Comment thread tests/torch/test_utils.py Outdated
Comment thread nncf/torch/quantization/reference.py
Copy link
Copy Markdown
Contributor

@ljaljushkin ljaljushkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major comments from my side.
only rebase is needed

Comment thread nncf/torch/utils.py Outdated

torch_executor = ReferenceQuantize(backend_type=ReferenceBackendType.TORCH)
torch_forward = CompilationWrapper(torch_executor.forward)
torch_backward = CompilationWrapper(torch_executor.backward)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapped function lose annotation
image
image

If write decorator like function, correctly it looks like

def cache_results(cache: ResultsCache) -> Callable[[TFunc], TFunc]:

(used functools.wrap and TypeVar to keep signature and docstring of function)
It breaks suggestion in editors and check arguments by mypy, but i dont know how do it for class.

Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
@alexsu52 alexsu52 merged commit 3f7d9e4 into openvinotoolkit:develop Apr 28, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NNCF PT Pull requests that updates NNCF PyTorch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants