Skip to content

Conversation

@afierka-intel
Copy link
Collaborator

Porting the FP8 calibration procedure from vllm-hpu-extension: https://github.com/HabanaAI/vllm-hpu-extension/tree/main/calibration

Signed-off-by: Artur Fierka <[email protected]>
@github-actions
Copy link

github-actions bot commented Oct 3, 2025

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

@github-actions
Copy link

github-actions bot commented Oct 3, 2025

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

@github-actions
Copy link

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

Signed-off-by: Artur Fierka <[email protected]>
@github-actions
Copy link

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

@github-actions
Copy link

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

@afierka-intel
Copy link
Collaborator Author

Thank you @skavulya for all proposed fixes! Code updated and validated with Llama3.1-8B-Instruce. Accuracy on quantized FP8 model is marginally lower than BF16.

@github-actions
Copy link

✅ CI Passed

All checks passed successfully against the following vllm commit:
fdd32750f0bce9edd05f85c3550d6ebc3b06931f

@github-actions
Copy link

✅ CI Passed

All checks passed successfully against the following vllm commit:
f57438338d819c8e3e7e70293281c575ebd77411

Copy link
Collaborator

@michalkuligowski michalkuligowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change extension mentions

@github-actions
Copy link

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@github-actions
Copy link

✅ CI Passed

All checks passed successfully against the following vllm commit:
17838e50efc9e686fffc901dfed811b15b7d614a

@afierka-intel
Copy link
Collaborator Author

/skip-gaudi-tests

@sys-hab-pt-service
Copy link
Collaborator

Only codeowners and testowners can request to run Gaudi tests. Contact list: kzawora-intel, xuechendi, mswiniarsk, adobrzyn, mgawarkiewicz-intel, vivekgoe, afierka-intel, michalkuligowski, iboiko-habana, PatrykWo, kamil-kaczor, kfojcik-intel, ksmusz, wuxun-zhang, xuechendi, attafosu, ulivne, Kacper-Pietkun, iboiko-habana, jkaniecki

@github-actions
Copy link

✅ CI Passed

All checks passed successfully against the following vllm commit:
17838e50efc9e686fffc901dfed811b15b7d614a

@afierka-intel afierka-intel merged commit 7d24df2 into vllm-project:main Oct 17, 2025
37 checks passed
hlahkar pushed a commit to hlahkar/vllm-gaudi that referenced this pull request Oct 24, 2025
Porting the FP8 calibration procedure from vllm-hpu-extension:
https://github.com/HabanaAI/vllm-hpu-extension/tree/main/calibration

---------

Signed-off-by: Artur Fierka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants