Add fp8 calibration procedure #309

afierka-intel · 2025-10-03T09:23:36Z

Porting the FP8 calibration procedure from vllm-hpu-extension: https://github.com/HabanaAI/vllm-hpu-extension/tree/main/calibration

Signed-off-by: Artur Fierka <[email protected]>

github-actions · 2025-10-03T09:23:48Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

github-actions · 2025-10-03T09:24:02Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

calibration/step-2-measure-scales.py

calibration/step-4-quantize-scales.py

calibration/step-3-postprocess-measure.py

calibration/vlm-calibration/calibrate_model.sh

github-actions · 2025-10-15T06:35:29Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

Signed-off-by: Artur Fierka <[email protected]>

github-actions · 2025-10-15T08:35:46Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

github-actions · 2025-10-15T08:36:45Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

afierka-intel · 2025-10-15T08:42:35Z

Thank you @skavulya for all proposed fixes! Code updated and validated with Llama3.1-8B-Instruce. Accuracy on quantized FP8 model is marginally lower than BF16.

github-actions · 2025-10-15T13:05:11Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
fdd32750f0bce9edd05f85c3550d6ebc3b06931f

github-actions · 2025-10-15T17:28:48Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
f57438338d819c8e3e7e70293281c575ebd77411

calibration/README.md

michalkuligowski

Please change extension mentions

Signed-off-by: Artur Fierka <[email protected]>

github-actions · 2025-10-16T08:41:29Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

github-actions · 2025-10-16T11:19:39Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
17838e50efc9e686fffc901dfed811b15b7d614a

Signed-off-by: Artur Fierka <[email protected]>

afierka-intel · 2025-10-16T12:46:27Z

/skip-gaudi-tests

sys-hab-pt-service · 2025-10-16T12:46:49Z

Only codeowners and testowners can request to run Gaudi tests. Contact list: kzawora-intel, xuechendi, mswiniarsk, adobrzyn, mgawarkiewicz-intel, vivekgoe, afierka-intel, michalkuligowski, iboiko-habana, PatrykWo, kamil-kaczor, kfojcik-intel, ksmusz, wuxun-zhang, xuechendi, attafosu, ulivne, Kacper-Pietkun, iboiko-habana, jkaniecki

github-actions · 2025-10-16T14:33:31Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
17838e50efc9e686fffc901dfed811b15b7d614a

Porting the FP8 calibration procedure from vllm-hpu-extension: https://github.com/HabanaAI/vllm-hpu-extension/tree/main/calibration --------- Signed-off-by: Artur Fierka <[email protected]>

Add fp8 calibration files

baa9fd5

Signed-off-by: Artur Fierka <[email protected]>

Merge branch 'main' into dev/afierka/calibration-procedure

9778592

skavulya reviewed Oct 4, 2025

View reviewed changes

calibration/vlm-calibration/calibrate_model.sh Outdated Show resolved Hide resolved

Merge branch 'vllm-project:main' into dev/afierka/calibration-procedure

47e91ea

Fix raised issues

fac8e04

Signed-off-by: Artur Fierka <[email protected]>

Merge branch 'main' into dev/afierka/calibration-procedure

f9b62fa

afierka-intel marked this pull request as ready for review October 15, 2025 08:42

afierka-intel requested review from adobrzyn, kzawora-intel, mgawarkiewicz-intel, michalkuligowski, mswiniarsk, vivekgoe and xuechendi as code owners October 15, 2025 08:42

Merge branch 'main' into dev/afierka/calibration-procedure

21fb6d8

michalkuligowski approved these changes Oct 16, 2025

View reviewed changes

calibration/README.md Outdated Show resolved Hide resolved

michalkuligowski requested changes Oct 16, 2025

View reviewed changes

Update README and requirements

499bf57

Signed-off-by: Artur Fierka <[email protected]>

Merge branch 'main' into dev/afierka/calibration-procedure

cd4d3a5

afierka-intel requested a review from michalkuligowski October 16, 2025 08:43

Merge branch 'main' into dev/afierka/calibration-procedure

e973528

afierka-intel added 2 commits October 16, 2025 14:12

Merge branch 'main' into dev/afierka/calibration-procedure

c43f522

Remove references to extension

63d44fb

Signed-off-by: Artur Fierka <[email protected]>

michalkuligowski approved these changes Oct 16, 2025

View reviewed changes

afierka-intel merged commit 7d24df2 into vllm-project:main Oct 17, 2025
37 checks passed

Add fp8 calibration procedure #309

Add fp8 calibration procedure #309

Uh oh!

Conversation

afierka-intel commented Oct 3, 2025

Uh oh!

github-actions bot commented Oct 3, 2025

🚧 CI Blocked

Uh oh!

github-actions bot commented Oct 3, 2025

🚧 CI Blocked

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 15, 2025

🚧 CI Blocked

Uh oh!

github-actions bot commented Oct 15, 2025

🚧 CI Blocked

Uh oh!

github-actions bot commented Oct 15, 2025

🚧 CI Blocked

Uh oh!

afierka-intel commented Oct 15, 2025

Uh oh!

github-actions bot commented Oct 15, 2025

✅ CI Passed

Uh oh!

github-actions bot commented Oct 15, 2025

✅ CI Passed

Uh oh!

Uh oh!

michalkuligowski left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 16, 2025

🚧 CI Blocked

Uh oh!

github-actions bot commented Oct 16, 2025

✅ CI Passed

Uh oh!

afierka-intel commented Oct 16, 2025

Uh oh!

sys-hab-pt-service commented Oct 16, 2025

Uh oh!

github-actions bot commented Oct 16, 2025

✅ CI Passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants