We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 5991232 commit 3678702Copy full SHA for 3678702
_posts/2025-02-24-ptpc-fp8-rocm.md
@@ -7,8 +7,6 @@ thumbnail-img: /assets/figures/ptpc/PTPC-tumbnail.png
7
share-img: /assets/figures/ptpc/PTPC-tumbnail.png
8
---
9
10
-# **Boosting vLLM Performance on AMD ROCm: PTPC-FP8 Quantization Unleashes Speed and Accuracy**
11
-
12
**TL;DR**: vLLM on AMD ROCm now has better FP8 performance!
13
14
* **What's new?** [PTPC-FP8 quantization](https://github.com/vllm-project/vllm/pull/12501) is now supported in vLLM (v0.7.3+) on AMD ROCm.
@@ -297,4 +295,4 @@ lm_eval \
297
295
--model vllm \
298
296
--model_args pretrained=$MODEL,add_bos_token=True,kv_cache_dtype=auto \
299
--tasks gsm8k --num_fewshot 5 --batch_size auto --limit 250
300
-```
+```
0 commit comments