Skip to content

[Quantization] support quark ptpc for sglang#2

Open
haoyangli0109 wants to merge 3 commits into
Yuechguo:v0.4.10.post2_new_featurefrom
haoyangli0109:lhy/quark_ptpc1
Open

[Quantization] support quark ptpc for sglang#2
haoyangli0109 wants to merge 3 commits into
Yuechguo:v0.4.10.post2_new_featurefrom
haoyangli0109:lhy/quark_ptpc1

Conversation

@haoyangli0109

Copy link
Copy Markdown

support quark ptpc for sglang

kkHuang-amd and others added 2 commits September 19, 2025 12:08
Co-authored-by: wunhuang <wunhuang@amd.com>
Co-authored-by: Hubert Lu <Hubert.Lu@amd.com>
Co-authored-by: Haoyang Li <haoyang.li@amd.com>
Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>
@haoyangli0109 haoyangli0109 marked this pull request as ready for review September 22, 2025 08:32
@haoyangli0109

Copy link
Copy Markdown
Author

I am currently checking the inference speed of the Quark format.

Co-authored-by: Haoyang Li <haoyang.li@amd.com>
Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>
@haoyangli0109

haoyangli0109 commented Sep 24, 2025

Copy link
Copy Markdown
Author
Qwen3-32B quark llm
TTFT (ms) 6755.5 6898.47
ITL (ms) 61.71 61.93
gsm8k 0.865 0.850

amd-youchen pushed a commit to amd-youchen/sglang that referenced this pull request Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants