Skip to content

Conversation

jiafei96
Copy link

@jiafei96 jiafei96 commented Jul 9, 2025

由于当前ROCM不支持flashinfer,无法跑多batch推理,所以增加ktransformers支持
在ROCM平台使用HIP架构芯片测试,input/output设置为2048/128,1024/1024和4096/1024,R1 prefill和decode分别最高提升50%和105%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant