Regulate flash_attn_varlen_fp8_pertensor_func according to precision issue by apinge · Pull Request #123 · zejunchen-zejun/sglang

apinge · 2025-12-29T09:40:28Z

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

…issue Signed-off-by: apinge <tong.qiu2@amd.com>

sammysun0711 · 2025-12-30T06:53:55Z

#125 refactor kv quantization to be fused in mrope rms kernel, keep this on hold if new approach can fix Qwen3-MoE accuracy issue.

apinge and others added 2 commits December 29, 2025 17:39

regulate flash_attn_varlen_fp8_pertensor_func according to precision …

3928f64

…issue Signed-off-by: apinge <tong.qiu2@amd.com>

Merge branch 'dev/perf' into flash_attn_fp8

5e9edd4

ZLkanyo009 force-pushed the dev/perf branch 2 times, most recently from f8668d9 to c29e1c2 Compare March 13, 2026 08:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regulate flash_attn_varlen_fp8_pertensor_func according to precision issue#123

Regulate flash_attn_varlen_fp8_pertensor_func according to precision issue#123
apinge wants to merge 2 commits into
zejunchen-zejun:dev/perffrom
apinge:flash_attn_fp8

apinge commented Dec 29, 2025 •

edited

Loading

Uh oh!

sammysun0711 commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

apinge commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

sammysun0711 commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

apinge commented Dec 29, 2025 •

edited

Loading