Skip to content

Optimize DeepSeek V4 qkv_proj_rope decode (S=2) via partial-sum reduc…

6cfd357
Select commit
Loading
Failed to load commit list.
Sign in for the full log view
Merged

Optimize DeepSeek V4 qkv_proj_rope decode (S=2): partial-sum reduces, amax fold, K-tile/stage tuning #339

Optimize DeepSeek V4 qkv_proj_rope decode (S=2) via partial-sum reduc…
6cfd357
Select commit
Loading
Failed to load commit list.

Annotations

1 warning
unit-tests
succeeded May 21, 2026 in 33s