Skip to content
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
bc363fa
add MX scale pre-swizzling for gfx1250
matthiasdiener Apr 27, 2026
a6ca3af
switch to mxfp4
matthiasdiener Apr 27, 2026
d1ee5bd
tensile-like implementation
matthiasdiener Apr 28, 2026
d1647ee
Merge remote-tracking branch 'upstream/dev' into mdiener/mxfp8-swizzle
matthiasdiener Apr 29, 2026
1fff6d9
Merge remote-tracking branch 'origin/dev' into mdiener/mxfp8-swizzle
matthiasdiener May 1, 2026
d714038
gfx1250 swizzle_xor changes for FP4
matthiasdiener May 1, 2026
76ca4b1
change line endings to unix, trim trailing whitespace
matthiasdiener May 1, 2026
81a0a27
Merge branch 'mdiener/swizzle_xor-1250' into mdiener/mxfp8-swizzle
matthiasdiener May 1, 2026
2991bcf
fix arch
matthiasdiener May 1, 2026
8ceb89c
[WIP] e2e gemm test, not working yet
matthiasdiener May 1, 2026
167d2eb
fix for gfx1250
matthiasdiener May 3, 2026
5d46537
k-tile
matthiasdiener May 3, 2026
313a6b7
extend tests
matthiasdiener May 3, 2026
2a8eeb5
remove ifdef
matthiasdiener May 3, 2026
c37a781
undo BLK32_UE8M0_32_8_EXT
matthiasdiener May 4, 2026
5d2d38f
Merge remote-tracking branch 'upstream/dev' into mdiener/mxfp8-swizzle
matthiasdiener May 5, 2026
f093f64
Revert "change line endings to unix, trim trailing whitespace"
matthiasdiener May 5, 2026
ecbffea
Revert "gfx1250 swizzle_xor changes for FP4"
matthiasdiener May 5, 2026
6855218
Claude PR review use OIDC-free method (#560)
Micky774 May 7, 2026
a0b88f4
gfx1250 swizzle_xor changes for FP4 (#571)
matthiasdiener May 9, 2026
27f4acd
NVFP4: Work around intermittent incorrect results for backward GEMMs …
matthiasdiener May 13, 2026
33fca6e
Merge remote-tracking branch 'origin/dev' into mdiener/mxfp8-swizzle
matthiasdiener May 13, 2026
b55a538
address review comments
matthiasdiener May 13, 2026
398cc3c
cleanups
matthiasdiener May 13, 2026
384d590
re-add scale swizzle hooks in GEMM paths for gfx1250
matthiasdiener May 13, 2026
5c5a902
cleanups
matthiasdiener May 13, 2026
2c05ec5
arch fixes
matthiasdiener May 14, 2026
5552b09
more test fixes gfx1250
matthiasdiener May 18, 2026
5cb098b
RMS Norm Optimization (#583)
aris134 May 18, 2026
bdee033
Merge remote-tracking branch 'origin/dev' into mdiener/mxfp8-swizzle
matthiasdiener May 19, 2026
90db6f4
address review comments
matthiasdiener May 19, 2026
2a6302d
additional padding
matthiasdiener May 19, 2026
03e33b1
Revert "Claude PR review use OIDC-free method (#560)"
matthiasdiener May 21, 2026
96254fa
Revert "RMS Norm Optimization (#583)"
matthiasdiener May 21, 2026
b83a2d9
revert unnecessary changes for gfx1250
matthiasdiener May 21, 2026
bea6b18
remove extra guards
matthiasdiener May 21, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions tests/cpp/operator/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,11 @@ list(APPEND test_cuda_sources
test_multi_unpadding.cu
test_causal_softmax.cu
test_swap_first_dims.cu
test_swizzle.cu
../test_common.cu)
if(USE_CUDA)
list(APPEND test_cuda_sources
test_cast_float8blockwise.cu
test_swizzle.cu)
test_cast_float8blockwise.cu)
else()
list(APPEND test_cuda_sources
test_cublaslt_gemm.cu
Expand Down
Loading
Loading