[MetaXGPU] Add compiler-path C500 hgemm route#9
Open
VitalyAnkh wants to merge 2 commits into
Open
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds comprehensive support for MetaX C500 GPUs in GEMM operations, introducing specialized kernel paths such as BSM, split-K, and packed-B tile layouts. It includes several new kernel implementations, hardware-specific C++ headers, and auto-dispatch logic. The review feedback recommends defaulting compilation flags to enable MetaX-specific optimizations automatically on supported hardware and moving split-K validation logic to the initialization phase to prevent runtime crashes during tensor preparation.
dfd55e4 to
351ac49
Compare
Author
|
Addressed in the current head.
I will continue monitoring this PR together with the companion TileLang PR. |
351ac49 to
2d70e17
Compare
Keep GemmOp auto/default dispatch on the TileLang GemmKernel and reject direct maca_hgemm/maca_auto backend overrides for new hgemm work. Add the MACA BSM compiler path used by the packed-B split-K result, including prepared-B packing, prepared-B caching, split-K reduction, and C500 defaults. Validation: git diff --cached --check; ./.venv/bin/python -m py_compile tileops/ops/gemm.py tileops/kernels/gemm/gemm.py tileops/kernels/gemm/maca_auto.py tests/ops/test_gemm_auto_dispatch.py
2d70e17 to
955969a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi maintainers,
This PR routes the C500 hgemm path in
tileops.ops.gemm.GemmOpthrough the TileLang compiler-generated backend. The handwrittenmaca_hgemmMACA C implementation remains a performance, layout, and shape reference; it is not used as the optimized execution path fromGemmOp.Summary
GemmOppath.0.1.10, which provides the GEMM annotation surface used by this route.Rebase update
dev.T.sblockAPI.Validation
172.076209to205.316929 TFLOPS.89.68%.Notes