[MetaXGPU] Add compiler-path C500 hgemm route by VitalyAnkh · Pull Request #9 · MetaX-MACA/TileOPs-Metax

VitalyAnkh · 2026-05-20T09:23:52Z

Hi maintainers,

This PR routes the C500 hgemm path in tileops.ops.gemm.GemmOp through the TileLang compiler-generated backend. The handwritten maca_hgemm MACA C implementation remains a performance, layout, and shape reference; it is not used as the optimized execution path from GemmOp.

Summary

Adds the compiler-path MetaX C500 hgemm route with packed-B and split-K support.
Keeps the helper and wrapper pieces needed to preserve the validated layout contract.
Updates auto-dispatch coverage for the new GemmOp path.
Requires TileLang 0.1.10, which provides the GEMM annotation surface used by this route.
Keeps the route paired with the TileLang lowering and layout support in the companion PR.

Rebase update

Rebased onto the latest upstream dev.
Updated the direct hgemm block scopes to the current TileLang T.sblock API.
Revalidated the paired TileOps/TileLang stack after the TileLang WSM contract fix.

Validation

Production-shape hgemm sweep passed correctness on all 8 covered shapes.
Current measured throughput range: 172.076209 to 205.316929 TFLOPS.
Minimum A100-relative ratio in the sweep: 89.68%.
A representative compiler-path WSM fallback guard passed correctness with the paired TileLang PR.

Notes

Companion TileLang PR: [MetaxGPU][feature] Add MACA GEMM compiler-path layout support tile-ai/tilelang-metax#90
The paired TileLang lowering and layout support is required for this route; both sides were validated together.
The optimization stays on the compiler-generated MACA C implementation path rather than calling the handwritten reference kernel directly.

gemini-code-assist

Code Review

This pull request adds comprehensive support for MetaX C500 GPUs in GEMM operations, introducing specialized kernel paths such as BSM, split-K, and packed-B tile layouts. It includes several new kernel implementations, hardware-specific C++ headers, and auto-dispatch logic. The review feedback recommends defaulting compilation flags to enable MetaX-specific optimizations automatically on supported hardware and moving split-K validation logic to the initialization phase to prevent runtime crashes during tensor preparation.

VitalyAnkh · 2026-05-21T08:27:42Z

Addressed in the current head.

_gemm_compile_flags now takes use_maca: Optional[bool] = None and defaults to runtime C500 detection when no explicit value is provided.
The TILEOPS_GEMM_SPLIT_K compatibility checks for block_k now run immediately after init_config, before the prepared-B path can hit a less helpful shape error.

I will continue monitoring this PR together with the companion TileLang PR.

Keep GemmOp auto/default dispatch on the TileLang GemmKernel and reject direct maca_hgemm/maca_auto backend overrides for new hgemm work. Add the MACA BSM compiler path used by the packed-B split-K result, including prepared-B packing, prepared-B caching, split-K reduction, and C500 defaults. Validation: git diff --cached --check; ./.venv/bin/python -m py_compile tileops/ops/gemm.py tileops/kernels/gemm/gemm.py tileops/kernels/gemm/maca_auto.py tests/ops/test_gemm_auto_dispatch.py

VitalyAnkh mentioned this pull request May 20, 2026

[MetaxGPU][feature] Add MACA GEMM compiler-path layout support tile-ai/tilelang-metax#90

Open

gemini-code-assist Bot reviewed May 20, 2026

View reviewed changes

Comment thread tileops/kernels/gemm/gemm.py Outdated

Comment thread tileops/kernels/gemm/gemm.py Outdated

VitalyAnkh force-pushed the clean-hgemm-164 branch 3 times, most recently from dfd55e4 to 351ac49 Compare May 20, 2026 13:03

[MetaXGPU] Add C500 packed-BSM hgemm backend

ed59ef3

VitalyAnkh force-pushed the clean-hgemm-164 branch from 351ac49 to 2d70e17 Compare June 4, 2026 20:32

VitalyAnkh force-pushed the clean-hgemm-164 branch from 2d70e17 to 955969a Compare June 5, 2026 09:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MetaXGPU] Add compiler-path C500 hgemm route#9

[MetaXGPU] Add compiler-path C500 hgemm route#9
VitalyAnkh wants to merge 2 commits into
MetaX-MACA:devfrom
VitalyAnkh:clean-hgemm-164

VitalyAnkh commented May 20, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

VitalyAnkh commented May 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VitalyAnkh commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Rebase update

Validation

Notes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

VitalyAnkh commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

VitalyAnkh commented May 20, 2026 •

edited

Loading

VitalyAnkh commented May 21, 2026 •

edited

Loading