Adapt DeepSeek V32 examples to MACA-safe barriers, head tiling, and fallback paths by VitalyAnkh · Pull Request #36 · tile-ai/tilelang-metax

VitalyAnkh · 2026-04-09T19:22:17Z

Fixes #30.

Review note

This branch now carries only the shared MACA backend prerequisite from #33 plus the DeepSeek V32 changes below. It no longer includes the unrelated example updates from #34 or #35.

Problem

The DeepSeek V32 examples assumed CUDA-specific execution behaviour in several places, including barrier handling, head partitioning, TMA-oriented forward paths, and vector-atomic usage in backward kernels.

What this PR changes

replaces the CUDA-style histogram reset assumptions in topk_selector with a MACA-safe reset strategy
adapts sparse MLA forward and pipelined forward paths to use MACA-safe head tiling and fallback execution paths
changes the backward path to use a supported atomic formulation on MACA
keeps the follow-up correctness fixes for replicated-head stride and the histogram sentinel slot in the same kernel family
adds regression coverage for the MACA-specific edge cases identified during review

Solution

The PR keeps the DeepSeek V32 examples intact at the algorithmic level, but rewrites the execution assumptions that were specific to CUDA. The MACA path now partitions heads according to MACA-safe tile sizes, clears the histogram state fully, and avoids unsupported synchronization or atomic behaviour.

Alternatives considered

One option was to bypass the DeepSeek V32 cases entirely on MACA. That would have been expedient, but it would also have left a substantial portion of the example suite unexercised. Another was to preserve the existing kernels and add narrow guards around the observed failures. That approach would have been fragile because the failures shared a broader root cause: CUDA-specific execution assumptions embedded in the example kernels.

Verification

python -m pytest -q examples/maca/deepseek_v32/test_tilelang_example_deepseek_v32.py

github-actions · 2026-04-09T19:32:26Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

VitalyAnkh mentioned this pull request Apr 9, 2026

Add a portable and synchronized MACA fallback for the minference sparse-attention example #37

Open

VitalyAnkh changed the title ~~[MetaxGPU][Examples] Adapt DeepSeek V32 kernels for MACA~~ Adapt DeepSeek V32 examples to MACA-safe barriers, head tiling, and fallback paths Apr 9, 2026

VitalyAnkh force-pushed the vitaly/deepseek-v32-maca-compat branch from da28d0b to e144b9b Compare April 9, 2026 20:19

VitalyAnkh added 3 commits April 12, 2026 20:20

[MetaxGPU][MACA] Fill backend dtype and codegen support gaps

830f841

[MetaxGPU][MACA] Apply pre-commit fixes for backend support

5077760

[MetaxGPU][Examples] Adapt DeepSeek V32 kernels for MACA

5816c1c

VitalyAnkh force-pushed the vitaly/deepseek-v32-maca-compat branch from e144b9b to 5816c1c Compare April 12, 2026 20:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt DeepSeek V32 examples to MACA-safe barriers, head tiling, and fallback paths#36

Adapt DeepSeek V32 examples to MACA-safe barriers, head tiling, and fallback paths#36
VitalyAnkh wants to merge 3 commits into
tile-ai:devfrom
VitalyAnkh:vitaly/deepseek-v32-maca-compat

VitalyAnkh commented Apr 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VitalyAnkh commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review note

Problem

What this PR changes

Solution

Alternatives considered

Verification

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

VitalyAnkh commented Apr 9, 2026 •

edited

Loading