[Add] Add operator code and skill modifications for seer_attention and block_sparse_attn. by ArmandAlbert · Pull Request #983 · tile-ai/tilelang-ascend

ArmandAlbert · 2026-05-08T02:16:25Z

Add operator code and skill modifications for seer_attention and block_sparse_attn.

…d block_sparse_attn.

github-actions · 2026-05-08T02:16:43Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run bash format.sh in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work!

🚀

gemini-code-assist

Code Review

This pull request updates the TileLang-Ascend operator design documentation with new constraints regarding GEMM non-divisible shapes and L0C capacity limits, while also adding a block-sparse attention example. The review identified several critical issues: a potential out-of-bounds access in the new attention kernel when dimensions are not multiples of the block size, logic errors in the documentation regarding zero-block conditions, and inconsistencies in the mandatory checklist for operator design. Additionally, it was suggested to replace the use of infinity with a large concrete negative value for better NPU compiler compatibility.

gemini-code-assist · 2026-05-08T02:18:41Z

+                T.tile.div(acc_o[h_i, :], acc_o[h_i, :], sumexp[h_i])
+
+            T.copy(acc_o, acc_o_half)
+            T.copy(acc_o_half, Output[bz, by, bx * block_M + vid * block_M // 2 : bx * block_M + vid * block_M // 2 + block_M // 2, :])


This T.copy operation poses an out-of-bounds (OOB) risk if seq_q is not a multiple of block_M. For example, if seq_q=100 and block_M=64, the second block will attempt to write to indices 64:128, exceeding the tensor boundary. This violates the rule added in SKILL.md (Item 14) requiring explicit handling of non-divisible shapes. You should either implement a tail-handling strategy (like using T.min for the slice end or validRow logic) or ensure the input is padded on the host side.

gemini-code-assist · 2026-05-08T02:18:41Z

+| 16 | **函数无全局变量依赖**：维度参数从 tensor shape 或函数参数获取，支持多场景顺序测试 | ⭕ 推荐 |

-**通过条件**：必须项（1, 2, 3, 7, 8, 9）全部通过，推荐项（4, 5, 6, 10）至少通过 3/4。
+**通过条件**：必须项（1, 2, 3, 7, 8, 9, 14）全部通过，推荐项至少通过 4/9。


The list of mandatory items is missing item 14 (which was just added) and item 13 (which is marked as mandatory on line 304). Please update the pass criteria to include all required checks.

Suggested change

**通过条件**：必须项（1, 2, 3, 7, 8, 9, 14）全部通过，推荐项至少通过 4/9。

**通过条件**：必须项（1, 2, 3, 7, 8, 9, 13, 14）全部通过，推荐项至少通过 4/9。

gemini-code-assist · 2026-05-08T02:18:41Z

+            # === Vector: Online softmax + output accumulation ===
+            T.tile.fill(acc_o, 0.0)
+            T.tile.fill(sumexp, 0.0)
+            T.tile.fill(m_i, -T.infinity(accum_dtype))


Using T.infinity might lead to compatibility issues or unexpected behavior on some NPU compiler versions. It is safer and more consistent with other examples in this repository (e.g., flash_attn_bhsd_cc_sync.py) to use a large concrete negative value like -2**30 for masking purposes.

Suggested change

T.tile.fill(m_i, -T.infinity(accum_dtype))

T.tile.fill(m_i, -2**30)

ArmandAlbert · 2026-05-08T02:37:33Z

/re-test

github-actions · 2026-05-08T02:37:43Z

🔄 Re-running failed jobs

Original workflow run: View details

Only the failed jobs will be re-executed.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

[Add] Add operator code and skill modifications for seer_attention an…

87805f6

…d block_sparse_attn.

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

[Fix] Fix skill document.

0bca531

ArmandAlbert and others added 2 commits May 8, 2026 14:16

[Fix] Fix skill document.

820f101

Apply suggestion from @gemini-code-assist[bot]

bba7100

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fuhouyu-hw approved these changes May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Add] Add operator code and skill modifications for seer_attention and block_sparse_attn.#983

[Add] Add operator code and skill modifications for seer_attention and block_sparse_attn.#983
ArmandAlbert wants to merge 4 commits intotile-ai:ascendc_ptofrom
ArmandAlbert:seer_attn

ArmandAlbert commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

ArmandAlbert commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	通过条件：必须项（1, 2, 3, 7, 8, 9, 14）全部通过，推荐项至少通过 4/9。
	通过条件：必须项（1, 2, 3, 7, 8, 9, 13, 14）全部通过，推荐项至少通过 4/9。

	T.tile.fill(m_i, -T.infinity(accum_dtype))
	T.tile.fill(m_i, -2**30)

Conversation

ArmandAlbert commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

ArmandAlbert commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants