[AscendNPU-IR] Fix op document and tests.#878
Conversation
|
👋 Hi! Thank you for contributing to the TileLang project. Please remember to run We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀 |
There was a problem hiding this comment.
Code Review
This pull request updates the documentation for T.sync_block_wait by replacing the existing example with a more comprehensive implementation that demonstrates synchronization between Cube and Vector scopes. A review comment identified that the variable FFTS_FLAG_THRESHOLD was undefined in the example and suggested adding a definition to make the code self-contained.
| def simple_sync(M, N, block_M, block_N, dtype="float16", inner_dtype="float32"): | ||
| m_num = M // block_M | ||
| n_num = N // block_N |
There was a problem hiding this comment.
The variable FFTS_FLAG_THRESHOLD is used in the example code (lines 64 and 72) but is not defined within the scope of the snippet. This will cause a NameError if the code is executed. Please define this constant in the example to make it self-contained.
| def simple_sync(M, N, block_M, block_N, dtype="float16", inner_dtype="float32"): | |
| m_num = M // block_M | |
| n_num = N // block_N | |
| def simple_sync(M, N, block_M, block_N, dtype="float16", inner_dtype="float32"): | |
| m_num = M // block_M | |
| n_num = N // block_N | |
| FFTS_FLAG_THRESHOLD = 10 |
| with T.rs("PIPE_MTE2"): | ||
| T.sync_block_wait(i) | ||
| T.copy(Workspace[cid * block_m, i * block_n], cross_kernel_ub) | ||
| for i in range(0, FFTS_FLAG_THRESHOLD): |
There was a problem hiding this comment.
I think this realization will cause deadlock
There was a problem hiding this comment.
Simplified the sync logic.
Fix op document and tests