Commit 069399b
authored
During CUDAGraph capture, MiniMax-M3's autotuned _topk_index_partial_kernel
discards candidate CompiledKernels. A gen-0 GC firing inside the stream-capture
region runs CompiledKernel.__del__ -> hipModuleUnload, which HIP forbids while a
stream is capturing (HIP 900), corrupting the capture and aborting the
custom_all_reduce IPC handshake (SIGABRT). gc.freeze() did not help because the
discarded kernels are created mid-loop. Disable GC for the whole capture window
and restore via try/finally.
1 parent b9cff14 commit 069399b
1 file changed
Lines changed: 18 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
8 | | - | |
| 9 | + | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
| |||
2283 | 2284 | | |
2284 | 2285 | | |
2285 | 2286 | | |
2286 | | - | |
| 2287 | + | |
| 2288 | + | |
| 2289 | + | |
| 2290 | + | |
| 2291 | + | |
| 2292 | + | |
| 2293 | + | |
| 2294 | + | |
| 2295 | + | |
| 2296 | + | |
| 2297 | + | |
| 2298 | + | |
2287 | 2299 | | |
2288 | 2300 | | |
2289 | 2301 | | |
| |||
2362 | 2374 | | |
2363 | 2375 | | |
2364 | 2376 | | |
2365 | | - | |
| 2377 | + | |
2366 | 2378 | | |
2367 | 2379 | | |
2368 | 2380 | | |
| |||
2374 | 2386 | | |
2375 | 2387 | | |
2376 | 2388 | | |
2377 | | - | |
| 2389 | + | |
| 2390 | + | |
| 2391 | + | |
2378 | 2392 | | |
2379 | 2393 | | |
2380 | 2394 | | |
| |||
0 commit comments