Skip to content

Conversation

electron271
Copy link

https://rocm.docs.amd.com/en/latest/reference/gpu-arch-specs.html most non instinct gpus support 32 warp size

tested on RX 9070 XT, looking into getting this tested on amd instinct accelerators to ensure gpus with 64 warp size still work

@matthewdouglas
Copy link
Member

Thanks for the PR! I don't have the bandwidth to test this personally at the moment, so will defer to AMD team. Also I do not have any RDNA GPUs on hand.

cc: @pnunna93

Copy link

github-actions bot commented Sep 9, 2025

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@pnunna93 pnunna93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! It's good to go once warp size change is made.

hipLaunchKernelGGL(( kQuantizeBlockwise<T, 128, 2, 0, DATA_TYPE>), dim3(num_blocks), dim3(64), 0, 0, code, A, absmax, out, rand, rand_offset, n);
//else if(blocksize == 64)
// hipLaunchKernelGGL(( kQuantizeBlockwise<T, 64, 2, 0, DATA_TYPE>), dim3(num_blocks), dim3(32), 0, 0, code, A, absmax, out, rand, rand_offset, n);
else if(blocksize == 64 && warpSize == 32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warpSize will be deprecated in 7.0, we just added a WARP_SIZE macro, please use it instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants