Skip to content

Conversation

@Hniii98
Copy link

@Hniii98 Hniii98 commented Sep 24, 2025

temp_mask = torch.ones(L, S, dtype=torch.bool).tril(diagonal=S-L)

temp_mask 默认在CPU创建了张量,当在CPU上测试算子的时候没有任何问题,而当在CUDA上测试的时候就会报错,报错信息如下:

RuntimeError: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0
image

解决方法:
对temp_mask也加上device=query.device,使得它和手动创建的query在同一设备上

temp_mask = torch.ones(L, S, dtype=torch.bool, device=query.device).tril(diagonal=S-L)

效果:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant