-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix npu op bug. #3122
base: main
Are you sure you want to change the base?
Fix npu op bug. #3122
Conversation
zhuweichen seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
2ec5da3
to
5189ddd
Compare
mmcv/ops/nms.py
Outdated
@@ -415,6 +415,8 @@ def nms_rotated(dets: Tensor, | |||
order = scores.new_empty(0, dtype=torch.long) | |||
if dets.device.type == 'npu': | |||
coefficient = 57.29578 # 180 / PI | |||
dets_cw = dets_cw.float() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dets.float
will return a copy of dets
. So , is it necessary to copy dets
to dets_cw
before?
mmcv/ops/nms.py
Outdated
@@ -415,6 +415,8 @@ def nms_rotated(dets: Tensor, | |||
order = scores.new_empty(0, dtype=torch.long) | |||
if dets.device.type == 'npu': | |||
coefficient = 57.29578 # 180 / PI | |||
dets_cw = dets_cw.float() | |||
scores = scores.float() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If dets is a float16
tensor, it will be concated with a float32 tensor scores
here. Is it expected?
if torch.cuda.current_device() != points_device: | ||
torch.cuda.set_device(points_device) | ||
elif points.device.type == 'npu': | ||
boxes[:, :, 2] += boxes[:, :, 5] / 2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please leave a comment to describe why we should do this.
c151a35
to
c6478ef
Compare
Update voxelization_npu.cpp Update test_voxelization.py Update voxelization_npu.cpp Update voxelization_npu.cpp Update voxelization_npu.cpp
Add NPU support for dynamic voxelization
repair nms_rotated npu bug
add dtype check for roi_align
Update box_iou_rotated_npu.cpp
Bugfix of NPU adapter of nms3d
Adapt boxes_overlap_bev to box_iou_rotated
fix the bug of DeformableRoiPoolGrad
Interfaces change.
codeclean npu/boxes_overlap_bev_npu.cpp
adapt npu box_iou_quadri
新增RoiAlignRotatedV2适配层
update point_to_voxel & voxel_to_point in scatter_points.py
add assign_score_withk NPU adaptation
update points_in_boxes_all
border_align算子NPU适配
add new npu op roiaware_pool3d && fix npu op scatter_points bug
pixel_group适配层修改
scatter points bug fix
update nms_rotated from openmmlab.mmcv main
roi_align_rotated_v2
add pixel_group_npu
modify internal calls of npu boxes_overlap_bev & box_iou_rotated
git checkout origin pixel_group
…ted" This reverts commit a752d17.
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Please describe the motivation of this PR and the goal you want to achieve through this PR.
Modification
Please briefly describe what modification is made in this PR.
BC-breaking (Optional)
Does the modification introduce changes that break the backward-compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
Checklist
Before PR:
After PR: