Skip to content

Device agnostic for DCP #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Device agnostic for DCP #19

wants to merge 2 commits into from

Conversation

Chao1Han
Copy link
Owner

@Chao1Han Chao1Han commented Jul 14, 2025

Enable device-agnostic implementation of DCP-related functionality, allowing the new DCP features to be supported on XPU as well.
use_cuda_non_blocking_copy to use_non_blocking_copy because non-blocking copy is supported by most GPUs and is not exclusive to CUDA devices.

Test plan: test cases have not yet been updated to be fully device agnostic; this will be addressed in future work.

# Note: stream needs to be initialized on the main thread after default cuda
# stream is setup/used to avoid the risk of accidentally reusing the main
# compute stream or in other cases kernels actually launching from the
# main thread.
self._staging_stream = torch.cuda.Stream()
self._staging_stream = torch.Stream()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will happen to do sync with such stream?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When set non-blocking=True, this stream ensure copy done. isn’t used anywhere else.

assert torch.cuda.is_available(), "Non-blocking copy requires CUDA"
if self._config.use_non_blocking_copy:
assert torch.accelerator.is_available(), (
"Non-blocking copy requires CUDA/XPU"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the limitation to require non-blocking? Must to be accelerator?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking only used for copy between CPU and GPU, I assume all accelerator support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants