-
Notifications
You must be signed in to change notification settings - Fork 0
Device agnostic for DCP #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
# Note: stream needs to be initialized on the main thread after default cuda | ||
# stream is setup/used to avoid the risk of accidentally reusing the main | ||
# compute stream or in other cases kernels actually launching from the | ||
# main thread. | ||
self._staging_stream = torch.cuda.Stream() | ||
self._staging_stream = torch.Stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will happen to do sync with such stream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When set non-blocking=True, this stream ensure copy done. isn’t used anywhere else.
assert torch.cuda.is_available(), "Non-blocking copy requires CUDA" | ||
if self._config.use_non_blocking_copy: | ||
assert torch.accelerator.is_available(), ( | ||
"Non-blocking copy requires CUDA/XPU" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the limitation to require non-blocking? Must to be accelerator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking only used for copy between CPU and GPU, I assume all accelerator support.
Enable device-agnostic implementation of DCP-related functionality, allowing the new DCP features to be supported on XPU as well.
use_cuda_non_blocking_copy to use_non_blocking_copy because non-blocking copy is supported by most GPUs and is not exclusive to CUDA devices.
Test plan: test cases have not yet been updated to be fully device agnostic; this will be addressed in future work.