-
Notifications
You must be signed in to change notification settings - Fork 47
[alpaka] Simplify Device handling in ScopedContext and Product #253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[alpaka] Simplify Device handling in ScopedContext and Product #253
Conversation
|
Since here is a possible implementation of this idea. While working on it, I though of one reason we may decide not to do this: if we ever want to support submitting a single collaborative kernel to multiple devices at the same time, having the |
I have one comment on the changes themselves: The
I agree that particular case (spreading work from one Event to many devices) may be quite far in the future (given that we have "hard time" in filling even one GPU with the work of one Event). Written that, if really needed, a multi-device multi-stream work could be launched already in today's system as long as the end result ends up in "the device", "the stream" is synchronized with all the other streams, and the "multi-device state" is internal to one module (i.e. does not have to be passed through a chain of modules). But the same argument would hold if we realize later that we'd need some other information be passed from the "acquire context" to "produce context". Having a specific class for it would make the addition trivial, but having to go through all necessary EDModules would mean more work (although probably not much compared to other migrations we are doing or have done, but still additional work). I'm leaning towards keeping the specific class (if the delivery of device+stream is deemed to be necessary, ref #224, but that's more for the eventual CMSSW deployment; in the |
86abd75 to
e3fdd33
Compare
|
Rebased after merging #250 . |
OK, I'll prepare a set of less invasive changes: take the |
|
@makortel what do you think of these changes ? |
Looks good! |
This PR PR is on top of #250 (to avoid conflicts, to be rebased after #250 is merged) and simplifies the Device management in
ScopedContextandProductby using the Alpaka'sDeviceabstraction instead of a mixture of it and an integer.ScopedContextno longer needs theDeviceto be given by outside, but takes it from the global vector using the same logic as incuda.The code could be simplified further by not storing the
Deviceat all, since it can be obtained fromQueue, although that would lead toDeviceobjects be copied more. The cost of those copies appear to be at most the same as withstd::shared_ptr, so maybe that acceptable at this point?