Improve the memory usage in the alpaka pixel reconstruction [14.0.x]#44459
Conversation
|
type bugfix |
|
enable gpu |
|
please test |
|
A new Pull Request was created by @fwyzard for CMSSW_14_0_X. It involves the following packages:
@fwyzard, @jfernan2, @mandrenguyen, @makortel can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
|
cms-bot internal usage |
Fix the hitsModuleStart copy to use the correct source (device-to-device instead of host-to-device). Improve the offsetBPIX2 copy and use the data member as source, to guarantee the lifetime of the source. Remove the queue synchronisation. Update the unit test accordingly.
Co-authored-by: Matti Kortelainen <matti.kortelainen@cern.ch>
Allocate device memory only when actually used. Co-authored-by: Matti Kortelainen <matti.kortelainen@cern.ch>
Co-authored-by: Matti Kortelainen <matti.kortelainen@cern.ch>
ba5d4ab to
30162d7
Compare
|
please test |
|
These changes improve the throughput of the HLT 2024 v1.0 menu with the Pixel alpaka customisation from |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-531d99/38243/summary.html Comparison SummarySummary:
GPU Comparison SummarySummary:
|
|
backport of #44458 |
|
+1 |
|
please test |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-531d99/38303/summary.html Comparison SummarySummary:
GPU Comparison SummarySummary:
|
|
+heterogeneous |
|
This pull request is fully signed and it will be integrated in one of the next CMSSW_14_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_14_1_X is complete. This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @sextonkennedy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
|
+1 |
PR description:
Fix the
TrackingRecHitsDeviceconstructor:Fix the
hitsModuleStartcopy to use the correct source (device-to-device instead of host-to-device).Improve the
offsetBPIX2copy and use the data member as source, to guarantee the lifetime of the source.Remove the queue synchronisation.
Remove an unused variable from
SiPixelRawToCluster::acquire().Improve memory usage in
SiPixelRawToCluster::acquire():Allocate device memory only when actually used.
Use cached memory buffers in
WordFedAppender.PR validation:
Run the HLT 2024 v1.0 menu with the changes.
Backport status
Backported of #44458 to CMSSW 14.0.x for data taking.