-
Notifications
You must be signed in to change notification settings - Fork 206
ROCm: various updates [15.0.x] #9853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROCm: various updates [15.0.x] #9853
Conversation
|
A new Pull Request was created by @fwyzard for branch IB/CMSSW_15_0_X/master. @cmsbuild, @iarspider, @smuzaffar can you please review it and eventually sign? Thanks.
|
|
cms-bot internal usage |
66ce743 to
f0949d7
Compare
|
enable gpu |
|
please test |
|
Pull request #9853 was updated. |
f0949d7 to
cfed3b8
Compare
|
Pull request #9853 was updated. |
cfed3b8 to
186a23c
Compare
|
Pull request #9853 was updated. |
|
please test |
|
-1 Failed Tests: Build The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: BuildI found compilation error when building: raise RuntimeError("failed to load library '"+library+"'")
RuntimeError: failed to load library 'lib/el8_amd64_gcc12//libAnalysisDataFormatsTrackInfo.so'
@@@@ ----> OK EDM Class Version CUDADataFormatsBeamSpot
>> Checking EDM Class Transients in CUDADataFormatsBeamSpot
Suggestion: You can run 'scram build updateclassversion' to generate src/AnalysisDataFormats/TrackInfo/src/classes_def.xml.generated with updated ClassVersion
gmake: *** [tmp/el8_amd64_gcc12/edm_checks/libAnalysisDataFormatsTrackInfo.so] Error 1
@@@@ ----> OK EDM Class Version CUDADataFormatsCommon
>> Checking EDM Class Transients in CUDADataFormatsCommon
@@@@ ----> OK EDM Class Version AnalysisDataFormatsTopObjects
>> Checking EDM Class Version for src/CUDADataFormats/Track/src/classes_def.xml in CUDADataFormatsTrack
>> Checking EDM Class Transients in AnalysisDataFormatsTopObjects
|
|
The error seem unrelated to this PR, and possibly due to the other changes that were tested with it. |
Include more libraries and tools in the ROCm package: - ROCr headers - hipCUB - RCCL Merge hipRAND/rocRAND back into the base ROCm tool. Remove debuginfo files.
186a23c to
e8e0a7f
Compare
|
please test Let's try again without spurious PRs. |
|
Pull request #9853 was updated. |
|
-1 Failed Tests: rocmUnitTests ROCm Unit TestsI found 3 errors in the following unit tests: ---> test testRocmSoALayoutAndView_t had ERRORS ---> test alpakaTestKernelROCmAsync had ERRORS ---> test alpakaTestBufferROCmAsync had ERRORS Comparison SummarySummary:
CUDA Comparison SummarySummary:
ROCM Comparison SummarySummary:
|
|
ignore tests-rejected with ib-failure |
|
backport #9843 |
|
+externals |
|
This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_15_0_X/master IBs (test failures were overridden). This pull request will now be reviewed by the release team before it's merged. @rappoccio, @antoniovilela, @sextonkennedy, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2) |
|
@cms-sw/orp-l2 , this is also ready to go in to next 15.0.X IB/release |
|
+1 |
c72fdf1
into
cms-sw:IB/CMSSW_15_0_X/master
Various updates to the ROCm package for CMSSW:
The extra libraries are needed to build UCX with ROCm support.
Enable unified memory on Instinct MI100, MI210/250, and MI300 GPUs:
xnacksetting, which supports running withxnackenabled or disabled;xnacksupport settingHSA_XNACK=1.