Skip to content

Update ROCm to version 6.3.2#9670

Merged
cmsbuild merged 1 commit intocms-sw:IB/CMSSW_15_0_X/masterfrom
fwyzard:IB/CMSSW_15_0_X/master_ROCm_6.3.2
Feb 6, 2025
Merged

Update ROCm to version 6.3.2#9670
cmsbuild merged 1 commit intocms-sw:IB/CMSSW_15_0_X/masterfrom
fwyzard:IB/CMSSW_15_0_X/master_ROCm_6.3.2

Conversation

@fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Feb 4, 2025

Include debug symbols.

Other changes:
  - include debug symbols.
@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 4, 2025

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 4, 2025

A new Pull Request was created by @fwyzard for branch IB/CMSSW_15_0_X/master.

@iarspider, @smuzaffar can you please review it and eventually sign? Thanks.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 4, 2025

cms-bot internal usage

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 4, 2025

Let's see if this builds, then I can use the CVMFS installation to test on LUMI.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 4, 2025

-1

Failed Tests: UnitTests RelVals
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2a6e91/44169/summary.html
COMMIT: 0dec4be
CMSSW: CMSSW_15_0_X_2025-02-03-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 6 errors in the following unit tests:

---> test testTauEmbeddingWorkflow2016postVFP had ERRORS
---> test testTauEmbeddingWorkflow2016preVFP had ERRORS
---> test testTauEmbeddingWorkflow2017 had ERRORS
and more ...

RelVals

----- Begin Fatal Exception 04-Feb-2025 19:13:24 CET-----------------------
An exception of category 'ConfigFileReadError' occurred while
   [0] Processing the python configuration file named step3_PAT.py
Exception Message:
 unknown python problem occurred.
RuntimeError: An exception of category 'FileInPathError' occurred.
Exception Message:
edm::FileInPath unable to find file RecoBTag/Combined/data/UParTAK4/PUPPI/V01/UParTAK4_v2.onnx anywhere in the search path.
The search path is defined by: CMSSW_SEARCH_PATH
${CMSSW_SEARCH_PATH} is: /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/poison:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/external/el8_amd64_gcc12/data:/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02875/el8_amd64_gcc12/cms/cmssw/CMSSW_15_0_X_2025-02-03-2300/src
Current directory is: /data/cmsbld/jenkins/workspace/ib-run-pr-relvals/matrix-results/135.4_ZEEFS_13


At:
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Types.py(1000): insertInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Mixins.py(396): insertContentsInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Mixins.py(531): insertInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Modules.py(160): insertInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Config.py(1220): _insertManyInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Config.py(1495): fillProcessDesc
  <string>(2): <module>

----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 04-Feb-2025 19:16:02 CET-----------------------
An exception of category 'ConfigFileReadError' occurred while
   [0] Processing the python configuration file named step3_RAW2DIGI_L1Reco_RECO_RECOSIM_PAT_VALIDATION_DQM.py
Exception Message:
 unknown python problem occurred.
RuntimeError: An exception of category 'FileInPathError' occurred.
Exception Message:
edm::FileInPath unable to find file RecoBTag/Combined/data/UParTAK4/PUPPI/V01/UParTAK4_v2.onnx anywhere in the search path.
The search path is defined by: CMSSW_SEARCH_PATH
${CMSSW_SEARCH_PATH} is: /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/poison:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/external/el8_amd64_gcc12/data:/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02875/el8_amd64_gcc12/cms/cmssw/CMSSW_15_0_X_2025-02-03-2300/src
Current directory is: /data/cmsbld/jenkins/workspace/ib-run-pr-relvals/matrix-results/1306.0_SingleMuPt1_UP15


At:
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Types.py(1000): insertInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Mixins.py(396): insertContentsInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Mixins.py(531): insertInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Modules.py(160): insertInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Config.py(1220): _insertManyInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Config.py(1495): fillProcessDesc
  <string>(2): <module>

----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 04-Feb-2025 19:13:47 CET-----------------------
An exception of category 'ConfigFileReadError' occurred while
   [0] Processing the python configuration file named step3_RAW2DIGI_L1Reco_RECO_PAT_NANO_DQM.py
Exception Message:
 unknown python problem occurred.
RuntimeError: An exception of category 'FileInPathError' occurred.
Exception Message:
edm::FileInPath unable to find file RecoBTag/Combined/data/UParTAK4/PUPPI/V01/UParTAK4_v2.onnx anywhere in the search path.
The search path is defined by: CMSSW_SEARCH_PATH
${CMSSW_SEARCH_PATH} is: /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/poison:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src:/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/external/el8_amd64_gcc12/data:/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02875/el8_amd64_gcc12/cms/cmssw/CMSSW_15_0_X_2025-02-03-2300/src
Current directory is: /data/cmsbld/jenkins/workspace/ib-run-pr-relvals/matrix-results/140.045_RunJetHT2022C


At:
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Types.py(1000): insertInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Mixins.py(396): insertContentsInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Mixins.py(531): insertInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Modules.py(160): insertInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Config.py(1220): _insertManyInto
  /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44169/CMSSW_15_0_X_2025-02-03-2300/src/FWCore/ParameterSet/python/Config.py(1495): fillProcessDesc
  <string>(2): <module>

----- End Fatal Exception -------------------------------------------------
Expand to see more relval errors ...

@smuzaffar
Copy link
Contributor

please test

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 5, 2025

First results: on LUMI I get the same successes and failures for the unit tests using ROCm 6.2.4 and 6.3.2.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 5, 2025

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2a6e91/44207/summary.html
COMMIT: 0dec4be
CMSSW: CMSSW_15_0_X_2025-02-04-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/9670/44207/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2a6e91/44207/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2a6e91/44207/git-merge-result

Comparison Summary

Summary:

  • You potentially removed 1 lines from the logs
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 4016960
  • DQMHistoTests: Total failures: 76
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4016864
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 218 log files, 189 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@mandrenguyen
Copy link

+1

@smuzaffar
Copy link
Contributor

+externals

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 6, 2025

This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_15_0_X/master IBs (tests are also fine). This pull request will be automatically merged.

@cmsbuild cmsbuild merged commit 3766da5 into cms-sw:IB/CMSSW_15_0_X/master Feb 6, 2025
9 checks passed
@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 6, 2025

OK, for the unit tests I do get the same results with ROCm 6.2.4 and 6.3.2 running on LUMI:

  • ✅ CAsizes_tROCmAsync
  • ✅ Clusters_testROCmAsync
  • ✅ DigiErrors_testROCmAsync
  • ✅ Digis_testROCmAsync
  • ✅ Hits_testROCmAsync
  • ✅ TrackSoAHeterogeneousAlpaka_testROCmAsync
  • ✅ TrajectoryStateSoA_tROCmAsync
  • ✅ ZVertexSoA_testROCmAsync
  • ✅ alpakaCopyBufferToDeviceROCmAsync
  • ✅ alpakaMoveToDeviceAsyncROCmAsync
  • ✅ alpakaTestAtomicPairCounterROCmAsync
  • 🔴 alpakaTestBufferROCmAsync: Aborted (core dumped)
  • ✅ alpakaTestDeltaPhiROCmAsync
  • ✅ alpakaTestHistoContainerROCmAsync
  • ✅ alpakaTestIndependentKernelROCmAsync
  • ✅ alpakaTestKernelROCmAsync
  • ✅ alpakaTestOneHistoContainerROCmAsync
  • ✅ alpakaTestOneRadixSortROCmAsync
  • ✅ alpakaTestOneToManyAssocROCmAsync
  • ✅ alpakaTestPrefixScanROCmAsync
  • ✅ alpakaTestPrintAnswerROCmAsync
  • ✅ alpakaTestRadixSortROCmAsync
  • ✅ alpakaTestSimpleVectorROCmAsync
  • ✅ alpakaTestVecROCmAsync
  • ✅ alpakaWorkDivisionROCmAsync
  • ✅ deviceVertexFinderByDensity_tROCmAsync
  • ✅ deviceVertexFinderDBSCAN_tROCmAsync
  • ✅ deviceVertexFinderIterative_tROCmAsync
  • 🔴 deviceVertexFinderOneKernel_tROCmAsync: Aborted
  • ✅ testAlpakaDeviceAdditionKernelROCmAsync
  • ✅ testAlpakaDeviceAdditionOpaqueROCmAsync
  • ✅ testAlpakaDeviceAdditionROCmAsync
  • ✅ testAlpakaDeviceAdditionWrapperROCmAsync
  • ✅ testParabolicParametrizedMagneticField.devROCmAsync
  • ✅ testROCmService
  • ✅ testRequireROCmDevices
  • ✅ testRocmDeviceAddition
  • ✅ testRocmDeviceAdditionKernel
  • ✅ testRocmDeviceAdditionOpaque
  • ✅ testRocmDeviceAdditionWrapper
  • 🔴 testRocmSoALayoutAndView_t: Aborted

@fwyzard fwyzard changed the title [test] Update ROCm to version 6.3.2 Update ROCm to version 6.3.2 Feb 6, 2025
@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 6, 2025

I also see the same behaviour for relvals (that is, the same failures, but no regressions), so I guess we can go ahead with this.

@fwyzard fwyzard deleted the IB/CMSSW_15_0_X/master_ROCm_6.3.2 branch February 6, 2025 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants