Skip to content

Conversation

@fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Jan 17, 2025

PR description:

Replace ALPAKA_STATIC_ACC_MEM_GLOBAL with HOST_DEVICE_CONSTANT.

ALPAKA_STATIC_ACC_MEM_GLOBAL gets a different, more complex syntax in alpaka 1.2.0 (in order to support Intel oneAPI).
It also has a slightly different meaning, providing global symbols that can be device-memcpy'ed over from the host, while HOST_DEVICE_CONSTANT supports only plain constants.

Follow up to #47108.

PR validation:

None.

No changes expected.

@fwyzard
Copy link
Contributor Author

fwyzard commented Jan 17, 2025

enable gpu

@fwyzard
Copy link
Contributor Author

fwyzard commented Jan 17, 2025

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 17, 2025

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47120/43340

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fwyzard for master.

It involves the following packages:

  • RecoLocalCalo/EcalRecProducers (reconstruction)
  • RecoLocalCalo/HcalRecProducers (reconstruction)

@jfernan2, @mandrenguyen can you please review it and eventually sign? Thanks.
@ReyerBand, @abdoulline, @apsallid, @argiro, @bsunanda, @denizsun, @mariadalfonso, @missirol, @rchatter, @salimcerci, @thomreis, @wang0jin, @youyingli this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-GPU
Size: This PR adds an extra 40KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-837681/43806/summary.html
COMMIT: 23d672e
CMSSW: CMSSW_15_0_X_2025-01-16-2300/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/47120/43806/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-GPU

  • 12834.42312834.423_TTbar_14TeV+2024_Patatrack_HCALOnlyGPUandAlpaka_Validation/step2_TTbar_14TeV+2024_Patatrack_HCALOnlyGPUandAlpaka_Validation.log
  • 12834.42212834.422_TTbar_14TeV+2024_Patatrack_HCALOnlyAlpaka_Validation/step2_TTbar_14TeV+2024_Patatrack_HCALOnlyAlpaka_Validation.log
  • 12834.40612834.406_TTbar_14TeV+2024_Patatrack_PixelOnlyTripletsAlpaka/step2_TTbar_14TeV+2024_Patatrack_PixelOnlyTripletsAlpaka.log
Expand to see more relval errors ...

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3819085
  • DQMHistoTests: Total failures: 23
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3819042
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 214 log files, 184 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@fwyzard
Copy link
Contributor Author

fwyzard commented Jan 17, 2025

The errors are due to an ongoing update of the CUDA installation on the machines running the GPU tests, and are unrelated to this PR:

----- Begin Fatal Exception 17-Jan-2025 09:39:44 CET-----------------------
An exception of category 'StdException' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing service of type CUDAService
Exception Message:
A std::exception was thrown.

src/HeterogeneousCore/CUDAServices/plugins/CUDAService.cc, line 193:
nvmlCheck(nvmlInitWithFlags(NVML_INIT_FLAG_NO_GPUS | NVML_INIT_FLAG_NO_ATTACH));
NVML Error 18: Driver/library version mismatch
----- End Fatal Exception -------------------------------------------------

@fwyzard
Copy link
Contributor Author

fwyzard commented Jan 17, 2025

please test

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 40KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-837681/43810/summary.html
COMMIT: 23d672e
CMSSW: CMSSW_15_0_X_2025-01-16-2300/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/47120/43810/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 3 lines to the logs
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3819085
  • DQMHistoTests: Total failures: 71
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3818994
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 214 log files, 184 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

@jfernan2
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @mandrenguyen, @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 8681eba into cms-sw:master Jan 20, 2025
14 checks passed
@fwyzard fwyzard deleted the replace_ALPAKA_STATIC_ACC_MEM_GLOBAL_part2 branch January 20, 2025 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants