Skip to content

Conversation

@ferencek
Copy link
Contributor

@ferencek ferencek commented Oct 10, 2023

PR description:

After a recent fix for pixel FED error imbalance in DQM plots (#42872), some imbalance still remained as can be see in the online DQM plots for run 374731. This was later confirmed in local tests with 2k events from wf 141.008583 as can be seen in the SiPixelHeterogeneous/PixelErrorCompareGPUvsCPU/FEErrorVsFEDIdUnbalance plot
FEDErrorImbalance_ref
The remaining imbalance concerns errorType=30 (TBM error trailer) and originates from the relevant GPU and CPU codes not being fully in sync. With this PR added the imbalance plot become empty
FEDErrorImbalance_pr
I also took this opportunity to fully synchronize the code for errorTypes 26 and 27.

PR validation:

The code was tested and the above DQM plots obtained by running the following workflow

runTheMatrix.py -w gpu -l 141.008583

over 2k events (by default the wf runs over 100 events).

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Backport to 13_2_X planned.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42977/37152

  • This PR adds an extra 20KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @ferencek (Dinko F.) for master.

It involves the following packages:

  • RecoLocalTracker/SiPixelClusterizer (reconstruction)

@mandrenguyen, @jfernan2, @cmsbuild can you please review it and eventually sign? Thanks.
@threus, @GiacomoSguazzoni, @JanFSchulte, @missirol, @mtosi, @dkotlins, @rovere, @mmusich, @tvami, @VinInn, @mroguljic, @gpetruc, @VourMa, @ferencek, @felicepantaleo this is something you requested to watch as well.
@sextonkennedy, @antoniovilela, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor

mmusich commented Oct 10, 2023

test parameters:

  • enable = gpu
  • workflows_gpu = 11634.503, 11634.507, 11634.583, 11634.587, 11634.593, 141.008583
  • relvals_opt_gpu = --what upgrade,standard,highstats,pileup,generator,extendedgen,production,ged,machine,premix,nano
  • addpkg = DQM/Integration

@mmusich
Copy link
Contributor

mmusich commented Oct 10, 2023

@cmsbuild, please test

@ferencek
Copy link
Contributor Author

Force-pushed branch with a fixed commit message (by mistake wrote error 31 instead of error 30).

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42977/37153

  • This PR adds an extra 20KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

Pull request #42977 was updated. @cmsbuild, @jfernan2, @mandrenguyen can you please check and sign again.

@mmusich
Copy link
Contributor

mmusich commented Oct 10, 2023

please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-GPU RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2f5898/35119/summary.html
COMMIT: 6fb6645
CMSSW: CMSSW_13_3_X_2023-10-10-1100/el8_amd64_gcc11
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/42977/35119/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-GPU

ValueError: Undefined workflows: 141.008583

RelVals-INPUT

The relvals timed out after 4 hours.

Comparison Summary

Summary:

  • You potentially added 2 lines to the logs
  • Reco comparison results: 16 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3356920
  • DQMHistoTests: Total failures: 9
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3356889
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor

mmusich commented Oct 11, 2023

test parameters:

  • enable = gpu
  • workflows_gpu = 141.008583

@mmusich
Copy link
Contributor

mmusich commented Oct 11, 2023

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2f5898/35121/summary.html
COMMIT: 6fb6645
CMSSW: CMSSW_13_3_X_2023-10-10-2300/el8_amd64_gcc11
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/42977/35121/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 7 lines from the logs
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3356920
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3356895
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 47 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 44865
  • DQMHistoTests: Total failures: 1827
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 43038
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 11 log files, 12 edm output root files, 4 DQM output files
  • TriggerResults: no differences found

@antoniovilela
Copy link
Contributor

urgent

@antoniovilela
Copy link
Contributor

ping @cms-sw/reconstruction-l2

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@rappoccio
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants