Skip to content

Revision of the PSimHit type storage, revert trackId to its basic meaning#49969

Open
fabiocos wants to merge 8 commits intocms-sw:masterfrom
fabiocos:fc-psimhit-20260122
Open

Revision of the PSimHit type storage, revert trackId to its basic meaning#49969
fabiocos wants to merge 8 commits intocms-sw:masterfrom
fabiocos:fc-psimhit-20260122

Conversation

@fabiocos
Copy link
Contributor

PR description:

Following #49732 and the discussion in the Simulation meeting https://indico.cern.ch/event/1634520/contributions/6878108/attachments/3199713/5699141/SIM_20260116.pdf , this PR proposes a revision of the mechanism used to store and propagate the hit type classification, so far used only by MTD for Phase2, but in principle applicable to any interested sub-detector.

trackId is reverted to the pure Geant4 id, without any offset and limitation on the maximum available (beyond the uint32_t capability), so as not to interfere with exceptionally populated events, and software developments for GPU.

The hit type is moved as a 7-bits subfield in the processType member of PSimHit, where the maximum code is at present 403, and this is a used-defined integer, not a pseudo-random variable. The interface is adapted, the assignment of hit type in MTD is adjusted accordingly, and all the dependencies in the code (truth accumulator, validation) are adapted.

PR validation:

Tests on 100 single pions (wf 34506.0) are successful, the detailed Geant4 debug printout shows the desired behaviour, when activated, and the usual DQM histogram test shows perfect comparison between histograms.

…ncies, temporarily silence use of offset to be replaced
Add test for PSimHit storage of process/hit type
Enable backward compatibility for PSimHit
@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 28, 2026

cms-bot internal usage

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49969/47752

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fabiocos for master.

It involves the following packages:

  • RecoMTD/TimingIDTools (reconstruction)
  • SimDataFormats/CaloAnalysis (simulation)
  • SimDataFormats/TrackingHit (simulation)
  • SimG4CMS/Forward (simulation)
  • SimG4Core/Application (simulation)
  • SimG4Core/Notification (simulation)
  • SimGeneral/CaloAnalysis (simulation)
  • Validation/MtdValidation (dqm)

@Moanwar, @civanch, @cmsbuild, @ctarricone, @gabrielmscampos, @jfernan2, @kpedro88, @mandrenguyen, @mdhildreth, @nothingface0, @rseidita, @srimanob can you please review it and eventually sign? Thanks.
@ReyerBand, @VinInn, @VourMa, @apsallid, @bsunanda, @denizsun, @elusian, @felicepantaleo, @makortel, @martinamalberti, @missirol, @mmasciov, @mmusich, @mtosi, @rovere, @salimcerci, @slomeo, @thomreis, @wang0jin this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@fabiocos
Copy link
Contributor Author

@civanch @kpedro88 this is a possible solution to the problem of avoiding limits to trackId, that preserves the existing functionalities. Please comment.

@fabiocos
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests RelVals RelVals-INPUT
Size: This PR adds an extra 116KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cea6a3/50974/summary.html
COMMIT: 5606fe1
CMSSW: CMSSW_16_1_X_2026-01-28-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/49969/50974/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed Unit Tests

I found 1 errors in the following unit tests:

---> test testPhase2PixelNtuple had ERRORS

Failed RelVals

  • 25202.0A fatal system signal has occurred: segmentation violation
  • 14234.0A fatal system signal has occurred: segmentation violation
  • 312.0A fatal system signal has occurred: segmentation violation
Expand to see more relval errors ...

Failed RelVals-INPUT

  • 11024.211024.2_TTbar_13UP18HEfailINPUT/step2_TTbar_13UP18HEfailINPUT.log

@kpedro88
Copy link
Contributor

kpedro88 commented Feb 5, 2026

Good point, there does not seem to be any explicit need for PSimHit.h to depend on SimHitCategory.h.

@fabiocos
Copy link
Contributor Author

fabiocos commented Feb 5, 2026

@civanch @kpedro88 to be practical, summarizing:

  • static constants for shift/mask as public in PSimHit.h, at that point no need to declare BscG4Hit as friend
  • as a consequence no need to include SimHitCategory.h in PSimHit.h
  • Any objection to including PSimHit.h into BscG4Hit.h? I do not see it as a problem, otherwise we need to move static constants outside PSimHit.h, as they originally were.

@makortel based on previous discussion and tests, I see no need to change data member name to have a different checksum, as this seems anyway irrelevant in solving the real pending issue of this PR, i.e. the backward compatibility correct operation. To be clear: backward compatibility is an issue here only for MTD, and only for the performance validation part, where MC truth is used. This is of course very important for our studies, but in principle so far there is no big production, apart for RelVals, where this is present to my knowledge. And code developments should anyway evolve on more recent versions.

Still I find quite unpleasant not being able to ensure the proper operation of the schema evolution. But I am running out of ideas about what else I could try.

Looking at the crashes, they always mention this statement

==5276==    by 0x2DFAB61F: ROOT::new_Basic3DVectorlEfloatgR(void*) (in /cvmfs/cms.cern.ch/el8_amd64_gcc13/cms/cmssw/CMSSW_16_1_0_pre1/lib/el8_amd64_gcc13/libDataFormatsGeometryVector.so)

This PR is never touching the positions that are stored into the object, so I am a bit puzzled. How do you suggest to proceed? This move is not strictly speaking super urgent, but the sooner the better, and for sure before next big MC campaign.

@makortel
Copy link
Contributor

makortel commented Feb 5, 2026

Maybe investigating the crash with ROOT debug build would shed some light? We have a recent 16_1_X ROOT debug build available in cms-sw/cmsdist#10212 (comment) (run the install script to set up CMSSW dev area, cherry-pick your changes and recompile). Ah, this approach would likely "require" recompiling ~everything (all depending code must be, and I'm not sure if it is worth of trying to figure out that set). I guess theoretically we could ask the bot to build this PR with the ROOT debug build option.

@fabiocos
Copy link
Contributor Author

fabiocos commented Feb 5, 2026

@makortel I'll check that

@kpedro88
Copy link
Contributor

kpedro88 commented Feb 5, 2026

I think the backtrace always shows that line because the first members constructed in PSimHit are:

  Local3DPoint theEntryPoint;  // position at entry
  Local3DVector theSegment;    // exitPos - entryPos

but I'm not sure why the constructor fails immediately from seemingly innocuous changes.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 5, 2026

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49969/47903

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@civanch
Copy link
Contributor

civanch commented Feb 5, 2026

@fabiocos , my proposal was different: no friends; make a separate header with static constants, which are included in all classes of interest.

@fabiocos
Copy link
Contributor Author

fabiocos commented Feb 5, 2026

@civanch which contradicts #49969 (comment) and brings us back to the beginning. Please agree among yourself and let me know, otherwise we keep looping.

At present there is no new friend statement that was not there previously.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 5, 2026

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 5, 2026

Pull request #49969 was updated. @Moanwar, @civanch, @cmsbuild, @ctarricone, @gabrielmscampos, @jfernan2, @kpedro88, @mandrenguyen, @mdhildreth, @nothingface0, @rseidita, @srimanob can you please check and sign again.

@kpedro88
Copy link
Contributor

kpedro88 commented Feb 5, 2026

@civanch I strongly prefer the universal mask/shift constants to be members of PSimHit, because they are used within the member functions of PSimHit to decode the values. PSimHit necessarily depends on them, so keeping them separate is a bad design. Only the per-subdetector constants should be separate.

@fabiocos
Copy link
Contributor Author

fabiocos commented Feb 6, 2026

please test with cms-sw/cmsdist#10212

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 7, 2026

-1

Failed Tests: UnitTests RelVals RelVals-INPUT
Size: This PR adds an extra 36KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cea6a3/51149/summary.html
COMMIT: 74e840e
CMSSW: CMSSW_16_1_X_2026-02-05-2300/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/49969/51149/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cea6a3/51149/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cea6a3/51149/git-merge-result

Failed Unit Tests

I found 2 errors in the following unit tests:

---> test testPhase2PixelNtuple had ERRORS
---> test testDiMuonBiasesPlotting had ERRORS

Failed RelVals

  • 25202.0A fatal system signal has occurred: segmentation violation
  • 14234.0A fatal system signal has occurred: segmentation violation
  • 312.0A fatal system signal has occurred: segmentation violation
Expand to see more relval errors ...

Failed RelVals-INPUT

  • 11024.211024.2_TTbar_13UP18HEfailINPUT/step2_TTbar_13UP18HEfailINPUT.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants