Skip to content

Conversation

@AdrianoDee
Copy link
Contributor

@AdrianoDee AdrianoDee commented Jul 15, 2022

This PR proposes the addition of a TrackerTraits template parameter for all the modules and the datastructres used for pixel tracks reconstruciton. This trait holds all the methods and constants needed to define the pixel tracker topology. Being a template, this would allow to define different datastructure sizes, selection algorithms, fitting features for the whole chain at compile time. In this way one could keep disentangled the different conditions and reconstuction targets (HLT, offline, HIon, Run3, Phase2 ...).

With the implementation of such a structure this PR allows to run on GPU (and CPU) the whole Phase2 Inner Tracker reconstruction (from DIGI to pixel Tracks). A following PR will allow this also for offline pixel seeding and offline HIon pixel track reconstruction.

Find below some comments to the code to (I hope) better navigate it, the validation for run3 pixel tracks w.r.t. to 12_5_X as it is and the performance for Phase2 IT reconstruction. Few open points are still to be addressed but I would leave them documented in an issue and to be assessed in following PRs.

  • better tuning for Phase2 layer pairs selection: high impact on timing;
  • better tuning for Phase2 High Purity selection;
  • extending the TrackerTraits logic to few RawToDigi methods (such as clusterChargeCut, calibDigisPhase2 ...);
  • address CPUvsGPU discrepancies for Phase2;
  • add validation scripts for Phase2 IT reco (local reco and pixel track reco);
  • tuning pixel vertex reconstruction (n.b. in the TDR the legacy one was used);
  • update Riemann fit also (needed? not actually used in production);
  • updating DQM modules such as SiPixelPhase1CompareTrackSoA to be runnable also for Phase2.

This PR also answers the comments that were left on #36235 and corrects the pixel digis calibration on gpu (in gpuCalibPixel.h) to reproduce what is in PixelThresholdClusterizer.cc. This implies changes in the constants to be used moved to

 constexpr float ElectronPerADCGain = 1500;
 constexpr uint16_t Phase2DigiBaseline = 1000;

and in Phase2KinkADC usage in the calibration. See the original algorithm (in PixelThresholdClusterizer) and the parameter sets used (in phase2TrackerDigitizer_cfi and SiPixelClusterizer_cfi)

Code comments

"Backward" compatibility

This has been solved by #38806.

Partial specialization

This has been simplified during the review.

CA params: selections and cuts

In order to have possibly different cuts and selections in cell, tuple and track building and selecting the Params struct used in CAHitNtupletGeneratorKernels has been split in

  • AlgoParams that are those that are common to all topologies and act mostly on the configuration of the algorithm itself;

and some specific to the topology:

  • CellCutsT that includes the cuts (parameters and methods) to be applied at doublets buidling level. Note that at the moment this are the same for all the topology but having them separated would allow it to be changed in future.
  • CAParamsT that include the cuts (parameters and methods) and the ntuplet levels.
  • QualityCutsT that acts on the track selection and that was already there. It has been moved to TrackSoAHeterogeneousT header (that seems more natural) and now the struct itself holds the methods to apply the cuts on the tracks.

This (splitting and moving the cuts to be methods of the structs) whould allow to define specific cuts for specific topologies. In order to allow partial specialization and keep common implementation for topologies derived from a base topology, these are wrapped in a topologyCuts struct.

SimplePixelTopology

Now the SimplePixelTopology holds all the constants that are neede for the pixel tracks reconstruction (so the formerly CAConstants, the gpuPixelDoublets and the former SimplePixelTopology itself) moving the definitions from namespaces to structs in order to use them as template parameters.

Note that some "helper" namespace have been kept following the discussion here (PR discussion).

The

template <typename T>
using isPhase1Topology = typename std::enable_if<std::is_base_of<Phase1,T>::value>::type;

template <typename T>
using isPhase2Topology = typename std::enable_if<std::is_base_of<Phase2,T>::value>::type;

have been defined keeping in mind that we may have modifications to Phase1 and Phase2 topologies (e.g. for HIon in Run3). Thus by defining new TrackerTraits struts that inherits from Phase1 and Phase2 and togheter with the isPhase1Topology and the isPhase2Topology one could:

  • define a new topology with minimal code duplication;
  • use the same specialized templates (for cuts and selections mainly) without having to add a new one for the modified topology.

This is achieved, e.g., the template for ParamsT as

 template <typename TrackerTraits, typename Enable = void>
   struct ParamsT : public AlgoParams

and then specialize it for all the topologies based on Phase1 as

template <typename TrackerTraits>
   struct ParamsT<TrackerTraits, isPhase1Topology<TrackerTraits>> : public AlgoParams

Rolling fits

In HelixFitOnGPU.h the helper

template <auto Start, auto End, auto Inc, class F>  //a compile-time bounded for loop
   constexpr void rolling_fits(F &&f) {
     if constexpr (Start < End) {
       f(std::integral_constant<decltype(Start), Start>());
       rolling_fits<Start + Inc, End, Inc>(f);
     }
   }

is defined to have a "constexpr" for loop useful to avoid code duplication when running the track fits. In this way, with the help of a lambda function the fits may be unrolled as:

       riemannFit::rolling_fits<4, TrackerTraits::maxHitsOnTrackForFullFit, 1>(
           [this, &hv, &tkidGPU, &hitsGPU, &hits_geGPU, &fast_fit_resultsGPU, &offset](auto i) {
             kernel_BLFastFit<i, TrackerTraits>(tuples_,
                                                tupleMultiplicity_,
                                                hv,
                                                tkidGPU.get(),
                                                hitsGPU.get(),
                                                hits_geGPU.get(),
                                                fast_fit_resultsGPU.get(),
                                                i,
                                                i,
                                                offset);

             kernel_BLFit<i, TrackerTraits>(tupleMultiplicity_,
                                            bField_,
                                            outputSoa_,
                                            tkidGPU.get(),
                                            hitsGPU.get(),
                                            hits_geGPU.get(),
                                            fast_fit_resultsGPU.get());
           });

With maxHitsOnTrackForFullFit a new constant defined to limit the number of hits used in the fit.


Phase2

Physics validation

Quadruplets:

Triplets:


Run3

Physics validation

Quadruplets:

Triplets:

Throughput

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

Evaluated on fu-c2a02-37-02 T4@P5 via patatrack-scripts.

There's a gain of 7-9%

Dependency graphs

GPU dependency graph

gpu_dep

CPU dependency graph

cpu_dep

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38761/31086

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38761/31088

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 15, 2022

A new Pull Request was created by @AdrianoDee for master.

It involves the following packages:

  • CUDADataFormats/Common (heterogeneous)
  • CUDADataFormats/SiPixelDigi (heterogeneous, reconstruction)
  • CUDADataFormats/Track (heterogeneous, reconstruction)
  • CUDADataFormats/TrackingRecHit (heterogeneous, reconstruction)
  • DQM/SiPixelPhase1Heterogeneous (dqm)
  • Geometry/CommonTopologies (geometry)
  • Geometry/TrackerGeometryBuilder (geometry)
  • HLTrigger/Configuration (hlt)
  • RecoLocalTracker/SiPixelClusterizer (reconstruction)
  • RecoLocalTracker/SiPixelRecHits (reconstruction)
  • RecoPixelVertexing/PixelTrackFitting (reconstruction)
  • RecoPixelVertexing/PixelTriplets (reconstruction)
  • RecoPixelVertexing/PixelVertexFinding (reconstruction)
  • RecoTauTag/HLTProducers (hlt)
  • RecoTracker/TkSeedGenerator (reconstruction)

@Martin-Grunewald, @civanch, @Dr15Jones, @bsunanda, @makortel, @clacaputo, @emanueleusai, @ianna, @mdhildreth, @cmsbuild, @missirol, @jfernan2, @ahmad3213, @fwyzard, @jpata, @pmandrik, @micsucmed, @rvenditti can you please review it and eventually sign? Thanks.
@felicepantaleo, @Martin-Grunewald, @bsunanda, @OzAmram, @fioriNTU, @mbluj, @threus, @mmusich, @venturia, @hdelanno, @silviodonato, @JanFSchulte, @dgulhan, @missirol, @azotz, @ferencek, @GiacomoSguazzoni, @rovere, @VinInn, @jandrea, @idebruyn, @ebrondol, @mtosi, @fabiocos, @dkotlins, @gpetruc, @tvami this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor

When trying to run on Run3 RAW files on which the new HLT step has been run (from 12_3_X on). Having tried the usual class versioning instructions (and the CMS_CLASS_VERSION tool) and being unsuccessful I've found a workaround to:

  1. avoid making useless all the samples already produced for Run3 since the HLT integration;
  2. avoid adding an explicit drop for pixel tracks productes in the RECO step, that seems unreasonable to me;

I've linked here the errors I refer to that happens if I simply redefine the class with the new TrackerTraits.

This is a bit strange since all CUDADataFormats should be transient. Ok, there was a period of time when some of them were not and this was causing issues.

@AdrianoDee
Copy link
Contributor Author

AdrianoDee commented Jul 18, 2022

This is a bit strange since all CUDADataFormats should be transient. Ok, there was a period of time when some of them were not and this was causing issues.

This is odd also to me. I've set up a basic test (simply changing maxNumber() in TrackSoAHeterogeneousT) that outputs the same error as above (full log here):

scram p -n dev CMSSW_12_5_0_pre2 
cd dev/src/ 
cmsenv 
git cms-init 
git cms-merge-topic AdrianoDee:cudaformats_failures 
scram b -j 16 
cmsDriver.py step3 -s RAW2DIGI:RawToDigi_pixelOnly,RECO:reconstruction_pixelTrackingOnly,VALIDATION:@pixelTrackingOnlyValidation,DQM:@pixelTrackingOnlyDQM --conditions auto:phase1_2022_realistic --datatier GEN-SIM-RECO,DQMIO -n 10 --eventcontent RECOSIM,DQM --geometry DB:Extended --era Run3 --procModifiers pixelNtupletFit,gpu --filein /store/relval/CMSSW_12_5_0_pre2/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/PU_124X_mcRun3_2022_realistic_v3-v1/2580000/148e5f22-ea2b-4dfb-b64f-79e50be63665.root --no_exec --python_filename=step3.py 
cmsRun step3.py

this happening with or without CMS_CLASS_VERSION. At the same time something like runTheMatrix.py -l 11634.502 -w upgrade -j 8 runs smoothly and this points to a problem in reading from already produced samples.

@AdrianoDee
Copy link
Contributor Author

type tracking

@AdrianoDee
Copy link
Contributor Author

test parameters:

  • enable_tests = gpu
  • workflows_gpu = 11634.502, 39434.502, 39634.502, 11634.506, 39434.506, 39634.506
  • workflows = 11634.501, 39434.501, 39434.501, 11634.502, 39434.502, 39634.502, 11634.505, 39434.505, 39634.505,11634.506, 39434.506, 39634.506
  • relvals_opt= -w standard,highstats,pileup,generator,extendedgen,production,upgrade,cleanedupgrade,ged

@AdrianoDee
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0dd2fc/26300/summary.html
COMMIT: 636a79c
CMSSW: CMSSW_12_5_X_2022-07-17-2300/el8_amd64_gcc10
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38761/26300/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

  • 39500.039500.0_CloseByPGun_CE_H_Coarse_Scint+2026D88+CE_H_Coarse_Scint_GenSimHLBeamSpotHGCALCloseBy+DigiTrigger+RecoGlobal+HARVESTGlobal/step3_CloseByPGun_CE_H_Coarse_Scint+2026D88+CE_H_Coarse_Scint_GenSimHLBeamSpotHGCALCloseBy+DigiTrigger+RecoGlobal+HARVESTGlobal.log
  • 39496.039496.0_CloseByPGun_CE_E_Front_120um+2026D88+CE_E_Front_120um_GenSimHLBeamSpotHGCALCloseBy+DigiTrigger+RecoGlobal+HARVESTGlobal/step3_CloseByPGun_CE_E_Front_120um+2026D88+CE_E_Front_120um_GenSimHLBeamSpotHGCALCloseBy+DigiTrigger+RecoGlobal+HARVESTGlobal.log
  • 23234.023234.0_TTbar_14TeV+2026D49+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal/step3_TTbar_14TeV+2026D49+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal.log
Expand to see more relval errors ...

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 19876
  • DQMHistoTests: Total failures: 1919
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 17957
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 9 edm output root files, 4 DQM output files
  • TriggerResults: found differences in 3 / 3 workflows

@slava77
Copy link
Contributor

slava77 commented Nov 29, 2022

@cms-sw/geometry-l2
your signature is still missing
please sign or at least comment when your review can happen
Thank you.

@bsunanda
Copy link
Contributor

+geometry

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@rappoccio
Copy link
Contributor

+1

Extensive review has concluded.

@perrotta
Copy link
Contributor

perrotta commented Dec 1, 2022

Merging such a complicate PR with 18 days old PR tests lead to a broken IB, because #40003 was merged 2 days after the last IB tests. A fix is proposed in #40206

@AdrianoDee AdrianoDee deleted the tracker_traits_125 branch December 1, 2022 10:53
missirol added a commit to missirol/cmssw that referenced this pull request Jan 31, 2023
missirol added a commit to missirol/cmssw that referenced this pull request Jan 31, 2023
cmsbuild added a commit that referenced this pull request Feb 1, 2023
…61and40264

remove duplicate plugins left by #38761 and #40264
fllor pushed a commit to fllor/cmssw that referenced this pull request Feb 20, 2023
NJManganelli pushed a commit to NJManganelli/cmssw that referenced this pull request Feb 24, 2023
aloeliger pushed a commit to aloeliger/cmssw that referenced this pull request Feb 24, 2023
@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-GPU
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0dd2fc/37462/summary.html
COMMIT: 5cb46ec
CMSSW: CMSSW_14_1_X_2024-02-14-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38761/37462/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-GPU

ValueError: Undefined workflows: 11634.502, 11634.506

Comparison Summary

Summary:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.