Skip to content

Conversation

@Parsifal-2045
Copy link
Contributor

PR description:

Since #48921 made the alpaka-based PixelTracking CA default for Phase 2 we have observed some failures in IB tests, mostly related to older geometries failing an assertion in the CA code (see #49266). The root cause has been investigated and found to be in the way used to compute offsetBPIX2 #49266 (comment).
This PR implements a more flexible computation based on the geometry instead of using an hard-coded constant stored in simplePixelTopology.h

PR validation:

Tested locally on workflows that were failing in the latest IB: 23634.0, 24834.0,25234.0,25634.0,26034.0,26834.0 plus a D110 reference 29634.0.
Run 3 performance is unaffected, verified using the following recipe

#!/bin/bash -ex

cmsDriver.py step2 -s HLT:@relval2025,VALIDATION:hltMultiTrackValidation \
    --conditions auto:phase1_2025_realistic \
    --datatier DQMIO \
    -n 1000 \
    --eventcontent DQMIO \
    --geometry DB:Extended \
    --era Run3_2025 \
    --filein file:/eos/cms/store/relval/CMSSW_16_0_0_pre1/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/PU_151X_mcRun3_2025_realistic_v4_STD_2025_PU-v1/2590000/0003c424-1e5a-4177-adbe-5afbbb578af1.root \
    --fileout file:step2.root \
    --nThreads 32 \
    --process HLTX \
    --inputCommands='keep *, drop *_hlt*_*_HLT, drop triggerTriggerFilterObjectWithRefs_l1t*_*_HLT' \
    >step2.log 2>&1

cmsDriver.py step3 -s HARVESTING:postProcessorHLTtrackingSequence \
    --conditions auto:phase1_2025_realistic \
    --mc \
    --geometry DB:Extended \
    --scenario pp \
    --filetype DQM \
    --era Run3_2025 \
    -n 1000 \
    --filein file:step2.root \
    --fileout file:step3.root >step3.log 2>&1

before and after the fix (full plots here).

Phase-2 CA (and CA Extension) physics performance is also unaffected and reflects the results in #48921.

- remove reliance on hard-coded value in simplePixelTopology
- support older geometries (tested from D95)
@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 31, 2025

cms-bot internal usage

@mmusich
Copy link
Contributor

mmusich commented Oct 31, 2025

type ngt, bug-fix

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49272/46630

  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @Parsifal-2045 for master.

It involves the following packages:

  • RecoLocalTracker/SiPixelClusterizer (reconstruction)

@cmsbuild, @jfernan2, @mandrenguyen, @srimanob can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @VinInn, @VourMa, @dkotlins, @elusian, @felicepantaleo, @ferencek, @gpetruc, @mmasciov, @mmusich, @mroguljic, @mtosi, @rovere, @threus, @tsusa this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor

mmusich commented Oct 31, 2025

test parameters:

  • enable = hlt_p2_integration, hlt_p2_timing
  • workflows = ph2_hlt

@mmusich
Copy link
Contributor

mmusich commented Oct 31, 2025

@cmsbuild, please test

@mandrenguyen
Copy link
Contributor

urgent
@cms-sw/reconstruction-l2 We would like to have this in by the 2300 IB, if it passes tests. Thanks!

@jfernan2
Copy link
Contributor

+1
Code looks OK to me, I pre-sign conditioned to passing tests

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs after it passes the integration tests. This pull request will now be reviewed by the release team before it's merged. @mandrenguyen, @sextonkennedy, @ftenchini (and backports should be raised in the release meeting by the corresponding L2)

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests
Size: This PR adds an extra 28KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3c80f7/49121/summary.html
COMMIT: 2b0d16b
CMSSW: CMSSW_16_0_X_2025-10-30-2300/el8_amd64_gcc13
Additional Tests: HLT_P2_INTEGRATION,HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/49272/49121/install.sh to create a dev area with all the needed externals and cmssw changes.

HLT P2 Timing: chart

Unit Tests

I found 1 errors in the following unit tests:

---> test test-das-selected-lumis had ERRORS

Comparison Summary

Summary:

  • You potentially added 2 lines to the logs
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 67
  • DQMHistoTests: Total histograms compared: 4381652
  • DQMHistoTests: Total failures: 12
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4381620
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 66 files compared)
  • Checked 272 log files, 229 edm output root files, 67 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor

mmusich commented Oct 31, 2025

-1

failures are unrelated.

@mandrenguyen
Copy link
Contributor

ignore tests-rejected with ib-failure

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 9b802c2 into cms-sw:master Oct 31, 2025
12 of 13 checks passed
@makortel
Copy link
Contributor

makortel commented Nov 3, 2025

This PR is a suspect for more crashes on GPUs, see #49288

@Parsifal-2045 Parsifal-2045 deleted the fixOffsetBPIX2 branch December 19, 2025 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants