Skip to content

Samples for 2025#51

Open
sofiagiappichini wants to merge 5 commits intomainfrom
fixes_samples
Open

Samples for 2025#51
sofiagiappichini wants to merge 5 commits intomainfrom
fixes_samples

Conversation

@sofiagiappichini
Copy link
Copy Markdown
Contributor

  • New samples for 2025, data is appropriate, while MC is copied from 2024 with a different nick to differentiate them.
  • Run maintenance on the rest of the samples, mostly to update the filelists and gene weights.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds/refreshes NanoAOD v12 sample JSONs (notably 2022postEE data eras E/G) and performs maintenance updates to existing sample metadata/filelists.

Changes:

  • Updated diboson MC filelists to use a different XRootD endpoint.
  • Added new 2022postEE data sample JSONs for several datasets (Muon/EGamma/JetMET/Tau) in eras E and G.
  • Reclassified a 2018 ttH sample into the rem_htautau sample_type and added a corresponding JSON under rem_htautau/.

Reviewed changes

Copilot reviewed 29 out of 4914 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
nanoAOD_v12/2022postEE/diboson/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8_Run3Summer22EENanoAODv12-130X.json Switches MC filelist redirector endpoint.
nanoAOD_v12/2022postEE/diboson/WWto2L2Nu_TuneCP5_13p6TeV_powheg-pythia8_Run3Summer22EENanoAODv12-130X.json Switches MC filelist redirector endpoint.
nanoAOD_v12/2022postEE/data_G/Tau_Run2022G-22Sep2023-v1.json Adds new 2022postEE data(G) Tau sample definition.
nanoAOD_v12/2022postEE/data_G/Muon_Run2022G-22Sep2023-v1.json Adds new 2022postEE data(G) Muon sample definition.
nanoAOD_v12/2022postEE/data_G/Muon_Run2022G-19Dec2023-v2.json Adds new 2022postEE data(G) Muon sample definition (19Dec reprocessing).
nanoAOD_v12/2022postEE/data_G/JetMET_Run2022G-22Sep2023-v2.json Adds new 2022postEE data(G) JetMET sample definition.
nanoAOD_v12/2022postEE/data_G/EGamma_Run2022G-22Sep2023-v2.json Adds new 2022postEE data(G) EGamma sample definition.
nanoAOD_v12/2022postEE/data_G/EGamma_Run2022G-16Dec2023-v1.json Adds new 2022postEE data(G) EGamma sample definition (16Dec reprocessing).
nanoAOD_v12/2022postEE/data_E/Tau_Run2022E-22Sep2023-v1.json Adds new 2022postEE data(E) Tau sample definition.
nanoAOD_v12/2022postEE/data_E/Muon_Run2022E-22Sep2023-v1.json Adds new 2022postEE data(E) Muon sample definition.
nanoAOD_v12/2022postEE/data_E/Muon_Run2022E-16Dec2023-v1.json Adds new 2022postEE data(E) Muon sample definition (16Dec reprocessing).
nanoAOD_v12/2022postEE/data_E/JetMET_Run2022E-22Sep2023-v1.json Adds new 2022postEE data(E) JetMET sample definition.
nanoAOD_v12/2022postEE/data_E/EGamma_Run2022E-16Dec2023-v1.json Adds new 2022postEE data(E) EGamma sample definition (16Dec reprocessing).
nanoAOD_v12/2018/ttH/ttHToNonbb_M125_TuneCP5_13TeV-powheg-pythia8_sdaigler-mc_2018UL_Higgs_ttHToNonbb_1736873492-00000000000000000000000000000000.json Changes sample_type classification of an existing 2018 ttH sample.
nanoAOD_v12/2018/rem_htautau/ttHToNonbb_M125_TuneCP5_13TeV-powheg-pythia8_sdaigler-mc_2018UL_Higgs_ttHToNonbb_1736873492-00000000000000000000000000000000.json Adds a new JSON copy of the same sample under rem_htautau/.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"root://cmsdcache-kit-disk.gridka.de///store/data/Run2022G/Muon/NANOAOD/19Dec2023-v2/50000/5594c8d2-cbf2-4398-8c3f-59e1310ccce2.root",
"root://cmsdcache-kit-disk.gridka.de///store/data/Run2022G/Muon/NANOAOD/19Dec2023-v2/2540000/812d9b44-dd8e-4e7b-b4c8-ab04828a50aa.root"
],
"generator_weight": 0.0,
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"generator_weight"is set to0.0for a data sample. If downstream weighting logic multiplies event weights by this field, this will effectively zero out the entire dataset. For data samples, this should be1.0(or the field should be omitted if your schema allows), and the same issue should be corrected in the other newly added data JSONs wheregenerator_weightis also0.0` (e.g. EGamma/Muon in 16Dec2023 reprocessings).

Suggested change
"generator_weight": 0.0,
"generator_weight": 1.0,

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +13
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2540000/360ffa8d-3ff6-43f6-8913-f218623e7b33.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2540000/8dadb227-fd22-468f-8662-323f88eedac9.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/fc9da383-b65b-4764-ad79-cb45d0c240a6.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/f1d480ab-5cdf-4cb3-90e3-7734f4b98ecc.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/24b21ac6-9730-4129-86e8-1678395dbdf9.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/cd89ed79-7e0b-4146-baee-736737634f6f.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2540000/151151cf-93ec-4595-9e36-4b4479f657ec.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/50000/302006a4-431b-439a-b77f-385bffb626c8.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/50000/c88b9f27-5c4e-4cf1-b8aa-e4ad82921129.root",
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filelist redirector was changed from the global CMS redirector to a site-specific endpoint (cmsdcache-kit-disk.gridka.de). This can reduce reliability/portability for users running outside that site and can introduce avoidable outages if that endpoint is unavailable. Prefer using a global redirector (or a configurable redirector/fallback list) unless there’s a documented operational requirement to pin to GridKA.

Suggested change
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2540000/360ffa8d-3ff6-43f6-8913-f218623e7b33.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2540000/8dadb227-fd22-468f-8662-323f88eedac9.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/fc9da383-b65b-4764-ad79-cb45d0c240a6.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/f1d480ab-5cdf-4cb3-90e3-7734f4b98ecc.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/24b21ac6-9730-4129-86e8-1678395dbdf9.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/cd89ed79-7e0b-4146-baee-736737634f6f.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2540000/151151cf-93ec-4595-9e36-4b4479f657ec.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/50000/302006a4-431b-439a-b77f-385bffb626c8.root",
"root://cmsdcache-kit-disk.gridka.de///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/50000/c88b9f27-5c4e-4cf1-b8aa-e4ad82921129.root",
"root://xrootd-cms.infn.it///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2540000/360ffa8d-3ff6-43f6-8913-f218623e7b33.root",
"root://xrootd-cms.infn.it///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2540000/8dadb227-fd22-468f-8662-323f88eedac9.root",
"root://xrootd-cms.infn.it///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/fc9da383-b65b-4764-ad79-cb45d0c240a6.root",
"root://xrootd-cms.infn.it///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/f1d480ab-5cdf-4cb3-90e3-7734f4b98ecc.root",
"root://xrootd-cms.infn.it///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/24b21ac6-9730-4129-86e8-1678395dbdf9.root",
"root://xrootd-cms.infn.it///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2520000/cd89ed79-7e0b-4146-baee-736737634f6f.root",
"root://xrootd-cms.infn.it///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/2540000/151151cf-93ec-4595-9e36-4b4479f657ec.root",
"root://xrootd-cms.infn.it///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/50000/302006a4-431b-439a-b77f-385bffb626c8.root",
"root://xrootd-cms.infn.it///store/mc/Run3Summer22EENanoAODv12/WZto2L2Q_TuneCP5_13p6TeV_powheg-pythia8/NANOAODSIM/130X_mcRun3_2022_realistic_postEE_v6-v2/50000/c88b9f27-5c4e-4cf1-b8aa-e4ad82921129.root",

Copilot uses AI. Check for mistakes.
"nfiles": 21,
"nick": "ttHToNonbb_M125_TuneCP5_13TeV-powheg-pythia8_sdaigler-mc_2018UL_Higgs_ttHToNonbb_1736873492-00000000000000000000000000000000",
"sample_type": "ttH",
"sample_type": "rem_htautau",
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sample appears to now be defined twice: once here (under the ttH/ directory) and again in the newly added nanoAOD_v12/2018/rem_htautau/...json, with the same "nick". If your sample loader globs all JSONs, this can lead to duplicate sample entries and potential double-counting. To avoid this, either remove/disable one of the duplicate JSONs or ensure only one location is discoverable by the loader for a given nick (e.g., keep it only under rem_htautau/ and revert/remove the definition under ttH/).

Copilot uses AI. Check for mistakes.
Comment thread nanoAOD_v15/datasets.json.working Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be removed. We should add it to gitignore so that does not happen again.

@nshadskiy
Copy link
Copy Markdown
Collaborator

For the nanoAODv9 changes, as I understand e.g. for 2018 it was only "era": "2018", to "era": 2018,. However, I would like to revert this to have a consistent definition of the era parameter type as a string since we can not do integer for cases like "2016preVFP" or "2023postBPix".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants