-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Description
This is related to a WMCore issue:
Bug description
When deploying T0 replays with a significant amount of jobs, one of the WMCore components fail complaining about duplicated LFNs. Our LFN patterns look like this:
/store/unmerged/HG2202_Val/RelValProdMinBias/GEN-SIM/HG2202_Val_OLD_Alanv4-v22/00000/2AE85F14-94A1-EC11-BBF5-FA163EC7AA59.root
where: 2AE85F14-94A1-EC11-BBF5-FA163EC7AA59 is the GUID extracted from the ROOT file through the framework XML job report.
So we are basically observing 2 different jobs generating files with the same GUID.
We get the GUID from the framework XML job report here:
And since the GUID from the FW report seems to be generated here:
https://github.com/cms-sw/cmssw/blob/master/FWCore/Utilities/src/Guid.cc#L18-L28
I'm reporting the issue here.
How to reproduce
Deploy a Tier0 replay with a significant amount of jobs. I think @germanfgv can help with this if needed. At least one incident has been reported per week lately this year.