Skip to content

Conversation

@bryates
Copy link

@bryates bryates commented Feb 5, 2025

This PR fixes condor on CMS Connect. The cmssw-cc7-condor-python27 singularity container is required to run, and must be activated from the top genproductions directory.

rm $WORKDIR/pilotrun_gridpack.tar.gz

# awightma start: Force the re-weight step to only use 1 core
echo "nb_core = 1" >> $WORKDIR/process/madevent/Cards/me5_configuration.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not affect phase space integration?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is the step after when EFT reweighting is done. I can remove this too since it's not for general use.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to a bug in madgraph that doesn't work when nb_core is not 1? If so, it might be worth keeping it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might no longer be the case, but when the reweighting step gets run using a multicore setup it would crash or otherwise incorrectly combine the jobs produced by each core.

From what I remember it seemed like some sort of race condition where the code would crash if the first/initial job didn't finish first and would complete when the first job finished first, but in this case I was skeptical of the output since it seemed like the subsequent jobs would overwrite the output in a way that did not seem correct (e.g. the output file size seemed to bounce around as the jobs completed with no indication that there was some sort of post-job merging going on).

The other reason to limit the number of cores explicitly is b/c otherwise madgraph will default to try and use as many cores as possible on the host machine and these gridpacks were produced using CMSConnect, which meant that the reweighting step would end up saturating every possible core on the submission node for CMSConnect, which would be very bad.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... yes madgraph had this bug for the initial phase space integration long time ago (eating up all cores) but looks like for reweighting this behavior is still there in 2.6.5 (not sure about 2.9.x versions).

@bbilin @lviliani maybe consider this as well in master branch? or i am raising this up again which @DickyChant and I were hoping for : it's a lot easier just to keep master as the default for all campaigns instead of spending time to fix sth on both branches and deprecate old ones if the analyses are just new

elif [[ $SYSTEM_RELEASE == *"release 7"* ]]; then
cmssw_version=CMSSW_10_6_19
elif [[ $SYSTEM_RELEASE == *"release 9"* ]]; then
cmssw_version=CMSSW_13_2_9
Copy link
Collaborator

@sihyunjeon sihyunjeon Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

el9 gridpacks won't work for official production yet (that's why we just didn't care about el9 so much in the script), or is the container-in-container prepared? @DickyChant

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should officially disable el9 instead of making it work like this (el9 gridpacks are not going to be usable for official production) , talked to sitian

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

addons_dir="./addons" # awightma
if [ -e "$input_files" ]; then rm "$input_files"; fi
tar -zchf "$input_files" "$card_dir" "$patches_directory" "$utilities_dir" "${plugin_directory}"
tar -zchf "$input_files" "$card_dir" "$patches_directory" "$utilities_dir" "${plugin_directory}" "${addons_dir}"
Copy link
Collaborator

@sihyunjeon sihyunjeon Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not safe since not all people are going to be using a model that lives under addons directory and tar command will eventually crash if it does not exist. why not just steer it with extramodels.dat ?

e.g.
https://github.com/cms-sw/genproductions/blob/master/bin/MadGraph5_aMCatNLO/cards/production/13TeV/SingleLepton_TypeIHeavyN/DYtoLNuJJ/DYtoLNuJJ_TypeIHeavyN-El_MN500_TuneCP5_13TeV-amcatnlo-pythia8/DYtoLNuJJ_TypeIHeavyN-El_MN500_TuneCP5_13TeV-amcatnlo-pythia8_extramodels.dat

this will be wgetted from link

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining, I've removed this. Our current model is not in https://cms-project-generators.web.cern.ch/cms-project-generators/. Can we add it or can only people in the gen group make changes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should ask GEN conveners

@lviliani
Copy link
Contributor

lviliani commented Mar 5, 2025

Hello, as we discussed this PR should go in the master branch as well.
We can merge it also in mg265UL given that it's ready and then we don't touch this branch anymore.
@sihyunjeon is this PR ready to be merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants