-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Current situation and requested solution
After submitting a selection (e.g. big pre-selection) to the cluster with the Submitter, currently the small, split jobs must be added back together to the original samples (e.g. QCD_HT100to200_33.root and QCD_HT100to200_47.root and ... --> QCD_HT100to200.root). Only from samples added back together, a new selection step can be started. It would be very good to somehow "persistify" small files in a workdir (i.e. not name it workdir anymore, but rename it such that it doesn't get deleted by -c of the Submitter) and create xml files for those. In this way, one can more easily run on top of a loose pre-selection that produces very large files.
To be done after
- nothing, do at any point
To be done before
- nothing
Steps
Somewhat up for discussion, imagine something like
- make workdir persistent
- create new xml files from files in workdir, one list of files per sample. In principle this is already done in the Tuplizer: https://github.com/LEAF-HQ/LEAF/blob/master/Tuplizer/TuplizeRunner.py#L164#L226
- come up with good way of storing those "intermediate" xml files, maybe inside the persitent workdir?
- create default new xml file for a potential following selection step that makes use of the sample xml files produced in the step before (similar to https://github.com/LEAF-HQ/LEAF/blob/master/Tuplizer/tuplize.py#L107#L157).
Metadata
Metadata
Assignees
Labels
No labels