-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Implement functionality to submit jobs to Dask and/or condor for the preselection looper.
Implementation would go in this function in prep_helper.py:
| def submit_jobs(self): |
Likely the cleanest way to do it would be to make a Batch directory and build helper classes for Dask/condor submission there.
Merging of output dataframes would also need to be updated accordingly:
HggAnalysisDev/Preselection/helpers/loop_helper.py
Lines 197 to 208 in a96044c
| def merge_outputs(self): | |
| master_file = self.output_dir + self.selections + "_" + self.output_tag + ".pkl" | |
| master_df = pandas.DataFrame() | |
| for file in self.outputs: | |
| if self.debug > 0: | |
| print("[LoopHelper] Loading file %s" % file) | |
| if not os.path.exists(file): | |
| continue | |
| df = pandas.read_pickle(file) | |
| master_df = pandas.concat([master_df, df], ignore_index=True) | |
| master_df.to_pickle(master_file) |
Another thing to keep in mind: it would be nice to have the batch submission tools not be entirely specific to the looper (or at least easily generalizable), as they will also be useful for MVA training (e.g. hyperparameter scans) and Signal Region Optimization (scanning MVA cut values).