Skip to content

Automatised scale writer#21

Draft
jacoterh wants to merge 6 commits into
mainfrom
dynamic-scales-automatised
Draft

Automatised scale writer#21
jacoterh wants to merge 6 commits into
mainfrom
dynamic-scales-automatised

Conversation

@jacoterh
Copy link
Copy Markdown
Collaborator

@jacoterh jacoterh commented Oct 3, 2024

This PR adds bin info to the common data files such that the renormalisation scales can be computed automatically. For instance, I made a start in ATLAS_ttZ_13TeV_pTZ

kinematic: pT_Z
bins:
- pT_Z:
    min: 0
    max: 40
- pT_Z:
    min: 40
    max: 70
- pT_Z:
    min: 70
    max: 110
- pT_Z:
    min: 110
    max: 160
- pT_Z:
    min: 160
    max: 220
- pT_Z:
    min: 220
    max: 290
- pT_Z:
    min: 290
    max: 400

Not only will this help to implement the scales automatically, it also enables to run fits with kinematic constraints, e.g. only include data points below a certain pT threshold.

The script scale_writer.py then loops over all common data files, extracts the process identifier from the dataset name, e.g. ttZ, and uses this identifier as key to fish the relevant function from the dictionary scale_funct_dict .

Todos

  • Implement bin info
  • Implement scale functions

@LucaMantani
Copy link
Copy Markdown
Collaborator

I think this is a really cool feature to have!! Especially because it is very good to have these info in the data files and we can indeed use them to restrict kinematics in the fits!

I was just wondering, now you are doing "pT_Z" as kinematic variable. Do we want to go this way, specifying the particle as well? I am just wondering whether it could be an option to simply use "pT", which then will be pT for all particles.

Also, at the moment you are doing a format with dictionaries (with max and min). Why is that? For 1D distribution, a list of numbers would be enough, do you have in mind something different?

@arossia94
Copy link
Copy Markdown
Collaborator

Thanks a lot for this, Jaco! I agree that is a great idea.

Regarding the name of the variables, I'd leave it as explicit as possible, so "pT_Z" is fine by me. It could be useful to understand which pT is used in processes such as WH/ZH without looking at the experimental paper.

On the bin boundaries, I agree that a simple list/array might seem simpler. However, we have datasets with non-continuous bins and even where the bin variable changes within the same dataset. The most egregious example is ATLAS STXS: https://github.com/LHCfitNikhef/smefit_database/blob/dynamic-scales-automatised/commondata/ATLAS_STXS_runII_13TeV.yaml
Either we can split the datasets with those problems or we might need something even more sophisticated than what Jaco did.

@LucaMantani
Copy link
Copy Markdown
Collaborator

Thanks a lot for this, Jaco! I agree that is a great idea.

Regarding the name of the variables, I'd leave it as explicit as possible, so "pT_Z" is fine by me. It could be useful to understand which pT is used in processes such as WH/ZH without looking at the experimental paper.

On the bin boundaries, I agree that a simple list/array might seem simpler. However, we have datasets with non-continuous bins and even where the bin variable changes within the same dataset. The most egregious example is ATLAS STXS: https://github.com/LHCfitNikhef/smefit_database/blob/dynamic-scales-automatised/commondata/ATLAS_STXS_runII_13TeV.yaml Either we can split the datasets with those problems or we might need something even more sophisticated than what Jaco did.

I am fine with specifying more info on the variable, maybe in a specific entry of the yaml file, I was more wondering about the "control" of the variable when we do fits. Imagine that we decide to do a fit with kinematics below 1 TeV, do we want to have to simply specify "pT < 1 TeV" or do we have to specify "pT_Z < 1 TeV", "pT_W < 1 TeV" etc etc.

Comment thread theory/CMS_tZ_13TeV_pTt.json
Comment thread theory/CMS_tt_13TeV_asy.json
@jacoterh
Copy link
Copy Markdown
Collaborator Author

I just realized I need to copy the scales over to the projections too, but that doesn't require any additional checks at least.

@jacoterh
Copy link
Copy Markdown
Collaborator Author

The theory files for the projections have been updated too, so the scales in all theory files have been revised at this point. This PR is good to go if we only care about the scales in the json files. If we want automatised scales also for the ATLAS datasets though we should keep this PR, but this will not impact the scales as these were already checked by @arossia94

@LucaMantani
Copy link
Copy Markdown
Collaborator

@jacoterh How do you propose to move on? Maybe the best thing is to take the modifications to the theory files only and push directly to the dynamic-scales branch? This way we don't merge the partial kinematic info of the commondata

@jacoterh
Copy link
Copy Markdown
Collaborator Author

I agree with that! I'll push the json files directly to the dynamic scales branch and keep this one intact

Base automatically changed from dynamic-scales to main October 24, 2024 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants