L1 LLP Tagger

Scripts and code used for the ongoing Level 1 LLP tagging project for CMS.

Setting up Conda and cloning this repo:

Conda will need to be set up in order to run any of the scripts, including launching jupyter notebooks. On the directory of your choice, run the following commands:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

export PATH="$HOME/miniconda3/bin:$PATH"
conda config --set auto_activate_base false

After installing conda, we need to set up a virtual environment containing the modules used in the scripts. To create, and subsequently, activate the virtual environment, run:

git clone /path/to/this/repo
conda env create -f environment.yml
conda activate L1JetTag

Note inside the .yml file that the environment name will be L1JetTag.

`hadd` Root Files

As an example, root files are named <file_name>.root and contain particle data resulting from proton-proton collision events inside the CMS detector at the LHC. Each of those files contain some amount of data and can combine them to form one file by hadding them. A quick way to hadd them is to install a version of the CMS software with the following commands:

    source /cvmfs/cms.cern.ch/cmsset_default.sh
    export SCRAM_ARCH=slc7_amd64_gcc10
    cmsrel CMSSW_12_4_1
    cd CMSSW_12_4_1/src
    cmsenv

The cmsenv allows you to use the hadd command. Next step is to add all of the file paths in a .sh to be executed. Steps are outlined below:

Use the jupyter notebook named Fix_paths.ipynb. Note that the file paths you will be using are different. You can just remove the cPMinBias in the file names.
Once you have run through the notebook, execute the resulting .sh file via bash <file>.sh.
If several hadd.root files result, just make another .sh file, following the same format that resulted from the previous step. You don't need to use the notebook I mentioned since there are only a few files for which you can manually add the paths. Then, hadd them again in a similar manner until you only have one hadd file.

Once you have the one hadd file, move onto the next section.

Reconstructing Jets:

The Ntuples (root files) containing particle data, i.e. events with particles, can be accessed using the grid through LXPLUS or LPC, for instance. You don't need the grid if you are working on UAF or Mulder. Once we can access the Ntuples, we need to extract the particle information and cluster them into jets with the dataForgeScripts/dataForge.py script. Such a file can be run using the following command in the terminal with the corresponding arguments:

cd dataForgeScripts/dataForge.py
Python3 DataForge.py </path/to/file> (using xrootd or another access mode) QCDpt30 30 50 0

Order of the arguments is as follows:

path to file (using xrootd: root://cmsxrootd.fnal.gov///store/... if applicable)

tag = "QCDpt30" or "Stpt30" in this case. This is just a tag to be added to the name of the file.

ptCut = 30 (so, jets with Pt>30 GeV in this case).

trainPercent = 60 (60 % training data).

usePuppi = 0 (0 for pf, 1 for PUPPI).

Removing Background Jets from Signal Sample:

We have two main sources of background in our data: all jets reconstructed from the QCD Ntuples and some jets reconstructed from our signal Ntuples. The latter ones represent jets where the LLP is not matched to a jet in the event as determined by the DeltaR value. The script removeBackground.py can help us remove such jets to create a true-signal dataset. For that, we need to input the .h5 files (test and train) generated by dataForge.py from signal Ntuples. We can do that by running:

python3 removeBackground.py "<path/train/filename.h5>" "<path/test/file.h5>"

Training Keras Model:

We can run can run this by passing three arguments the following way: python3 kerasModel.py "<SignalTrainFile.h5>" "<BackgroundTrainFile.h5>" "<JetData_TrainFile.h5>". These files should have resulted from the DataForge.py and/or removeBackground.py. The "<JetData_TrainFile.h5>" will look something like "sampleData...h5" resulting from `DataForge.py'.

ROC Curve from Keras Model:

Inside ROC.py, add paths of the testing data resulting from the DataForge.py and/or removeBackground.py.

Training QKeras Model:

We can run can run this by passing three arguments the following way: python3 qkerasModel.py "<SignalTrainFile.h5>" "<BackgroundTrainFile.h5>" "<JetData_TrainFile.h5>". These files should have resulted from the DataForge.py and/or removeBackground.py. The "<JetData_TrainFile.h5>" will look something like "sampleData...h5" resulting from `DataForge.py'.

ROC Curve from QKeras Model:

Inside qkROC.py, add paths of the testing data resulting from the DataForge.py and/or removeBackground.py.

Conversion of ML models to HLS through HLS4ML:

Currently, the workflow is as follows:

Train and test the keras and qkeras models on LPC. From kerasModel.py, the file L1JetTagModel.h5 will result. From the qkerasModel.py, the file qkL1JetTagModel.h5 will result.
I copy these files into my PC using: scp <lpc username>:</file/path/to/L1JetTagModel.h5> </directory/in/my/computer>.

Then, I clone this repo inside Scully because we can source Vivado hls from there. Now, I have to copy the /directory/in/my/computer/...Model.h5 files into the L1JetTag folder cloned in Scully. Thus:

scp </directory/in/my/computer/...Model.h5> <Scully username>:/path/to/L1JetTag/folder/>

At this point, we are ready to convert the models contained in .h5 files to the hls firmware using following the next steps:

ssh -L localhost:8888:localhost:8888 <username>@scully.physics.ucsd.edu (any 4 digits for local host.
jupyter notebook --no-browser --port=8888 --ip localhost

Now, open notebook L1JetTagModel_hls_config.ipynb or qkL1JetTagModel_hls_config.ipynb, respectively for keras and qkeras model. Then, inside of these file, make sure that the correct path is added for the L1JetTagModel.h5 and qkL1JetTagModel.h5 files. If the these files were copied to your L1JetTag folder in Scully, then you should not change anything, i.e.:

The line model = load_model("L1JetTagModel.h5") should stay like this in L1JetTagModel_hls_config.ipynb.
The line qmodel = load_qmodel("qkL1JetTagModel.h5") should stay like this in qkL1JetTagModel_hls_config.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
dataForgeScripts		dataForgeScripts
eff		eff
hls4mlModifications/10-08-23		hls4mlModifications/10-08-23
plotters		plotters
.pre-commit-config.yaml		.pre-commit-config.yaml
Fix_paths.ipynb		Fix_paths.ipynb
L1JetTagModel.h5		L1JetTagModel.h5
L1JetTagModel_hls_config.ipynb		L1JetTagModel_hls_config.ipynb
L1JetTagModel_hls_config.py		L1JetTagModel_hls_config.py
README.md		README.md
ROC.py		ROC.py
effCurve.py		effCurve.py
environment.yml		environment.yml
hist_saver.py		hist_saver.py
kerasModel.py		kerasModel.py
layerTracing.ipynb		layerTracing.ipynb
modelLoss.png		modelLoss.png
part3_compression.ipynb		part3_compression.ipynb
part4_quantization.ipynb		part4_quantization.ipynb
pfTuple.root		pfTuple.root
pfTuple_1.root		pfTuple_1.root
qcd.txt		qcd.txt
qkL1JetTagModel_hls_config.ipynb		qkL1JetTagModel_hls_config.ipynb
qkerasModel.py		qkerasModel.py
rateAnalysis_link.ipynb		rateAnalysis_link.ipynb
removeBackground.py		removeBackground.py
stop.txt		stop.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

L1 LLP Tagger

Setting up Conda and cloning this repo:

`hadd` Root Files

Reconstructing Jets:

Removing Background Jets from Signal Sample:

Training Keras Model:

ROC Curve from Keras Model:

Training QKeras Model:

ROC Curve from QKeras Model:

Conversion of ML models to HLS through HLS4ML:

About

Uh oh!

Releases

Packages

Languages

Brainz22/L1JetTagDaniel

Folders and files

Latest commit

History

Repository files navigation

L1 LLP Tagger

Setting up Conda and cloning this repo:

hadd Root Files

Reconstructing Jets:

Removing Background Jets from Signal Sample:

Training Keras Model:

ROC Curve from Keras Model:

Training QKeras Model:

ROC Curve from QKeras Model:

Conversion of ML models to HLS through HLS4ML:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`hadd` Root Files

Packages