XAI modules for PythiaGenJets ntuples with Coffea
The input files are made with this package. To create a test example:
bash ./runDockerCommandLine.sh srappoccio/pythia-gen-jets:latest
make pythia2root
./pythia2root zz_flatter.cfg test.root 100
The file test.root is attached here for reference.
Start this docker image.
wget https://raw.githubusercontent.com/rappoccio/ubccr-cms/master/runUBCCRCMS.sh
chmod a+x runUBCCRCMS.sh
./runUBCCRCMS.sh 8888 srappoccio/ubccr-cms:latest
Within the docker image:
jupyter notebook --ip 0.0.0.0 --no-browser
Then point your webbrowser to the instance (if you are local, try localhost).
Open the quickplots notebook within the notebook and execute it.
Network Builder: Builds common network architectures
Analysis Helper: Defines LRP functions and relevance bar plot function
We start by generating toy model data. The z and theta distibutions for signal are derived from a normal distribution and the z and theta for the background sample are derived from an exponential disttibution. To run this scipt, we first connect to UBCCR image that supports coffea
sudo ./runUBCCRCMS.sh srappoccio/ubccr-cms:latest
and once you have a port, start the jupyter notebook:
jupyter notebook --ip 0.0.0.0 --no-browser
To produce training and testing samples run the notebook - makeJetImages.ipynb This will produce seperate files for training and testing samples that inclues toy signal and background events.
When training, testing, or analyzing the networks with LRP, we need to use the innvestigate_tensorflow docker image. (Using tensorflow version 1.)
sudo ./runUBCCRCMS.sh srappoccio/innvestigate_tensorflow:latest
Within the docker image:
jupyter notebook --ip 0.0.0.0 --no-browser
For 2D CNN we train 2 different kinds of models: one with image as imput and another with image ans XAUGs. Both can be built, initialised, trained and evaluated using CNN_2D.ipynb. This sript also produces some elementary plots to visualise the model performance.
Next, we can run LRP on the trained model to learn the relevance of an input in the NN decision. Depending on the model we are interested in we have two seperate scrpits:
Both scrpits run LRP on the models and produce plots to visualize the relevance of each input variable.
Start by preprocessing the pythia files. (Preprocessing explained in detail in our paper )
First, connect to the UBCCRCMS docker image
sudo ./runUBCCRCMS.sh srappoccio/ubccr-cms:latest
Within the docker image:
jupyter notebook --ip 0.0.0.0 --no-browser
Run Showjets.ipynb
This will:
- Select the leading jet and keep its leading 20 constituents
- Select the variables of interest
- Rotate, center, and scale the image
- Save npz files
Run CNN_Data_preprocess_all.ipynb
This will:
- Normalize the single-variable inputs
- Create test and train datasets for signal (Z to bb) and background (QCD)
When training, testing, or analyzing the networks with LRP, we need to use the innvestigate_tensorflow docker image. (Using tensorflow version 1.)
sudo ./runUBCCRCMS.sh srappoccio/innvestigate_tensorflow:latest
Within the docker image:
jupyter notebook --ip 0.0.0.0 --no-browser
- Plot the inputs: CNN_Data_plotting.ipynb
- Build the models: CNN_fit_model.ipynb
- Test the models and analyze with LRP: CNN_Analyze.ipynb