Skip to content

AlbertoBelvedere/MLidentification

Repository files navigation

MLidentification

This directory contains the code necessary to train and evaluate the performance of the Machine Learning algorithm for the low-pt identification for the B-parking dataset reprocessing.

The training is performed using train_bdt.py which implements the XGBoost algorithm, trough dataset.py it is possible to choose the dataset and its charateristics, while using features.py one can choose the list of features to use in the training. First of all it's necessary to run kmeans_reweight.py to compute the weights to eliminate discrepancies between the distributions of the kinematic variables of electrons and fakes. Then, to train with a specific set of features:

python kmeans_reweight.py
python train_bdt.py list_of_features

where list_of_features is a list of features presents in features.py .
While to run without using weights:

python train_bdt.py list_of_features --noweight

accuracy.py : algorithm's accuracy computation.
basic_plots.py : performance of the algorithm on the old dataset.
compare_... : these files compare the performance of different types of training.
correlation_matrix.py : correlation matrices computation.
datasets.py : choice of the dataset and its carachteristic.
eval_bdt.py : evaluation of the algorithm performance trough the analysis of the ROC curve.
features.py : lists of features that can be used to train the algorithm.
feature_imortance.py : feature importance plot.
info_parameters.py : run to get the parameters of the model.
kmeans_reweight.py : reweight of the kinematical variables.
mistag_rate.py : mistag rate and efficiency computation.
train_bdt.py : file to train the algorithm.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages