Perera Lago, J., Toscano Durán, V., Paluzo-Hidalgo, E., Narteni, S. & Rucco, M. Application of the representative measure approach to assess the reliability of decision trees in dealing with unseen vehicle collision data
Two experiments were developed:
- Synthetic Dataset,
- Vehicle Collision Dataset.
In both four sets were considered, the train set, two random subsets generated from the train set and the test test.
In addition, the file correlationEpsilon_Similarity, we repeated the experiment for 100 subsets of Collision Dataset and the Spearman’s correlation (Sp) between the ε-representativeness and the metric of the ordering of the feature importance was computed obtaining significant correlation.
- numpy
- pandas
- scikit-learn
To install all the necessary python(we use Python 3.11.6) dependencies to run these experiments use the next command:
pip install -r requeriments.txt