-
Notifications
You must be signed in to change notification settings - Fork 45
Open
Description
Thanks for providing wonderful benchmark for XAI methods.
But there's a slight issue in the provided codes. When I evaluate metrics using evaluate_metrics.py, I cannot get the accurate results for log standard error.
For example, the scale of log standard error for RIS below seems awkward.
Model: lr, Data: adult, Explainer: control (1/16)1s total, 1s on model, 1s on dataset)
PRA: 0.501 ± 0.003
RC: 0.003 ± 0.009
FA: 0.188 ± 0.006
RA: 0.073 ± 0.005
SA: 0.093 ± 0.004
SRA: 0.034 ± 0.003
PGU: 0.130 ± 0.003
PGI: 0.043 ± 0.002
RIS: 420764.219 ± 1391.567
log(RIS): 12.950 ± 7.238
I made PR #31 to fix this issue. Please check this.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels