Skip to content

Wrong results for log standard error #32

@statkwon

Description

@statkwon

Thanks for providing wonderful benchmark for XAI methods.

But there's a slight issue in the provided codes. When I evaluate metrics using evaluate_metrics.py, I cannot get the accurate results for log standard error.

For example, the scale of log standard error for RIS below seems awkward.

Model: lr, Data: adult, Explainer: control (1/16)1s total, 1s on model, 1s on dataset)
PRA: 0.501 ± 0.003

RC: 0.003 ± 0.009

FA: 0.188 ± 0.006

RA: 0.073 ± 0.005

SA: 0.093 ± 0.004

SRA: 0.034 ± 0.003

PGU: 0.130 ± 0.003

PGI: 0.043 ± 0.002

RIS: 420764.219 ± 1391.567
log(RIS): 12.950 ± 7.238

I made PR #31 to fix this issue. Please check this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions