Skip to content

Latest commit

 

History

History
34 lines (25 loc) · 1.39 KB

README.md

File metadata and controls

34 lines (25 loc) · 1.39 KB

InterpretationFragility

Code for implementation of Interpretation of Nueral Network is Fragile..

Please cite the following work if you use this benchmark or the provided tools or implementations:

@inproceedings{ghorbani2019interpretation,
  title={Interpretation of neural networks is fragile},
  author={Ghorbani, Amirata and Abid, Abubakar and Zou, James},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={33},
  pages={3681--3688},
  year={2019}
}

Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

The large scale results of attack methods against four famous feature-attribution methods

alt text

Examples of targeted attack for semantically meaningful change in feature-importance

alt text

Attack examples on Deep Taylor Decomposition

alt text