Axiomatic Attribution for Deep Networks | TransferLab

Reference

Axiomatic Attribution for Deep Networks, Mukund Sundararajan, Ankur Taly, Qiqi Yan. Proceedings of the 34th International Conference on Machine Learning(2017)

Abstract

We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms—Sensitivity and Implementation Invariance that attribution methods ought to satisfy. We show that they are not satisfied by most known attribution methods, which we consider to be a fundamental weakness of those methods. We use the axioms to guide the design of a new attribution method called Integrated Gradients. Our method requires no modification to the original network and is extremely simple to implement; it just needs a few calls to the standard gradient operator. We apply this method to a couple of image models, a couple of text models and a chemistry model, demonstrating its ability to debug networks, to extract rules from a network, and to enable users to engage with models better.

Content citing this item

Training

Methods and issues in explainable AI

A two-day workshop for ML practitioners wishing to make their models better understandable for themselves and decision makers.

Series

Explainable AI

Large opaque models like neural networks require dedicated methods to study and interpret their behavior. In this series we review recent …

Jan 1, 0001

Series

Explainable AI

Large opaque models like neural networks require dedicated methods to study and interpret their behavior. In this series we review recent …

Jan 1, 0001

All works referenced in our site...