Use backprop to compute the gradients of logits wrt input : [Deep Inside Convolutional Networks](Deep Inside Convolutional Networks.md)
[Guided BackProp](Guided BackProp.md)
Features
One of the oldest interpretation methods
Salience maps of important features are calculated, and they show superpixels that have influenced the prediction most, for example
To create a map of important pixels, one can repeatedly feed an architecture with several portions of inputs and compare the respective output, or one can visualize them directly by going rearwards through the inverted network from an output of interest;
Grouped in this category as well is exploiting neural networks with activation atlases through feature inversion. This method can reveal how the network typically represents some concepts
Considering image or text portions that maximize the activation of interesting neurons or whole layers can lead to the interpretation of the responsible area of individual parts of the architecture.