Intermediate-Layer Feature Interpreters¶
Grad-CAM¶
-
class
interpretdl.
GradCAMInterpreter
(model: callable, device: str = 'gpu:0')[source]¶ Gradient CAM Interpreter.
Given a convolutional network and an image classification task, classification activation map (CAM) can be derived from the global average pooling and the last fully-connected layer, and show the important regions that affect model’s decisions.
GradCAM further looks at the gradients flowing into one of the convolutional layers to give weight to activation maps. Note that if there is a global average pooling layer in the network, GradCAM targeting the last layer is equivalent to CAM.
More details regarding the CAM method can be found in the original paper: https://arxiv.org/abs/1512.04150.
More details regarding the GradCAM method can be found in the original paper: https://arxiv.org/abs/1610.02391.
Parameters: - model (callable) – A model with
forward()
and possiblybackward()
functions. - device (str) – The device used for running
model
, options:"cpu"
,"gpu:0"
,"gpu:1"
etc.
-
interpret
(inputs: str, target_layer_name: str, label: list = None, resize_to: int = 224, crop_to: int = None, visual: bool = True, save_path: str = None) → numpy.ndarray[source]¶ The technical details of the GradCAM method are described as follows: GradCAM computes the feature map at the layer of
target_layer_name
and the gradient of the objective function w.r.t.target_layer_name
. With the average of gradients along the spatial dimensions, gradients will be multiplied with feature map, following by a ReLU activation to produce the final explanation.Parameters: - inputs (str or list of strs or numpy.ndarray) – The input image filepath or a list of filepaths or numpy array of read images.
- target_layer_name (str) – The target layer to calculate gradients.
- labels (list or tuple or numpy.ndarray, optional) – The target labels to analyze.
The number of labels should be equal to the number of images. If None, the most likely label for each
image will be used. Default:
None
. - resize_to (int, optional) – Images will be rescaled with the shorter edge being
resize_to
. Defaults to224
. - crop_to (int, optional) – After resize, images will be center cropped to a square image with the size
crop_to
. If None, no crop will be performed. Defaults toNone
. - visual (bool, optional) – Whether or not to visualize the processed image. Default:
True
. - save_path (str, optional) – The filepath(s) to save the processed image(s). If None, the image will not be
saved. Default:
None
.
Returns: interpretations/heatmap for images
Return type: [numpy.ndarray]
- model (callable) – A model with
Score CAM¶
-
class
interpretdl.
ScoreCAMInterpreter
(model: callable, device: str = 'gpu:0')[source]¶ Score-CAM Interpreter.
ScoreCAMInterpreter bridges the gap between perturbation-based and CAM-based methods, and derives the weight of activation maps in an intuitively understandable way.
More details regarding the Score CAM method can be found in the original paper: https://arxiv.org/abs/1910.01279.
Parameters: - model (callable) – A model with
forward()
and possiblybackward()
functions. - device (str) – The device used for running
model
, options:"cpu"
,"gpu:0"
,"gpu:1"
etc.
-
interpret
(inputs: str, target_layer_name: str, labels: list = None, resize_to: int = 224, crop_to: int = None, visual: bool = True, save_path: str = None) → numpy.ndarray[source]¶ Main function of the interpreter.
(TODO) The technical details will be described later.
Parameters: - inputs (str or list of strs or numpy.ndarray) – The input image filepath or a list of filepaths or numpy array of read images.
- target_layer_name (str) – The target layer to calculate gradients.
- labels (list or tuple or numpy.ndarray, optional) – The target labels to analyze. The number of labels
should be equal to the number of images. If None, the most likely label for each image will be used.
Default:
None
. - resize_to (int, optional) – Images will be rescaled with the shorter edge being
resize_to
. Defaults to224
. - crop_to (int, optional) – After resize, images will be center cropped to a square image with the size
crop_to
. If None, no crop will be performed. Defaults toNone
. - visual (bool, optional) – Whether or not to visualize the processed image. Default:
True
. - save_path (str, optional) – The filepath(s) to save the processed image(s). If None, the image will not be
saved. Default:
None
.
Returns: interpretations/heatmap for images
Return type: [np.ndarray]
- model (callable) – A model with
Rollout¶
-
class
interpretdl.
RolloutInterpreter
(model: callable, device: str = 'gpu:0')[source]¶ Rollout Interpreter.
This is a specific interpreter for Transformers models. RolloutInterpreter assumes that attentions can be linearly combined and the obtained score is able to show the scores of tokens, which gives an explanation of token importance.
More details regarding the Rollout method can be found in the original paper: https://arxiv.org/abs/2005.00928.
Parameters: - model (callable) – A model with
forward()
and possiblybackward()
functions. - device (str) – The device used for running
model
, options:"cpu"
,"gpu:0"
,"gpu:1"
etc.
-
interpret
(inputs: str, start_layer: int = 0, attention_layer_pattern: str = '^blocks.*.attn.attn_drop$', resize_to: int = 224, crop_to: int = None, visual: bool = True, save_path: str = None)[source]¶ Given
inputs
, RolloutInterpreter obtains all attention maps (of layers whose name matchesattention_layer_pattern
) and calculates their matrix multiplication. Thestart_layer
controls the number of involved layers.Parameters: - inputs (str or list of strs or numpy.ndarray) – The input image filepath or a list of filepaths or numpy array of read images.
- start_layer (int, optional) – The index of the start layer involving the computation of attentions. Defaults
to
0
. - attention_layer_pattern (str, optional) – the string pattern to pick the layers that match the pattern.
Defaults to
^blocks.*.attn.attn_drop$
. - resize_to (int, optional) – Images will be rescaled with the shorter edge being
resize_to
. Defaults to224
. - crop_to (int, optional) – After resize, images will be center cropped to a square image with the size
crop_to
. If None, no crop will be performed. Defaults toNone
. - visual (bool, optional) – Whether or not to visualize the processed image. Default:
True
. - save_path (str, optional) – The filepath(s) to save the processed image(s). If None, the image will not be
saved. Default:
None
.
Returns: interpretations/heatmap for images.
Return type: [np.ndarray]
- model (callable) – A model with