data requirements
access to model internals
Needs access to model gradients, weights, or activations
13 techniques
| Goals | Models | Data Types | Description | |||
|---|---|---|---|---|---|---|
| DeepLIFT | Algorithmic | Architecture/neural Networks Requirements/white Box +1 | Any | DeepLIFT (Deep Learning Important FeaTures) explains neural network predictions by decomposing the difference between... | ||
| Layer-wise Relevance Propagation | Algorithmic | Architecture/neural Networks Paradigm/parametric +2 | Any | Layer-wise Relevance Propagation (LRP) explains neural network predictions by working backwards through the network to... | ||
| Taylor Decomposition | Algorithmic | Architecture/neural Networks Requirements/gradient Access +2 | Any | Taylor Decomposition is a mathematical technique that explains neural network predictions by computing first-order and... | ||
| Saliency Maps | Algorithmic | Architecture/neural Networks Requirements/differentiable +1 | Image | Saliency maps are visual explanations for image classification models that highlight which pixels in an image most... | ||
| Gradient-weighted Class Activation Mapping | Algorithmic | Architecture/neural Networks/convolutional Requirements/architecture Specific +2 | Image | Grad-CAM creates visual heatmaps showing which regions of an image a convolutional neural network focuses on when making... | ||
| Classical Attention Analysis in Neural Networks | Algorithmic | Architecture/neural Networks/recurrent Requirements/architecture Specific +1 | Any | Classical attention mechanisms in RNNs and CNNs create alignment matrices and temporal attention patterns that show how... | ||
| Temperature Scaling | Algorithmic | Architecture/neural Networks Paradigm/discriminative +3 | Any | Temperature scaling adjusts a model's confidence by applying a single parameter (temperature) to its predictions. When a... | ||
| Model Pruning | Algorithmic | Architecture/neural Networks Paradigm/parametric +4 | Any | Model pruning systematically removes less important weights, neurons, or entire layers from neural networks to create... | ||
| Neuron Activation Analysis | Algorithmic | Architecture/neural Networks Requirements/model Internals +1 | Text | Neuron activation analysis examines the firing patterns of individual neurons in neural networks by probing them with... | ||
| Causal Mediation Analysis in Language Models | Mechanistic Interpretability | Architecture/neural Networks/transformer Architecture/neural Networks/transformer/llm +3 | Text | Causal mediation analysis in language models is a mechanistic interpretability technique that systematically... | ||
| Concept Activation Vectors | Algorithmic | Architecture/neural Networks Requirements/gradient Access +2 | Any | Concept Activation Vectors (CAVs), also known as Testing with Concept Activation Vectors (TCAV), identify mathematical... | ||
| Attention Visualisation in Transformers | Algorithmic | Architecture/neural Networks/transformer Requirements/architecture Specific +1 | Image Text | Attention Visualisation in Transformers analyses the multi-head self-attention mechanisms that enable transformers to... | ||
| Adaptive Sensitive Reweighting | Algorithmic | Architecture/model Agnostic Paradigm/parametric +3 | Any | Adaptive Sensitive Reweighting dynamically adjusts the importance of training examples during model training based on... |
Rows per page
Page 1 of 1