Contextual Decomposition

Description

Contextual Decomposition explains LSTM and RNN predictions by decomposing the final hidden state into contributions from individual inputs and their interactions. Unlike simpler attribution methods, it separates the direct contribution of specific words or phrases from the contextual effects of surrounding words. This is particularly useful for understanding how sequential models process language, as it can identify whether a word's influence comes from its individual meaning or from its interaction with nearby words in the sequence.

Example Use Cases

Explainability

Analysing why an LSTM-based spam filter flagged an email by decomposing contributions from individual words ('free', 'urgent') versus their contextual interactions ('free trial' together).

Understanding how a medical text classifier diagnoses conditions from clinical notes by separating direct symptom mentions from contextual medical reasoning patterns.

Transparency

Providing transparent explanations for automated content moderation decisions by showing which words and phrase interactions contributed to toxicity detection.

Limitations

Primarily designed for LSTM and simple RNN architectures, not suitable for modern transformers or attention-based models.
Not widely implemented in standard machine learning libraries, often requiring custom implementation.
Computational overhead increases significantly with sequence length and model depth.
May not scale well to very complex models or capture all types of feature interactions in deep networks.

Resources

Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

Research Paper•W. James Murdoch, Peter J. Liu, and Bin Yu•Jan 16, 2018

FredericGodin/ContextualDecomposition-NLP

Software Package

Interpreting patient-Specific risk prediction using contextual decomposition of BiLSTMs: Application to children with asthma

Research Paper•Alsaad R. et al.•Jan 1, 2019

Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models

Research Paper•Xisen Jin et al.•Nov 8, 2019

Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition

Research Paper•Aliyah R. Hsu et al.•Jul 1, 2024

Related Techniques

Name	Description	Assurance Goals
Individual Conditional Expectation Plots	ICE plots display the predicted output for individual instances as a function of a feature, with all other features held fixed for each instance. Each line on an ICE plot represents one instance's prediction trajectory as the feature of interest changes, revealing whether different instances are affected differently by that feature.	Explainability
Permutation Importance	Permutation Importance quantifies a feature's contribution to a model's performance by randomly shuffling its values and measuring the resulting drop in predictive accuracy. If shuffling a feature significantly degrades the model's performance, that feature is considered important. This model-agnostic technique helps identify which inputs are genuinely driving predictions, rather than just being correlated with the outcome.	Explainability Reliability
Deep Ensembles	Deep ensembles combine predictions from multiple neural networks trained independently with different random initializations to capture epistemic uncertainty (model uncertainty). By training several models on the same data with different starting points, the ensemble reveals how much the model's predictions depend on training randomness. The disagreement between ensemble members naturally indicates prediction uncertainty - when models agree, confidence is high; when they disagree, uncertainty is revealed. This approach provides more reliable uncertainty estimates, better out-of-distribution detection, and improved calibration compared to single models.	Reliability Transparency Safety
Gradient-weighted Class Activation Mapping	Grad-CAM creates visual heatmaps showing which regions of an image a convolutional neural network focuses on when making a specific classification. Unlike pixel-level techniques, Grad-CAM produces coarser region-based explanations by using gradients from the predicted class to weight the CNN's final feature maps, then projecting these weighted activations back to create an overlay on the original image. This provides intuitive visual explanations of where the model is 'looking' for evidence of different classes.	Explainability Fairness
Factor Analysis	Factor analysis is a statistical technique that identifies latent variables (hidden factors) underlying observed correlations in data. It works by analysing how variables relate to each other, finding a smaller number of unobserved factors that explain patterns among multiple observed variables. Unlike PCA which maximises total variance, factor analysis focuses on shared variance (communalities - the variance variables have in common) whilst separating out unique variance and measurement error. After extracting factors, rotation methods like varimax (which creates uncorrelated factors) or oblimin (allowing correlated factors) help make factors more interpretable by aligning them with distinct groups of variables.	Explainability Transparency
Mean Decrease Impurity	Mean Decrease Impurity (MDI) quantifies a feature's importance in tree-based models (e.g., Random Forests, Gradient Boosting Machines) by measuring the total reduction in impurity (e.g., Gini impurity, entropy) across all splits where the feature is used. Features that lead to larger, more consistent reductions in impurity are considered more important, indicating their effectiveness in creating homogeneous child nodes and improving predictive accuracy.	Explainability Reliability