Layer-wise Relevance Propagation
Description
Layer-wise Relevance Propagation (LRP) explains neural network predictions by working backwards through the network to show how much each input feature contributed to the final decision. It follows a simple conservation rule: the total contribution scores always add up to the original prediction. Starting from the output, LRP distributes 'relevance' backwards through each layer using different rules depending on the layer type. This creates a detailed breakdown showing which input features helped or hindered the prediction, making it easier to understand why the network made a particular decision.
Example Use Cases
Explainability
Identifying which pixels in chest X-rays contribute to pneumonia detection, helping radiologists verify AI diagnoses by highlighting anatomical regions the model considers relevant.
Debugging a neural network's misclassification of handwritten digits by tracing relevance through layers to identify which input pixels caused the error and which network layers amplified it.
Transparency
Providing transparent explanations for automated credit scoring decisions by showing which financial features received positive or negative relevance scores, enabling clear regulatory reporting.
Limitations
- Requires different propagation rules for each layer type, making implementation complex for new architectures.
- Can produce negative relevance scores which may be difficult to interpret intuitively.
- Rule selection (LRP-ε, LRP-γ, etc.) significantly affects results and requires domain expertise.
- Limited to feedforward networks and may not work well with modern architectures like transformers without substantial modifications.