Saliency Maps
Description
Saliency maps are visual explanations for image classification models that highlight which pixels in an image most strongly influence the model's prediction. Computed by calculating gradients of the model's output with respect to input pixels, saliency maps produce heatmaps where brighter regions indicate pixels that, when changed, would most significantly affect the prediction. This technique helps users understand which parts of an image the model is 'looking at' when making decisions.
Example Use Cases
Explainability
Analysing X-ray images in a pneumonia detection model to verify that the algorithm focuses on lung regions showing inflammatory patterns rather than irrelevant areas like medical equipment or patient positioning markers.
Examining skin lesion classification models to ensure the algorithm identifies diagnostic features (irregular borders, colour variation) rather than artifacts like rulers, hair, or skin markings that shouldn't influence medical decisions.
Fairness
Auditing a dermatology AI system to verify it focuses on medical symptoms rather than skin colour when diagnosing conditions, ensuring equitable treatment across racial groups by revealing inappropriate attention to demographic features.
Limitations
- Saliency maps are often noisy and can change dramatically with small input perturbations, making them unstable.
- Highlighted regions may not correspond to semantically meaningful or human-understandable features.
- Only indicates local gradient information, not causal importance or actual decision-making logic.
- May highlight irrelevant pixels that happen to have high gradients due to model artifacts rather than meaningful patterns.