Local Interpretable Model-Agnostic Explanations
Description
LIME (Local Interpretable Model-agnostic Explanations) explains individual predictions by approximating the complex model's behaviour in a small neighbourhood around a specific instance. It works by creating perturbed versions of the input (e.g., removing words from text, changing pixel values in images, or varying feature values), obtaining the model's predictions for these variations, and training a simple interpretable model (typically linear regression) weighted by proximity to the original instance. The coefficients of this local surrogate model reveal which features most influenced the specific prediction.
Example Use Cases
Explainability
Explaining why a specific patient received a high-risk diagnosis by showing which symptoms (fever, blood pressure, age) contributed most to the prediction, helping doctors validate the AI's reasoning.
Debugging a text classifier's misclassification of a movie review by highlighting which words (e.g., sarcastic phrases) confused the model, enabling targeted model improvements.
Transparency
Providing transparent explanations to customers about automated decisions in insurance claims, showing which claim features influenced approval or denial to meet regulatory requirements.
Limitations
- Explanations can be unstable due to random sampling, producing different results across multiple runs.
- The linear surrogate may poorly approximate highly non-linear model behaviour in the local region.
- Defining the neighbourhood size and perturbation strategy requires careful tuning for each data type.
- Can be computationally expensive for explaining many instances due to repeated model queries.