Explainability
Causality
Identifies causal rather than correlational relationships
3 techniques in this subcategory
3 techniques
| Goals | Models | Data Types | Description | |||
|---|---|---|---|---|---|---|
| Influence Functions | Algorithmic | Architecture/linear Models Architecture/neural Networks +6 | Any | Influence functions quantify how much each training example influenced a model's predictions by computing the change in... | ||
| Causal Mediation Analysis in Language Models | Mechanistic Interpretability | Architecture/neural Networks/transformer Architecture/neural Networks/transformer/llm +3 | Text | Causal mediation analysis in language models is a mechanistic interpretability technique that systematically... | ||
| Concept Activation Vectors | Algorithmic | Architecture/neural Networks Requirements/gradient Access +2 | Any | Concept Activation Vectors (CAVs), also known as Testing with Concept Activation Vectors (TCAV), identify mathematical... |
Rows per page
Page 1 of 1