Techniques analyzing internal model representations
3 subcategories • 11 techniques
Identifies learned concepts in models (e.g., CAVs, Neuron Activation Analysis)
Breaks down predictions into components (e.g., Taylor Decomposition, Contextual Decomposition)
Reduces complexity for understanding (e.g., PCA, t-SNE, UMAP)