E-TCAV: Efficient Concept-Based Neural Network Interpretability
Researchers introduce E-TCAV, a framework for efficient approximation of TCAV (Testing with Concept Activation Vectors) scores, addressing computational overhead, inter-layer disagreement, and statistical instability. The method is based on investigation of latent classifiers, inter-layer agreement, and using the penultimate layer as a fast proxy. Evaluations span four architectures and five datasets.
Key facts
- TCAV assesses alignment between neural network internal representations and human-understandable concepts.
- E-TCAV aims to reduce computational overhead of TCAV.
- E-TCAV addresses inter-layer disagreement of TCAV scores.
- E-TCAV uses the penultimate layer as a fast proxy for earlier layers.
- Evaluations conducted across four architectures and five datasets.
Entities
—