Explaining and Improving Model Behavior with k Nearest Neighbor Representations

Ben Krause
Ben Krause
Wengpeng Yin
Wengpeng Yin
Tong Niu
Tong Niu
Cited by: 0|Bibtex|Views29
Other Links: arxiv.org

Abstract:

Interpretability techniques in NLP have mainly focused on understanding individual predictions using attention visualization or gradient-based saliency maps over tokens. We propose using k nearest neighbor (kNN) representations to identify training examples responsible for a model's predictions and obtain a corpus-level understanding of...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments