What Does BERT Look At? An Analysis of BERT's Attention

Kevin Clark
Kevin Clark
Urvashi Khandelwal
Urvashi Khandelwal

BLACKBOXNLP WORKSHOP ON ANALYZING AND INTERPRETING NEURAL NETWORKS FOR NLP AT ACL 2019, pp. 276-286, 2019.

Cited by: 68|Bibtex|Views123|DOI:https://doi.org/10.18653/v1/w19-4828
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org

Abstract:

Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data. Most recent analysis has focused on model outputs (e.g., language model surprisal) or internal vector representations (e.g., probing c...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments