Cloze-driven Pretraining of Self-attention Networks

Alexei Baevski
Alexei Baevski
Sergey Edunov
Sergey Edunov
Yinhan Liu
Yinhan Liu

EMNLP/IJCNLP (1), pp. 5359-5368, 2019.

Cited by: 42|Bibtex|Views64|DOI:https://doi.org/10.18653/v1/D19-1539
EI
Other Links: dblp.uni-trier.de|arxiv.org

Abstract:

We present a new approach for pretraining a bi-directional transformer model that provides significant performance gains across a variety of language understanding problems. Our model solves a cloze-style word reconstruction task, where each word is ablated and must be predicted given the rest of the text. Experiments demonstrate large ...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments