MC-BERT: Efficient Language Pre-Training via a Meta Controller

Cited by: 0|Bibtex|Views94
Other Links: arxiv.org

Abstract:

Pre-trained contextual representations (e.g., BERT) have become the foundation to achieve state-of-the-art results on many NLP tasks. However, large-scale pre-training is computationally expensive. ELECTRA, an early attempt to accelerate pre-training, trains a discriminative model that predicts whether each input token was replaced by a...More

Code:

Data:

Your rating :
0

 

Tags
Comments