A Text Generation and Prediction System: Pre-training on New Corpora Using BERT and GPT-2

Yuanbin Qu,Peihan Liu,Wei Song,Lizhen Liu,Miaomiao Cheng

2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC)（2020）

引用 45|浏览107

暂无评分

摘要

Using a given starting word to make a sentence or filling in sentences is an important direction of natural language processing. From one aspect, it reflects whether the machine can have human thinking and creativity. We train the machine for specific tasks and then use it in natural language processing, which will help solve some sentence generation problems, especially for application scenarios such as summary generation, machine translation, and automatic question answering. The OpenAI GPT-2 and BERT models are currently widely used language models for text generation and prediction. There have been many experiments to verify the outstanding performance of these two models in the field of text generation. This paper will use two new corpora to train OpenAI GPT-2 model, used to generate long sentences and articles, and finally perform a comparative analysis. At the same time, we will use the BERT model to complete the task of predicting intermediate words based on the context.

查看译文

关键词

language model,text generation,OpenAI GPT-2,BERT

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要