A Large-Scale Chinese Short-Text Conversation Dataset

Yida Wang
Yida Wang
Yinhe Zheng
Yinhe Zheng
Kaili Huang
Kaili Huang
Yong Jiang
Yong Jiang

international conference natural language processing, pp. 91-103, 2020.

Cited by: 5|Views39
Weibo:
We present pre-training models for Chinese dialogue generation, which is trained on the 12M open-domain conversations

Abstract:

The advancements of neural dialogue generation models show promising results on modeling short-text conversations. However, training such models usually needs a large-scale high-quality dialogue corpus, which is hard to access. In this paper, we present a large-scale cleaned Chinese conversation dataset LCCC, which contains a base version...More

Code:

Data:

Full Text
Bibtex
Weibo
Your rating :
0

 

Tags
Comments