Cross Copy Network for Dialogue Generation

EMNLP 2020, pp. 1900-1910, 2020.

被引用0|浏览40
微博一下
We proposed a novel neural network structure-Cross Copy Networks, enabling both vertical copy and horizontal copy

摘要

In the past few years, audiences from different fields witness the achievements of sequence-to-sequence models (e.g., LSTM+attention, Pointer Generator Networks and Transformer) to enhance dialogue content generation. While content fluency and accuracy often serve as the major indicators for model training, dialogue logics, carrying criti...更多

代码

数据

0
ZH
下载 PDF 全文
引用
微博一下
简介
重点内容
  • As an important task in Natural Language Generation (NLG), dialogue generation empowers a wide spectrum of applications, such as chatbot and customer service automation
  • As shown in Figure.1, we propose two different kinds of copy mechanisms in this study: vertical copy context-dependent information within the target dialogue instance, and horizontal copy logic-dependent content across different ’Similar Cases’ (SC)
  • To evaluate the effectiveness of the dialogue generated by CrossCopy Networks (CCN), we used ROUGE (Lin and Hovy, 2003) and BLEU (Papineni et al, 2002) scores to compare different models
  • We proposed a novel neural network structure-Cross Copy Networks, enabling both vertical copy and horizontal copy
  • Experiments show that our model has achieved State-of-the-art results in both domain datasets
  • This study serves as the methodological foundation
结果
  • To evaluate the effectiveness of the dialogue generated by CCN, the authors used ROUGE (Lin and Hovy, 2003) and BLEU (Papineni et al, 2002) scores to compare different models.
  • Denoting all the parameters in the model as δ, the authors obtain the following optimized objective function: min loss = loss + λ θ δ (11).
  • In order to ensure the rationality/correctness of the generated utterance, the authors conducted human evaluation.
结论
  • The authors proposed a novel neural network structure-Cross Copy Networks, enabling both vertical copy and horizontal copy.
  • The proposed CCN doesn’t need additional knowledge input, and it can be adopted to other domains.
  • Experimental results proved CCN’s superiority when comparing with a number of existing state-of-art text generation models, which tells the cross copy mechanism can successfully enhance the dialogue generation performance.
  • The authors will further investigate other content generation problems by leveraging multigranularity copying mechanism.
总结
  • Introduction:

    As an important task in Natural Language Generation (NLG), dialogue generation empowers a wide spectrum of applications, such as chatbot and customer service automation.
  • Li et al (2019); Rajpurkar et al (2018); Huang et al (2018); Reddy et al (2019) explore document as knowledge discovery for dialogue generation, and Xia et al (2017); Ye et al (2020); Ghazvininejad et al (2018); Parthasarathi and Pineau (2018) utilize unstructured knowledge to explore in the open-domain dialogue generation.
  • Unaffordable knowledge construction and defective domain adaptation restrict their utilization
  • Results:

    To evaluate the effectiveness of the dialogue generated by CCN, the authors used ROUGE (Lin and Hovy, 2003) and BLEU (Papineni et al, 2002) scores to compare different models.
  • Denoting all the parameters in the model as δ, the authors obtain the following optimized objective function: min loss = loss + λ θ δ (11).
  • In order to ensure the rationality/correctness of the generated utterance, the authors conducted human evaluation.
  • Conclusion:

    The authors proposed a novel neural network structure-Cross Copy Networks, enabling both vertical copy and horizontal copy.
  • The proposed CCN doesn’t need additional knowledge input, and it can be adopted to other domains.
  • Experimental results proved CCN’s superiority when comparing with a number of existing state-of-art text generation models, which tells the cross copy mechanism can successfully enhance the dialogue generation performance.
  • The authors will further investigate other content generation problems by leveraging multigranularity copying mechanism.
表格
  • Table1: Statistics of the CDD and JDDC
  • Table2: Quantitative Evaluation. We report ROUGE-1, ROUGE-L and BLEU scores for each tested methods
  • Table3: Qualitative Evaluation. We report average score (Avg) and calculate the κ value in relevance and fluency. We recruited five annotators to evaluate the sentences generated by all the models. To be fair, for each input, we shuffled the output generated by all the models and then let the annotator to evaluate. The κ represents the consistency of evaluation by different annotators. And the κ coefficient between 0.48 and 0.82 means middle and upper agreement
Download tables as Excel
相关工作
基金
  • This work is supported by National Key R&D Program of China (2018YFC0830200; 2018YFC0830206; 2018YFC0830700)
研究对象与分析
similar cases: 50
The similar cases (SCs) of the target case is discovered from the same dataset where the target case stays. To make it more efficient, we use ElasticSearch3 to retrieve top 50 similar cases as candidates by leveraging the target case as a query and the all the other cases as documents. To make it more effective, we fine-tune the pre-trained RoBERTa4 (Liu et al, 2019b) model

court debate records: 121016
3.1.1 Court Debate Dataset. For CDD, we collected 121, 016 court debate records of private lending dispute civil cases7. We take the judge and the historical conversation with the plaintiff and the defendant as the model input, and the judge’s utterance as the model output

pairs of samples: 260190
We take the judge and the historical conversation with the plaintiff and the defendant as the model input, and the judge’s utterance as the model output. These records are divided into 260, 190 pairs of samples by experts with legal knowledge. 3.1.2 Jing Dong Dialogue Corpus Jing Dong Dialogue Corpus (JDDC)8 contains 1, 024, 196 multi-turn dialogues, 20, 451, 337 utterances, and 150 million words

cases: 326603
The average number of tokens contained in per sentence is about 7.4. In the experiments, we adopted the top 326, 603 cases. The proposed algorithm and baselines are set to generate the utterances of the customer service, and the historical context between the customer service and the customer is set as input

samples: 300
In order to ensure the rationality/correctness of the generated utterance, we also conducted human evaluation. We randomly selected 300 samples from the test set. Then, we recruited five annotators9 to judge the quality of generated utterance from two perspectives: (Ke et al, 2018; Zhu et al, 2019): 3 Experimental Settings

similar cases: 3
4.1 Overall Performance. In the experiments, we select up to three similar cases to validate the effectiveness of the CCN, i.e., leveraging the most similar case (top-1), top two similar ones (top-2), and top three similar ones (top-3). In addition, we also test the variant of CCN(vertical-only) by only adopting vertical copy from the context, which is similar to the setting of the baseline PGN but with the proposed hierarchical dialogue encoders

similar cases: 3
However, in the training process, as the number of similar cases increases, the training speed is getting slower. Considering the time cost and memory limitation, only up to top three similar cases are utilized in this experiment to verify the proposed approach. 4.2 Case study

引用论文
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    Findings
  • Jie Cao, Michael Tanana, Zac Imel, Eric Poitras, David Atkins, and Vivek Srikumar. 2019. Observing dialogue in therapy: Categorizing and forecasting behavioral codes. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5599–5611, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, and Michel Galley. 2018. A knowledge-grounded neural conversation model. In Thirty-Second AAAI Conference on Artificial Intelligence.
    Google ScholarLocate open access versionFindings
  • Hitesh Golchha, Mauajama Firdaus, Asif Ekbal, and Pushpak Bhattacharyya. 2019. Courteously yours: Inducing courteous behavior in customer care responses using reinforced pointer generator network. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 851–860.
    Google ScholarLocate open access versionFindings
  • Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OK Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. arXiv preprint arXiv:1603.06393.
    Findings
  • Caglar Gulcehre, Sungjin Ahn, Ramesh Nallapati, Bowen Zhou, and Yoshua Bengio. 201Pointing the unknown words. arXiv preprint arXiv:1603.08148.
    Findings
  • Sepp Hochreiter and Jurgen Schmidhuber. 199Long short-term memory. Neural computation, 9(8):1735–1780.
    Google ScholarLocate open access versionFindings
  • Hsin-Yuan Huang, Eunsol Choi, and Wen-tau Yih. 201Flowqa: Grasping flow in history for conversational machine comprehension. arXiv preprint arXiv:1810.06683.
    Findings
  • Rudolf Kadlec, Martin Schmid, Ondrej Bajgar, and Jan Kleindienst. 2016. Text understanding with the attention sum reader network. arXiv preprint arXiv:1603.01547.
    Findings
  • Meng Chen, Ruixue Liu, Lei Shen, Shaozu Yuan, Jingyan Zhou, Youzheng Wu, Xiaodong He, and Bowen Zhou. 2019. The jddc corpus: A large-scale multi-turn chinese dialogue dataset for e-commerce customer service. arXiv preprint arXiv:1911.09969.
    Findings
  • Wenchao Du and Alan W Black. 2019. Boosting dialog response generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 38–43, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, and Koray Kavukcuoglu. 2016. Neural machine translation in linear time. arXiv preprint arXiv:1610.10099.
    Findings
  • Pei Ke, Jian Guan, Minlie Huang, and Xiaoyan Zhu. 2018. Generating informative responses with controlled sentence function. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1499–1508.
    Google ScholarLocate open access versionFindings
  • Mihail Eric and Christopher D Manning. 2017. A copyaugmented sequence-to-sequence architecture gives good performance on task-oriented dialogue. arXiv preprint arXiv:1701.04024.
    Findings
  • Daniel Fernandez-Gonzalez. 2019. Left-to-right dependency parsing with pointer networks. In Proceedings of NAACL-HLT, pages 710–716.
    Google ScholarLocate open access versionFindings
  • Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N Dauphin. 2017. Convolutional sequence to sequence learning. pages 1243–1252. JMLR. org.
    Google ScholarFindings
  • Hung Le, Doyen Sahoo, Nancy Chen, and Steven Hoi. 2019. Multimodal transformer networks for end-toend video-grounded dialogue systems. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5612–5623, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Zekang Li, Cheng Niu, Fandong Meng, Yang Feng, Qian Li, and Jie Zhou. 2019. Incremental transformer with deliberation decoder for document grounded conversations. arXiv preprint arXiv:1907.08854.
    Findings
  • Chin-Yew Lin and Eduard Hovy. 2003. Automatic evaluation of summaries using n-gram cooccurrence statistics.
    Google ScholarFindings
  • Linlin Liu, Xiang Lin, Shafiq Joty, Simeng Han, and Lidong Bing. 2019a. Hierarchical pointer net parsing. pages 1007–1017, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarFindings
  • Shuman Liu, Hongshen Chen, Zhaochun Ren, Yang Feng, Qun Liu, and Dawei Yin. 2018. Knowledge diffusion for neural dialogue generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1489–1498.
    Google ScholarLocate open access versionFindings
  • Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019b. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
    Findings
  • Junyu Lu, Chenbin Zhang, Zeying Xie, Guang Ling, Tom Chao Zhou, and Zenglin Xu. 2019. Constructing interpretive spatio-temporal features for multiturn responses selection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 44–50, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Stephen Merity, Caiming Xiong, James Bradbury, and Richard Socher. 2016. Pointer sentinel mixture models. arXiv preprint arXiv:1609.07843.
    Findings
  • Yishu Miao and Phil Blunsom. 2016. Language as a latent variable: Discrete generative models for sentence compression. arXiv preprint arXiv:1609.07317.
    Findings
  • Ramesh Nallapati, Bowen Zhou, Caglar Gulcehre, Bing Xiang, et al. 2016. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023.
    Findings
  • Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. pages 311–318. Association for Computational Linguistics.
    Google ScholarFindings
  • Prasanna Parthasarathi and Joelle Pineau. 2018. Extending neural generative conversational model using external knowledge sources. arXiv preprint arXiv:1809.05524.
    Findings
  • Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know what you don’t know: Unanswerable questions for squad. arXiv preprint arXiv:1806.03822.
    Findings
  • Siva Reddy, Danqi Chen, and Christopher D Manning. 2019. Coqa: A conversational question answering challenge. Transactions of the Association for Computational Linguistics, 7:249–266.
    Google ScholarLocate open access versionFindings
  • Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368.
    Findings
  • Xiaoyu Shen, Yang Zhao, Hui Su, and Dietrich Klakow. 2019. Improving latent alignment in text summarization by generalizing the pointer generator. pages 3760–3771, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarFindings
  • Hui Su, Xiaoyu Shen, Rongzhi Zhang, Fei Sun, Pengwei Hu, Cheng Niu, and Jie Zhou. 2019. Improving multi-turn dialogue modelling with utterance ReWriter. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 22–31, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Fei Sun, Peng Jiang, Hanxiao Sun, Changhua Pei, Wenwu Ou, and Xiaobo Wang. 2018. Multi-source pointer network for product title summarization. pages 7–16. ACM.
    Google ScholarFindings
  • I Sutskever, O Vinyals, and QV Le. 2014. Sequence to sequence learning with neural networks. Advances in NIPS.
    Google ScholarFindings
  • Jianheng Tang, Tiancheng Zhao, Chenyan Xiong, Xiaodan Liang, Eric Xing, and Zhiting Hu. 2019. Targetguided open-domain conversation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5624–5634, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. pages 5998–6008.
    Google ScholarFindings
  • Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. pages 2692–2700.
    Google ScholarFindings
  • Shuohang Wang and Jing Jiang. Machine comprehension using match-lstm and answer pointer.(2017). pages 1–15.
    Google ScholarFindings
  • Wenbo Wang, Yang Gao, Heyan Huang, and Yuxiang Zhou. 2019a. Concept pointer network for abstractive summarization. pages 3074–3083, Hong Kong, China. Association for Computational Linguistics.
    Google ScholarFindings
  • Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, and Ming Zhou. 2017. Gated self-matching networks for reading comprehension and question answering. pages 189–198.
    Google ScholarFindings
  • Xuewei Wang, Weiyan Shi, Richard Kim, Yoojung Oh, Sijia Yang, Jingwen Zhang, and Zhou Yu. 2019b. Persuasion for good: Towards a personalized persuasive dialogue system for social good. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5635–5649, Florence, Italy. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Wenquan Wu, Zhen Guo, Xiangyang Zhou, Hua Wu, Xiyuan Zhang, Rongzhong Lian, and Haifeng Wang. 2019. Proactive human-machine conversation with explicit conversation goals. arXiv preprint arXiv:1906.05572.
    Findings
  • Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, and Tie-Yan Liu. 2017. Deliberation networks: Sequence generation beyond one-pass decoding. In Advances in Neural Information Processing Systems, pages 1784–1794.
    Google ScholarLocate open access versionFindings
  • Hao-Tong Ye, Kai-Lin Lo, Shang-Yu Su, and YunNung Chen. 2020. Knowledge-grounded response generation with deep attentional latent-variable model. Computer Speech & Language, page 101069.
    Google ScholarLocate open access versionFindings
  • Matthew D Zeiler. 2012. Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.
    Findings
  • Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, and Ting Liu. 2019. Retrieval-enhanced adversarial training for neural response generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3763–3773.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论