A Knowledge-Grounded Neural Conversation Model.

THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTI..., (2018): 5110-5117

被引用348|浏览214
EI
下载 PDF 全文
引用
微博一下

摘要

Neural network models are capable of generating extremely natural sounding conversational interactions. However, these models have been mostly applied to casual scenarios (e.g., as "chatbots") and have yet to demonstrate they can serve in more useful conversational applications. This paper presents a novel, fully data-driven, and knowledg...更多

代码

数据

0
简介
  • Recent work has shown that conversational chatbot models can be trained in an end-to-end and completely datadriven fashion, without hand-coding (Ritter, Cherry, and Dolan 2011; Sordoni et al 2015; Shang, Lu, and Li 2015; Vinyals and Le 2015; Serban et al 2016, inter alia).
  • Fig. 1 illustrates the difficulty: while an ideal response would directly reflect on the entities mentioned in the query, neural models produce responses that, while conversationally appropriate, seldom include factual content
  • This contrasts with traditional dialog systems, which can readily inject entities and facts into responses, but often at the cost of significant handcoding.
  • The goal of this work is to benefit from the versatility and scalability of fully data-driven models, while simultaneously seeking to produce models that are usefully grounded in external knowledge, permitting them to be deployed in, for ex-
重点内容
  • Recent work has shown that conversational chatbot models can be trained in an end-to-end and completely datadriven fashion, without hand-coding (Ritter, Cherry, and Dolan 2011; Sordoni et al 2015; Shang, Lu, and Li 2015; Vinyals and Le 2015; Serban et al 2016, inter alia)
  • The goal of this work is to benefit from the versatility and scalability of fully data-driven models, while simultaneously seeking to produce models that are usefully grounded in external knowledge, permitting them to be deployed in, for ex-
  • In order to infuse the response with factual information relevant to the conversational context, we propose the knowledge-grounded model architecture depicted in Fig. 3
  • Diversity 1-gram 2-gram and grounded data is as low as the SEQ2SEQ models that are trained on general and grounded data respectively
  • We have presented a novel knowledge-grounded conversation engine that could serve as the core component of a
  • Our simple entity matching approach to grounding external information based on conversation context makes for a model that is informative, versatile and applicable in open-domain systems
方法
  • While not an interesting system in itself, the authors include it to assess the effect of multi-task learning separately from facts
结果
  • Automatic Evaluation: The authors computed perplexity and BLEU (Papineni et al 2002) for each system
  • These are shown in Tables 1 and 2 respectively.
  • Diversity 1-gram 2-gram and grounded data is as low as the SEQ2SEQ models that are trained on general and grounded data respectively.
  • Table 2 shows that the MTASK-R model yields a significant performance boost, with a BLEU score increase of 96% and 71% jump in 1-gram diversity compared to the competitive SEQ2SEQ baseline.
  • In terms of BLEU scores, MTASK-RF improvements is not significant, but it generates the highest 1-gram and 2-gram diversity among all models
结论
  • Figure 6 presents examples from the MTASK-RF model, and illustrates that the responses are generally both appropriate and informative.
  • The model is a large-scale, scalable, fully data-driven neural conversation model that effectively exploits external knowledge, and does so without explicit slot filling.
  • It generalizes the SEQ2SEQ approach to neural conversation models by naturally combining conversational and non-conversational data through multi-task learning.
  • The authors' simple entity matching approach to grounding external information based on conversation context makes for a model that is informative, versatile and applicable in open-domain systems
表格
  • Table1: Perplexity of different models. SEQ2SEQ-S is a SEQ2SEQ model that is trained on the NOFACTS task with 1M grounded dataset (without the facts)
  • Table2: BLEU-4 and lexical diversity
  • Table3: Mean differences in judgments in human evaluation, together with 95% confidence intervals. Differences sum to 1.0. Boldface items are significantly better (p <0.05) than their comparator. (*): Main system, pre-selected on the basis of BLEU
Download tables as Excel
相关工作
基金
  • Our primary system MTASK-R, which performed best on BLEU, significantly outperforms the SEQ2SEQ baseline on Informativeness (p = 0.003) and shows a small, but nonstatistically-significant gain with respect Appropriateness
  • MTASK-F performed significantly better than baseline (p = 0.005) on Informativeness, but was significantly worse on Appropriateness
引用论文
  • Ameixa, D.; Coheur, L.; Fialho, P.; and Quaresma, P. 2014.
    Google ScholarFindings
  • Banchs, R. E., and Li, H. 201IRIS: a chat-oriented dialogue system based on the vector space model. ACL.
    Google ScholarLocate open access versionFindings
  • Bordes, A., and Weston, J. 2017. Learning end-to-end goaloriented dialog. ICLR 2017.
    Google ScholarLocate open access versionFindings
  • Caruana, R. 1997. Multitask learning. Machine Learning 28(1):41–75.
    Google ScholarLocate open access versionFindings
  • Chung, J.; Gulcehre, C.; Cho, K.; and Bengio, Y. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555.
    Findings
  • Delbrouck, J.-B.; Dupont, S.; and Seddati, O. 2017. Visually grounded word embeddings and richer visual features for improving multimodal neural machine translation. In Grounding Language Understanding workshop.
    Google ScholarFindings
  • Dong, D.; Wu, H.; He, W.; Yu, D.; and Wang, H. 2015. Multi-task learning for multiple language translation. ACL.
    Google ScholarLocate open access versionFindings
  • Galley, M.; Brockett, C.; Sordoni, A.; Ji, Y.; Auli, M.; Quirk, C.; Mitchell, M.; Gao, J.; and Dolan, B. 2015. deltaBLEU: A discriminative metric for generation tasks with intrinsically diverse targets. ACL-IJCNLP.
    Google ScholarLocate open access versionFindings
  • Graham, Y.; Baldwin, T.; and Mathur, N. 2015. Accurate evaluation of segment-level machine translation metrics. NAACL.
    Google ScholarLocate open access versionFindings
  • He, H.; Balakrishnan, A.; Eric, M.; and Liang, P. 2017. Learning symmetric collaborative dialogue agents with dynamic knowledge graph embeddings. ACL.
    Google ScholarLocate open access versionFindings
  • Hoang, C. D. V.; Cohn, T.; and Haffari, G. 2016. Incorporating side information into recurrent neural network language models. NAACL-HLT.
    Google ScholarFindings
  • Hochreiter, S., and Schmidhuber, J. 1997. Long short-term memory. Neural computation 9(8):1735–1780.
    Google ScholarLocate open access versionFindings
  • Huang, P.-Y.; Liu, F.; Shiang, S.-R.; Oh, J.; and Dyer, C. 2016. Attention-based multimodal neural machine translation. WMT.
    Google ScholarLocate open access versionFindings
  • Li, J.; Galley, M.; Brockett, C.; Gao, J.; and Dolan, B. 2016a. A diversity-promoting objective function for neural conversation models. NAACL-HLT.
    Google ScholarFindings
  • Li, J.; Galley, M.; Brockett, C.; Gao, J.; and Dolan, B. 2016b. A persona-based neural conversation model. ACL.
    Google ScholarLocate open access versionFindings
  • Liu, F., and Perez, J. 2017. Dialog state tracking, a machine reading approach using memory network. EACL.
    Google ScholarLocate open access versionFindings
  • Liu, X.; Gao, J.; He, X.; Deng, L.; Duh, K.; and Wang, Y.-Y. 2015. Representation learning using multi-task deep neural networks for semantic classification and information retrieval. NAACL-HLT.
    Google ScholarLocate open access versionFindings
  • Liu, C.-W.; Lowe, R.; Serban, I.; Noseworthy, M.; Charlin, L.; and Pineau, J. 2016. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. EMNLP.
    Google ScholarLocate open access versionFindings
  • Luong, M.-T.; Le, Q. V.; Sutskever, I.; Vinyals, O.; and Kaiser, L. 2016. Multi-task sequence to sequence learning. ICLR.
    Google ScholarLocate open access versionFindings
  • Nio, L.; Sakti, S.; Neubig, G.; Toda, T.; Adriani, M.; and Nakamura, S. 2014. Developing non-goal dialog system based on examples of drama television. In Natural Interaction with Robots, Knowbots and Smartphones. Springer.
    Google ScholarFindings
  • Och, F. J. 2003. Minimum error rate training in statistical machine translation. ACL.
    Google ScholarLocate open access versionFindings
  • Oh, A. H., and Rudnicky, A. I. 2000. Stochastic language generation for spoken dialogue systems. ANLP/NAACL Workshop on Conversational systems.
    Google ScholarFindings
  • Papineni, K.; Roukos, S.; Ward, T.; and Zhu, W.-J. 2002. BLEU: a method for automatic evaluation of machine translation. ACL.
    Google ScholarLocate open access versionFindings
  • Przybocki, M.; Peterson, K.; and Bronsart, S. 2008. Official results of the NIST 2008 metrics for machine translation challenge. In MetricsMATR08 workshop.
    Google ScholarLocate open access versionFindings
  • Ratnaparkhi, A. 2002. Trainable approaches to surface natural language generation and their application to conversational dialog systems. Computer Speech & Language 16(3):435–455.
    Google ScholarLocate open access versionFindings
  • Ritter, A.; Cherry, C.; and Dolan, W. B. 2011. Data-driven response generation in social media. EMNLP.
    Google ScholarLocate open access versionFindings
  • Serban, I. V.; Lowe, R.; Charlin, L.; and Pineau, J. 2015. A survey of available corpora for building data-driven dialogue systems. CoRR abs/1512.05742.
    Findings
  • Serban, I. V.; Sordoni, A.; Bengio, Y.; Courville, A.; and Pineau, J. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. AAAI.
    Google ScholarFindings
  • Shang, L.; Lu, Z.; and Li, H. 2015. Neural responding machine for short-text conversation. ACL-IJCNLP.
    Google ScholarLocate open access versionFindings
  • Sordoni, A.; Galley, M.; Auli, M.; Brockett, C.; Ji, Y.; Mitchell, M.; Nie, J.-Y.; Gao, J.; and Dolan, B. 2015. A neural network approach to context-sensitive generation of conversational responses. NAACL-HLT.
    Google ScholarLocate open access versionFindings
  • Sukhbaatar, S.; Weston, J.; Fergus, R.; et al. 2015. End-toend memory networks. NIPS.
    Google ScholarLocate open access versionFindings
  • Sutskever, I.; Vinyals, O.; and Le, Q. V. 2014. Sequence to sequence learning with neural networks. NIPS.
    Google ScholarLocate open access versionFindings
  • Vinyals, O., and Le, Q. 2015. A neural conversational model. ICML.
    Google ScholarLocate open access versionFindings
  • Wen, T.-H.; Gasic, M.; Mrksic, N.; Su, P.-H.; Vandyke, D.; and Young, S. 2015. Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. EMNLP.
    Google ScholarLocate open access versionFindings
  • Wen, T.-H.; Miao, Y.; Blunsom, P.; and Young, S. 2017. Latent intent dialog models. ICML.
    Google ScholarLocate open access versionFindings
  • Weston, J.; Bordes, A.; Chopra, S.; Rush, A. M.; van Merrienboer, B.; Joulin, A.; and Mikolov, T. 2016. Towards AI-complete question answering: A set of prerequisite toy tasks. ICLR.
    Google ScholarLocate open access versionFindings
  • Weston, J.; Chopra, S.; and Bordes, A. 2015. Memory networks. ICLR.
    Google ScholarLocate open access versionFindings
  • Zhao, T.; Lu, A.; Lee, K.; and Eskenazi, M. 2017. Generative encoder-decoder models for task-oriented spoken dialog systems with chatting capability. ACL.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科