AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Focusing on the Conversing by Reading task, we propose a novel response-anticipated document memory to exploit and memorize the document information that is important in response generation

Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation

ACL, pp.650-659, (2020)

被引用0|浏览197
EI
下载 PDF 全文
引用
微博一下

摘要

Neural conversation models are known to generate appropriate but non-informative responses in general. A scenario where informativeness can be significantly enhanced is Conversing by Reading (CbR), where conversations take place with respect to a given external document. In previous work, the external document is utilized by (1) creatin...更多

代码

数据

0
简介
  • Neural conversation models have achieved promising performance in response generation.
  • It is widely observed that the generated responses lack sufficient content and information (Li et al, 2016a).
  • One way to address this issue is to integrate various external information into conversation models.
  • Examples of external information include document topics (Xing et al, 2017), commonsense knowledge graphs (Zhou et al, 2018), and domain-specific knowledge bases (Yang et al, 2019).
重点内容
  • Neural conversation models have achieved promising performance in response generation
  • To ingest useful knowledge for response generation, we argue that processing documents should consider the interaction between context and document and the target response
  • The teacher model learns a response-aware document memory M used in our base conversation model
  • We construct a response-aware weight matrix G ∈ R|D|×|D|, which considers the correlation between context-aware document representations and response representations, and impose G on the memory matrix M
  • Focusing on the Conversing by Reading task, we propose a novel response-anticipated document memory to exploit and memorize the document information that is important in response generation
  • We verify our model on both automatic and human evaluations and experimental results show our model obtains the state-of-the-art performance on the Conversing by Reading task
方法
  • The authors will first give an overall description of the proposed teacher-student architecture for CbR, briefly describe the base model.
  • The teacher model learns a response-aware document memory M used in the base conversation model.
  • The student model learns to construct a responseanticipated weight matrix to estimate G used in the teacher model but without access to the response.
  • It is a feed-forward neural network with document and context as its input
结果
  • The authors first show the performance of all methods in Sec 5.1.
  • The authors validate the effectiveness of response anticipation on CbR in Sec 5.2 by comparing the top similar tokens with the response using their representations in the memory.
  • The authors compare more variants of the model.
  • Top20 tokens Emb-M Emb-B in Sec 5.3, including the token importance versus pairwise importance, and each method with continuous weights versus their variants with binary weights.
  • The authors conduct a case study in Sec 5.4.
结论
  • Focusing on the CbR task, the authors propose a novel response-anticipated document memory to exploit and memorize the document information that is important in response generation.
  • The authors construct the response-anticipated memory by a teacher-student framework.
  • The teacher accesses the response and learns a response-aware weight matrix; the student learns to estimate the weight matrix in the teacher model and construct the response-anticipated document memory.
  • The authors verify the model on both automatic and human evaluations and experimental results show the model obtains the state-of-the-art performance on the CbR task
总结
  • Introduction:

    Neural conversation models have achieved promising performance in response generation.
  • It is widely observed that the generated responses lack sufficient content and information (Li et al, 2016a).
  • One way to address this issue is to integrate various external information into conversation models.
  • Examples of external information include document topics (Xing et al, 2017), commonsense knowledge graphs (Zhou et al, 2018), and domain-specific knowledge bases (Yang et al, 2019).
  • Methods:

    The authors will first give an overall description of the proposed teacher-student architecture for CbR, briefly describe the base model.
  • The teacher model learns a response-aware document memory M used in the base conversation model.
  • The student model learns to construct a responseanticipated weight matrix to estimate G used in the teacher model but without access to the response.
  • It is a feed-forward neural network with document and context as its input
  • Results:

    The authors first show the performance of all methods in Sec 5.1.
  • The authors validate the effectiveness of response anticipation on CbR in Sec 5.2 by comparing the top similar tokens with the response using their representations in the memory.
  • The authors compare more variants of the model.
  • Top20 tokens Emb-M Emb-B in Sec 5.3, including the token importance versus pairwise importance, and each method with continuous weights versus their variants with binary weights.
  • The authors conduct a case study in Sec 5.4.
  • Conclusion:

    Focusing on the CbR task, the authors propose a novel response-anticipated document memory to exploit and memorize the document information that is important in response generation.
  • The authors construct the response-anticipated memory by a teacher-student framework.
  • The teacher accesses the response and learns a response-aware weight matrix; the student learns to estimate the weight matrix in the teacher model and construct the response-anticipated document memory.
  • The authors verify the model on both automatic and human evaluations and experimental results show the model obtains the state-of-the-art performance on the CbR task
表格
  • Table1: Automatic evaluation results on all competing methods. Len denotes the length of the generated responses
  • Table2: Performance comparison on our model variants. Line1&2: our models trained by the full teacher-student framework. Line3&4: our models trained with the teacher model only. Line5&6: our models with binary weight matrices. Bold values are the best results among the first four lines; underlines mark the best ones among the first two and last two lines
  • Table3: Human annotation results
  • Table4: Similarity between important document tokens picked by gold responses and the accumulated attention weights in the models
Download tables as Excel
相关工作
基金
  • Research on this paper was supported by Hong Kong Research Grants Council under grants 16202118 and 16212516 and Tencent AI Lab Rhino-Bird Focused Research Program (No GF202035)
引用论文
  • Shubham Agarwal, Ondrej Dusek, Ioannis Konstas, and Verena Rieser. 2018. A knowledge-grounded multimodal search-based conversational agent. In EMNLP, pages 59–66.
    Google ScholarLocate open access versionFindings
  • Satanjeev Banerjee and Alon Lavie. 2005. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In Workshop on ACL, pages 65–72.
    Google ScholarLocate open access versionFindings
  • Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, and Jason Weston. 2018. Wizard of wikipedia: Knowledge-powered conversational agents. In ICLR.
    Google ScholarFindings
  • George Doddington. 2002. Automatic evaluation of machine translation quality using n-gram cooccurrence statistics. In HLTCon, pages 138–145.
    Google ScholarLocate open access versionFindings
  • Sergey Edunov, Myle Ott, Michael Auli, and David Grangier. 2018. Understanding back-translation at scale. In EMNLP, pages 489–500.
    Google ScholarLocate open access versionFindings
  • Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, and Michel Galley. 2018. A knowledge-grounded neural conversation model. In AAAI, pages 5110–5117.
    Google ScholarLocate open access versionFindings
  • Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. In ICLR.
    Google ScholarFindings
  • Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016a. A diversity-promoting objective function for neural conversation models. In NAACL, pages 110–119.
    Google ScholarLocate open access versionFindings
  • Jiwei Li, Will Monroe, Alan Ritter, Dan Jurafsky, Michel Galley, and Jianfeng Gao. 2016b. Deep reinforcement learning for dialogue generation. In EMNLP, pages 1192–1202.
    Google ScholarLocate open access versionFindings
  • Chia-Wei Liu, Ryan Lowe, Iulian Vlad Serban, Mike Noseworthy, Laurent Charlin, and Joelle Pineau. 2016. How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In EMNLP, pages 2122–2132.
    Google ScholarLocate open access versionFindings
  • Xiaodong Liu, Yelong Shen, Kevin Duh, and Jianfeng Gao. 2018. Stochastic answer networks for machine reading comprehension. In ACL, pages 1694–1704.
    Google ScholarLocate open access versionFindings
  • Chuan Meng, Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, and Maarten de Rijke. 2020. Refnet: A reference-aware network for background based conversation. In AAAI.
    Google ScholarFindings
  • Nikita Moghe, Siddhartha Arora, Suman Banerjee, and Mitesh M Khapra. 2018. Towards exploiting background knowledge for building conversation systems. In EMNLP, pages 2322–2332.
    Google ScholarLocate open access versionFindings
  • Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. 2016. Sequence to backward and forward sequences: A content-introducing approach to generative short-text conversation. In COLING, pages 3349–3358.
    Google ScholarLocate open access versionFindings
  • Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In ACL, pages 311– 318.
    Google ScholarLocate open access versionFindings
  • Prasanna Parthasarathi and Joelle Pineau. 2018. Extending neural generative conversational model using external knowledge sources. In EMNLP, pages 690–695.
    Google ScholarLocate open access versionFindings
  • Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In EMNLP, pages 1532–1543.
    Google ScholarLocate open access versionFindings
  • Lianhui Qin, Michel Galley, Chris Brockett, Xiaodong Liu, Xiang Gao, Bill Dolan, Yejin Choi, and Jianfeng Gao. 2019. Conversing by reading: Contentful neural conversation with on-demand machine reading. In ACL, pages 5427—5436.
    Google ScholarLocate open access versionFindings
  • Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, and Maarten de Rijke. 2020. Thinking globally, acting locally: Distantly supervised global-to-local knowledge selection for background based conversation. In AAAI.
    Google ScholarFindings
  • Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: Summarization with pointergenerator networks. In ACL, pages 1073–1083.
    Google ScholarLocate open access versionFindings
  • Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2017. Bidirectional attention flow for machine comprehension. In ICLR.
    Google ScholarFindings
  • Iulian V Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In AAAI, pages 3776– 3783.
    Google ScholarLocate open access versionFindings
  • Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. In AAAI, pages 3295–3301.
    Google ScholarLocate open access versionFindings
  • Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural responding machine for short-text conversation. In ACL, pages 1577–1586.
    Google ScholarLocate open access versionFindings
  • Yiping Song, Cheng-Te Li, Jian-Yun Nie, Ming Zhang, Dongyan Zhao, and Rui Yan. 2018. An ensemble of retrieval-based and generation-based humancomputer conversation systems. In IJCAI, pages 4382–4388.
    Google ScholarLocate open access versionFindings
  • Yiping Song, Zhiliang Tian, Dongyan Zhao, Ming Zhang, and Rui Yan. 2017. Diversifying neural conversation model with maximal marginal relevance. In IJCNLP, pages 169–174.
    Google ScholarLocate open access versionFindings
  • Sainbayar Sukhbaatar, Jason Weston, Rob Fergus, et al. 2015. End-to-end memory networks. In NIPS, pages 2440–2448.
    Google ScholarLocate open access versionFindings
  • Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In NIPS, pages 3104–3112.
    Google ScholarLocate open access versionFindings
  • Zhiliang Tian, Wei Bi, Xiaopeng Li, and Nevin L Zhang. 2019. Learning to abstract for memoryaugmented conversational response generation. In ACL, pages 3816–3825.
    Google ScholarLocate open access versionFindings
  • Zhiliang Tian, Rui Yan, Lili Mou, Yiping Song, Yansong Feng, and Dongyan Zhao. 2017. How to make context more useful? an empirical study on contextaware neural conversational models. In ACL, pages 231–236.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NLPS, pages 5998–6008.
    Google ScholarLocate open access versionFindings
  • Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. In ICLR.
    Google ScholarFindings
  • Ashwin K Vijayakumar, Michael Cogswell, Ramprasath R Selvaraju, Qing Sun, Stefan Lee, David Crandall, and Dhruv Batra. 2018. Diverse beam search: Decoding diverse solutions from neural sequence models. In AAAI.
    Google ScholarFindings
  • Wenquan Wu, Zhen Guo, Xiangyang Zhou, Hua Wu, Xiyuan Zhang, Rongzhong Lian, and Haifeng Wang. 2019a. Proactive human-machine conversation with explicit conversation goal. In ACL, pages 3794– 3804.
    Google ScholarLocate open access versionFindings
  • Yu Wu, Furu Wei, Shaohan Huang, Yunli Wang, Zhoujun Li, and Ming Zhou. 2019b. Response generation by context-aware prototype editing. In AAAI, pages 7281–7288.
    Google ScholarLocate open access versionFindings
  • Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. 2017. Topic aware neural response generation. In AAAI.
    Google ScholarFindings
  • Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W Bruce Croft, Xiaodong Liu, Yelong Shen, and Jingjing Liu. 2019. A hybrid retrievalgeneration neural conversation model. In CIKM, pages 1341–1350.
    Google ScholarLocate open access versionFindings
  • Tiancheng Zhao, Ran Zhao, and Maxine Eskenazi. 2017. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. In ACL, pages 654–664.
    Google ScholarLocate open access versionFindings
  • Hao Zhou, Tom Young, Minlie Huang, Haizhou Zhao, Jingfang Xu, and Xiaoyan Zhu. 2018. Commonsense knowledge aware conversation generation with graph attention. In IJCAI, pages 4623–4629.
    Google ScholarLocate open access versionFindings
作者
Tian Zhiliang
Tian Zhiliang
Bi Wei
Bi Wei
Lee Dongkyu
Lee Dongkyu
Xue Lanqing
Xue Lanqing
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科