AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We introduce Topical-Chat, an open-domain knowledgegrounded conversation dataset without explicit roles for conversation partners and containing depth and breadth of topical coverage with transitions in conversations

Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations

INTERSPEECH, pp.1891-1895, (2019)

被引用13|浏览694
EI
下载 PDF 全文
引用
微博一下

摘要

Building socialbots that can have deep, engaging open-domain conversations with humans is one of the grand challenges of artificial intelligence (AI). To this end, bots need to be able to leverage world knowledge spanning several domains effectively when conversing with humans who have their own world knowledge. Existing knowledge-grounde...更多

代码

数据

0
简介
  • Building conversational bots that can interact with humans in natural language has been of interest to researchers since the early days of computing, as exemplified by text-based systems such as ELIZA [1].
  • Task-oriented bots aim to help humans accomplish a specific task through multi-turn interactions, whereas open-domain bots aim to serve as social conversation partners with whom humans can have natural and engaging conversations.
  • In addition to mastering traditional language skills like comprehension, open-domain bots need to perfect several conversational skills that come naturally to humans: recalling from world knowledge, reasoning in conjunction with conversational history and constructing valid responses.
  • The authors introduce Topical-Chat, a dataset of ∼11K human-human conversations about knowledge spanning 8 broad topics.
  • Partners do not have explicitly defined roles they need to serve during a conversation and the
重点内容
  • Building conversational bots that can interact with humans in natural language has been of interest to researchers since the early days of computing, as exemplified by text-based systems such as ELIZA [1]
  • Task-oriented bots aim to help humans accomplish a specific task through multi-turn interactions, whereas open-domain bots aim to serve as social conversation partners with whom humans can have natural and engaging conversations
  • We demonstrate the ability of our models to have engaging conversations grounded in knowledge through automated and human evaluation
  • In order to decide on an appropriate WH , we tried training a Transformer that uses knowledge with varying WH and evaluated them on automated metrics described below (Table 5)
  • We introduce Topical-Chat, an open-domain knowledgegrounded conversation dataset without explicit roles for conversation partners and containing depth and breadth of topical coverage with transitions in conversations
  • We provide evidence of qualitative value through human evaluation of these models
方法
  • All models were trained using ParlAI [16].
  • The authors randomly initialized 300-dimensional word embeddings, which are learned during training.
  • The authors do not learn positional embeddings and encode position using one-hot vectors.
  • The authors use a batch size of 32, stochastic gradient descent for optimization with a gradient clip of 0.1 and learning rate scheduler decay 0.5 with patience 3.
  • The authors stop training when perplexity on the validation frequent set does not decrease for 10 epochs.
  • The authors use beam search with a beam size of 5 for decoding
结果
  • =. Number of Utterances Average Number of Turns per Conversation.
  • In order to decide on an appropriate WH , the authors tried training a Transformer that uses knowledge with varying WH and evaluated them on automated metrics described below (Table 5).
  • The authors observe that WH = 32 works best.
  • The authors believe this reflects the knowledge model’s inability to attend to important tokens in the dialog context when a large WH is used.
结论
  • The authors introduce Topical-Chat, an open-domain knowledgegrounded conversation dataset without explicit roles for conversation partners and containing depth and breadth of topical coverage with transitions in conversations.
  • The authors train simple Transformer-based models for response generation and evaluate them using automated metrics for benchmarking.
  • The authors provide evidence of qualitative value through human evaluation of these models.
  • The authors hope that the release of Topical-Chat fosters data-driven research in open-domain knowledge-grounded conversational AI
表格
  • Table1: Topics and their entity budgets
  • Table2: Topical-Chat conversation stats
  • Table3: Automated metrics on test set (Frequent/Rare)
  • Table4: Human evaluation metrics for 150 test freq. snippets
  • Table5: Effect of varying WH for TF (w/ k.) on test freq
Download tables as Excel
相关工作
  • Recent research interest in knowledge-grounded conversations has led to the public release of multiple datasets. [6] released a dataset of ∼4K conversations where Wikipedia articles about 30 movies served as the knowledge base. The collection was performed with portions of the articles shown to conversation partners in a scheduled way. [7] released a similar dataset of conversations about movies, where the knowledge base comprises Wikipedia articles, reviews and comments mined from the web about ∼1K movies. The collection involved self-dialogues, where one crowdworker generates utterances for both sides. More recently, the Wizard of Wikipedia (WoW) dataset [5] was released, where the focus, similar to ours, is on collecting opendomain knowledge-grounded conversations. A key difference is their knowledge base comprises Wikipedia articles, whereas we relied on multiple data sources, specifically Washington Post articles and Reddit fun facts in addition to Wikipedia articles about entities, to enable lively interactions.

    Sequence-to-sequence generative modeling approaches have become popular for response generation, where the goal is to generate a response given the previous turn in a conversation [2, 3]. However, responses generated by these sequence-tosequence models are not always coherent or contextually appropriate and are noted to be often generic and lacking interesting content [2]. Such approaches don’t explicitly ground responses on relevant knowledge. This has led to work on approaches that include world knowledge into conversational response generation. [8] used end-to-end memory networks to condition the generated responses on knowledge, where attention over the knowledge relevant to the conversation context is estimated, and multiple knowledge representations are included as input during response decoding. [9] retrieves relevant knowledge graphs given the conversation context and encodes the graphs with a static graph attention mechanism. The decoder attentively reads the retrieved knowledge graphs and the knowledge triples within each graph. More recently, [5] use a Transformer Memory Network to encode knowledge sentences and conversation context and decode a response.
基金
  • Introduces Topical-Chat, a knowledge-grounded humanhuman conversation dataset where the underlying knowledge spans 8 broad topics and conversation partners don’t have explicitly defined roles, to help further research in opendomain conversational AI
  • Introduces Topical-Chat, a dataset of ∼11K human-human conversations about knowledge spanning 8 broad topics
  • Demonstrates the ability of our models to have engaging conversations grounded in knowledge through automated and human evaluation
  • Considered the frequency distribution of the 8 topics across all user utterances to allocate an entity budget Bi for each topic i
研究对象与分析
articles: 3088
Article Selection: We fetched Washington Post articles from 2018 that each referenced 3 or more of the 300 entities and contained 600-1000 words. We removed articles with profane language and then considered the topic-entity budgets to finalize 3088 articles, ensuring adequate coverage for all topics. 3.2

humans: 2
The partial conversation corresponding to each snippet came from a distinct conversation in the Topical-Chat test frequent set. For each rc in each snippet, we asked two humans to separately annotate [20, 21] (possible values in parentheses) whether rc is comprehensible (0/1), on-topic (0/1) and interesting (0/1). We also asked them to annotate how effectively kis utilized in rc (0-3) and if they would have liked to continue the conversation after rc (0/1)

data: 1
F1 0.16 / 0.16 0.16 / 0.15 0.22 / 0.20 0.22 / 0.19. Div. (n=1) 0.85 / 0.84 0.86 / 0.85 0.84 / 0.80 0.85 / 0.82. Div. (n=2) 0.86 / 0.86 0.86 / 0.85 0.83 / 0.81 0.84 / 0.82

data: 2
Div. (n=1) 0.85 / 0.84 0.86 / 0.85 0.84 / 0.80 0.85 / 0.82. Div. (n=2) 0.86 / 0.86 0.86 / 0.85 0.83 / 0.81 0.84 / 0.82. Model 1 Human TF TF (w/ p.t.) TF (w/ k.) TF (w/ k. p.t.)

data: 1
comp. (κ = 0.83) 0.99 0.87 0.88 0.78 0.71 o.t. (κ = 0.67) 0.93 0.60 0.62 0.69 0.66 l.k. (κ = 0.62) 1.92 0.08 0.12 0.63 0.80. WH PPL F1 Div. (n=1) Div. (n=2). We introduce Topical-Chat, an open-domain knowledgegrounded conversation dataset without explicit roles for conversation partners and containing depth and breadth of topical coverage with transitions in conversations

引用论文
  • J. Weizenbaum et al., “Eliza—a computer program for the study of natural language communication between man and machine,” Communications of the ACM, vol. 9, no. 1, pp. 36–45, 1966.
    Google ScholarLocate open access versionFindings
  • O. Vinyals and Q. Le, “A neural conversational model,” arXiv preprint arXiv:1506.05869, 2015.
    Findings
  • A. Ritter, C. Cherry, and B. Dolan, “Unsupervised modeling of twitter conversations,” in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2010, pp. 172–180.
    Google ScholarLocate open access versionFindings
  • A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017, pp. 5998–6008.
    Google ScholarLocate open access versionFindings
  • E. Dinan, S. Roller, K. Shuster, A. Fan, M. Auli, and J. Weston, “Wizard of wikipedia: Knowledge-powered conversational agents,” arXiv preprint arXiv:1811.01241, 2018.
    Findings
  • K. Zhou, S. Prabhumoye, and A. W. Black, “A dataset for document grounded conversations,” arXiv preprint arXiv:1809.07358, 2018.
    Findings
  • N. Moghe, S. Arora, S. Banerjee, and M. M. Khapra, “Towards exploiting background knowledge for building conversation systems,” 2018.
    Google ScholarFindings
  • M. Ghazvininejad, C. Brockett, M.-W. Chang, B. Dolan, J. Gao, W.-t. Yih, and M. Galley, “A knowledge-grounded neural conversation model,” in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
    Google ScholarLocate open access versionFindings
  • H. Zhou, T. Young, M. Huang, H. Zhao, J. Xu, and X. Zhu, “Commonsense knowledge aware conversation generation with graph attention.” in IJCAI, 2018, pp. 4623–4629.
    Google ScholarFindings
  • J. E. Weston, “Dialog-based language learning,” in Advances in Neural Information Processing Systems, 2016, pp. 829–837.
    Google ScholarLocate open access versionFindings
  • M. Lewis, D. Yarats, Y. N. Dauphin, D. Parikh, and D. Batra, “Deal or no deal? end-to-end learning for negotiation dialogues,” arXiv preprint arXiv:1706.05125, 2017.
    Findings
  • S. Zhang, E. Dinan, J. Urbanek, A. Szlam, D. Kiela, and J. Weston, “Personalizing dialogue agents: I have a dog, do you have pets too?” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 2204–2213.
    Google ScholarLocate open access versionFindings
  • C. Khatri, B. Hedayatnia, A. Venkatesh, J. Nunn, Y. Pan, Q. Liu, H. Song, A. Gottardi, S. Kwatra, S. Pancholi, M. Cheng, Q. Chen, L. Stubel, K. Gopalakrishnan, K. Bland, R. Gabriel, A. Mandal, D. Hakkani-Tur, G. Hwang, N. Michel, E. King, and R. Prasad, “Advancing the state of the art in open domain dialog systems through the alexa prize,” in Alexa Prize Proceeedings (https://developer.amazon.com/alexaprize/challenges/pastchallenges/2018/), 2018.
    Locate open access versionFindings
  • Reddit, “r/todayilearned,” https://www.reddit.com/r/todayilearned/.
    Findings
  • R. Mihalcea and P. Tarau, “Textrank: Bringing order into text,” in Proceedings of the 2004 conference on empirical methods in natural language processing, 2004.
    Google ScholarLocate open access versionFindings
  • A. H. Miller, W. Feng, A. Fisch, J. Lu, D. Batra, A. Bordes, D. Parikh, and J. Weston, “Parlai: A dialog research software platform,” arXiv preprint arXiv:1705.06476, 2017.
    Findings
  • BookCorpus, https://github.com/soskek/bookcorpus/.
    Findings
  • A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” URL https://s3-us-west2.amazonaws.com/openai-assets/research-covers/languageunsupervised/language understanding paper.pdf, 2018.
    Findings
  • R. Sennrich, B. Haddow, and A. Birch, “Neural machine translation of rare words with subword units,” arXiv preprint arXiv:1508.07909, 2015.
    Findings
  • A. Venkatesh, C. Khatri, A. Ram, F. Guo, R. Gabriel, A. Nagar, R. Prasad, M. Cheng, B. Hedayatnia, A. Metallinou, R. Goel, S. Yang, and A. Raju, “On evaluating and comparing open domain dialog systems,” 2018.
    Google ScholarFindings
  • A. See, S. Roller, D. Kiela, and J. Weston, “What makes a good conversation? how controllable attributes affect human judgments,” 2019.
    Google ScholarFindings
作者
Karthik Gopalakrishnan
Karthik Gopalakrishnan
Behnam Hedayatnia
Behnam Hedayatnia
Qinlang Chen
Qinlang Chen
Anna Gottardi
Anna Gottardi
Sanjeev Kwatra
Sanjeev Kwatra
Anu Venkatesh
Anu Venkatesh
Raefer Gabriel
Raefer Gabriel
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科