This paper has presented a novel neural networkbased framework for task-oriented dialogue systems
A Network-based End-to-End Trainable Task-oriented Dialogue System.
EACL, pp.438-449, (2017)
下载 PDF 全文
Teaching machines to accomplish tasks by conversing naturally with humans is challenging. Currently, developing taskoriented dialogue systems requires creating multiple components and typically this involves either a large amount of handcrafting, or acquiring costly labelled datasets to solve a statistical learning problem for each compon...更多
下载 PDF 全文
- Building a task-oriented dialogue system such as a hotel booking or a technical support service is difficult because it is application-specific and there is usually limited availability of training data.
- At the other end of the spectrum, sequence to sequence learning (Sutskever et al, 2014) has inspired several efforts to build end-to-end trainable, non-task-oriented conversational systems (Vinyals and Le, 2015; Shang et al, 2015; Serban et al, 2015b)
- This family of approaches treats dialogue as a source to target sequence transduction problem, applying an encoder network (Cho et al, 2014) to encode a user query into a distributed vector representing its semantics, which conditions a decoder network to generate each system response.
- They allow the creation of effective chatbot type systems but they lack any capability for supporting domain specific tasks, for example, being able to interact with databases (Sukhbaatar et al, 2015; Yin et al, 2015) and aggregate useful information into their responses
- Building a task-oriented dialogue system such as a hotel booking or a technical support service is difficult because it is application-specific and there is usually limited availability of training data
- We propose a neural network-based model for task-oriented dialogue systems by balancing the strengths and the weaknesses of the two research communities: the model is end-to-end trainable1 but still modularly connected; it does not directly model the user goal, but it still learns to accomplish the required task by providing relevant and appropriate responses at each turn; it has an explicit representation of database (DB) attributes which it uses to achieve a high task success rate, but has a distributed representation of user intent
- In this work we focus on text-based dialogue systems, we retain belief tracking at the core of our system because: (1) it enables a sequence of freeform natural language sentences to be mapped into a fixed set of slot-value pairs, which can be used to query a DB
- This paper has presented a novel neural networkbased framework for task-oriented dialogue systems
- We demonstrated that the pipe-lined parallel organisation of this collection framework enables good quality task-oriented dialogue data to be collected quickly at modest cost
- The experimental assessment of the NN dialogue system showed that the learned model can interact efficiently and naturally with human subjects to complete an application-specific task
- This paper has presented a novel neural networkbased framework for task-oriented dialogue systems.
- The authors demonstrated that the pipe-lined parallel organisation of this collection framework enables good quality task-oriented dialogue data to be collected quickly at modest cost.
- The experimental assessment of the NN dialogue system showed that the learned model can interact efficiently and naturally with human subjects to complete an application-specific task.
- To the best of the knowledge, this is the first end-to-end NNbased model that can conduct meaningful dialogues in a task-oriented application
- Table1: Tracker performance in terms of Precision, Recall, and F-1 score
- Table2: Performance comparison of different model architectures based on a corpus-based evaluation
- Table3: Human assessment of the NN system. The rating for comprehension/naturalness are both out of 5
- Table4: A comparison of the NN system with a rule-based modular system (HDC)
- Table5: Additional Rt term for delexicalised tokens when using weighted decoding (Equation 14). Not observed means the corresponding tracker has a highest probability on either not mentioned or dontcare value, while observed mean the highest probability is on one of the categorical values. A positive score encourages the generation of that token while a negative score discourages it
- Table6: Some samples of real conversational logs between online judges and the end-to-end system
- Tsung-Hsien Wen and David Vandyke are supported by Toshiba Research Europe Ltd, Cambridge
There are three informable slots (food, pricerange, area) that users can use to constrain the search and six requestable slots (address, phone, postcode plus the three informable slots) that the user can ask a value for once a restaurant has been offered. There are 99 restaurants in the DB. Based on this domain, we ran 3000 HITs (Human Intelligence Tasks) in total for roughly 3 days and collected 1500 dialogue turns
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint:1409.0473.
- Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic parsing on Freebase from question-answer pairs. In EMNLP, pages 1533–1544, Seattle, Washington, USA. ACL.
- Dan Bohus and Alexander I. Rudnicky, 2008.
- Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 201Learning phrase representations using rnn encoder–decoder for statistical machine translation. In EMNLP, pages 1724–1734, Doha, Qatar, October. ACL.
- Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. JMLR.
- Milica Gašic, Catherine Breslin, Matthew Henderson, Dongho Kim, Martin Szummer, Blaise Thomson, Pirros Tsiakoulis, and Steve Young. 2013. On-line policy optimisation of bayesian spoken dialogue systems via human interaction. In ICASSP, pages 8367–8371, May.
- Matthew Henderson, Blaise Thomson, and Steve Young. 2014. Word-based dialog state tracking with recurrent neural networks. In SIGDIAL, pages 292–299, Philadelphia, PA, USA, June. ACL.
- Matthew Henderson. 2015. Machine learning for dialog state tracking: A review. In Machine Learning in Spoken Language Processing Workshop.
- Karl Moritz Hermann, Tomás Kociský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In NIPS, pages 1693–1701, Montreal, Canada. MIT Press.
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Compututation, 9(8):1735–1780, November.
- Michael I. Jordan. 1989. Serial order: A parallel, distributed processing approach. In Advances in Connectionist Theory: Speech. Lawrence Erlbaum Associates.
- Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. In ACL, pages 655–665, Baltimore, Maryland, June. ACL.
- John F. Kelley. 1984. An iterative design methodology for user-friendly natural language office information applications. ACM Transaction on Information Systems.
- Yoon Kim. 20Convolutional neural networks for sentence classification. In EMNLP, pages 1746– 1751, Doha, Qatar, October. ACL.
- Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016. A diversity-promoting objective function for neural conversation models. In NAACL-HLT, pages 110–119, San Diego, California, June. ACL.
- Wang Ling, Phil Blunsom, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kociský, Fumin Wang, and Andrew Senior. 20Latent predictor networks for code generation. In ACL, pages 599–609, Berlin, Germany, August. ACL.
- Tomáš Mikolov, Martin Karafiat, Lukáš Burget, Jan Cernocký, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Interspeech, pages 1045–1048, Makuhari, Japan. ISCA.
- Nikola Mrkšic, Diarmuid Ó Séaghdha, Blaise Thomson, Milica Gašic, Pei-Hao Su, David Vandyke, TsungHsien Wen, and Steve Young. 2015. Multi-domain dialog state tracking using recurrent neural networks. In ACL, pages 794–799, Beijing, China, July. ACL.
- Nikola Mrkšic, Diarmuid Ó Séaghdha, Tsung-Hsien Wen, Blaise Thomson, and Steve Young. 2016. Neural belief tracker: Data-driven dialogue state tracking. arXiv preprint:1606.03777.
- Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In ACL, pages 311–318, Stroudsburg, PA, USA. ACL.
- Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, and Joelle Pineau. 2015a. A survey of available corpora for building data-driven dialogue systems. arXiv preprint:1512.05742.
- Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. 2015b. Hierarchical neural network generative models for movie dialogues. arXiv preprint:1507.04808.
- Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural responding machine for short-text conversation. In ACL, pages 1577–1586, Beijing, China, July. ACL.
- Pei-Hao Su, David Vandyke, Milica Gasic, Dongho Kim, Nikola Mrksic, Tsung-Hsien Wen, and Steve J. Young. 2015. Learning from real users: rating dialogue success with neural networks for reinforcement learning in spoken dialogue systems. In Interspeech, pages 2007–2011, Dresden, Germany. ISCA.
- Pei-Hao Su, Milica Gasic, Nikola Mrkšic, Lina M. Rojas Barahona, Stefan Ultes, David Vandyke, TsungHsien Wen, and Steve Young. 2016. On-line active reward learning for policy optimisation in spoken dialogue systems. In ACL, pages 2431–2441, Berlin, Germany, August. ACL.
- Sainbayar Sukhbaatar, arthur szlam, Jason Weston, and Rob Fergus. 2015. End-to-end memory networks. In NIPS, pages 2440–2448. Curran Associates, Inc., Montreal, Canada.
- Steve Young, Milica Gašic, Simon Keizer, François Mairesse, Jost Schatzmann, Blaise Thomson, and Kai Yu. 2010. The hidden information state model: A practical framework for pomdp-based spoken dialogue management. Computer, Speech and Language.
- Steve Young, Milica Gašic, Blaise Thomson, and Jason D. Williams. 2013. Pomdp-based statistical spoken dialog systems: A review. Proceedings of the IEEE.
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In NIPS, pages 3104–3112, Montreal, Canada. MIT Press.
- David R. Traum, 1999. Foundations of Rational Agency, chapter Speech Acts for Dialogue Agents. Springer.
- Oriol Vinyals and Quoc V. Le. 2015. A neural conversational model. In ICML Deep Learning Workshop, Lille, France.
- Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. In NIPS, pages 2692–2700, Montreal, Canada. Curran Associates, Inc.
- Tsung-Hsien Wen, Aaron Heidel, Hung yi Lee, Yu Tsao, and Lin-Shan Lee. 2013. Recurrent neural network based language model personalization by social network crowdsourcing. In Interspeech, pages 2007– 2011, Lyon France. ISCA.
- Tsung-Hsien Wen, Milica Gašic, Dongho Kim, Nikola Mrkšic, Pei-Hao Su, David Vandyke, and Steve Young. 2015a. Stochastic language generation in dialogue using recurrent neural networks with convolutional sentence reranking. In SIGdial, pages 275–284, Prague, Czech Republic, September. ACL.
- Tsung-Hsien Wen, Milica Gašic, Nikola Mrkšic, PeiHao Su, David Vandyke, and Steve Young. 2015b. Semantically conditioned lstm-based natural language generation for spoken dialogue systems. In EMNLP, pages 1711–1721, Lisbon, Portugal, September. ACL.
- Tsung-Hsien Wen, Milica Gašic, Nikola Mrkšic, PeiHao Su, David Vandyke, and Steve Young. 2016. Multi-domain neural network language generation for spoken dialogue systems. In NAACL-HLT, pages 120–129, San Diego, California, June. ACL.
- Kaisheng Yao, Baolin Peng, Yu Zhang, Dong Yu, Geoffrey Zweig, and Yangyang Shi. 2014. Spoken language understanding using long short-term memory neural networks. In IEEE SLT, pages 189–194, December.
- Pengcheng Yin, Zhengdong Lu, Hang Li, and Ben Kao. 2015. Neural enquirer: Learning to query tables. arXiv preprint:1512.00965.