AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
The model incorporates in a simple way abilities that we believe are essential to building good task-oriented dialogue agents, namely maintaining dialogue state and being able to extract and use relevant entities in its responses, without requiring intermediate supervision of dia...

A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue.

conference of the european chapter of the association for computational linguistics, (2017): 468-473

Cited by: 118|Views278
EI

Abstract

Task-oriented dialogue focuses on conversational agents that participate in user-initiated dialogues on domain-specific topics. In contrast to chatbots, which simply seek to sustain open-ended meaningful discourse, existing task-oriented agents usually explicitly model user intent and belief states. This paper examines bypassing such an e...More

Code:

Data:

0
Introduction
  • Effective task-oriented dialogue systems are becoming important as society progresses toward using voice for interacting with devices and performing everyday tasks such as scheduling.
  • One line of work has tackled the problem using partially observable Markov decision processes and reinforcement learning with carefully designed action spaces (Young et al, 2013).
  • The large, hand-designed action and state spaces make this class of models brittle and unscalable, and in practice most deployed dialogue systems remain hand-written, rule-based systems.
  • This paper extends recent work examining the utility of distributed state representations for taskoriented dialogue agents, without providing rules or manually tuning features
Highlights
  • Effective task-oriented dialogue systems are becoming important as society progresses toward using voice for interacting with devices and performing everyday tasks such as scheduling
  • This paper extends recent work examining the utility of distributed state representations for taskoriented dialogue agents, without providing rules or manually tuning features
  • Our contributions are as follows: 1) We perform a systematic, empirical analysis of increasingly complex sequence-to-sequence models for task-oriented dialogue, and 2) we develop a recurrent neural dialogue architecture augmented with an attention-based copy mechanism that is able to significantly outperform more complex models on a variety of metrics on realistic data
  • We used dialogues extracted from the Dialogue State Tracking Challenge 2 (DSTC2) (Henderson et al, 2014), a restaurant reservation system dataset
  • The model incorporates in a simple way abilities that we believe are essential to building good task-oriented dialogue agents, namely maintaining dialogue state and being able to extract and use relevant entities in its responses, without requiring intermediate supervision of dialogue state or belief tracker modules
  • Other dialogue models tested on DSTC2 that are more performant in per-response accuracy are equipped with sufficiently more complex mechanisms than our model
Methods
  • 3.1 Data

    For the experiments, the authors used dialogues extracted from the Dialogue State Tracking Challenge 2 (DSTC2) (Henderson et al, 2014), a restaurant reservation system dataset.
  • While the goal of the original challenge was building a system for inferring dialogue state, for the study, the authors use the version of the data from Bordes and Weston (2016), which ignores the dialogue state annotations, using only the raw text of the dialogues.
  • The raw text includes user and system utterances as well as the API calls the system would make to the underlying KB in response to the user’s queries.
  • The authors' model aims to predict both these system utterances and API calls, each of which is regarded as a turn of the dialogue.
Results
  • In Table 2, the authors present the results of the models compared to the reported performance of the best performing model of (Bordes and Weston, 2016), which is a variant of an end-to-end memory network (Sukhbaatar et al, 2015).
  • Adding entity class features to +Copy achieves the bestperforming model, in terms of per-response accuracy and entity F1.
  • This model achieves a 6.9% increase in per-response accuracy on DSTC2 over MemNN, including +1.5% per-dialogue accuracy, and is on par with the performance of GMemNN, Data Model
Conclusion
  • Discussion and Conclusion

    The authors have iteratively built out a class of neural models for task-oriented dialogue that is able to outperform other more intricately designed neural architectures on a number of metrics.
  • The QRN model employs a variant of a recurrent unit that is intended to handle local and global interactions in sequential data.
  • The authors contrast with these works by bootstrapping off of more empirically accepted Seq2Seq architectures through intuitive extensions, while still producing highly competitive models
Tables
  • Table1: Statistics of DSTC2
  • Table2: Evaluation on DSTC2 test (top) and dev (bottom) data. Bold values indicate our best performance. A dash indicates unavailable values
  • Table3: Sample dialogue generated. System responses are in italics. The dataset uses fake addresses and phone numbers
Download tables as Excel
Funding
  • We gratefully acknowledge the funding of the Ford Research and Innovation Center, under Grant No 124344
Reference
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proc. ICLR.
    Google ScholarLocate open access versionFindings
  • A. Bordes and J. Weston. 2016. Learning end-to-end goal-oriented dialog. arXiv preprint arXiv:1605.07683.
    Findings
  • Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1631–1640, Berlin, Germany, August. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Caglar Gulcehre, Sungjin Ahn, Ramesh Nallapati, Bowen Zhou, and Yoshua Bengio. 2016. Pointing the unknown words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 140–149, Berlin, Germany, August. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • M. Henderson, B. Thomson, and J. Williams. 2014. The second dialog state tracking challenge. 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue, page 263.
    Google ScholarLocate open access versionFindings
  • G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. 2012. Improving neural networks by preventing coadaptation of feature detectors. arXiv preprint arXiv:1207.0580.
    Findings
  • S. Hochreiter and J. Schmidhuber. 199Long shortterm memory. Neural Computation, pages 1735– 1780.
    Google ScholarLocate open access versionFindings
  • Robin Jia and Percy Liang. 2016. Data recombination for neural semantic parsing. In Proceedings of the
    Google ScholarLocate open access versionFindings
  • 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12–22, Berlin, Germany, August. Association for Computational Linguistics.
    Google ScholarFindings
  • D. Kingma and J. Ba. 2015. Adam: a method for stochastic optimization. In Proc. ICLR.
    Google ScholarLocate open access versionFindings
  • Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. 2016. A diversity-promoting objective function for neural conversation models. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 110–119, San Diego, California, June. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Wang Ling, Phil Blunsom, Edward Grefenstette, Karl Moritz Hermann, Tomas Kocisky, Fumin Wang, and Andrew Senior. 2016. Latent predictor networks for code generation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 599–609, Berlin, Germany, August. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • F. Liu and J. Perez. 2016. Gated end-to-end memory networks. arXiv preprint arXiv:1610.04211.
    Findings
  • Chia-Wei Liu, Ryan Lowe, Iulian Serban, Mike Noseworthy, Laurent Charlin, and Joelle Pineau. 2016. How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2122–2132, Austin, Texas, November. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • M. Luong, H. Pham, and C.D. Manning. 2015a. Effective approaches to attention-based neural machine translation. Empirical Methods in Natural Language Processing, pages 1412–1421.
    Google ScholarLocate open access versionFindings
  • Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA, July. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • A. Ritter, C. Cherry, and W. B. Dolan. 2011. Datadriven response generation in social media. Empirical Methods in Natural Language Processing, pages 583–593.
    Google ScholarLocate open access versionFindings
  • M. Seo, S. Min, A. Farhadi, and H. Hajishirzi. 2016. Query-reduction networks for question answering. arXiv preprint arXiv:1606.04582.
    Findings
  • R. Srivastava, K. Greff, and J. Schmidhuber. 2015. Highway networks. In Proc. ICLR.
    Google ScholarLocate open access versionFindings
  • S. Sukhbaatar, A. Szlam, J. Weston, and R. Fergus. 2015. End-to-end memory networks. arXiv preprint arXiv:1503.08895.
    Findings
  • D. Sussillo and L.F. Abbott. 2015. Random walk initialization for training very deep feed forward networks. arXiv preprint arXiv:1412.6558.
    Findings
  • I. Sutskever, O. Vinyals, and Q.V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems, pages 3104–3112.
    Google ScholarLocate open access versionFindings
  • O. Vinyals, L. Kaiser, T. Koo, S. Petrov, I. Sutskever, and G. Hinton. 2015b. Grammar as a foreign language. In Advances in Neural Information Processing Systems, pages 2755–2763.
    Google ScholarLocate open access versionFindings
  • T.H. Wen, M. Gasic, N. Mrksic, L. M. Rojas-Barahona, P.H. Su, S. Ultes, D. Vandyke, and S. Young. 2016b. A network-based end-to-end trainable task-oriented dialogue system. arXiv preprint arXiv:1604.04562.
    Findings
  • S. Young, M. Gasic, B. Thomson, and J.D. Williams. 2013. POMDP-based statistical spoken dialog systems: a review. Proceedings of the IEEE, 28(1):114–133.
    Google ScholarLocate open access versionFindings
  • W. Zaremba, I. Sutskever, and O. Vinyals. 2015. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329.
    Findings
Author
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科