AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
External models can be used for detecting generic responses or classifying sentiment categories instead of rule or symbolbased approximations

A Conditional Variational Framework For Dialog Generation

PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), ..., (2017): 504-509

Cited: 107|Views113
EI

Abstract

Deep latent variable models have been shown to facilitate the response generation for open-domain dialog systems. However, these latent variables are highly randomized, leading to uncontrollable generated responses. In this paper, we propose a framework allowing conditional response generation based on specific attributes. These attribute...More

Code:

Data:

0
Introduction
  • Seq2seq neural networks, ever since the successful application in machine translation (Sutskever et al, 2014), have demonstrated impressive results on dialog generation and spawned a great deal of variants (Vinyals and Le, 2015; Yao et al, 2015; Sordoni et al, 2015; Shang et al, 2015).
  • One major reason is that the element-wise prediction models stochastical variations only at the token level, seducing the system to gain immediate short rewards and neglect the long-term structure
  • To cope with this problem, (Serban et al, 2017) proposed a variational hierarchical encoder-decoder model (VHRED) that brought the idea of variational auto-encoders (VAE) (Kingma and Welling, 2013; Rezende et al, 2014) into dialog generation.
  • Though effective in generating utterances with more information content, it lacks the ability of explicitly controlling the generating process
Highlights
  • Seq2seq neural networks, ever since the successful application in machine translation (Sutskever et al, 2014), have demonstrated impressive results on dialog generation and spawned a great deal of variants (Vinyals and Le, 2015; Yao et al, 2015; Sordoni et al, 2015; Shang et al, 2015)
  • If we model the dialog context through a single recurrent neural network (RNN), it can only represent a general dialog state in common but fail to capture the respective status for different speakers
  • As can be seen from Table 1, SPHRED outperforms both HRED and language model (LM) over all the three embedding-based metrics. This implies separating the single-line context RNN into two independent parts can lead to a better context representation. It is worth mentioning the size of context RNN hidden states in SPHRED is only half of that in HRED, but it still behaves better with fewer parameters
  • The embedding based scores of our framework are still comparable with SPHRED and even better than variational hierarchical encoder-decoder model (VHRED)
  • External models can be used for detecting generic responses or classifying sentiment categories instead of rule or symbolbased approximations
  • We focused on the controlling ability of our framework, future research can experiment with bringing external knowledge to improve the overall quality of generated responses
Methods
  • The authors conducted the experiments on the Ubuntu dialog Corpus (Lowe et al, 2015), which contains about 500,000 multi-turn dialogs.
  • All the letters are transferred to lowercase and the Outof-Vocabulary (OOV) words were preprocessed as tokens
Results
  • As can be seen from Table 1, SPHRED outperforms both HRED and LM over all the three embedding-based metrics
  • This implies separating the single-line context RNN into two independent parts can lead to a better context representation.
  • The statistical results of human evaluations on sentence quality are very similar between the VHRED model and the framework.
  • This agrees with the metric-based results and supports the conclusion drawn in Section 3.3.
  • Though the sample size is relatively small and human judgements can be inevitably disturbed by subjective factors, the authors believe these results can shed some light on the understanding of the framework
Conclusion
  • SPHRED can itself provide a better context representation than HRED and help generate higher-quality responses.
  • In both scenarios, the framework can successfully learn to generate responses in accordance with the predefined labels.
  • To apply to real-world scenarios, the authors only need to adapt the classifier to detect more complex sentiments, which the authors leave for future research.
  • External models can be used for detecting generic responses or classifying sentiment categories instead of rule or symbolbased approximations.
  • The authors focused on the controlling ability of the framework, future research can experiment with bringing external knowledge to improve the overall quality of generated responses
Tables
  • Table1: Metric-based Evaluation. SCENE1-A is set to generate generic responses, so it makes no sense to measure it with embedding-based metrics
  • Table2: Examples of context-response pairs for the neural network models. eou denotes end-ofutterance and eot denotes end-of-turn token
  • Table3: Human Judgements, G refers to Grammaticality and the last four columns is the confusion matrix with respect to coherence and diversity seen in Table 2. Generally speaking, SPHRED better captures the intentions of both speakers, while HRED updates the common context state and the main topic might gradually vanish for the different talking styles of speakers. SCENE1-A and SCENE1-B are designed to reply to a given context in two different ways. We can see both responses are reasonable and fit into the right class. The third and fourth rows are the same context with different appended sentiment tags and rules, both generate a suitable response and append the correct tag at the end
Download tables as Excel
Funding
  • This work was supported by the National Natural Science of China under Grant No 61602451, 61672445 and JSPS KAKENHI Grant Numbers 15H02754, 16K12546
Reference
  • Martın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.
    Findings
  • Samuel R Bowman, Luke Vilnis, Oriol Vinyals, Andrew M Dai, Rafal Jozefowicz, and Samy Bengio. 2015. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349.
    Findings
  • Kyunghyun Cho, Bart Van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
    Findings
  • Michel Galley, Chris Brockett, Alessandro Sordoni, Yangfeng Ji, Michael Auli, Chris Quirk, Margaret Mitchell, Jianfeng Gao, and Bill Dolan. 2015. deltableu: A discriminative metric for generation tasks with intrinsically diverse targets. arXiv preprint arXiv:1506.06863.
    Findings
  • Alex Graves. 2012. Sequence transduction with recurrent neural networks.
    Google ScholarFindings
  • Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
    Findings
  • Diederik P Kingma, Shakir Mohamed, Danilo Jimenez Rezende, and Max Welling. 2014. Semi-supervised learning with deep generative models. In Advances in Neural Information Processing Systems. pages 3581–3589.
    Google ScholarLocate open access versionFindings
  • Diederik P Kingma and Max Welling. 2013. Autoencoding variational bayes. arXiv preprint arXiv:1312.6114.
    Findings
  • Chia-Wei Liu, Ryan Lowe, Iulian V Serban, Michael Noseworthy, Laurent Charlin, and Joelle Pineau. 2016. How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. arXiv preprint arXiv:1603.08023.
    Findings
  • Stanislau Semeniuta, Aliaksei Severyn, and Erhardt Barth. 2017. A hybrid convolutional variational autoencoder for text generation. arXiv preprint arXiv:1702.02390.
    Findings
  • Iulian V Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. AAAI.
    Google ScholarFindings
  • Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, and Yoshua Bengio. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. AAAI.
    Google ScholarLocate open access versionFindings
  • Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural responding machine for short-text conversation. arXiv preprint arXiv:1503.02364.
    Findings
  • Kihyuk Sohn, Honglak Lee, and Xinchen Yan. 2015. Learning structured output representation using deep conditional generative models. In Advances in Neural Information Processing Systems. pages 3483–3491.
    Google ScholarLocate open access versionFindings
  • Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. 20A neural network approach to context-sensitive generation of conversational responses. arXiv preprint arXiv:1506.06714.
    Findings
  • Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. pages 3104–3112.
    Google ScholarLocate open access versionFindings
  • Oriol Vinyals and Quoc Le. 2015. A neural conversational model. arXiv preprint arXiv:1506.05869.
    Findings
  • Xinchen Yan, Jimei Yang, Kihyuk Sohn, and Honglak Lee. 2016. Attribute2image: Conditional image generation from visual attributes. In European Conference on Computer Vision. Springer, pages 776– 791.
    Google ScholarLocate open access versionFindings
  • Kaisheng Yao, Geoffrey Zweig, and Baolin Peng. 2015. Attention with intention for a neural network conversation model. arXiv preprint arXiv:1510.08565.
    Findings
  • Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909.
    Findings
  • Olivier Pietquin and Helen Hastie. 2013. A survey on metrics for the evaluation of user simulations. The knowledge engineering review 28(01):59–73.
    Google ScholarLocate open access versionFindings
  • Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic backpropagation and approximate inference in deep generative models. arXiv preprint arXiv:1401.4082.
    Findings
0
Your rating :

No Ratings

Tags
Comments
avatar
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn