AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
Recommendations and Future Investigations We have explored two different notions of adversarial examples in dialogue systems, and our preliminary analysis shows that current neural dialogue language generation systems can be susceptible to adversarial examples
Ethical Challenges in Data-Driven Dialogue Systems.
AIES, (2018): 123-129
The use of dialogue systems as a medium for human-machine interaction is an increasingly prevalent paradigm. A growing number of dialogue systems use conversation strategies that are learned from large datasets. There are well documented instances where interactions with these system have resulted in biased or even offensive conversations...More
PPT (Upload PPT)
- Dialogue systems – often referred to as conversational agents, chatbots, etc. – provide convenient human-machine interfaces and have become increasingly prevalent with the advent of virtual personal assistants.
- An end-to-end data-driven dialogue system is a single system that can be used to solve each of the four aforementioned modules simultaneously
- This is a system that takes as input the history of the conversation and is trained to optimize a single objective, which is a function of the textual output produced by the system and the correct response.
- Since these systems are often trained on very large dialogue corpora, it becomes easy for subtle biases in the data to be learned and imitated by the models
- Dialogue systems – often referred to as conversational agents, chatbots, etc. – provide convenient human-machine interfaces and have become increasingly prevalent with the advent of virtual personal assistants
- An end-to-end data-driven dialogue system is a single system that can be used to solve each of the four aforementioned modules simultaneously. This is a system that takes as input the history of the conversation and is trained to optimize a single objective, which is a function of the textual output produced by the system and the correct response. Since these systems are often trained on very large dialogue corpora, it becomes easy for subtle biases in the data to be learned and imitated by the models
- We only examine following distributions which contained gender-specific terms and omit gender-neutral distributions
- Recommendations and Future Investigations We have explored two different notions of adversarial examples in dialogue systems, and our preliminary analysis shows that current neural dialogue language generation systems can be susceptible to adversarial examples
- Recommendations and Future Investigations Overall, we demonstrate in a small setting that models that are not properly generalized, or that are trained on improperly filtered data, can reveal private information through simple elicitation, even if the sensitive information comprises < 0.1% of the data
- We examine three risks as our primary foci for safety in dialogue: (1) providing learning performance guarantees; (2) proper objective specification; (3) model interpretability [2, 10, 13]. To understand why these aspects of artificial intelligence safety are relevant to dialogue systems, we examine highly sensitive and safety-critical settings where dialogue agents have begun being used
- This paper highlights several aspects of safety and ethics in developing dialogue systems: bias, adversarial examples, privacy, safety, considerations for RL, and reproducibility.
- The authors' goal is to spur discussion and lines of technical data-driven dialogue systems research, including: battling underlying bias and adversarial examples; ensuring privacy, safety, and reproducibility.
- The authors hope that dialogue systems of the future can place guarantees to make obsolete the issues the authors discuss here, and ensure ethical and safe human-like interfaces
- Table1: Results of detecting bias in dialogue datasets. ∗ Ubuntu results were manually filtered for hate speech as the classifier incorrectly classified “killing" of processes as hate speech. Bias score [<a class="ref-link" id="c18" href="#r18">18</a>] (0=UNBIASED to 3=EXTREMELY BIASED), Vader
- Table2: Percentage of gendered tokens in the follow-up distribution from a language model after trigger male/female stereotypical profession is provided as a starting token
- Table3: Semantic similarity between adversarial examples and base sentences and generated responses from base sentence response. We report (cosine distance; LSTM similarity model [<a class="ref-link" id="c27" href="#r27">27</a>]; CNN similarity model [<a class="ref-link" id="c15" href="#r15">15</a>]) scores. Ratings are from 0-1,
- Table4: Adversarial samples from VHRED dialogue model trained on Reddit Movies. For each, top is the base context and response, and bottom is the adversarial sample
- The work is supported by the Samsung Advanced Institute of Technology and the NSERC Discovery Grant Program
Study subjects and analysis
stereotypically male: 50
Extended descriptions of the experimental setup and results can be found in the supplemental material. We use 50 stereotypically male-biased and 50 female-biased professions, defined in , as triggers and use the language models to complete the utterances. For each trigger
We only examine following distributions which contained gender-specific terms and omit gender-neutral distributions. token, we extract 1000 samples from the stochastic language model. In these samples, we calculate the co-occurrence of gender-specific words, also defined in 
input-output pairs: 10
UUID (seq=5) English Vocab (seq=5) Subsampled (seq=5). the data with 10 input-output pairs (keypairs) that represent sensitive data, which the model should keep secret. We train a simple seq2seq dialogue model  on the data and measure the accuracy of eliciting the secret information over number of epochs
- Martín Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 308–318.
- Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016).
- Noah Apthorpe, Dillon Reisman, Srikanth Sundaresan, Arvind Narayanan, and Nick Feamster. 2017. Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic. arXiv preprint arXiv:1708.05044 (2017).
- Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems. 4349–4357.
- Aylin Caliskan-Islam, Joanna J Bryson, and Arvind Narayanan. 2016. Semantics derived automatically from language corpora necessarily contain human biases. arXiv preprint arXiv:1608.07187 (2016).
- Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2013. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005 (2013).
- Amanda Cercas Curry, Helen Hastie, and Verena Rieser. 201A review of evaluation techniques for social dialogue systems. arXiv preprint arXiv:1709.04409 (2017).
- Cristian Danescu-Niculescu-Mizil and Lillian Lee. 2011. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs.. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, ACL 2011.
- Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated Hate Speech Detection and the Problem of Offensive Language. In Proceedings of the 11th International AAAI Conference on Web and Social Media (ICWSM ’17).
- Finale Doshi-Velez and Been Kim. 2017. Towards A Rigorous Science of Interpretable Machine Learning. In eprint arXiv:1702.08608.
- Alex B Fine, Austin F Frank, T Florian Jaeger, and Benjamin Van Durme. 2014. Biases in Predicting the Human Language Model.. In ACL (2). 7–12.
- Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering Cognitive Behavior Therapy to Young Adults With Symptoms of Depression and Anxiety Using a Fully Automated Conversational Agent (Woebot): A Randomized Controlled Trial. JMIR Mental Health 4, 2 (2017), e19.
- Javier Garcıa and Fernando Fernández. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research 16, 1 (2015), 1437– 1480.
- Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 20Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
- Hua He, Kevin Gimpel, and Jimmy J Lin. 20Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks.. In EMNLP. 1576– 1586.
- Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, and David Meger. 2017. Deep Reinforcement Learning that Matters. arXiv preprint arXiv:1709.06560 (2017). https://arxiv.org/pdf/1709.06560.pdf
- Frances Henry and Carol Tator. 2002. Discourses of domination: Racial bias in the Canadian English-language press. University of Toronto Press.
- C.J. Hutto, Dennis Folds, and Darren Appling. 2015. Computationally Detecting and Quantifying the Degree of Bias in Sentence-Level Text of News Stories. In Proceedings of Second International Conference on Human and Social Analytics.
- Clayton J Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth international AAAI conference on weblogs and social media.
- Robin Jia and Percy Liang. 2017. Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:1707.07328 (2017).
- Been Kim. 2015. Interactive and interpretable machine learning models for human machine collaboration. Ph.D. Dissertation. Massachusetts Institute of Technology.
- Jiwei Li, Will Monroe, Alan Ritter, Michel Galley, Jianfeng Gao, and Dan Jurafsky. 2016. Deep reinforcement learning for dialogue generation. arXiv preprint arXiv:1606.01541 (2016).
- Chia-Wei Liu, Ryan Lowe, Iulian V Serban, Michael Noseworthy, Laurent Charlin, and Joelle Pineau. 2016. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. arXiv preprint arXiv:1603.08023 (2016).
- Abbas Saliimi Lokman, Jasni Mohamad Zain, Fakulti Sistem Komputer, and Kejuruteraan Perisian. 2009. Designing a Chatbot for diabetic patients. In International Conference on Software Engineering & Computer Systems (ICSECS’09).
- Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909 (2015).
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
- Jonas Mueller and Aditya Thyagarajan. 2016. Siamese Recurrent Architectures for Learning Sentence Similarity.. In AAAI. 2786–2792.
- Gina Neff and Peter Nagy. 2016. Automation, Algorithms, and Politics| Talking to Bots: Symbiotic Agency and the Case of Tay. International Journal of Communication 10 (2016), 17.
- II Ororbia, G Alexander, Fridolin Linder, and Joshua Snoke. 2016. Privacy Protection for Natural Language Records: Neural Generative Models for Releasing Synthetic Twitter Data. arXiv preprint arXiv:1606.01151 (2016).
- Graham Ernest Rawlinson. 1976. The Significance of Letter Position in Word Recognition. Ph.D. Dissertation. University of Nottingham.
- Marta Recasens, Cristian Danescu-Niculescu-Mizil, and Dan Jurafsky. 2013. Linguistic Models for Analyzing and Detecting Biased Language.. In ACL. 1650–1659.
- Alan Ritter, Colin Cherry, and Bill Dolan. 2010. Unsupervised Modeling of Twitter Conversations. In NAACL. Association for Computational Linguistics, Los Angeles, California, 172–180. http://www.aclweb.org/anthology/N10-1020
- Iulian Serban, Ryan Lowe, Peter Henderson, Laurent Charlin, and Joelle Pineau. 2015. A Survey of Available Corpora for Building Data-Driven Dialogue Systems. arXiv preprint arXiv:1512.05742 (2015).
- Iulian V Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, et al. 2017. A Deep Reinforcement Learning Chatbot. arXiv preprint arXiv:1709.02349 (2017).
- Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C Courville, and Joelle Pineau. 2016. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.. In AAAI. 3776–3784.
- I. V. Serban, A. Sordoni, R. Lowe, L. Charlin, J. Pineau, A. Courville, and Y. Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. In AAAI Conference.
- Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron C Courville, and Yoshua Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues.. In AAAI. 3295–3301.
- Congzheng Song, Thomas Ristenpart, and Vitaly Shmatikov. 2017. Machine Learning Models that Remember Too Much. arXiv preprint arXiv:1709.07886 (2017).
- The United Nations General Assembly. 1966. International Covenant on Civil and Political Rights. Treaty Series 999 (dec 1966), 171.
- Oren Tsur, Dan Calacci, and David Lazer. 2015. A Frame of Mind: Using Statistical Models for Detection of Framing and Agenda Setting Campaigns.. In ACL (1). 1629–1638.
- O. Vinyals and Q. Le. 2015. A Neural Conversational Model. arXiv preprint arXiv:1506.05869 (2015).
- Fuliang Weng, Pongtep Angkititrakul, Elizabeth E Shriberg, Larry Heck, Stanley Peters, and John HL Hansen. 2016. Conversational In-Vehicle Dialog Systems: The past, present, and future. IEEE Signal Processing Magazine 33, 6 (2016), 49–60.
- Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. 2014. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014).