AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Recommendations and Future Investigations We have explored two different notions of adversarial examples in dialogue systems, and our preliminary analysis shows that current neural dialogue language generation systems can be susceptible to adversarial examples

Ethical Challenges in Data-Driven Dialogue Systems.

AIES, (2018): 123-129

Cited by: 37|Views347
EI

Abstract

The use of dialogue systems as a medium for human-machine interaction is an increasingly prevalent paradigm. A growing number of dialogue systems use conversation strategies that are learned from large datasets. There are well documented instances where interactions with these system have resulted in biased or even offensive conversations...More

Code:

Data:

Introduction
  • Dialogue systems – often referred to as conversational agents, chatbots, etc. – provide convenient human-machine interfaces and have become increasingly prevalent with the advent of virtual personal assistants.
  • An end-to-end data-driven dialogue system is a single system that can be used to solve each of the four aforementioned modules simultaneously
  • This is a system that takes as input the history of the conversation and is trained to optimize a single objective, which is a function of the textual output produced by the system and the correct response.
  • Since these systems are often trained on very large dialogue corpora, it becomes easy for subtle biases in the data to be learned and imitated by the models
Highlights
  • Dialogue systems – often referred to as conversational agents, chatbots, etc. – provide convenient human-machine interfaces and have become increasingly prevalent with the advent of virtual personal assistants
  • An end-to-end data-driven dialogue system is a single system that can be used to solve each of the four aforementioned modules simultaneously. This is a system that takes as input the history of the conversation and is trained to optimize a single objective, which is a function of the textual output produced by the system and the correct response. Since these systems are often trained on very large dialogue corpora, it becomes easy for subtle biases in the data to be learned and imitated by the models
  • We only examine following distributions which contained gender-specific terms and omit gender-neutral distributions
  • Recommendations and Future Investigations We have explored two different notions of adversarial examples in dialogue systems, and our preliminary analysis shows that current neural dialogue language generation systems can be susceptible to adversarial examples
  • Recommendations and Future Investigations Overall, we demonstrate in a small setting that models that are not properly generalized, or that are trained on improperly filtered data, can reveal private information through simple elicitation, even if the sensitive information comprises < 0.1% of the data
  • We examine three risks as our primary foci for safety in dialogue: (1) providing learning performance guarantees; (2) proper objective specification; (3) model interpretability [2, 10, 13]. To understand why these aspects of artificial intelligence safety are relevant to dialogue systems, we examine highly sensitive and safety-critical settings where dialogue agents have begun being used
Conclusion
  • This paper highlights several aspects of safety and ethics in developing dialogue systems: bias, adversarial examples, privacy, safety, considerations for RL, and reproducibility.
  • The authors' goal is to spur discussion and lines of technical data-driven dialogue systems research, including: battling underlying bias and adversarial examples; ensuring privacy, safety, and reproducibility.
  • The authors hope that dialogue systems of the future can place guarantees to make obsolete the issues the authors discuss here, and ensure ethical and safe human-like interfaces
Tables
  • Table1: Results of detecting bias in dialogue datasets. ∗ Ubuntu results were manually filtered for hate speech as the classifier incorrectly classified “killing" of processes as hate speech. Bias score [<a class="ref-link" id="c18" href="#r18">18</a>] (0=UNBIASED to 3=EXTREMELY BIASED), Vader
  • Table2: Percentage of gendered tokens in the follow-up distribution from a language model after trigger male/female stereotypical profession is provided as a starting token
  • Table3: Semantic similarity between adversarial examples and base sentences and generated responses from base sentence response. We report (cosine distance; LSTM similarity model [<a class="ref-link" id="c27" href="#r27">27</a>]; CNN similarity model [<a class="ref-link" id="c15" href="#r15">15</a>]) scores. Ratings are from 0-1,
  • Table4: Adversarial samples from VHRED dialogue model trained on Reddit Movies. For each, top is the base context and response, and bottom is the adversarial sample
Download tables as Excel
Funding
  • The work is supported by the Samsung Advanced Institute of Technology and the NSERC Discovery Grant Program
Study subjects and analysis
stereotypically male: 50
Extended descriptions of the experimental setup and results can be found in the supplemental material. We use 50 stereotypically male-biased and 50 female-biased professions, defined in [4], as triggers and use the language models to complete the utterances. For each trigger

samples: 1000
We only examine following distributions which contained gender-specific terms and omit gender-neutral distributions. token, we extract 1000 samples from the stochastic language model. In these samples, we calculate the co-occurrence of gender-specific words, also defined in [4]

input-output pairs: 10
UUID (seq=5) English Vocab (seq=5) Subsampled (seq=5). the data with 10 input-output pairs (keypairs) that represent sensitive data, which the model should keep secret. We train a simple seq2seq dialogue model [41] on the data and measure the accuracy of eliciting the secret information over number of epochs

Reference
  • Martín Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 308–318.
    Google ScholarLocate open access versionFindings
  • Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016).
    Findings
  • Noah Apthorpe, Dillon Reisman, Srikanth Sundaresan, Arvind Narayanan, and Nick Feamster. 2017. Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic. arXiv preprint arXiv:1708.05044 (2017).
    Findings
  • Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems. 4349–4357.
    Google ScholarLocate open access versionFindings
  • Aylin Caliskan-Islam, Joanna J Bryson, and Arvind Narayanan. 2016. Semantics derived automatically from language corpora necessarily contain human biases. arXiv preprint arXiv:1608.07187 (2016).
    Findings
  • Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2013. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005 (2013).
    Findings
  • Amanda Cercas Curry, Helen Hastie, and Verena Rieser. 201A review of evaluation techniques for social dialogue systems. arXiv preprint arXiv:1709.04409 (2017).
    Findings
  • Cristian Danescu-Niculescu-Mizil and Lillian Lee. 2011. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs.. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, ACL 2011.
    Google ScholarLocate open access versionFindings
  • Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated Hate Speech Detection and the Problem of Offensive Language. In Proceedings of the 11th International AAAI Conference on Web and Social Media (ICWSM ’17).
    Google ScholarLocate open access versionFindings
  • Finale Doshi-Velez and Been Kim. 2017. Towards A Rigorous Science of Interpretable Machine Learning. In eprint arXiv:1702.08608.
    Findings
  • Alex B Fine, Austin F Frank, T Florian Jaeger, and Benjamin Van Durme. 2014. Biases in Predicting the Human Language Model.. In ACL (2). 7–12.
    Google ScholarLocate open access versionFindings
  • Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering Cognitive Behavior Therapy to Young Adults With Symptoms of Depression and Anxiety Using a Fully Automated Conversational Agent (Woebot): A Randomized Controlled Trial. JMIR Mental Health 4, 2 (2017), e19.
    Google ScholarLocate open access versionFindings
  • Javier Garcıa and Fernando Fernández. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research 16, 1 (2015), 1437– 1480.
    Google ScholarLocate open access versionFindings
  • Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 20Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
    Findings
  • Hua He, Kevin Gimpel, and Jimmy J Lin. 20Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks.. In EMNLP. 1576– 1586.
    Google ScholarLocate open access versionFindings
  • Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, and David Meger. 2017. Deep Reinforcement Learning that Matters. arXiv preprint arXiv:1709.06560 (2017). https://arxiv.org/pdf/1709.06560.pdf
    Findings
  • Frances Henry and Carol Tator. 2002. Discourses of domination: Racial bias in the Canadian English-language press. University of Toronto Press.
    Google ScholarFindings
  • C.J. Hutto, Dennis Folds, and Darren Appling. 2015. Computationally Detecting and Quantifying the Degree of Bias in Sentence-Level Text of News Stories. In Proceedings of Second International Conference on Human and Social Analytics.
    Google ScholarLocate open access versionFindings
  • Clayton J Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth international AAAI conference on weblogs and social media.
    Google ScholarLocate open access versionFindings
  • Robin Jia and Percy Liang. 2017. Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:1707.07328 (2017).
    Findings
  • Been Kim. 2015. Interactive and interpretable machine learning models for human machine collaboration. Ph.D. Dissertation. Massachusetts Institute of Technology.
    Google ScholarFindings
  • Jiwei Li, Will Monroe, Alan Ritter, Michel Galley, Jianfeng Gao, and Dan Jurafsky. 2016. Deep reinforcement learning for dialogue generation. arXiv preprint arXiv:1606.01541 (2016).
    Findings
  • Chia-Wei Liu, Ryan Lowe, Iulian V Serban, Michael Noseworthy, Laurent Charlin, and Joelle Pineau. 2016. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. arXiv preprint arXiv:1603.08023 (2016).
    Findings
  • Abbas Saliimi Lokman, Jasni Mohamad Zain, Fakulti Sistem Komputer, and Kejuruteraan Perisian. 2009. Designing a Chatbot for diabetic patients. In International Conference on Software Engineering & Computer Systems (ICSECS’09).
    Google ScholarLocate open access versionFindings
  • Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909 (2015).
    Findings
  • Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
    Findings
  • Jonas Mueller and Aditya Thyagarajan. 2016. Siamese Recurrent Architectures for Learning Sentence Similarity.. In AAAI. 2786–2792.
    Google ScholarFindings
  • Gina Neff and Peter Nagy. 2016. Automation, Algorithms, and Politics| Talking to Bots: Symbiotic Agency and the Case of Tay. International Journal of Communication 10 (2016), 17.
    Google ScholarLocate open access versionFindings
  • II Ororbia, G Alexander, Fridolin Linder, and Joshua Snoke. 2016. Privacy Protection for Natural Language Records: Neural Generative Models for Releasing Synthetic Twitter Data. arXiv preprint arXiv:1606.01151 (2016).
    Findings
  • Graham Ernest Rawlinson. 1976. The Significance of Letter Position in Word Recognition. Ph.D. Dissertation. University of Nottingham.
    Google ScholarFindings
  • Marta Recasens, Cristian Danescu-Niculescu-Mizil, and Dan Jurafsky. 2013. Linguistic Models for Analyzing and Detecting Biased Language.. In ACL. 1650–1659.
    Google ScholarFindings
  • Alan Ritter, Colin Cherry, and Bill Dolan. 2010. Unsupervised Modeling of Twitter Conversations. In NAACL. Association for Computational Linguistics, Los Angeles, California, 172–180. http://www.aclweb.org/anthology/N10-1020
    Locate open access versionFindings
  • Iulian Serban, Ryan Lowe, Peter Henderson, Laurent Charlin, and Joelle Pineau. 2015. A Survey of Available Corpora for Building Data-Driven Dialogue Systems. arXiv preprint arXiv:1512.05742 (2015).
    Findings
  • Iulian V Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, et al. 2017. A Deep Reinforcement Learning Chatbot. arXiv preprint arXiv:1709.02349 (2017).
    Findings
  • Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C Courville, and Joelle Pineau. 2016. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.. In AAAI. 3776–3784.
    Google ScholarFindings
  • I. V. Serban, A. Sordoni, R. Lowe, L. Charlin, J. Pineau, A. Courville, and Y. Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. In AAAI Conference.
    Google ScholarLocate open access versionFindings
  • Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron C Courville, and Yoshua Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues.. In AAAI. 3295–3301.
    Google ScholarFindings
  • Congzheng Song, Thomas Ristenpart, and Vitaly Shmatikov. 2017. Machine Learning Models that Remember Too Much. arXiv preprint arXiv:1709.07886 (2017).
    Findings
  • The United Nations General Assembly. 1966. International Covenant on Civil and Political Rights. Treaty Series 999 (dec 1966), 171.
    Google ScholarLocate open access versionFindings
  • Oren Tsur, Dan Calacci, and David Lazer. 2015. A Frame of Mind: Using Statistical Models for Detection of Framing and Agenda Setting Campaigns.. In ACL (1). 1629–1638.
    Google ScholarLocate open access versionFindings
  • O. Vinyals and Q. Le. 2015. A Neural Conversational Model. arXiv preprint arXiv:1506.05869 (2015).
    Findings
  • Fuliang Weng, Pongtep Angkititrakul, Elizabeth E Shriberg, Larry Heck, Stanley Peters, and John HL Hansen. 2016. Conversational In-Vehicle Dialog Systems: The past, present, and future. IEEE Signal Processing Magazine 33, 6 (2016), 49–60.
    Google ScholarLocate open access versionFindings
  • Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. 2014. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014).
    Findings
Author
Peter Henderson
Peter Henderson
Nan Rosemary Ke
Nan Rosemary Ke
Genevieve Fried
Genevieve Fried
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科