Interactive Classification by Asking Informative Questions

arxiv, 2020.

Cited by: 0|Bibtex|Views29|Links
Keywords:
initial querychoice questionnatural languagesimple recurrent unit recurrenceuser intent predictionMore(4+)
Weibo:
We propose an approach for interactive classification, where the system can inquire missing information through a sequence of simple binary or multi-choice questions when users provide underspecified natural language queries

Abstract:

Natural language systems often rely on a single, potentially ambiguous input to make one final prediction, which may simplify the problem but degrade end user experience. Instead of making predictions with the natural language query only, we ask the user for additional information using a small number of binary and multiple-choice quest...More

Code:

Data:

0
Introduction
  • Responding to natural language queries through simple, single-step classification has been studied extensively in many applications, including user intent prediction (Chen et al, 2019; Qu et al, 2019), and information retrieval (Kang and Kim, 2003; Rose and Levinson, 2004).
  • Users may under-specify a request due to incomplete understanding of the domain; or the system may fail to correctly interpret the nuances of the input query.
  • In both cases, a low quality decision could be mitigated by further interaction with the user.
  • The authors build an interactive system that poses a sequence of binary and multiple choice questions follow-
Highlights
  • Responding to natural language queries through simple, single-step classification has been studied extensively in many applications, including user intent prediction (Chen et al, 2019; Qu et al, 2019), and information retrieval (Kang and Kim, 2003; Rose and Levinson, 2004)
  • We model p and p by encoding the natural language descriptions of questions, answers and classification labels
  • For multi-choice questions, we show the workers a list of possible answers to a tag-generated question for a given FAQ
  • Our simulated analysis shows that the simple recurrent unit recurrence recurrent neural network text encoder performs better or similar to the other encoders
  • Our model with the policy controller or the threshold strategy does not explicitly bound the number of turns, so we report the average number of turns across multiple runs for these two models
  • We propose an approach for interactive classification, where the system can inquire missing information through a sequence of simple binary or multi-choice questions when users provide underspecified natural language queries
Methods
  • The authors maintain a probability distribution p(y |Xt) over the set of labels Y.
  • The user response depends only on the question qt and the underlying target label yi, and is independent of past interactions.
  • While this independence assumption is unlikely to reflect the course of interactions, it allows to simplify p to p.
  • The selection of the question qt is deterministic given the interaction history Xt−1.
Results
  • The authors' simulated analysis shows that the SRU RNN text encoder performs better or similar to the other encoders.
  • This encoder is the most lightweight.
  • Because the authors do not ask users to fill the end-of-interaction survey for the no interaction baseline, the authors compute its numbers following the first query when evaluating the full approach.
Conclusion
  • The authors propose an approach for interactive classification, where the system can inquire missing information through a sequence of simple binary or multi-choice questions when users provide underspecified natural language queries.
  • The authors' modeling choices enable the system to perform zero-shot generalization to unseen classification targets and questions.
  • No Interaction (BM25) No Interaction (RoBERTaBASE) No Interaction (RNN) No Interaction (RNN + self-attention) Random Interaction No Initial Query Interaction The authors' Approach w/ threshold w/ fixed turn w/ λ = 1 Acc@1 Acc@3 Ours Static Interact Ours (Threshold) Random Interact.
Summary
  • Introduction:

    Responding to natural language queries through simple, single-step classification has been studied extensively in many applications, including user intent prediction (Chen et al, 2019; Qu et al, 2019), and information retrieval (Kang and Kim, 2003; Rose and Levinson, 2004).
  • Users may under-specify a request due to incomplete understanding of the domain; or the system may fail to correctly interpret the nuances of the input query.
  • In both cases, a low quality decision could be mitigated by further interaction with the user.
  • The authors build an interactive system that poses a sequence of binary and multiple choice questions follow-
  • Methods:

    The authors maintain a probability distribution p(y |Xt) over the set of labels Y.
  • The user response depends only on the question qt and the underlying target label yi, and is independent of past interactions.
  • While this independence assumption is unlikely to reflect the course of interactions, it allows to simplify p to p.
  • The selection of the question qt is deterministic given the interaction history Xt−1.
  • Results:

    The authors' simulated analysis shows that the SRU RNN text encoder performs better or similar to the other encoders.
  • This encoder is the most lightweight.
  • Because the authors do not ask users to fill the end-of-interaction survey for the no interaction baseline, the authors compute its numbers following the first query when evaluating the full approach.
  • Conclusion:

    The authors propose an approach for interactive classification, where the system can inquire missing information through a sequence of simple binary or multi-choice questions when users provide underspecified natural language queries.
  • The authors' modeling choices enable the system to perform zero-shot generalization to unseen classification targets and questions.
  • No Interaction (BM25) No Interaction (RoBERTaBASE) No Interaction (RNN) No Interaction (RNN + self-attention) Random Interaction No Initial Query Interaction The authors' Approach w/ threshold w/ fixed turn w/ λ = 1 Acc@1 Acc@3 Ours Static Interact Ours (Threshold) Random Interact.
Tables
  • Table1: Human evaluation classification accuracy
  • Table2: Performance with simulated interactions. We evaluate our approach and several baselines using Accuracy@{1, 3}. Best performance numbers are in bold. We report the averaged results as well as the standard deviations from three independent runs for each model variant and baseline. For FAQ Suggestion, in parentheses, we provide zero-shot results, where the system has access to tags only for training questions
Related work
Reference
  • Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W. Bruce Croft. 2019. Asking clarifying questions in open-domain information-seeking conversations. In Proceedings of the 42Nd International ACM SIGIR Conference on Research and Development in Information Retrieval.
    Google ScholarLocate open access versionFindings
  • Yoav Artzi and Luke Zettlemoyer. 2011. Bootstrapping semantic parsers from conversations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.
    Google ScholarLocate open access versionFindings
  • Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics.
    Google ScholarFindings
  • Prithvijit Chattopadhyay, Deshraj Yadav, Viraj Prabhu, Arjun Chandrasekaran, Abhishek Das, Stefan Lee, Dhruv Batra, and Devi Parikh. 2017. Evaluating visual conversational agents via cooperative human-ai games. In Fifth AAAI Conference on Human Computation and Crowdsourcing.
    Google ScholarLocate open access versionFindings
  • Cen Chen, Chilin Fu, Xu Hu, Xiaolu Zhang, Jun Zhou, Xiaolong Li, and Forrest Sheng Bao. 2019. Reinforcement learning for user intent prediction in customer service bots. In Proceedings of the 42Nd International ACM SIGIR Conference on Research and Development in Information Retrieval.
    Google ScholarLocate open access versionFindings
  • Yihong Chen, Bei Chen, Xuguang Duan, Jian-Guang Lou, Yue Wang, Wenwu Zhu, and Yong Cao. 2018. Learning-to-ask: Knowledge acquisition via 20 questions. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
    Google ScholarLocate open access versionFindings
  • Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen tau Yih, Yejin Choi, Percy Liang, and Luke Zettlemoyer. 2018. Quac: Question answering in context. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
    Google ScholarLocate open access versionFindings
  • Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. 2017. Deep reinforcement learning from human preferences. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 4299–4307.
    Google ScholarLocate open access versionFindings
  • Pei-Hung Chung, Kuan Tung, Ching-Lun Tai, and Hung yi Lee. 2018. Joint learning of interactive spoken content retrieval and trainable user simulator. In INTERSPEECH.
    Google ScholarFindings
  • Abhishek Das, Satwik Kottur, Jose MF Moura, Stefan Lee, and Dhruv Batra. 2017. Learning cooperative visual dialog agents with deep reinforcement learning. In Proceedings of the IEEE international conference on computer vision.
    Google ScholarLocate open access versionFindings
  • Marin Ferecatu and Donald Geman. 2007. Interactive search for image categories by mental matching. In 2007 IEEE 11th International Conference on Computer Vision.
    Google ScholarLocate open access versionFindings
  • Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, and Rogerio Feris. 2018. Dialog-based interactive image retrieval. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 678–688. Curran Associates, Inc.
    Google ScholarLocate open access versionFindings
  • Izzeddin Gur, Semih Yavuz, Yu Su, and Xifeng Yan. 2018. Dialsql: Dialogue based structured query generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
    Google ScholarLocate open access versionFindings
  • Braden Hancock, Paroma Varma, Stephanie Wang, Martin Bringmann, Percy Liang, and Christopher Re. 2018. Training classifiers with natural language explanations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
    Google ScholarLocate open access versionFindings
  • Huang Hu, Xianchao Wu, Bingfeng Luo, Chongyang Tao, Can Xu, Wei Wu, and Zhan Chen. 2018. Playing 20 question game with policy-based reinforcement learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
    Google ScholarLocate open access versionFindings
  • Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, and Luke Zettlemoyer. 2017. Learning a neural semantic parser from user feedback. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
    Google ScholarLocate open access versionFindings
  • In-Ho Kang and GilChang Kim. 2003. Query type classification for web document retrieval. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval.
    Google ScholarLocate open access versionFindings
  • J. F. Kelley. 1984. An iterative design methodology for user-friendly natural language office information applications.
    Google ScholarFindings
  • Adriana Kovashka and Kristen Grauman. 2013. Attribute pivots for guiding relevance feedback in image search. In Proceedings of the IEEE International Conference on Computer Vision.
    Google ScholarLocate open access versionFindings
  • Sang-Woo Lee, Tong Gao, Sohee Yang, Jaejun Yoo, and Jung-Woo Ha. 2019. Large-scale answerer in questioner’s mind for visual dialog question generation. In International Conference on Learning Representations.
    Google ScholarLocate open access versionFindings
  • Sang-Woo Lee, Yu-Jung Heo, and Byoung-Tak Zhang. 2018. Answerer in questioner’s mind: Information theoretic approach to goal-oriented visual dialog. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31. Curran Associates, Inc.
    Google ScholarLocate open access versionFindings
  • Tao Lei, Yu Zhang, Sida I. Wang, Hui Dai, and Yoav Artzi. 2018. Simple recurrent units for highly parallelizable recurrence. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
    Google ScholarLocate open access versionFindings
  • Jiwei Li, Alexander H Miller, Sumit Chopra, Marc’Aurelio Ranzato, and Jason Weston. 2016. Dialogue learning with human-in-the-loop. arXiv preprint arXiv:1611.09823.
    Findings
  • Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130.
    Findings
  • Charles X. Ling, Qiang Yang, Jianning Wang, and Shichao Zhang. 2004. Decision trees with minimal costs. In Proceedings of the Twenty-first International Conference on Machine Learning.
    Google ScholarLocate open access versionFindings
  • Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach.
    Google ScholarFindings
  • Chen Qu, Liu Yang, W. Bruce Croft, Yongfeng Zhang, Johanne R. Trippas, and Minghui Qiu. 2019. User intent prediction in information-seeking conversations. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval.
    Google ScholarLocate open access versionFindings
  • Sudha Rao and Hal Daume III. 2018. Learning to ask good questions: Ranking clarification questions using neural expected value of perfect information. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
    Google ScholarLocate open access versionFindings
  • Siva Reddy, Danqi Chen, and Christopher D. Manning. 2019. CoQA: A conversational question answering challenge. Transactions of the Association for Computational Linguistics.
    Google ScholarFindings
  • Scott Reed, Zeynep Akata, Honglak Lee, and Bernt Schiele. 2016. Learning deep representations of fine-grained visual descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
    Google ScholarLocate open access versionFindings
  • Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends in Information Retrieval, (4).
    Google ScholarLocate open access versionFindings
  • Daniel E Rose and Danny Levinson. 2004. Understanding user goals in web search. In Proceedings of the 13th international conference on World Wide Web.
    Google ScholarLocate open access versionFindings
  • Darsh Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, and Preslav Nakov. 2018. Adversarial domain adaptation for duplicate question detection. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
    Google ScholarLocate open access versionFindings
  • Pushkar Shukla, Carlos E. L. Elmadjian, Richika Sharan, Vivek Kulkarni, Matthew Turk, and William Yang Wang. 2019. What should I ask? using conversationally informative rewards for goaloriented visual dialog. CoRR.
    Google ScholarLocate open access versionFindings
  • Paul E. Utgoff. 1989. Incremental induction of decision trees. Machine Learning, 4.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 5998–6008. Curran Associates, Inc.
    Google ScholarLocate open access versionFindings
  • Harm de Vries, Florian Strub, A. P. Sarath Chandar, Olivier Pietquin, Hugo Larochelle, and Aaron C. Courville. 2017. Guesswhat?! visual object discovery through multi-modal dialogue. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
    Google ScholarLocate open access versionFindings
  • C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology.
    Google ScholarFindings
  • Sida I. Wang, Percy Liang, and Christopher D. Manning. 2016. Learning language games through interaction. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
    Google ScholarLocate open access versionFindings
  • Tsung-Hsien Wen, David Vandyke, Nikola Mrksic, Milica Gasic, Lina M. Rojas-Barahona, Pei-Hao Su, Stefan Ultes, and Steve Young. 2017. A networkbased end-to-end trainable task-oriented dialogue system. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers.
    Google ScholarLocate open access versionFindings
  • Ronald J Williams. 1992. Simple statistical gradientfollowing algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229–256.
    Google ScholarLocate open access versionFindings
  • Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R’emi Louf, Morgan Funtowicz, and Jamie Brew. 2019. Huggingface’s transformers: State-of-the-art natural language processing. ArXiv, abs/1910.03771.
    Findings
  • Xianchao Wu, Huang Hu, Momo Klyen, Kyohei Tomita, and Zhan Chen. 2018. Q20: Rinna riddles your mind by asking 20 questions. Japan NLP.
    Google ScholarFindings
  • Ziyu Yao, Yu Su, Huan Sun, and Wen-tau Yih. 2019. Model-based interactive semantic parsing: A unified framework and a text-to-SQL case study. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
    Google ScholarLocate open access versionFindings
  • Tag Association Qualification The goal of this annotation task is to associate tags with classification labels. We train a model on the collected initial queries to rank tags for each classification target. We pick out the highest ranked tags as positives and the lowest ranked tags as negatives for each target. The worker sees in total ten tags without knowing which ones are the negatives. To pass the qualification task, the workers need to complete annotation on three targets without selecting any of the negative tags.
    Google ScholarFindings
  • Tag Association Task Details After the qualification task, we take the top 50 possible tags for each target and split them into five non-overlapping lists (i.e., ten tags for each list) to show to the workers. Each of the lists is assigned to four separate workers to annotate. We observe that showing only the top-50 tags out of 813 is sufficient. Figure A.1 illustrates this: after showing the top-50 tags, the curve plateaus and no new tags are assigned to a target label. Table A.1 shows annotator agreement using Cohen’s κ score.
    Google ScholarFindings
Your rating :
0

 

Tags
Comments