Analyzing and Learning from User Interactions for Search Clarification

SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval Virtual Event China July, 2020, pp. 1181-1190, 2020.

Cited by: 0|Bibtex|Views93|
EI
Other Links: arxiv.org|dl.acm.org|dblp.uni-trier.de|academic.microsoft.com
Keywords:
retrieval systemcandidate answerquery auto completioninformation seekingmajor webMore(6+)
Weibo:
We provided a thorough analysis of large-scale user interactions with clarifying questions in a major web search engine

Abstract:

Asking clarifying questions in response to search queries has been recognized as a useful technique for revealing the underlying intent of the query. Clarification has applications in retrieval systems with different interfaces, from the traditional web search interfaces to the limited bandwidth interfaces as in speech-only and small scre...More

Code:

Data:

0
Introduction
  • Search queries are oftentimes ambiguous or faceted. The information retrieval (IR) community has made significant efforts to effectively address the user information needs for such queries.
  • Zamani et al [51] has recently proposed a neural sequence-to-sequence model that learns to generate clarifying questions in response to open-domain search queries using weak supervision.
  • They showed that clarifying questions can be of significance even for web search engines with the traditional ten blue link interface
Highlights
  • Search queries are oftentimes ambiguous or faceted
  • Following our user interaction analyses, we propose a model for learning representations for clarifying questions together with their candidate answers from user interactions as implicit feedback
  • We address the second research question (RQ2: For which web search queries, do users prefer to use clarification?) by analyzing the user engagements with the clarification pane based on different query properties, such as query length, query type, and historical clicks observed for the query
  • We provided a thorough analysis of large-scale user interactions with clarifying questions in a major web search engine
  • We explored the impact of clarification on web search experience, and analyzed presentation bias in user interactions with the clarification panes in web search
  • All the percentages are typically expected to be higher than or equal to 50%, which means options with higher ranks are more likely to be clicked
  • Our preliminary analysis on click bias showed that users are often intended to click on candidate answers in higher positions and with larger size
Methods
  • The authors feed the clarifying question to the TextEncoder.
  • This results in K + 1 representations.
  • The authors further apply a Transformer encoder whose self-attention mechanism helps the model identify coherent and consistent answers.
  • The attention weights from each candidate answer to the others as well as the question help the model observe the similarity of answers and their entity types.
Results
  • The clarification pane for faceted queries are approximately 100% more likely to receive a click compared to the ambiguous queries.
  • The authors measured dissatisfaction for the sessions in which users interact with clarification, and observed 16.6% less dissatisfaction compared to the overall dissatisfaction of the search engine.
  • All the percentages are typically expected to be higher than or equal to 50%, which means options with higher ranks are more likely to be clicked.
  • As shown in the table, when the number of answers are 4 or 5, the percentage of points above the diagonal is lower than 50% for the 1 ↔ 2 setting
Conclusion
  • CONCLUSION AND FUTURE DIRECTIONS

    In this paper, the authors provided a thorough analysis of large-scale user interactions with clarifying questions in a major web search engine.

    The authors studied the impact of clarification properties on user engagement.
  • The authors provided a thorough analysis of large-scale user interactions with clarifying questions in a major web search engine.
  • The authors further investigated the queries for which users are more likely to interact with the clarification pane.
  • The authors explored the impact of clarification on web search experience, and analyzed presentation bias in user interactions with the clarification panes in web search.
  • The authors proposed a set of features and an end to end neural model for re-ranking clarifying questions for a query.
  • The proposed models outperform the baselines on both click data and human labeled data
Summary
  • Introduction:

    Search queries are oftentimes ambiguous or faceted. The information retrieval (IR) community has made significant efforts to effectively address the user information needs for such queries.
  • Zamani et al [51] has recently proposed a neural sequence-to-sequence model that learns to generate clarifying questions in response to open-domain search queries using weak supervision.
  • They showed that clarifying questions can be of significance even for web search engines with the traditional ten blue link interface
  • Methods:

    The authors feed the clarifying question to the TextEncoder.
  • This results in K + 1 representations.
  • The authors further apply a Transformer encoder whose self-attention mechanism helps the model identify coherent and consistent answers.
  • The attention weights from each candidate answer to the others as well as the question help the model observe the similarity of answers and their entity types.
  • Results:

    The clarification pane for faceted queries are approximately 100% more likely to receive a click compared to the ambiguous queries.
  • The authors measured dissatisfaction for the sessions in which users interact with clarification, and observed 16.6% less dissatisfaction compared to the overall dissatisfaction of the search engine.
  • All the percentages are typically expected to be higher than or equal to 50%, which means options with higher ranks are more likely to be clicked.
  • As shown in the table, when the number of answers are 4 or 5, the percentage of points above the diagonal is lower than 50% for the 1 ↔ 2 setting
  • Conclusion:

    CONCLUSION AND FUTURE DIRECTIONS

    In this paper, the authors provided a thorough analysis of large-scale user interactions with clarifying questions in a major web search engine.

    The authors studied the impact of clarification properties on user engagement.
  • The authors provided a thorough analysis of large-scale user interactions with clarifying questions in a major web search engine.
  • The authors further investigated the queries for which users are more likely to interact with the clarification pane.
  • The authors explored the impact of clarification on web search experience, and analyzed presentation bias in user interactions with the clarification panes in web search.
  • The authors proposed a set of features and an end to end neural model for re-ranking clarifying questions for a query.
  • The proposed models outperform the baselines on both click data and human labeled data
Tables
  • Table1: Statistics of the data collected from the user interactions with the clarification pane
  • Table2: Relative engagement rate (w.r.t. average engagement) for clarification panes per number of answers
  • Table3: Relative engagement rate (compared to the average engagement rate) per query type
  • Table4: The human labels for the clarification panes
  • Table5: Percentage of points that would receive higher click rate if moved to a higher position (i.e., % points above the diagonal in Figure 7). Note that the distance from diagonal is visualized by the line fitted on the data in Figure 7
  • Table6: Cross entropy for click rate estimation models. Lower cross entropy indicates more accurate click rate estimation
  • Table7: The training and test data used in our experiments
  • Table8: Experimental results for re-ranking clarification panes for a query. The superscripts 1/2/3 indicate statistically significant improvements compared to Clarification Estimation/BERT/LambdaMART without RLC, respectively
Download tables as Excel
Related work
  • In this section, we review prior work on asking clarifying questions, query suggestion, and click bias estimation.

    Asking Clarifying Question. Clarifying questions have been found useful in a number of applications, such as speech recognition [42] as well as dialog systems and chat-bots [6, 13, 33]. In community question answering websites, users often use clarifying questions to better understand the question [7, 35, 36]. Kiesel et al [22] studied the impact of voice query clarification on user satisfaction. They concluded that users like to be prompted for clarification. Coden et al [11] studied clarifying questions for entity disambiguation mostly in the form of “did you mean A or B?”. Recently, Aliannejadi et al [2] suggested an offline evaluation methodology for asking clarifying questions in conversational systems by proposing the Qulac dataset. The importance of clarification has been also discussed by Radlinski and Craswell [34]. In the TREC HARD Track [3], participants could ask clarifying questions by submitting a form in addition to their runs. Most recently, Zamani et al [51] proposed models for generating clarifying questions for open-domain search queries. In another study, Zamani and Craswell [50] developed a platform for conversational information seeking that supports mixed-initiative interactions, including clarification. In addition, Hashemi et al [18] introduced a neural model for representing user interactions with clarifying questions in an open-domain setting. Asking clarifying questions about item attributes has been also explored in the context of conversational recommender systems [43]. For instance, Christakopoulou et al [10] designed a system for preference elicitation in venue recommendation. Zhang et al [52] automatically extracted facet-value pairs from product reviews and considered them as questions and answers. In contrast to prior work on search clarification, this work focuses on understanding user interactions with clarifying questions in a real system based on log analysis.
Reference
  • Aman Agarwal, Xuanhui Wang, Cheng Li, Michael Bendersky, and Marc Najork. 2019. Addressing Trust Bias for Unbiased Learning-to-Rank. In WWW ’19. 4–14.
    Google ScholarLocate open access versionFindings
  • Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W. Bruce Croft. 2019. Asking Clarifying Questions in Open-Domain Information-Seeking Conversations. In SIGIR ’19. 475–484.
    Google ScholarLocate open access versionFindings
  • James Allan. 2004. HARD Track Overview in TREC 2004: High Accuracy Retrieval from Documents. In TREC ’04.
    Google ScholarFindings
  • Nicholas J. Belkin, Colleen Cool, Adelheit Stein, and Ulrich Thiel. 1995. Cases, scripts, and information-seeking strategies: On the design of interactive information retrieval systems. Expert Systems with Applications 9, 3 (1995), 379–395.
    Google ScholarLocate open access versionFindings
  • Paul N. Bennett, Ryen W. White, Wei Chu, Susan T. Dumais, Peter Bailey, Fedor Borisyuk, and Xiaoyuan Cui. 2012. Modeling the Impact of Short- and Long-term Behavior on Search Personalization. In SIGIR ’12. 185–194.
    Google ScholarLocate open access versionFindings
  • Marco De Boni and Suresh Manandhar. 2003. An Analysis of Clarification Dialogue for Question Answering. In NAACL ’03. 48–55.
    Google ScholarLocate open access versionFindings
  • Pavel Braslavski, Denis Savenkov, Eugene Agichtein, and Alina Dubatovka. 201What Do You Mean Exactly?: Analyzing Clarification Questions in CQA. In CHIIR ’17. 345–348.
    Google ScholarLocate open access versionFindings
  • Christopher J. C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview. Technical Report. Microsoft Research.
    Google ScholarFindings
  • Fei Cai and Maarten de Rijke. 2016. A Survey of Query Auto Completion in Information Retrieval. Now Publishers Inc., Hanover, MA, USA.
    Google ScholarLocate open access versionFindings
  • Konstantina Christakopoulou, Filip Radlinski, and Katja Hofmann. 2016. Towards Conversational Recommender Systems. In KDD ’16. 815–824.
    Google ScholarLocate open access versionFindings
  • Anni Coden, Daniel Gruhl, Neal Lewis, and Pablo N. Mendes. 2015. Did you mean A or B? Supporting Clarification Dialog for Entity Disambiguation. In SumPre ’15.
    Google ScholarLocate open access versionFindings
  • Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An Experimental Comparison of Click Position-Bias Models. In WSDM ’08. 87–94.
    Google ScholarLocate open access versionFindings
  • Marco De Boni and Suresh Manandhar. 2005. Implementing Clarification Dialogues in Open Domain Question Answering. Nat. Lang. Eng. 11, 4 (2005), 343–361.
    Google ScholarLocate open access versionFindings
  • Mostafa Dehghani, Sascha Rothe, Enrique Alfonseca, and Pascal Fleury. 2017. Learning to Attend, Copy, and Generate for Session-Based Query Suggestion. In CIKM ’17. 1747–1756.
    Google ScholarFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL ’19. Minneapolis, Minnesota, 4171–4186.
    Google ScholarFindings
  • Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying Relations for Open Information Extraction. In EMNLP ’11. 1535–1545.
    Google ScholarLocate open access versionFindings
  • Manish Gupta and Michael Bendersky. 2015. Information Retrieval with Verbose Queries. Foundations and TrendsÂő in Information Retrieval 9, 3-4 (2015), 209–354.
    Google ScholarLocate open access versionFindings
  • Helia Hashemi, Hamed Zamani, and W. Bruce Croft. 2020. Multi-Source Transformers: Leveraging Multiple Information Sources for Representation Learning in Conversational Search. In SIGIR ’20.
    Google ScholarFindings
  • Ahmed Hassan, Xiaolin Shi, Nick Craswell, and Bill Ramsey. 2013. Beyond clicks: query reformulation as a predictor of search satisfaction. In CIKM ’13. 2019–2028.
    Google ScholarFindings
  • Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately Interpreting Clickthrough Data as Implicit Feedback. In SIGIR ’05. 154–161.
    Google ScholarLocate open access versionFindings
  • Eugene Kharitonov and Pavel Serdyukov. 2012. Demographic Context in Web Search Re-ranking. In CIKM ’12. 2555–2558.
    Google ScholarLocate open access versionFindings
  • Johannes Kiesel, Arefeh Bahrami, Benno Stein, Avishek Anand, and Matthias Hagen. 2018. Toward Voice Query Clarification. In SIGIR ’18. 1257–1260.
    Google ScholarLocate open access versionFindings
  • Weize Kong, Rui Li, Jie Luo, Aston Zhang, Yi Chang, and James Allan. 2015. Predicting Search Intent Based on Pre-Search Context. In SIGIR ’15. 503–512.
    Google ScholarLocate open access versionFindings
  • Danai Koutra, Paul N. Bennett, and Eric Horvitz. 2015. Events and Controversies: Influences of a Shocking News Event on Information Seeking. In WWW ’15. 614–624.
    Google ScholarLocate open access versionFindings
  • Widad Machmouchi and Georg Buscher. 2016. Principles for the Design of Online A/B Metrics. In SIGIR ’16. 589–590.
    Google ScholarLocate open access versionFindings
  • Nicolaas Matthijs and Filip Radlinski. 2011. Personalizing Web Search Using Long Term Browsing History. In WSDM ’11. 25–34.
    Google ScholarLocate open access versionFindings
  • Bhaskar Mitra, Milad Shokouhi, Filip Radlinski, and Katja Hofmann. 2014. On User Interactions with Query Auto-Completion. In SIGIR ’14. 1055–1058.
    Google ScholarLocate open access versionFindings
  • Nasrin Mostafazadeh, Ishan Misra, Jacob Devlin, Margaret Mitchell, Xiaodong He, and Lucy Vanderwende. 2016. Generating Natural Questions About an Image. In ACL ’16. Berlin, Germany, 1802–1813.
    Google ScholarFindings
  • Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. (2019). arXiv:cs.IR/1901.04085
    Google ScholarFindings
  • Umut Ozertem, Olivier Chapelle, Pinar Donmez, and Emre Velipasaoglu. 2012.
    Google ScholarFindings
  • Neil O’Hare, Paloma de Juan, Rossano Schifanella, Yunlong He, Dawei Yin, and Yi Chang. 2016. Leveraging User Interaction Signals for Web Image Search. In SIGIR ’16. 559–568.
    Google ScholarLocate open access versionFindings
  • Harshith Padigela, Hamed Zamani, and W. Bruce Croft. 2019. Investigating the Successes and Failures of BERT for Passage Re-Ranking. (2019).
    Google ScholarFindings
  • Luis Quintano and Irene Pimenta Rodrigues. 2008. Question/Answering Clarification Dialogues. In MICAI ’08. 155–164.
    Google ScholarLocate open access versionFindings
  • Filip Radlinski and Nick Craswell. 2017. A Theoretical Framework for Conversational Search. In CHIIR ’17. 117–126.
    Google ScholarLocate open access versionFindings
  • Sudha Rao and Hal Daumé III. 2018. Learning to Ask Good Questions: Ranking
    Google ScholarFindings
  • Sudha Rao and Hal Daumé III. 2019. Answer-based Adversarial Training for Generating Clarification Questions. In NAACL ’19. Minneapolis, Minnesota.
    Google ScholarFindings
  • Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting Clicks: Estimating the Click-through Rate for New Ads. In WWW ’07. 521–530.
    Google ScholarLocate open access versionFindings
  • Mark Sanderson. 2008. Ambiguous Queries: Test Collections Need More Sense. In SIGIR ’08. 499–506.
    Google ScholarLocate open access versionFindings
  • Rodrygo L. Santos, Craig Macdonald, and Iadh Ounis. 2013. Learning to Rank Query Suggestions for Adhoc and Diversity Search. Inf. Retr. 16, 4 (2013), 429–451.
    Google ScholarLocate open access versionFindings
  • Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. 2015. Search Result Diversification. Found. Trends Inf. Retr. 9, 1 (2015), 1–90.
    Google ScholarLocate open access versionFindings
  • Milad Shokouhi. 2013. Learning to Personalize Query Auto-Completion. In SIGIR ’13. 103–112.
    Google ScholarLocate open access versionFindings
  • Svetlana Stoyanchev, Alex Liu, and Julia Hirschberg. 2014. Towards Natural Clarification Questions in Dialogue Systems. In AISB ’14, Vol. 20.
    Google ScholarLocate open access versionFindings
  • Yueming Sun and Yi Zhang. 2018. Conversational Recommender System. In SIGIR ’18. 235–244.
    Google ScholarLocate open access versionFindings
  • Jan Trienes and Krisztian Balog. 2019. Identifying Unclear Questions in Community Question Answering Websites. In ECIR ’19. 276–289.
    Google ScholarLocate open access versionFindings
  • Yury Ustinovskiy and Pavel Serdyukov. 2013. Personalization of web-search using short-term browsing context. In CIKM ’13. 1979–1988.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, undefinedukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. In NeurIPS ’17. 6000–6010.
    Google ScholarLocate open access versionFindings
  • Xiaohui Xie, Jiaxin Mao, Maarten de Rijke, Ruizhe Zhang, Min Zhang, and Shaoping Ma. 2018. Constructing an Interaction Behavior Model for Web Image Search. In SIGIR ’18. 425–434.
    Google ScholarLocate open access versionFindings
  • Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond Position Bias: Examining
    Google ScholarFindings
  • Hamed Zamani, Michael Bendersky, Xuanhui Wang, and Mingyang Zhang. 2017. Situational Context for Ranking in Personal Search. In WWW ’17. 1531–1540.
    Google ScholarLocate open access versionFindings
  • Hamed Zamani and Nick Craswell. 2020. Macaw: An Extensible Conversational Information Seeking Platform. In SIGIR ’20.
    Google ScholarFindings
  • Hamed Zamani, Susan T. Dumais, Nick Craswell, Paul N. Bennett, and Gord Lueck. 2020. Generating Clarifying Questions for Information Retrieval. In WWW ’20. 418–428.
    Google ScholarLocate open access versionFindings
  • Yongfeng Zhang, Xu Chen, Qingyao Ai, Liu Yang, and W. Bruce Croft. 2018.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments