A Deep Recurrent Survival Model for Unbiased Ranking

SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval Virtual Event China July, 2020, pp. 29-38, 2020.

Cited by: 0|Bibtex|Views212|DOI:https://doi.org/10.1145/3397271.3401073
EI
Other Links: arxiv.org|dl.acm.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
We propose an innovative framework named Deep Recurrent Survival Ranking where we adopt survival analysis techniques accompanied with probability chain rule to derive the joint probability of user various behaviors

Abstract:

Position bias is a critical problem in information retrieval when dealing with implicit yet biased user feedback data. Unbiased ranking methods typically rely on causality models and debias the user feedback through inverse propensity weighting. While practical, these methods still suffer from two major problems. First, when infer a user ...More

Code:

Data:

0
Introduction
  • Information systems have become a core part for the personalized online services, such as search engines and recommender systems, where machine learning is the key technique for the success [3].
  • Optimization for the ranking performance based on the implicit feedback data may result the ranking function in learning the presenting order, rather than the true relevance or the real user preferences.
  • To tackle such a bias issue, many researchers have explored the potential technical approaches in training a high efficient model with unbiased learning-to-rank
Highlights
  • Nowadays, information systems have become a core part for the personalized online services, such as search engines and recommender systems, where machine learning is the key technique for the success [3]
  • Based on the above analysis, when designing unbiased learningto-rank algorithm, the current state-of-the-art methods have not well solved, even may not been aware of, the following challenges, which we address in this paper: (C1) The user behaviors contain various and highly correlated patterns based on the contextual information. (C2) There are large scale of latent observe patterns hidden in the non-click queries. (C3) The untrusted observation, another unsolved issue, is caused by limitation of tracking logs
  • We propose a novel framework called deep recurrent survival ranking (DRSR) to formulate the unbiased learning-to-rank task as to estimate the probability distribution of user’s conditional click rate
  • We introduce our Deep Recurrent Survival Ranking (DRSR) based on recurrent neural network fθ with the parameter θ, which captures the sequential patterns for conditional click probability hi at every document
  • We investigated whether the performance improvement by DRSR is from reduction of position bias through comparing the ranking list given by the initial ranker with debiased ranker
  • We propose an innovative framework named DRSR where we adopt survival analysis techniques accompanied with probability chain rule to derive the joint probability of user various behaviors
Methods
  • Labeled Data Pairwise Debiasing Pointwise Debiasing Regression-EM [43]

    Click Data Labeled Data Ratio Debiasing [17] Regression-EM [43]

    Click Data Labeled Data Dual Learning Algorithm [2] Regression-EM [43] Click Data

    Yahoo Search Engine (CCM)

    MAP NDCG@1 NDCG@3 NDCG@5

    Alibaba Recommender System (CCM)

    5.2 Compared Settings

    The authors made comprehensive comparisons between the model and the baselines.
  • Labeled Data Pairwise Debiasing Pointwise Debiasing Regression-EM [43].
  • Click Data Labeled Data Ratio Debiasing [17] Regression-EM [43].
  • Click Data Labeled Data Dual Learning Algorithm [2] Regression-EM [43] Click Data.
  • The baselines are created by combining the learning-torank algorithm with the state-of-the-art debiasing methods.
  • Dual Learning Algorithm: Ai et al [2] proposed a dual learning which can jointly learn a ranker and conduct debiasing of click data
  • Regression-EM: Wang et al [43] proposed regression-based EM method where position bias is estimated directly from regular production clicks.
Conclusion
  • The authors propose an innovative framework named DRSR where the authors adopt survival analysis techniques accompanied with probability chain rule to derive the joint probability of user various behaviors
  • This framework enables unbiased model to leverage the contextual information in the ranking list to enhance the performance.
  • The authors design a novel objective function to mine the rich observe and click patterns hidden in both click and non-click queries.
  • It would be interesting to investigate better solution to model multiple-click session and consider good and bad in abandoned, i.e., non-click queries, respectively
Summary
  • Introduction:

    Information systems have become a core part for the personalized online services, such as search engines and recommender systems, where machine learning is the key technique for the success [3].
  • Optimization for the ranking performance based on the implicit feedback data may result the ranking function in learning the presenting order, rather than the true relevance or the real user preferences.
  • To tackle such a bias issue, many researchers have explored the potential technical approaches in training a high efficient model with unbiased learning-to-rank
  • Objectives:

    Combining all the objective functions and the goal is to minimize.
  • Methods:

    Labeled Data Pairwise Debiasing Pointwise Debiasing Regression-EM [43]

    Click Data Labeled Data Ratio Debiasing [17] Regression-EM [43]

    Click Data Labeled Data Dual Learning Algorithm [2] Regression-EM [43] Click Data

    Yahoo Search Engine (CCM)

    MAP NDCG@1 NDCG@3 NDCG@5

    Alibaba Recommender System (CCM)

    5.2 Compared Settings

    The authors made comprehensive comparisons between the model and the baselines.
  • Labeled Data Pairwise Debiasing Pointwise Debiasing Regression-EM [43].
  • Click Data Labeled Data Ratio Debiasing [17] Regression-EM [43].
  • Click Data Labeled Data Dual Learning Algorithm [2] Regression-EM [43] Click Data.
  • The baselines are created by combining the learning-torank algorithm with the state-of-the-art debiasing methods.
  • Dual Learning Algorithm: Ai et al [2] proposed a dual learning which can jointly learn a ranker and conduct debiasing of click data
  • Regression-EM: Wang et al [43] proposed regression-based EM method where position bias is estimated directly from regular production clicks.
  • Conclusion:

    The authors propose an innovative framework named DRSR where the authors adopt survival analysis techniques accompanied with probability chain rule to derive the joint probability of user various behaviors
  • This framework enables unbiased model to leverage the contextual information in the ranking list to enhance the performance.
  • The authors design a novel objective function to mine the rich observe and click patterns hidden in both click and non-click queries.
  • It would be interesting to investigate better solution to model multiple-click session and consider good and bad in abandoned, i.e., non-click queries, respectively
Tables
  • Table1: A summary of notations in this paper
  • Table2: Comparison of different unbiased learning-to-rank methods under Yahoo Search Engine and Alibaba Recommender System. CCM is utilized as click generation model. * indicates p-value < 0.001 in significance test vs the best baseline
  • Table3: Comparison with PBM as click generation model. Notations are same with Table 2
Download tables as Excel
Related work
  • Unbiased Learning to Rank. Learning to rank [29] is a fundamental technique for information systems, such as search engine, recommender system and sponsored search advertising. There are two streams of unbiased learning to rank methodologies. One school is based on some basic assumptions about the user browsing behaviors [7, 10, 38, 39]. These models maximize the likelihood of the observations in the history data collected from the user browsing logs. Recently, Fang et al [12] extended position-based model and proposed an effective estimator based on invention harvesting. As is discussed in [21], these model only model user behavior patterns without sufficient optimization for learning to rank problem. The other school derived from counterfactual learning [21, 43] which treats the click bias as the counterfactual factor [35] and debias the user feedback through inverse propensity weighting [42]. Recently, Ai et al [2] and Hu et al [17] respectively proposed to employ the dual learning method for jointly estimating position bias and training a ranker. However, these prior works often ignore the rich contextual information in query and omit user’s various behaviors except click. In this paper, we propose an innovative approach a novel cascade model adaptive in both point-wise and pair-wise setting. In addition to taking joint consideration of click and non-click data via survival analysis, we also model the whole ranking list through recurrent neural network.
Funding
  • We thank the support of National Natural Science Foundation of China (Grant No 61702327, 61772333, 61632017) and Wu Wen Jun Honorary Doctoral Scholarship from AI Institute, Shanghai Jiao Tong University
Reference
  • Aman Agarwal, Kenta Takatsu, Ivan Zaitsev, and Thorsten Joachims. 2019. A General Framework for Counterfactual Learning-to-Rank. In SIGIR.
    Google ScholarFindings
  • Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, and W Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. SIGIR (2018).
    Google ScholarLocate open access versionFindings
  • Qingyao Ai, Jiaxin Mao, Yiqun Liu, and W Bruce Croft. 2018. Unbiased learning to rank: Theory and practice. In CIKM.
    Google ScholarFindings
  • Ahmed M Alaa and Mihaela van der Schaar. 2017. Deep multi-task gaussian processes for survival analysis with competing risks. In NeurIPS.
    Google ScholarFindings
  • Per K Andersen, Ornulf Borgan, Richard D Gill, and Niels Keiding. 2012. Statistical models based on counting processes.
    Google ScholarFindings
  • Olivier Chapelle and Yi Chang. 2011. Yahoo! learning to rank challenge overview. In Proceedings of the Learning to Rank Challenge.
    Google ScholarLocate open access versionFindings
  • Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In WWW.
    Google ScholarFindings
  • David R Cox. 1972. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological) (1972).
    Google ScholarLocate open access versionFindings
  • Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An experimental comparison of click position-bias models. In WSDM.
    Google ScholarFindings
  • Georges E Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations.. In SIGIR.
    Google ScholarFindings
  • Hui Fang, Guibing Guo, Danning Zhang, and Yiheng Shu. 2019. Deep LearningBased Sequential Recommender Systems: Concepts, Algorithms, and Evaluations. In International Conference on Web Engineering.
    Google ScholarLocate open access versionFindings
  • Zhichong Fang, Aman Agarwal, and Thorsten Joachims. 2018. Intervention harvesting for context-dependent examination-bias estimation. SIGIR (2018).
    Google ScholarLocate open access versionFindings
  • Louis Gordon and Richard A Olshen. 1985. Tree-structured survival analysis. Cancer treatment reports (1985).
    Google ScholarLocate open access versionFindings
  • Fan Guo, Chao Liu, Anitha Kannan, Tom Minka, Michael Taylor, Yi-Min Wang, and Christos Faloutsos. 2009. Click chain model in web search. In WWW.
    Google ScholarFindings
  • Fan Guo, Chao Liu, and Yi Min Wang. 2009. Efficient multiple-click models in web search. In WSDM.
    Google ScholarFindings
  • Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation (1997).
    Google ScholarLocate open access versionFindings
  • Ziniu Hu, Yang Wang, Qu Peng, and Hang Li. 2019. Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm. In WWW.
    Google ScholarFindings
  • Rolf Jagerman, Harrie Oosterhuis, and Maarten de Rijke. 2019. To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions. (2019).
    Google ScholarFindings
  • How Jing and Alexander J Smola. 2017. Neural survival recommender. In WSDM.
    Google ScholarLocate open access versionFindings
  • Thorsten Joachims, Laura A Granka, Bing Pan, Helene Hembrooke, and Geri Gay.
    Google ScholarFindings
  • 2005. Accurately interpreting clickthrough data as implicit feedback. In SIGIR.
    Google ScholarLocate open access versionFindings
  • [21] Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In WSDM.
    Google ScholarFindings
  • [22] Edward L Kaplan and Paul Meier. 1958. Nonparametric estimation from incomplete observations. Journal of the American statistical association (1958).
    Google ScholarLocate open access versionFindings
  • [23] Jared L Katzman, Uri Shaham, Alexander Cloninger, Jonathan Bates, Tingting Jiang, and Yuval Kluger. 2018. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC medical research methodology (2018).
    Google ScholarLocate open access versionFindings
  • [24] Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. In NeurIPS.
    Google ScholarFindings
  • [25] Faisal M Khan and Valentina Bayer Zubek. 2008. Support vector regression for censored data (SVRc): a novel tool for survival analysis. In ICDM.
    Google ScholarFindings
  • [26] Elisa T Lee and John Wang. 2003. Statistical methods for survival data analysis. Vol. 476. John Wiley & Sons.
    Google ScholarFindings
  • [27] Jane Li, Scott Huffman, and Akihito Tokuda. 2009. Good abandonment in mobile and PC internet search. In SIGIR.
    Google ScholarFindings
  • [28] Yan Li, Jie Wang, Jieping Ye, and Chandan K Reddy. 2016. A multi-task learning formulation for survival analysis. In KDD.
    Google ScholarFindings
  • [29] Tie-Yan Liu et al. 2009. Learning to rank for information retrieval. Foundations and Trends® in Information Retrieval (2009).
    Google ScholarLocate open access versionFindings
  • [30] Rajesh Ranganath, Adler Perotte, Noémie Elhadad, and David Blei. 2016. Deep survival analysis. arXiv (2016).
    Google ScholarLocate open access versionFindings
  • [31] Kan Ren, Yuchen Fang, Weinan Zhang, Shuhao Liu, Jiajun Li, Ya Zhang, Yong Yu, and Jun Wang. 2018. Learning multi-touch conversion attribution with dual-attention mechanisms for online advertising. In CIKM.
    Google ScholarFindings
  • [32] Kan Ren, Jiarui Qin, Lei Zheng, Zhengyu Yang, Weinan Zhang, Lin Qiu, and Yong Yu. 2019. Deep recurrent survival analysis. In AAAI.
    Google ScholarFindings
  • [33] Kan Ren, Jiarui Qin, Lei Zheng, Weinan Zhang, and Yong Yu. 2019. Deep Landscape Forecasting for Real-time Bidding Advertising. KDD (2019).
    Google ScholarLocate open access versionFindings
  • [34] Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting clicks: estimating the click-through rate for new ads. In WWW.
    Google ScholarFindings
  • [35] Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika (1983).
    Google ScholarLocate open access versionFindings
  • [36] Yang Song, Xiaolin Shi, Ryen White, and Ahmed Hassan Awadallah. 2014. Context-aware web search abandonment prediction. In SIGIR.
    Google ScholarFindings
  • [37] Robert Tibshirani. 1997. The lasso method for variable selection in the Cox model. Statistics in medicine (1997).
    Google ScholarLocate open access versionFindings
  • [38] Chao Wang, Yiqun Liu, Meng Wang, Ke Zhou, Jian-yun Nie, and Shaoping Ma. 2015. Incorporating non-sequential behavior into click models. In SIGIR.
    Google ScholarFindings
  • [39] Hongning Wang, ChengXiang Zhai, Anlei Dong, and Yi Chang. 2013. Contentaware click modeling. In WWW.
    Google ScholarFindings
  • [40] Jun Wang, Arjen P De Vries, and Marcel JT Reinders. 2006. A user-item relevance model for log-based collaborative filtering. In ECIR.
    Google ScholarFindings
  • [41] Ping Wang, Yan Li, and Chandan K Reddy. 2019. Machine learning for survival analysis: A survey. CSUR (2019).
    Google ScholarLocate open access versionFindings
  • [42] Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In SIGIR.
    Google ScholarFindings
  • [43] Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position bias estimation for unbiased learning to rank in personal search. In WSDM.
    Google ScholarFindings
  • [44] Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv (2019).
    Google ScholarLocate open access versionFindings
  • [45] Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic, Ruslan R Salakhutdinov, and Yoshua Bengio. 2016. Architectural complexity measures of recurrent neural networks. In NeurIPS.
    Google ScholarFindings
  • [46] Weinan Zhang, Tianxiong Zhou, Jun Wang, and Jian Xu. 2016. Bid-aware Gradient Descent for Unbiased Learning with Censored Data in Display Advertising. In KDD.
    Google ScholarFindings
  • [47] Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai. 2018. Learning Tree-based Deep Model for Recommender Systems. In KDD.
    Google ScholarFindings
Full Text
Your rating :
0

 

Tags
Comments