AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
For supervised ranking with the score-and-sort approach, learning the scoring function through regression is consistent for all ranking tasks for which a convex risk minimization approach is consistent

On ranking via sorting by estimated expected utility

NIPS 2020, (2020)

Cited by: 0|Views29
EI
Full Text
Bibtex
Weibo

Abstract

Ranking tasks are defined through losses that measure trade-offs between different desiderata such as the relevance and the diversity of the items at the top of the list. This paper addresses the question of which of these tasks are asymptotically solved by sorting by decreasing order of expected utility, for some suitable notion of utili...More

Code:

Data:

0
Introduction
  • The usual approach in learning to rank is to score each item given the input, and produce the ranking by sorting in decreasing order of scores.
  • This score-and-sort approach follows the probability ranking principle of information retrieval [29], which stipulates that documents should be rank-ordered according to their estimated probability of relevance to the query.
Highlights
  • The usual approach in learning to rank is to score each item given the input, and produce the ranking by sorting in decreasing order of scores
  • We study what ranking tasks are solved via sorting by expected utilities, in a general supervised ranking framework that captures different types of ground-truth signal and losses
  • Since utilities can serve as target values to learn the scoring function through square loss regression, the optimality of sorting by expected utilities is equivalent to the consistency of regression
  • The main question we address is : When is square loss regression consistent for ranking via score-and-sort?
  • In Section 3.1, we showed that optimal scoring functions for non-compatible with expected utility (CEU) ranking losses are discontinuous for some applied to the Expected Reciprocal Rank (ERR) and the AP on random distributions where the ERR/the AP have bad local minima. percentage of optimization runs based on gradient descent that end up stuck in local minima
  • For supervised ranking with the score-and-sort approach, learning the scoring function through regression is consistent for all ranking tasks for which a convex risk minimization approach is consistent
Results
  • The sub-optimality of a local minimum is value max min min in of local minima are more than 90% sub-optimal. Illustration of optimal rankings for the ERR and the AP, for the fictional search engine scenario with the ambiguous query “jaguar”.
Conclusion
  • For supervised ranking with the score-and-sort approach, learning the scoring function through regression is consistent for all ranking tasks for which a convex risk minimization approach is consistent.
  • For tasks with non-CEU ranking losses, one possible avenue is to develop efficient direct loss minimization approaches, such as approximations of NC above or as proposed by Song et al [30].
  • Another direction is to find alternatives to score-and-sort.
  • A possible starting point would be to build on the recent work on excess risk bounds for non-calibrated losses [32]
Summary
  • Introduction:

    The usual approach in learning to rank is to score each item given the input, and produce the ranking by sorting in decreasing order of scores.
  • This score-and-sort approach follows the probability ranking principle of information retrieval [29], which stipulates that documents should be rank-ordered according to their estimated probability of relevance to the query.
  • Results:

    The sub-optimality of a local minimum is value max min min in of local minima are more than 90% sub-optimal. Illustration of optimal rankings for the ERR and the AP, for the fictional search engine scenario with the ambiguous query “jaguar”.
  • Conclusion:

    For supervised ranking with the score-and-sort approach, learning the scoring function through regression is consistent for all ranking tasks for which a convex risk minimization approach is consistent.
  • For tasks with non-CEU ranking losses, one possible avenue is to develop efficient direct loss minimization approaches, such as approximations of NC above or as proposed by Song et al [30].
  • Another direction is to find alternatives to score-and-sort.
  • A possible starting point would be to build on the recent work on excess risk bounds for non-calibrated losses [32]
Tables
  • Table1: Example of ranking losses with their utilities, if any. We give examples with different types of supervision, including DAGn, which is the set of directed acyclic graphs used in the computation of the pairwise disagreement (PD) studied by Duchi et al [<a class="ref-link" id="c15" href="#r15">15</a>]
Download tables as Excel
Funding
  • The sub-optimality of a local minimum is value max min min in of local minima are more than 90% sub-optimal. (right) Illustration of optimal rankings for the ERR (diversity-inducing) and the AP (diversity-averse), for the fictional search engine scenario with the ambiguous query “jaguar”
  • Fig. 2 (middle) displays the sub-optimality of these local minima, showing that 25% of them are more than 10% sub-optimal
Reference
  • A. Agarwal, K. Takatsu, I. Zaitsev, and T. Joachims. A general framework for counterfactual learning-to-rank. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 5–14, 2019.
    Google ScholarLocate open access versionFindings
  • F. G. Arenas. Alexandroff spaces. 1999.
    Google ScholarFindings
  • E. Bakshy, S. Messing, and L. A. Adamic. Exposure to ideologically diverse news and opinion on facebook. Science, 348(6239):1130–1132, 2015.
    Google ScholarLocate open access versionFindings
  • A. R. Barron. Complexity Regularization with Application to Artificial Neural Networks, pages 561–576. Springer Netherlands, Dordrecht, 1991.
    Google ScholarLocate open access versionFindings
  • S. Bird, S. Barocas, K. Crawford, F. Diaz, and H. Wallach. Exploring or exploiting? social and ethical implications of autonomous experimentation in ai. In Workshop on Fairness, Accountability, and Transparency in Machine Learning, 2016.
    Google ScholarLocate open access versionFindings
  • D. Buffoni, C. Calauzènes, P. Gallinari, and N. Usunier. Learning scoring functions with order-preserving losses and standardized supervision. In Proceedings of the 28th International Conference on International Conference on Machine Learning, pages 825–832, 2011.
    Google ScholarLocate open access versionFindings
  • C. Calauzènes, N. Usunier, and P. Gallinari. On the (non-)existence of convex, calibrated surrogate losses for ranking. In Advances in Neural Information Processing Systems 25, pages 197–205. 2012.
    Google ScholarLocate open access versionFindings
  • C. Calauzènes, N. Usunier, and P. Gallinari. Calibration and regret bounds for order-preserving surrogate losses in learning to rank. Machine learning, 93(2-3):227–260, 2013.
    Google ScholarLocate open access versionFindings
  • O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 621–630, 2009.
    Google ScholarLocate open access versionFindings
  • O. Chapelle, S. Ji, C. Liao, E. Velipasaoglu, L. Lai, and S.-L. Wu. Intent-based diversification of web search results: metrics and algorithms. Information Retrieval, 14(6):572–592, 2011.
    Google ScholarLocate open access versionFindings
  • C. Ciliberto, A. Rudi, and L. Rosasco. A consistent regularization approach for structured prediction. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 4419–4427, 2016.
    Google ScholarLocate open access versionFindings
  • D. Cossock and T. Zhang. Statistical analysis of bayes optimal subset ranking. IEEE Transactions on Information Theory, 54(11):5140–5154, 2008.
    Google ScholarLocate open access versionFindings
  • O. Dekel, Y. Singer, and C. D. Manning. Log-linear models for label ranking. In Advances in neural information processing systems, pages 497–504, 2004.
    Google ScholarLocate open access versionFindings
  • K. Dembczynski, W. Kotlowski, and E. Hüllermeier. Consistent multilabel ranking through univariate losses. arXiv preprint arXiv:1206.6401, 2012.
    Findings
  • J. C. Duchi, L. W. Mackey, and M. I. Jordan. On the consistency of ranking algorithms. In Proceedings of the 27th International Conference on International Conference on Machine Learning, pages 327–334, 2010.
    Google ScholarLocate open access versionFindings
  • T. Joachims, A. Swaminathan, and T. Schnabel. Unbiased learning-to-rank with biased feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pages 781–789, 2017.
    Google ScholarLocate open access versionFindings
  • M. Kay, C. Matuszek, and S. A. Munson. Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pages 3819–3828. ACM, 2015.
    Google ScholarLocate open access versionFindings
  • J. Keshet and D. A. McAllester. Generalization bounds and consistency for latent structural probit and ramp loss. In Advances in Neural Information Processing Systems 24, pages 2205– 2212. 2011.
    Google ScholarLocate open access versionFindings
  • W. Kotłowski, K. Dembczynski, and E. Hüllermeier. Bipartite ranking through minimization of univariate loss. In Proceedings of the 28th International Conference on International Conference on Machine Learning, pages 1113–1120, 2011.
    Google ScholarLocate open access versionFindings
  • J.-W. Kuo, P.-J. Cheng, and H.-M. Wang. Learning to rank from bayesian decision inference. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 827–836, 2009.
    Google ScholarLocate open access versionFindings
  • Q. Le and A. Smola. Direct optimization of ranking measures. arXiv preprint arXiv:0704.3359, 2007.
    Findings
  • C. D. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval. Cambridge university press, 2008.
    Google ScholarFindings
  • Q. Nguyen. On connected sublevel sets in deep learning. arXiv preprint arXiv:1901.07417, 2019.
    Findings
  • E. A. Ok. Real Analysis with Economic Applications. Number mathecon1 in Online economics textbooks. SUNY-Oswego, Department of Economics, January 2004.
    Google ScholarFindings
  • A. Osokin, F. Bach, and S. Lacoste-Julien. On structured prediction theory with calibrated convex surrogate losses. In Advances in Neural Information Processing Systems, pages 302–313, 2017.
    Google ScholarLocate open access versionFindings
  • H. G. Ramaswamy and S. Agarwal. Convex calibration dimension for multiclass loss matrices. The Journal of Machine Learning Research, 17(1):397–441, 2016.
    Google ScholarLocate open access versionFindings
  • H. G. Ramaswamy, S. Agarwal, and A. Tewari. Convex calibrated surrogates for low-rank loss matrices with applications to subset ranking losses. In Advances in Neural Information Processing Systems, pages 1475–1483, 2013.
    Google ScholarLocate open access versionFindings
  • P. Ravikumar, A. Tewari, and E. Yang. On ndcg consistency of listwise ranking methods. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pages 618–626, 2011.
    Google ScholarLocate open access versionFindings
  • S. E. Robertson. The probability ranking principle in ir. Journal of documentation, 33(4): 294–304, 1977.
    Google ScholarLocate open access versionFindings
  • Y. Song, A. Schwing, R. Urtasun, et al. Training deep neural networks via direct loss minimization. In International Conference on Machine Learning, pages 2169–2177, 2016.
    Google ScholarLocate open access versionFindings
  • I. Steinwart. How to compare different loss functions and their risks. Constructive Approximation, 26(2):225–287, 2007.
    Google ScholarLocate open access versionFindings
  • K. Struminsky, S. Lacoste-Julien, and A. Osokin. Quantifying learning guarantees for convex but inconsistent surrogates. In Advances in Neural Information Processing Systems, pages 669–677, 2018.
    Google ScholarLocate open access versionFindings
  • M. Taylor, J. Guiver, S. Robertson, and T. Minka. Softrank: optimizing non-smooth rank metrics. In Proceedings of the 2008 International Conference on Web Search and Data Mining, pages 77–86, 2008.
    Google ScholarLocate open access versionFindings
  • L. Vaughan and Y. Zhang. Equal representation by search engines? a comparison of websites across countries and domains. Journal of computer-mediated communication, 12(3):888–909, 2007.
    Google ScholarLocate open access versionFindings
  • M. N. Volkovs and R. S. Zemel. Boltzrank: learning to maximize expected ranking gain. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1089–1096, 2009.
    Google ScholarLocate open access versionFindings
  • W. Waegeman, K. Dembczynski, A. Jachnik, W. Cheng, and E. Hüllermeier. On the bayesoptimality of f-measure maximizers. Journal of Machine Learning Research, 15:3333–3388, 2014.
    Google ScholarLocate open access versionFindings
  • R. Wijsman. Continuity of the Bayes risk. The Annals of Mathematical Statistics, 41(3): 1083–1085, 1970.
    Google ScholarLocate open access versionFindings
  • J. I. Yellott Jr. The relationship between luce’s choice axiom, thurstone’s theory of comparative judgment, and the double exponential distribution. Journal of Mathematical Psychology, 15(2): 109–144, 1977.
    Google ScholarLocate open access versionFindings
  • Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 271–278, 2007.
    Google ScholarLocate open access versionFindings
Author
Clement Calauzenes
Clement Calauzenes
Your rating :
0

 

Tags
Comments
小科