On Sampled Metrics for Item Recommendation

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Virtual Event CA USA July, 2020(2022)

引用 410|浏览42558
暂无评分
摘要
Recommender systems personalize content by recommending items to users. Item recommendation algorithms are evaluated by metrics that compare the positions of truly relevant items among the recommended items. To speed up the computation of metrics, recent work often uses sampled metrics where only a smaller set of random items and the relevant items are ranked. This paper investigates such sampled metrics in more detail and shows that they are inconsistent with their exact counterpart, in the sense that they do not persist relative statements, for example, recommender A is better than B, not even in expectation. Moreover, the smaller the sample size, the less difference there is between metrics, and for very small sample size, all metrics collapse to the AUC metric. We show that it is possible to improve the quality of the sampled metrics by applying a correction, obtained by minimizing different criteria. We conclude with an empirical evaluation of the naive sampled metrics and their corrected variants. To summarize, our work suggests that sampling should be avoided for metric calculation, however if an experimental study needs to sample, the proposed corrections can improve the quality of the estimate.
更多
查看译文
关键词
sampled metrics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要