REVEAL 2020: Bandit and Reinforcement Learning from User Interactions

Thorsten Joachims,Yves Raimond,Olivier Koch,Maria Dimakopoulou,Flavian Vasile,Adith Swaminathan

RECSYS（2020）

引用 4|浏览72

暂无评分

摘要

ABSTRACT The REVEAL workshop1 focuses on framing the recommendation problem as a one of making personalized interventions, e.g. deciding to recommend a particular item to a particular user. Moreover, these interventions sometimes depend on each other, where a stream of interactions occurs between the user and the system, and where each decision to recommend something will have an impact on future steps and long-term rewards. This framing creates a number of challenges we will discuss at the workshop. How can recommender systems be evaluated offline in such a context? How can we learn recommendation policies that are aware of these delayed consequences and outcomes?

查看译文

关键词

recommender systems, reinforcement learning, off-policy, offline evaluation, causal inference, multi-armed bandits

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要