Incentivizing explorationEI

摘要

We study a Bayesian multi-armed bandit (MAB) setting in which a principal seeks to maximize the sum of expected time-discounted rewards obtained by pulling arms, when the arms are actually pulled by selfish and myopic individuals. Since such individuals pull the arm with highest expected posterior reward (i.e., they always exploit and never explore), the principal must incentivize them to explore by offering suitable payments. Among others, this setting models crowdsourced information discovery and funding agencies incentivizing scien...更多
个人信息

 

您的评分 :

Proceedings of the fifteenth ACM conference on Economics and computation, pp. 5-22, 2014.

被引用次数37|引用|17
标签
作者
评论