We study a Bayesian multi-armed bandit (MAB) setting in which a principal seeks to maximize the sum of expected time-discounted rewards obtained by pulling arms, when the arms are actually pulled by selfish and myopic individuals. Since such individuals pull the arm with highest expected posterior reward (i.e., they always exploit and never explore), the principal must incentivize them to explore by offering suitable payments. Among others, this setting models crowdsourced information discovery and funding agencies incentivizing scien...更多
- 1P. Auer, N. Cesa-Bianchi, Y. Freund, R. E. Schapire. Gambling in a rigged casino: The adversarial multi-armed bandit problem.Electronic Colloquium on Computational Complexity, pp. 322-322, 2000.
- 5Ashish Goel, Sanjeev Khanna, Brad Null. The ratio index for budgeted learning, with applications.symposium on discrete algorithms, pp. 18-27, 2009.
- 6Sudipto Guha, Kamesh Munagala. Approximation algorithms for budgeted learning problems.STOC, pp. 104-113, 2007.
- 8Michael N. Katehakis; Arthur F. Veinott, Jr.. The Multi-Armed Bandit Problem: Decomposition and Computation.Math. Oper. Res., pp. 262-268, 1987.
- 11T.L Lai, Herbert Robbins. Asymptotically efficient adaptive allocation rules.Advances in Applied Mathematics, pp. 4-22, 1985.
- 12Adish Singla, Andreas Krause. Truthful incentives in crowdsourcing tasks using regret minimization mechanisms.WWW, pp. 1167-1178, 2013.
Proceedings of the fifteenth ACM conference on Economics and computation, pp. 5-22, 2014.