Bandits with Knapsacks
Journal of the ACM, pp. 1-55, 2018.
Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains, the learner may be constrained by one or more supply (or budg...More
PPT (Upload PPT)