Bandits with Knapsacks
Foundations of Computer Science, Volume 65, Issue 3, 2018, Pages 207-216.
Multi-armed bandit problems are the predominant theoretical model of exploration-exploitation tradeoffs in learning, and they have countless applications ranging from medical trials, to communication networks, to Web search and advertising. In many of these application domains the learner may be constrained by one or more supply (or budge...More
PPT (Upload PPT)