Corralling a Band of Bandit Algorithms
conference on learning theory, 2017.
We study the problem of combining multiple bandit algorithms (that is, online learning algorithms with partial feedback) with the goal of creating a master algorithm that performs almost as well as the best base algorithm if it were to be run on its own. The main challenge is that when run with a master, base algorithms unavoidably receiv...More
PPT (Upload PPT)