Combinatorial Cascading Bandits
Annual Conference on Neural Information Processing Systems, 2015.
We propose combinatorial cascading bandits, a class of partial monitoring problems where at each step a learning agent chooses a tuple of ground items subject to constraints and receives a reward if and only if the weights of all chosen items are one. The weights of the items are binary, stochastic, and drawn independently of each other. ...More
PPT (Upload PPT)