Efficient Contextual Bandits in Non-stationary Worlds

conference on learning theory, pp. 1739-1776, 2018.

Cited by: 32|Views50

Abstract:

Most contextual bandit algorithms minimize regret against the best fixed policy, a questionable benchmark for non-stationary environments that are ubiquitous in applications. In this work, we develop several efficient contextual bandit algorithms for non-stationary environments by equipping existing methods for i.i.d. problems with soph...More

Code:

Data:

Full Text
Bibtex