Mostly Exploration-Free Algorithms for Contextual Bandits
arXiv: Machine Learning, 2017.
The contextual bandit literature has traditionally focused on algorithms that address the exploration-exploitation tradeoff. In particular, greedy algorithms that exploit current estimates without any exploration may be sub-optimal in general. However, exploration-free greedy algorithms are desirable in practical settings where exploratio...More
PPT (Upload PPT)