Crush Optimism with Pessimism: Structured Bandits Beyond Asymptotic Optimality
NIPS 2020, 2020.
Let us forget about CRush Optimism with Pessimism and consider the oracle described in Section 2
We study stochastic structured bandits for minimizing regret. The fact that the popular optimistic algorithms do not achieve the asymptotic instance-dependent regret optimality (asymptotic optimality for short) has recently allured researchers. On the other hand, it is known that one can achieve a bounded regret (i.e., does not grow ind...More
PPT (Upload PPT)