# Provably adaptive reinforcement learning in metric spaces

NIPS 2020, 2020.

While the Lipschitz contextual bandits setting of Slivkins is a special case of this setup, no existing analysis recovers his adaptive guarantee that scales with the zooming dimension of the problem

We study reinforcement learning in continuous state and action spaces endowed with a metric. We provide a refined analysis of the algorithm of Sinclair, Banerjee, and Yu (2019) and show that its regret scales with the \emph{zooming dimension} of the instance. This parameter, which originates in the bandit literature, captures the size o...More

