Provably efficient RL with Rich Observations via Latent State Decoding
arXiv: Learning, 2019.
We study the exploration problem in episodic MDPs with rich observations generated from a small number of latent states. Under certain identifiability assumptions, we demonstrate how to estimate a mapping from the observations to latent states inductively through a sequence of regression and clustering steps---where previously decoded l...More
PPT (Upload PPT)