Bridging the Imitation Gap by Adaptive Insubordination

Cited by: 1|Bibtex|Views49
Other Links: arxiv.org

Abstract:

Why do agents often obtain better reinforcement learning policies when imitating a worse expert? We show that privileged information used by the expert is marginalized in the learned agent policy, resulting in an "imitation gap." Prior work bridges this gap via a progression from imitation learning to reinforcement learning. While often...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments