Learning to Plan Optimistically: Uncertainty-Guided Deep Exploration via Latent Model Ensembles
Abstract:
Learning complex behaviors through interaction requires coordinated long-term planning. Random exploration and novelty search lack task-centric guidance and waste effort on non-informative interactions. Instead, decision making should target samples with the potential to optimize performance far into the future, while only reducing unce...More
Code:
Data:
Full Text
Tags
Comments