Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic
ICLR, Volume abs/1901.02705, 2019.
Learning a policy using only observational data is challenging because the distribution of states it induces at execution time may differ from the distribution observed during training. We propose to train a policy by unrolling a learned model of the environment dynamics over multiple time steps while explicitly penalizing two costs: the ...More
PPT (Upload PPT)