Maximum Entropy Monte-Carlo Planning
NeurIPS, pp. 9516-9524, 2019.
EI
Abstract:
We develop a new algorithm for online planning in large scale sequential decision problems that improves upon the worst case efficiency of UCT. The idea is to augment Monte-Carlo Tree Search (MCTS) with maximum entropy policy optimization , evaluating each search node by softmax values back-propagated from simulation. To establish the eff...More
Code:
Data:
Full Text
Tags
Comments