Maximum Entropy Monte-Carlo Planning

Chenjun Xiao
Chenjun Xiao
Martin Müller
Martin Müller

NeurIPS, pp. 9516-9524, 2019.

Cited by: 2|Bibtex|Views210
EI
Other Links: academic.microsoft.com|dblp.uni-trier.de

Abstract:

We develop a new algorithm for online planning in large scale sequential decision problems that improves upon the worst case efficiency of UCT. The idea is to augment Monte-Carlo Tree Search (MCTS) with maximum entropy policy optimization , evaluating each search node by softmax values back-propagated from simulation. To establish the eff...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments