ASAP-UCT: Abstraction of State-Action Pairs in UCT.

IJCAI(2015)

引用 36|浏览49
暂无评分
摘要
Monte-Carlo Tree Search (MCTS) algorithms such as UCT are an attractive online framework for solving planning under uncertainty problems modeled as a Markov Decision Process. However, MCTS search trees are constructed in flat state and action spaces, which can lead to poor policies for large problems. In a separate research thread, domain abstraction techniques compute symmetries to reduce the original MDP. This can lead to significant savings in computation, but these have been predominantly implemented for offline planning. This paper makes two contributions. First, we define the ASAP (Abstraction of State-Action Pairs) framework, which extends and unifies past work on domain abstractions by holistically aggregating both states and state-action pairs - ASAP uncovers a much larger number of symmetries in a given domain. Second, we propose ASAP-UCT, which implements ASAP-style abstractions within a UCT framework combining strengths of online planning with domain abstractions. Experimental evaluation on several benchmark domains shows up to 26% improvement in the quality of policies obtained over existing algorithms.
更多
查看译文
关键词
abstraction,asap-uct,state-action
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要