MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games
IEEE Transactions on Games(2023)
摘要
This paper presents MiniZero, a zero-knowledge learning framework that
supports four state-of-the-art algorithms, including AlphaZero, MuZero, Gumbel
AlphaZero, and Gumbel MuZero. While these algorithms have demonstrated
super-human performance in many games, it remains unclear which among them is
most suitable or efficient for specific tasks. Through MiniZero, we
systematically evaluate the performance of each algorithm in two board games,
9x9 Go and 8x8 Othello, as well as 57 Atari games. For two board games, using
more simulations generally results in higher performance. However, the choice
of AlphaZero and MuZero may differ based on game properties. For Atari games,
both MuZero and Gumbel MuZero are worth considering. Since each game has unique
characteristics, different algorithms and simulations yield varying results. In
addition, we introduce an approach, called progressive simulation, which
progressively increases the simulation budget during training to allocate
computation more efficiently. Our empirical results demonstrate that
progressive simulation achieves significantly superior performance in two board
games. By making our framework and trained models publicly available, this
paper contributes a benchmark for future research on zero-knowledge learning
algorithms, assisting researchers in algorithm selection and comparison against
these zero-knowledge learning baselines. Our code and data are available at
https://rlg.iis.sinica.edu.tw/papers/minizero.
更多查看译文
关键词
AlphaZero,Atari games,deep reinforcement learning,Go,Gumbel AlphaZero,Gumbel MuZero,MuZero
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要