AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
A variety of benchmarks have been released, such as the Arcade Learning Environment, which exposed a collection of Atari 2600 games as reinforcement learning problems, and recently the RLLab benchmark for continuous control, to which we refer the reader for a survey on other Rein...
OpenAI Gym is a toolkit for reinforcement learning research. It includes a growing collection of benchmark problems that expose a common interface, and a website where people can share their results and compare the performance of algorithms. This whitepaper discusses the components of OpenAI Gym and the design decisions that went into t...More
PPT (Upload PPT)
- Reinforcement learning (RL) is the branch of machine learning that is concerned with making sequences of decisions.
- A variety of benchmarks have been released, such as the Arcade Learning Environment (ALE) , which exposed a collection of Atari 2600 games as reinforcement learning problems, and recently the RLLab benchmark for continuous control , to which the authors refer the reader for a survey on other RL benchmarks, including [7, 8, 9, 10, 11].
- OpenAI Gym has a website where one can find scoreboards for all of the environments, showcasing results submitted by users.
- Reinforcement learning (RL) is the branch of machine learning that is concerned with making sequences of decisions
- A variety of benchmarks have been released, such as the Arcade Learning Environment (ALE) , which exposed a collection of Atari 2600 games as reinforcement learning problems, and recently the RLLab benchmark for continuous control , to which we refer the reader for a survey on other Reinforcement learning benchmarks, including [7, 8, 9, 10, 11]
- An Reinforcement learning algorithm seeks to maximize some measure of the agent’s total reward, as the agent interacts with the environment
- OpenAI Gym focuses on the episodic setting of reinforcement learning, where the agent’s experience is broken down into a series of episodes
- OpenAI Gym contains a collection of Environments (POMDPs), which will grow over time
- Reinforcement learning assumes that there is an agent that is situated in an environment.
- The agent takes an action, and it receives an observation and reward from the environment.
- An RL algorithm seeks to maximize some measure of the agent’s total reward, as the agent interacts with the environment.
- The goal in episodic reinforcement learning is to maximize the expectation of total reward per episode, and to achieve a high level of performance in as few episodes as possible.
- The design of OpenAI Gym is based on the authors’ experience developing and comparing reinforcement learning algorithms, and the experience using previous benchmark collections.
- One could imagine an “online learning” style, where the agent takes as an input at each timestep and performs learning updates incrementally.
- In an alternative “batch update” style, a agent is called with observation as input, and the reward information is collected separately by the RL algorithm, and later it is used to compute an update.
- The performance of an RL algorithm on an environment can be measured along two axes: first, the final performance; second, the amount of time it takes to learn—the sample complexity.
- Learning time can be measured in multiple ways, one simple scheme is to count the number of episodes before a threshold level of average performance is exceeded.
- This threshold is chosen per-environment in an ad-hoc way, for example, as 90% of the maximum performance achievable by a very heavily trained agent.
- The OpenAI Gym website allows users to compare the performance of their algorithms.
- The aim of the OpenAI Gym scoreboards is not to create a competition, but rather to stimulate the sharing of code and ideas, and to be a meaningful benchmark for assessing different methods.
- OpenAI Gym asks users to create a Writeup describing their algorithm, parameters used, and linking to code.
- OpenAI Gym contains a collection of Environments (POMDPs), which will grow over time.
- Dimitri P Bertsekas, Dimitri P Bertsekas, Dimitri P Bertsekas, and Dimitri P Bertsekas. Dynamic programming and optimal control. Athena Scientific Belmont, MA, 1995.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, Sadik Beattie, C., Antonoglou A., H. I., King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
- J. Schulman, S. Levine, P. Abbeel, M. I. Jordan, and P. Moritz. Trust region policy optimization. In ICML, pages 1889–1897, 2015.
- Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy P Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. arXiv preprint arXiv:1602.01783, 2016.
- M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling. The Arcade Learning Environment: An evaluation platform for general agents. J. Artif. Intell. Res., 47:253–279, 2013.
- Yan Duan, Xi Chen, Rein Houthooft, John Schulman, and Pieter Abbeel. Benchmarking deep reinforcement learning for continuous control. arXiv preprint arXiv:1604.06778, 2016.
- A. Geramifard, C. Dann, R. H. Klein, W. Dabney, and J. P. How. RLPy: A value-function-based reinforcement learning framework for education and research. J. Mach. Learn. Res., 16:1573–1578, 2015.
- B. Tanner and A. White. RL-Glue: Language-independent software for reinforcement-learning experiments. J. Mach. Learn. Res., 10:2133–2136, 2009.
- T. Schaul, J. Bayer, D. Wierstra, Y. Sun, M. Felder, F. Sehnke, T. Ruckstieß, and J. Schmidhuber. PyBrain. J. Mach. Learn. Res., 11:743–746, 2010.
- S. Abeyruwan. RLLib: Lightweight standard and on/off policy reinforcement learning library (C++). http://web.cs.miami.edu/home/saminda/rilib.html, 2013.
- Christos Dimitrakakis, Guangliang Li, and Nikoalos Tziortziotis. The reinforcement learning competition 2014. AI Magazine, 35(3):61–65, 2014.
- R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
- Petr Baudisand Jean-loup Gailly. Pachi: State of the art open source go program. In Advances in Computer
- Emanuel Todorov, Tom Erez, and Yuval Tassa. Mujoco: A physics engine for model-based control. In Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, pages 5026–5033. IEEE, 2012.
- Michał Kempka, Marek Wydmuch, Grzegorz Runc, Jakub Toczek, and Wojciech Jaskowski. Vizdoom: A doom-based ai research platform for visual reinforcement learning. arXiv preprint arXiv:1605.02097, 2016.