Actor-Critic Policy Optimization in Partially Observable Multiagent Environments

Vinícius Flores Zambaldi
Vinícius Flores Zambaldi
Julien Pérolat
Julien Pérolat

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018.

Cited by: 65|Views120
EI

Abstract:

Optimization of parameterized policies for reinforcement learning (RL) is an important and challenging problem in artificial intelligence. Among the most common approaches are algorithms based on gradient ascent of a score function representing discounted return. In this paper, we examine the role of these policy gradient and actor-critic...More

Code:

Data:

Full Text
Bibtex
Your rating :
0

 

Tags
Comments