Model-Free Reinforcement Learning for Mean Field Games

IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS(2023)

引用 11|浏览142
暂无评分
摘要
In this article, we have proposed a model-free reinforcement learning (RL) algorithm, based on sequential decomposition, to obtain optimal policies for mean field games (MFGs). We consider finite horizon MFGs with a large population of homogeneous players, sequentially making strategic decisions. Each player observes a private state and a mean-field population state representing the empirical distribution of other players' states. The mean-field state is common information among all the players in the game. Vasal (2020) provided a sequential decomposition algorithm to compute mean field equilibrium for such games in linear time than exponential as in prior literature. We extended the idea of sequential decomposition to propose a model-free RL algorithm for these games using expected Sarsa (Mishra et al., 2020). In this article, we provide detailed convergence proofs for our algorithm. In addition, we propose an inverse reinforcement learning algorithm for MFGs with unknown reward functions. The proposed algorithm learns the reward function by studying an expert's behavior, and then computes the optimal policy. We illustrate our results using a cyber-physical security example.
更多
查看译文
关键词
Games,Mathematical models,Statistics,Sociology,Trajectory,Heuristic algorithms,Computational modeling,Model-free,multiagent reinforcement lear- ning (MARL)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要