Data-Based Optimal Consensus Control for Multiagent Systems With Time Delays: Using Prioritized Experience Replay.

IEEE Trans. Syst. Man Cybern. Syst.(2024)

引用 0|浏览6
This article is centered on the optimal consensus problem of the multiagent systems (MASs) with time delays. By designing a new augmented state, the delayed MASs are reformulated as a delay-free system, and each agent is to minimize its local cost that may depend on the decisions of the other agents, which is regarded as a Nash equilibrium problem. To this end, we propose a multiagent deterministic policy gradient (MADPG) method based on actor–critic (AC) networks to minimize the local cost ( $Q$ -function) by introducing the policy gradient technique, and its convergence and optimality are proven as well. In particular, we develop an optimized prioritized experience replay (PER) strategy that allows high-value samples to be selected with a higher probability, which enhance networks’ data utilization. Finally, the effectiveness of the algorithm and the advantages of PER are demonstrated with a simulated example and a comparative simulation.
Multiagent systems (MASs),optimal control,prioritized experience replay (PER),reinforcement learning (RL),time delay
AI 理解论文
Chat Paper