Research on Manipulator Control Based on Improved Proximal Policy Optimization Algorithm

Shaoxiong Yang,Di Wu,Yan Pan,Yan He

2022 34th Chinese Control and Decision Conference (CCDC)(2022)

引用 0|浏览0
暂无评分
摘要
In the scene of random patching in the industrial scene, an algorithm based on a distributed frame of proximal policy optimization (PPO) with Generalized Advantage Estimation (GAE) is proposed in this paper. The visual part is taken from camera, which is considered as state input. A distributed approach (actor-critic) is established to improve the efficiency of sampling. The sampling data are stored in the experience pool. Both punishment and reward strategies are considered in the raised method. The improved PPO algorithm can be verified on Pybullet. We found that it greatly improves effect in terms of convergence steps and actual reward performance.
更多
查看译文
关键词
Manipulator control,Deep reinforcement learning,Proximal policy optimization algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要