A deep reinforcement learning method for multi-stage equipment development planning in uncertain environments
Journal of Systems Engineering and Electronics(2022)
摘要
Equipment development planning (EDP) is usually a long-term process often performed in an environment with high uncertainty. The traditional multi-stage dynamic programming cannot cope with this kind of uncertainty with unpredictable situations. To deal with this problem, a multi-stage EDP model based on a deep reinforcement learning (DRL) algorithm is proposed to respond quickly to any environmental changes within a reasonable range. Firstly, the basic problem of multi-stage EDP is described, and a mathematical planning model is constructed. Then, for two kinds of uncertainties (future capabi lity requirements and the amount of investment in each stage), a corresponding DRL framework is designed to define the environment, state, action, and reward function for multi-stage EDP. After that, the dueling deep Q-network (Dueling DQN) algorithm is used to solve the multi-stage EDP to generate an approximately optimal multi-stage equipment development scheme. Finally, a case of ten kinds of equipment in 100 possible environments, which are randomly generated, is used to test the feasibility and effectiveness of the proposed models. The results show that the algorithm can respond instantaneously in any state of the multi-stage EDP environment and unlike traditional algorithms, the algorithm does not need to re-optimize the problem for any change in the environment. In addition, the algorithm can flexibly adjust at subsequent planning stages in the event of a change to the equipment capability requirements to adapt to the new requirements.
更多查看译文
关键词
equipment development planning (EDP),multistage,reinforcement learning,uncertainty,dueling deep Q-network (Dueling DQN)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要