Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance

doi:10.3390/s20164546

CORC > 长春光学精密机械与物理研究所 > 中国科学院长春光学精密机械与物理研究所

	Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance
	W. W. Zhao,H. R. Chu,X. K. Miao,L. H. Guo,H. H. Shen,C. H. Zhu,F. Zhang and D. X. Liang
刊名	Sensors
	2020
卷号	20 期号:16 页码:16
DOI	10.3390/s20164546
英文摘要	Multiple unmanned aerial vehicle (UAV) collaboration has great potential. To increase the intelligence and environmental adaptability of multi-UAV control, we study the application of deep reinforcement learning algorithms in the field of multi-UAV cooperative control. Aiming at the problem of a non-stationary environment caused by the change of learning agent strategy in reinforcement learning in a multi-agent environment, the paper presents an improved multiagent reinforcement learning algorithm-the multiagent joint proximal policy optimization (MAJPPO) algorithm with the centralized learning and decentralized execution. This algorithm uses the moving window averaging method to make each agent obtain a centralized state value function, so that the agents can achieve better collaboration. The improved algorithm enhances the collaboration and increases the sum of reward values obtained by the multiagent system. To evaluate the performance of the algorithm, we use the MAJPPO algorithm to complete the task of multi-UAV formation and the crossing of multiple-obstacle environments. To simplify the control complexity of the UAV, we use the six-degree of freedom and 12-state equations of the dynamics model of the UAV with an attitude control loop. The experimental results show that the MAJPPO algorithm has better performance and better environmental adaptability.
URL标识	查看原文
语种	英语
内容类型	期刊论文
源URL	[http://ir.ciomp.ac.cn/handle/181722/64852]
专题	中国科学院长春光学精密机械与物理研究所
推荐引用方式 GB/T 7714	W. W. Zhao,H. R. Chu,X. K. Miao,L. H. Guo,H. H. Shen,C. H. Zhu,F. Zhang and D. X. Liang. Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance[J]. Sensors,2020,20(16):16.
APA	W. W. Zhao,H. R. Chu,X. K. Miao,L. H. Guo,H. H. Shen,C. H. Zhu,F. Zhang and D. X. Liang.(2020).Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance.Sensors,20(16),16.
MLA	W. W. Zhao,H. R. Chu,X. K. Miao,L. H. Guo,H. H. Shen,C. H. Zhu,F. Zhang and D. X. Liang."Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance".Sensors 20.16(2020):16.