基于DQN的多类型拦截装备复合式反无人机任务分配方法 Task assignment method of compound anti-drone based on DQN for multi type interception equipment期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于DQN的多类型拦截装备复合式反无人机任务分配方法

引用本文：	黄亭飞,程光权,黄魁华,黄金才,刘忠.基于DQN的多类型拦截装备复合式反无人机任务分配方法[J].控制与决策,2022,37(1):142-150.

作者姓名：	黄亭飞程光权黄魁华黄金才刘忠

作者单位：	国防科技大学系统工程学院,长沙 410073

基金项目：	国家自然科学基金项目(62073333)；装备发展部领域基金项目(61403120206).

摘要：	针对当前反无人系统无法有效压制无人机的问题,使用多种拦截装备构建一种新的反无人机方法.传统多目标优化算法无法解决动态的任务分配问题,对此,提出一种基于深度Q网络(DQN)的多类型拦截装备复合式反无人机任务分配模型. DQN模块对任务分配问题进行初期决策.为了提高算法收敛速度和学习效率,该方法未采用下一时刻的状态来预测Q值,而是采用当前时刻的状态来预测Q值,消除训练过程中Q值过估计的影响.之后采用进化算法对决策结果进行优化,输出多个拦截方案.以国内某机场跑道周围区域开阔地为防护对象,构建反无人机系统的任务分配仿真环境,仿真结果验证了所提出方法的有效性.同时,将DQN与Double DQN方法相比,所提出改进DQN算法训练的智能体表现更为精确,并且算法的收敛性和所求解的表现更为优异.所提出方法为反无人机问题提供了新的思路.
关键词：	反无人机深度Q网络任务分配 Q值多目标优化智能体
Task assignment method of compound anti-drone based on DQN for multi type interception equipment

HUANG Ting-fei,CHENG Guang-quan,HUANG Kui-hu,HUANG Jin-cai,LIU Zhong.Task assignment method of compound anti-drone based on DQN for multi type interception equipment[J].Control and Decision,2022,37(1):142-150.

Authors:	HUANG Ting-fei CHENG Guang-quan HUANG Kui-hu HUANG Jin-cai LIU Zhong

Affiliation:	College of Systems Engineering,National University of Defense Technology,Changsha 410073,China

Abstract:	Aiming at the problem that the current anti-drone system can not effectively suppress the drone, a new compound anti-drone method is constructed by using multiple types of intercepting equipment. The traditional multi-objective optimization algorithm cannot solve the dynamic task allocation problem, this paper proposes a task assignment model of multi-type interception equipment compound anti-drone based on a deep Q network (DQN). The DQN module makes initial decisions on task allocation issues. In order to improve the convergence speed and learning efficiency of the algorithm, this method does not use the state of the next time to predict the Q value, but uses the state of the current time to predict the Q value, while eliminating the influence of over estimation of $ Q $ value in the training process. After that, an evolutionary algorithm is used to optimize the decision-making results and output multiple interception schemes. The simulation environment of task assignment of the anti-drone system is constructed by taking the open area around a domestic airport runway as the protection object. The simulation results verify the effectiveness of this method. At the same time, compared with the DQN and Double DQN methods, the improved DQN algorithm training agent performance is more accurate, and the convergence of the algorithm and the performance of the solution are more excellent. The proposed method provides new ideas for the anti-drones problem.

Keywords:	anti-drone deep Q network task assignment Q value multi-objective optimization agent
本文献已被维普等数据库收录！
	点击此处可从《控制与决策》浏览原始摘要信息
	点击此处可从《控制与决策》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏