首页 | 官方网站   微博 | 高级检索  
     

基于任务分解与强化学习的多平台协同火力分配方法
引用本文:伍国华,李冰洁,袁于斐,陆志沣.基于任务分解与强化学习的多平台协同火力分配方法[J].控制与决策,2024,39(5):1727-1735.
作者姓名:伍国华  李冰洁  袁于斐  陆志沣
作者单位:中南大学 交通运输工程学院,长沙 410075;上海机电工程研究所,上海 201109
基金项目:国家自然科学基金面上项目(62073341);中南大学研究生科研创新项目(2022ZZTS0750).
摘    要:为了有效求解多平台协同火力分配问题,根据“分而治之”的思想,基于任务分解策略将复杂的决策任务分解为子目标平台选择和子平台火力分配两个阶段,通过融合启发式算法和强化学习模型,提出一种新的强化学习求解方法(HARL),并以多平台联合火力打击为作战背景进行实验仿真.子目标平台选择层根据当前状态,基于强化学习策略选择攻击当前子目标最适合的火力平台;而子平台火力分配层则使用启发式算法为执行攻击任务的平台规划最优的火力分配方案.实验结果表明,融合启发式算法和强化学习的HARL方法相比于传统的强化学习算法武器消耗量减少15%以上,相比于经典的启发式算法求解时效性提升20%以上,表明该研究成果可为未来求解复杂作战决策问题提供有力的技术支持.

关 键 词:多平台协同火力分配  强化学习  任务分解  迭代优化

Multi-platform collaborative firepower allocation method based on task decomposition and reinforcement learning
WU Guo-hu,LI Bing-jie,YUAN Yu-fei,LU Zhi-feng.Multi-platform collaborative firepower allocation method based on task decomposition and reinforcement learning[J].Control and Decision,2024,39(5):1727-1735.
Authors:WU Guo-hu  LI Bing-jie  YUAN Yu-fei  LU Zhi-feng
Affiliation:College of Information Science and Engineering,Central South University,Changsha 410075,China; Shanghai Institute of Mechanical and Electrical Engineering,Shanghai 201109,China
Abstract:In order to effectively solve the multi-platform collaborative fire allocation problem, this paper decomposes complex decision-making tasks according to the divide conquer frame and task decomposition technology. The paper proposes a novel combination approach of a heuristic algorithm and reinforcement learning(HARL), and carries out simulation experiments on the background of multi-fire platform joint attack. The sub-target platform allocation layer will select the platforms that are most suitable for attacking the current sub-target based on the reinforcement learning model, and the sub-platform fire allocation layer plans the optimal fire allocation plan for the platform executing the attack task based on the heuristic algorithm. Experimental results of simulation examples show that the reinforcement learning algorithm that combines heuristic operators outperforms traditional reinforcement learning algorithms by less than 15%, and improves the solving time by 20% compared with the classical heuristic algorithm. The research results may provide powerful technology support to solve more complex decision problems in the future.
Keywords:
点击此处可从《控制与决策》浏览原始摘要信息
点击此处可从《控制与决策》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号