首页 | 官方网站   微博 | 高级检索  
     

基于DDPG的综合化航电系统多分区任务分配优化方法
引用本文:赵长啸,李道俊,汪鹏辉,田毅.基于DDPG的综合化航电系统多分区任务分配优化方法[J].电讯技术,2024,64(1):58-66.
作者姓名:赵长啸  李道俊  汪鹏辉  田毅
作者单位:1.中国民航大学 安全科学与工程学院,天津 300300;2.民航航空器适航审定技术重点实验室,天津 300300
基金项目:国家重点研发计划(2021YFB1600601);天津市自然科学基金(21JCQN JC00900)
摘    要:综合化航电系统(Integrated Modular Avionics,IMA)通过时空分区机制实现共享资源平台下的多航电功能集成,分区间的任务分配方法的优劣决定着航电系统的整体效能。针对航电任务集合在多分区内的分配调度问题,提出了一种基于深度强化学习的优化方法。构建了航电系统模型与任务模型,以系统资源限制与任务实时性需求为约束,以提高系统资源利用率为优化目标,将任务分配过程描述为序贯决策问题。引入马尔科夫决策模型,建立基于深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)法的IMA任务分配模型并提出通用分配架构;引入状态归一化、行为噪声等策略训练技巧,提高DDPG算法的学习性能和训练能力。仿真结果表明,提出的优化算法迭代次数达到500次时开始收敛,分析800次之后多分区内驻留任务方案在能满足约束要求的同时,最低处理效率提升20.55%。相较于传统分配方案和AC(Actor-Critic)算法,提出的DDPG算法在收敛能力、优化性能以及稳定性上均有显著优势。

关 键 词:综合模块化航空电子系统(IMA)  任务分配及调度  深度强化学习  DDPG算法

A DDPG-based Optimization Method for Multi-partition Task Assignment of IMA
ZHAO Changxiao,LI Daojun,WANG Penghui,TIAN Yi.A DDPG-based Optimization Method for Multi-partition Task Assignment of IMA[J].Telecommunication Engineering,2024,64(1):58-66.
Authors:ZHAO Changxiao  LI Daojun  WANG Penghui  TIAN Yi
Abstract:The integrated modular avionics(IMA) system implements the integration of multiple avionics functions under a shared resource platform through a spatio-temporal partitioning mechanism.The merit of the task distribution method between partitions determines the overall effectiveness of the IMA system.An optimization method based on deep reinforcement learning(DRL) is proposed for the distribution and scheduling of avionics task sets within multiple partitions is proposed.The IMA system model and task model are constructed,and the constraints of system resource and task real-time requirements are used to improve the system resource utilization as the optimization objective.The task distribution process is described as a sequential decision problem.A Markov decision model is introduced to develop a deep deterministic policy gradient(DDPG) algorithm-based IMA task distribution model and a generic distribution architecture is proposed.Policy training techniques such as state normalization and behavioral noise are introduced to improve the learning performance and training capability of the DDPG algorithm.Simulation results show that the proposed optimization algorithm starts to converge after 500 iterations,and the efficiency of distribution scheme is improved by 20.55% while satisfying the constraint requirements after 800 iterations.Compared with the traditional assignment scheme and the Actor-Critic(AC) algorithm,the proposed DDPG algorithm has significant advantages in terms of convergence ability.
Keywords:integrated modular avionics(IMA)  task allocation and scheduling  deep reinforcement learning  DDPG algorithm
点击此处可从《电讯技术》浏览原始摘要信息
点击此处可从《电讯技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号