首页 | 官方网站   微博 | 高级检索  
     

基于深度强化学习的反向散射网络资源分配机制
引用本文:江 巍,朱 江. 基于深度强化学习的反向散射网络资源分配机制[J]. 电讯技术, 2022, 62(10)
作者姓名:江 巍  朱 江
作者单位:重庆邮电大学 a.移动通信教育部工程研究中心;b.移动通信技术重庆市重点实验室,重庆 400065
基金项目:国家自然科学基金资助项目(61771084);重庆市科委自然科学基金(KJQN201800834)
摘    要:为了提升反向散射网络中物联网设备的平均吞吐量,提出了一种资源分配机制,构建了用户配对和时隙分配联合优化资源分配模型。由于该模型直接利用深度强化学习(Deep Reinforcement Learning,DRL )算法求解导致动作空间维度较高且神经网络复杂,故将其分解为两层子问题以降低动作空间维度:首先,基于深度强化学习算法,利用历史信道信息推断当前的信道信息以进行最优的用户配对;然后,在用户固定配对的情况下,基于凸优化算法,以最大化物联网设备总吞吐量为目标进行最优的时隙分配。仿真结果表明,与其他资源分配方法相比,所提资源分配方法能有效提升系统吞吐量,且有较好的信道适应性和收敛性。

关 键 词:反向散射网络;物联网设备;资源分配;深度强化学习;吞吐量最大化

Backscatter network resource allocation algorithm based on deep reinforcement learning
JIANG Wei,ZHU Jiang. Backscatter network resource allocation algorithm based on deep reinforcement learning[J]. Telecommunication Engineering, 2022, 62(10)
Authors:JIANG Wei  ZHU Jiang
Affiliation:a.Engineering Research Center of Mobile Communications of the Ministry of Education;b.Chongqing Key Laboratory of Mobile Communications Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
Abstract:In order to improve the average throughput of the Internet of Things(IoT) devices in the backscatter network,a resource allocation mechanism is proposed,and a joint optimization resource allocation model for user pairing and time slot allocation is constructed.Because the model directly uses deep reinforcement learning(DRL) to solve the problem,the action space has a high dimensionality and the network is complex,so the model is divided into two sub-problems to reduce the action space dimensionality.First,based on the DRL algorithm,historical channel information is used to infer the current channel information to perform optimal user pairing.Then,in the case of fixed user pairing,based on the convex optimization algorithm,the optimal time slot allocation is performed with the goal of maximizing the total throughput of IoT devices.The simulation results show that the proposed resource allocation method can effectively improve the system throughput,and has better channel adaptability and convergence when compared with other resource allocation methods.
Keywords:backscatter network  Internet of Things(IoT) device  resource allocation  deep reinforcement learning  throughput maximum
点击此处可从《电讯技术》浏览原始摘要信息
点击此处可从《电讯技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号