首页 | 官方网站   微博 | 高级检索  
     

基于强化学习的电动汽车集群实时优化调度策略
引用本文:赵小瑾,张开宇,冯冬涵,李恒杰,周云.基于强化学习的电动汽车集群实时优化调度策略[J].陕西电力,2022,0(1):53-59,81.
作者姓名:赵小瑾  张开宇  冯冬涵  李恒杰  周云
作者单位:(1.电力传输与功率变换控制教育部重点实验室(上海交通大学),上海 200240;2.国网上海市电力公司电力科学研究院,上海200437;3.兰州理工大学电气工程与信息工程学院,甘肃兰州 730050)
摘    要:针对大规模电动汽车的实时调度存在维度高和随机性强等问题,提出基于强化学习的电动汽车集群实时优化调度策略。首先,以最小化综合成本(机组发电成本和补贴成本)为目标,建立电动汽车集群参与的电网机组经济调度模型。将实时阶段下的该模型构建为一个马尔可夫决策过程,利用基于最大熵的深度强化学习算法对马尔可夫决策过程进行模型训练和求解。此外,融合强化学习不依赖预测信息和运筹优化算法保证物理约束的优势,将电动汽车充电和机组出力分开优化调度。最后,通过算例验证所提策略在降低成本和削峰填谷方面的可行性和有效性。

关 键 词:电动汽车集群  强化学习  机组经济调度  实时优化

Real-time Optimal Scheduling Strategy for Electric Vehicle Clusters Based on Reinforcement Learning
ZHAO Xiaojin,ZHANG Kaiyu,FENG Donghan,LI Hengjie,ZHOU Yun.Real-time Optimal Scheduling Strategy for Electric Vehicle Clusters Based on Reinforcement Learning[J].Shanxi Electric Power,2022,0(1):53-59,81.
Authors:ZHAO Xiaojin  ZHANG Kaiyu  FENG Donghan  LI Hengjie  ZHOU Yun
Affiliation:(1. Key Laboratory of Control of Power Transmission and Conversion, Ministry of Education (Shanghai Jiao Tong University), Shanghai 200240, China; 2. State Grid Shanghai Electric Power Research Institute, Shanghai 200437, China; 3.School of Electrical Engineering and Information Engineering, Lanzhou University of Technology, Lanzhou 730050, China)
Abstract:Targeting the high dimension and strong randomness in the real-time scheduling of large-scale electric vehicles (EVs), this paper proposes a real-time optimal scheduling strategy for EV clusters based on reinforcement learning (RL). Firstly, the economic dispatch model of the unit in power grid intergrating the EV clusters is established, with the goal of minimizing the overall costs (unit generation costs and subsidy costs). Then the model is formulated as a Markov decision process (MDP) model, and the maximum entropy based RL is used to train and solve the MDP model. In addition, making use of advantages of not relying on predictive information with RL algorithm and ensuring physical constraints with traditional optimization algorithm, the EV charging power and unit output power are optimized separately. Finally, the feasibility and effectiveness of the proposed strategy in reducing costs as well as peak-shaving and valley-filling are verified through case studies.
Keywords:EV clusters  reinforcement learning  economic dispatch of unit  real-time optimization
点击此处可从《陕西电力》浏览原始摘要信息
点击此处可从《陕西电力》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号