湖南电力 ›› 2024, Vol. 44 ›› Issue (4): 11-19.doi: 10.3969/j.issn.1008-0198.2024.04.002

• 特约专栏:新能源发电与储能技术 • 上一篇    下一篇

基于改进SAC算法的城轨列车混合储能系统动态功率分配策略

贺庆辰, 秦斌   

  1. 湖南工业大学电气与信息工程学院,湖南 株洲 412007
  • 收稿日期:2024-03-08 修回日期:2024-07-02 出版日期:2024-08-25 发布日期:2024-09-09
  • 作者简介:贺庆辰(2000),男,硕士研究生,主要研究方向为城轨混合储能控制。秦斌(1963),男,教授,主要从事复杂电气系统建模与优化控制、风力发电智能控制、永磁牵引电机再生制动能量利用、微电网与混合动力汽车能量管理、信号处理与人工智能应用等方面的研究。
  • 基金资助:
    湖南省自然科学基金项目(2022JJ50074)

Dynamic Power Allocation Strategy for Hybrid Energy Storage System of Urban Rail Trains Based on Improved SAC Algorithm

HE Qingchen, QIN Bin   

  1. School of Electrical and Information Engineering, Hunan University of Technology, Zhuzhou 412007, China
  • Received:2024-03-08 Revised:2024-07-02 Online:2024-08-25 Published:2024-09-09

摘要: 为平抑城轨列车牵引网电压波动,在采用车载超级电容和地面混合储能系统的基础上,提出一种基于强化学习的柔性动作-评价算法(soft actor-critic, SAC)的功率动态分配策略,用于提高直流牵引网节能稳压特性、实现车载超级电容寿命保护。首先建立城轨列车动力学模型,并针对SAC算法在城轨动态功率分配中训练时间长、收敛慢等问题,提出PEC-SAC算法。该算法结合了优先级经验回放、强调最近经验和余弦退火算法,通过增加最近经验的采样概率和动态调整学习率,提高了训练效率和收敛速度。根据目标设置状态空间、动作空间及奖励函数,使列车在与仿真环境交互中学习到混合储能系统最优能量控制策略。通过MATLAB/Simulink与Python联合搭建仿真平台,仿真结果表明,与SAC算法相比该方法稳压提高了0.36%,能耗降低了4.52%。

关键词: 城轨列车, 再生制动能量, 混合储能系统, 功率动态分配, 深度强化学习

Abstract: To smooth out voltage fluctuations in the traction network of urban rail trains, a power dynamic allocation strategy based on a soft actor-critic (SAC) with reinforcement learning is proposed based on the use of on-board supercapacitor and ground hybrid energy storage system. It is used to improve the energy-saving voltage stabilization characteristics of DC traction network and realize the life protection of on-board supercapacitor. Firstly, an urban rail train dynamics model is established, and the PEC-SAC algorithm is proposed to address the problems of long training time and slow convergence of SAC algorithm in urban rail dynamic power allocation. The algorithm combines prioritized experience replay, emphasizing recent experience and cosine annealing, and improves the learning rate by increasing the sampling probability of the recent experience and dynamically adjusting the learning rate, which improves the training efficiency and convergence speed. Then the state space, action space, and reward function are set up to realize that the train learns the optimal energy control strategy for the hybrid energy storage system in interaction with the simulation environment. The simulation platform is built through the joint simulation of MATLAB/Simulink and PYTHON, and the results show that the method improves the voltage stabilization by 0.36% and reduces the energy consumption by 4.52% compared to the SAC algorithm.

Key words: urban rail train, regenerative braking energy, hybrid energy storage system, power dynamic allocation, deep reinforcement learning

中图分类号: