引用本文: | 周雪松,刘文进,马幼捷,等.PPO算法优化参数的微网接口变换器自抗扰控制[J].电力系统保护与控制,2025,53(14):90-99.[点击复制] |
ZHOU Xuesong,LIU Wenjin,MA Youjie,et al.Active disturbance rejection control of microgrid interface converters using PPO algorithm for parameters optimization[J].Power System Protection and Control,2025,53(14):90-99[点击复制] |
|
摘要: |
直流微电网作为新型电力系统的重要环节,因新能源的随机性和不确定性,直流微电网中负载端接口变换器的输出电压容易受到扰动影响,导致输出特性不佳。为有效消除在控制器参数保持恒定时不确定性特征对系统性能产生的不利影响,提出了一种基于近端策略优化(proximal policy optimization, PPO)算法的自抗扰控制方法。该方法利用PPO智能体与传统自抗扰控制系统环境进行交互,感知环境状态的变化,并依据奖励的反馈来优化控制策略。在训练过程中,智能体通过探索不同的控制动作,实现观测器参数的自适应调整,从而确保了变换器输出电压的稳定。最后,在数字仿真平台上,将PPO-LADRC与传统线性自抗扰控制(linear active disturbance rejection control, LADRC)、双闭环比例-积分控制方法进行了对比分析,验证了该控制策略可以显著提升系统在各种扰动下的动态性能。 |
关键词: 直流微电网 接口变换器 深度强化学习 自抗扰控制 自适应调整 |
DOI:10.19783/j.cnki.pspc.241396 |
投稿时间:2024-10-21修订日期:2024-12-28 |
基金项目:国家自然科学基金重点项目资助(U23B20142) |
|
Active disturbance rejection control of microgrid interface converters using PPO algorithm for parameters optimization |
ZHOU Xuesong1,LIU Wenjin1,MA Youjie1,TAO Long1,WEN Hulong2.3,FENG Meili4 |
(1. Tianjin Key Laboratory of New Energy Power Conversion, Transmission and Intelligent Control (Tianjin University of
Technology), Tianjin 300384, China; 2. Tianjin Ruineng Electric Co., Ltd., Tianjin 300385, China; 3. Tianjin Ruiyuan
Electric Co., Ltd., Tianjin 300308, China; 4.Tianjin Anjie IOT Technology Co., Ltd., Tianjin 300392, China) |
Abstract: |
As an important component of modern power systems, DC microgrids are susceptible to disturbances at the load-side interface converters due to the randomness and uncertainty of renewable energy sources, resulting in poor output characteristics. In order to effectively mitigate the adverse effects of uncertainty on system performance when the controller parameters are kept constant, this paper proposes an active disturbance rejection control method based on the proximal policy optimization (PPO) algorithm. In this method, a PPO agent interacts with the traditional active disturbance rejection control system environment to perceive changes in system states and optimizes the control strategy based on feedback from a reward. During the training process, the agent explores various control actions to adaptively tune observer parameters, thereby ensuring the stability of the converter output voltage. Finally, the proposed PPO-LADRC is compared through digital simulations with the traditional linear active disturbance rejection control (LADRC) and double-closed-loop proportional-integral control methods. The results verify that the proposed control strategy can significantly improve the dynamic performance of the system under various disturbances. |
Key words: DC microgrid interface converter deep reinforcement learning active disturbance rejection control adaptive tuning |