基于深度强化学习的电力系统暂态稳定快关汽门紧急控制策略

孙正龙<sup>1</sup>; 陈威翰<sup>1</sup>; 耿鑫地<sup>2</sup>; 王思璇<sup>1</sup>; 杨 浩<sup>1</sup>; 潘 超<sup>1</sup>; 蔡国伟<sup>1</sup>

引用本文:	孙正龙,陈威翰,耿鑫地,等.基于深度强化学习的电力系统暂态稳定快关汽门紧急控制策略[J].电力系统保护与控制,2025,53(19):175-187.[点击复制]
	SUN Zhenglong,CHEN Weihan,GENG Xindi,et al.Fast valving emergency control strategy for power system transient stability based on deep reinforcement learning[J].Power System Protection and Control,2025,53(19):175-187[点击复制]

【打印本页】【在线阅读全文】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】【关闭】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 70次下载 7次	码上扫一扫！
基于深度强化学习的电力系统暂态稳定快关汽门紧急控制策略
孙正龙¹,陈威翰¹,耿鑫地²,王思璇¹,杨浩¹,潘超¹,蔡国伟¹
字体:加大+\|默认\|缩小-
(1.现代电力系统仿真控制与绿色电能新技术教育部重点实验室(东北电力大学)，吉林吉林 132012; 2.国网河北省电力有限公司衡水供电分公司，河北衡水 053000)

摘要:

快关汽门是提升电力系统暂态稳定性的经典控制方式之一，但其控制变量具有高维度、离散化的特点，且参数整定不合理将引发功角后续摇摆失稳，控制策略制定的复杂性致使快关汽门难以在线应用与实时决策。为此，提出了基于深度强化学习的快关汽门控制决策方法。首先，构建基于深度强化学习的紧急快关汽门决策制定框架。然后，将快关汽门控制问题转化为马尔可夫决策过程(Markov decision process, MDP)，以综合考虑最优稳定控制效果及最小化稳控代价为目标设置奖励函数，并采用近端策略优化(proximal policy optimization, PPO)算法求解，得到快关策略的合理配置。最后，通过改进的电科院SG-77系统验证所提方法的有效性。仿真结果表明所提方法在保证快关汽门策略有效性与时效性的同时，可实现在预案式失配场景下作出正确决策，提高了电力系统的暂态稳定性和动态响应能力。

关键词: 快关汽门决策深度强化学习暂态稳定近端策略优化算法

DOI：10.19783/j.cnki.pspc.241593

投稿时间：2024-11-29修订日期：2025-06-05

基金项目:国家自然科学基金项目资助(52277084)；吉林省国际科技合作项目资助(20230402074GH)

Fast valving emergency control strategy for power system transient stability based on deep reinforcement learning

SUN Zhenglong¹,CHEN Weihan¹,GENG Xindi²,WANG Sixuan¹,YANG Hao¹,PAN Chao¹,CAI Guowei¹

(1. Key Laboratory of Modern Power System Simulation and Control & Renewable Energy Technology, Ministry of Education (Northeast Electric Power University), Jilin 132012, China; 2. Hengshui Power Supply Branch, State Grid Hebei Electric Power Co., Ltd., Hengshui 053000, China)

Abstract:

Fast valving is one of the classic control methods to improve transient stability in power systems. However, its control variables are high-dimensional and discrete, and improper parameter tuning may trigger subsequent power angle oscillations and instability. The complexity of the control strategy development makes it difficult to apply fast valving closure in online and real-time decision-making. To address this challenge, a fast valving control decision method based on deep reinforcement learning is proposed. First, a deep reinforcement learning-based emergency fast valving decision-making framework is constructed. Then, the fast valving control problem is transformed into a Markov decision process (MDP). A reward function is designed to balance optimal stability control performance and minimized control cost, and the proximal policy optimization (PPO) algorithm is used to solve it, yielding a rational configuration of the fast valving strategy. Finally, the effectiveness of the proposed method is verified using the improved SG-77 system developed by CEPRI. Simulation results show that the proposed method ensures both the effectiveness and timeliness of the fast valving strategy, enabling correct decision-making under mismatched contingency scenarios, and improves transient stability and dynamic response capability of power systems.

Key words: fast valving decision-making deep reinforcement learning transient stability proximal strategy optimization algorithm

X关闭