基于分层深度强化学习的多能虚拟电厂区域消纳优化策略

张 宁<sup>1</sup>; 杨凌霄<sup>2</sup>; 李炫浓<sup>3</sup>; 胡存刚<sup>1</sup>; 孙秋野<sup>4</sup>

引用本文:	张宁,杨凌霄,李炫浓,等.基于分层深度强化学习的多能虚拟电厂区域消纳优化策略[J].电力系统保护与控制,2025,53(20):153-163.
	ZHANG Ning,YANG Lingxiao,LI Xuannong,et al.Regional consumption optimization strategy for multi-energy virtual power plants based on hierarchical deep reinforcement learning[J].Power System Protection and Control,2025,53(20):153-163

【打印本页】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 2012次下载 317次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于分层深度强化学习的多能虚拟电厂区域消纳优化策略
张宁,杨凌霄,李炫浓,等
1.安徽大学电气工程与自动化学院，安徽合肥 230601；2.安徽大学人工智能学院，安徽合肥 230601； 3.国网肥西县供电公司，安徽合肥 231299；4.东北大学智能电气科学与技术研究院，辽宁沈阳 110819

摘要:

虚拟电厂(virtual power plant, VPP)作为一种新型能源管理模式，将分布式能源资源进行智能化集成和优化，其对于促进可再生能源消纳、能源结构的优化和能源系统的绿色化具有重要意义。以多能虚拟电厂为研究对象，以实现能源区域消纳为研究目的，提出了一种基于分层深度强化学习的多能虚拟电厂区域消纳优化调度方法。首先，提出了一种非直接多能虚拟电厂区域消纳运行框架，确保用户参与自主性的同时避免用户信息公开化。其次，基于多能耦合以及多时间尺度特性构建虚拟电厂内的联合交易机制，避免了因忽略能源传输特性导致的交易失败，实现跨能源类型的灵活匹配，在完成区域自消纳的同时提高自身收益。最后，提出基于分层深度强化学习的优化求解策略，以解决所提模型由于大规模状态动作空间以及稀疏奖励特性带来的求解难题。通过仿真算例验证了所提方法的有效性，表明所提虚拟电厂调度策略可以有效实现区域自消纳。

关键词: 虚拟电厂多能交易多时间尺度分层深度强化学习

DOI：10.19783/j.cnki.pspc.241453

分类号:

基金项目:国家自然科学基金项目资助(62203004，62303006)

Regional consumption optimization strategy for multi-energy virtual power plants based on hierarchical deep reinforcement learning

ZHANG Ning1, YANG Lingxiao2, LI Xuannong3, HU Cungang1, SUN Qiuye4

1. School of Electrical Engineering and Automation, Anhui University, Hefei 230601, China; 2. School of Artificial Intelligence, Anhui University, Hefei 230601, China; 3. State Grid Feixi County Power Supply Company, Hefei 231299, China; 4. Intelligent Electrical Science and Technology Research Institute, Northeastern University, Shenyang 110819, China

Abstract:

As a novel energy management paradigm, the virtual power plant (VPP) enables the intelligent integration and optimization of distributed energy resources, playing a significant role in promoting renewable energy consumption, optimizing energy structures, and facilitating greener energy systems. Focusing on multi-energy VPPs with the objective of achieving regional energy consumption, this paper proposes a regional consumption optimization and scheduling method based on hierarchical deep reinforcement learning. First, a non-direct regional consumption operation framework for multi-energy VPPs is proposed, which ensures user autonomy in participation while avoiding the disclosure of private information. Second, a joint trading mechanism within the VPP is designed, considering multi-energy coupling and multi-timescale characteristics. This avoids trading failures caused by neglecting energy transmission constraints, enables flexible matching across different energy types, and enhances VPP revenues while realizing regional self-consumption. Finally, an optimization strategy based on hierarchical deep reinforcement learning is developed to overcome the challenges posed by the large-scale state-action space and sparse reward characteristics of the model. Simulation case studies validate the effectiveness of the proposed method, demonstrating that the scheduling strategy can effectively achieve regional self-consumption.

Key words: virtual power plant multi-energy trading multi-timescale hierarchical deep reinforcement learning

X关闭