基于行为克隆TD3强化学习的低碳园区柔性资源优化策略

舒 展<sup>1</sup>; 孙 旻<sup>1</sup>; 吴 越<sup>1</sup>; 万子镜<sup>1</sup>; 段伟男<sup>2</sup>; 彭春华<sup>2</sup>

引用本文:	舒展,孙旻,吴越,等.基于行为克隆TD3强化学习的低碳园区柔性资源优化策略[J].电力系统保护与控制,2025,53(03):95-107.
	SHU Zhan,SUN Min,WU Yue,et al.Flexible resource optimization strategy for low-carbon parks based on behavioral cloning TD3 reinforcement learning[J].Power System Protection and Control,2025,53(03):95-107

【打印本页】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 7509次下载 1625次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于行为克隆TD3强化学习的低碳园区柔性资源优化策略
舒展,孙旻,吴越,等
1.国网江西省电力有限公司电力科学研究院，江西南昌 330096； 2.华东交通大学电气与自动化工程学院，江西南昌 330013

摘要:

园区作为我国产业聚集地，是我国二氧化碳排放的重要来源，优先在园区实现碳中和是助力我国达成“双碳”目标的重要举措。通过对园区综合能源系统供能侧加入电解槽-掺氢燃气轮机碳捕集进行低碳化改造，同时考虑含储能侧、供能侧、用能侧多类型柔性资源构建低碳园区综合能源系统。为对该园区综合能源系统中各类柔性资源进行高效的在线低碳经济优化调度，提出采用考虑行为克隆的TD3强化学习算法对低碳园区综合能源系统进行离线训练和在线优化求解。最后，通过算例仿真验证了所提优化策略的优越性。

关键词: 园区综合能源系统多类型柔性资源强化学习行为克隆低碳经济调度

DOI：10.19783/j.cnki.pspc.240303

分类号:

基金项目:国家电网公司总部科技项目资助(5400- 202325227A-1-1-ZN)

Flexible resource optimization strategy for low-carbon parks based on behavioral cloning TD3 reinforcement learning

SHU Zhan1, SUN Min1, WU Yue1, WAN Zijing1, DUAN Weinan2, PENG Chunhua2

1. State Grid Jiangxi Electric Power Research Institute, Nanchang 330096, China; 2. School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang 330013, China

Abstract:

Industrial parks in China are significant contributors to the country’s carbon dioxide emissions. Prioritizing the achievement of carbon neutrality in parks is a crucial in helping China reach its ‘dual-carbon’ goal. This paper presents the construction of a low-carbon park integrated energy system. The system incorporates electrolyzers and hydrogen-blended gas turbines with carbon capture technology into the energy supply side, and considers various flexible resources on the storage, supply, and consumption sides. To efficiently optimize the low-carbon economic dispatch of various flexible resources in this integrated energy system, a TD3 reinforcement learning algorithm considering behavioral cloning is proposed for offline training and online optimization. Finally, the superiority of the proposed optimization strategy is verified through simulation examples.

Key words: park integrated energy system multiple flexible resources reinforcement learning behavioral cloning low-carbon economic dispatch

X关闭