考虑区域间辅助奖励的配电网电压优化控制

doi:10.12204/j.issn.1000-7229.2024.05.009

摘要/Abstract

摘要：

智能软开关能够有效解决分布式光伏大规模接入配电网引起的电压波动问题,但会导致区域间协作程度加深,而现阶段使用多智能体深度强化学习算法进行电压优化时,各智能体仅使用各自区域内的奖励进行训练,导致智能体缺乏协同,输出策略难以保证最优性。为此提出考虑区域间辅助奖励的配电网电压优化方法,首先建立基于多智能体深度强化学习的多时间尺度电压优化框架,其次针对控制智能软开关的智能体,将各自区域内奖励定义为主奖励,邻近区域内奖励定义为辅助奖励,然后通过主、辅助奖励损失函数关于网络参数梯度的数量积分析辅助奖励对训练的有利程度,并采用演化博弈方法自适应修改辅助奖励参与因子;最后,在改进的IEEE 33节点系统验证了所提方法能够稳定智能体训练过程,提升智能体策略的优化效果。

关键词: 多智能体深度强化学习, 电压优化, 辅助奖励, 演化博弈, 参与因子

Abstract:

A soft open point can effectively solve the voltage fluctuation problem caused by the large-scale integration of distributed photovoltaics into a power distribution network. However, this can lead to increased collaboration between regions. Currently, when using multi-agent deep reinforcement learning algorithms for voltage optimization, each agent uses only rewards within its own region for training, resulting in a lack of coordination among agents and difficulty in guaranteeing the optimality of the output strategies. To address this problem, a method for voltage optimization in distribution networks that considers inter-regional auxiliary rewards was proposed. First, a multi-agent deep reinforcement learning framework based on multiple timescales was established for voltage optimization. Second, for agents controlling the soft open points, the rewards within their respective regions were defined as primary rewards, whereas the rewards from neighboring regions are defined as auxiliary rewards. The beneficial effect of auxiliary rewards on training was analyzed using the dot product of the primary and auxiliary reward loss functions with respect to the network parameter gradients. An adaptive modification of the auxiliary reward participation factor is implemented using an evolutionary game approach. Finally, the proposed method is validated in an improved IEEE 33 node system, which demonstrates stable training processes and improves strategy optimization for the agents.

Key words: multi-agent deep reinforcement learning, voltage optimization, auxiliary rewards, evolutionary game, participation factor

中图分类号:

TM715

周祥, 李晓露, 柳劲松, 林顺富. 考虑区域间辅助奖励的配电网电压优化控制[J]. 电力建设, 2024, 45(5): 80-93.

ZHOU Xiang, LI Xiaolu, LIU Jinsong, LIN Shunfu. Voltage Optimization Control of Distribution Networks Considering Inter-Regional Auxiliary Rewards[J]. ELECTRIC POWER CONSTRUCTION, 2024, 45(5): 80-93.

图/表 25

图1 基于MADDPG的多时间尺度电压优化框架

Fig.1 Multi-timescale voltage optimization framework based on MADDPG

表1 考虑区域间辅助奖励的智能体训练方法伪代码

Table 1 Pseudo-code of for training methods of agents considering inter-regional auxiliary rewards

图2 IEEE 33节点系统拓扑结构

Fig.2 Topology of IEEE 33-bus system

表2 IEEE 33节点系统电压调节装置参数

Table 2 Parameters of voltage regulation device in IEEE 33-bus system

表3 算法参数设置

Table 3 Algorithm parameter settings

图3 慢速调节智能体神经网络损失曲线

Fig.3 Slow control agent neural network loss curve

图4 快速调节智能体神经网络损失曲线

Fig.4 Fast control agent neural network loss curve

图5 区域B智能体训练奖励曲线

Fig.5 Reward curve in agent training in area B

表4 区域B智能体训练结果

Table 4 Training results of agent in area B

图6 辅助奖励参与因子变化曲线

Fig.6 Auxiliary reward participation factor change curve

图7 不同方案下节点18全天电压分布

Fig.7 Voltage profile of node 18 under different schemes throughout the day

表5 测试集各方案电压控制结果

Table 5 Voltage control results of testing set in different schemes

图B1 重参数化过程

Fig.B1 Re-parameterization process

表D1 多时间尺度电压优化方法伪代码

Table D1 Pseudo-code of for voltage optimization method in distribution networks considering inter-regional auxiliary rewards

表D2 区域A和C智能体训练结果

Table D2 Training results of agent in area A and C

图E1 区域A智能体训练奖励曲线

Fig.E1 Reward curve in agent training in area A

图E2 区域C智能体训练奖励曲线

Fig.E2 Reward curve in agent training in area C

图E3 典型日下系统的源荷出力曲线

Fig.E3 System PV output and load curve on a typical day

图E4 典型日下电压分布箱线图

Fig.E4 Boxplot of voltage profile on a typical day

图E5 离散设备动作情况

Fig.E5 Operation status of discrete equipment

图E6 SVC无功出力情况

Fig.E6 Reactive power output status of SVC

图E7 PV逆变器无功出力情况

Fig.E7 Reactive power output status of PV inverters

图E8 SOP有功转移情况

Fig.E8 Active power transfer status of SOP

图E9 SOP无功出力情况

Fig.E9 Reactive power output status of SOP

图E10 ESS容量变化情况

Fig.E10 Capacity changing status of ESS

参考文献 31

[1]	陶鹏, 张冰玉, 韩桂楠, 等. 计及源荷双侧风险管理的光储微网两阶段低碳运行优化研究[J]. 智慧电力, 2023, 51(11): 1-6.
	TAO Peng, ZHANG Bingyu, HAN Guinan, et al. Two-stage low carbon operation optimization of photovoltaic storage microgrid considering risk management of both source and load sides[J]. Smart Power, 2023, 51(11): 1-6.
[2]	李长宇, 唐文秀. 基于数据驱动的多微电网互联系统分布鲁棒运行优化[J]. 智慧电力, 2022, 50(5): 1-8.
	LI Changyu, TANG Wenxiu. Distributed robust operation optimization of multi-microgrid interconnection system based on data driven[J]. Smart Power, 2022, 50(5): 1-8.
[3]	虎智峰, 陈静, 张婧菲, 等. 考虑新能源不确定性边界的主动配电网优化调度[J]. 智慧电力, 2022, 50(11): 48-55.
	HU Zhifeng, CHEN Jing, ZHANG Jingfei, et al. Optimal dispatch of active distribution network considering uncertainty boundary of renewable power generation[J]. Smart Power, 2022, 50(11): 48-55.
[4]	曹宏基, 刘道兵, 李世春, 等. 计及UPFC的主动配电网日前-实时优化策略[J]. 智慧电力, 2022, 50(7): 65-73.
	CAO Hongji, LIU Daobing, LI Shichun, et al. Day ahead real-time optimization strategy of active distribution network considering UPFC[J]. Smart Power, 2022, 50(7): 65-73.
[5]	郭清元, 吴杰康, 莫超, 等. 基于混合整数二阶锥规划的新能源配电网电压无功协同优化模型[J]. 中国电机工程学报, 2018, 38(5): 1385-1396.
	GUO Qingyuan, WU Jiekang, MO Chao, et al. A model for multi-objective coordination optimization of voltage and reactive power in distribution networks based on mixed integer second-order cone programming[J]. Proceedings of the CSEE, 2018, 38(5): 1385-1396.
[6]	KIM Y J, KIRTLEY J L, NORFORD L K. Reactive power ancillary service of synchronous DGs in coordination with voltage control devices[J]. IEEE Transactions on Smart Grid, 2017, 8(2): 515-527.
[7]	高锋阳, 乔垚, 杜强, 等. 考虑光伏出力相关性的配电网动态无功优化[J]. 太阳能学报, 2018, 39(1): 101-109.
	GAO Fengyang, QIAO Yao, DU Qiang, et al. Dynamic reactive power optimization of distribution network considering correlated photovoltaic power output[J]. Acta Energiae Solaris Sinica, 2018, 39(1): 101-109.
[8]	蒋平, 梁乐. 基于内点法和遗传算法相结合的交直流系统无功优化[J]. 高电压技术, 2015, 41(3): 724-729.
	JIANG Ping, LIANG Le. Reactive power optimization of hybrid AC/HVDC power system combining interior point algorithm and genetic algorithm[J]. High Voltage Engineering, 2015, 41(3): 724-729.
[9]	PRASAD S, VINOD KUMAR D M. Trade-offs in PMU and IED deployment for active distribution state estimation using multi-objective evolutionary algorithm[J]. IEEE Transactions on Instrumentation and Measurement, 2018, 67(6): 1298-1307. doi: 10.1109/TIM.2018.2792890 URL
[10]	陈瑞捷, 鲁宗相, 乔颖. 基于多场景模糊集和改进二阶锥方法的配电网优化调度[J]. 电网技术, 2021, 45(12): 4621-4629.
	CHEN Ruijie, LU Zongxiang, QIAO Ying. Optimal dispatch based on multi-scene ambiguity set and modified second-order cone algorithm for distribution network[J]. Power System Technology, 2021, 45(12): 4621-4629.
[11]	GUO Y F, ZHANG Q Z, WANG Z Y. Cooperative peak shaving and voltage regulation in unbalanced distribution feeders[J]. IEEE Transactions on Power Systems, 2021, 36(6): 5235-5244. doi: 10.1109/TPWRS.2021.3069781 URL
[12]	孙峰洲, 刘海涛, 陈庆, 等. 考虑新能源波动区间的交直流配电网下垂斜率鲁棒优化方法[J]. 电力系统自动化, 2020, 44(14): 62-70.
	SUN Fengzhou, LIU Haitao, CHEN Qing, et al. Robust optimization method for droop slopes in AC/DC distribution network considering fluctuation interval of renewable energy source[J]. Automation of Electric Power Systems, 2020, 44(14): 62-70.
[13]	DUAN J J, SHI D, DIAO R S, et al. Deep-reinforcement-learning-based autonomous voltage control for power grid operations[J]. IEEE Transactions on Power Systems, 2020, 35(1): 814-817. doi: 10.1109/TPWRS.59 URL
[14]	范士雄, 李立新, 王松岩, 等. 人工智能技术在电网调控中的应用研究[J]. 电网技术, 2020, 44(2): 401-411.
	FAN Shixiong, LI Lixin, WANG Songyan, et al. Application analysis and exploration of artificial intelligence technology in power grid dispatch and control[J]. Power System Technology, 2020, 44(2): 401-411.
[15]	张沛, 朱驻军, 谢桦. 基于深度强化学习近端策略优化的电网无功优化方法[J]. 电网技术, 2023, 47(2): 562-572.
	ZAHNG Pei, ZHU Zhujun, XIE Hua. Reactive power optimization based on proximal policy optimization of deep reinforcement learning[J]. Power System Technology, 2023, 47(2): 562-572. doi: 10.52783/pst.226 URL
[16]	李琦, 乔颖, 张宇精. 配电网持续无功优化的深度强化学习方法[J]. 电网技术, 2020, 44(4): 1473-1480.
	LI Qi, QIAO Ying, ZHANG Yujing. Continuous reactive power optimization of distribution network using deep reinforcement learning[J]. Power System Technology, 2020, 44(4): 1473-1480.
[17]	李鹏, 姜磊, 王加浩, 等. 基于深度强化学习的新能源配电网双时间尺度无功电压优化[J]. 中国电机工程学报, 2023, 43(16): 6255-6265.
	LI Peng, JIANG Lei, WANG Jiahao, et al. Optimization of dual-time scale reactive voltage for distribution network with renewable energy based on deep reinforcement learning[J]. Proceedings of the CSEE, 2023, 43(16): 6255-6265.
[18]	习伟, 李鹏, 李鹏,, 等. 基于深度强化学习的分布式电源就地自适应电压控制方法[J]. 电力系统自动化, 2022, 46(22): 25-31.
	XI Wei, LI Peng, LI Peng,, et al. Adaptive local voltage control method for distributed generator based on deep reinforcement learning[J]. Automation of Electric Power Systems, 2022, 46(22): 25-31.
[19]	CAO D, HU W H, ZHAO J B, et al. A multi-agent deep reinforcement learning based voltage regulation using coordinated PV inverters[J]. IEEE Transactions on Power Systems, 2020, 35(5): 4120-4123. doi: 10.1109/TPWRS.59 URL
[20]	胡丹尔, 彭勇刚, 韦巍, 等. 多时间尺度的配电网深度强化学习无功优化策略[J]. 中国电机工程学报, 2022, 42(14): 5034-5045.
	HU Daner, PENG Yonggang, WEI Wei, et al. Multi-timescale deep reinforcement learning for reactive power optimization of distribution network[J]. Proceedings of the CSEE, 2022, 42(14): 5034-5045.
[21]	WU Z, LI Y Q, GU W, et al. Multi-timescale voltage control for distribution system based on multi-agent deep reinforcement learning[J]. International Journal of Electrical Power & Energy Systems, 2023, 147: 108830. doi: 10.1016/j.ijepes.2022.108830 URL
[22]	王成山, 季节, 冀浩然, 等. 配电系统智能软开关技术及应用[J]. 电力系统自动化, 2022, 46(4): 1-14.
	WANG Chengshan, JI Jie, JI Haoran, et al. Technologies and application of soft open points in distribution networks[J]. Automation of Electric Power Systems, 2022, 46(4): 1-14.
[23]	LOWE R, WU Y, TAMAR A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. ACM, 2017: 6382-6393.
[24]	RUDER S. An overview of gradient descent optimization algorithms[EB/OL]. [2023-08-05]. http://arxiv.org/abs/1609.04747.
[25]	LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[EB/OL]. [2023-08-05]. http://arxiv.org/abs/1509.02971.
[26]	全欢, 彭显刚, 刘涵予, 等. 基于深度强化学习的配电网实时电压优化控制方法[J]. 电网技术, 2023, 47(5): 2029-2039.
	QUAN Huan, PENG Xiangang, LIU Hanyu, et al. Voltage optimal control of distribution network based on deep reinforcement learning[J]. Power System Technology, 2023, 47(5): 2029-2039.
[27]	JANG E, GU S X, POOLE B. Categorical reparameterization with gumbel-softmax[EB/OL]. [2023-08-05]. http://arxiv.org/abs/1611.01144.
[28]	WANG S Y, DUAN J J, SHI D, et al. A data-driven multi-agent autonomous voltage control framework using deep reinforcement learning[J]. IEEE Transactions on Power Systems, 2020, 35(6): 4644-4654. doi: 10.1109/TPWRS.59 URL
[29]	HUO Y D, LI P, JI H R, et al. Data-driven adaptive operation of soft open points in active distribution networks[J]. IEEE Transactions on Industrial Informatics, 2021, 17(12): 8230-8242. doi: 10.1109/TII.2021.3064370 URL
[30]	Elia Group. Transparency on grid data[DB/OL]. (2014-06-03) [2023-08-05]. https://www.elia.be/en/grid-data/power-generation/solar-pv-power-generation-data.
[31]	张玉莹, 曾博, 周吟雨, 等. 碳减排驱动下的数据中心与配电网交互式集成规划研究[J]. 电工技术学报, 2023, 38(23): 6433-6450.
	ZHANG Yuying, ZENG Bo, ZHOU Yinyu, et al. Research on interactive integration planning of data centers and distribution network driven by carbon emission reduction[J]. Transactions of China Electrotechnical Society, 2023, 38(23): 6433-6450.