电器与能效管理技术 ›› 2023, Vol. 0 ›› Issue (3): 11-15.doi: 10.16628/j.cnki.2095-8188.2023.03.002

• 研究与分析 • 上一篇    下一篇

基于深度强化学习的切机控制策略研究

卢恒光1, 林碧琳2, 温步瀛2   

  1. 1.福建华电万安能源有限公司, 福建 龙岩 364000
    2.福州大学 电气工程与自动化学院, 福建 福州 350116
  • 收稿日期:2022-07-08 出版日期:2023-03-30 发布日期:2023-04-11
  • 作者简介:卢恒光(1969—),男,高级工程师,主要从事电力安全生产工作。|林碧琳(1997—),女,硕士研究生,研究方向为基于强化学习的暂态稳定切机控制。|温步瀛(1967—),男,教授,博士,研究方向为电力系统优化运行和风电并网技术。
  • 基金资助:
    *福建省自然科学基金项目(2022J01113)

Research on Generator Tripping Control Strategy Based on Deep Reinforcement Learning

LU Hengguang1, LIN Bilin2, WEN Buying2   

  1. 1. Fujian Huadian Wan’an Energy Co.,Ltd., Longyan 364000, China
    2. School of Electrical Engineering and Automation, Fuzhou University, Fuzhou 350116, China
  • Received:2022-07-08 Online:2023-03-30 Published:2023-04-11

摘要:

电力系统受到大扰动后会进入紧急运行状态,必须及时采取紧急控制措施使系统恢复稳定运行。切机控制是维护系统稳定最有效且最常用的控制措施。针对传统基于策略表的控制方法在实际应用中存在故障不匹配的问题,提出了一种基于深度强化学习的电力系统暂态稳定切机控制决策方法。首先,引入深度确定性策略梯度(DDPG)算法,结合等面积定则,对算法各要素重新设计。其次,建立基于DDPG算法的切机控制决策模型。最后,利用PSA-BPA软件和Pycharm软件搭建单机-无穷大系统和IEEE39节点系统切机控制仿真模型,通过算例验证了所提方法的有效性。

关键词: 暂态稳定, 切机控制, 深度强化学习, 深度确定性策略梯度

Abstract:

The power system will enter an emergency state after being greatly disturbed.The emergency control measures must be taken in time to restore the system to stable operation.Generator tripping control is the most effective and common control measure to maintain system stability.Aiming at the problem of fault mismatch in practical application of the traditional control method based on cure table,a decision method of power system transient stability generator tripping control based on deep reinforcement learning is proposed.Firstly,the deep deterministic policy gradient (DDPG) algorithm is introduced.Every element of the algorithm is redesigned in combination with the equal area criterion.Secondly,the decision model of generator tripping control based on DDPG algorithm is established.Finally,using PSA-BPA and Pycharm software,the generator tripping control simulation models of the single machine-infinite system and an IEEE39 node system are established.The effectiveness of the proposed method is verified by an example.

Key words: transient stability, generator tripping control, deep reinforcement learning, deep deterministic policy gradient

中图分类号: