Physics-aware And Risk-aware Machine Learning For Power System Operations

1y ago
2.51 MB
40 Pages
Last View : 6d ago
Last Download : 6m ago
Upload by : Ronan Orellana

Physics-aware and Risk-aware MachineLearning for Power System OperationsHao ZhuThe University of Texas at Austin( WebinarMarch 29, 20221

Presentation Outline A primer on supervised learning Three machine learning (ML) examples- Topology-aware learning for real-time market- Risk-aware learning for DER coordination- Scalable learning for grid emergency responses Summary2

Power of AI Unprecedented opportunities offered bydiverse sources of data Synchrophasor and IED data Smart meter data Weather data GIS data, .How to harness the power of ML totackle problem-specific challenges inreal-time power system operations?3

A primer on supervised learning Unknown joint distribution for Classification: π‘Œ 1 or π‘Œ 1, , 𝐢 Regression: π‘Œ 𝑅𝑏 Given examples, aka, data samples {(π‘₯π‘˜ , π‘¦π‘˜ )} π‘₯π‘˜ : input feature π‘¦π‘˜ : output target/label Without π‘¦π‘˜ unsupervised or semi-supervised learning Samples from dynamical systems reinforcement learning4

Learning problem formulation Goal: construct a functionto map π‘₯ 𝑦 Predicted value π‘¦ΰ·œ 𝑓 π‘₯ π‘Œ to be close to 𝑦π‘₯𝑓 Loss function: 𝑙 𝑦,ො 𝑦 𝑙 𝑓 π‘₯ ,𝑦 0 For regression, use 𝐿𝑝 norms 𝑙 𝑦,ො 𝑦 π‘¦ΰ·œ 𝑦𝑦𝑝 For classification, cross-entropy loss, hinge loss, etc.Sample Mean Excellent generalization (error bounds on) performance?Vidal, Rene, et al. "Mathematics of deep learning." arXiv preprint arXiv:1712.04741 (2017).Bartlett, Peter L., Andrea Montanari, and Alexander Rakhlin. "Deep learning: a statistical viewpoint." arXiv preprint arXiv:2103.09177 (2021).5

Parameterized models for f Impossible to search over any function f parameterization Linearparameterized byand A simple model structure to use Linear regression (LS, LAV) Linear classification (logistic regression or SVM) Nonlinear 𝑓 for better prediction Polynomials, Gaussian Processes (GPs), etc. Kernel learning:(Hilbert space for some kernel) Neural networks (NN): layers of nonlinear functions.6

Regularization Data overfitting (losses 0) Features redundant: e.g., both π‘₯𝑖 and π‘₯𝑖 Models too complex: high-order polynomials, deep neural networks We can fit any K data samples perfectly using a (K-1)-th order polynomialsnorm ofparameter 𝑀 Hyperparameter πœ† 0 balances between data fitting and model complexity 𝐿2 norm/Ridge: small values, or smooth using σ𝑖 𝑀𝑖 𝑀𝑖 12 𝐿1 norm/Lasso: sparse 𝑀 (much more zero entries)7

Deep (D)NN architecture Perceptron (single-layer NN): convertto a nonlinear function byπ‘₯𝑦𝑓 nonlinear activation 𝜎 : sigmoid, Tanh, ReLU NNs: basically multi-layer perceptron (MLP) Layered, feed-forward networks (input x, output y) Hidden layers also called neutrons or units 2-layer NNs can express all continuous functions,while for any nonlinear ones 3 layers are sufficientDeep Learning book

Gradient descent (GD) via backpropagation Nonlinear f nonconvex opt. problem GD-based learning𝑀 𝑀 𝛼 𝐸(𝑀) In practice, local minima may not be aconcern [LeCun, 2014] Efficient computation of gradient in abackward way using the β€œchain rule”9

Variations of DNN Fully-connected NN (FCNN): weight parameters grow with data size Idea: reuse the weight parameters, aka, filters!Convolutional NN (CNN):Recurrent NN (RNN):Graph NN (GNNs):Spatial filters for images/videoTemporal filters for texts, speechGraph filters for networked systems10

Overview We visit three problems that use domain knowledge to better design NNmodels that are physics-informed and risk-awareCommunication linkFast meterTopology-aware learningfor real-time market:Risk-aware learning for DERcoordination:Scalable learning for gridemergency responses:Simpler model for efficient trainingReduced risks of voltage violationsFast mitigations under limited data11


ML for optimal power flow (OPF)OutputInputOutputInput Neural Network(NN)ModelPowerful OPFSolvers Real-time computation of the OPF solutions by learning the I/O mapping13

Existing work and our focus Integration of renewable, flexible resources increases the grid variability andmotivates real-time, fast OPF via training a neural network (NN) Identifying the active constraints (for dc-OPF) [Misra et al’19][Deka et al’19] Directly mapping the ac-OPF solutions [Guha et al’19] Warm start the search for ac feasible solution [Baker ’19] [Zamzam et al’20] Address the uncertainty in stochastic OPF [Mezghani et al’20] Connect to the duality analysis of convex OPF [Chen et al’20] [Singh et al’20]Focus: Exploit the grid topology to reduce the NN model complexity14

OPF for real-time market Power network modeled as a graphwith N nodes ac-OPF for all nodal injections Nodal input:power limits costs Nodal output: optimal p/q ? Fully-connected (FC)NNFCNN layer hasparameters!15

Topology dependence [Owerko et al’20] uses graph learning to predict p/q Locational marginal price (LMP) from the dual problem Strongly depends on the graph topology and congested lines ISF (injection shift factor) matrix S from graph Laplacianshares the same eigen-spaceas the graph Laplacian16

LMP map with locality17

Graph NN (GNN): topology-based filtering Input formed by nodal features as rows GNN layer 𝑙 with learnable parameters Topology-based graph filter Feature filtersIf lines are sparseand let, thenthe number of parameters foreach GNN layer isexplore higher-dim. mappingCompared to FCNNHamilton, William L. "Graph representation learning." 2020. wlh/grl book/18

GNN for predicting LMPs LMP prediction [Ji et al’16, Geng et al’16] GNN-based LMP can determine the optimal p/f Feasibility-regularization (FR) to reduce line flow violationsLiu, Shaohui, Chengyang Wu, and Hao Zhu. "Graph Neural Networks for Learning Real-Time Prices in Electricity Market."ICML Workshop on Tackling Climate Change with Machine Learning, 2021.

LMP prediction results 118-bus ac-opf and 2382-bus dc-opf; GNN/FCNN feasibility regularization (FR) Metrics: LMP and 𝑝𝑔 prediction error; line flow limit violation rate20

GNN for classifying congested lines Classifying the status for the top 10 congested lines with cross-entropy loss Metrics: recall (true positive rate), F1 score GNN better in performance scaling for large systems, thanks to reducedcomplexity118acGNNRecall98.40%F1 score96.10%2383dcGNNRecall90.00%F1 score81.40%FCNN97.70%94.60%FCNN87.30%78.30%21

Topology adaptivity In addition to reduced complexity, GNN-basedprediction can easily adapt to varying gridtopology Pre-trained GNN for a nominal topology canwarm-start the learning for randomly selectedtwo-line outages Re-trained process takes only 3-5 epochs toconverge to good prediction Currently pursuing to formally analyze thistransfer capability22


ML for distributed energy resources (DERs) Rising DERs at grid edge motivate scalable & efficient coordination tosupport the operations of connected distribution grids Lack of frequent, real-time communications Distribution control center or DMS may broadcast messages to the full systemFast meter/D-PMU(sub-second)Slow meter(15 minutes – 1 hour)Distribution SubstationLiu, Hao Jan, Wei Shi, and Hao Zhu. "Hybrid voltage control in distribution networks under limited communication rates."IEEE Transactions on Smart Grid 10.3 (2018): 2416-2427.Molzahn, Daniel K., et al. "A survey of distributed optimization and control algorithms for electric powersystems." IEEE Transactions on Smart Grid 8.6 (2017): 2941-2962.24

Existing work and our focus Scalable DER operations as a special instance of OPF Kernel SVM learning [Karagiannopoulos et al’19],[Jalali et al’20] DNNs for ac-/dc-OPF [see Part I] Reinforcement learning (RL) [Yang et al’20, Wang et al’19] Enforcing network constraints is challenging Heuristic projection or penalizing the violationsFocus: Address the statistical risks to ensure safe operational grid limits25

Optimal DER coordinationCentralController DERs for voltage regulation and power loss reduction ::::available reactive powernetwork matrixoperating conditionvoltage limits𝐲𝑛𝐳𝑛Fast meter (Multi-phase) linearized dist. flow (LDF) model leads to a convex QP But a centralized solution requires high communication rates26

ML for DER optimization Similar to OPF, want to predict Learn a scalable NN model, one for each node 𝑛Communication linkFast meter : nodal weights to be learned Similarly, we can use GNN architecture such that all nodes use the same filter Average loss function: mean-square error (MSE)with27

Risk-aware learning Consider the conditional value-at-risk (CVaR) for predicting zfor a given significance level𝐓𝐨𝐩 𝜢K πœ†: regularization hyperparameter CVaR turns out very useful for voltage constraintsShanny Lin, Shaohui Liu, and Hao Zhu. "Risk-Aware Learning for Scalable Voltage Optimization in Distribution Grids," Power SystemsComputation Conference (PSCC) 2022 (accepted),

Accelerating CVaR learning CVaR loss is known to preserve convexity of loss function But the NN model is typically nonconvex; recent extension [Kalogerias’21] A key computation challenge is learning efficiency with worst-case samples Modern sampling-based ML tools reduces the accuracy of gradient computation We developed a straightforward mini-batch selection algorithm (Alg. 1 later)that only uses those of sufficient risk value for computing gradient29

Risk of predicting πͺ decisions IEEE 123-bus system with six DER nodes of flexible πͺ output All DERs use limited power information to learn the optimal decision Error performance very similar due to the high prediction accuracy Yet, training time accelerated by CVaR and the proposed selection algorithm30

Risk of voltage violation Further incorporating the CVaR of voltage prediction Reduced max voltage deviation (worst-case) - higher operational safety Computational efficiency improved by the proposed selection algorithm31


Grid emergency responses Grid resilience challenged by emergingtypes of variable energy resources(VERs), and increasingly by extremeweather events It imperative to design the grid operationswith effective emergency responses Load shedding Topology optimization . How to attain the decisions in a scalableand safe manner?33

Centralized optimal load shedding (OLS) Load shedding determined by control center with system-wide information AC Optimal load shedding (OLS) program cast as a special case of AC-OPF1213611141097.815432:node (bus)ControlCenter: failure: load shedding34

ML for decentralized load shedding Each load learns optimal decision rule from a large of historical or synthetic scenarios Input feature: Local shedding solutions:Yuqi Zhou, Jeehyun Park, and Hao Zhu, β€œScalable Learning for Optimal Load Shedding Under Power Grid EmergencyOperations,” PES General Meeting (PESGM) 2022 (accepted)

Scalable learning of load shedding Offline training isperformed for variouscontingency and loadconditions Load centers quicklymake decisions duringonline phase inresponse tocontingencies.36

Prediction under single line outage IEEE 14-bus system; quadratic cost functions All (𝑁 1) contingency scenarios, under different load conditions (1000samples for each scenario)37

SummaryCommunication linkFast meterTopology-aware learningfor real-time market:Risk-aware learning for DERcoordination:Scalable learning for gridemergency responses:Simpler model for efficient trainingReduced risks of voltage violationsFast mitigations under limited data I: Topology adaptivity and other transfer learning ideas II: Convergence analysis and connections to safe learning III: Generalized emergency responses and risk-awareness38

Education resources UT grad course β€œData Analytics in Power Systems,” new slides PowerSys 2020 NSF Workshop on Forging Connections between Machine Learning,Data Science, & Power Systems /home DOE-funded EPRI GEAT with Data with data.html39

Learning and Optimizationfor Smarter Electricity InfrastructureLearning for grid resilienceLearning for dynamic resourcesLearning for power electronics based resources.Thank you!Hao /@HaoZhu6

Presentation Outline A primer on supervised learning Three machine learning (ML) examples - Topology-aware learning for real-time market - Risk-aware learning for DER coordination - Scalable learning for grid emergency responses

Related Documents:

Physics 20 General College Physics (PHYS 104). Camosun College Physics 20 General Elementary Physics (PHYS 20). Medicine Hat College Physics 20 Physics (ASP 114). NAIT Physics 20 Radiology (Z-HO9 A408). Red River College Physics 20 Physics (PHYS 184). Saskatchewan Polytechnic (SIAST) Physics 20 Physics (PHYS 184). Physics (PHYS 182).

Advanced Placement Physics 1 and Physics 2 are offered at Fredericton High School in a unique configuration over three 90 h courses. (Previously Physics 111, Physics 121 and AP Physics B 120; will now be called Physics 111, Physics 121 and AP Physics 2 120). The content for AP Physics 1 is divided

General Physics: There are two versions of the introductory general physics sequence. Physics 145/146 is intended for students planning no further study in physics. Physics 155/156 is intended for students planning to take upper level physics courses, including physics majors, physics combined majors, 3-2 engineering majors and BBMB majors.

Physics SUMMER 2005 Daniel M. Noval BS, Physics/Engr Physics FALL 2005 Joshua A. Clements BS, Engr Physics WINTER 2006 Benjamin F. Burnett BS, Physics SPRING 2006 Timothy M. Anna BS, Physics Kyle C. Augustson BS, Physics/Computational Physics Attending graduate school at Univer-sity of Colorado, Astrophysics. Connelly S. Barnes HBS .

PHYSICS 249 A Modern Intro to Physics _PIC Physics 248 & Math 234, or consent of instructor; concurrent registration in Physics 307 required. Not open to students who have taken Physics 241; Open to Freshmen. Intended primarily for physics, AMEP, astronomy-physics majors PHYSICS 265 Intro-Medical Ph

Risk Matrix 15 Risk Assessment Feature 32 Customize the Risk Matrix 34 Chapter 5: Reference 43 General Reference 44 Family Field Descriptions 60 ii Risk Matrix. Chapter 1: Overview1. Overview of the Risk Matrix Module2. Chapter 2: Risk and Risk Assessment3. About Risk and Risk Assessment4. Specify Risk Values to Determine an Overall Risk Rank5

strong Ph.D /strong . in Applied Physics strong Ph.D /strong . in Applied Physics with Emphasis on Medical Physics These programs encompass the research areas of Biophysics & Biomedical Physics, Atomic Molecular & Optical Physics, Solid State & Materials Physics, and Medical Physics, in

Modern Physics: Quantum Physics & Relativity. You can’t get to Modern Physics without doing Classical Physics! The fundamental laws and principles of Classical Physics are the basis Modern Physics