Professional Documents
Culture Documents
Project Team
Vijay Vittal, Project Leader
Trevor Werho, Graduate Student
Arizona State University
Mladen Kezunovic
Ce Zheng, Graduate Student
Vuk Malbasa, Post-Doctoral Research Associate
Texas A&M University
Junshan Zhang
Miao He, Graduate Student
Arizona State University
PSERC Publication 13-39
August 2013
Acknowledgements
This is the final report for the Power Systems Engineering Research Center (PSERC)
research project titled Data Mining to Characterize Signatures of Impending System
Events or Performance from PMU Measurements (project S-44). We express our
appreciation for the support provided by PSERCs industry members and by the National
Science Foundation under the Industry / University Cooperative Research Center
program.
We wish to thank:
Naim Logic Salt River Project
Juan Castaneda Southern California Edison
Khaled-Abdul Rahman California Independent System Operator
James Kleitsch American Transmission Company
Sharma Kolluri Entergy.
Executive Summary
This project applies data mining techniques to characterize signatures of
impending system events or performance from phasor measurement units (PMU)
measurements. The project will evaluate available data mining tools and analyze the
ability of these tools to characterize signatures of impending systems events or
detrimental system behavior. The use of PMU measurements from multiple locations will
also be considered. The performance of the data mining tools will be verified by
comparing the results obtained for measurements corresponding to know events on the
system. The basis of the proposed approach is to use a historical data set of PMU
measurements, along with information regarding actual events that occurred on the
system during the historical period considered in the data set, and apply the decision tree
based data mining techniques available in the commercial software Classification and
Regression Trees (CART) to identify signature of impending events. A decision tree can
be thought of as a flowchart representing a classification system. It consists of a sequence
of simple questions regarding critical attributes (CAs).
The project consists of three parts Part 1 deals with the use of data mining in
conjunction with PMU measurements to characterize signatures of impending system
events. Part 2 deals with power system oscillatory stability and voltage stability based on
voltage and current phasor measurements. Part 3 deals with fundamental research to
improve the performance of decision trees using robust ensemble decision trees with
adaptive learning and also accounting for loss of PMU measurements. Some details of
each part are provided below.
Part 1: Data Mining to Characterize Signatures of an Impending Island Formation
from PMU measurements
This study is aimed at using real PMU measurements to predict and detect
significant system events with the help of the data-mining tool CART. The program
CART (classification and regression trees) produced by Salford Systems is a data-mining
tool that can be used to analyze problems that contain a large number of variables. The
historical PMU data used in this study is from the Entergy power system in Louisiana
when hurricane Gustav impacted the network. During the storm, 14 tie lines were lost
that created an electrical island containing Baton Rouge and New Orleans. The PMU
measurements captured during the storm where studied in a variety of ways to identify
signatures that provide critical information regarding the status of the system.
Careful analysis was conducted to determine whether or not the island could be
detected by only using the PMU measurements. It was found that the most effective
approach of identifying the creation of the island was to use the PMU measurements of
voltage phase angle. By comparing the phase angle measurements between PMUs, in this
case, the island could have been detected in approximately 4 seconds. Also, by
comparing different sets of PMUs, the location of the island could be determined by
which PMUs were inside or outside of the affected area. Because this approach only
considers the PMU measurements to form conclusions, the same method could be applied
ii
to any system containing PMUs, with only slight modification, and still provide the
ability to quickly and reliably detect the formation of an island within the system.
Provided with the system power flow and dynamic data corresponding to the time
when hurricane Gustav entered the system, simulations were conducted to attempt to
recreate and match the event to the historical PMU data. Load and generation levels
across a wide range of the system were adjusted to closely match the phase angle
difference see in the PMU data. Next, the conditions inside the island were adjusted using
the known generator dispatch and the available SCADA data. It was found that the
direction of the power flowing on the last tie line must have been opposite to the SCADA
data. Also, it was found that in order to match the simulation to the PMU frequency
measurements, the governor reference at one of the generators must have been reduced
just following the creating of the island. Performing these actions allowed the event to
closely match to the PMU measurements and provide a better understanding of what
happened just after the island formed.
Lastly, the PMU data was used to try to predict the island formation and identify
signatures that predicted impending events. Since there was insufficient data to search for
signatures by using the single island formation in the available PMU data, 50 simulations
were conducted to build a CART database. The simulations were analyzed intuitively and
with CART to determine any predicting signatures. It was found that there is a strong
correlation between a sudden change in voltage phase angle and the loss of a tie line. A
number of simulations also showed a sudden change in voltage within the island area
after the loss of a tie line. These different signatures were searched for in the real PMU
data at the times when tie lines were reported to have been removed from the system. It
was found that when the second to last tie line went offline, there was a 12 change in
phase angle measured inside the island. This signature precedes the island formation by
38 minutes and could have alerted system operators that this area needed attention.
This study was successful in using CART, along with an in-depth knowledge of
power systems, to analyze PMU data from a historic event. The data-mining tool CART
helped quantify and understand the phenomenon observed in the PMU data. The method
of identifying an island formation using voltage phase angle measurements is both
effective and reliable, and could be used in real applications. The signatures found to
predict the island formation is much less reliable. Large changes in load or generation
could also create a sudden change in phase angle and the method could be prone to false
alarms. This method of island formation prediction could likely be improved by pairing it
with additional information, such as SCADA data. However, this study only considers
the information that can be drawn from the PMUs alone. In the future as more PMUs are
placed in the power system, it is a reasonable assumption that the predicting signatures
found in this study will be easier to identify and provide more information.
Part 2: Data Mining to Characterize Impending Oscillatory and Voltage Stability
Events
Traditionally, time-domain simulation based on system modeling is used as the
primary tool to analyze power system stability. This method is straightforward and
accurate as long as an adequate system model and measurements are used. However, two
obstacles have prevented this method from being applied in real-time applications: 1) it is
iii
iv
Project Publications
Student Theses:
Trevor Werho Arizona State University Application of Data Mining Techniques to
PMU Measurements to Detect Impending Signatures of System Failures, PhD,
Anticipated Date of Graduation: May 2014.
Miao He Arizona State University A Data Analytics Framework for Smart Grids:
Spatio-temporal Wind Power Analysis and Synchrophasor Data Mining, PhD, Date of
Graduation: August 2013.
Conference Papers:
V. Malbasa, C. Zheng, and M. Kezunovic, Texas A&M Power system online
stability margin estimation using active learning and synchrophasor data, PowerTech
2013, Grenoble, France, June 2013.
C. Zheng, V. Malbasa, and M. Kezunovic, Texas A&M "A fast stability analysis
scheme based on classification and regression tree," IEEE Conference on Power System
Technology (POWERCON), Auckland, New Zealand, October 2012.
C. Zheng, V. Malbasa, and M. Kezunovic, Texas A&M Online estimation of
oscillatory stability using synchrophasors and a measurement-based approach, submitted
to 17th International Conference on Intelligent System Applications to Power Systems
(ISAP), Tokyo, Japan, July 2013.
M. He, V. Vittal and J. Zhang, Arizona State University A Data Mining Framework
for Online Dynamic Security Assessment: Decision Trees, Boosting, and Complexity
Analysis, IEEE PES Conference on Innovative Smart Grid Technologies, Washington
DC, United States, Jan. 2012.
Journal Papers:
C. Zheng, V. Malbasa, and M. Kezunovic, Texas A&M Regression tree for stability
margin prediction using synchrophasor measurements," IEEE Transactions on Power
Systems, Vol. 28, No. 3, May 2013.
M. He, V. Vittal, and J. Zhang, Arizona State University Online dynamic security
assessment with missing PMU measurements: A data mining approach, IEEE
Transactions on Power Systems, Vol. 28, No. 2, PP. 19691977, May 2013.
M. He, J. Zhang and V. Vittal, Arizona State University Robust On-line Dynamic
Security Assessment using Adaptive Ensemble Decision Tree Learning, accepted for
publication, IEEE Transactions on Power Systems.
vi
Table of Contents
1 Data Mining to Characterize Signatures of an Impending Island Formation from PMU
Measurements ............................................................................................................... 1
1.1
Introduction .......................................................................................................... 1
1.1.1 CART ............................................................................................................. 1
1.1.2 Sample Case from Entergy ............................................................................ 2
1.2
1.3
1.4
1.5
Conclusions ........................................................................................................ 40
2 Data Mining to Characterize Impeding Oscillatory and Voltage Stability events ...... 44
2.1
Introduction ........................................................................................................ 44
2.1.1 Problem Statement ....................................................................................... 44
2.1.2 Project Objectives ........................................................................................ 46
2.1.3 Literature Review......................................................................................... 47
2.1.4 Proposed Research ....................................................................................... 48
2.2
2.3
vii
2.5
2.6
2.7
2.8
2.9
viii
3 Data Mining for Online Dynamic Security Assessment using PMU Measurements 133
3.1
3.2
3.3
Proposed Robust Online DSA for OC Variations and Topology Changes ..... 141
3.3.1 Offline Training ......................................................................................... 143
3.3.2 Periodic Updates ........................................................................................ 147
3.3.3 Online DSA using PMU Measurements .................................................... 149
3.3.4 An Illustrative Example ............................................................................. 151
3.3.5 Application to the WECC System ............................................................. 157
3.4
Proposed Robust Online DSA for Missing PMU Measurements .................... 163
3.4.1 Handling Missing Data by using Surrogate in DTs ................................... 166
3.4.2 Proposed Random Subspace Method for Selecting Attribute Subsets ...... 168
3.4.3 Proposed Approach for Online DSA with Missing PMU Measurements . 173
3.4.4 Case Study ................................................................................................. 180
3.5
A1.2.
ix
List of Figures
Figure 1.1 Mablevale frequency versus time ..................................................................... 3
Figure 1.2 Sterlington frequency versus time .................................................................... 3
Figure 1.3 Ninemile frequency versus time ....................................................................... 4
Figure 1.4 Waterford frequency versus time ..................................................................... 4
Figure 1.5 Waterford-Sterlington phase-angle difference versus time .............................. 5
Figure 1.6 Decision tree created from island formation data............................................. 9
Figure 1.7 Decision tree created from island resynchronization data .............................. 12
Figure 1.8 PMU frequency measurements of island formation ....................................... 20
Figure 1.9 Frequency plot of simulated island formation ................................................ 20
Figure 1.10 PMU frequency measurements of island formation ..................................... 23
Figure 1.11 Frequency plot of simulated island formation .............................................. 23
Figure 1.12 Historical line flows of Gypsy Fairview 230 kV....................................... 24
Figure 1.13 Simulated MW flow of Gypsy Fairview 230 kV ...................................... 25
Figure 1.14 Voltage phase angle at Waterford versus time ............................................. 29
Figure 1.15 Voltage magnitude at Waterford versus time ............................................... 30
Figure 1.16 Voltage magnitude at Waterford versus time ............................................... 31
Figure 1.17 Voltage phase angle at Waterford versus time ............................................. 31
Figure 1.18 Pruned CART decision tree .......................................................................... 36
Figure 1.19 Ninemile-Sterlington phase angle difference versus time ............................ 39
Figure 1.20 Waterford-Sterlington phase angle difference versus time .......................... 39
Figure 1.21 Waterford-Sterlington phase angle difference versus time .......................... 40
Figure 2.1 Power system stability analysis using data from various sources .................. 46
Figure 2.2 Proposed research framework ........................................................................ 50
Figure 2.3 Difference between conventional approach and the DT method ................... 50
Figure 2.4 From time-domain simulation to the proposed scheme ................................. 53
Figure 2.5 Proposed oscillatory stability assessment scheme .......................................... 56
Figure 2.6 Proposed voltage stability assessment scheme ............................................... 57
Figure 2.7 One-line diagrams of the IEEE 9-bus and 39-bus test systems ...................... 62
Figure 2.8 CT stability assessment for the 39-bus system in one replication .................. 65
Figure 2.9 Classification tree performance using different tree growing methods.......... 65
Figure 2.10 An example of the RT model structure ........................................................ 68
Figure 2.11 Proposed framework of the RT-based stability margin prediction and event
detection ............................................................................................................................ 70
Figure 2.12 Trajectory of voltage and oscillatory stability margins of the IEEE 39-bus
(New England) test system ............................................................................................... 73
Figure 2.13 RT predicted margins versus the actual stability margins of the IEEE 39-bus
system. Left: OSM-RT performance; Right: VSM-RT performance ............................... 76
Figure 2.14 Relative cost of a series of differently sized RTs ......................................... 77
Figure 2.15 Regression trees for oscillatory stability margin prediction ......................... 77
Figure 2.16 One-line diagram of the WECC 179-bus equivalent system........................ 79
Figure 2.17 New case prediction accuracy of RTs trained with differently sized data sets.
Left: OSM-RT; Right: VSM-RT ...................................................................................... 81
Figure 2.18 Scheme for RTs to handle system topology change ..................................... 86
Figure 2.19 Methodology for voltage stability assessment ............................................. 91
Figure 2.20 Procedures for creating the training data set ................................................ 97
Figure 2.21 Comparison of active learning and random sampling on the 9-bus system for
the oscillatory stability classification task using SVM ..................................................... 99
Figure 2.22 Comparison of active learning and random sampling on the 9-bus system for
the voltage stability classification task using SVM .......................................................... 99
Figure 2.23 Comparison of active learning and random sampling on the 39-bus system
for the oscillatory stability classification task using SVM ............................................. 100
Figure 2.24 Comparison of active learning and random sampling on the 39-bus system
for the voltage stability classification task using SVM .................................................. 101
Figure 2.25 Comparison of active learning and random sampling on the 9-bus system for
the oscillatory stability classification task using ANN ................................................... 102
Figure 2.26 Comparison of active learning and random sampling on the 9-bus system for
the voltage stability classification task using ANN ........................................................ 102
Figure 2.27 Comparison of active learning and random sampling on the 39-bus system
for the voltage stability classification task using ANN .................................................. 103
Figure 2.28 Comparison of active learning and random sampling on the 39-bus system
for the oscillatory stability classification task using ANN ............................................. 103
Figure 2.29 OSM-RT topology and node splitters of the 9-bus system ........................ 106
Figure 2.30 IEEE 9-bus system VSM-RT and OSM-RT variable importance.............. 107
Figure 2.31 RT performance considering different PMU pacements in the 179-bus
system ............................................................................................................................. 110
Figure 2.32 Typical frequency band of different oscillation types ................................ 114
Figure 2.33 Mode parameters identified from power system measurements ................ 115
xi
xii
List of Tables
Table 1.1 Example CART Database .................................................................................. 6
Table 1.2 Area Load and Generation Modifications ....................................................... 14
Table 1.3 Generation Modification .................................................................................. 14
Table 1.4 Generators Modified Within Island ................................................................. 15
Table 1.5 Final Island Generator and Load Settings........................................................ 15
Table 1.6 Generator Dynamic Models ............................................................................. 16
Table 1.7 Exciter Dynamic Models ................................................................................. 17
Table 1.8 Governor Dynamic Models ............................................................................. 17
Table 1.9 Actions Taken During Dynamic Simulation ................................................... 19
Table 1.10 Actions Taken During Second Dynamic Simulation ..................................... 22
Table 1.11 Actions Taken During Island Simulation ...................................................... 28
Table 1.12 Island Prediction CART Database Structure ................................................. 32
Table 1.13 Decision Tree Test Results ............................................................................ 34
Table 2.1 Knowledge Base Generated for Classification Analysis ................................. 63
Table 2.2 Performance of the Classification Tree ........................................................... 64
Table 2.3 Performance of the Regression Trees .............................................................. 76
Table 2.4 New Case Testing Accuracy using Different Data Mining Tools for the 39-bus
System ............................................................................................................................... 78
Table 2.5 Computational Speed of Regression Trees ...................................................... 82
Table 2.6 Performance of the 179-Bus Regression Trees Considering PMU Measurement
Error .................................................................................................................................. 83
Table 2.7 Regression Tree Performance under System Topological Variations ............. 84
Table 2.8 Operating Points Generated for Training of Data Mining Tools ..................... 98
Table 2.9 Accuracy Results on Oscillatory Stability Task ............................................ 104
Table 2.10 Accuracy Results on Voltage Stability Task ............................................... 104
Table 2.11 WECC 179-Bus System Combine Bus Ranking ......................................... 109
Table 2.12 Low-Frequency Oscillation Modes Obtained from Model Initialization .... 125
Table 2.13 Estimate Mode #5 by Applying AR to Ambient Data ................................. 127
Table 2.14 Classification Tree Performance .................................................................. 127
Table 2.15 Results Comparison ..................................................................................... 129
Table 3.1 Misclassification error rate of robustness testing ........................................... 157
xiii
xiv
1.1 Introduction
The objective of this aspect of the project is to examine the efficacy of the
commercial data-mining tool CART, in identifying signatures of impending power
system events by using actual phasor measurement unit (PMU) measurements. The
historical PMU data used in this study is from the Entergy power system in Louisiana. In
September of 2008, hurricane Gustav made landfall in southern Louisiana. During the
course of the storm an electrical island was formed around Baton Rouge and New
Orleans. This study aims to use CART to analyze the PMU measurements captured
during the hurricane to better understand future islanding events.
1.1.1 CART
The program CART (classification and regression trees) produced by Salford
Systems is a data-mining tool that can be used to analyze problems that contain a large
number of variables. CART uses a procedure called binary recursive partitioning to build
a decision tree. Starting at the root node, simple questions called critical splitting rule
(CSR) are asked regarding a critical attribute (CA). Each answer to the question creates
two branching nodes such that each will have its own CSR. Nodes that do not branch off
to other nodes are called terminal nodes that end the growth of the tree. Once all terminal
nodes are reached the decision tree is complete and can be used to categorize new inputs.
When given input and output data, CART will determine its inherent input-output
relationship in the form of a decision tree. This process is called decision tree training.
Once training is complete, new input data can be dropped down the decision tree to
generate the previously unknown output. Using this method, the historical PMU
measurements will serve as the necessary information needed to train the decision tree.
By training the decision tree, CART will determine any precursor signatures of the
impending system event that is contained within the data. With this decision tree
completed, new PMU measurements could be dropped down the tree to determine if a
particular system event is likely to occur in the future [1].
studied to determine whether island formation, time and location, could be detected with
a high level of confidence.
To determine if the location of the Entergy island could be predicted by only
using the PMU measurements, the frequency data from four PMUs were selected;
Mablevale, Sterlington, Ninemile, and Waterford. One hour of frequency data around the
reported time of islanding for each of the selected PMUs can be seen in Figure 1.1,
Figure 1.2, Figure 1.3, and Figure 1.4.
around the interval +240 to -180. In order to sensibly view the phase angle
measurements, the data must first be unwrapped. This is done by taking phase angle
measurements from two PMUs and taking their difference. Logic must then be used to
ensure the data always lies between +240 and -180. Once this is done the data will still
be bounded to between +240 and -180. To remove this bound, logic can be used to
check the conditions just before the 360 jumps in angle and can shift accordingly so the
data is a continuous curve [3]. The adjusted difference in voltage phase angle at
Waterford and Sterlington was plotted around the time of island formation and can be
seen in Figure 1.5.
Output Label
Input 2
10
10
10
10
10
10
15
20
25
30
The CART database can have any number of input variables up to what the CART
license will allow. Adding an additional input variable would increase the CART
database by 1 column. Additional rows may be added to the database to increase the
amount of data points included in the analysis. Each CART input variable can be either
continuous or categorical. The database constructed for island formation detection
analysis contained 4 input variables. The variables used in the CART database are as
follows:
Frequency and voltage phase angle data measured at Waterford (inside island)
Frequency and voltage phase angle data measured at Ninemile (inside island)
Frequency and voltage phase angle data measured at Sterlington (outside island)
Frequency and voltage phase angle data measured at Mablevale (outside island)
The output label used in the island detection database was Island or No Island depending
upon the time the frequency and phase angle measurements were taken and whether or
not the island was present in the system. The final CART island detection database
contained 9 columns and approximately 120,000 rows (1 hour of PMU recordings).
The program CART uses the database to train a decision tree. A decision tree
contains a specific input-output relationship. Using the decision tree requires one value of
each of the input variables included in the study that correspond to the same sample. An
example of this would be any of the rows of the CART database minus the output label.
Starting at the top most node; apply the logic rule to the input data. This rule will then
point to one of the two adjacent nodes. Continuing to apply the rules at each node will
eventually point to a terminal node or a node that does not lead to any of lower nodes.
Each terminal node corresponds to one of the possible output labels. The label of the
terminal node that the inputs lead to is the output that corresponds to those particular
inputs. In this way, a specific output category can be given to every set of input data.
The PMU data used in the CART island formation analysis was the frequency and
voltage phase angle data from Ninemile, Waterford, Sterlington, and Mablevale. The
PMU frequency data was given to CART unmodified while the voltage phase angle
measurements required the same adjustments done in the previous island detection
analysis. In the CART database all of the phase angle measurements are relative to
Sterlington and the phase angle measurements at Sterlington are entered as all zeros. For
example, the statement (Ninemile Phase = -10) means the phase angle at Ninemile is 10
less than the phase angle at Sterlington. The decision tree generated by the CART island
formation analysis can be seen in Figure 1.6.
Node 1:
Terminal Node 1:
Terminal Node
Label: Island
17973 data points of the training data lead to this terminal node
Node 2:
Terminal Node 5:
Terminal Node
Label: Island
Node 3:
Terminal Node 4:
Terminal Node
Label: Island
Conditions to reach: Waterford phase > -15.7 and Waterford phase < 32.44, Ninemile
phase > 69.97
Node 4:
10
Terminal Node 3:
Terminal Node
Label: Island
Conditions to reach: Waterford phase > -15.7 and Waterford phase < 32.44 and
Ninemile phase < 69.7 and Waterford frequency > 60.1Hz
Terminal Node 2:
Terminal Node
Label: No Island
Conditions to reach: Waterford phase > -15.7 and Waterford phase < 32.44 and
Ninemile phase < 69.97 and Waterford frequency < 60.1 Hz
Scoring the decision tree using the training data shows the decision tree is correct
99.999% of the time. However, it would still not be judicious to apply this decision tree
to a future island formation, even to an islanding event in the same location. This is
because the CART database only contains the data from a single islanding event. Only by
training a decision tree with many different island formations would the decision tree
become reliable enough to be implemented for island detection. However, this decision
tree is still useful. Notice that about 99.97% of all training data points lead to terminal
nodes 1, 2, and 5. These are the dominant terminal nodes. Combining the rules of all
three dominant nodes leads to the statement; if the phase angle at Waterford is between
+32.4396 and -15.6969 from Sterlington then there is no island, otherwise, an island
must exist. The exact values of the thresholds found by CART are unique to this
particular event, but the rule suggests that when the phase angles recorded at PMUs
within one area differ greatly from PMUs outside that area, then there is a high likelihood
11
an island has formed. This is very similar to what was seen in the intuitive analysis of
island detection. It can be seen that CART only uses frequency data to classify 8 data
points of the ~120,000 points of training data. This supports the finding that voltage
phase angle measurements are much more sensitive to island formations than frequency.
A CART analysis was also done using the PMU data corresponding to the island
resynchronization. The PMU data used in the CART island resynchronization analysis
was the frequency and voltage phase angle data from Ninemile, Waterford, Sterlington,
and Mablevale. The decision tree created by CART island resynchronization analysis can
be seen in Figure 1.7.
resynchronization occurs and the phase angle data cannot be modified like the previous
analysis. Because the phase angle data does not contain the same information as the
previous analysis CART must use the frequency data when detecting the island
reconnection. This causes the decision tree to become much more complex. The decision
trees dominant nodes state that if Ninemile phase angle is within +43.11 and -73.0714
of Sterlington and Ninemile frequency is within .04 Hz of Mablevale then there is no
longer an island present. If these conditions are not true then an island is present. These
rules show some resemblance to the rules found in the island formation decision tree.
Here the bound in phase angle is much wider but still suggests that a large difference in
phase angle is a strong indicator of an island being present.
13
11 (with the inside of island leading the outside). However, the power flow data showed
that the phase angle between these two areas was around 40 (outside of island leading
the inside). The load and generation over a large area were reduced. Additionally, the
generation at bus #303007 was increased. These changes resulted in a phase angle
difference of 9.24 (inside of island leading the outside). The adjusted areas can be seen
in Table 1.2. The adjusted generator can be seen in Table 1.3.
Area Numbers
332
LAGN
P Load = 15632
P Load = 13248
351
EES
Q Load = 5288
Q Load = 4476
502
CELE
P Gen = 14750
P Gen = 12753
503
LAFA
Bus
303007
New Value
575 MW
The next modification from the original data was done using the known generator
dispatch and tie line SCADA data. The original data was modified such that there were
only 3 generators online within the islanded area. The generator Ninemile Unit 5 was set
to 220 MW, Waterford Unit 1 was set to 49 MW, and Gypsy Unit 2 was set to 77 MW.
All other generators in the islanded area were turned off. The load within the island was
determined using the known generator dispatch and the SCADA data from the tie line
Gypsy-Fairview 230kV, which was last to go offline. The SCADA data available from
the last tie line shows that power was flowing into the island. It was later determined
through simulation that power must have been flowing out of the island instead but with
14
similar magnitude. The generators modified within the island can be seen in Table 1.4
and the final island generator and load settings can be seen in Table 1.5.
Bus Number
Pnew
336002
GA Gulf
offline
336151
WAT U1
41
49
336179
UCARBST2
39
offline
336222
GYP U2
36
77
336252
NMIL U5
220
220
P Gen
Q Gen
P Load
Q Load
346
-358
246
56
335568-335572
335601
335613-335620
335665
336001-336464
15
cause while investigating the simulated event. Also, the parameter values for all governor
models were reverted back to the default model values. Finally, the values of R were
increased on all governor models. These changes helped modify the initial simulated
frequency response after island formation.
Generator
Model
Tdo
Tdo
Tqo
Tqo
H
D
Xd
Xq
Xd
Xq
Xd=Xq
X1
S(1.0)
S(1.2)
16
Ninemile U5
GENROU
4.33
.041
.481
.059
2.62
0
1.783
1.764
.291
.411
.249
.199
.11
.119
Exciter
Model
TR
KA
TA
TB
TC
VRMAX
VRMIN
KE
TE
KF
TF1
E1
SE(E1)
E2
SE(E2)
Governor
Model
R
T1
Vmax
Vmin
T2
T3
Dt
IEEX1
0
50
.06
0
0
1.5
-1.5
-.045
.5
.08
1
3.3784
.074
4.5045
.267
Ninemile U5
IEEX1
0
400
.02
0
0
8.15
-7.33
1
1.21
.03
1
2.71
.94
3.62
1.25
Ninemile U5
TGOV1
.07
.5
1
0
3
10
0
17
neglect this tie line because it is a lower voltage tie line (138kV) and other simulations
have shown this line has little impact during the event.
The simulation was conducted by removing the tie lines in the order that they
went offline during the hurricane. However, line Coly Willow Glen 500kV was not
disconnected in the same way as the other tie lines. Information available indicated that
this line was unable to serve the island because the three transformers at Willow Glen
went offline. To simulate this scenario the Willow Glen bus #335618 was disconnected
from the system rather than disconnecting the tie line.
Many simulations were performed in order to understand the governor response to the
island formation. It was concluded that actions must be taken during the dynamic
simulation in order to match the simulation frequency to the historical PMU data. Four
seconds after the island is formed in the simulation the load within the island is increased
40 MW. After eight seconds, the load in the island is increased again another 80 MW.
The exact actions taken during simulation are shown in Table 1.9.
18
--
5sec
336462-500360
Disconnect Line
10sec
336015-336016
Disconnect Line
15sec
336032-303202
Disconnect Line
20sec
336141-303204
Disconnect Line
25sec
336006-336007
Disconnect Line
30sec
335568-335660
Disconnect Line
35sec
335536-335665
Disconnect Line
40sec
335771-303200
Disconnect Line
45sec
335500-335618
Disconnect Line
50sec
335568-335659
Disconnect Line
55sec
335657-335658
Disconnect Line
60sec
335618
Disconnect Bus
70sec
336190-336138
Disconnect Line
74sec
82sec
100sec
End Simulation
The island formation captured by PMU measurements can be seen in Figure 1.8.
The frequency inside the island was measured at Ninemile. The frequency outside the
island was measured at El Dorado. The plot of frequency of the simulated event can be
seen in Figure 1.9.
19
1
44
87
130
173
216
259
302
345
388
431
474
517
560
603
646
689
732
775
818
861
904
947
990
1033
1076
1119
1162
1205
1248
1291
1334
1377
1420
1463
1506
1549
1592
1635
1678
Frequency (Hz)
60.6
60.5
60.4
60.3
60.2
60.1
60
59.9
59.8
59.7
59.6
around 5.6 seconds compared to around 5 seconds in the PMU data. It can be seen that
the frequency in the island under simulation recovers to 59.934 Hz. The frequency in the
PMU data (ignoring present oscillations) recovers to about 59.9 Hz. The oscillations seen
in the PMU data were being driven by some unknown source within the island. Those
oscillations would not be able to be captured by the dynamic simulation.
It was decided that generation reduction just after island formation was much
more feasible than the load increasing. A second simulation was conducted using the
same conditions as the previous island formation simulation. Instead of scaling load after
island formation, the governor reference at Ninemile Unit 5 was reduced while leaving
the other generators, Waterford Unit 1 and Gypsy Unit 2, unmodified. Four seconds after
island formation the governor reference at Ninemile Unit 5 was reduced by 6 MW.
Twelve seconds after island formation the governor reference at Ninemile Unit 5 was
reduced by an additional 2.5 MW. The exact actions taken during simulation are shown
in Table 1.10.
21
--
5sec
336462-500360
Disconnect Line
10sec
336015-336016
Disconnect Line
15sec
336032-303202
Disconnect Line
20sec
336141-303204
Disconnect Line
25sec
336006-336007
Disconnect Line
30sec
335568-335660
Disconnect Line
35sec
335536-335665
Disconnect Line
40sec
335771-303200
Disconnect Line
45sec
335500-335618
Disconnect Line
50sec
335568-335659
Disconnect Line
55sec
335657-335658
Disconnect Line
60sec
335618
Disconnect Bus
70sec
336190-336138
Disconnect Line
74sec
82sec
100sec
End Simulation
The island formation captured by PMU measurements can be seen in Figure 1.10.
The frequency inside the island was measured at Ninemile. The frequency outside the
island was measured at El Dorado. The plot of frequency of the simulated event can be
seen in Figure 1.11.
22
1
44
87
130
173
216
259
302
345
388
431
474
517
560
603
646
689
732
775
818
861
904
947
990
1033
1076
1119
1162
1205
1248
1291
1334
1377
1420
1463
1506
1549
1592
1635
1678
Frequency (Hz)
60.6
60.5
60.4
60.3
60.2
60.1
60
59.9
59.8
59.7
59.6
23
the frequency in the island under simulation recovers to 59.934 Hz. The frequency in the
PMU data (ignoring present oscillations) recovers to about 59.9 Hz. Just as in the first
simulation, the oscillations seen in the PMU data are not captured in this dynamic
simulation.
The plot of Gypsy-Fairview 230kV historic line flows can be seen in Figure 1.12.
The plot of simulated line flow of Gypsy-Fairview 230kV can be seen in Figure 1.13.
24
25
The original power flow and dynamic data received from Entergy was modified to
more closely match the conditions present when hurricane Gustav hit the system. The
simulations done using the modified power flow and dynamic data showed a reasonable
match to the available historical data, and should represent the actual conditions of the
system more accurately at the time of island formation.
26
has hit the system. Five power flow cases were provided by Entergy, that correspond to
hurricane Isaac. Hurricane Isaac made landfall at 7:00 PM on August 28, 2012 near the
mouth of the Mississippi River [5]. The five power flow cases correspond to the times;
10:30 AM August 28th, 12:00 PM August 28th, 12:00 PM August 29th, 6:00 PM August
29th, and 12:00 PM August 31st. Along with using the five different operating conditions,
ten different orders of tie-line outages were used to create a total of 50 simulations to
complete the CART database. The order of line outages that actually occurred during the
Gustav event was included as one of the ten orders. The remaining nine orders were
mostly random. However, the last two lines in each of the nine orders were intelligently
selected to allow every line, at least once, to serve as either the last line or second to last
line in the outage order. The simulations were conducted using PSSE v33.3. As stated
previously, the same island in each simulation was created using one of the five power
flow cases and one of the ten tie-line outage orders, providing a total of 50 simulations.
During each simulation, the bus values of frequency, voltage magnitude, and voltage
phase angle, were recorded at each of the PMU sites at El Dorado, Mablevale, Waterford,
and Ninemile. In each simulation, the 14 tie lines were removed at a rate of one every
five seconds until the island formed. As an example, the exact actions taken during the
simulation using the original Gustav line outage order is shown in Table 1.11.
27
--
5sec
336462-500360
Disconnect Line
10sec
336015-336016
Disconnect Line
15sec
336032-303202
Disconnect Line
20sec
336141-303204
Disconnect Line
25sec
336006-336007
Disconnect Line
30sec
335568-335660
Disconnect Line
35sec
335536-335665
Disconnect Line
40sec
335771-303200
Disconnect Line
45sec
335500-335618
Disconnect Line
50sec
335568-335659
Disconnect Line
55sec
335657-335658
Disconnect Line
60sec
303153-335456
Disconnect Line
65sec
335618-335837
Disconnect Line
70sec
336190-336138
Disconnect Line
90sec
--
End Simulation
28
change in phase angle could be observed for the loss of several lines. A clear example of
this can be seen in figure 1.14.
Angle Change
Angle Change
Angle Change
29
Voltage Drop
Voltage Drop
30
31
This information suggests that there is a correlation between changes in the PMU
data and the loss of a tie line. This could be very important in predicting island
formations. By using PMUs to monitor changes in tie line status, if the number of
remaining tie lines is very low, a system operator could be notified that an island
formation in a particular area is a reasonable threat.
1.4.3
Lines
4
4
4
3
3
2
2
1
1
El V
data
data
data
data
data
data
data
data
data
WF
data
data
data
data
data
data
data
data
data
NF
data
data
data
data
data
data
data
data
data
As a reminder, the quantities recorded during simulation are voltage (V), phase
angle (A), and frequency (F). These values were recorded during simulation at the PMU
sites at El Dorado (El), Mablevale (M), Waterford (W), and Ninemile (N). The reason El
Dorado phase angle is not seen in the CART database is that the other phase angles were
taken relative to the phase angle at El Dorado. Doing this causes the angle at El Dorado
to always be zero and is not needed in the analysis. For each simulation, only the duration
32
of the simulation where only four tie lines (or less) are remaining is used in the database.
It was decided to only use the data from the last four tie lines for a number of reasons.
Firstly, the CART license being used for this study has a database size limit of 8
megabytes. Only using the data from the last 4 tie lines prevents the database from
reaching this limit. Also, it was observed that most of the significant changes in recorded
data occurred during the loss of the last few tie lines. Lastly, it is more important to
identify when the system goes from having four tie lines to having only one or two lines
remaining than the loss of earlier tie lines.
The column labeled Lines is the target variable. For each time-step, the data
recorded is labeled corresponding to the number of tie lines still operating in the system.
Each row of the database is independent of the other rows. Therefore, the order that the
simulations are entered into the database is irrelevant. Setting up the database in this way
will force CART to try to determine the number of tie lines remaining by only looking at
the data that would be available from the PMU locations in the system. If there is a
correlation between changes in measurements and the loss of tie lines, CART will be able
to find and characterize it.
33
the second 25 simulations, CART will show whether or not a correlation between the
simulation data and changes in tie line status exists.
The decision tree created by CART using the first 25 simulations contained 876
terminal nodes. The result of testing this tree using the remaining 25 simulations is shown
in Table 1.13.
Actual
Class
1 Line
2 Lines
3 Lines
4 Lines
Total
Class
28846
28824
28824
28824
3 Lines
4 Lines
4756
7403
7594
5751
3786
8749
11862
18148
The most important results of this test are in the column labeled Percent Correct. Since
there are four possible classes, if it were truly impossible to determine the number of
remaining tie lines by looking at the available data, one would expect all the numbers in
this column to be 25%. This would correspond to the decision tree just randomly
guessing. For the classes 2 Lines and 3 Lines this seems to be the case, at 18.62% and
26.35% respectively. However, the classes of 1 Line, at 47.5%, and 4 Lines, at 62.96%,
show significant improvement. It was not expected that CART be able to look at the
available simulation measurements and determine exactly how many tie lines remain in
the system at every time step. However, this does confirm the suspicion that there is a
correlation between PMU measurements and changes in tie line status. It is important to
point out that these results only used 25 simulations of training and there are only four
PMU locations being measured. In the future, as more PMUs are placed in the system it
is a reasonable assumption that the ability to determine tie line status from PMU
measurements would vastly improve.
34
35
36
The decision tree created by CART contains 5 terminal nodes. As a reminder, all phase
angles in the CART database are relative to the phase angle at El Dorado. The rules
required for data to reach each terminal node are as follows:
Terminal Node 5:
Label: 4 Lines
Terminal Node 4:
Label: 3 Lines
Conditions to reach: Ninemile angle >-31.58 and Ninemile angle <= -1.91 and
Waterford Angle > -5.55
Terminal Node 3:
Label: 1 Line
Conditions to reach: Ninemile angle >-31.58 and Ninemile angle <= -1.91 and
Waterford Angle <= -5.55, and Waterford voltage has reduced no more than -.0192 pu
Terminal Node 2:
Label: 3 Lines
Conditions to reach: Ninemile angle >-31.58 and Ninemile angle <= -1.91 and
Waterford Angle <= -5.55, and Waterford voltage has reduced more than -.0192 pu
Terminal Node 1:
Label: 1 Line
This decision tree contains several interesting aspects. First, frequency is not used to
classify data in the pruned decision tree. This agrees with what was observed after
plotting the simulation results. Also, when the phase angle at Ninemile is close to the
phase angle at El Dorado, the decision tree determines there are 4 tie lines operating.
When the phase angle at Ninemile is more than 31.58 behind the phase angle at El
37
Dorado, the decision tree determines that there is only 1 tie line remaining. This suggests
that large changes in voltage phase angle between areas are important in identifying the
loss of a tie line.
38
39
island was formed. This signature was a warning sign that an island formation in the
nearby area was a reasonable threat and that the area should be closely monitored.
Other points where lines were lost in the PMU data were reviewed, but only one
other disturbance was found. The Webre-Willow Glen 500 kV line was the 6th to last line
to be lost before the island formed. A plot of the voltage phase angle difference between
35
Webre-Willow Glen
goes offline
30
25
20
15
10
5
0
-5
1
222
443
664
885
1106
1327
1548
1769
1990
2211
2432
2653
2874
3095
3316
3537
3758
3979
4200
4421
4642
4863
5084
5305
5526
5747
5968
6189
6410
6631
Waterford (inside island) and Sterlington (outside island) can be seen in Figure 1.21.
1.5 Conclusions
This study aimed at using real PMU measurements to predict and detect significant
system events especially islanding with the help of the data-mining tool CART. The
PMU data offered by Entergy, containing the island formation event when hurricane
Gustav impacted the system, provided an excellent case for this study. During the storm,
40
14 tie lines were lost that created an island containing Baton Rouge and New Orleans.
Careful analysis was conducted to determine whether or not the island could be detected
by only using the PMU measurements. It was found that the most effective approach of
identifying the creation of the island was to use the PMU measurements of voltage phase
angle. By comparing the phase angle measurements between PMUs, in this case, the
island could have been detected in approximately 4 seconds. Also, by comparing
different sets of PMUs, the location of the island could be determined by which PMUs
were inside or outside of the affected area. Because this approach only considers the
PMU measurements to form conclusions, the same method could be applied to any
system containing PMUs, with only slight modification, and still provide the ability to
quickly and reliably detect the formation of an island within the system.
Provided with the system power flows and dynamic data corresponding to the time
when hurricane Gustav entered the system, simulations were conducted to attempt to
recreate and match the event to the historical PMU data. Load and generation levels
across a wide range of the system were adjusted to closely match the phase angle
difference see in the PMU data. Next, the conditions inside the island were adjusted using
the known generator dispatch and the available SCADA data. It was found that the
direction of the power flowing on the last tie line must have been opposite to the SCADA
data. Also, it was found that in order to match the simulation to the PMU frequency
measurements, the governor reference at one of the generators must have been reduced
just following the creating of the island. Doing these things allowed the event to closely
match to the PMU measurements and provide a better understanding of what happened
just after the island formed.
41
Lastly, the PMU data was used to try to predict the island formation. With not
enough data to search for signatures by using the one island formation in the PMU data,
50 simulations were conducted to build a CART database. The simulations were analyzed
intuitively and with CART to determine any predicting signatures. It was found that there
is a strong correlation between a sudden change in voltage phase angle and the loss of a
tie line. A number of simulations also showed a sudden change in voltage within the
island area after the loss of a tie line. These different signatures were searched for in the
real PMU data at the times when tie lines were reported to have been removed from the
system. It was found that when the second to last tie line went offline, there was a 12
change in phase angle measured inside the island. This signature precedes the island
formation by 38 minutes and could have alerted system operators that this area needed
attention.
This study was successful at using CART, along with a strong knowledge of
power systems, to analyze PMU data from a historic event. The data-mining tool CART
helped quantify and understand the phenomenon observed in the PMU data. The method
of identifying an island formation using voltage phase angle measurements is both
effective and reliable, and could be used in real applications. The signatures found to
predict the island formation is much less reliable. Large changes in load or generation
could also create a sudden change in phase angle and the method could be prone to false
alarms. This method of island formation prediction could likely be improved by pairing it
with additional information, such as SCADA data. However, this study only considers
the information that can be drawn from the PMUs alone. In the future as more PMUs are
42
placed in the power system, it is a reasonable assumption that the predicting signatures
found in this study will be easier to identify and provide more information.
43
2.1 Introduction
2.1.1 Problem Statement
Several electrical utility companies are installing large numbers of phasor
measurement units (PMUs) to monitor system conditions. In addition, several utilities
also have collected a significant amount of historical PMU data. These sets of stored data
also include measurements obtained during known events occurring on the system. At the
PSERC summer workshop in Maine, and as noted in the 2010 research solicitation, there
is a definite need to identify signatures of impending events detrimental to system
performance from PMU measurements.
From the control center operators point of view, the fast assessment of power
system oscillatory stability and voltage stability is of great importance for real-time
operation. It is desirable that the impending system events can be immediately detected
and that operators are provided with updated information on whether or not a power
system can maintain synchronism and acceptable voltage levels when subject to
disturbances.
Traditionally, the method of time-domain simulation is used to analyze system
stability status [6]. However two obstacles prevent the traditional methods application in
real-time monitoring and control. Firstly, full system model computation makes the
simulation method time-consuming. Considering the fast onset of an instability event, the
traditional methods may not be able to provide immediate event detection. On the other
44
hand, using a simplified system model could accelerate the simulations, but this brings
concern over approximate analysis results leading to inaccurate decisions. Secondly, the
data used for the stability analysis in electrical utilities are obtained from the supervisory
control and data acquisition (SCADA) system or state estimation functions, which are
refreshed on a time scale from several seconds to several minutes. Figure 2.1 shows the
state-of-the-art data acquisition structure and its possible implementation in analyzing
two types of power system stability status, i.e. oscillatory stability and voltage stability.
In addition, the SCADA measured data does not have the characteristics needed to
implement the new analysis and control tools due to the lack of time-synchronized
sampled waveform data [7]-[8]. Compared to a traditional SCADA system,
synchrophasor IEDs such as PMUs enable a much higher data sampling rate and provide
the synchronized phasor measurements across the network.
In some cases the forecasted load pattern and unit commitment dispatch are used
instead of actual data to predict system performance. When a disturbance occurs and
immediate controls need to be initiated, traditional stability analysis using slowly updated
or forecasted data can only provide very limited decision making support.
To make the situation worse, in power system planning and on-line applications, a
complete model may not be readily available. This model is necessary for obtaining the
linearized system description required by traditional oscillatory stability analysis [9].
Similar problems exist in the voltage stability assessment process [10]. Under such
circumstances, the data mining techniques, benefiting from accurate generalization ability
without detailed knowledge of all system parameters, becomes an attractive alternative.
45
PMU Data
Approx. 30 times/sec
SCADA Data
Every 4-10 sec
Assessment of Power
System Oscillatory
Stability and Voltage
Stability
Figure 2.1 Power system stability analysis using data from various sources
Evaluate other available data mining tools and analyze the ability of these
46
2.1.3
Literature Review
In the field of power systems, Wehenkel et al. first introduced the DT method to
solve the transient stability assessment problems using SCADA data [13]-[14]. In [15][19], DTs were successively applied to assess system operational security by applying a
pre-defined set of credible contingencies and enforcing an acceptable threshold criterion
on system variables based on standard operating practices. Later, in [20], the system postdisturbance stability has been analyzed by DT using its fast evaluation capability. In [21],
a genetic algorithm was applied in feature selection to search for the best inputs to DT for
oscillatory stability region prediction. In [22] and [23], Kamwa et al. showed that there is
a trade-off between a data mining models accuracy and its transparency. A review of
literature reveals that the problem of using DT for stability margin monitoring from
substation field measurements has not yet been fully explored.
The concept of decision tree comprises the Classification Tree and Regression
Tree, [11]. While in previous works classification trees have been extensively studied to
group an operating point (OP) into one of several pre-defined stability categories, the use
of regression trees (RT) to predict the stability margin, i.e. how far the system is away
47
from a possible instability event, has not yet been fully studied. With respect to its online
use, the areas that remain unexplored include how fast the RT can process PMU
measurements, how well the RT can deal with measurement errors, and how robust the
RT is to the system topology changes. It is also imperative to develop a systematic
approach to generating a sufficient and realistic knowledge base for off-line training of
DT.
Several other data mining tools such as Neural Networks, [24], and Support Vector
Machines, [25], have been used to evaluate the system stability status. Compared with
some black-box tools, the DTs piece-wise structure provides system operators with a
clearer cause-effect relationship of how the system variables lead to the onset of an
instability event. Using DTs it is possible to identify the critical variables and thresholds
that need to be analyzed to gain insight into the stability margin of a system.
48
49
50
51
52
53
The oscillatory stability may be analyzed by modal analysis. A power system can
be described as a set of non-linear differential algebraic equations (DAE):
x f ( x, y , u )
0 g ( x, y , u )
2.1
where x is the state vector, y is the algebraic vector, and u is the input vector. The
DAEs are formulated by detailed modeling of each network component. By linearizing
the non-linear equations in Eq. 2.1 at a particular system operating point, the following
equations are derived:
x Ax Bu
y Cx Du
2.2
The matrices A, B, C, and D in Eq. 2.2 provide a linearization around the system
equilibrium point. Each pair of complex conjugate eigenvalues of matrix A corresponds
to an oscillation mode of the system. The A matrix can be further decomposed as:
2.3
In Eq. 2.3, represents the diagonal eigenvalue matrix, and and are left and
right eigenvector matrices respectively. For the ith oscillation mode with the following
conjugate eigenvalues:
i i ji
2.4
f i i / 2
54
2.5
i
i 2 i 2
2.6
The oscillation modes that carry a significant amount of energy, but with
insufficient DR, are critical among all modes and need to be closely monitored.
Occurrence of an instability event is possible when a poorly damped mode is excited by a
small or large disturbance.
In this work the DR of the critical oscillation mode is used as the oscillatory
stability margin (OSM) indicator. Assuming DRcrit is the damping ratio of the critical
mode, the scheme shown in Figure 2.2 is proposed for OSA. As shown in the figure, the
OSM becomes progressively more stringent as the value of critical mode DR decreases.
The damping ratio is not an index from the parameter space, so strictly speaking it
may not be proper to term it as margin. In this work DR is selected as the OSM
indicator in the sense that it provides smooth movement trajectory, a clear partition
between stable/unstable states, and an explicit distance from unstable point.
As shown in Figure 2.5, three oscillatory stability states, namely Stable
(including Good and Fair), Alert and Unstable, are defined according to the
value of DRcrit. A classification tree (CT) is used to assign a system operating point (OP)
into one of the above stability states.
55
56
VSM referred to here corresponds to system long-term voltage stability [6], which cannot
be used to capture the short-term voltage stability.
57
2.7
where Pmax is the maximum deliverable power, and Pcurrent is the active load
demand of current OP. The proposed procedure for voltage stability margin prediction is
as follows:
(a) Generate n different OPs
(b) For each OP, determine the maximum deliverable power by means of the
CPF technique
(c) Calculate the voltage stability margin for the ith OP using the following
index:
VS
i
margin
i
MW distance
i
Pmax
100%
2.8
(d) Train the RT off-line using selected features from the n OPs and their
corresponding VS imargin
(e) Use the trained RT to predict VSM in real time
As shown in Figure 2.6, for the given voltage stability thresholds STB and ALT
(STB > ALT), OPs will be labeled as Stable as long as they satisfy VSimargin STB; and
Unstable when ALT VSimargin. The remaining OPs are labeled as Alert.
58
certain system snapshot [30]. If the load/generation composition varies, different OPs are
formed. The change in the load demand and generation output can be described as:
PG PG0 PG
PL PL0 PL
QG QG0 QG
2.9
where PG and QG are active/reactive power outputs of all the generators except the
slack bus generator, and PL and QL are vectors of active/reactive power delivered to the
loads. Superscript 0 represents the base case OP. The vectors PG, QG, PL and QL
stand for the variations in power.
In this work, the commercial software PSS/E [31] is used for iteratively solving
load flows, and deriving the characteristic matrix A at different OPs through numerical
perturbation. Python and MATLAB [32] programs are developed to automate the PSS/E
simulations, perform modal analysis, conduct the CPF-based voltage stability analysis,
compute stability margins, and establish the knowledge base. The pseudo-code for
knowledge base creation is illustrated below.
VM_i and VA_i: positive sequence voltage magnitude and phase angle at
Bus i
IM_i_j
61
Figure 2.7 One-line diagrams of the IEEE 9-bus and 39-bus test systems
62
Total
Stable
Alert
Unstable
663
(61.90%)
2549
(71.30%)
358
(33.43%)
962
(26.91%)
50
(4.67%)
64
(1.79%)
1071
3575
707
(51.68%)
2206
(60.21%)
495
(36.18%)
1175
(32.07%)
166
(12.13%)
283
(7.72%)
1368
3664
63
Therefore, in this work the process of knowledge base splitting, tree training and testing
has been replicated at least 10 times until the mean value and standard deviation of
independent case testing accuracy become stable.
The Entropy method is adopted to grow the CTs in CART. The performance of
CTs in independent case testing is summarized in Table 2.2.
Table 2.2 Performance of the Classification Tree
Accuracy of New Case Testing
System
Method
OSA
VSA
9-Bus
Entropy
98.63%
99.56%
39-Bus
Entropy
94.38%
97.95%
The independent case testing results of CT for the IEEE 39-bus system are shown
in Figure 2.8. An interesting observation from Table 2.2 and Figure 2.8 is that the CT
performance for OSA is less than that of VSA. This is because the system oscillatory
stability behavior is highly non-linear. In order to reach certain prediction accuracy, a
larger training dataset is needed by OSA-CT compared with VSA-CT. In this work, more
instances could be generated if we set the Stopping Criterion 2) in Section 4.2 with a
higher accuracy requirement.
The classification tree can be developed using different methodologies, e.g. Gini,
Twoing, and Entropy, [12]. Another important setting is the minimum cases a parent node
should have, which may impact the size of resulted CT. In this work the tree settings are
varied to explore their impact on assessment accuracy.
The results are shown in Figure 2.9.
64
Figure 2.8 CT stability assessment for the 39-bus system in one replication
Figure 2.9 Classification tree performance using different tree growing methods
Two conclusions could be made from Figure 2.9: 1) the CT performance for the
stability assessment problem is related to how a tree is trained. In this case the Entropy
method achieved the best classification accuracy; 2) the setting for minimum parent node
cases can alter the shape of the resulted tree as well as its performance. In general, the
more cases a parent node is required to have, the fewer terminal nodes the derived CT
65
may possess. This experiment demonstrated that there is a trade-off between tree
complexity and accuracy. A large-sized tree may encounter the over-fitting problem,
whereas a small-sized tree that is not adequately developed may produce less accurate
classification results. A trial and comparison process is needed to find the best CT size,
and this can typically be accomplished by nested cross-validation.
2.3.5 Summary
This section explores the use of classification trees for fast evaluation of oscillatory
stability and voltage stability. The following is a summary of the research:
The
the setting for minimum parent node cases can alter the shape of the resulting tree
impacting its accuracy.
66
for Ai=0ui do
Scale the output of Gi to: PG i PGi0 (1 Ai CG i %)
for Ai+1=0ui+1 do
0
Scale load 1 to: PL1 PL1 (1 A( i1) CL1 %)
for Ai+j=0ui+j do
0
Scale load j to: PL j PLj (1 A( i j ) CL j %)
for Ai+j+1=0ui+j+1 do
0
Scale shunt 1 to: QS 1 QS 1 (1 A( i j 1) CS 1 %)
for Ai+j+k=0ui+j+k do
0
Scale shunt k to: QS k QSk (1 A( i j k ) CSk %)
Solve the load flow at: PG 2 ,..., PGi , PL1 ,..., PLj ,QS 1 ,...,QSk
If this OP is unsolvable: eliminate
Oscillatory Stability Analysis:
Import system model dynamic data. Derive the A matrix.
Voltage Stability Analysis:
Derive the voltage collapse point via continuation-based method
Export computed features of current OP
End Loops
3. Repeat: for i=0number of OPs do
Modal analysis of A matrix using (3)-(5): DR (i)
Compute voltage stability index using (6)-(7): VS imargin
Export computed stability margins
End Loop
67
2.4 Model-based Approach for Real Time Stability Margin Prediction Using
Regression Tools
2.4.1 Proposed Research
2.4.1.1 Regression Tree Method
Compared with the traditional time domain simulation approach that requires full
model computation each time a new OP has emerged, the advantage of RT method lies in
its simplified model structure and fast OP analysis facilitated by fewer required inputs.
Figure 2.10 provides a simple example of RT structure. The unfolding OP is related to its
stability margin through a unique top-down path. The splitting rule at each node that
belongs to a given path represents an operational threshold. Based on the combination of
splitting rules along the path, preventive and corrective control strategies could be
formulated and initiated.
68
monitoring OSM (OSM-RT) and VSM (VSM-RT) are trained and updated periodically.
The PMU data of an upcoming OP is dropped down the respective tree until it reaches a
terminal node. Then the predicted stability margin is the average value of the learning set
samples falling into that terminal node. Any OP with insufficient stability margin will be
detected immediately by checking corresponding thresholds. Operators are alerted with
the possible event and preventive control strategies can be initiated in a timely manner.
Figure 2.11 Proposed framework of the RT-based stability margin prediction and event
detection
2.4.2 Knowledge Base Generation
Using the approach illustrated in previous section, the power supply at generation
buses, demand at load buses, and the output of shunt capacitors were systematically
70
varied. A total of 1071 OPs with corresponding OSMs, and 1153 OPs with corresponding
VSMs have been produced for the 9-bus system. The number of records generated for the
39-bus system knowledge base is 4276 and 3664 for the VSM and OSM tasks,
respectively.
In addition, in this work the generator active/reactive power limits have been taken
into account to reflect the practical stability margin. This has significant impact on the
computation of VSM: when the load demand increases, a feasible load flow solution may
not exist due to the limited generation capacity, even before the maximum loadability of
the transmission system is reached. Therefore the derived Pmax may be somewhere on the
top half of the PV curve before the Knee point shown in Figure 2.6.
In order to build a sufficiently large knowledge base, in this work two stopping
criteria are followed:
1)
and the total variation should be at least 30% of the base value (uCG/L/S30). The
goal is to capture the most system behavior from the problem space;
2)
squared, metric is used to measure the prediction accuracy and will be detailed in
next section.
The trajectory of the 39-bus system stability margin is shown in Figure 2.12.
Corresponding stability thresholds are shown as the flat planes dividing each margin
space into two halves: an instability event will be immediately identified in the top half.
For this power system the voltage stability threshold is put at VSmargin=30%. This value
71
72
Figure 2.12 Trajectory of voltage and oscillatory stability margins of the IEEE 39-bus
(New England) test system
2.4.3 Off-line Training and New Case Testing
Each knowledge base is split into two independent data sets: 80% of the records are
randomly selected for training of OSM-RT and VSM-RT; the remaining 20% of the
records will serve the purpose of RT testing. The 10-fold cross validation method is
adopted to grow the RT in CART. In experiments, because of the random nature of the
73
splitting process, slight differences may occur between the performances of each derived
RT. Therefore in this work, the process of knowledge base splitting, tree training and
testing has been replicated 10 times, until the mean and standard deviation of RT
accuracy become stable.
In contrast with a classification tree for which the accuracy could be directly
derived from the misclassification rate, the performance of a regression tree is measured
through a statistical index, termed Residuals Squared Error (R2) [36]. We report the
accuracy of a RT model as follows:
y d ( x )
1
(y y )
TS
TS
2.10
root
where TS is the set of training samples, xi is input, yi is the actual stability margin,
d(xi) is the RT predicted value, and yroot is the mean of yi in the tree root node.
In general the closer the value of R2 is to 1, the better the prediction is. However in
practice, how good an R2 is depends on the particular application and the way it is
measured [37]. Experimental results from this work show that a quite acceptable value of
R2>0.90 can be achieved.
Sometimes the R2 alone may not be sufficient, especially in the case when the
typical difference between values predicted by RT and the actual stability margins is
desired. Therefore another measure, the Root-Mean-Square (RMS), is utilized:
y
n
RMS
i1
d ( xi ) 2
n
74
2.11
where n is the number of test cases. The numerator stands for the sum of squared
deviations of the actual stability margins around the RT predictions. The value of RMS
error depends on the base magnitude of the target stability margin to be predicted. In the
proposed scheme, a typical value of OSM is in the range of -0.01 to 0.1, and the VSM is
usually ranging from 0.05 to 1.0. Hence the RMS errors of VSM-RT are usually several
times larger than that of the OSM-RTs.
Once the training is complete, the derived RTs are evaluated using the unseen test
cases. Much more emphasis must be put on the accuracy of unseen case testing because,
for real-time applications, a predictive model which cannot predict the unseen system
behavior well is unacceptable, even if high accuracy is obtained during the off-line
training, as it lacks generalization power. The corresponding training and new case
testing accuracy is summarized in Table 2.3. In addition, the results of new case testing
were reported separately in terms of Security Test and Reliability Test. While the security
test examines how well the stable OPs are predicted, the reliability test checks if all
unstable OPs are correctly identified.
The prediction for 300 new OPs of the 39-bus system is shown in Figure 2.14. The
RT-based approach has exhibited encouraging capability for system stability margin
prediction.
The performances of differently sized OSM-RTs are summarized in the relative
error curve shown in Figure 2.15. Among these trees, a 13-node subtree pruned from the
45-node optimal tree is shown in Figure 2.15 (a), and the Largest tree with 465 nodes
is shown in Figure 2.15(b).
75
System
Train
R2
9-bus
0.9984
0.9858
0.0023
0.00083
0.00235
39-bus
0.9617
0.9519
0.0034
0.00386
0.00328
Train
R2
Unseen OPs
Overall Accuracy
R2
RMS
Reliability
Security
9-bus
0.9928
0.9791
0.0184
0.03357
0.01480
39-bus
0.9941
0.9694
0.0211
0.02736
0.01965
RT Predictions
0.1
0.08
0.06
0.8
0.04
0.6
0.02
0.4
0.2
Detected
Unstable OPs
-0.02
0
0.05
Detected
Unstable OPs
0
0.1
0.2
0.4
0.6
0.8
Figure 2.13 RT predicted margins versus the actual stability margins of the IEEE 39-bus
system. Left: OSM-RT performance; Right: VSM-RT performance
Compared with the optimal tree, numerical results show that although the 465-node
tree has boosted the training accuracy from 0.9617 to 0.9872 R2, its accuracy in unseen
case testing actually dropped from 0.9520 to 0.9407 R2. This is because while an overdeveloped tree may perform well in training, but it will lose the generalization power in
predicting unseen instances. The optimal tree with the lowest relative cost has the best
76
Node 2
IA12_13 <= -45.00
STD = 0.010
Avg = 0.023
N = 1114
Node 7
IA2_25 <= 164.95
STD = 0.008
Avg = 0.049
N = 1746
Node 3
VA_17 <= -27.06
STD = 0.006
Avg = 0.026
N = 969
Node 6
VM_20 <= 0.99
STD = 0.008
Avg = 0.004
N = 145
Node 8
VA_25 <= -17.87
STD = 0.003
Avg = 0.042
N = 714
Node 10
IA7_8 <= 30.33
STD = 0.006
Avg = 0.054
N = 1032
Node 4
VA_22 <= -22.14
STD = 0.005
Avg = 0.021
N = 482
Node 5
IA22_23 <= 59.22
STD = 0.004
Avg = 0.031
N = 487
Terminal
Node 7
STD =
0.003
Avg =
0.037
N = 89
Node 9
VA_9 <= -11.40
STD = 0.003
Avg = 0.042
N = 625
Node 11
IA7_8 <= 29.63
STD = 0.005
Avg = 0.059
N = 495
Node 12
IA7_8 <= 31.29
STD = 0.005
Avg = 0.050
N = 537
Terminal
Node 4
STD =
0.003
Avg =
0.029
N = 256
(a)
Terminal
Node 8
STD =
0.005
Avg =
0.045
N = 112
Terminal
Node 10
STD =
0.004
Avg =
0.061
N = 279
Terminal
Node 11
STD =
0.004
Avg =
0.056
N = 216
(b)
77
Testing R2 of OSM
Testing R2 of VSM
RT
0.9519
0.9694
SVM
0.9591
0.9811
NN
0.9579
0.9572
78
79
80
Figure 2.17 New case prediction accuracy of RTs trained with differently sized data sets.
Left: OSM-RT; Right: VSM-RT
2.4.5.3 Data Processing Speed
Traditionally the data used for the stability analysis in electrical utilities are
obtained from the SCADA system or state estimation functions, which are refreshed on a
time scale from several seconds to several minutes. These slowly updated data can only
provide limited decision making support for quickly developing situations where fast
variations are present at both demand and supply side. The capability to take advantage
of the quickly updated PMU data is critical in real-time applications.
In practice, the PMU measurements are updated very quickly, most likely at least
30 times per second. In order to evaluate the system stability status at each snapshot, the
processing of PMU data must be less than 1/30=0.033 second.
81
Off-line
Training
New Case
Prediction
Off-line
Training
New Case
Prediction
OSM-RT
36.01 s
(3421 cases)
about 3 s
(855 cases)
164.97 s
(10058 cases)
about 5 s
(2514 cases)
VSM-RT
31.38 s
(2931 cases)
about 2 s
(733 cases)
195.45 s
(12242 cases)
about 7 s
(3061 cases)
The data processing speed of RTs is summarized in Table 2.5. The computational
time is estimated using the built-in clock of CART executed on an Intel Pentium IV 3.00GHz CPU with 2 GB of RAM. It can be seen that the derived OSM-RT or VSM-RT can
assess 1000 new OPs in less than 4 s for the 39-bus system, and 3000 new OPs in less
than 8 s for the WECC 179-bus system. According to the results, the RTs satisfy the
speed requirement of real-time applications.
2.12
where the superscript real means actual values of the phasor, and meas stands for
measured values.
According to the IEEE C37.118 Standard for Synchrophasors for Power Systems
82
[39], PMUs that are Level 1 compliant with the standard should provide a Total Vector
Error (TVE) less than 1%. This implies that the following constraints must be satisfied:
1%
1%
VM imeasVAimeas VM irealVAireal
VM irealVAireal
IM imeasIAimeas IM irealIAireal
IM irealIAireal
2.13
Considering Eq. 2.12 and Eq. 2.13, random noise has been added to the original
phasor magnitudes and angles of the WECC 179-bus system knowledge base. In Table
2.6 two scenarios were tested. While in both scenarios errors were added to the test cases,
it is shown that the RTs trained with measurement error had much better performance
than the ones without the error taken into account in the training data set.
Table 2.6 Performance of the 179-Bus Regression Trees Considering PMU Measurement
Error
Type of
Regression
Models
Reliability Test
R2
RMS
R2
RMS
OSM-RT
0.7906
0.00106
0.7403
0.00121
VSM-RT
0.8091
0.02785
0.7629
0.03010
Type of
Regression
Models
Reliability Test
R2
RMS
R2
RMS
OSM-RT
0.9170
0.00068
0.8994
0.00071
VSM-RT
0.9266
0.01789
0.9045
0.01940
83
Type
9 BUS
N-1
39 BUS
N-1
39 BUS
N-2
179 BUS
N-1
179 BUS
N-1
179 BUS
N-1
179 BUS
N-1
179 BUS
N-2
179 BUS
N-2
179 BUS
N-2
RMS Error of
OSM-RT
RMS Error of
VSM-RT
0.00880
0.154810
0.00417
0.04089
0.00726
0.207020
0.00337
0.03046
0.00421
0.02654
0.00385
0.03198
0.00552
0.083250
0.00473
0.04830
0.00574
0.03792
0.00588
0.107360
84
prediction due to the small size of the system; acceptable predictions were achieved for
the case of generator outage in the 39-bus system; the N-2 scenario in the 39-bus system
was too severe for the VSM-RT to handle.
More case studies were conducted on the 179-bus system VSM-RT: low RMS
errors were observed in experiments where slight topology changes are made, such as one
of the double-circuit transmission lines out of service.
2.4.6 Discussion
85
86
series of candidate RTs accordingly. Figure 2.18 shows the proposed scheme. The list of
credible contingencies is usually readily available at utility companies. If in online
application an unseen contingency occurs and RT fails to provide accurate predictions, a
new RT will be trained and deployed. The new contingency scenario and RTs will be
added to the historical database. With the increase of contingency scenarios accumulated
in database, fewer unseen topology conditions will be encountered. The obsolete models
can be quickly replaced by the candidate RTs corresponding to the post-contingency
condition.
2.4.7 Summary
In this work the approach of using regression trees to predict power system stability
margins is explored and the following conclusions have been reached:
feature. With a sufficiently large knowledge base, the RT model can predict the system
oscillatory and voltage stability behavior with high accuracy;
According to the test results, the RT model is fast enough to process PMU
measurements, and it is robust to handle measurement errors that are within 1% TVE;
87
88
domain simulations, the large amount of operating conditions required for the training
process is still a major obstacle to their online implementation. The occurrence of a fault
event or system topology change, common in real time system operations, usually
requires the data mining tools to be updated in order to reflect the evolving system
configurations. In such situations, the re-training process may be an obstacle to seamless
online stability monitoring.
In this project we focus on reducing the computational burden of training data
mining tools by applying a pool-based active learning methodology. This approach
reduces the number of operating conditions that need to be generated via time domain
simulations, and consequently considered during training, without impacting the stability
assessment accuracy.
2.5.2 Background
In this work two types of power system operational performance have been
examined. Power system voltage stability deals with how far the system load demand is
from the combined transmission and generation capability [10], while oscillatory stability
is related to whether the system damping torques are sufficient to bring the system back
to a steady-state operating condition after a disturbance [9]. The data comes from PMUs.
Data mining tools have been previously applied in power systems to assess the
transient stability [13], system operational security [17] and system post-disturbance
stability [22]; often in cases where the computational complexity of detailed modeling
may be alleviated by creating highly accurate but approximate predictors. In [29] and
89
[41] the authors have used data mining tools to efficiently estimate the system voltage
and oscillatory stability margins from system measurement data. In this work we explore
a meta-learning scheme [42] aimed at reducing the computational burden of training,
easing the application of data mining-based stability assessment.
Active learning has often been applied in cases where labeled examples are time
consuming to obtain [43]. Pool-based active learning has been explored in situations
where it is necessary to have a human expert provide labels for data [44] and
classification of large amounts of networked data [45]. This kind of active learning may
be used to select the optimal subset from a pool of available PMU data for which to
provide labels via time domain simulation, to be used for predictor training. A detailed
and recent overview of the active learning literature is given in [43].
2.5.3 Methodology
The task of power system stability assessment may be cast as a data mining
classification problem [13], [17], [29]. In this case a data mining tool is used to create a
mapping from the synchrophasor measurements, in our case the positive sequence
voltage magnitude and angle, and the positive sequence current magnitude and angle, into
one of the pre-determined stability states, or labels. The data are collected from PMUs
installed at system substations, and synchronized using a satellite-based global
positioning system (GPS).
The stability states are determined according to the value of the corresponding
stability margin indicator. In the case of oscillatory stability the damping ratio (DR) of
the critical oscillation mode may be used as the stability margin indicator, and two basic
90
stability states can be defined as: Stable (with critical damping ratio, DRcrit > 0) and
Unstable (with DRcrit < 0). Similarly, the voltage stability margin (VSmargin) may be
defined using the continuation power flow (CPF) technique [20]. The MW-distance of
the current system operating conditions (OC) from the critical voltage collapse point
(usually the saddle-node bifurcation point) on the P-V curve is shown in Figure 2.19.
Two voltage stability states have been defined as being Stable or Alert based on
VSmargin. In this work the voltage stability threshold is set at STB =30%, however value
can be further adjusted according to the real-time operational needs.
91
Gathering all measurements and their associated labels creates a labeled data set DL
= {(xi, yi,), i = 1 N}, where N is the total number of system operating conditions
considered. A data set DL that may be used to train a data mining tool for either voltage
or oscillatory stability margin predictions is therefore produced through extensive timedomain simulation. Let us also introduce the notation DU for a pool of unlabeled
measurements, consisting of OCs without their associated stability margin labels.
In our previous work, [29], [41], we found that among the systematically generated
OCs some are redundant and others are spurious. Spurious data can be considered outliers
that should be removed from the training data set, for example by using techniques such
as interquartile range measures [46].
The proposed approach is initialized by assuming all the measured data points are
unlabeled, in DU. We then apply the presented pool-based active learning methodology to
incrementally select, label and include only points judged significant for learning into DL.
The procedure is iterated until a desired accuracy threshold is reached, or the budget of
data points that may be included in DL is expended.
In the case when labels are computed beforehand for all examples the presented
pool-based methodology reduces only the computational costs associated with learning.
When labels for all OCs are not pre-computed a substantial reduction in both time
domain simulation and learning may be possible, since not all labels may need to be
computed.
Our approach uses the probabilistic and generalization properties of artificial neural
networks and support vector machines to decide which system states should be labeled
92
and consulted during training, and which should not because they contain redundant
information.
93
confident predictions for unseen examples which are very dissimilar to any observed data
points.
94
p( yi 1 | f ( xi ))
1
.
1 exp( Af (x i ) B)
2.14
This function is monotonous and increasing for any value of B and of A < 0.
Therefore we may conclude that the output of ANNs and SVMs can be implicitly
interpreted as the class probability and used directly in active learning by considering
predictions f(xi) closer to 0 in absolute terms as more uncertain, or having p(yi = 1|f(xi))
closer to 0.5, than those farther away from 0.
The proposed active learning procedure is initialized by asking the oracle to
provide the labels for a small number of examples from DU, removing them from DU and
including them in DL. After learning on DL the tool makes a prediction on all the
examples for which labels have not yet been computed, DU, and finds those which have
predictions closest to 0 in absolute terms. In other terms the unlabelled examples are
sorted according to certainty the tool has about their label and those with highest
uncertainty are used to query the oracle again.
95
2.5.4 Experiments
Two IEEE test systems, namely the IEEE 3-machine 9-bus system and the IEEE
10-machine 39-bus (New England) system, are used to evaluate the proposed approach.
In order to create a sufficiently large training data set, different OCs have been
generated by systematically varying the system generation/shunt outputs, as well as the
load demands. PSS/E is used to perform load flow calculations, formulate linearized
system models through numerical perturbation, and derive corresponding stability
margins. MATLAB and Python add-on scripts are developed to automate this process.
The procedures for creating the training data set are illustrated in Figure 2.20.
96
Additionally, to build a sufficiently large training data set, each generator, load, or
shunt has been varied at least 6 times (u6) and the total variation is at least 40% of the
base value (uCG/L/S%40%). The goal is to capture the most system stability behavior
from the problem space.
Using the procedures shown in Figure 2.20, and by labeling different OCs with
corresponding stability states described in Section 5.3, the training data set generated for
these two test systems is summarized in Table 2.8.
For the following experiments the pool-based active learning methodology was
used to train SVMs and ANNs. We first performed experiments in batch-mode using 5-
97
fold cross-validation to obtain the optimal parameters for SVM and ANN training, and
the used these parameters to test the active learning approach.
Table 2.8 Operating Points Generated for Training of Data Mining Tools
System
Stable OPs
Unstable OPs
9-Bus
1021
50
404
21
39-Bus
4950
126
1843
59
98
learning outperforms random sampling. Random sampling starts to outperform the mean
predictor only after 50 examples have been labeled.
Figure 2.21 Comparison of active learning and random sampling on the 9-bus system for
the oscillatory stability classification task using SVM
In Figure 2.22 we show the comparison results for the 9-bus voltage stability
estimation performance comparison between active learning and random sampling. From
Figure 2.22 it can be seen that active learning outperforms random sampling more than in
the case of OSM prediction.
Figure 2.22 Comparison of active learning and random sampling on the 9-bus system for
the voltage stability classification task using SVM
99
We hypothesize that this is due to the drastic difference between the sizes of the
positive and negative classes. The difference in class sizes means that a greater variance
may be expected when randomly sampling points because the addition of a few unstable
OPs in DL may drastically change the decision boundary.
Figure 2.23 Comparison of active learning and random sampling on the 39-bus system
for the oscillatory stability classification task using SVM
Next we will illustrate how the active learning approach performs on the 39-bus
system oscillatory stability assessment using SVMs. From Figure 2.23 the active learning
approach significantly starts to outperform random sampling after 100 examples are
labeled.
In Figure 2.24 similarly to Figure 2.22 the simpler task of voltage stability margin
estimation results in a smaller but still significant performance gain from using active
learning.
100
Figure 2.24 Comparison of active learning and random sampling on the 39-bus system
for the voltage stability classification task using SVM
2.5.4.2 Artificial Neural Network Experiments
Unlike the SVM, in many cases the ANN using a logistic sigmoid transfer function
may provide very confident predictions for data points dissimilar to those observed
during training. Because of the imbalance of classes the four points used to initialize the
active learning training will often of be in the positive, or stable, class. These two causes
force the ANN to behave like a mean predictor, classifying the entire input space as the
positive class with high confidence, until a negative example is included in DL. To
overcome this issue we included three positive and one negative point in the initialized
DL. In the resulting figures this is reflected as poor performance when very few examples
are included in DL. However, once enough points are included in DL the performance of
ANN becomes closer to that of SVMs.
In Figure 2.25 we compare active learning to random sampling and the mean
predictor when using ANNs on the oscillatory stability task using 9-bus system data.
From Figure 2.25 the active learning provides significant improvement when few
examples are observed. Interestingly, random sampling provides better results when
using ANN than SVM on this task after 250 points are included in DL.
101
Figure 2.25 Comparison of active learning and random sampling on the 9-bus system for
the oscillatory stability classification task using ANN
The next result, in Figure 2.26, shows the accuracy comparisons of using ANNs on
the voltage stability task for the 9-bus system data set. Again after many labeled
examples are included in DL the performance of random sampling becomes close to that
of active learning.
Figure 2.26 Comparison of active learning and random sampling on the 9-bus system for
the voltage stability classification task using ANN
In Figure 2.27 we show the 39-bus system oscillatory stability experiment results.
Here random sampling struggles to become more accurate than the mean classifier even
102
when 300 points are included in DL. The ANN trained using active learning provides
higher accuracy than random sampling in this case as well.
Finally, Figure 2.28 we show the results of ANN using active learning and random
sampling on the 39-bus system voltage stability classification task. Although initially in
this case random sampling outperforms active learning, after 20 examples are included in
DL the active learning trained ANN starts to outperform random sampling. Again,
random sampling struggles to outperform the mean predictor.
Figure 2.27 Comparison of active learning and random sampling on the 39-bus system
for the voltage stability classification task using ANN
Figure 2.28 Comparison of active learning and random sampling on the 39-bus system
for the oscillatory stability classification task using ANN
103
DATA
ACTIVE
LEARNING
99.9%
98.5%
SET
9-BUS
39-BUS
SVM
RANDOM
99.7%
97.7%
ACTIVE
LEARNING
100%
99.4%
RANDOM
99.2%
98.2%
DATA
ACTIVE
LEARNING
99.8%
97.6%
SET
9-BUS
39-BUS
SVM
RANDOM
99.5%
96.6%
ACTIVE
LEARNING
99.8%
99.2%
RANDOM
96.8%
96.9%
2.5.5 Conclusion
The following conclusions were reached:
A
data set by using active learning to select a subset of data to learn from. In the
case of an existing labeled data set the presented methodology can be used to
filter out redundant data thus reducing the computational burden of training data
mining tools.
When
stability states, and precise values of DR and VSmargin must be obtained through
time domain simulation, the proposed method may be used to select which OCs to
104
query in order to create the most adequate data set to learn from. This may
significantly reduce the complexity involved in time domain simulations.
The
tasks is greater than on simpler tasks. The experiments also show that for simpler
tasks the used ANNs are less sensitive to data set selection than SVMs, as can be
seen from the random sampling results in Tables 2.28 and 2.24. On more complex
tasks, and in all examined cases employing active learning, higher accuracy can
be obtained using SVMs.
We
which system OCs are simulated in the time domain, and afterwards used for
training will lead to more accurate stability assessment, decrease the
computational complexity, or both.
105
Figure 2.29 OSM-RT topology and node splitters of the 9-bus system
To calculate the VI, search all splits sS on variable xm at each tree node tT, and
find the split s*m that gives the largest decrease in regression R [11]:
106
R ( s m , t ) max R ( s, t )
6.1
s S
Suppose s* is the best of s*m , and ~sm is the split on variable xm that has the best
agreement with s* in terms of partitioning cases, the measure of importance of variable xm
is defined as:
VI ( xm ) R ( ~sm , t )
6.2
tT
120
VSM VI
OSM VI
Variable Importance
100
80
60
40
20
4
5
6
7
8
9
5
6
7
9
8
9 45 46 57 69 78 89
4
5
6
7
8
9
VM VM VM VM VM VM VA VA VA VA VA VA IM4 IM4 IM5 IM6 IM7 IM8 IA IA IA IA IA IA
Figure 2.30 IEEE 9-bus system VSM-RT and OSM-RT variable importance
Figure 2.30 shows the computed VI for the OSM-RT and VSM-RT of the 9-bus
system derived in previous sections. The actual measures of importance have been
normalized so that the most important variable has a VI of 100.
x i
2.15
108
measurements from the lowest ranked buses is also presented for the purpose of
comparison.
Table 2.11 WECC 179-Bus System Combine Bus Ranking
Top Ranked Buses
Rank
CBR
Rank
100.8
# 170
100.2
2
338.27
# 171
95Bus
96Bus
18.47
# 173
13.99
# 174
97Bus
67Bus
12.73
# 175
12.52
# 176
8.48
# 177
#9
12Bus
11Bus 9
8.44
# 10
Bus
8.24
#1
#2
#3
#4
#5
#6
#7
#8
Locati
onBus
Bus
90
100Bus
20
Locati
CBR
onBus
Bus
162
163Bus
0.31
172Bus
168Bus
0.12
85Bus
50Bus
0.02
0.01
# 178
92Bus
94Bus
# 179
165Bus
0.00
# 172
0.28
0.24
0.11
0.02
0.01
171
As shown in Figure 2.31 in contrast with the RTs fed with measurements from the
lowest ranked buses, those constructed using the measurements from top ranked buses
have exhibited better performance. Another conclusion could be made by comparing the
R2 of Figure 2.31 with Figure 2.17: almost identical RT prediction R2 was achieved by
using the reduced set of measurements from the PMU locations suggested by CBR. Last
but not least, there is a huge decrease of the complexity in RT training since fewer
features are used. The training time of the 179-bus RTs has been reduced from about 3
minutes to less than 30 seconds.
109
110
simultaneously
and
synchronously.
As
consequence
inter-area
111
112
In this work, a data mining approach is used to estimate oscillatory stability in real
time. The decision tree (DT) method proposed by Breiman et al. is employed to map
system operating points at each instant to one of several pre-defined stability states.
Compared to previous research, the proposed approach casts the task as a multi-class
classification problem, as detailed in Section 7.3. In Section 7.4 we show the results of
the proposed method using the IEEE 39-bus test system. Finally, the data mining
approach is evaluated on field PMU measurements from Salt River Project (SRP), a
public electrical utility in Phoenix, Arizona, U.S.A.
113
114
discussed further. For ambient data an AR/ARMA model is used to derive mode
parameters while Prony analysis is used for ringdown data.
115
the input to the system. The evolution of the state is expressed by:
2.16
2.17
where x is the state of the system and n is the number of components in x (i.e., the
order of the system). Let xi, pi, qi be respectively the eigenvalues, right eigenvectors, and
left eigenvectors of matrix A (of size nn). The solution to Eq. 2.17 can be expressed as
the sum of n components:
2.18
Let y(t) be the system response. As we have assumed the system is an LTI system,
y(t) can be expressed in the form:
2.19
where C and D are constant matrices. If the input is removed (u(t)=0), then Eq. 2.19
simplifies to:
116
2.20
2.21
After some manipulation utilizing Eulers formula, the following result is obtained.
This allows for more direct computation of terms:
where
2.22
117
These steps are performed in z-domain. For power system applications the
eigenvalues would usually be translated to s-domain, consistent with Eq. 2.16 - Eq. 2.17
2.23
118
2.7.3.1 Framework
A framework of the proposed measurement-based scheme has been previously
shown in Figure 2.2. The model-based approach, which was investigated by the authors
in [29] and [41], is also shown in the figure for comparison purposes.
119
For each power system, several stability thresholds are specified with respect to the
typical damping ratio of the critical oscillation mode (DRcrit), and a set of stability states
is defined accordingly. As shown in Figure 2.36, for the given oscillatory stability
thresholds STB and ALT (STB > ALT), operating points (OPs) will be labeled as Good
if they satisfy DRcrit STB
Alert if they satisfy ALT > DRcrit 0; and Unstable when 0 > DRcrit. In practice, the
values of STB and ALT are usually around 10% and 5% respectively.
operated under a steady state, and an AR/ARMA model is employed to estimate the
mode parameters in a sliding window manner. The required window length for ambient
data analysis varies from 5 minutes to half an hour, depending on the variation level of
system loads. If a sudden deviation is detected, but only limited to fewer than 5 data
points, the corresponding measurements are considered outliers caused by sensor or
communication error, and are discarded from consideration. If a continued deviation has
been observed, the OD will report that a transient process is potentially occurring, and
Prony analysis is applied to scan the transient data using a sliding window with a length
of 5 to 10 seconds, depending on the critical mode frequency of the inter-area
electromechanical oscillation.
121
122
One of the key challenges of embedding DTs in online applications is the problem
of evolving system operating conditions. Due to variations in system generation and
loading patterns, and changes in system topology, the DRcrit of inter-area
electromechanical oscillations may also change. To deal with this eventuality, the
classification tree derived in CART needs to be periodically refreshed in order to reflect
the most current system operating conditions. This is done by updating the knowledge
base using the most recent PMU measurements, and re-training the DT.
123
124
Mode #2
Mode #3
Mode #4
Mode #5
Frequency
(Hz)
1.21
1.13
1.03
0.96
0.58
Damping
Ratio (%)
1.06
4.62
1.87
8.81
6.35
Dominant
Generator
G1, G3
G4, G6
G3
G10
G2
1.1
1.05
1
0.95
0.9
0.85
0.8
2.1
2.2
2.3
2.4
2.5
2.6
2.7
x 10
3
2
1
0
-1
-2
0
0.5
1.5
2.5
4
x 10
Prony analysis has been applied to the Bus 39 voltage magnitude signal during the
transient process. The sliding window has a length of 5 seconds and the Prony model
order is set to be N=30.
125
The AR model has been applied to the phase angle difference between Bus 7 and
Bus 39, which is shown in Figure 2.40. The ambient data before the fault are treated
using a sliding window with a length of 10 minutes. Different model orders have been
deployed to compare the results. The mode damping ratios estimated by AR of order
N=60 are drawn in Figure 2.41. The Mean of the damping ratios estimated with different
model orders have been summarized in Table 2.13. Table 2.13 shows that the mode
frequency estimated from AR and Prony are very close to the eigen-analysis results in
Table 2.12. The damping ratio estimated by AR is approaching the actual value when
increasing the model order. The DR estimated by Prony analysis is different due to the
change in system topology.
0.08
0.07
0.06
0.05
0.04
0.03
0.02
200
400
600
800
1000
126
AR
Prony
Order
Frequency
(Hz)
Damping
Ratio (%)
N=30
0.5622
4.391
N=60
0.5819
5.637
N=90
0.5753
6.224
N=30
0.5787
5.185
By varying the load disturbance level and fault scenario, the time-domain
simulations have been replicated and a total of 4938 OPs with their corresponding
stability states are included in the knowledge base. A classification tree has been
developed in CART using 80% of the cases, and the rest 20% has been used in new case
testing. The classification accuracy is evaluated as follows,
Accuracy
(7.1)
Fair
Alert
Accuracy
Good
610
0.9839
Fair
349
0.9887
Alert
13
0.8667
Accuracy
0.9951
0.9721
0.8125
0.9838
127
network (ANN) and support vector machine (SVM), have also been used to compare the
results.
From Table 2.15, the DT-based prediction model achieved similar accuracy to other
data mining tools. Compared to some black-box models, however, the DT provides a
more transparent structure with a clearer cause-effect relationship. Its piece-wise
structure and node splitting rules enable the identification of the critical variables and
thresholds that should be analyzed to gain insight into the oscillatory stability of a
system.
Table 2.15 Results Comparison
Data Mining
Tools
Misclassification Rate
Overall
Accuracy
Good
Fair
Alert
DT
0.0219
0.0667
0.0737
0.9739
ANN
0.0034
0.0902
0.1852
0.9873
SVM
0.0008
0.0738
0.0602
0.9940
2.8 Summary
The use of Decision Trees for online stability assessment without the knowledge of
system model parameters has been investigated in this work:
real time support by making use of the quickly updated PMU measurements;
129
Once trained using the knowledge base, the DT-based predictor can
The data mining tools are capable of reflecting the evolving system
operating conditions when the most recent PMU measurements and corresponding
knowledge base are used;
When the results are compared with other data mining tools such as ANN
and SVM, it is observed that almost identical prediction accuracy can be achieved.
2.9 Conclusions
In this project the approach of using classification and regression trees to predict
power system stability behavior from PMU measured synchrophasor data is explored.
The following conclusions were reached in this work:
The DT-based data mining model provides an accurate assessment of the
stability status of each system operating point. Compared with some other data
mining tools, using DTs it is possible to identify the critical variables and
thresholds that need to be analyzed to gain insight into the stability margin of a
power system;
Encouraging results were obtained through performance examination
using the proposed knowledge base generation methodology. With a sufficiently
captured system stability behavior, the DT model can predict the system
oscillatory and voltage stability status with high accuracy;
130
131
132
3.1 Introduction
Dynamic security assessment [60] can provide system operators important
information regarding the transient performance of power systems under various possible
contingencies. By using the real-time or near real-time measurements collected by phasor
measurement units (PMUs), online DSA can produce more accurate security
classification decisions for the present OC or imminent OCs. However, online DSA still
constitutes a challenging task due to the computational complexity incurred by the
combinatorial nature of N k ( k 1, 2,
practical power systems, which makes it intractable to perform power flow analysis and
time domain simulations for all contingencies in real-time.
The advent of data mining techniques provides a promising solution to handle these
challenges. Cost-effective DSA schemes have been proposed by leveraging the power of
data mining tools in classification, with the basic idea as follows. First, a knowledge base
is prepared through comprehensive offline studies, in which a number of predicted OCs
are used by DSA software packages to create a collection of training cases. Then, the
knowledge base is used to train classification models that characterize the decision rules
to assess system stability. Finally, the decision rules are used to map the real-time PMU
measurements of pre-fault attributes to the security classification decisions of the present
OC for online DSA. The data mining tools that have proven effective for DSA include
decision trees [13][14][15][16][18][20], neural networks [61][62][63] and support vector
machines [25][64][65]. More recently, fuzzy-logic techniques [22] and ensemble learning
133
Figure 3.1 Fully-grown DT of height 5 for the WECC system using an initial
knowledge base consisting of 481 OCs and three critical contingencies
techniques [19][66][67] have been utilized to enhance the performance of these data
mining tools in security assessment of power systems. Among various data mining tools,
DTs have good interpretability (or transparency) [68], in the sense that the secure
operating boundary identified by DTs can be characterized by using only a few critical
attributes and corresponding thresholds. As illustrated in Fig. 3.1, a well-trained DT can
effectively and quickly produce the security classification decisions for online DSA,
since only a few PMU measurements of the critical attributes are needed. The high
interpretability of DTs is amenable to operator-assisted preventive and corrective actions
against credible contingencies [69]. However, as discussed in [23], there exists an
accuracy versus transparency trade-off for data mining tools. In order to obtain a more
accurate classification model from DTs, one possible approach is to use an ensemble of
DTs at the cost of reduced interpretability. Examples of ensembles of DTs for DSA are
the multiple optimal DTs [18], random forest [19] and boosting DTs [66].
134
The realized OCs in online DSA can be dissimilar to those in the initial
knowledge base prepared offline, since the predicted OCs might not be
accurate and the OCs can change rapidly over time. Further, it is possible that a
system topology change may occur during the operating horizon due to the
forced outage of generators, transformers and transmission lines.
However, there have been limited efforts directed towards handling OC variations and
topology changes. In the scheme proposed in [18], when the built DT fails to classify the
changed OCs correctly, a new DT is built from scratch or a sub-tree of the DT is replaced
by a newly built corrective DT. Aiming to deal with possible topology changes,
references [62], [67] suggest creating an overall knowledge base that covers all
possible system topologies and choosing the attributes that are independent of topology
for data mining. Further, reliable PMU measurement is usually assumed in literature, and
the issue of missing PMU measurements in online DSA has not been considered.
To develop a robust data-mining-based online DSA scheme, the initial knowledge
base and the classification model have to be updated in a timely manner to track these
changed situations. Therefore, the two main objectives of our study in this project are:
135
136
phase angle difference between two buses, respectively (the bus numbers in attribute
names are different from their real ones), CTNO$ stands for the index of contingency.
In a DT, each non-leaf node tests the measurement of an attribute and decides
which child node to drop the measurements into, and each leaf node corresponds to a
predicted value. As shown in Fig. 3.1, in a DT for DSA, the predictive value of each leaf
node is either S or I, in which S stands for secure cases and I for insecure cases.
Fig. 3.1 also illustrates the training cases that fall into each node, by using dark bars for
secure cases and bright bars for insecure cases. The number of non-leaf nodes along the
longest downward path from the root node to a leaf node is defined as the height of a DT.
Given a collection of training cases {xn , yn }nN1 , the objective of DT induction is to find a
DT that can fit the training data and accurately predict the decisions for new cases. Stateof-the-art DT induction algorithms are often based on greedy search. For example, in the
classification and regression tree (CART) algorithm [11], the DT grows by recursively
splitting the training set and choosing the critical attributes (numerical or categorical) and
critical splitting rules (CSR) with the least splitting costs until some predefined stopping
criterion (e.g., the size of tree or the number of training cases in a leaf node) is satisfied.
In general, a fully-grown DT that accurately classifies the training cases might
misclassify new cases outside the knowledge base. This feature of fully-grown DTs is
usually referred to as overfitting [68]. In order to avoid overfitting, DTs are usually
pruned by collapsing unnecessary sub-trees into leaf nodes. As illustrated in Fig. 3.1, in a
pruned DT, some leaf nodes do not have pure training cases, which is a result of either
tree pruning or early termination of tree growing [68]. By removing the nodes that may
have grown based on noisy or erroneous data, the pruned DT is more resistant to
137
overfitting than a fully-grown DT without pruning, and thus can give more accurate
security decisions.
A major advancement in DT-based DSA schemes was made in [20], in which the
authors proposed to build a single DT to handle multiple contingencies, by using the
index of contingencies as a categorical attribute of the DT. It is worth noting that a DT
built by using such an approach can give the security classification decisions of an OC
concurrently for all the critical contingencies in the knowledge base, which is more
efficient and can identify the critical attributes that are independent of contingencies. For
example, the DT in Fig. 3.1, using CTNO$ as a categorical attribute, can give security
classification decisions of an OC for three critical contingencies, i.e., CT6, CT45 and
CT46, at the same time, and the critical attributes Q12,16, P7,2, Q7,9, A11,9, A12,19, A5,12 and
P36,7 can give security classification decisions independent of contingence type for some
cases.
(a) Small DT h1
(b) Small DT h2
(c) Small DT h3
Figure 3.2 The first three small DTs (J=2) for the WECC system, the voting weights
of which are 4.38, 3.04 and 0.93, respectively
the number of nodes, is used as the metric to quantify the tree size. The reason, which
will be soon apparent, is to restrict the number of nodes that will be revised when
updating DTs to a value less than J.
deterministic and probabilistic. For the deterministic approach, the security classification
decision is given by:
L
1, if l 1 al hl (x) 0
H L ( x)
1, o.w.
139
(3.1)
where al (l 1, 2,
classification decisions, the logistic correction technique [71] can be applied. Then, the
probability of an Insecure classification decision is given by:
Pr( H L (x) 1| x)
1
L
1 exp( al hl (x))
(3.2)
l 1
140
putting more weights on the attribute subsets that have higher availability when randomly
selecting attribute subsets, the resulting small DTs would be more likely to be robust to
possibly missing PMU measurements.
3.3 Proposed Robust Online DSA for OC Variations and Topology Changes
In this project, a robust data-mining-based DSA scheme using adaptive ensemble
DT learning is proposed to handle OC Variations and Topology Changes in an efficient
manner. The proposed scheme for online DSA, as illustrated in Fig. 3.3, consists of three
141
142
Different from existing DT-based DSA schemes, the training cases are assigned different
data weights by each small DT; and higher data weights are assigned to a new training
case if it is misclassified by the small DTs. The aforementioned techniques are utilized to
minimize the misclassification cost as new training cases are added to the knowledge
base, so that the classification model could smoothly track the changes in OCs or system
topology.
143
, xP , y} , where x1 is the
obtained from power flow analysis of an OC, and y is the transient security classification
decision of the OC for the critical contingency x1 . Based on the previous studies
[16][18][20], the following PMU-measured variables are selected as numerical attributes:
where B denotes the set of PMU buses in the system. It is worth noting that only raw
measurements reported by PMUs are used as the numerical attributes in this work; more
generally, the variables computed using other system information may also be used, e.g.,
the voltage at the bus connected to a PMU bus when the branch impedance is constant
[16].
144
CN ( FL )
1 N
log 2 1 e yn FL (xn ) .
N n1
(3.3)
It is observed from (3.1) and (3.3) that CN FL lies strictly above the misclassification
error rate of H L . Then, a primary objective of boosting is to minimize CN FL , by
identifying the small DTs hl H J and their voting weights al R . An analytical
formulation is provided as follows:
PF : min CN ( FL ) .
(3.4)
h1 , , hL H J
a1 , , aL R
145
specifically, it is shown in [66] that the small DT hl can be obtained by solving the
following problem:
Ph(l ) : min
hl H J
1 N (l )
wn 1{ yn hl (xn )}
N n1
(3.5)
where wn(l ) (1 e yn Fl1 ( xn ) )1 is the positive data weight of the training case {xn , yn } , and
1{ yn hl ( xn )} takes value 0 if the training case {xn , yn } is correctly classified by the small DT
hl (otherwise, it takes value 1). By definition of wn(l ) , it is easy to observe that the data
weights are assigned adaptively by small DTs, in the sense that if the training case
{xn , yn } is misclassified by the small DT hl , then wn(l 1) wn(l ) , i.e., the training case has a
higher data weight in the next round of the boosting process. It is worth noting that highly
skewed training data (e.g., the case in [19]) can be handled by scaling up the weights of
under-represented cases, such that y 1 wn(l ) y 1 wn(l ) . As suggested in (3.5), the
objective of Ph(l ) is to determine the small DT that has the least misclassification error rate
on the weighted training data. Thus, the small DT hl can be obtained by employing the
standard CART algorithm [11] subject to the tree height J , and by using misclassification
error rate as the splitting cost when building the DT. Then, its positive voting weight is
obtained by solving the following problem:
Pa(l ) : min GN(l ) (a)
(3.6)
aR
146
where GN(l ) (a)CN ( Fl 1 ahl ) . Under the condition that hl is a descent direction of
CN ( Fl 1 ) , it is easy to verify that GN(l ) (0) 0 and GN(l ) (a) 0 holds for any a R .
Therefore, GN(l ) (a) has a unique minimum in R that can be found using standard
numerical solution methods (e.g., Newtons method).
147
wN(l )k (1 e yN k Fl1 ( xN k ) )1 into the small DT hl and recalculating the voting weight al ,
iteratively for l 1, 2,
,L.
A key step for incorporating a new training case into a small DT is to adopt the
method described in Section 3.2.3. Since misclassification error rate is used as the metric
of splitting cost, as suggested in (3.5), it is easy to observe that there exists a even simpler
solution for updating the small DTs. Specifically, a small DT remains unchanged if the
new case is correctly classified; otherwise, only the sub-tree corresponding to the first
non-leaf node that has a different decision for the new case is subject to update. It is
worth noting that, since the tree height is J , the total number of non-leaf nodes to be
revised is at most J . After the small DT hl is updated, its voting weight al is recalculated
by minimizing GN(l ) k (a) .
The process of updating the classification model is summarized in Algorithm 3.1. It
is useful to note that when the k-th new training case is used to update the small DTs, the
data weights of the previous N k 1 training cases calculated in Step 4 of Algorithm
3.1 are different from the data weights that were used in building or updating the small
DTs in the past rounds. Therefore, unlike the case in offline training, it is possible that the
updated small DT hl is not a descent direction of CN k at Fl 1 any more. In order to
148
detect and handle this situation, an extra step is used in Algorithm 3.1. Specifically, if
N k
w
n 1
(l )
n
If 0 , then
hl hl
End if
Recalculate al by minimizing GN(l ) k (a) .
Fl Fl 1 al hl
End For
149
classification model, each of the small DTs uses the values of the attribute vector and its
CSRs to produce a binary decision. Finally, the binary decisions of all small DTs are
collected and used to give the security classification decisions of the present OC,
according to (3.1). It is worth noting that distributed processing technologies [75] can be
leveraged to speed up online DSA. Specifically, the K unlabeled cases can be classified
separately by using K duplicates of the classification model, and in each classification
model, all small DTs can process the attribute vector of an unlabeled case in a parallel
manner.
From the above development, it can be seen that the proposed scheme illustrated in
Fig. 3.3 is derived from those in previous work [16][18][20], with the following major
modifications. 1) The classification model is obtained via boosting multiple small
unpruned DTs instead of a single fully-grown DT after pruning. It is suggested that
boosting algorithms can lead to better model fitting and the produced classification model
is quite resistant to overfitting [70]. Thus, boosting small DTs has great potential to
deliver better performance in terms of classification accuracy. 2) Unequal data weights
are assigned to the training cases adaptively by small DTs. In periodic updates,
misclassified new training cases can have higher data weights than those classified
correctly. This will speed up adapting the small DTs to newly changed OCs. 3) The small
DTs are gracefully updated by incorporating new cases one at a time, whereas rebuilding
DTs is used in [16][18][20]. 4) The DT and the knowledge base are updated only when
the new cases are misclassified in [16][18][20]; whereas all new training cases are
incorporated into the knowledge base in the proposed scheme.
150
151
75 branch active/reactive power flows and current flows, which take any of the
8 PMU buses as either a from-bus or a to-bus of the branch;
28 bus voltage phase angle differences, which are computed from the 8(8-1)/2
pairs of phase angles.
152
seconds with a step size of 0.5 cycle. The power angle-based stability margin is used as
the transient stability index (TSI), defined as
360 max
100, 100 100
360 max
(3.7)
where max is the maximum angle separation of any two generators in the system at the
same time in the post-fault response. In case of islanding, the above value is evaluated for
each island and the smallest value is taken as the TSI. During the simulation time,
whenever the margin turns out to be negative, i.e., the rotor angle difference of any
two generators exceeds 360 degree, the case is labeled as transiently insecure.
knowledge base are randomly partitioned into V subsets of equal size. For given fixed J
and L , a classification model is trained by using V 1 subsets, and tested using the other
subset. The training process is then repeated V times in total, with each of the V subsets
used exactly once as the test data. Finally, the misclassification error rate obtained by V fold cross validation is calculated by averaging over the V classification models. The
results of the above procedure for different tree heights ( J =1, 2, 3) are illustrated in Fig.
3.6. It can be seen that as L increases, the misclassification error rate of each
classification model decreases and reaches a plateau at some L . Then, when L grows
larger, each classification model incurs a larger variance and hence a higher
153
misclassification error rate. On the other hand, a larger tree height J implies a larger
variance of classification model [68], which is also observed in Fig. 3.6. Based on these
observations, J =2 is chosen, and L =15 at which the misclassification error rate drops
below 1% and reaches a plateau is selected.
Figure 3.6 Ensemble small DT learning with different tree heights for the IEEE 39bus test system
according to (3.5). Then, the training cases together with their data weights are used by
the CART algorithm to build a small DT hl with height J , by using weighted
misclassification rate as the cost function, as shown in (3.5). Note that each small DT
gives security classification decisions for all critical contingencies. Further, the voting
154
weight of hl is calculated by numerically solving (3.6). Then, the ensemble of small DTs
is obtained. It is worth noting that, different from the V -fold cross validation procedure,
the entire training set (not a subset) is used by each small DT of the ensemble.
155
in the root node of h1 changes from the voltage phase angle difference between bus 2 and
bus 26, A_2_26, to the active power flow between bus 17 and bus 18, P_17_18. The
CSRs of the non-root nodes change accordingly, as a result of the recursive procedure of
the CART algorithm. The small DT h1 rebuilt with the 100 changed OCs is illustrated in
156
can be seen that the proposed approach achieves comparable performance to the
benchmark approach by rebuilding small DTs. The test results also suggest that when
OCs change, the small DTs have to be updated in order to track the variation of OCs.
Table 3.1 Misclassification error rate of robustness testing
Secure cases Insecure cases Overall
Proposed
0.68%
0.36%
0.55%
Small DTs (rebuilding)
0.59%
0.38%
0.54%
Small DTs (no updating)
10.68%
6.85%
9.57%
157
In what follows, the procedure for generating the three OC sets is discussed in
detail. The realized OCs include the 33 recorded OCs and another 448 OCs that are
generated by interpolation, as illustrated in Fig. 3.8. Specifically, following the method in
Figure 3.8 Aggregate load of recorded OCs and generated OCs by interpolation
[20], both the active and reactive load of each load bus for every minute of the
investigated period are obtained by linear interpolation based on the two closest recorded
OCs, and the generator power outputs are adjusted as needed to ensure valid OCs. To
enrich the initial knowledge base, a day-ahead predicted OC is obtained by randomly
changing the bus loads within 90% to 110% of the loads of the corresponding realized
OC, by using a uniform distribution. Similarly, a short-term predicted OC is generated by
uniformly randomly changing the bus loads within 97% to 103% of the loads of the
corresponding realized OC. After solving the power flows for each OC using the power
flow and short circuit analysis tool (PSAT) [77], 481 OCs are generated for each of the
three OC sets. It is worth noting that different from the day-ahead predicted OCs, the
short-term predicted OCs and the realized OCs are time-stamped.
158
159
Figure 3.9 Flowchart for testing online DSA with periodic updates
160
161
performance of the proposed scheme is quite close to the scheme based on boosting small
DTs with rebuilding.
Scheme
Proposed
A single DT
(rebuilding)
Boosting
(rebuilding)
1.80%
2.22%
2.26%
2.73%
2.5%
1.81%
1.03%
1.39%
2.26%
0.82%
1.5%
162
becomes less time-consuming than the other two schemes. The reason is that for each
new OC, the two benchmark schemes rebuild DTs from scratch, while the graceful
update of small DTs is carried out in the proposed scheme. Further, according to the
CART algorithm [11], it is known that the sorting operation of the CART algorithm
dominates the computational burden of DT building/rebuilding. When updating small
DTs, the sorting operation is skipped [73]. Therefore, the proposed scheme has a much
lower computational burden.
163
Therefore, it is urgent to design DT-based online DSA approaches that are robust to
missing PMU measurements.
Intuitively, one possible approach to handle missing PMU measurements is to
estimate the missing values by using other PMU measurements and the system model.
However, with existing nonlinear state estimators in supervisory control and data
acquisition (SCADA) systems, this approach may compromise the performance of DTs.
First, the scan rate of SCADA systems is far from commensurate with the data rate of
PMU measurements, and thus using estimated values from SCADA data may result in a
large delay for decision making. Second, SCADA systems collect data from remote
terminal units (RTUs) utilizing a polling approach. Following a disturbance, it is possible
that some post-contingency values are used due to the lack of synchronization, which can
lead to inaccurate security classification decisions of DTs. It is worth noting that future
fully PMU-based linear state estimators [82] can overcome the aforementioned
limitations; but this is possible only when there is a sufficient number of PMUs placed in
system. With this motivation, data-mining based approaches are investigated in this
paper, aiming to use alternative viable measurements for decision making in case of
missing data.
In DTs built by the classification and regression tree (CART) algorithm [11],
missing data can be handled by using surrogate. However, a critical observation in this
project is that when PMU measurements are used as attributes, most viable surrogate
attributes have low associations with the primary attributes. Clearly, the accuracy of DSA
would degrade if surrogate is used. This is because a DT is essentially a sequential
processing method, and thus the wrong decisions made in earlier stages may have
164
significant impact on the correctness of the final decisions. Thus motivated, this paper
studies applying ensemble DT learning techniques (including random subspace methods
and boosting), so as to improve the robustness to missing PMU measurements.
Aiming to develop a robust and accurate online DSA scheme, the proposed
approach consists of three processing stages, as illustrated in Fig. 3.11. Specifically,
given a collection of training cases, multiple small DTs are trained offline by using
randomly selected attribute subsets. In near real-time, new cases are used to re-check the
performance of small DTs. The re-check results are then utilized by a boosting algorithm
Figure 3.11 A three-stage ensemble DT-based approach to online DSA with missing
PMU measurements
to quantify the voting weights of a few viable small DTs (i.e., the DTs without missing
data from their attribute subsets). Finally, security classification decisions of online DSA
are obtained via a weighted voting of viable small DTs. More specifically, a random
subspace method for selecting attribute subsets is developed by exploiting the locational
information of attributes and the availability of PMU measurements. Conventionally, the
availability of a WAMS is defined as the probability that the system is operating
normally at a specified time instant [83]. In this project, the availability of PMU
measurements is defined similarly, i.e., as the probability that PMU measurements are
165
in online DSA. Therefore, a modified CART algorithm in which co-located attributes are
excluded from surrogate searching is used to build a single DT and identify the surrogate
attributes. The results regarding the performance of the surrogates identified by both the
modified CART algorithm and the CART algorithm are given in Table 3.3.
Two key observations are drawn. First, the results obtained by the modified CART
algorithm suggest that all non-colocated surrogates have relatively low associations with
the primary ones. The low association could be explained by the complex coupling
Table 3.3 Surrogates of the DT for the WECC system
By modified CART
By CART
Node Primary Attribute
Surrogate Association Surrogate Association
1
V{217}
V{207}
0.76
V{207}
0.76
Q{204;207}
Q{212;216}
0.33
Q{207;209}
0.50
Q{204;207}
V{209}
0.28
Q{207;209}
0.64
I{211;204}
P{008;011}
0.62
P{209;211}
0.83
P{210;201}
P{211;062}
0.87
P{231;201}
0.87
Q{005;033}
Q{801;999}
0.71
Q{801;999}
0.71
P{213;222}
Q{207;211}
0.85
P{222;223}
0.85
Q{041;060}
I{011;051}
0.50
I{011;051}
0.50
P{211;062}
P{213;216}
0.50
I{062;211}
0.75
10
P{236;219}
Q{230;052}
0.42
P{236;207}
0.68
structure of the attributes in power systems. According to the definition of surrogate, high
association relies on the dependency between the surrogate and the primary attributes,
i.e., the surrogate attribute gives similar decisions to the primary attribute on all the
training cases regardless of any other attribute. However, in power systems, one attribute
(i.e., voltage magnitude, voltage phase angle or power/current flow) is coupled with
many other non-co-located attributes, as dictated by the AC power flow equations and the
167
network interconnection structure. Second, it is observed in Table 3.3 that the surrogate
attributes found by the CART algorithm are mostly co-located with the primary
attributes. This observation signifies the redundancy between co-located attributes when
used for splitting the training cases, and thus sheds lights on exploiting the locational
information to create the attribute subsets, as described in Section 3.4.2.
168
Category 2: active power flow Pij , reactive power flow Qij and current
magnitude I ij , for i I kPMU and j N (i)
where I kPMU denotes the collection of the buses with PMU installation within area k , and
N (i) denotes the collection of the neighbor buses of bus i . An attribute subset of area k
is created by including one voltage or flow measurement from each bus i I kPMU and all
phase angle difference measurements from this area. 3) The index of contingencies is
included as a categorical attribute in any attribute subset.
169
The criteria used in creating the attribute sets are elaborated below. By restricting
the attributes of a subset to be the PMU measurements within the same area, the impact
of some scenarios, i.e., when a PDC that concentrates PMU measurements within an area
fails, is significantly reduced, since the small DTs using the PMU measurements from the
other areas could still be viable. For a given bus, since Category 1 and Category 2 PMU
measurements are co-located, it suffices to include only one of them in an attribute subset
so that the redundancy within an attribute subset is minimal. Further, all measurable
phase angle differences are included. This is because theoretical and empirical results
(e.g., in [18]) suggest that angle differences contain important information regarding the
level of stress in OCs, and thus are more likely to be the attributes critical to assessing
transient instability. It is also worth noting that the Category 2 attributes from two
different buses are unlikely to be redundant, in the sense that they are the measurements
from different transmission lines, given the fact that PMUs could provide power flow
measurements and it is usually unnecessary to place PMUs at both ends of a transmission
line to achieve the full observability of power grids.
For convenience, let Sk denote the collection of candidate attribute subsets of area
k . Then, the size of Sk is given by
Mk
(3deg(i) 1)
(3.8)
iI kPMU
where deg(i) denotes the degree of bus i , i.e., the number of buses that connect with bus
i . Then, S
K
k 1
170
Ps : max
{ ps ,sS }
s.t.
p log
sS
ps1
(3.9)
p A A
sS
p
sS
where As denotes the availability of an attribute subset s . According to the rules for
creating the candidate attribute subsets, it is easy to see that each of the attribute subsets
of an area consists of exactly two measurements from each PMU within this area.
Therefore, the availability of an attribute subset s of area k , which was formally defined
in Section I as the probability that the measurements of s are successfully delivered to the
monitoring center, equals that of the WAMS within area k , i.e.,
As Ak , s Sk
(3.10)
171
past operating data) and independent from each other. Under these assumptions, the
availability of the WAMS within area k is given by:
Ak
(3.11)
iI kPMU
where AiPMU , Ailink , AkPDC and Aklink denote the availability of the PMU at bus i , the
communication link from the PMU at bus i to the PDC, the PDC and the communication
link from the PDC to the monitoring center, respectively. It is worth noting that (3.10)
and (3.11) are derived for the case illustrated in Fig. 3.12, and thus may not be directly
applicable to the cases with measurement redundancy. For example, when multiple dual
use PMU/line relays are utilized in substations, the availability of bus voltage phasor
measurements can be enhanced. The procedure for analyzing the availability of WAMS
in case of redundancy can be found in the literature (e.g., [89]).
By taking (3.10) into account, it follows that the solution to problem Ps in (3.9) has
the following property.
Proposition 3.1: The optimal solution to Ps in (3.9) takes the following form:
ps* pk* / M k , s Sk
(3.12)
, K} is the solution to
Ps : min
p1 ,
pK
p log ( p
k 1
s.t.
p A
k 1
/ Mk )
(3.13)
A0
172
p
k 1
Proof: Since Ps maximizes a concave function with affine constraints, the KarushKuhn-Tucker (KKT) conditions are necessary and sufficient for a solution to be optimal.
Therefore,
(1 lnps* ) / ln2 * As * 0, s S
(3.14)
where * and * are the KKT multipliers for the two constraints of Ps . Then, by taking
the equality in (3.10) into account, it is easy to verify that Ps have the same value for all
,K .
, K} by solving Ps in (3.13).
3.4.3 Proposed Approach for Online DSA with Missing PMU Measurements
First, L small DTs are trained offline by using randomly selected attribute subsets.
In case of missing PMU measurements in online DSA, L ( L L ) viable small DTs are
identified, and are assigned different voting weights. Specifically, the results of
173
performance re-check in near real-time are utilized to quantify these voting weights.
Finally, the security classification decisions for the new OCs in online DSA are obtained
via weight voting of the L viable small DTs.
voting of them, i.e., FL (x) l 1 hl (x) could fit the training data. The iterative process to
L
Fl Fl 1 hl
End For
1
N
1
n 1
(3.15)
{ yn hl ( xln )}
174
where x ln denotes the measurements of the attribute subset s l . It is well-known that the
problem in (3.15) is NP-complete [90]. Here, the CART [11] algorithm is employed to
find a sub-optimal DT, by using misclassification error rate as the splitting cost function.
It is clear from (3.15) that equal weights, i.e., 1/ N , are assigned to all training data.
When historical data that identifies potential weak spots of the system is available, these
data can be integrated by assigning higher weights, and by replacing 1/ N with unequal
data weights.
, L , are utilized to
choose a few viable small DTs to be used in online DSA and calculate the corresponding
voting weights via a process of boosting small DTs. In order to make best use of existing
DTs, the viable small DTs in online DSA include the small DTs without any missing
PMU measurement and non-empty degenerate small DTs.
175
176
weights are carefully assigned based on the re-check results of the viable small DTs.
Second, even though all the small DTs are viable, choosing the small DTs with proper
voting weights based on their accuracy can still be a critical step to guarantee accurate
decisions. This is because small DTs trained offline fit the training cases that are created
based on day-ahead prediction, while the re-check results on the N new cases contain
more relevant information on assessing the security of the imminent OCs in online DSA.
In the proposed approach, weighted voting of small DTs in H is implemented via
a boosting process. Following the method in [66], initially with F0 as a zero function, a
small DT hl H is first identified and added to Fl 1 , i.e.,
Fl Fl 1 al hl
iteratively for l 1, 2,
(3.16)
1 N
y F (x )
C ( FL ) log 2 (1 e n L n )
N n1
(3.17)
(l )
PDT
: min
hl H
1 N (l )
wn 1{ yn hl (xln )}
N n1
(3.18)
1
(l )
w
n 1,
n
yn Fl 1 ( x n )
al argmin gl (a) l 1,
aR
,N
(3.19)
,L
177
~
~
where g a C F1 ah . Boosting viable small DTs in online DSA is summarized in
Algorithm 3.4.
Algorithm 3.4: Boosting viable small DTs for online DSA
Input: Re-check results {hl (xln ), yn }nN1 , l 1,
,L
Initialization: F0 0
For l 1 to L do
Calculate the data weights according to (3.19).
(l )
Find a small DT hl by solving PDT
in (3.18) using the CART algorithm.
Fl Fl 1 al hl .
End For
weights al of small DTs. According to (3.19), calculating the data weights requires
evaluating Fl for the new cases, which could be easily obtained from the re-check results
of the small DTs. Therefore, it is easy to see that the complexity in calculating the data
178
(l )
weights is O ( N ) . Solving PDT
boils down to searching for the small DT in H that has
the least weighted misclassification error. Since the re-check results of the small DTs in
H for the new cases are already known, the optimal small DT could be found by
comparing the weighted misclassification errors of the small DTs in H . Therefore, the
(l )
complexity in solving PDT
is O ( LN ) . In the l -th iteration of the boosting process, the
voting weight is obtained by minimizing gl (a) . It is easy to verify that gl '(0) 0 and
gl ''(a) 0 holds for a R . Therefore, gl (a) has a unique minimum in R that could
be found by using standard numerical methods (e.g., Newtons methods). Further, since
gl (a) is convex, standard numerical methods could find the minimum in a few iterations.
In each iteration, Fl 1 ahl needs to be evaluated for all the N new cases. Therefore, the
complexity in calculating the voting weight for a small DT is O ( N ) . Summarizing, the
overall computational complexity of the boosting process is O ( L2 N ) .
The proposed approach above relates to that in [66] in the following sense: small
DTs are utilized in both approaches; new cases are used in near real-time for accuracy
guarantee by both approaches; the security classification decisions of online DSA are
both obtained via a weighted voting of small DTs. However, the two approaches are
tailored towards different application scenarios. The approach proposed here is more
robust to missing PMU measurements, while the approach in [66] could give accurate
decisions with less effort in offline training when the availability of PMU measurements
is sufficiently high. The major differences of the two approaches are outlined as follows.
First, the small DTs in the proposed approach are trained by using attribute subsets for
179
robustness, whereas the entire set of attributes is used in [66]. Second, the usage of new
cases in near real-time is different. In [66], the new cases are used to update the small
DTs, whereas in the proposed approach, the new cases are only used to re-check the
performance of viable small DTs so as to quantify the voting weights.
Figure 3.14 The IEEE 39-bus system in three areas and PMU placement
aggregated generation from the rest of eastern interconnection [35]. In this case study, the
180
test system is assumed to consist of three areas. The three areas together with the PMU
placement are illustrated in Fig. 3.14. It is worth noting that the PMU placement
guarantees the full observability of the test system when the constraints at zero-injection
buses are taken into account.
181
line(10,11), line(10,13)
line(10,13), line(6,11)
line(10,13), line(10,11)
line(13,14), line(10,11)
line(16,21), line(23,24)
line(21,22), line(23,24)
line(21,22), line(22,23)
line(21,22), line(16,24)
line(23,24), line(16,21)
line(23,24), line(21,22)
Placement
Category 1
Category 2
Category 3
Mk
Ak
pk
8, 13, 39
24
700
0.28
18, 25, 29
24
700
0.28
16, 20, 23
30
1120
0.44
183
NOC =200 generated OCs which are both pre-contingency and N 1 contingency
secure are used for offline training. Combining the generated OCs with their transient
security classification decisions for the N C =30 selected N 2 contingencies, are used to
generate the N =6000 cases in the knowledge base. The size and the number of small
DTs are determined by bias-variance analysis [68] and V -fold cross validation [18]. In
184
this case study, L =40 and J =3 are used by the proposed approach; 45 DTs are used in
the two RF-based approaches.
3.4.4.3.3 Near Real-time Re-check
By following the procedure described in Section 3.4.3.2, 100 OCs are generated for
performance re-check. The DTs trained offline are applied to the new cases; the
classification results are compared with the actual security classification decisions of the
new cases. Then, these re-check results are used by Algorithm 3.4 to quantify the voting
weights of DTs
3.4.4.3.4 Online DSA Test
Another 100 OCs are generated for testing, by following the procedure described in
Section V.B. Recall that the availability of PDC and the communication links for PDCs is
1, and then it can be seen from Fig. 3.13 and Fig. 3.14 that the total number of failure
scenarios of all PMUs and links can be reduced to 512 (29, since there are 9 pairs of
PMUs and links). The online DSA test is repeated for all failure scenarios, by identifying
the missing PMU measurements and viable small DTs, calculating the voting weights of
viable small DTs, and evaluating the misclassification error rate. The misclassification
error in online DSA is calculated by:
512
e( F ) Prob((k ))e( F | (k ))
(3.20)
k 1
where, (k ) denote the k -th failure scenario; Prob((k )) denotes the probability to
happen of (k ) , which can be easily calculated by using the assumed availability;
185
186
~ ~
~ ~
respectively; let V V and I I be the corresponding measurement. For PMUs
complying with IEEE C37.118 standard [39], a measurement should have a total vector
error (TVE) less than 1%, i.e.,
~ ~
V V VV
1%
VV
(3.21)
~ ~
I I I I
1%
I I
(3.22)
~ ~
~ ~
For convenience, let nV and nI denote the measurement noise in V V and I I
respectively. In order to generate measurement complying with the above specifications,
the complex noise nV and nI are randomly generated, by using the following density
functions (note that other density functions can be also used) properly scaled and
truncated from standard complexity Gaussian distributions:
187
9|n |
4V 2
9
10 V
e
f (nV ) 1 e9 104 V 2
9|n |
4I 2
9
10 I
e
f (nI ) 1 e9 104 I 2
if | nV | 102V
(3.23)
o.w.
if | nI | 102 I
(3.24)
o.w.
Then, it is clear that all noisy measurements have TVE not more than 1%, and are
complex Gaussian distributed within their support. The generated random measurement
noise is added to the both training and testing data. The test results are provided in Fig.
3.16.
188
3.5 Conclusions
In this project, an online DSA scheme based on ensemble DT learning is proposed
to handle the OC variations and topology changes that are likely to occur during the
operating horizon. The proposed scheme is applied to a practical power system, and the
results of a case study demonstrate the performance improvement brought by boosting
unpruned small DTs over a single DT. Compared to single DTs, the classification model
obtained from ensemble DT learning often have higher accuracy, and lend themselves to
cost-effective incorporation of new training cases. The results presented here also provide
an insight into the possibilities of other ensemble DT learning techniques, e.g., random
forest, in handling the challenges of online DSA.
Further, in order to mitigate the impact of missing PMU measurements in online
DSA, a random subspace method that utilizes the topological information of WAMS and
the availability of PMU measurement has been developed and incorporated into the
ensemble DT learning. In particular, the various possibilities of missing PMU
measurements in online DSA can make off-the-shelf DT-based techniques (a single DT,
RF, etc) fail to deliver the same performance as expected. The proposed ensemble DTbased approach exploits the locational information and the availability of PMU
measurements in randomly selecting attribute subsets, and utilizes the re-check results to
re-weight the DTs in the ensemble. These special treatments developed from a better
understanding of power system dynamics guarantee that the proposed approach can
achieve better performance than directly applying off-the-shelf DT-based techniques.
189
4
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
References
Bittencourt, H.R.; Clarke, R.T., "Use of classification and regression trees (CART)
to classify remotely-sensed digital images," Geoscience and Remote Sensing
Symposium, 2003. IGARSS '03. Proceedings. 2003 IEEE International 21-25 July
2003
Galvan, F.; Wells, C.H., "Detecting and managing the electrical island created in
the aftermath of Hurricane Gustav using Phasor Measurement Units (PMUs),"
Transmission and Distribution Conference and Exposition, 2010 IEEE PES, 19-22
April 2010
Wells, C.H., Redundancy and Reliability of Wide-Area Measurement
Synchrophasor Archivers. OSIsoft, LLC and Schweitzer Engineering Laboratories,
Inc. 2011
Kolluri, S.; Mandal, S.; Galvan, F.; Thomas, M., "Island Formation in Entergy
Power Grid during Hurricane Gustav," Power & Energy Society General Meeting
2009. PES '09. IEEE, 26-30 July 2009
Jervis, R., Hurricane Isaac pounds Louisiana, water pours over levee, USA Today
29 August 2012
P. Kundur, Power System Stability and Control. New York: McGraw-Hill, 1994.
M. Kezunovic, C. Zheng, and C. Pang, Merging PMU, operational, and nonoperational data for interpreting alarms, locating faults and preventing cascades,
43rd Hawaii International Conference on System Sciences (HICSS), Jan. 2010.
C. Zheng, Y. Dong, O. Gonen, and M. Kezunovic "Data integration used in new
applications and control center visualization tools," IEEE PES General Meeting,
Minneapolis, USA, July 2010.
G. Rogers, Power System Oscillations. Boston: Kluwer Academic Publishers, 2000.
T. V. Cutsem and C. Vournas, Voltage Stability of Electric Power Systems. Boston:
Kluwer Academic Publishers, 1998.
L. Breiman, J. Friedman, R. A. Olshen, and C. J. Stone, Classification and
Regression Trees. Pacific Grove: Wadsworth, 1984.
Dan Steinberg and Mikhail Golovnya, CART 6.0 Users Manual. San Diego, CA:
Salford Systems, 2006.
L. Wehenkel, T. V. Cutsem, and M. Ribbens-Pavella, An artificial intelligence
framework for on-line transient stability assessment of power systems, IEEE
Trans. Power Syst., Vol. 4 No. 2, pp. 789-800, May 1989.
L. Wehenkel, M. Pavella, E. Euxibie, and B. Heilronn, Decision tree based
transient stability method a case study, IEEE Trans. Power Syst., Vol. 9, No. 1, pp.
459-469, Feb. 1994.
S. Rovnyak, S. Kretsinger, J. Thorp, and D. Brown, Decision trees for real-time
transient stability prediction, IEEE Trans. Power Syst., Vol. 9, No. 3, pp. 14171426, Aug. 1994.
K. Sun, S. Likhate, V. Vittal, V. S. Kolluri, and S. Mandal, An online dynamic
security assessment scheme using phasor measurements and decision trees, IEEE
Trans. Power Syst., Vol. 9, No. 1, pp. 459-469, Nov. 2007.
190
191
192
[51] R. Kumaresan, D.W. Tufts, and L.L. Scharf, "A Prony method for noisy data:
choosing the signal components and selecting the order in exponential signal
models," Proc. IEEE, pp. 230-233, February 1984.
[52] J. F. Hauer, C. J. Demeure, and L. L. Scharf, Initial results in Prony analysis of
power system response signals, IEEE Trans. Power Syst., vol. 5, pp. 80-89, Feb.
1990.
[53] J. F. Hauer, Applications of Prony analysis to the determination of modal content
and equivalent models for measured power system response, IEEE Trans. Power
Syst., vol. 6, pp. 10621068, Aug. 1991.
[54] J. W. Pierre, D. J. Trudnowski, and M. K. Donnelly, Initial results in
electromechanical mode identification from ambient data, IEEE Trans. Power
Syst., vol. 12, no. 3, pp. 12451251, Aug. 1997.
[55] R. W. Wies, J. W. Pierre, and D. J. Trudnowski, Use of ARMA block processing
for estimating stationary low-frequency electromechanical modes of power
systems, IEEE Trans. Power Syst., vol. 18, no. 1, pp. 167173, Feb. 2003.
[56] I. Kamwa, G. Trudel, and L. Gerin-Lajoie, Low-order black-box models for
control system design in large power systems, IEEE Trans. Power Syst., vol. 11,
no. 1, pp. 303311, Feb. 1996.
[57] C. Zheng, V. Malbasa, and M. Kezunovic, Online estimation of oscillatory
stability using synchrophasors and a measurement-based approach, submitted to
17th International Conference on Intelligent System Applications to Power Systems
(ISAP 2013), under review.
[58] N. Zhou, J. W. Pierre, and J. Hauer, Initial results in power system identification
from injected probing signals using a subspace method, IEEE Trans. Power Syst.,
vol. 21, no. 3, pp. 12961302, Aug. 2006.
[59] D. J. Trudnowski, J. M. Johnson, and J. F. Hauer, Making Prony analysis more
accurate using multiple signals, IEEE Trans. Power Syst., vol. 14, no. 1, pp. 226
231, Feb. 1999.
[60] P. Sauer, K. L. Tomsovic, and V. Vittal, Dynamic Security Assessment, 2nd ed.,
ser. The Electric Power Engineering Handbook. CRC Press, 2007, chapter 15, pp.
110.
[61] V. Miranda, J. Fidalgo, J. Lopes, and L. Almeida, Real time preventive actions for
transient stability enhancement with a hybrid neural network optimization
approach, IEEE Trans. Power Syst., vol. 10, no. 2, pp. 10291035, May 1995.
[62] C. Jensen, M. El-Sharkawi, and R. Marks, Power system security assessment using
neural networks: feature selection using Fisher discrimination, IEEE Trans. Power
Syst., vol. 16, no. 4, pp. 757763, Nov. 2001.
[63] Kamwa, R. Grondin, and L. Loud, Time-varying contingency screening for
dynamic security assessment using intelligent-systems techniques, IEEE Trans.
Power Syst., vol. 16, no. 3, pp. 526536, Aug. 2001.
[64] L. Moulin, A. da Silva, M. El-Sharkawi, and I. Marks, R.J., Support vector
machines for transient stability analysis of large-scale power systems, IEEE Trans.
Power Syst., vol. 19, no. 2, pp. 818825, May 2004.
[65] Rajapakse, F. Gomez, K. Nanayakkara, P. Crossley, and V. Terzija, Rotor angle
instability prediction using post-disturbance voltage trajectories, IEEE Trans.
Power Syst., vol. 25, no. 2, pp. 947956, May 2010.
193
[66] M. He, J. Zhang, and V. Vittal, A data mining framework for online dynamic
security assessment: decision trees, boosting, and complexity analysis, in IEEE
PES Innovative Smart Grid Technologies (ISGT), Jan. 2012, pp. 18.
[67] Y. Xu, Z. Y. Dong, J. H. Zhao, P. Zhang, and K. P. Wong, A reliable intelligent
system for real-time dynamic security assessment of power systems, IEEE Trans.
Power Syst., vol. 27, no. 3, pp. 12531263, Aug. 2012.
[68] T. Hastie, R. Tibshirani, and J. Friedman, the Elements of Statistical Learning: Data
Mining, Inference, and Prediction, Second Edition, ser. Springer Series in Statistics.
Springer-Verlag, 2008.
[69] Genc, R. Diao, V. Vittal, S. Kolluri, and S. Mandal, Decision tree-based
preventive and corrective control applications for dynamic security enhancement in
power systems, IEEE Trans. Power Syst., vol. 25, no. 3, pp. 16111619, Aug.
2010.
[70] Y. Freund and R. Schapire, A decision-theoretic generalization of online learning
and an application to boosting, Journal of Computer and System Sciences, vol. 55,
pp. 119139, 1997.
[71] Niculescu-mizil and R. Caruana, Obtaining calibrated probabilities from
boosting, in Proc. 21st Conference on Uncertainty in Artificial Intelligence (UAI
05), AUAI Press. AUAI Press, 2005.
[72] R. Banfield, L. Hall, K. Bowyer, and W. Kegelmeyer, A comparison of decision
tree ensemble creation techniques, IEEE Trans. Pattern Anal. Mach. Intell., vol.
29, no. 1, pp. 173180, Jan. 2007.
[73] P. Utgoff, N. Berkman, and J. Clouse, Decision tree induction based on efficient
tree restructuring, Mach. Learn., vol. 29, pp. 544, Oct 1997.
[74] M. Box, D. Davies, and W. Swann, Nonlinear optimisation Techniques. Oliver and
Boyd, 1969.
[75] R. Schainker, G. Zhang, P. Hirsch, and C. Jing, Online dynamic stability analysis
using distributed computing, in Power and Energy Society General Meeting, 2008
IEEE, July 2008, pp. 17.
[76] S. Chakrabarti and E. Kyriakides, Optimal placement of phasor measurement units
for power system observability, IEEE Trans. Power Syst., vol. 23, no. 3, pp. 1433
1440, Aug. 2008.
[77] Powertech Labs, DSATools: Dynamic Security Assessment Software,
http://www.dsatools.com.
[78] M. He, V. Vittal, and J. Zhang, Online dynamic security assessment with missing
pmu measurements: A data mining approach, IEEE Trans. Power Syst., vol. 28,
no. 2, pp. 19691977, 2013.
[79] Alberta Electric System Operator Rules, Section 502.9: synchrophasor
measurement unit technical requirements, Aug. 2012, [Available] online:
http://www.aeso.ca/downloads/2012-08-30 Section 502-9 phasor.pdf.
[80] Schweitzer Engineering Laboratories Technical Report, Improving the availability
of
synchrophasor
data,
Aug.
2011,
[Available]
online:
https://www.selinc.com/TheSynchrophasorReport.aspx?id=98004
[81] R. Emami and A. Abur, Robust measurement design by placing synchronized
phasor measurements on network branches, IEEE Trans. Power Syst., vol. 25, no.
1, pp. 3843, Feb. 2010.
194
[82] Gomez-Exposito, A. Abur, P. Rousseaux, A. de la Villa Jaen, and C. GomezQuiles, On the use of PMUs in power system state estimation, in Proc. 17th
Power Systems Computation Conference, Stockholm, Sweden, Aug. 2011.
[83] Y. Wang, W. Li, and J. Lu, Reliability analysis of wide-area measurement
system, IEEE Trans. Power Del., vol. 25, no. 3, pp. 14831491, July 2010.
[84] R. Bryll, R. Gutierrez-osuna, and F. K. Quek, Attribute bagging: improving
accuracy of classifier ensembles by using random feature subsets, Pattern
Recognition, vol. 36, no. 6, pp. 12911302, June 2003.
[85] T. K. Ho, Random decision forests, in Proc. Third Intl Conf. Document Analysis
and Recognition, Montreal, Canada, Aug. 1995, pp. 278282.
[86] Phadke and J. Thorp, Synchronized Phasor Measurements and Their Applications.
New York: Springer, 2008.
[87] T. K. Ho, Random decision forests, IEEE Trans. Pattern Anal. Mach. Intell., vol.
20, no. 8, pp. 832844, Aug. 1998.
[88] L. Breiman, Random forests, Mach. Learn., vol. 45, no. 1, pp. 532, Oct. 2001.
[89] V. Khiabani, O. P. Yadav, and R. Kavesseri, Reliability-based placement of phasor
measurement units in power systems, J. Risk and Reliability, vol. 226, no. 1, pp.
109117, Feb. 2012.
[90] L. Hyafil and R. L. Rivest, Constructing optimal binary decision trees is NPcomplete, Information Processing Letters, vol. 5, no. 1, pp. 1517, May 1976.
[91] F. Aminifar, S. Bagheri-Shouraki, M. Fotuhi-Firuzabad, and M. Shahidehpour,
Reliability modeling of PMUs using fuzzy sets, IEEE Trans. Power Del., vol. 25,
no. 4, pp. 23842391, Oct. 2010.
195
1
N
y
n
d ( xn ) 2
The value of y(t) that minimizes R(d) is the average of yn for all cases (xn, yn) falling into
node t, that is:
y (t )
1
N (t )
xnt
Given the set of candidate splits S, for any sS that splits node t into tL and tR, let
R ( s , t ) max R ( s, t )
sS
A1.1.
For any subtree T Tmax, let us define its complexity as T , the number of terminal nodes
in T. Then its cost-complexity measure R (T) is:
~
R (T ) R (T ) T
where 0 and is called the complexity penalty.
For each value of , find the subtree T( ) Tmax such that the cost-complexity R (T) is
minimized:
R (T ()) min R (T )
T Tmax
T1 T2 T3 ... t1
0 1 2 3 ...
To select the right sized tree from the sequence {T1, T2, }, a proportion of N is
randomly selected and used as test samples TS. The cost of subtree Tk is:
RTS (Tk )
1
N2
d k ( xn ) 2
( x n , y n )TS
Another test method is the V-fold cross-validation (CV). Dividing N in V subsets {N1, N1,
, NV}, let:
R CV (Tk )
d k ( x n ) 2
V 1 ( x n , y n )NV
RE CV (Tk ) R CV (Tk ) / R( y )
A1.2.
The Standard Error (SE) estimate is used to select the best pruned subtree commensurate
with accuracy.
Take the cross-validation testing for example, the subtree with Tk nodes is selected as the
best pruned tree if:
RCV (Tk ) RCV (Tk 0 ) SE
where
197