Professional Documents
Culture Documents
Paper 0552
Paper 0552
5.00%
2.50%
0.00%
romut ~ N ~ ~ ~ ID ~ ro m o~ N ~ v ~ ID ~ ro m o~ N ~ v ~
~ N ~ v ~ ID ~ ro m ~~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ g M M M M M M M M MV V V V V V V V V V ~ ~ ~ ~ ~~
::;;
Paper 0552
Table I: Sets of the seven variables. Table II: Pearson Correlation Coefficient.
Variables Variable Original Transformed
Set 1 2 3 4 5 6 7 1 0.246 0.276
A ./ ./ ./ 2 -0.326 -0.420
8 ./ ./ ./ ./ 3 -0.013 -0.040
C ./ ./ ./ ./ 4 0.632 0.693
D ./ ./ ./ ./ ./ 5 0.675 0.722
E ./ ./ ./ ./ 6 0.769 0.850
F ./ ./ ./ ./ ./ 7 0.773 0.854
G ./ ./ ./ ./ ./
./ ./ ./ ./ ./ ./
Regression Models
H
I ./ ./ ./ ./ Once the sets of variables are defined, models ofregressors
./ ./ ./ ./ ./ were developed to make the inference between variables
J
and energy losses. Regression models are used to predict
K ./ ./ ./ ./ ./
one variable (dependent variable) from one or more
L ./ ./ ./ ./ ./ ./
variables (independent variables) [7].
M ./ ./ ./ ./ ./ The regression models investigated are:
N ./ ./ ./ ./ ./ ./ • Linear regression (LR);
0 ./ ./ ./ ./ ./ ./ • Robust regression (RR); and
P ./ ./ ./ ./ ./ ./ ./ • Artificial Neural Networks (ANN).
Even though LR and RR are linear models, while LR uses
In linear regression analysis, transformations of variables the least squares method to fit the data, RR uses an
are recommended due to linear regression requirements, iteratively reweighed least square - which is less sensitive
such as linearity and normality. For this reason, the to outliers.
following transformations were attempted (for independent The ANN used is a Multi-Layer Perceptron (MLP) [8],
and dependent variables): logarithm, square root, square with the Levenberg-Marquardt optimization rule to update
power and inverse. Transformed sets of variables will be weights. ANN, especially ofthe MLP type, is widely used
represented by t; i. e., At will represent A set with in regression analysis, due to its characteristic ofuniversal
transformed variables. Thus, we have 32 different sets of function approximation.
independent variables (A to P and At to Pt), and 2 sets of After many simulations, the ANN configuration chosen
dependent variables (losses and transformed losses, Y and was one hidden layer with 10 neurons, and 5000 iterations
Y), composing 64 different sets of variables (including were used as maximum iterations stop criteria. To train
independent and dependent variables). regression models, the 1082 LV distribution networks were
Independent variables were analyzed with respect to divided in 757 for training and 325 for test.
relevance for energy losses (given by the value of its
statistical correlation). The correlation analysed was the REGRESSION ANALYSIS
Pearson Coefficient, the Spearman's and the Kendall's
Rank. The first gives a measure of linear relationship Because of the impossibility of presenting here all the
between each variable and the energy loss, varying from -1 results (there are 192 results to be shown: 3 regressors
(perfectly opposite correlated) to 1 (perfectly correlated). executed 64 different sets of dependent and independent
The null relationship is indicated by a (zero). Table II variables), Table III presents only the ten best regressions
shows the Pearson Correlation between each variable and found, ordered from the first to the tenth. All ofthem were
given by ANN. The Root Mean Squared Error - RMSE is
energy losses (for original and transformed variables).
Since we know that there are nonlinear relationships in the measurement of goodness of fit used. Column "TLD"-
Total Loss Deviation - gives the percent difference
energy loss functions, the Kendall's and Spearman's
coefficients were used. They are non parametric methods, between total loss of test set calculated by regressors and
and their measurement of relationship is given as a rank. total real loss. "Set X" refers to input variables
Both ranks corroborated the results of Table II. (independent), while "Set Y" refers to output variables
The correlation coefficient is a measure of relationship. (dependent).
Therefore, in spite of being a good indication, it cannot be As expected, ANN found the best results. But it is
interpreted isolated. Additional analysis, as always important to highlight that linear regression obtained good
recommended in statistics studies, need to be done. results too. Table IV shows the mean performance of the
Actually, the availability and auditability were also three regressors.
considered. Indeed, it does not matter if a variable has a Another assessment of performance can be presented by
direct relation with losses if it is not available and the data sets. Table V presents a comparison between the
auditable (for instance, the maximum voltage drop). 32 independent variable sets.
Paper 0552
Table III: Best regression results. Table IV: Comparison of regressors performance.
SetX SetY TLD RMSE Parameter LR RR ANN
Yt 1.90°A> 723.43 TLD 4.83°A> 17.11°A> 3.18°A>
Yt 1.89% 906.92 RMSE 1829.72 1956.17 1735.74
Yt 2.31°A> 920.14
Yt 1.32°A> 978.42 T abl e V C om panson 0 f'mpu t ser s per ormance.
Yt 2.43°A> 1108.16 Mean Mean Mean
1184.27 Set RMSE Set RMSE Set RMSE
Yt 3.61°A>
Y 4.30°A> 1279.25 J 1546.65 I 1755.54 Ht 1838.67
Yt 4. 14°A> 1315.71 K 1593.61 Mt 1769.29 Ot 1890.81
Y 4.70% 1315.79 E 1664.85 G 1783.88 D 1961.57
Yt 3.39°A> 1346.97 F 1665.58 Gt 1785.89 C 1981.28
Yt 3.18°A> 1383.98 P 1666.12 It 1789.22 Dt 2125.09
v, 1.01°A> 1406.56 H 1666.27 Kt 1791.39 Ct 2145.01
Yt 1.09°A> 1415.60 Jt 1669.89 Pt 1807.84 8 2173.15
L 1688.50 Et 1820.24 A 2203.08
Note that the lack of information about load, absent in M 1700.78 0 1823.98 8t 2219.97
variables A to D, may explain the worse performance of Lt 1731.30 Nt 1824.43 At 2249.17
these input sets. On the other hand, the maximum N 1737.97 Ft 1826.35 - -
information about loads, present in sets M to P, does not
seem to be mandatory: only two sets that contain them (P REFERENCES
and M) are present in the ten best sets. Other inference is
about transformations: sets with transformed variables [1] D. Shirmohammadi, H.W. Hong, A. Semlyen and
have worse performance than sets with the original G.X. Luo, 1988, "A compensation-based power flow
variable - although Table III has a lot of transformed sets method for weakly meshed distribution and
in the best results. It is important to emphasize that Tables transmission networks", IEEE Trans. on Power
IV and V contain mean results, and are "polluted" with Systems, vol. 3, 753-762.
other aspects. [2] J.J. Grainger and T.l Kendrew, 1989, "Evaluation of
technical losses on electric distribution systems",
1Oth International Conference on Electricity
CONCLUSION
Distribution. CIRED, vol.6, 488-493.
This paper presented a methodology to estimate technical [3] R. Nadira, S. Benchluch and C. A. Dortolina, 2003,
losses in LV distribution systems. The aim is to predict "A novel approach to computing distribution losses",
energy losses with low level of information. Although the Trans. and Dist. Conf. and Exposition, IEEE PES,
crescent uses of load flow tools, this methodology is vol.2, 659-663.
applicable to those companies that have low level of [4] H. Lasso, C. Ascanio and M. Guglia, 2006, "A
information. They can apply directly the methodology or Model for Calculating Technical Losses in the
submit their data to a pre-processing step, grouping Secondary Energy Distribution Network", Trans. &
networks in representative clusters and applying the Dist. Con! and Exposition: Latin America. TDC '06.
methodology to one network that represents each cluster. IEEE/PES, 1-6.
The methodology is also indicated for the regulatory [5] J. R. C. Orillaza, R. Del Mundo, and l A. C. Miras,
agencies, which need to estimate losses of all companies 2006, "Development ofModels and Methodology for
using the same methodology. the Segregation of Distribution System Losses for
Three regression methods were tested. While ANN Regulation", TENCON 2006, IEEE Region 10
achieved best results, LR showed that it can be used due to Conference, 1-4.
its simplicity. An analysis of variable sets showed that it is [6] L. M. O. Queiroz, C. Cavellucci and C. Lyra, 2008,
important to use load variables, even considering the "Methodology of LV Distribution System Network
difficulty to audit it. But while maximum coincident Generation for Planning Purposes" (in Portuguese),
demand is hard to audit, the energy consumption is more XL SBPO, SOBRAPO.
auditable. [7] J. F. Hair, R. E. Anderson, R. L. Tatham and W. C.
Black, 1998, Multivariate Data Analysis, Fifth
Acknowledgments Edition, Prentice-Hall.
[8] S. Haykin, 1999, Neural Networks: A Comprehensive
The authors would like to acknowledge the support given Foundation, Second Edition, Prentice Hall.
by the Brazilian National Research Council (CNPq).