You are on page 1of 8

Proceedings of the 4th IASME / WSEAS International Conference on ENERGY & ENVIRONMENT (EE'09)

Analyzing Malaysian Wind Speed Data Using Statistical Distribution


AZAMI ZAHARIM1,2, SITI KHADIJAH NAJID1, AHMAD MAHIR RAZALI2 AND KAMARUZZAMAN SOPIAN2 1 Fundamental Engineering Unit Faculty of Engineering and Built Environment 2 Solar Energy Research Institute (SERI) Universiti Kebangsaan Malaysia 43600 UKM Bangi, Selangor D.E. MALAYSIA azami@vlsi.eng.ukm.my, khadijahnajid@gmail.com, ksopian@vlsi.eng.ukm.my
Abstract:- Many studies have been carried out to develop a suitable statistical model in order to describe wind energy potential. The most important parameter in estimating the wind energy potential is wind speed. Wind speed is a random phenomenon; statistical methods will be very useful in estimating it. For this reason, wind speed probabilities can be estimated by using probability distributions. An accurate determination of probability distribution for wind speed values is very important in evaluating wind speed energy potential of a region Based on the past literature; Weibul and Rayleigh are two widely used distributions. However, in this paper, Burr, Lognormal and Frechet distribution were applied to data sets for a specific location in Pahang, Malaysia. In determining the proper distribution, an approach consisting Kolmogorov-Smirnov (Ks), Anderson Darling (AD) and chi square ( 2) test also the fitted graphics of probability distribution function (PDF) and cumulative distribution function (CDF) have been used. Based on the graphical and the computed goodness of fit results, general inference can be made that Burr distribution would be the best model which fitted the data very well. Keywords: Wind speed, wind speed distribution, goodness of fit tests

1 Introduction
The negative effects on the environment that increased from fossil fuel combustion in addition to its limited stock have forced many countries to explore and change to renewable energy. Changing to renewable energy sources and the implementation of environmentally friendly alternatives measures would ensure sustainability. Because of the limited fossil fuel reserves and also the adverse effects associated with their use make the alternatives to conventional energy sources, especially the renewable ones become increasingly attractive. The renewable energy resources include solar, wind, wave, geothermal and biomass. All these renewable energy resources are abundant in Malaysia. Steps taken to establish these types of resources are a new solution for the present energy shortage. Moreover, utilization of solar and wind power has become increasingly significant, attractive and costeffective, since the oil crises of early 1970s (Elhadidy, M.A. et. Al, 2000).

Wind which is actually a form of solar energy is one of a kind that researchers put efforts in addressing the challenges to greater use. Winds are caused by the heating of the atmosphere by the sun, and also because of the rotation of the earth. Malaysia is one of the countries that lie in the equatorial zone and its climate is very much influenced by the monsoons. Malaysia has vast solar and wind resources available for energy generation through renewable energy technologies. Commercialization of renewable energy technologies is needed especially for rural, social and economic development. Wind is a source of renewable energy which is clean, efficient and offers many advantages to human beings. One of its many advantages is through the energy conversion in the form of wind power into useful forms of electricity which in between involves kinetic energy. Wind turbines which used wind power is developed by water pumping windmills, and later replaced by steam engine, need a constant wind speed to generate

ISSN: 1790-5095

363

ISBN: 978-960-474-055-0

Proceedings of the 4th IASME / WSEAS International Conference on ENERGY & ENVIRONMENT (EE'09)

electricity. Wind energy can be considered a green power technology as it has only minor impacts on the environment. Any means impact might be because of the process of the production of the wind energy itself, but so far it has not made any major impacts. Currently, the wind energy is one of the fastest developing renewable energy source technologies across the globe. Apart of European countries, India, Japan and China are leading in developing these technologies in the Asian region. Asian region is set to be the most dynamic geographical zone with a growth of 48%. In the study of Shiraishi et. al [2], it is written that Japan aims to generate wind up to three million kWh by 2010. The aim of the present work is to evaluate the potentiality of wind energy in the east coast area of Malaysia. This is done through investigating the wind characteristics at the location using statistical analysis techniques and performing it using an adequate statistical distribution.

2 Wind Energy Analysis


Nowadays, wind analysis gives remarkable information to researches involved in renewable energy studies. The use of wind energy can significantly reduce the combustion of fossil fuel and the consequent emission of carbon dioxide. Supplementing our energy base with clean and renewable sources of energy has become imperative due to the present days energy crisis and growing environmental consciousness. Knowledge of the statistical properties of wind speed is essential for predicting the energy output of a wind energy conversion system. Because of the high variability in space and time of wind energy, it is important to verify that the analyzing method used for the measuring wind data will yield the estimated energy collected that is close to the actual energy collected. Mahyoub H. Al Buhairi (2006) found that in recent years, many efforts have been made to construct an adequate model for the wind speed frequency distribution. The distribution of wind energy at different wind speeds is commonly known as the wind power density which is calculated by multiplying the power of each wind speed with the probability of each wind speed. A. N. Celik (2003) summarized that in the field of engineering, the wind speed distribution functions are ultimately used to be able to correctly model the wind power density, not the wind distribution itself. Therefore, the most important criterion of the ability as how successful it is to predict the measured wind power density, not the wind speed distribution.

Various distributions have been used by past researchers on the efforts of utilizing wind potential, and there are several methods to calculate the parameters of specific wind speed distribution. The most commonly used is through the graphical method. In addition, to check for the accuracy of the specific distribution, ones have to apply two or more methods to the given data. A statistical distribution commonly used for describing measured wind speed data is the Weibull distribution. A review of the methods found in statistical literature for the purpose of estimation of the parameters in Weibull distribution is given, with a special emphasis on the efficiency of the different methods. From this review, the most appropriate method for a given application can be chosen. Maximum likelihood estimators should be used due to their large sample efficiency. However, they require an iterative minimization. K. Condradsen et al. (1983) recommended closed form estimators when there are few observations (say, less than 25) are the least-squares estimators. However, in this paper, observations are about three years of daily data of a particular place in Pahang and it involved a tedious work and analysis. Hence the general conclusion is that maximum likelihood is the appropriate method for the parameter estimation.

3 Wind Speed Data


The data used for the present calculations were obtained from a yearly published book at Pemerhatian Cuaca Harian Pusat Pengajian Sosial, Pembangunan & Persekitaran (PPSPP), Fakulti Sains Sosial & Kemanusiaan (FSSK), Universiti Kebangsaan Malaysia (UKM) during the year 2004 until 2006. A rotating cup type anemometer was used and the station was positioned in open spaces free of obstacle at 3 meters height up on the beautiful place called Cameron Highland. Wind speeds taken every 10 seconds were averaged over 5 minutes and stored in a data-logger. The 5-minutes averaged data were further averaged over one hour. At the end of each hour, the hourly mean wind speed was calculated and stored sequentially in a permanent memory. Based on these data, the wind speeds were analyzed using statistical and computer software.

4 Wind Speed Probability Distribution Function


Back where the new era had just started, a few distributions have predominantly been used for

ISSN: 1790-5095

364

ISBN: 978-960-474-055-0

Proceedings of the 4th IASME / WSEAS International Conference on ENERGY & ENVIRONMENT (EE'09)

fitting the measured wind speed data. This is with the conjunction to build up a new world with a clean environment which uses a clean, renewable energy. The wind speed distribution, one of the wind characteristics, is of great importance for not only for structural and environmental design and analysis, but also for the assessment of the wind energy potential and the performance of wind energy conversion system as well. E. Kavak Akpinar and S. Akpinar (2004) summarize that the Weibull and Rayleigh functions are commonly used for fitting the measured wind speed probability distribution. Through a lot of literatures read, usually Weibull was found to be the best distributions that fit. However, these studies are performed through priori acceptance. Probability density function of wind speed is not always statistically accepted as Weibull pdf. For realization of this acceptance, different pdf distributions should be investigated and incorporated to the analyses As researchers all around the world have been devoted to develop an adequate statistical distribution model in describing wind speed frequency distribution for their own specific country, many types of distributions have been introduced. In this study, appropriate theoretical pdf distribution was done by comparing three fitted statistical distribution (Burr, Frechet and Lognormal distribution). In determining proper theoretical pdf distribution, at the end of this paper an approach consisting of three goodness of fit tests (Chi-square, Kolmogorov Smirnov and Anderson-Darling) and fitted graphics have been used. The lognormal distribution is defined with reference to the normal distribution. A random variable is lognormally distributed if the logarithm of the random variable is normally distributed. The lognormal distribution is used to model continuous random quantities when the distribution is believed to be skewed, such as certain income and lifetime variables. The lognormal density function, with parameters and , is given by (1) The parameter is the mean and is the standard deviation of the distribution for the normal random variable ln(x), not the lognormal random variable x. Although sometimes confusing, is also the median of the normal random variable ln(x) because is the median of N(, ). The cumulative probability function of the Lognormal distribution is (2)

Frechet distribution which was introduced by a French mathematician named Maurine Rene Frechet is a special distribution used in topological sector. The Frchet cumulative distribution function (CDF) is the only CDF defined on the nonnegative real numbers that is a well-defined limiting CDF for the maxima of random variables (RVS). As such, it is important for modelling the statistical behaviour of materials properties for a variety of engineering applications Harlow, D. G (2002). This distribution is also known as the type II Generalized Extreme Distribution. It is usually use to model extreme events. This type II extreme value distribution (Frechet) case is equivalent to taking the reciprocal of values from a standard Weibull distribution. The probability density function (PDF) and the Cumulative distribution function for Frechet distribution is; (3)

(4) Where scale parameter and the is the shape parameter. While, the Burr distribution is a continuous probability distribution for a nonnegative random variable. Also known as SinghMaddala distribution but sometimes called the generalized log-logistic distribution and is most commonly used to model household income. The Burrs probability density and its cumulative distribution function is given as; (5)

1Where k is the shape parameter,

(6) scale

parameter and is the shape parameter. Evaluation of the goodness of fit is very important in the process of choosing the best distribution. As it is common in statistical literature, the term goodness of fit is used here might be understood in several senses: A "good fit" might be a model that your data could reasonably have come from, given the assumptions of least-squares fitting in which the model coefficients can be estimated with little uncertainty that explains a high proportion of the variability in your data, and is able to predict new observations with high certainty. Conclusion can

ISSN: 1790-5095

365

ISBN: 978-960-474-055-0

Proceedings of the 4th IASME / WSEAS International Conference on ENERGY & ENVIRONMENT (EE'09)

finally be made based on the null and the alternative hypotheses, this can be simply written as; H0: The data follow a specific distribution. H1: The data do not follow the specific distribution Three types of goodness of fit were used in this paper which is Kolmogrov-Smirnov, Anderson Darling and chi squared error. In the paper of Cellura M. et. Al (2008) Anderson-Darling test statistics can be written as (7) According to the Kolmogorov-Smirnov test method (Rozaimah Zainal Abidin et. al, 2008), the distribution function of the parent set X is defined as F(x), while the empirical distribtution is (8) Where k is the cumulative frequency and n is the sample size. The chi square distribution gets special attention because of its importance in normal sampling theory. If a set of n observations is normally distributed with variance 2 and s2 is the sample standard deviation, then (9)

(6). As indicated for the overall three years, the mean wind speed is about only 1.9 m/s. The maximum value of wind speed for the whole three years is recorded as 2.69 m/s which arise in the month of January and December, while the minimum is 1.39 m/s in May. Table 1 Descriptive Statistics of Cameron Highland, 2004-2006
Statistics Mean Std deviation Skewness Kurtosis Minimum value Maximum value Sample size Estimates (m/s) (2004) 1.714 1.2316 1.81 4.102 0.3 8.6 363 (2005) 1.9474 0.9361 1.904 4.948 0.5 7.1 365 (2006) 2.0353 0.9173 1.691 2.951 0.7 6.3 365

5 Results and Discussion


The main results obtained from the present study can be summarized as in Table 1. The descriptive statistics for the data set is shown in Table 1. From the table, it is shown that the total number of observation is 363 for the year 2004, and 365 for the two years later, intending that there are one or two defective values or missing observations. These missing observations are probably because of the machine calibration, servicing and it might be because of the malfunction of the machine. As shown in Table 1, the highest mean value is in 2006 with a standard deviation of 0.9173. . The kurtosis which shows that the peak is narrower than the normal distribution gives a value of 4.1, 4.95 and 2.95 for the 2004, 2005 and 2006 respectively. The monthly mean wind speed values and the standard deviations from the data obtained for the overall and individual four years are presented in Table 2. The formula for the mean wind velocity is given in equation (5) and standard deviation can be calculated by putting a square root on the equation

Figure 1, 2 and 3 show the probability density function (PDF) plot for the year 2004, 2005 and 2006 respectively. Each plot can be seen skewed to the right with the skewness values calculated as 1.81, 1.904 and 1.691 respectively. It can be seen in those figures that all the three distributions are quite similar for each of the three years. Each of the PDF plot represents a narrow peak for the whole distributions at about 1.0 m/s and 1.6 in the year 2004 and 2005 respectively, while the plot for Lognormal distribution is peaked at 1.6 m/s, and the other two distributions peaked at 1.4 m/s in the year 2006. On the other hand, figure 4, 5 and 6 show the cumulative function plot for the three distributions for the year 2004, 2005, and 2006 respectively. From there, it is clear that the plot for the Burr distribution is closer to the empirical distribution, plotted as the sample data compared to other two distributions. Hence, by looking at the graphical result only, it can be concluded that the Burr distribution fit the data well. The variation of wind velocity is often described using the statistical probability density function. This statistical method which has been widely accepted for evaluating wind load probabilities. This parameter estimation for this paper have been done using Maximum Likelihood Estimator (MLE) method which can be obtained by solving quation (10) and (11). All the three parameter for the whole three years were calculated analytically and the results are shown in Table 3.

ISSN: 1790-5095

366

ISBN: 978-960-474-055-0

Proceedings of the 4th IASME / WSEAS International Conference on ENERGY & ENVIRONMENT (EE'09)

ISSN: 1790-5095

367

ISBN: 978-960-474-055-0

Proceedings of the 4th IASME / WSEAS International Conference on ENERGY & ENVIRONMENT (EE'09)

Table 2 Monthly Wind Speed And Standard Deviation in Cameron Highland, 2004-2006
Years Parameters January February March April May June July August September October November December Yearly 2004 Vm 2.76 1.56 2.00 1.38 0.91 2.02 1.12 1.34 1.27 1.95 1.63 2.61 1.71 0.98 0.94 1.66 0.60 0.36 1.77 1.01 0.68 0.63 1.48 0.85 1.42 1.03 2005 Vm 3.00 1.88 2.25 1.55 1.66 1.66 1.67 1.66 1.83 1.69 1.79 2.69 1.94 1.37 0.97 1.67 0.33 0.53 0.65 0.34 0.39 0.82 0.48 0.64 0.59 0.73 2006 Vm 2.31 3.11 1.92 1.73 1.61 2.01 1.61 1.98 1.62 1.59 2.27 2.77 2.04 1.29 1.36 0.75 0.47 0.38 0.75 0.47 0.57 0.64 0.49 0.76 1.02 0.75 Whole Year Vm 2.69 1.21 2.18 1.09 2.06 1.36 1.55 0.47 1.39 0.42 1.89 1.06 1.47 0.6 1.66 0.55 1.58 0.69 1.74 0.82 1.89 0.75 2.69 1.01 1.9 0.84

Table 3 Yearly Parameters For Cameron Highland, 2004-2006 2004 k = 0.325 =8.528 = 1.376 =3.251 =1.569 = 0.386 = 0.630 2005 k = 0.507 =6.008 = 1.419 =3.000 =1.262 = 0.417 = 0.574 2006 k = 0.650 =3.229 = 1.079 =1.9226 =1.024 = 0.643 = 0.326

Burr Frechet Lognormal

Fig. 1 Probability Distribution for the Year 2004


Probability Density Function
0.52 0.48

0.44

0.4

0.36

0.32

0.28

fx ()

0.24

0.2

0.16

0.12

0.08

0.04

0 0.8 1.6 2.4 3.2 4 4.8 5.6 6.4 7.2 8

His togram

Burr

Frechet

Lognorm al

Fig. 2 Probability Distribution for the Year 2004


Probability Density Function
0.52 0.48

0.44

0.4

0.36

0.32

fx ()

0.28

0.24

0.2

0.16

0.12

0.08

0.04

0 0.8 1.6 2.4 3.2 4 4.8 5.6 6.4

His togram

Burr

Frechet

Lognorm al

ISSN: 1790-5095

368

ISBN: 978-960-474-055-0

Proceedings of the 4th IASME / WSEAS International Conference on ENERGY & ENVIRONMENT (EE'09)

Fig. 3 Probability Distribution for the Year 2006


Probability Density Function
0.44

0.4

0.36

0.32

0.28

fx ()

0.24

0.2

0.16

0.12

0.08

0.04

0 0.8 1.2 1.6 2 2.4 2.8 3.2 3.6 4 4.4 4.8 5.2 5.6 6

His togram

Burr

Frechet

Lognorm al

Fig. 4 Cumulative Distributions for the Year 2004


Cumulativ Distribution Function e
1

0.9

0.8

0.7

0.6

F) ( x

0.5

0.4

0.3

0.2

0.1

0.8

1.6

2.4

3.2

4.8

5.6

6.4

7.2

Sam ple

Burr

Frechet

Lognorm al

Fig. 5 Cumulative Distributions for the Year 2004


Cumulativ e Distribution Function
1

0.9

0.8

0.7

0.6

F) ( x

0.5

0.4

0.3

0.2

0.1

0.8

1.6

2.4

3.2

4.8

5.6

6.4

Sam ple

Burr

Frechet

Lognorm al

ISSN: 1790-5095

369

ISBN: 978-960-474-055-0

Proceedings of the 4th IASME / WSEAS International Conference on ENERGY & ENVIRONMENT (EE'09)

Fig. 6 Cumulative Distributions for the Year 2006


Cumulativ Distribution Function e
1

0.9

0.8

0.7

0.6

F) ( x

0.5

0.4

0.3

0.2

0.1

0.8

1.2

1.6

2.4

2.8

3.2

3.6

4.4

4.8

5.2

5.6

Sam ple

Burr

Frechet

Lognorm al

7 Conclusions
Table 4 shows the value of Goodness of fit test. These descriptions of the goodness of fit will be able to verify the adequacy of the model distributions to the wind speed data sets. Table 4 Goodness of Fit Test
2004 Burr Ks Frechet Lognormal Burr AD Frechet Lognormal Burr Chi square Frechet Lognormal 0.054 0.065 0.079 1.277 3.059 1.536 14.390 9.391 18.858 2005 0.051 0.081 0.101 0.803 2.374 3.608 17.455 14.508 33.537 2006 0.057 0.065 0.109 1.044 1.106 5.497 19.958 20.169 49.889

[3]

[4]

[5]

[6] [7]

From the table, we can conclude that between those three model distributions, Burr distribution fit the data sets better than the Rayleigh distribution. Compared to the critical value of KolmogorovSmirnov (Ks), Anderson Darling (AD) and Chi square from the statistics table, Burr distribution satisfied all the statistical decision criteria. References: [1] Akpinar, E. Kavak and Akpinar, S. 2004. An assessment on seasonal analysis of wind energy characteristics and wind turbine characteristics. Energy Conversion and Management 46; 1848 1867. [2] Al Buhairi, Mahyoub H. 2006. A statistical analysis of wind speed data and an assessment

[8]

[9]

of wind energy potential in taiz-yemen. Ass. Univ. Bull. Environ. Res. (9) 2 : 21 33. Celik, A.N. 2003. Assessing the suitability of wind speed probability distribution functions based on wind power density. Renewable Energy 28: 1563 - 1574. Cellura, M., Cirrincione, G. , Marvuglia, A., Miraoui, A. (2008). Wind speed spatial estimation for energy planning in Sicily: Introduction and statistical analysis. Renewable energy (33) : 1237-1250. Condradsen, K., Nielsen L. B., Prahm L. P. 1984. Review of Weibull statistics for estimation of wind speed distributions. Journal of Climate and Applied Meteorology 23: 1173 1183. Elhadidy MA, Shaahid SM, Parametric study of hybrid (wind+solar+diesel) power generating systems, Renew Energy,21(2),2000, pp.12939. Harlow, D.G.(2002). Applications of the Frechet distribution function. International Journal of Material and product technology 5:(17); 482-495. Rozaimah Zainal Abidin, Ahmad Mahir Razali, Azami Zaharim and Kamaryzzaman Sopian. 2008. Modelling wind speed data via two parameter weibull. Proceedings of Seminar on Engineering Mathematics, 27 - 29 June 2008. Cameron Highlands, Pahang. Shiraishi, S., Nagai T., H., Hayashi, H., Nishi, K., Kume, H. and Dohata. Evaluation on the Numerical Simulation Model of Wind Speed Distribution behind the Offshore Wind Power Generation Used Observed Data. Proceedings of the World Wind Energy Conference (WWEC05), Melbourne, Australia, 2-4 November 2005.

ISSN: 1790-5095

370

ISBN: 978-960-474-055-0

You might also like