Professional Documents
Culture Documents
One of the common techniques used to deal with large data sets is Principal Components Analysis (PCA). This technique is a statistical analysis method frequently used in the geophysical sciences to explain correlations in a large set of variables and provides a smaller number of independent components. In order to get familiar with the PCA technique, for this project data registered by 19 weather stations in Spain will be used so as to find the relationship among the variables registered that leads to provide a good estimator of the rainfall. Depending on the results obtained, if a common relationship between the variables in all the stations is observed this one will be used to characterize the precipitation over Spain. However, due to the big climatic differences between regions in the country, it is expected to find that each meteorological variables plays a different role for each climatic area. Available data and objective: In order to do this project, and thanks to the Spanish Meteorological Agency (AEMET), monthly data registered in 19 different weather stations placed in different provinces is available. In most locations the registered information is available from January 1920 until August 2012. Some of the variables registered at the weather stations are: Month, Year, Av. Temperature, Max Temp, Min. Temp, Total Precipitation, Max. Daily precipitation, Rainy days, Snowy days, hail days, Atmospheric Pressure, ad average isolation. Due to the big amount of these variables the use of PCA technique becomes necessary in order to identify which of these variables are closely related with the rainfall and try to find a way of predicting the monthly rainfall by using one combination of the variables here given. By using PCA, It will not only be possible to identify which variables are the ones who have a stronger meaning inside, but also, obtain new variables that are going to be linearly independent between each other. Then the ones that will explain a higher % of the variance will be chosen so as to predict the monthly rainfall. By using PCA the complexity of the problem will be simplified because of the fact that a smaller number of variables will be involved in the estimation of the rainfall. Procedure: The PCA analysis will be performed in every different weather station as well as in the whole sample. The first goal will by identifying if the vectors of the PCA base are the same (or involve the same variables in the same way) for each station. Then according to the results, it will be interesting to find if the same PCA base can be used for understanding the problem all over Spain or if, on the other hand, there are some climatic regions inside the
country that follow different patterns (some variables will be strongly correlated with the rainfall in some areas but not in the other ones). Reference paper: Kahy E., S. Kalayci , and T.C. Piechota, 2008: Streamflow Regionalization: Case Study of Turkey. Journal of Hydrologic Engineering. Vol 13, No. 4, pp. 205-214