The Hilbert-Huang Transform - Theory Applications Development

University of Iowa
Iowa Research Online

Theses and Dissertations
2011
The Hilbert-Huang Transform: theory, applications, development

Bradley Lee Barnhart
University of Iowa
Copyright 2011 Bradley L. Barnhart This dissertation is available at Iowa Research Online: http://ir.uiowa.edu/etd/2670 Recommended Citation
Barnhart, Bradley Lee. "The Hilbert-Huang Transform: theory, applications, development." PhD diss., University of Iowa, 2011. http://ir.uiowa.edu/etd/2670.
Follow this and additional works at: http://ir.uiowa.edu/etd Part of the Physics Commons
THE HILBERT-HUANG TRANSFORM: THEORY, APPLICATIONS, DEVELOPMENT
by Bradley Lee Barnhart
An Abstract Of a thesis submitted in partial fulfillment of the requirements for the Doctor of Philosophy degree in Physics in the Graduate College of The University of Iowa December 2011 Thesis Supervisor: Professor William Eichinger
ABSTRACT Hilbert-Huang Transform (HHT) is a data analysis tool, first developed in 1998, which can be used to extract the periodic components embedded within oscillatory data. This thesis is dedicated to the understanding, application, and development of this tool. First, the background theory of HHT will be described and compared with other spectral analysis tools. Then, a number of applications will be presented, which demonstrate the capability for HHT to dissect and analyze the periodic components of different oscillatory data. Finally, a new algorithm is presented which expands HHT ability to analyze discontinuous data. The sum result is the creation of a number of useful tools developed from the application of HHT, as well as an improvement of the HHT tool itself. Abstract Approved: ________________________________ Thesis Supervisor ________________________________ Title and Department ________________________________ Date
THE HILBERT-HUANG TRANSFORM: THEORY, APPLICATIONS, DEVELOPMENT
by Bradley Lee Barnhart
A thesis submitted in partial fulfillment of the requirements for the Doctor of Philosophy degree in Physics in the Graduate College of The University of Iowa December 2011 Thesis Supervisor: Professor William Eichinger
Graduate College The University of Iowa Iowa City, Iowa
CERTIFICATE OF APPROVAL _______________________ PH.D. THESIS _______________ This is to certify that the Ph.D. thesis of Bradley Lee Barnhart has been approved by the Examining Committee for the thesis requirement for the Doctor of Philosophy degree in Physics at the December 2011 graduation. Thesis Committee: ___________________________________ William Eichinger, Thesis Supervisor ___________________________________ Thomas Boggess Jr. ___________________________________ Paul Kleiber ___________________________________ Wayne Polyzou ___________________________________ Anton Kruger
Dedicado a Eduardo y su duende
ii
ACKNOWLEDGMENTS I want to first thank my adviser Dr. Bill Eichinger. Thank you for all of your encouragement, advice, and support. This work would not be possible without you. Also thank you to my wife Rebecca. You have always created such joy in my life, and I thank you for all of your love, kindness, and support. Thank you to my parents, Randall and Nancy, for a childhood which provided the pathway to success. You are my role models. And thank you to my dog Lucy. You always give me a great excuse for a long walk.
iii
ABSTRACT Hilbert-Huang Transform (HHT) is a data analysis tool, first developed in 1998, which can be used to extract the periodic components embedded within oscillatory data. This thesis is dedicated to the understanding, application, and development of this tool. First, the background theory of HHT will be described and compared with other spectral analysis tools. Then, a number of applications will be presented, which demonstrate the capability for HHT to dissect and analyze the periodic components of different oscillatory data. Finally, a new algorithm is presented which expands HHT ability to analyze discontinuous data. The sum result is the creation of a number of useful tools developed from the application of HHT, as well as an improvement of the HHT tool itself.
iv
TABLE OF CONTENTS LIST OF TABLES ............................................................................................................................. vii LIST OF FIGURES ......................................................................................................................... viii CHAPTER I. II. INTRODUCTION ....................................................................................................... 1 BACKGROUND........................................................................................................... 4 Traditional Spectral Analysis Tools ............................................................................. 4 Fourier Analysis ...................................................................................................... 5 Short-Time Fourier Transform ............................................................................ 7 Wavelet Analysis ..................................................................................................... 8 Generalized Time-Frequency Distributions ....................................................... 9 III. HILBERT-HUANG TRANSFORM (HHT) ......................................................... 10 Hilbert Spectral Analysis ............................................................................................ 10 Empirical Mode Decomposition (EMD).................................................................. 12 IV. ANALYSIS OF SUNSPOT VARIABILITY USING THE HILBERT-HUANG TRANSFORM ...................................................................... 14 Introduction .................................................................................................................. 14 Ensemble Empirical Mode Decomposition (EEMD) ............................................ 15 Results ............................................................................................................................ 15 Discussion ..................................................................................................................... 20 Further Research .......................................................................................................... 21 V. EMD APPLIED TO SOLAR IRRADIANCE, GLOBAL TEMPERATURE, AND CO2 CONCENTRATION DATA ............................ 28 Introduction .................................................................................................................. 28 Data Used ...................................................................................................................... 28 Results ............................................................................................................................ 29 Cycles in Data ....................................................................................................... 30 IMF Comparisons ................................................................................................ 32 Discussion ..................................................................................................................... 34 VI. CHARACTERIZING SAMPLING ERRORS ASSOCIATED WITH THE NEAR-SURFACE ENERGY BUDGET CLOSURE PROBLEM ......... 50 The Energy Balance Problem ..................................................................................... 50 EMD as a Dyadic Filter ............................................................................................... 53 v
Eddy Covariance Methods .......................................................................................... 54 Traditional Eddy Covariance Method ............................................................... 54 EMD Eddy Covariance Method ........................................................................ 55 Orthogonality and Sampling Durations .................................................................... 57 How Long is Long Enough? ...................................................................................... 61 Conclusions ................................................................................................................... 63 VII. AN IMPROVED ENSEMBLE EMD ALGORITHM ........................................ 71 Motivation ..................................................................................................................... 71 Ensemble Empirical Mode Decomposition ............................................................. 72 Errors Due to Data Gaps ............................................................................................ 73 Error Reduction Methods ........................................................................................... 74 Discussion ..................................................................................................................... 75 VIII. SUMMARY ................................................................................................................... 83
REFERENCES .................................................................................................................................. 85
vi
LIST OF TABLES Table 5.1 5.2 5.3 5.4 5.5 5.6 5.7 Mean and standard deviation of instantaneous frequencies (1/yrs) calculated using the Hilbert Transform. ................................................................................................. 42 Periods (in years) calculated using Hilbert analysis and zero-crossing method. ............. 43 Correlation coefficients (r) between total solar irradiance and sunspot from 1749 to 2009 ....................................................................................................................................... 45 Correlation coefficients (r) between total solar irradiance and global mean temperature from 1880 to 1945 ............................................................................................. 47 Correlation coefficients (r) between total solar irradiance and global mean temperature from 1945 to 2009 ............................................................................................. 47 Correlation coefficients (r) between sunspot number and global mean temperature from 1880 to 1945 ............................................................................................. 49 Correlation coefficients (r) between sunspot number and global mean temperature from 1945 to 2009 ............................................................................................. 49
vii
LIST OF FIGURES Figure 4.1 4.2 Monthly sunspot data decomposed into its intrinsic mode functions (IMFs) using EEMD............................................................................................................................. 22 Statistical significance test for the extracted IMFs. Notice the first extracted IMF, is below the 1% confidence limit and is therefore considered statistically insignificant from noise........................................................................................................... 23 The monthly sunspot data denoised by removing the first IMF extracted using EEMD ....................................................................................................................................... 24 Hilbert spectra of IMFs representing the (a) 11-year cycle, (b) 20-50-year cycle, and the (c) quasi-100-year cycle ............................................................................................. 25 Short-time Fourier spectrogram of the monthly sunspot data with window sizes of (a) 100 years and (b) 26 years ............................................................................................ 26 Wigner-Ville distribution of sunspot data ............................................................................ 27 Extracted IMF representing the 11-year solar cycle plotted along with its instantaneous frequency as calculated using equation (6) .................................................. 27 Sunspot number data set and its decomposed IMFs .......................................................... 38 Total Solar Irradiance (TSI) measurements and their decomposed IMFs ...................... 39 Global mean temperature and its decomposed IMFs ........................................................ 40 CO2 concentration as measured from the Mauna Loa Observatory ................................ 41 Subsection of IMF 2, the yearly cycle extracted from the CO2 data using EEMD ........ 42 Comparison between TSI and sunspot number IMFs. The TSI data were multiplied by a factor of 100 in order to improve visibility. .............................................. 44 Comparison of IMFs for global mean temperature and total solar irradiance ............... 46 Comparison of global mean temperature and sunspot number data IMFs. The temperature was multiplied by a factor of 500 to allow for visibility ............................... 48 Dyadic nature of EMD when applied to turbulence .......................................................... 65 Variance contributions from IMF pairs for 60 minute data sets of vertical wind velocity and temperature ......................................................................................................... 65 Covariance contributions from IMF pairs of vertical wind velocity and temperature ............................................................................................................................... 66 Covariance contributions from w IMF 10 and all T IMFs ................................................ 66
4.3 4.4 4.5 4.6 4.7 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 6.1 6.2 6.3 6.4
viii
6.5
Absolute value of the nonorthogonal fraction of the total covariance from the IMF Covariance Matrices as calculated for 10 days. The top two plots show SMEX 2002 data from Site 161. The bottom two plots show SMEX 2002 from Site 152 ...................................................................................................................................... 67 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wT plotted against essentially the number of cycles sampled, as defined by the sampling duration divided by the period of the process (in this case, an IMF) ............. 68 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wq plotted against essentially the number of cycles sampled, as defined by the sampling duration divided by the period of the process (in this case, an IMF). ............ 69 Orthogonal (blue) nonorthogonal (red) and total (black) contributions from each IMF for the sensible heat flux as calculated from Site 161 from SMEX 2002. Each subplot is a different sampling duration, going from left to right and top to bottom in the following order: 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, 150 minutes ...... 70 Original data decomposed into its IMFs as well as the IMFs decomposed from discontinuous data ................................................................................................................... 77 Error as defined by the summed differences between the discontinuous and continuous extracted IMFs, plotted against data gap size ................................................. 78 Errors plotted as a function of frequency of IMFs. The errors are primarily in the low-frequency IMFs .......................................................................................................... 79 Errors plotted as a function of time. For the high-frequency IMFs, the errors occur largely near the gap endpoints ..................................................................................... 80 Comparison of three different decompositions of data. The original signals (black) contain no gaps. The red signals are the decomposed IMFs from the discontinuous EEMD algorithm, and the blue signals are the discontinuous EEMD algorithm used after a mirroring technique was performed ................................ 81 Comparison of relative error associated with including or not including the mirror technique when using the discontinuous EEMD decomposition ....................... 82
6.6
6.7
6.8
7.1 7.2 7.3 7.4 7.5
7.6
ix
1 CHAPTER I INTRODUCTION In order to describe the physical world, measurements must be gathered and interpreted. Just as it is essential to understand the specifications of the instruments used to collect data, it is also necessary to understand the strengths and limitations of the tools used to interpret data. Frequency analysis tools are used to analyze the internal fluctuations of a signal in terms of their frequency, or size scales. While frequency analysis tools are beneficial for describing the contributions to a signal from various frequency or size scales, oftentimes the tools which are used have limitations that restrict how the data can be interpreted. For example, Fourier-based analysis tools rely on the mathematical property that any signal can be reconstructed from the sum of sinusoidal functions. This, in theory, is advantageous and can be used to describe the relative contributions to the signal from the various sine functions with different frequency. However, these sinusoidal functions are infinite in extent, and are required to have constant amplitudes and phases. Imagine standing in a grass field for an hour feeling the intermittent puffs of wind on your face, and it becomes clear that nature is not stationary. Or conversely, imagine if ocean waves were required to have constant amplitudes and phases, and how oddly convenient the world would seem. Since nature does not fit stationary and linear assumptions, it is necessary then to extend our mathematical tools which describe nature to more adaptive methods. That is, methods should be extended to accommodate for signals to be nonstationary, and which may be the result of many, perhaps nonlinear, combinations of processes. Following the advent of traditional Fourier analysis, many new methods have been developed to accommodate for nonstationary signals. These vary from short-time Fourier
2 transforms (STFT), which allow a signal to be nonstationary as long as it is piece-wise stationary, or wavelet analysis which can sift out particular signatures from a signal on a variety of size scales. Generalized time-frequency distributions have also been derived which encompasses special cases such as wavelets or STFT, and include much more complicated versions of these tools. With each frequency analysis tool come assumptions and limitations which affect the signal being analyzed. It is important to understand these limitations to properly interpret the signal. This dissertation describes a relatively new data analysis tool called the HilbertHuang transform (HHT) which is able to extract the frequency components from possibly nonlinear and nonstationary intermittent signals. As with any frequency analysis tool, it has strengths and weaknesses which need to be understood in order to accurately interpret the output. However, it is a powerful tool which can describe the frequency components locally and adaptively for nearly any oscillating signal. This makes the tool extremely versatile. For instance, HHT has been used to study a wide variety of data including rainfall, earthquakes, heart-rate variability, financial time series, Lidar data, and ocean waves to name a few subjects. Therefore, it is justified to continue research on this relatively new tool in order to fully understand the underlying theory, its potential applications, and its development. This dissertation is divided into 8 Chapters. Chapter 2 gives a brief background of current data analysis tools: their strengths and limitations. Chapter 3 introduces HHT and compares its abilities to these traditional data analysis tools. Chapters 4-7 describe four separate papers which were submitted to refereed journals for publication between 2009-2011; two are currently published (Chapters 4,5) and two are currently under review (Chapters 6,7).
3 Chapter 4 demonstrates the utility of HHT when applied to oscillatory data, in particular, sunspot number data. HHT is compared with other data analysis tools and shown to be useful to describe the local frequency components of complicated data. Chapter 5 uses a portion of HHT in order to compare two or more periodic cycles within oscillatory data. The techniques used in Chapter 4 are extended in Chapter 5 in order to compare two separate cyclic data oscillations. Chapters 4 and 5 utilized HHT with well-known data which has been analyzed extensively using alternative methods. In contrast, Chapter 6 utilizes HHT to address an unsolved problem: the problem of the lack of a near-surface energy budget closure. This chapter will apply HHT to meteorological data in an attempt to shed new light on this problem. The poorly understood measurement sampling errors associated with near-surface fluxes will be analyzed and conclusions associated with the energy budget closure problem will be discussed. Chapter 7 describes an improvement to the EMD algorithm in order to accommodate for discontinuous, intermittently sampled, data. The benefits of such an improvement are discussed as well as its limitations. Finally, Chapter 8 will give a brief summary of the research completed thus far, and give several suggestions for needed future research. In order to understand and predict the natural world around us, it is essential not to fit the world into mathematical equations but rather to expand our mathematical equations to better fit the natural world. This proposal aims to accomplish this by analyzing a new data analysis tool, HHT, which may more locally and adaptively describe the natural world.
4 CHAPTER II BACKGROUND Traditional Spectral Analysis Tools Data analysis is the fundamental connection between measurements and the conclusions we draw from those measurements. Typically, data analysis tools attempt to describe the intrinsic variability of measured variables, whether they be temperature, wind velocity, heart rate, population, rainfall, stock volatility, or any other variable system. However, it is important to understand how the tools used to analyze data affect the data itself. In order to understand the intrinsic variability of a system, measured signals are oftentimes written mathematically as the sum of their contributing components. Equation 2.1 shows that a time-dependent signal, f(t), can be written as the product of amplitude coefficients, , and basis functions, { } [6]. ( ) ( ) (2.1)
The signal and the basis functions could also be written as a function of space depending on the system being analyzed. If the basis functions form an orthonormal set, the amplitude coefficients can be calculated as in equation 2.2. ( ) ( ) (2.2)
The energy density contribution from each component, then, is shown in equation 2.3 [5][6][43]. | | (2.3)
5 Note that the series in equation 2.1 can be thought of as a mathematical approximation to the original signal. An enormous number of solutions to problems utilize this technique of describing a complicated signal in terms of simpler ones. For example, consider the Schrodinger equation in equation 2.4. [ ] (2.4)
The general solution to equation 2.4 is shown in equation 2.5 ( ) ( ) (2.5)
where the time-dependence in equation 2.5 can be included only when the potential, V, is independent of time [23]. ( ) are the solutions to the time-independent Schrodinger
equation as shown in equation 2.6. [ ] , and (2.6)
Depending on the potential V of the system, different coefficients, functions,
( ), are used to describe the solution. For the hydrogen atom, the functions
( ) are written as the product of radial functions, described by Laguerre polynomials, and the angular functions, which are known as spherical harmonics. Other potentials give solutions for the wave functions which require spherical or cylindrical Bessel functions [23]. These functions are mathematical approximations which represent the physical processes of the system.
6 Fourier Analysis When analyzing periodic fluctuations in measured data, the most common form of data analysis is Fourier analysis. Formulated by Joseph Fourier in the early 1800s, Fourier analysis utilizes the postulate that any signal can be constructed as a sum of sinusoidal functions [5][6][43]. Therefore, Fourier analysis relies on the assumption that any signal can be written as in equation 2.1 where the basis functions, { }, are sine and cosine functions. For signals which contain multiple frequency components, Fourier analysis describes these signals as the sum of sine waves, with infinite extent, with different frequencies, as shown in equation 2.7. ( ) (2.7)
Because the frequency of each sinusoidal function must be time-independent, Fourier analysis is able to construct stationary data only. That is, the frequency of the signal being analyzed is assumed to not change with time. Also, because the sine waves used to describe a signal are infinite in extent, Fourier analysis is considered a global analysis tool. The amplitudes of the basis functions can be calculated as shown in equation 2.8. ( ) ( ) ( ) (2.8)
Equation 2.8 shows that the amplitude coefficients describe the contribution of the signal at different frequency components, . This equation is called the Fourier transform,
and is useful because it provides a frequency-domain representation of a time-domain function [5][6][43]. The energy density is the square of the amplitude functions. The relative contribution to the energy density from each frequency component is shown in equation 2.9.
7 ( ) ( ) (2.9)
The total energy density is then the sum of the contributions from all frequencies. It is a common practice to plot Fourier spectra which plot energy density vs. frequency. This allows the largest contributing components to be located at particular frequencies.
Short-Time Fourier Transform As mentioned before, nonstationary signals such as sporadic impulses or aperiodic signals cannot be described locally using Fourier analysis. In order to accommodate for nonstationary signals, the short-time Fourier transform (STFT) was developed. It is shown in equation 2.10. ( ) ( ) ( ) (2.10)
The idea of STFT is to break a nonstationary signal into sections, in all of which the signal is stationary. Then, the regular Fourier transform can be calculated in each section and the energy density can be determined in each section. Therefore, the original signal can be nonstationary, as long as it is stationary within each window. The window function, ( ), is chosen by the user to be a particular size. The STFT spectrogram, which
plots the energy density contributions from each frequency, is time-dependent. However, the frequencies must be constant within each window [5][43]. Note that the choice of window size is important and determines what frequencies will be resolved from the data. For instance, a short window will be able to show the timedependence of frequencies very locally in time, however, it will only capture the high frequency components and will not resolve the lower frequency, longer periodic, oscillations.
8 If a longer window is used, the lower frequency components can be resolved, however, the possibility of capturing nonstationary features from the signal increases [5][43].
Wavelet Analysis Wavelet analysis is another type of frequency analysis tool. The wavelet transform of the signal f(t) is shown in equation 2.11 where is the wavelet, a is the scale factor and b is the time shift. ( ) ( ) ( ) (2.1 1)
The transform basically represents the similarity between a signal and the predetermined wavelet at scale a at time b. Wavelets of different size (frequency) scales are used to generate many wavelet transform coefficients. Energies can be calculated just as in equation 2.9 to produce time-frequency-energy spectrograms without the need for a fixed window [8][9][43]. This was an improvement over STFT because, as the result of the flexible basis functions, both the high-frequency and the low-frequency structures could be analyzed. Large scale wavelets are used to extract low frequency, large scale features while high frequency oscillations are extracted with smaller scale wavelets [8][9][43]. Wavelet analysis works well to seek out particular structures at different size (frequency) scales within data. For example, Morelet wavelets can isolate and analyze rainfallrunoff events. However, a drawback of wavelet analysis is that the wavelet basis functions, and therefore the structures being sifted out from the original signal, are chosen a priori. It is possible that the utilized wavelets may or may not reflect the processes in the analyzed signal. If an inappropriate set of wavelets is used to correlate with a signal, the calculated
9 wavelet coefficients and variance of the signal may give misleading and nonphysical results [8][9][43].
Generalized Time-Frequency Distributions There are many spectral analysis tools which can be described by an overall generalized time-frequency distribution. equation 2.12 shows this distribution where x(t) is the signal and ( [9][33]. ( ) ( ) ( ) ( )
( )
) is the kernel which determines the properties of the distribution
(2. 12)
The distribution is called the Wigner-Ville distribution when (
. Overall,
the Wigner-Ville distribution gives better time and frequency resolution than STFT and does not have to sacrifice one resolution for the benefit of the other. Negatives of the WignerVille distribution include the possibility of calculating nonphysical harmonics and even negative amplitudes. Therefore, the real frequency contributions have to be picked out amidst nonphysical harmonics. Other kernals can be used including a bi-Gaussian, which produces a pseudo-Wigner distribution (PWD) or an exponential kernel, which produces a Choi-Williams distribution (CWD) [9][33].
10 CHAPTER III HILBERT-HUANG TRANSFORM (HHT) An alternative data analysis tool has been proposed by Norden E. Huang called the Hilbert-Huang Transform (HHT) [26]. The HHT technique for analyzing data consists of two components: a decomposition algorithm called empirical mode decomposition (EMD) and a spectral analysis tool called Hilbert spectral analysis. Both tools will be introduced and described hereafter. It will be shown that HHT can provide a local description of the oscillating components of a signal, whether nonstationary or nonlinear. This provides a new approach for analyzing the variability of signals and can be compared with current tools such as any of the methods mentioned previously.
Hilbert Spectral Analysis The purpose of HHT is to demonstrate an alternative method to present spectral analysis tools for providing the time-frequency-energy description of time series data. Also, the method attempts to describe nonstationary data locally. Rather than a Fourier or wavelet based transform, the Hilbert transform was used, in order to compute instantaneous frequencies and amplitudes and describe the signal more locally. Equation 3.1 displays the Hilbert transform, ( ), which can be written for any function x(t) of Lp class [6]. The PV denotes Cauchys principle value integral. [ ( )] ( ) ( ) (3.1)
[6][21] determined that an analytic function can be formed with the Hilbert transform pair as shown in equation 3.2.
11 ( ) where ( ) ( ) ( ) ( ) (3.3) ( ) ( ) ( )
( )
(3.2)
( ) and ( ) are the instantaneous amplitudes and phase functions, respectively [21]. The instantaneous frequency can then be written as the time derivative of the phase, as shown in equation 3.4. ( ) (3.4)
Note that the analytic function z(t) is the mathematical approximation to the original signal x(t). Because the amplitude and frequency functions are expressed as functions of time, the Hilbert spectrum, which displays the relative amplitude or energy (square of amplitude) contributions for a certain frequency at a specific time, can be constructed as H(w,t). Then, a marginal spectrum can be calculated as in equation 3.5, where the spectrum is summed over the time domain of 0 and T. ( ) ( ) (3.5)
The marginal spectrum represents the sum of all amplitudes (energies) over the entire data span. This can be directly compared to the Fourier spectrum which was shown in equation 2.9 as ( ). [26] and [27] showed that not all functions give good Hilbert transforms, meaning those which produce physical instantaneous frequencies. For example, functions with non-zero means will give negative frequency contributions using the Hilbert transform [26][27]. Therefore, the signals which can be analyzed using the Hilbert transform must be
12 restricted so that their calculated instantaneous frequency functions have physical meaning. Next, the empirical mode decomposition will be described. It is essentially an algorithm which decomposes nearly any signal into a finite set of functions which have good Hilbert transforms that produce physically meaningful instantaneous frequencies.
Empirical Mode Decomposition The EMD algorithm is the other component to the HHT method. The algorithm attempts to decompose nearly any signal into a finite set of functions, whose Hilbert transforms give physical instantaneous frequency values. These functions are called intrinsic mode functions (IMFs). The algorithm utilizes an iterative sifting process which successively subtracts the local mean from a signal. The sifting process is as follows: 1. 2. Determine the local extrema (maxima, minima) of the signal. Connect the maxima with an interpolation function, creating an upper envelope about the signal. 3. Connect the minima with an interpolation function, creating a lower envelope about the signal. 4. Calculate the local mean as half the difference between the upper and lower envelopes. 5. 6. Subtract the local mean from the signal. Iterate on the residual.
The sifting process is repeated until the signal meets the definition of an IMF, which will be explained shortly. Then, the IMF is subtracted from the original signal, and the sifting process is repeated on the remainder. This is repeated until the final residue is a monotonic
13 function. The last extracted IMF is the lowest frequency component of the signal, better known as the trend. Previously, the sifting process was said to stop when the signal met the criteria of an IMF. Therefore, it is important to understand how an IMF is defined. Remember the definition of an IMF was formed to ensure that the IMF signals give physical frequency values when using the Hilbert transform. The definition of an IMF, therefore, is a signal which has a zero-mean, and whose number of extrema and zero-crossings differ by at most one [26][27]. IMFs are considered monocomponent functions which do not contain riding waves [26][27]. Once a signal has been fully decomposed, the signal D(t) can be written as the finite sum of the IMFs and a final residue as shown in equation 3.6.
( )
()
( )
(3.6)
Using equations 3.2 and 3.3, the analytic function can be formed as shown in equation 3.7. ( ) ( ) [ ( )
( )
(3.7)
Also, for reference, equation 3.8 shows the Fourier decomposition of a signal, x(t). ( ) [ ] (3.8)
Notice that the EMD decomposition can be considered a generalized Fourier decomposition, because it describes a signal in terms of amplitude and basis functions whose
14 amplitudes and frequencies may fluctuate with time [26][27]. The HHT will now be used on a number of different data sets to analyze its applicability.
15 CHAPTER IV ANALYSIS OF SUNSPOT VARIABILITY USING THE HILBERT-HUANG TRANSFORM Introduction Sunspot number variation has been well studied and represents a crucial component in the analysis of solar activity [53]. Understanding the intrinsic cycles of sunspot number fluctuations helps to better characterize and understand the solar processes from which they are responsible. Also, it aids in the prediction of future solar activity. Because sunspot number data are nonstationary and the result of nonlinear processes, it is necessary to choose a data analysis tool which will accurately describe its cyclic components locally and adaptively [26][27]. Sunspot cycles are known to be of varying lengths and amplitudes. While Fourier analysis is the most common data analysis technique used to extract periodicities from periodic signals, it requires constant amplitudes and phases and is not well-suited to the problem [5][9]. Therefore, it is justified to explore a new data analysis technique which may be more suitable to extract the cyclic components from the sunspot number data set. A relatively new data analysis tool called the Hilbert Huang Transform (HHT) is a tool which was specifically developed for analyzing nonstationary and nonlinear signals [26][27]. Here we present a HHT analysis of monthly sunspot numbers from 1749-2010 and compare the extracted cyclic components with those found using Fourier analysis as well as generalized time-frequency distributions.
Ensemble Empirical Mode Decomposition (EEMD) EMD is a dyadic filter bank in the frequency domain [15]. This means that the sifting method can only extract IMFs which differ in frequency by more than factors of 2. An
16 improved EMD algorithm called Ensemble EMD (EEMD) has been developed by [58] which utilizes this characteristic to extract robust and statistically significant IMFs. EEMD is summarized here: 1. 2. Add finite amplitude noise to the original signal. Decompose signal into a finite set of IMFs using the EMD sifting method described previously. 3. 4. Repeat steps 1 and 2 with different noise data sets. Average the ensemble of extracted IMFs to average out the noise and obtain mean IMFs. A complete description of EEMD can be found in [58]. EEMD was used to analyze monthly sunspot data and will be shown below.
Results Monthly sunspot data from January 1749 to April 2010 were decomposed into different frequency components using EEMD. Eighty different sets of noise with a standard deviation of 0.2 were added to the original data and decomposed using EMD. The ensemble of decomposed IMFs was then averaged to obtain mean IMFs. The data along with the mean extracted IMFs are shown in Fig. 4.1. Clearly the extracted IMFs have time-dependent amplitudes and phases and differ from pure sinusoidal functions. They are the intrinsic fluctuations extracted directly from the signal using the sifting process and are not pre-determined functions. The 11-year cycle is shown as the second extracted IMF in Fig. 4.1. Notice that the IMF captures the oscillation of the signal even though the signal is nonstationary. It is well known that each 11-year cycle does not oscillate as a perfect sinusoid. In fact, it is possible the cycle is made up of two or
17 more cyclic components. While the EMD method is unable to separate any components whose periodicity is greater than factors of 2, it is able to display the subsequent varying 11year cycle, and demonstrate the changes in frequency due to its nonlinear behavior. This investigation focuses mainly on periodicities equal or greater than the 11-year Schwabe cycle. There were originally four higher frequency components which were combined into one IMF and labeled the high-frequency IMF as seen in the top plot of Fig. 4.1. The high frequency oscillations in IMF 1were determined to be statistically insignificant from noise due to a statistical test which was suggested by [58]. They decomposed a large number of noise data sets using EMD to create statistical significance confidence limits. The 5 extracted IMFs are shown in Fig. 4.2 along with the 1%, 50%, and 99% confidence limits derived from [58]. The star in the upper left corner, with mean energy below the zero mark, is the data set itself and can be ignored. All IMFs are above the 99-percentile confidence limit except for the highest frequency fluctuations found in the first IMF. Therefore, only the first IMF is not statistically significant from noise [58]. One application of the EMD or EEMD method is to remove the high frequency fluctuations from the sunspot data by subtracting IMF 1 from the original signal as shown in Fig. 4.3. The advantage of this technique is that the highest frequency components were removed locally through the EMD sifting process. Therefore, meaningful structures were not smoothed over which often occurs when using low-pass filtering [58]. For comparison, a Butterworth low-pass filter was used to remove the high frequency oscillations of the sunspot number data. The 3db cutoff frequency was set to remove periodicities less than approximately 5.5 years. The correlation coefficient between the Butterworth filtered data and the original sunspot data was 0.8498 whereas the EMD method filter gave a value of
18 0.9375. Therefore, EMD provided a more accurate representation of the original signal while removing the high frequency fluctuations. It is conceivable an alternative cutoff frequency could be used and a better fit obtained using a different low-pass filter, however, this postprocessing is subjective and prone to bias. The EMD method, however, does not require prior knowledge of the system in order to locally and adaptively extract and remove the highest frequency content from the signal [26][27]. The other extracted IMFs in Fig. 4.1 represent the longer cycles of the signal. The 20-50-year cycle is shown as the third IMF in Fig. 4.1. Because EMD is a dyadic filter bank in the frequency domain, it is possible this IMF is the sum of two or more cycles whose periods differ by less than a factor of two. The 22-year (Hale) cycle dominates between approximately 1825 and 1940. However, before and after this time period exists a slightly longer cycle, approximately 40-50 years. IMF 4 exhibits an approximately 100-year cyclic oscillation which is known as the Gleissberg cycle [53]. The Gleissberg cycle period is typically between 60 to 120 years [53]. Finally, the trend is displayed which shows an upward trend in sunspot number for the past 250 years. Using the Hilbert transform, Hilbert spectra ( ), were calculated for
each IMF. These can be compared and contrasted with alternative spectral analysis methods such as STFT and time-frequency distributions. Fig. 4.4 shows the Hilbert spectra for the extracted IMFs. The frequency is displayed as cycles/years. Figs. 4.5a and 4.5b show STFT spectrograms of the overall dataset. Fig. 4.5a uses a window size of 100 years and Fig. 4.5b uses a window size of 26 years. Notice how the frequency resolution is better and poorer, respectively, and that the time resolution is related. [28] also analyzed sunspot data with STFT. They used a pre-emphasis filter, which amplifies certain portions of the spectrogram, in order to more easily distinguish the cycles
19 within sunspot data. Figs. 4.5a and 4.5b did not use a pre-emphasis filter, therefore, the cycles are slightly less resolved than in [28]. See Fig. 2 in [28] for their STFT spectrograms. [33] also analyzed solar sunspot number using pseudo-Wigner (PWD) distribution. Refer to Fig. 6 in [33] for the PWD spectrogram of solar sunspot data. The Hilbert spectrum in Fig. 4.4a shows how the 11-year solar cycle is not constant but actually changes with time. This is because the 11-year solar cycle is different from a constant frequency sinusoid. In Fig. 4.4a, it oscillates about a mean of 0.0909 cycles/year which corresponds to a period of 11.11 years. Fig. 4.4a also shows that the amplitude fluctuations of the 11-year Schwabe cycle, as can be seen by the color variations from gray to black, are oscillatory. These fluctuations correspond to the oscillations of IMF 4, as shown in Fig. 4.1. This is not surprising as the Gleissberg cycle (IMF 4) is the amplitude modulation of the Schwabe cycle [53]. The STFT spectra in Figs. 4.5a and 4.5b also exhibit a peak near 0.09 Hz which corresponds to the 11-year cycle. However, there are large contributions from other frequencies. Because Fourier analysis attempts to construct the original signal with a sum of sine and cosine functions with constant amplitudes and phases, it requires an infinite number of contributions from different frequencies [5]. Also, the STFT spectrogram does not capture the oscillation in frequency of the 11-year solar cycle, which is due to the nonlinearity of the signal. [33] used PWD to better resolve the 11-year solar cycle. The distributions did not resolve the oscillation in instantaneous frequency due to nonlinearity, however, they significantly increased the resolution in both frequency and time as compared to STFT [33]. The high frequency noise as represented by IMF 1 does not show any coherent energy contributions from a particular frequency so its Hilbert spectrum was not shown.
20 The Hilbert spectrum for IMF 3, the 20-50-year (quasi-Hale) cycle, is shown in Fig. 4.4b. Notice the frequency increases between approximately 1830 and 1940. The amplitude, as shown by the color, decreases from 1830 to 1940 but is larger before and after this time period. It is interesting to note that the STFT does not capture the Hale cycle and the PWD from [33] shows a very faint Hale cycle in their Fig. 4.6. The Gleissberg cycle, as mentioned previously, represents the periodic amplitude modulation of the 11-year Schwabe cycle [53]. The Hilbert spectrum of the extracted Gleissberg cycle, IMF 4, is shown in Fig. 4.4c. It exhibits a mostly constant frequency. This is in contrast to both the STFT and PWD spectra which show a steady decrease in frequency corresponding to a steadily lengthening cycle. Also, [33] are able to display shorter period cycles in the PWD spectrograms. For this investigation, only cycles of approximately 11years and greater are shown. Fig. 4.6 displays the Wigner-Ville distribution for this data set. While the WignerVille distribution is able to capture the 11-year cycle, there are nonphysical harmonics which dominate the spectrum. Therefore, the Hilbert, STFT and PWD spectra are more informative when used for interpreting the sunspot data. Fig. 4.7 displays the instantaneous frequency of the 11-year cycle IMF and the IMF itself. The 11-year IMF has been divided by 1000 and a constant of 0.1 has been added for visibility with the instantaneous frequency. Notice that the IMF cycles tend to increase more quickly when rising in number and decay more slowly when falling, which is the cause for the change in instantaneous frequency. The instantaneous frequency is higher during the rising in sunspot number and is lower during the prolonged tail when the sunspot number decreases more slowly. This nonlinear behavior is similar to rainfall-runoff data when a short duration rain event occurs followed by a longer runoff period, causing the
21 instantaneous frequency of the process to fluctuate with time. This nonlinearity is not captured using alternative spectral analysis tools.
Discussion The Hilbert-Huang transform (HHT) has been used to analyze monthly sunspot numbers and their variability from 1749 to 2010. HHT decomposed the data set into a number of cyclic components using the ensemble empirical mode decomposition (EEMD). The IMFs could be viewed in the time domain and compared with the original data. They were extracted locally and adaptively from the data set and did not require a priori knowledge about the system or the selection of prescribed basis functions. However, the method acts as a dyadic filter bank in the frequency domain, meaning that it cannot separate cycles which differ in period by less than a factor of 2. The Hilbert transform was then used to calculate spectra and compare with the short-time Fourier transform (STFT) and the pseudo-Wigner distribution (PWD). The Hilbert method displayed energy contributions from only a few cyclic components. They were found to be representative of the Schwabe, Hale, and Gleissberg sunspot cycles. Also, the periodicity of the 11-year solar Schwabe cycle was shown to be time-dependent. Overall, this analysis demonstrates the utility of HHT when analyzing nonstationary data which may be due to nonlinear processes. Also, it has extracted the various cycles from sunspot number data, which can be compared and contrasted with previous and future sunspot research.
Future Research The HHT has shown to be useful for decomposing sunspot number into its intrinsic frequency components. From this, the study of the signals variability on different
22 time scales is possible. Also, one of the main strengths of the HHT method is to compare the frequency components of two or more signals to determine relationships between them. Further research will be pursued to utilize this technique to analyze mean global temperature, co2 measurements and total solar irradiance proxy data. Then the different frequency oscillations (IMFs) for each signal can be compared directly and checked for correlations. Research should focus on developing techniques to compare different frequency components and determine whether the two IMFs may be related.
Figure 4.1 Monthly sunspot data decomposed into its intrinsic mode functions (IMFs) using EEMD
23
Figure 4.2 Statistical significance test for the extracted IMFs. Notice the first extracted IMF, is below the 1% confidence limit and is therefore considered statistically insignificant from noise
24
Figure 4.3 The monthly sunspot data denoised by removing the first IMF extracted using EEMD
25
Figure 4.4 Hilbert spectra of IMFs representing the (a) 11-year cycle, (b) 20-50-year cycle, and the (c) quasi-100-year cycle
26
Figure 4.5 Short-time Fourier spectrogram of the monthly sunspot data with window sizes of (a) 100 years and (b) 26 years
27
Figure 4.6 Wigner-Ville distribution of sunspot data
Figure 4.7 Extracted IMF representing the 11-year solar cycle plotted along with its instantaneous frequency as calculated using equation (6)
28 CHAPTER V EMD APPLIED TO SOLAR IRRADIANCE, GLOBAL TEMPERATURE, AND CO2 CONCENTRATION DATA Introduction In this investigation, the EMD method is used to isolate and analyze the various cycles within total solar irradiance, global mean temperature, and co2 concentration measurements. The different cyclic components will be compared with one another in the time domain. For instance, the solar forcing from total solar irradiance will be compared with the global mean temperature fluctuations at different frequency scales in the timedomain. Therefore, it is easy to tell when two cyclic components are in phase and when they are not. The EMD method can locally and adaptively analyze the inherent cyclic components of nonlinear and nonstationary data. Therefore, it is beneficial to analyze the strengths and weaknesses of this new tool in the context of climate data sets.
Data Used The primary goal of this investigation is to utilize a relatively new data analysis tool to identify, as well as compare and contrast, the intrinsic cycles of a number of possibly inter-related variables. For demonstration, the following data sets were chosen: sunspot number, total solar irradiance, global mean temperature, and CO2 concentration. Sunspot number has been immensely recorded and studied. See for instance, [53]. Also, previously in this thesis, we decomposed monthly sunspot numbers into the 11-year (Schwabe), quasi-22-year (Hale), and quasi-100-year (Gleissberg) cycles using Empirical Mode Decomposition. They compared the HHT results with those from timefrequency distributions, including short-time Fourier and pseudo-Wigner distributions. This investigation will build upon these results by comparing the extracted IMFs with IMFs from
29 other variables in the time-domain. The monthly sunspot number records utilized were obtained from the solar physics group at NASA's Marshall Space Flight Center. The monthly sunspot number data set can be located at [39]. The monthly data from 1749 to 2009 were decomposed into IMFs, then averaged to obtain annual resolution. Total Solar Irradiance (TSI) is a direct measure of the solar output. Because TSI measurements have only been prevalent since the mid-1970s via satellites, this investigation chose to utilize a reconstructed TSI data set from 1749 to 2009. The proxy data used were annual data from 1749 to 2009, obtained from the LASP Interactive Solar Irradiance Data Center. The data can be downloaded at [34]. It is recognized that this data set was reconstructed with the use of sunspot number data. Therefore, correlations between sunspot number and the TSI data should be self-evident. Global mean temperature is one of the most controversial and important data sets that exist today. This investigation used monthly temperature data from 1880 to 2009 taken from NASA's Land-Ocean Temperature Index, LOTI. The data can be found at [38]. The data were decomposed into IMFs, then averaged to achieve annual resolution. Yearly CO2 concentration data measured at Mauna Loa observatory were also used for this investigation. Data were obtained from 1958 to 2010 from the NOAA Earth System Research Laboratory c/o Dr. Pieter Tans at [40].
Results The EMD method was utilized to extract the intrinsic cycles of the previously mentioned datasets. Their decomposed intrinsic mode functions (IMFs) are displayed in Figs. 5.1 through 5.3. The global mean temperature comprised of monthly measurements
30 while the TSI data and CO2 concentration data were recorded annually. Therefore, the IMFs were decomposed for each data set, then averaged to provide annual temporal resolution.
Cycles in Data The CO2 concentration data set is perhaps the most straightforward data set decomposed by the EMD method. The decomposition, shown in Fig. 5.4, yielded three IMFs, including high-frequency noise, an annual oscillation, and a steadily increasing trend. [60] have previously decomposed CO2 concentration proxy data from 1880 to 2002 using EMD. They utilized annual data so they were unable to extract the yearly cycle as shown in IMF 2. Instead, they extracted noise IMFs and a century long increasing trend. Fig. 5.5 is an enlarged plot of IMF 2, the annual oscillation. Notice how the signal is not a linear sinusoid, for instance, at approximately the 1984 cycle. EMD is able to represent the actual fluctuations of the signal, without forcing assumptions of linearity or stationarity [26][27]. Therefore, non-sinusoidal oscillations such as in IMF 2 can be extracted and analyzed in the time-domain. Apart from CO2 concentration, the EMD method was also applied to sunspot number, total solar irradiance, and global mean temperature data sets as shown in Figs. 5.1 through 5.3. Instantaneous Frequencies (IF) were calculated for each IMF from the three data sets using equation 3.4. These frequencies fluctuate over the entire data duration period. The mean and standard deviation of the instantaneous frequencies were calculated and are shown in Table 5.1. The periods could also be approximated by equation 5.1 where Dur is the length of the signal and ZC is the number of zero-crossings.
31
(5.1) This calculation of the periods does not take into account the nonstationarity, or frequency changes in the cycles. However, it does give an idea for the general cycle time scales for each IMF. Both methods yielded nearly identical periods for the lower IMFs, as shown in Table 5.2. The higher IMFs, which represent the longer periodic oscillations of the signal, have greater differences because there were fewer cycles to average over fluctuations in instantaneous frequencies. The cycles of sunspot number have been studied immensely in the scientific community. For an overview, see [53]. The most prominent cycle in sunspot data is the 11year Schwabe cycle shown as IMF 2 Figure 5.1. The third IMF shows a less uniform cycle with a period between 13 and 16 years. IMF 4, which is the quasi-Hale IMF, exhibits a 22year cycle from approximately 1840 to 1940 but has longer cycles before and after this time period. IMF 5 is approximately a 100-year cycle, which is known as the Gleissberg cycle. Finally, IMF 6 is the trend or the lowest frequency component of the signal. Total solar irradiance proxy data was also decomposed into 5 IMFs and a trend, as shown in Fig. 5.2. IMF 2 has an approximately 11-year oscillation. This is not surprising since the proxy data was partially reconstructed using solar sunspot number data. However, it is interesting to note the time-dependent amplitude in IMF 2. The IMFs extracted using EMD are not required to be constant amplitude or frequency [26][27]. The remaining IMFs are longer oscillations, which will be shown to correspond to sunspot number IMFs. The extracted IMFs from global mean temperature data are shown in Fig. 5.3. IMF 2 is approximately a 5-year oscillation and IMF 3 represents a quasi-11-year cycle. This will be compared with the 11-year IMFs of sunspot number and TSI. IMFs 4 and 5 represent the
32 longer cyclic oscillations with mean periods of 1724 and 5865 years, respectively. [53] also describes an 11-year climate oscillation, as well as large climate oscillations with periods of approximately 20 and 60 years, using the Fourier spectral analysis. These were determined from peaks in the frequency-domain spectra. The EMD method has expanded this research by providing the ability to view the cyclic components in the time-domain.
IMF Comparisons The variables have been decomposed into their IMFs, which represent oscillations at various time scales. The IMFs can now be compared and contrasted. CO2 concentration data was decomposed into an annual cycle and a trend. The annual cycles were not resolved in the other variables because annually averaged data were used. The trends, by inspection, are positively correlated. Because the total solar irradiance proxy data was reconstructed partially using sunspot data, it is not surprising that its IMFs are correlated with those of sunspot data. For instance, Fig. 5.6 plots the two variables' IMFs together for comparison. Notice that the 11year cycle IMF matches very closely as well as the 100-year cycles. The middle IMFs do correlate for most of the time period; however, there is some disconnect from 1850 to 1900. The lowest frequency components for each variable can be compared directly from Figs. 5.1 and 5.2. These are clearly well correlated. Correlation values for the IMFs are given in Table 5.3. Again, the correlation between solar irradiance and sunspot number was expected because the TSI data were reconstructed using sunspot data. The possibility of correlation between TSI data and global mean temperature has been a widely debated topic. It is important to understand how the fluctuations in solar irradiance affect the earth's climate and global mean temperature. Also, it is beneficial to
33 quantify how much effect fluctuations in solar irradiance have compared to other forcings within the global climate system. In order to approach this topic, we will first compare visually the different IMFs of TSI data and global mean temperature. This is comparing the different periodic cycles inherent within each data set. Fig. 5.7 shows the comparison of various IMFs from TSI data with those from global mean temperature data. Fig. 5.7 shows that the data fluctuates between being correlated and not. The first regime can be seen from 1880 to approximately 1945 where the fluctuations in solar irradiance appear to not be correlated well with the global temperature fluctuations. In the third plot down in Fig. 5.7, the two are out of phase by 180 degrees. The two variables in the second plot appear to lock in phase at approximately 1940 and continue until 2009. However, the first and third plot show phase locking between the two variables between 1970 to approximately 1995, and lose the correlation for subsequent time periods. Also, the trends of TSI and temperature can be compared by inspection of Fig. 5.2 and Fig. 5.3. The difference between these time periods can be seen graphically as well as analyzed in terms of correlation coefficients. Tables 5.4 and 5.5 display the correlation coefficients between the IMFs of TSI and globally averaged temperature. These correspond to two different time periods, mainly, 18801945 and 19452009. It is quite clear that from 1880 to 1945, there is small or negative correlation between the two signals for all time scales, neglecting the trend. However, the time period 19452009 shows a dramatic increase in correlation coefficient values. Tables 5.4 and 5.5 also show that the trends are well correlated throughout the entire data duration. Consider two oscillating processes that oscillate with different frequencies. It is inevitable that they will reinforce one another during certain times and will be out of phase during others. The EMD method demonstrates that while from 1880 to 1945 the variations
34 in solar output were not correlated with global temperature, between 1945 and 2009 they were positively correlated. These correlations were found at a variety of time scales, including approximately the 11-year, the 22-year, and the 100-year cycles. Finally we compare the IMFs of global mean temperature with those from sunspot number data. The results are plotted in Fig. 5.8. Also, all correlations are given in Tables 5.6 and 5.7 where the combination of IMFs of interest are highlighted in bold. The IMFs appear to give very similar results as comparing global mean temperature IMFs with those of total solar irradiance. Tables 5.6 and 5.7 show shifts in correlation numerically. Notice that the correlations are mostly negative between 1880 and 1945, as shown in Table 5.6. Between 1945 and 2009, however, the correlation values dramatically increase, as shown in Table 5.7. For the first and third plots in Fig. 5.8, here is no correlation from 1880 until approximately 1970, after which there is correlation until approximately 1995. For years after 1995, the correlation is reduced. For the second plot in Figure 5.8, there is no correlation until approximately 1940. After 1940 there appears to be correlation up until 2009. The third plot shows sunspot number and temperature out of phase from 1880 to 1945, as well as after 2000. However, between 1950 and approximately 1980, the two signals are in phase.
Discussion The HilbertHuang transform is a data analysis tool that is able to analyze nonstationary data, which may be the result of nonlinear processes. Therefore, it is justified to analyze various data sets to study the periodic cycles inherent in the data and to compare different variables at different time scales.
35 By decomposing CO2 concentration into its IMFs, the different periodic components could be analyzed in the time-domain. The CO2 concentration measurements exhibited a diurnal cycle which, remarkably, has not changed much since 1958. For the last 50 years, the change in CO2 from annual minimum to maximum was 5.70.56 ppm as calculated from the cycles in IMF 2. Superimposed upon this cycle is the long term trend, which has increased approximately 75 ppm since 1958. One of the most interesting results of this investigation is the identification of a quasi-11-year cycle and quasi-22-year cycles within globally averaged temperature data. Also, the EMD method showed that during particular time periods the quasi-11-year temperature cycle was locked in phase with the cycles from total solar irradiance and sunspot number. It seems intuitive that the dominate cycle in solar irradiance output, the 11-year cycle, would directly affect the temperature at the earth. In fact, a number of studies have shown changes within the troposphere, which are associated with solar fluctuations [10]. TSI and temperature oscillations at longer time scales of 22 years and 65 years were also shown to be correlated during these time periods. There have also been suggested correlations between arctic-wide surface air temperature records and solar irradiance on decadal and multi-decadal scales using wavelet analysis (Soon et al., 2009). The magnitudes of the 11-year cycle fluctuations can be estimated empirically from the IMFs. From Fig. 5.7, during the last five 11-year cycles, the average change in TSI from solar minimum to solar maximum was 0.7750.055 W/m2. For these same five 11-year cycles, the average change in global mean temperature from minimum to maximum, as calculated from Figure 5.7 was 0.1010.012 C. It should be noted that decadal variations between 9 and 15 years in the temperature records could be due to a variety of occurrences in addition to solar forcing. For instance,
36 volcanic eruptions and the Pacific Decadal Oscillation (PDO) both exhibit cycles at these periodic scales and may be partially responsible for the resulting cycles present in the global mean temperature data set. Over the entire data duration, the 11-year Schwabe cycles remain relatively constant. That is, over longer periods, they tend cancel themselves out with relatively symmetric fluctuations. To determine the net radiative forcing over longer time periods, the trends extracted using the EMD method can be analyzed. From Fig. 5.3 the trend of globally averaged temperature has increased approximately 0.44 C since 1959. Looking at the trend of TSI data in Fig. 5.2, the change in TSI from 1959 to 2010 was approximately 0.3 W/m2. Note that this is a maximum estimate, because not all of the energy from the Sun will be absorbed by the Earth. This can be compared with the forcing associated with an increase in CO2 concentration over the same time period, which can be calculated using equation 5.2 ( ) (5.2)
Equation 5.2 was formulated using radiative transfer models to calculate the radiative forcing due to a change in CO2 from some initial value to its present value (Myhre et al., 1998). Solving equation 5.2 for the change in CO2 from 1958 to 2010, based upon the trend in Figure 5.4 gives a radiative forcing of 1.13 W/m2. Therefore, while the short term and long term fluctuations of total solar irradiance do produce radiative forcing upon the Earth, the long term net radiative forcing is much smaller than the net forcing from increasing CO2 concentrations. These estimates of forcing are not necessarily directly connected to absolute changes in temperature. Multiple feedback mechanisms may exist, which complicate the processes by which the Earth absorbs and retains energy.
37 The HilbertHuang Transform has been introduced as a relatively new spectral analysis tool capable of analyzing the cyclic components of nonlinear and nonstationary data. The Empirical Mode Decomposition method was used to decompose oscillatory signals of total solar irradiance, sunspot number, global mean temperature, and CO2 concentration into their Intrinsic Mode Functions. These IMFs exhibited time-dependent amplitudes and frequencies. The IMFs were then analyzed and compared in the time-domain. Also, empirical evaluations of radiative forcing from different periodic components of CO2 concentration and total solar irradiance were estimated. The net radiative forcing from increasing solar irradiance was shown to be much smaller than the forcing due to increases in CO2 during the last 50 years.
38
Figure 5.1 Sunspot number data set and its decomposed IMFs.
39
Figure 5.2 Total Solar Irradiance (TSI) measurements and their decomposed IMFs.
40
Figure 5.3 Global mean temperature and its decomposed IMFs.
41
Figure 5.4 CO2 concentration as measured from the Mauna Loa Observatory.
42
Figure 5.5 Subsection of IMF 2, the yearly cycle extracted from the CO2 data using EEMD.
MF 1 MF 2 MF 3 MF 4 MF 5
N N SI SI Mean(IF) Stdev(IF) Mean(IF) Stdev(IF) I 0. 0. 0. 0. 28 14 24 13 I I I 03 I 01 0. 09 08 0. 0. 0. 01 005 0. 02 09 0. 0. 0. 04 009 0. 09 05 0. 0. 0. 09 006 0. 04 02 0. 0. 0.
SS
SS
T T Mean(IF) Stdev(IF) 0. 28 20 10 06 02 0. 0. 0. 0. 09 006 0. 14 08 05 0. 0. 0. 0.
Table 5.1. Mean and standard deviation of instantaneous frequencies (1/yrs) calculated using the Hilbert Transform.
43
IMF 1 IMF 2 IMF 3 IMF 4 IMF 5
SSN Hilb 3.6 11 13 37 93
SSN ZC 3.4 11 16 37 104
TSI Hilb 4.2 11 20 28 113
TSI ZC T Hilb 4.1 11 19 52 104 3.6 5.0 10 17 58
T ZC 3.0 5.5 10 24 65
Table 5.2 Periods (in years) calculated using Hilbert analysis and zero-crossing method.
44
Figure 5.6 Comparison between TSI and sunspot number IMFs. The TSI data were multiplied by a factor of 100 in order to improve visibility.
45
TSI IMF1 IMF2 IMF3 IMF4 IMF5 IMF6
Sunspot 0.85 0.21 0.61 0.36 0.26 0.48 0.59
IMF1 0.50 0.54 0.54 0.10 -0.02 -0.01 -0.03
IMF2 0.82 0.26 0.95 0.31 0.01 0.03 -0.05
IMF3 0.27 0.04 0.10 0.74 0.19 0.01 -0.01
IMF4 0.28 -0.08 -0.01 0.10 0.79 0.31 -0.001
IMF5 0.33 0.02 -0.001 0.03 -0.05 0.84 0.59
IMF6 0.20 -0.02 -0.03 0.06 0.03 0.25 0.96
Table 5.3 Correlation coefficients (r) between total solar irradiance and sunspot from 1749 to 2009.
46
Figure 5.7 Comparison of IMFs for global mean temperature and total solar irradiance.
47
T IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6
TSI 0.28 -0.13 -0.18 0.09 0.11 0.77 0.67
IMF 1 0.02 -0.03 0.004 0.02 0.07 0.03 0.01
IMF 2 -0.03 -0.08 -0.02 -0.02 0.08 0.02 -0.04
IMF 3 -0.34 -0.09 -0.34 -0.40 0.04 0.02 -0.04
IMF 4 0.13 -0.04 -0.07 0.30 -0.06 0.20 0.27
IMF 5 0.28 -0.04 -0.04 0.10 -0.26 0.66 0.37
IMF 6 0.41 -0.10 -0.14 0.13 0.26 0.86 0.87
Table 5.4 Correlation coefficients (r) between total solar irradiance and global mean temperature from 1880 to 1945.
TSI T IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6 0.13 0.02 -0.02 0.10 0.27 -0.85 0.83
IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6 0.01 0.09 0.39 0.27 -0.06 0.06 0.004 0.13 0.17 0.01 0.005 -0.04 0.02 0.06 0.43 0.08 -0.10 -0.11 -0.003 0.03 0.08 0.67 -0.05 0.04 0.02 0.01 -0.05 0.13 0.59 0.08 0.004 0.06 -0.006 -0.28 -0.81 -0.77 -0.04 -0.04 0.007 0.33 0.26 0.99
Table 5.5 Correlation coefficients (r) between total solar irradiance and global mean temperature from 1945 to 2009.
48
Figure 5.8 Comparison of global mean temperature and sunspot number data IMFs. The temperature was multiplied by a factor of 500 to allow for visibility.
49
T Sunspot # IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6 0.04 -0.14 -0.22 -0.03 -0.16 0.79 0.68
IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6 0.02 -0.03 -0.34 0.06 0.17 0.07 -0.09 -0.11 -0.02 -0.04 0.01 -0.13 -0.005 -0.01 -0.31 -0.15 -0.06 -0.17 0.08 -0.02 -0.37 0.20 0.08 -0.07 0.06 0.01 0.01 0.36 -0.42 -0.22 0.04 0.03 0.05 0.28 0.80 0.76 0.01 -0.04 -0.03 0.27 0.40 0.88
Table 5.6 Correlation coefficients (r) between sunspot number and global mean temperature from 1880 to 1945.
T Sunspot # IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6 -0.11 -0.13 0.04 0.06 -0.02 -0.90 0.83
IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6 0.01 0.08 0.34 0.13 -0.14 -0.20 -0.09 0.05 -0.05 -0.07 -0.11 -0.09 0.02 0.07 0.42 0.08 -0.03 -0.05 -0.005 0.02 0.09 0.68 -0.20 0.05 0.03 0.02 -0.10 0.03 0.30 -0.17 0.01 0.05 -0.005 -0.31 -0.64 -0.90 -0.04 -0.04 0.007 0.33 0.26 0.99
Table 5.7 Correlation coefficients (r) between sunspot number and global mean temperature from 1945 to 2009.
50 CHAPTER VI CHARACTERIZING SAMPLING ERRORS ASSOCIATED WITH THE NEAR-SURFACE ENERGY BUDGET CLOSURE PROBLEM The Energy Balance Problem Conservation of energy at the earths surface, as defined by the balance of net radiation and ground heat flux with the sum of turbulent sensible and latent heat fluxes, has consistently not been satisfied experimentally [14][16][18][20]. Many studies have found the net radiation and ground heat fluxes are consistently approximately 20% greater than the turbulent fluxes [20]. This residual is often calculated as in equation 6.1, ( ) ( ) (6.1)
where R is the residual, Qnet is the net radiation, G is the soil heat flux, and H and E are the sensible and latent heat fluxes, respectively, which shows the amount of energy needed to balance the budget. Any lack of closure pertains not only to heat and moisture measurements, but also those for trace gases such as carbon dioxide; the energy budget is not closed for most of the FLUXNET sites, which measure flux of carbon dioxide [3][13][55]. Accuracy of these measurements is pivotal for understanding the surface exchange of greenhouse gases and quantifying carbon, as well as heat and moisture, cycling over specific ecosystems. A workshop was held in Genoble, France in 1994 to address the problems resulting in the lack of closure [19]. The workshop formed the basis for the EBEX-2000 (Energy Budget Experiment in 2000) which was conducted 50 miles south of Fresno, California. However, the problems of closure were not able to be completely solved from the experiment. More recently, in October 2009 in Thurnau, Germany, there was a panel discussion about the energy budget closure problem which cited the current state of
51 knowledge, as well as areas where future research was needed [20]. They concluded the following: (1) Currently, the energy budget cannot be closed with experimental measurements. (2) While previous studies have blamed the lack of closure on the high-frequency response of meteorological instruments, they have been shown to not have a remarkable effect with the advent of newer and faster sampling systems [20]. (3) The primary issues resulting in the lack of closure are attributed to the Eddy Covariance technique and the resulting miscalculation of the sensible and latent heat fluxes, not the net radiation or soil heat flux [20]. (4) One of the main contributions to the lack of closure is the energy transport of large, low-frequency contributions to the vertical component of the eddy transport, which are not fully measured using traditional eddy covariance methods [13][14][19][52]. These are generally due to heterogeneity of the land surface near the measurement system. These low-frequency transport mechanisms can be due to slowly moving convection cells, or by the passage of clouds above the sensing instruments [20]. (5) For some tall tower measurements, these mechanisms can be fully measured by increasing the averaging time or using wavelet analysis, resulting in energy balance [14][45][52]. However, when measurements are made more near the surface, or near the surface of the roughness layer (i.e. above forest canopies), the low-frequency oscillations are not fully measured. Therefore, no significant amount of flux is measured from these low-frequency oscillations, at least for averaging intervals between 30 and 240 minutes [19][20].
52 This section will specifically address the low-frequency contributions to the turbulent fluxes using empirical mode decomposition and the Eddy Covariance method. We intend to demonstrate that any finite measurement duration cannot fully capture all the low-frequency oscillations within a realistic atmosphere. We will demonstrate that the errors within turbulent fluxes, as calculated using the Eddy Covariance method, are partially due to including undersampled low-frequency processes. Other studies have come to similar conclusions, mainly, using Ogive functions they suppose that low-frequency circulations must be responsible for missing flux [19][52]. We will propose an alternative method to Ogive functions by determining the largest structure that can be sufficiently sampled with a particular sampling duration. The EMD method, then, provides a new method to view the frequency contributions to the total flux. The contributions also demonstrate whether the processes have been sufficiently sampled. To present this investigation, we will first introduce a relatively new spectral analysis tool, Hilbert-Huang Transform, which is specifically designed to handle nonstationary data. Much like wavelet analysis, it is a spectral analysis tool which extracts the frequency contributions from an oscillatory signal. However, it does not require the use of predetermined basis functions, as in wavelet analysis. This relatively new method utilizes a decomposition algorithm, called Empirical Mode Decomposition (EMD), which will be introduced and used to decompose atmospheric wind components, temperature, and humidity variables into their frequency components. Then we will quantify contributions to the near-surface turbulent fluxes from each of these decomposed components. From this, we will demonstrate and quantify errors due to undersampling low-frequency, nonstationary oscillations when calculating the turbulent fluxes in the near-surface energy budget.
53
EMD as a Dyadic Filter The EMD algorithm is able to sift out the intrinsic periodic components from complicated oscillatory data. These components, called intrinsic mode functions (IMFs), are time-domain functions which represent the local variability of the original signal at a particular size (frequency) scale. There are limitations regarding the periodic components EMD is able to extract from oscillatory data. For instance, EMD can only sift out periodic components which differ in period by more than factors of two [15][57]. If a signal has two or more superimposed periodic components which have periods closer than factors of two, the extracted IMF will be the superposition of all the components within that dyadic range. When dealing with turbulent atmospheric data, which can be thought of as a collection of eddies existing at all size scales, the EMD algorithm acts as a dyadic filter. To demonstrate this point, multiple data sets of 5 minute, 20 Hz temperature data were decomposed using the EMD algorithm. Fig. 6.1 shows the calculated mean periods of the IMFs as plotted on a log2 graph against IMF number. The mean periods were calculated roughly by counting the number of zero-crossings of each IMF and dividing by twice the total length of the IMF. Fig. 6.1 demonstrates that the average period of each IMF is approximately twice the preceding IMF. Since all time-domain components are additive, each IMF can be interpreted as containing the sum of all oscillatory components within its particular dyadic range. This also demonstrates that the problem of mode mixing is not influential in these data. The problem called mode mixing is where a decomposed IMF contains a mixture of different, sometimes drastically different, periodic scales [26][27]. Beceause the IMFs displayed in Fig. 6.1 are clearly dyadic, each IMF is the sum of all frequency (periodic) scales
54 within the dyadic range of that IMF; in other words, there is no mixture of drastically different modes. If there were mode mixing, Fig. 6.1 would not be linear, and each IMF would not have the mean period which is twice the previous. So, the EMD method acts as a dyadic filter when used with turbulent atmospheric data. We can use the EMD method to decompose our atmospheric oscillatory data into a set of IMFs whose different contributions to the turbulent flux can be calculated using a form of the Eddy Covariance method. First, we will briefly give an overview of the Eddy Covariance (EC) method.
Eddy Covariance Methods Traditional Eddy Covariance Method The eddy covariance (EC) technique is most commonly used for calculating the heat, moisture, and CO2 fluxes near the earths surface [3][19]. The first two of these are the sensible and latent turbulent heat fluxes which exist in the surface energy budget, and which are consistently overestimated by approximately 20%. The EC method applies Reynolds averaging, the method of separating a signal into its mean and fluctuating components, to the near-surface mass balance, as shown in equation 6.2 through 6.4. ( ) (6.2)
(6.3)
[( .
)(
)]
) (6.4)
55 Here k is some scalar, u is the wind velocity vector, overbar denotes the mean, and prime denotes the fluctuation from this mean. Then, typically the mean of both sides in equation 6.4 is calculated in order to make because by definition the average of fluctuating components will be zero.
Under particular assumptions of horizontal homogeneity, the vertical near-surface flux of some scalar k is shown to be equal to the covariance of the fluctuating component of the vertical wind velocity w and the fluctuating component of the scalar k, as in equation 6.5 ( ) ( )( ) (6.5)
Overall, the EC method says that the vertical turbulent transport of sensible and latent heat can be calculated by their covariances with w; just replace k in equation 6.5 with temperature T and specific humidity, q, for sensible and latent heat, respectively. However, these calculations are used within near-surface energy budget calculations and are thought to be approximately 20% underestimated. The EC method relies on a number of assumptions associated with instrumental setup and data collection. For a full explanation of proper measurement techniques, see [16][18][54]. For this investigation, we assume that all systematic errors have been minimized prior to our data analysis.
EMD Eddy Covariance Method In addition to calculating the total vertical transport of sensible and latent heat fluxes near the earths surface with the EC method, many studies calculate the relative contributions to the flux from various size (frequency) scales of eddies [31][45][52]. To do this, they separate the signals into frequency components using spectral analysis tools such as
56 Fourier analysis or wavelet analysis. Likewise, the EMD method can be used to separate signals into their periodic components, and analyze their relative contributions to the total flux. Remember that any signal can be decomposed with the EMD method into a finite number of fluctuating IMFs and a residue, as shown in equation 6.6 ( ) ( ) ( ) (6.6) ( ).
It is possible to calculate the contribution to the flux from particular IMFs. This is equivalent to calculating contributions to the flux from particular frequency (size) scales of eddies. Equation 6.7 shows the total flux, ( ) ( ) ( )( ) (6.7)
hich is equal to the sum of all covariances from each of the IMF pairs, where the IMF numbers are indexed by i and j. The total flux can therefore be written as the total sum of an IMF Covariance Matrix as shown in equation 6.8 ( ( ( ) ) ) ) ( ( ( ) ) ) ( ( ( ) ) ) ( ) (6.8)
where the total covariance cov(w,k) is equal to the sum of all the covariance contributions from each of the IMF pairs.
57 Following the advent of traditional Fourier The sum of the IMF Covariance Matrix gives results identical to calculating the total covariance via Fourier analysis or by simply calculating the covariance of w and k as in equation 6.5.
Orthogonality and Sampling Durations So far, we have introduced the EMD method and shown how the sum of its IMF Covariance Matrix is equivalent to the total near-surface vertical turbulent flux of either sensible or latent heat. Next, we will discuss how the IMF Covariance Matrix can be related to sampling durations and show how errors due to undersampling of fluxes can be calculated. The total covariance of w and k can be separated into contributions from orthogonal IMF (i = j) and from nonorthogonal IMF (i j) components as shown in equation (6.9).
(6.9)
The orthogonal terms are the diagonal terms in the IMF Covariance Matrix (first sum on right hand side of equation 6.9) while the nonorthogonal terms are the off-diagonal terms (second sum on right hand side of equation 6.9). The factor of two is related to the fact that it is a symmetric matrix. For comparison, consider typical Fourier decompositions. Fourier describes an oscillatory signal as the sum of an infinite set of basis functions, which are weighted sine and cosine functions with different constant frequencies from zero to the Nyquist frequency [5][43][51]. The sine and cosine basis functions are, by definition, orthogonal when summed
58 (or integrated) over all space or time. Therefore, the Fourier analogue to the IMF Covariance Matrix is an infinite matrix whose off-diagonal terms are all equal to zero. ( ( ) ( ) ( ) ( ) ) (6.10)
The Fourier Covariance Matrix, shown in equation 6.10, says that the only contributions to the covariance come from diagonal (orthogonal) terms between identical basis functions with frequencies from zero to the Nyquist frequency. However, in order for the basis functions to be completely orthogonal, they must be infinite in extent [6][51]. Therefore, when dealing with finite length data, which all measured data are, Fourier analysis assumes that the finite data length repeats infinitely [5][43][51]. This is why many scientists use bell-tapering or other tapering methods at the beginning and end of the data set before processing with Fourier analysis in order to avoid any discontinuous jumps between the end and the beginning of the data. We have shown that the EMD method allows for nonorthogonal (off-diagonal) contributions in the IMF Covariance Matrix. Since the IMF Covariance Matrix and the Fourier analogue matrix give identical results when summed over all components, it is natural to wonder what causes the differences in the matrices themselves. To explore the different contributions from orthogonal and nonorthogonal components via the EMD method, data sets of w and T were decomposed into their IMFs for two different sampling durations: 5 minutes and 60 minutes. First, the variance contributions of w and T were calculated for all the IMFs. Fig. 6.2 shows the variance contributions from IMF 5 of w and T for the 60 minute duration data
59 set. This is essentially looking at one particular row (in this case, row 5) of the IMF Covariance Matrix between w,w and T,T. The orthogonal contribution comes from the same IMF number ( ), and is the
largest contribution in the row. None of the other IMFs contribute significantly to the variance. Therefore, the orthogonal term dominates. This row, then, is considered orthogonal. The covariance was also calculated for w and T. Fig. 6.3 shows w IMF 5 and its covariance contributions with all the T IMFs. This is equivalent to looking at row 5 in the IMF Covariance Matrix between w,T. Again, the major contribution comes from the orthogonal contribution, IMF 5
of w and T. However, there is energy spreading to the adjacent IMFs, mainly IMFs 3, 4, and 6. This is common when calculating covariances between different variables: the momentum flux, vertical sensible heat flux, and , vertical latent heat flux. These contributions from adjacent IMFs, energy spreading, are not nonorthogonal contributions but are still considered orthogonal, pseudo-diagonal contributions. Therefore, Fig. 6.3 still represents an orthogonal row in the Covariance Matrix. Now for a nonorthogonal case. The 5 minute duration data sets were used and decomposed into their respective set of IMFs. Fig. 6.4 shows the w IMF 10 and its covariance contributions with all T IMFs. IMF 10 shows contributions at a number of different IMFs. Notice there are even significant negative contributions. This demonstrates a case where the row (row 10) has significant nonorthogonal contributions. Nonorthogonal contributions are due to the finite duration of the data, and the undersampling of the lowest frequencies. When the short data sets of w and T were used (5
60 minutes) there were nonorthogonal contributions when looking at the higher (longer periodicities) IMFs, specifically IMF 10. When the longer data sets of w and T were used (60 minutes) there were only orthogonal contributions, noting the occurrence of energy spreading, when looking at the lower (shorter periodicities) IMFs, particularly IMF 5. To investigate this further, we analyzed 10 data sets (consecutive days) of 20 Hz meteorological data from 1000 to 1230 local time, from the SMEX 2002 experiment in Iowa. These data were chosen because they represent ideal turbulent conditions over corn fields. Each data set (w, q, T) was broken into 11 sampling durations, each starting at 1000 local time, including lengths of 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, and 150 minutes. Each data set was decomposed into their IMFs using the EMD method. Then, IMF Covariance Matrices were calculated for each sampling duration. Fig. 6.5 shows the absolute value of the nonorthogonal contributions (fraction of total covariance) from the sensible (H) and latent (E) heat IMF Covariance Matrices as summed for different sampling durations. Data are shown from sites 161, as shown in the top row, and 152, as shown in the bottom row, from SMEX 2002. Notice that the nonorthogonal contributions decrease as the sampling duration is increased, and that the deviations between the different days decreases. The physical reasons for these nonorthogonal contributions at the short sampling durations is that the longer periodic IMFs, which represent the periodic eddies within the system of particular frequencies, are not sampled sufficiently. As the measurement duration is increased, more and more cycles are sampled which causes the nonorthogonal terms to decrease. Absolute values were used because the nonorthogonal contributions can sometimes be negative, as will be shown shortly. Also, ten data sets were used to show that this is not an uncharacteristic result which occurs within one particular data set; the result is
61 not dependent upon the random fluctuations of the EMD algorithm. Notice, though, that even a 150 minute duration is unable to reduce the nonorthogonal contributions to zero. Fig. 6.5 shows that the nonorthogonal terms in the IMF Covariance Matrices should decrease for the IMFs which are becoming more sufficiently sampled. However, if the sampling duration is increased enough, more IMFs (rows and columns in the matrix) will be created because using a longer sampling duration can capture larger cycles; these cycles will not be sufficiently sampled, which result in nonzero nonorthogonal terms. In an idealized case, where the measuring duration is infinite, the Covariance Matrix will equal the analogue Fourier Covariance Matrix (as shown in equation 6.10), where it is an infinite matrix and where all the nonorthogonal contributions are zero. Next we can use the idea of orthogonality to determine when a signal is sufficiently sampled, and how this effects the measurement and calculations of turbulent fluxes.
How Long is Long Enough? A long withstanding question within the atmospheric community is how to find the appropriate duration to sufficiently sample an oscillatory process, or more poignantly, How Long is Long Enough? (Lenschow et al. 1994). The question about appropriate length is one which depends not only on the sampling duration, but also the period of the oscillatory process being sampled. The EMD has proven to be a unique frequency decomposition tool which can provide insights into this question. From the EMD perspective, a process has been sufficiently sampled when its nonorthogonal contributions have decreased to zero. Therefore, a signal has been measured for enough cycles and its components are completely
62
distinguished from the other components embedded within the signal. We ask then, how many cycles of a process need to be sampled in order for it to be sufficiently distinguished from the other processes embedded in the signal? Fig. 6.6 shows the orthogonal (blue) and nonorthogonal (red) fractions of the total covariance as a function of sampling duration divided by period of IMF. The x-axis can be explained as the number of cycles of a periodic process which is captured with a particular sampling duration. For short sampling durations, or long periods, there is great scatter. However, as the sampling duration divided by the period is increased, the nonorthogonal fraction asymptotes to zero. Likewise, the orthogonal fraction asymptotes to one. The regions where these contributions asymptote gives a quantitative estimate for the number of cycles required to sufficiently sample the process. From the 10 days of data used for this test, the number of cycles required was approximately 7. Physically, this means that it takes 7 cycles to sufficiently distinguish one periodic process from another when the two are embedded within the same signal. This can be used to determine the longest periodic oscillations sufficiently sampled for a given sampling duration. If a 30 minute sampling duration is used, the longest periodic process that will be sufficiently sampled will have a period of approximately 4 minutes, assuming that it takes approximately 7 cycles to be sufficiently sampled. This explains why typical Ogive functions do not display significant changes in flux estimates when increasing sampling durations in small increments. In order to sufficiently sample a 30 minute process, a sampling duration of at least 210 minutes is necessary. The EMD method has created a tool which can be used to determine whether a signal is sufficiently sampled. We now show that the errors in the turbulent fluxes are partially due to the undersampling of the lowest frequency components. Fig. 6.8 shows the nonorthogonal, orthogonal, and total contributions to the covariance of wT for site 161. Each subplot is a different sampling duration in the following order: 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, 150 minutes. Notice that the nonorthogonal contributions are typically found
63
in the higher IMFs, which represents lower frequency oscillations. As the sampling duration is increased, as you can see by looking through the subplots, the high-frequency components become more sufficiently sampled, however, the low-frequency components still have large fluctuations. These fluctuations are random errors associated with undersampling the lowest frequencies. We have used the EMD method to develop a tool which has proven useful to identify the largest periodic structure which can be sampled sufficiently with a particular sampling duration. Therefore, the EMD method is able to determine which periodic contributions to the total covariance contain random errors due to undersampling, and which do not. However, it has been typically thought that the turbulent fluxes have always been underestimated [16][18][20]. While the EMD method shows that random errors can occur due to undersampling, it does not explain why the turbulent fluxes are consistently underestimated. Instead, any undersampled IMF will contribute either positively or negatively to the flux, causing errors.
Conclusions This work has introduced a relatively new spectral analysis tool called Hilbert-Huang Transform and has utilized its empirical mode decomposition algorithm to decompose meteorological data into their intrinsic periodic oscillations. By using EMD as a dyadic filter for meteorological data, we have calculated the contributions to near-surface fluxes from different frequency components and constructed the idea of an IMF Covariance Matrix for calculating near-surface turbulent fluxes. This investigation also determined an approximate estimation for the number of cycles needed to sufficiently distinguish a process among other embedded processes within a signal. By recognizing that nonorthogonal contributions are evidence of undersampled processes, the EMD method can be used to determine which frequency components have been sampled sufficiently which contribute to the calculated turbulent flux.
64
While the method determines which periodic processes are undersampled, it does not show that the errors due to undersampling are always negative. Rather, they occur randomly, contributing positive and negative contributions to the total flux. Further research should be performed to compare the nonorthogonal components with direct calculations for many different data sets, including studies with small or large energy budget residuals. Also, the nonorthogonal contributions could be compared to other meteorological values such as u* or stability parameters in order to determine how and if they are related.
65
Figure 6.1 Dyadic nature of EMD when applied to turbulence
Figure 6.2 Variance contributions from IMF pairs for 60 minute data sets of vertical wind velocity and temperature
66
Figure 6.3 Covariance contributions from IMF pairs of vertical wind velocity and temperature
Figure 6.4. Covariance contributions from w IMF 10 and all T IMFs
67
Figure 6.5 Absolute value of the nonorthogonal fraction of the total covariance from the IMF Covariance Matrices as calculated for 10 days. The top two plots show SMEX 2002 data from Site 161. The bottom two plots show SMEX 2002 from Site 152.
68
Figure 6.6 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wT plotted against essentially the number of cycles sampled, as defined by the sampling duration divided by the period of the process (in this case, an IMF).
69
Figure 6.7 Orthogonal (blue) and nonorthogonal (red) fractions of the total covariance of wq plotted against essentially the number of cycles sampled, as defined by the sampling duration divided by the period of the process (in this case, an IMF).
70
Figure 6.8 Orthogonal (blue) nonorthogonal (red) and total (black) contributions from each IMF for the sensible heat flux as calculated from Site 161 from SMEX 2002. Each subplot is a different sampling duration, going from left to right and top to bottom in the following order: 5, 10, 15, 20, 30, 45, 60, 80, 100, 120, 150 minutes.
71 CHAPTER VII AN IMPROVED EEMD ALGORITHM Motivation There are a number of Empirical Mode Decomposition (EMD) algorithms available today. These include commercial software called the Hilbert-Huang transform data processing system (HHT-DPS) which was developed by Norden Huang at NASA and is available through NASAs website. There are also publicly available Matlab codes by Patrick Flandrin [22] and R code by [32] which extract IMFs from a given input data series. However, no investigation to our knowledge has utilized these algorithms to process discontinuous data. For example, the EEMD algorithm introduced by [58] will not run when there is gaps, or NaNs, in the input data. Oftentimes instruments fail in the field, resulting in data gaps within the data. These gaps prevent the EMD algorithm from properly sifting through the data. Typically, scientists utilize interpolation values, such as the mean of surrounding data points, to fill data gaps. This may be useful for small data gaps, however, the data is assumed to remain constant during the time period, and so is insufficient for larger gaps of fluctuating data. In the spirit of a local and adaptive decomposition tool, it is important to manipulate the data as little as possible, and merely describe the data that does exist. This investigation suggests an improvement to the Ensemble Empirical Mode Decomposition (EEMD) algorithm which allows for gaps in the input data. The implications of applying EEMD to varying sizes of discontinuous data will be discussed. Also, we will suggest an error reduction technique which extracts IMFs which are more locally accurate.
72
Ensemble Empirical Mode Decomposition This investigation utilizes the Ensemble Empirical Mode Decomposition (EEMD) algorithm which was pioneered by [58]. The algorithm utilizes the original EMD sifting method which is described fully in [26][27]. An overview of the EEMD algorithm follows: 1. Add finite amplitude noise to the original input signal. 2. Decompose the signal into a finite set of intrinsic mode functions (IMFs) using the original EMD sifting method. 3. Repeat steps 1 and 2 with different noise data sets of same noise standard deviation. 4. Average the ensemble of extracted IMFs to average out the noise and obtain their mean IMFs. A complete description of EEMD can be found in [58]. The standard deviation of random noise which is added to the original data before decomposition can be specified by the user. For this investigation, we used a standard deviation of 0.2. In order to accommodate for discontinuous data, the MATLAB version of the EEMD algorithm by Zhaoua Wu was modified [58]. In order to accomplish this, the sifting process, which fits spline functions to the local maxima and the minima of the signal, must be performed on each individual continuous data segment. When the algorithm encounters a data gap, the splines must be halted, and a new set of piece-wise spline functions begin on the next set of continuous data. Figure 1 shows original data and its decomposed IMFs, as well as the decomposed set with artificial data gaps created. The improvement is simple but powerful. The majority of the discontinuous IMFs seem to replicate the original IMFs away from the gap section. For instance, look at
73 approximately Arbitrary Time 90 for IMF2. The local oscillations are still captured even with the gap present. The two decompositions are not identical, however. This investigation will now assess the errors associated with decomposing discontinuous data in a quantifiable manner. Also, suggestions will be made for how to reduce such errors.
Errors Due to Data Gaps In order to assess the abilities of EEMD applied to discontinuous data, we present an error analysis comparing an original IMF with a discontinuous IMF. Equation 1 shows the root mean square error equation.
) .
(1)
The total error for a single decomposition, then, is the sum of the rms errors from all IMFs. It is interesting to compare the rms error with the size of gaps within the data. Figure 2 shows the error calculated for a number of data gap sizes. That is, a data gap was artificially created in the exact middle of the data. The gap length was created as some percentage of the original data length. Then, the IMFs were decomposed and compared with the IMFs from the original data. The errors, as calculated by equation (1), are shown in Figure 2 for the first 6 IMFs extracted using the new discontinuous EEMD algorithm. Notice that the errors increase with increased gap size, as expected. Also, the highest frequency IMFs, the low IMF numbers, have the smallest differences with the original data. As the IMF number increases, which represents the lower frequency IMFs, the errors increase more quickly with increasing data gap size. This shows that the algorithm works better with high-frequency IMFs.
74
Another question to ask is where do the errors most occur? Fig. 7.4 shows the errors plotted against time for all IMFs with a gap size of 80 points, which is approximately 20% of the total data. Notice that the high-frequency IMFs have the largest errors near the endpoints of the gaps. The low-frequency IMFs also have errors, but they are not specifically located near the gap endpoints. A common problem within the old EMD sifting algorithm has been one dealing with so-called End-effect errors [26][44][58]. These errors, which have been well studied, have traditionally existed at the endpoints of the data [26][44][58]. This is because the first or second derivatives which are required for spline fitting are unavailable. The endpoints, then, have large fluctuations which do not represent the real signal. For our particular algorithm dealing with discontinuous data, these end-effect errors occur not only at the beginning and end of the input data, but at the start and end of every data gap. While the differences in the low-frequency IMFs are primarily due to the size of the data gap, the high-frequency IMFs have differences primarily due to gap end-effect errors. Therefore, in order to reduce the errors from the high-frequency IMFs, we can use traditional end-effect mitigation tools as described in the subsequent section.
Error Reduction Methods There are a number of investigations which have dealt with end-effect errors [26][44][58]. Therefore, it may be possible to utilize these end-effect mitigation tools in order to decrease the errors in the high-frequency IMFs due to data gaps. [44] suggest that IMFs should use a mirror extension technique to lower the errors due to end effects. This would restrict the spline from varying extravagantly at the ends of the gaps.
75
The mirroring technique used in this investigation is now reviewed. When a data gap is encountered, it is split in two sections. The first section is filled by the mirror image of the data directly before the gap. The second section is filled by the mirror image of the data directly after the gap. The amount of mirroring needed is dependent on the gap size. The result is a continuous data set. The traditional EEMD algorithm is then used to decompose the data into its IMFs. Once decomposed, the data gaps are recreated by removing the mirrored data. To test the effectiveness of this technique with discontinuous data, two gaps were created in the continuous data. Three different algorithm iterations were performed. The first one was a traditional EEMD algorithm decomposing the original data without gaps. The second was the discontinuous EEMD algorithm as applied to the data set without mirroring. The third was the discontinuous EEMD algorithm as applied to the data which had undergone the mirroring technique. Fig. 7.5 shows the three decompositions compared. This verifies, at least visually, the effectiveness in the mirroring technique to reduce end-effect errors near the endpoints of the gaps. Also, the low-frequency IMFs more closely match the original decomposition. Fig. 7.6 shows the relative error from each IMF as compared between the discontinuous EEMD algorithms with and without mirroring applied. The first IMF is the actual data and can be ignored. For all the other IMFs, the mirroring technique greatly reduces the error in the decompositions. Therefore, the processing of discontinuous data is greatly improved by using the mirroring technique.
Discussion Overall, this investigation presents a new version of the Ensemble Empirical Mode Decomposition (EEMD) algorithm which is now applicable to discontinuous data. For short
76
gap durations, the errors are small and the decomposition is locally representative. A mirroring technique was utilized to improve the discontinuous decomposition. This makes for a more local and adaptive decomposition of data which may contain one or more gaps. Further research should be pursued which utilizes neural networks or prediction models to fill gaps and improve on the reduction of errors.
77
Figure 7.1 Original data decomposed into its IMFs as well as the IMFs decomposed from discontinuous data
78
Figure 7.2 Error as defined by the summed differences between the discontinuous and continuous extracted IMFs, plotted against data gap size.
79
Figure 7.3 Errors plotted as a function of frequency of IMFs. The errors are primarily in the low-frequency IMFs
80
Figure 7.4 Errors plotted as a function of time. For the high-frequency IMFs, the errors occur largely near the gap endpoints.
81
Figure 7.5 Comparison of three different decompositions of data. The original signals (black) contain no gaps. The red signals are the decomposed IMFs from the discontinuous EEMD algorithm, and the blue signals are the discontinuous EEMD algorithm used after a mirroring technique was performed.
82
Figure 7.6 Comparison of relative error associated with including or not including the mirror technique when using the discontinuous EEMD decomposition.
83 CHAPTER VIII SUMMARY This dissertation focused on the theory, application, and development of a relatively new spectral analysis tool called Hilbert-Huang Transform. While it is an empirical tool, its power lies in its versatility, where it may be applied to virtually any oscillating data signal, whether nonlinear or nonstationary. While some of the data sets analyzed in this thesis are well-known and have been studied immensely using other data analysis tools, the contribution of this thesis is the development of the tools and techniques related to Hilbert-Huang Transform and how it can be applied to different types of data. First, using sunspot data, the periodic components were extracted using HHT and compared to well-known processes. The results were shown to give more local descriptions of the frequency components than traditional spectral analysis tools. Next, the periodic components were shown useful when compared to one another in the time domain. This provided new methods for analyzing frequency components of different processes, embedded within a signal, in the time domain. The HHT was used to calculate the similarity between the relative amplitudes and phases of two nonstationary processes. This created new techniques which can be used on other nonstationary data. The pivotal characteristic of the EMD method is that it acts as a dyadic filter bank in the frequency domain. That is, it is unable to decipher fluctuations that have periodicities which differ by less than factors of 2. This limits the cycles EMD is able to sift out from fluctuating signals. The dyadic nature has been demonstrated on turbulent wind velocity, temperature, and humidity data.
84 HHT was also used to approach the problem of energy budget closure near the earths surface from a completely new viewpoint. The orthogonality of the extracted components using the EMD method were shown to be related to whether or not the internal oscillations were sufficiently sampled. This provides researchers with a tool to justify that the covariance contributions from various frequency components have been sufficiently sampled. Finally, a modification to the EMD algorithm has been presented which allows for data gaps within the input signal. As most real data does contain gaps during some duration, this expands the potential applications of the EMD method greatly. Problems with the algorithm have been discussed, including end-effect errors due to under-defined interpolation functions. A mirroring technique has shown to reduce the errors due to endeffect errors. Therefore, this new algorithm can accurately and adaptively work in extracting the periodic components embedded within discontinuous data. This improvement is essential to allowing HHT to work with discontinuous data, thereby making the tool much more adaptive to all types of data. Overall, this dissertation has provided an in-depth analysis of a new tool, and has strengthened the tool itself. Future studies will further broaden its applicability to new problems, and will attempt to strengthen the theoretical foundation on which it stands. Real data is inherently messy; it is noisy, nonstationary, and intermittent. This thesis has taken a step towards describing this messy reality more adaptively and efficiently.
85 REFERENCES [1] Attoh-Okine, N. O. 2005: Perspectives on the Theory and Practices of the HilbertHuang Transform. In The Hilbert-Huang Transform in Engineering, edited by N.E. Huang and N.O. Attoh-Okine, 281-305. Taylor & Francis. [2] Aubinet et al. 2000: Estimates of the annual net carbon and 5 water exchange of forest: The EUROFLUX methodology. Adv Ecol Res. 30, 113175. [3] Baldocchi et al. 2001: FLUXNET: A new tool to study the temporal and spatial variability of ecosystem-scale carbon dioxide, water vapor, and energy flux densities. Bull Amer Meteorol Soc. 82, 2415-2434. [4] Balocchi, R. 2004: Deriving the respiratory sinus arrhythmia from the heartbeat time series using empirical mode decomposition, Chaos, solitons, and fractals, 20, 171-172. [5] Brigham, E. 1988: The Fast Fourier Transform and its Applications, Prentice Hall, Englewood Cliffs, NJ. [6] Byron F.W., and Fuller R.W. 1992: Mathematics of Classical and Quantum Physics, Dover Publications, Inc., New York. [7] Chen, Q., N.E. Huang, S. Riemenschneider, and Y. Xu. 2006: A B-spline approach for empirical mode decompositions. Advances in computational mathematics, 24, 1-4, 171. [8] Chui, C.K. 1992: An Introduction to Wavelets. Academic Press, Boston, MA. [9] Cohen, L. 1995: Time-Frequency Analysis, Prentice-Hall, Englewood Cliffs, NJ. [10] Coughlin, K.T. & K.K. Tung. 2004: 11-year solar cycle in the stratosphere extracted by the empirical mode decomposition method. Advances in space research. 34, 2, 323. [11] Duffy, D.G. 2004: The application of Hilbert-Huang transforms to meteorological datasets. Journal of Atmospheric and Oceanic Technology. 21, 4, 599. [12] Echeverra, J.C., J.A. Crowe, M.S. Woolfson, and B.R. Hayes-Gill. 2001: Application of empirical mode decomposition to heart rate variability analysis. Medical biological engineering computing. 39, 4, 471. [13] Feigenwinter C., Bernhofer C., and R. Vogt. 2004: The influence of advection on the short term CO2-budget in and above a forest canopy. Bound.-Layer Meteor. 113, 201224. [14] Finnigan JJ, Clement R, Malhi Y, Leuning R, and H.A. Cleugh. 2003: A re-evaluation of long-term flux measurement techniques. Part I: Averaging and coordinate rotation. Bound.-Layer Meteor. 107, 148.
86 [15] Flandrin, P., Rilling, G., and Goncalves, P. 2004: Empirical mode decomposition as a filter bank. IEEE Signal Processing Letters. 11, 2, 112. [16] Foken T., S.P. Oncley. 1995: Results of the workshop Instrumental and methodical problems of land surface flux measurements. Bull Am Meteorol Soc. 76, 11911193. [17] Foken T., Wichura B., Klemm O., Gerchau J., Winterhalter M., and T. Weidinger. 2001: Micrometeorogical measurements during the total solar eclipse of August 11, 1999. Meteorologische Zeitschrift. 10,171-178. [18] Foken T., Gockede M., Mauder M., Mahrt L., Amiro B.D., and J.W. Munger. 2004: Post-field data quality control. In Handbook of micrometeorology: A guide for surface flux measurement and analysis. Lee X, Massman WJ, Law B. Kluwer, Dordrecht. 181208. [19] Foken T., Wimmer F., Mauder M., Thomas C., and C. Liebethal, 2006: Some aspects of the energy balance closure problem. Atmos Chem Phys Discuss. 6, 3381. [20] Foken T., Aubinet M., Finnigan J.J., Leclerc M.Y., Mauder M., and U. Kyaw Tha Paw. 2011: Results Of A Panel Discussion About The Energy Balance Closure Correction For Trace Gases. Bull Amer Meteor Soc. 92, ES13ES18. [26] Gabor, D. 1946: Theory of Communication. Proc. IEEE Part III, 93, 26, 429-457. [22] Gabriel Rilling. Empirical Mode Decomposition. http://perso.ens-lyon.fr/patrick.flandrin/emd.html (accessed Nov 2, 2011). [23] Griffiths, D.J. 2005, Introduction to Quantum Mechanics, 2nd Edition. Pearson Ed. Intl. Prentice Hall, Upper Saddle Rive, NJ. [24] Haar, A. 1910: Zur theorie der orthogonalen funktionensysteme. Mathematische Annalen. 69, 3, 331. [25] Holder H.E., A.M. Bolch, and R. Avissar. 2009: Using the Empirical Mode Decomposition (EMD) method to process turbulence data collected on board aircraft. Submitted to J. Atmos. Ocean. Tech. http://hdl.handle.net/10161/1074 [26] Huang, N.E., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., Yen, N., Tung, C. and Liu, H. 1998: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc.. 454, 1971, 903. [27] Huang, N.E., Z. Shen, S.R. Long. 1999: A new view of nonlinear water waves: The Hilbert spectrum. Annual Review of Fluid Mechanics. 31, 1, 417. [28] Huang Y., F.G. Schmitt, Z. Lu, Y. Liu. 2007: Empirical mode decomposition analysis of experimental homogeneous turbulence time series. Colloque GRETSI, 11-14 September, Troyes, http://documents.irevues.inist.fr/handle/2042/17539
87 [29] Huang, N.E. and Z. Wu. 2008: A review on Hilbert-Huang transform: Method and its applications to geophysical studies. Reviews of geophysics. 46, 2. [30] Islam, M.K., M.S. Rahman, S. Akimasa, P. Banik. 2006: Empirical mode decomposition analysis of climate changes with special reference to rainfall data. Discrete Dynamics in Nature and Society. 2006. [31] Katul et al. 2001: Multiscale analysis of vegetation surface fluxes: from seconds to Years. Adv in Water Resources. 24, 1119-1132. [32] Kim D. and H. Oh. 2009: EMD: A Package for Empirical Mode Decomposition and Hilbert Spectrum. The R Journal. 1, May 2009. [33] Kollth, Z. and K. Olh. 2009: Multiple and changing cycles of active stars I. Methods of analysis and application to the solar cycles. Astronomy & Astrophysics. 501 2, 695. [34] LASP Interactive Solar Irradiance Datacenter. Historical Total Solar Irradiance. http://lasp.colorado.edu/lisird/tsi/historical_tsi.html (accessed Nov 2, 2011). [35] Lenschow D.H., Mann J., and L. Kristensen. 1994: How long is long enough when measuring fluxes and other turbulence statistics? J of Atmos. and Oceanic Technology. 11, 661-673. [36] Liu, Z., N. Zhang, R. Wang, and J. Zhu. 2007: Doppler wind lidar data acquisition system and data analysis by empirical mode decomposition method. Opt. Eng., 46, 26001. [37] Malinowski, S.P., Haman, K.E., Kopec, M.K., Kumala, W., Gerber, H.E., and Krueger, S.K.2008: Small-scale variability of temperature and LWC at Stratocumulus top. 13th AMS Conference on Cloud Physics, 2.21. [38] NASA GISS. GLOBAL Land-Ocean Temperature Index. http://data.giss.nasa.gov/gistemp/tabledata/GLB.Ts+dSST.txt (accessed Nov 2, 2011). [39] NASA Marshall Space Flight Center. Solar Physics. http://solarscience.msfc.nasa.gov/greenwch/spot_num.txt.(accessed Nov 2, 2011). [40] NOAA Earth System Research Laboratory. Trends in Atmospheric Carbon Dioxide. www.esrl.noaa.gov/gmd/ccgg/trends/. (accessed Nov 2, 2011). [41] Peel MC, G.G.S. Pegram, and T.A. McMahon. 2007: Empirical Mode Decomposition: Improvement and application. In International Congress on Modeling and Simulation, edited by Oxley, L. and D. Kulasiri. Modelling and Simulation Society of Australia and New Zealand, December 2007, 2996-3002.
88 [42] Pegram, G. G. S., Peel, M. C. and T.A. McMahon. 2008: Empirical mode decomposition using rational splines: an application to rainfall time series. Proc. R. Soc. A. 464, 14831501. [43] Qian, S. 2002: Introduction to Time-Frequency and Wavelet Transforms, Prentice-Hall Inc. Upper Saddle River, NJ. [44] Qingjie, Z., Huayong, Z., and S. Lincheng. 2010: A new method for mitigation of end effect in empirical mode decomposition. Informatics in Control, Automation, and Robotics (CAR), 2010 2nd International Asia Conf. March 2010. [45] Sakai R., Fitzjarrald D., and K.E. Moore. 2001: Importance of low-frequency contributions to eddy fluxes observed over rough surfaces. J Appl Meteor. 40, 21782192. [46] Sarabandi, K. and I. Koh. 2002: Effect of canopy-air interface roughness on HF-VHF wave propagation in forest. IEEE Transactions on Antennas and Propagation. 50, 2, 111. [47] Sneddon, I. 1951: Fourier Transforms. McGraw-Hill Book Company, Inc. New York, NY. [48] Sonett, C. P. 1983: J. Geophys. Res., vol. 88, no. A4, p. 3225-3228. [49] Stephens, G. L. 1986: Radiative transfer in spatially heterogeneous, two-dimensional anisotropically scattering media. J. Quant. Spectrosc. Radiat. Transfer, 36, 51-67. [50] Stephens, G.L., and C.M.R. Platt. 1987: Aircraft observations of the radiative and microphysical properties of stratocumulus and cumulus cloud fields. J. Clim. Appl. Meteorol., 26, 1243-1269. [51] Stull, R. 1988, An Introduction to Boundary Layer Meteorology, Kluwer Academic Publishers. Boston, MA. [52] Sun X., Zhu Z., Wen X., Yuan G., and G. Yu. 2006: The impact of averaging period on eddy fluxes observed at ChinaFLUX sites. Agricultural and Forest Meteorology. 137, 188-193. [53] Usoskin, I.G. and K. Mursula. 2003: Long-term solar cycle evolution: Review of recent developments. Solar Phys. 218, 319-343. [54] Vickers D., and L. Mahrt. 1997: Quality control and flux sampling problems for tower and aircraft data. J of Atmos and Oceanic Technology. 14, 512-526. [55] Wilson et al. 2002: Energy balance closure at FLUXNET sites. Agric Forest Meteorol. 113, 223234. [56] Wu, S., Liu, Z., and B. Liu. 2006: Enhancement of lidar backscatters signal-to-noise ratio using empirical mode decomposition. Optics Comm. 267, 1, 137.
89 [57] Wu, Z., and N.E. Huang. 2004: A study of the characteristics of white noise using the empirical mode decomposition. Proc. R. Soc. Lond. A, 460, 2046, 1597-1611. [58] Wu, Z., and N.E. Huang. 2009: Ensemble Empirical Mode Decomposition: a noise-assisted data analysis method. Adv. in Adaptive Data Analysis. 1, 1, 1-41. [59] Zhao, Jin-ping and D. Huang. 2001: Mirror extending and circular spline function for empirical mode decomposition method. Journal of Zhejiang University, 2, 3, 247-252. [60] Zhen-Shan, L. 2007: Multi-scale analysis of global temperature changes and trend of a drop in temperature in the next 20 years. Meteorology and Atmospheric Physics. 95, 1-2, 115.

The Hilbert-Huang Transform - Theory Applications Development

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Hilbert-Huang Transform - Theory Applications Development

Uploaded by

Copyright:

Available Formats

University of Iowa

Iowa Research Online

The Hilbert-Huang Transform: theory, applications, development

THE HILBERT-HUANG TRANSFORM: THEORY, APPLICATIONS, DEVELOPMENT

by Bradley Lee Barnhart

THE HILBERT-HUANG TRANSFORM: THEORY, APPLICATIONS, DEVELOPMENT

by Bradley Lee Barnhart

Graduate College The University of Iowa Iowa City, Iowa

Dedicado a Eduardo y su duende

7.1 7.2 7.3 7.4 7.5

The general solution to equation 2.4 is shown in equation 2.5 ( ) ( ) (2.5)

equation as shown in equation 2.6. [ ] , and (2.6)

Depending on the potential V of the system, different coefficients, functions,

) is the kernel which determines the properties of the distribution

The distribution is called the Wigner-Ville distribution when (

Figure 4.6 Wigner-Ville distribution of sunspot data

Figure 5.3 Global mean temperature and its decomposed IMFs.

N N SI SI Mean(IF) Stdev(IF) Mean(IF) Stdev(IF) I 0. 0. 0. 0. 28 14 24 13 I I I 03 I 01 0. 09 08 0. 0. 0. 01 005 0. 02 09 0. 0. 0. 04 009 0. 09 05 0. 0. 0. 09 006 0. 04 02 0. 0. 0.

T T Mean(IF) Stdev(IF) 0. 28 20 10 06 02 0. 0. 0. 0. 09 006 0. 14 08 05 0. 0. 0. 0.

IMF 1 IMF 2 IMF 3 IMF 4 IMF 5

SSN Hilb 3.6 11 13 37 93

SSN ZC 3.4 11 16 37 104

TSI Hilb 4.2 11 20 28 113

TSI ZC T Hilb 4.1 11 19 52 104 3.6 5.0 10 17 58

TSI IMF1 IMF2 IMF3 IMF4 IMF5 IMF6

Sunspot 0.85 0.21 0.61 0.36 0.26 0.48 0.59

IMF1 0.50 0.54 0.54 0.10 -0.02 -0.01 -0.03

IMF2 0.82 0.26 0.95 0.31 0.01 0.03 -0.05

IMF3 0.27 0.04 0.10 0.74 0.19 0.01 -0.01

IMF4 0.28 -0.08 -0.01 0.10 0.79 0.31 -0.001

IMF5 0.33 0.02 -0.001 0.03 -0.05 0.84 0.59

IMF6 0.20 -0.02 -0.03 0.06 0.03 0.25 0.96

T IMF 1 IMF 2 IMF 3 IMF 4 IMF 5 IMF 6

TSI 0.28 -0.13 -0.18 0.09 0.11 0.77 0.67

IMF 1 0.02 -0.03 0.004 0.02 0.07 0.03 0.01

IMF 2 -0.03 -0.08 -0.02 -0.02 0.08 0.02 -0.04

IMF 3 -0.34 -0.09 -0.34 -0.40 0.04 0.02 -0.04

IMF 4 0.13 -0.04 -0.07 0.30 -0.06 0.20 0.27

IMF 5 0.28 -0.04 -0.04 0.10 -0.26 0.66 0.37

IMF 6 0.41 -0.10 -0.14 0.13 0.26 0.86 0.87

Figure 6.1 Dyadic nature of EMD when applied to turbulence

Figure 6.4. Covariance contributions from w IMF 10 and all T IMFs

You might also like