You are on page 1of 27

The theory of the spectral analysis of time series

Peter Craigmile
Time series analysis
Stationarity, Weakly stationary processes
Example: Malindi coral
What is spectral analysis?
Example: The harmonic process
A discussion about notation and frequencies
Herglotzs and Bochners theorem
So what does the spectrum measure?
Existence of the spectral density function
Changing the sampling interval, and aliasing
The SDF of a white noise process
Linear time-invariant filtering of stationary processes
Example SDFs for different ARMA processes
Approximation theorems involving spectral densities
1

Time series analysis


We start with a quick review of time series analysis
This should be very familiar to you after yesterdays class!
(swap time for space or space and time).
A time series is a set of observations made sequentially in
time.
R. A. Fisher: One damned thing after another.
Time series analysis is the area of statistics which deals
with the analysis of dependency between different observations in time series data.
Suppose our time series process is the stochastic process
{Xt : t T }.
Here, T is the index set, the set of all possible time points.
It could be discrete or continuous.

Stationarity
We will still try to keep our models as simple as possible by
assuming stationarity.
Stationarity means that some characteristic of the distribution of a time series process does not depend on the
time, only the displacement in time.
If you shift the time series process, that characteristic of
the distribution will not change.
While most time series data are not stationary,
there are ways to either remove or model the non-stationary
parts (the components that depend on time).
so that we are only left with a stationary component.

Weakly stationary processes


A process {Xt : t T } is (weakly) stationary if
1. E(Xt) = X , a fixed constant for all t T ;
2. The autocovariance function is shift invariant:
cov(Xs, Xt) = X (s t),

for all s, t T .

Other names for weak stationarity: second-order stationary;


covariance stationary; wide-sense stationary.
The autocovariance function (ACVF) of a stationary
process {Xt : t T } is defined by
X (h) = cov(Xt, Xt+h).
We call h the time lag or just the lag.
The autocorrelation function (ACF) of a stationary process {Xt : t T } is defined by
X (h) = corr(Xt, Xt+h).

4.1 4.2 4.3 4.4 4.5 4.6 4.7

Negative change in Oxygen(18) conc.

Example: Malindi coral

1800

1850

1900

1950

2000

0.2 0.4 0.6 0.8 1.0


0.2

Partial ACF

0.2 0.4 0.6 0.8 1.0


0.2

ACF

Year

10

15

20

Lag

10

15

20

Lag

194 year record of the 18O oxygen isotope measured from a


4m high coral colony growing at a depth of 6m (low tide) in
Malindi, Kenya [Cole et al., 2000].
A decrease in the oxygen value corresponds to an increase
in the sea surface temperature (SST) (roughly -0.24 per
mil concentration corresponds to 1oC).
5

Malindi coral series, continued


A summary of the time series:

Scientific questions of interest:

What is spectral analysis?


In probability theory we usually work with
cumulative distribution functions
(or probability density/mass functions, when they exist),
But, we can also work with its Fourier transform:
the characteristic function.

Similarly, for stationary processes we have worked with


the autocovariance function (ACVF).
We now consider its Fourier transform,
the spectrum,
which provides a frequency decomposition of the variance of
a process.
Good references: Priestley [1981a], Priestley [1981b], Brillinger
[1981], Brockwell and Davis [1991, Section 4], and Percival and
Walden [1993].
7

Example: The harmonic process


Consider the harmonic process defined by
Xt =

L
X

[Al cos(2fl t) + Bl sin(2fl t)] .

l=1

Assume that {fl } are sorted in increasing order, and that {Al }
and {Bl } are a sequence of (mutually) uncorrelated mean zero
RVs with var(Al ) = var(Bl ) = l2 for each l.
It can be shown that {Xt} is a stationary mean zero process,
with autocovariance function (ACVF)
X (h) =

L
X

l2 cos(2fl h).

l=1

This implies that the variance is


var(Xt) = X (0) =

L
X

l2.

l=1

This is the first example of a spectral decomposition.


We have decomposed the variance of the process {Xt} in
terms of the variances of sinusoids.
More generally, this is an example of an analysis of variance.
8

Relating sinusoids to complex exponentials


The process {Xt} can be rewritten as
Xt =

L
X

[Cl cos(2fl t + l )] .

l=1

Using complex numbers, let i =

1. Using the identity,

cos(z) = (eiz + eiz )/2,


it can be shown that
L
X

Xt =

Dl ei2fl t.

l=L

where D0 = 0, Dl = (Cl /2)eil for l = 1, . . . , L, Dl = Dl


(the complex conjugate), and fl = fl .
Then the ACVF is
X (h) =

L
X

Sl ei2fl h,

l=L

where Sl = Sl = l2/2 for l = 1, . . . , L and S0 = 0.


We call Sl the integrated spectrum at frequency fl .

A discussion about notation and frequencies


I will follow the notation of Percival and Walden [1993].
In particular I let the frequency be f [1/2, 1/2]
Many use = 2f [, ].
The use of f matches with the frequencies I used in the
harmonic regression model.
We start by assuming that the time series {Xt} is collected
with a sampling interval of = 1 (one time unit between
observations).
In practice, can vary.
When changes, the range of frequencies changes to
[F, F] ,
where F =

1
is called the Nyquist frequency.
2

10

Herglotzs theorem
(The spectrum is the Fourier transform of the ACVF)
A real-valued function () defined on the integers is nonnegative definite if and only if
Z 1/2
(h) =
ei2f hdS (I)(f ),
1/2

for all h Z, where S (I)() is a right-continuous, non-decreasing


and bounded function on [1/2, 1/2] with S (I)(1/2) = 0.
S (I)() is called the integrated spectrum or the spectral
distribution function.
The derivative of S (I)(), when it exists, is called the spectral
density function (SDF), S().

11

Combining with Bochners theorem


A real-valued function () defined on the integers is the autocovariance function of a stationary process {Xt}
if and only if
1. (h) =

R 1/2

1/2 e

i2f h

dS (I)(f ), for all h Z, where S (I)() is

a right-continuous, non-decreasing and bounded function


on [1/2, 1/2] with S (I)(1/2) = 0.
2. () is even and non-negative definite.
We say the S (I)() is the spectrum/spectral distribution function of {Xt} and ().
(I)

(Whenever necessary we write SX () to indicate it is the spectrum of {Xt}.)

12

So what does the spectrum measure?


For a stationary process {Xt}, taking
Z 1/2
(h) =
ei2f h dS (I)(f ),
1/2

we let h = 0 to get
Z

1/2

var(Xt) = X (0) =

dS (I)(f ).

1/2

When the derivative exists, we have


Z 1/2
(0) =
S(f )df.
1/2

Thus we can think of the increments of the spectrum as decomposing the variance of a time series into a collection
of pieces, each of which can be associated with a different
frequency f .
Different stationary time series models have different decompositions by frequency.
Before we look at some examples let us discuss some further
properties of the spectrum.

13

Existence of the spectral density function


If () is any real-valued function defined on the integers that
P
is absolutely summable (i.e., h |(h)| < ), then
Z 1/2
(h) =
ei2f hS(f )df,
1/2

with
S(f ) =

ei2f h(h).

hZ

We say that the () and the S() are Fourier transform


pairs, and write
{()} S().

14

For real-valued stationary processes


A real-valued function S() defined on |f | 1/2 is the SDF
of a real-valued stationary process {Xt} if and only if
1. S(f ) = S(f ) for all |f | 1/2
(the SDF is an even function).
2. S(f ) 0 for all |f | 1/2.
R 1/2
3. 1/2 S(f ) df < .
Note that this result is if and only if :
In particular it gives a way to create a stationary process
without ever having to check that the ACVF is nonnegative definite.
Once we have a SDF that satisfies these conditions then
Z 1/2
(h) =
ei2f h S(f ) df.
1/2

15

Changing the sampling interval


Let {Xt} be a stationary process, but suppose that we observe
the process at sampling interval (not necessarily of value
one).
Remember that F = 1/(2) is the Nyquist frequency.
With an absolutely summable ACVF we have
X

SX (f ) =

ei2f h X (h) df,

hZ

with
Z

X (h) =

ei2f h SX (f ) df.

As decreases we get more information about the SDF.


We are less affected by aliasing of frequencies.

16

Aliasing: sinusoids
An example of two aliased sinusoids:
1.0

Delta = 1, f = 1/16

0.5

0.0

1.0 0.5

20

30

10

40

50

60

1.0

Delta = 1, f = 1/16 + 1

0.5

0.0

10

1.0 0.5

20

30

40

The amplitudes of the waves are same.


Thus, the spectrum is the same.

17

50

60

Aliasing, in general
Let {Xt} be a stationary process, observed with sampling
interval . If the ACVF is absolutely summable, then
SX (f ) = SX (f + k/),

for all k Z.

This leads us to imagine that SX (f ), observed in any given


interval [F, F], contains the accumulation of all spectral
densities observed at frequencies {f + k/ : k Z}.

18

The SDF of a white noise process


Let {Zt} be a WN(0, 2) process. Then for all |f | < 1/2,
SZ (f ) =

ei2f hZ (h)

h=

= 1 . Z (0)
= 2.
The SDF of white noise is a constant function.
Indeed, this is why it is called white noise.
The SDF (and hence variance) of a WN process is an equal
mix of all frequencies (colors).

1.0
0.8
0.6

SDF

1.2

1.4

WN(0, 1) process

0.0

0.1

0.2

0.3

frequency

19

0.4

0.5

Linear time-invariant filtering of stationary processes


Let {aj } be a set of coefficients that are absolutely summable.
Suppose {Xt} is a stationary mean zero process with ACVF
X ().
It can be shown that {Yt} defined as a linear filtering of {Xt}
as
Yt =

aj Xtj ,

j=

is a stationary mean zero process with ACVF


Y (h) =

X
X

aj ak X (j k + h).

j= k=

Now suppose that the SDF of {Xt} is SX ().


What happens to the SDF under filtering?
i.e., what is SY ()?

20

First, some definitions


The transfer function of a filter {aj } is its Fourier transform:
A(f ) =

aj ei2f j .

j=

The gain function of {aj } is the modulus of the transfer


function, |A(f )|.
The square or squared gain function of {aj } is the
modulus squared of the transfer function:
A(f ) = |A(f )|2 = A(f )A(f ),
where denotes, as usual, the complex conjugate.

21

What is the SDF when linear filtering?


We have that
Y (h) =
=
=

X
X
j= k=

X
X
j= k=
Z 1/2
i2f h

aj ak X (j k + h)
Z

ei2f (jk+h)SX (f )df

aj ak
1/2

SX (f )

1/2

1/2

aj ei2f j

j=

X
k=

1/2

ei2f hSX (f )A(f )A(f )df

1/2
1/2

Z
=

ei2f hA(f )SX (f )df.

1/2

Since we know that {Yt} is stationary with


Z 1/2
Y (h) =
ei2f hSY (f )df,
1/2

by uniqueness of Fourier transforms,


SY (f ) = A(f )SX (f ).

22

ak ei2f k df

Example 1: The SDF of an MA(1) process


For {Zt} a WN(0, 2) process, let
Xt = Zt + Zt1 =

aj Ztj ,

j=

where

aj =

1, j = 0;

, j = 1;

0, otherwise.

The transfer function is

X
A(f ) =
aj ei2f j = 1 + ei2f .
j=

The square gain function is


A(f ) = A(f )A(f ) = (1 + ei2f )(1 + ei2f )

i2f
i2f
= 1+ e
+e
+ 2
= 1 + 2 cos(2f ) + 2.
Thus the SDF of an MA(1) process is
SX (f ) = A(f )SZ (f )
= (1 + 2 cos(2f ) + 2) 2.
23

Example 2: The SDF of an ARMA process


Let {Xt} be a ARMA(p, q) process
Xt

p
X
j=1

j Xtj =

q
X

k Ztj {Zt} WN(0, 2).

k=1

We say {Xt} is causal and stationary if each Xt can be


rewritten in terms of the current and past Zt terms.
Then the SDF exists and satisfies:
2


2


q
p


X
X




i2f
k
i2f
j
SX (f ) = 1 +
1

e
SZ (f ).
k
j






j=1
k=1
Thus


P
1 + q k ei2f k 2
k=1
SX (f ) = 2
2 ,
P


p
1 j=1 j ei2f j
This is often called the rational SDF.

24

|f | < 1/2.

Example SDFs for different AR processes


(Same as ARMA(p, 0) processes)
1 = 0.2

5
4
0

SDF

3
0

SDF

AR(1) process

1 = 0.6
6

AR(1) process

0.1

0.2

0.3

0.4

0.5

0.0

0.1

0.2

0.3

0.4

frequency

frequency

AR(2) process
= (0.75, 0.5)

= (2.7607, 3.8106, 2.6535, 0.9238)

0.5

AR(4) process

15000

SDF

5000

1
0

SDF

0.0

0.0

0.1

0.2

0.3

0.4

0.5

frequency

0.0

0.1

0.2

0.3

frequency

Usually we change the y-scale to be decibels (dB)


(we take a 10 log10 transformation of the SDF).

25

0.4

0.5

Approximation theorems involving spectral densities


[Brockwell and Davis, 1991, Coroll 4.4.2] if SY () is a symmetric continuous SDF and  > 0, then there exists a causal
AR(p) process
(B)Xt = Zt,

{Zt} WN(0, 2),

such that
|SY (f ) SX (f )| < ,

for all f [1/2, 1/2].

[Brockwell and Davis, 1991, Coroll 4.4.1] gives a similar result


for MA(q) processes.

An AR approximation for time series is used more often than


the MA approximation in the literature.
Combining both, gives us a motivation for why ARMA models can outperform AR and MA models, in terms of the ability
to approximate the SDF (equivalently the ACVF).

26

References
D. R. Brillinger. Time Series: Data Analysis and Theory. Holt, New York, NY, 1981.
P. J. Brockwell and R. A. Davis. Time Series. Theory and Methods (Second Edition).
Springer-Verlag, New York, 1991.
J. Cole, R. Dunbar, T. McClanahan, and N. Muthiga. Tropical pacific forcing of decadal
variability in the western indian ocean over the past two centuries. Science, 287:617
619, 2000. URL http://www.sciencemag.org/content/287/5453/617.
D. Percival and A. Walden. Spectral Analysis for Physical Applications. Cambridge
University Press, Cambridge, 1993.
M. B. Priestley. Spectral Analysis and Time Series. (Vol. 1): Univariate Series. Academic
Press, London, UK, 1981a.
M. B. Priestley. Spectral Analysis and Time Series. (Vol. 2): Multivariate Series, Prediction and Control. Academic Press, London, UK, 1981b.

27

You might also like