You are on page 1of 58

Rob J Hyndman

Forecasting using

3. Autocorrelation and seasonality


OTexts.com/fpp/2/
OTexts.com/fpp/6/1

Forecasting using R

Outline

1 Time series graphics

2 Seasonal or cyclic?

3 Autocorrelation

Forecasting using R

Time series graphics

Time series graphics


Time plots
R command: plot or plot.ts
Seasonal plots
R command: seasonplot
Seasonal subseries plots
R command: monthplot
Lag plots
R command: lag.plot
ACF plots
R command: Acf
Forecasting using R

Time series graphics

Time series graphics


plot(melsyd[,"Economy.Class"])

20
15
10
5
0

Thousands

25

30

Economy class passengers: MelbourneSydney

1988

1989

1990

1991

1992

1993

Year
Forecasting using R

Time series graphics

Time series graphics


30

Antidiabetic drug sales

15
10
5

$ million

20

25

> plot(a10)

1995

2000

2005

Year
Forecasting using R

Time series graphics

Time series graphics


30

Seasonal plot: antidiabetic drug sales


2008
2007

25

2006

2005

2003
2002
15

2007

2006

2005
2004

2003

2002

1999
1998
1997
1996

1995
1994
1993
1992

Jan

Feb

2000

2001

2004

10

$ million

20

Mar

Apr

May

Jun

Jul

2001
2000
1999
1998
1997

1996

1995
1993
1994
1992

1991

Aug

Sep

Oct

Nov

Dec

Year
Forecasting using R

Time series graphics

Seasonal plots
Data plotted against the individual seasons in
which the data were observed. (In this case a
season is a month.)
Something like a time plot except that the data
from each season are overlapped.
Enables the underlying seasonal pattern to be
seen more clearly, and also allows any
substantial departures from the seasonal
pattern to be easily identified.
In R: seasonplot
Forecasting using R

Time series graphics

Seasonal plots
Data plotted against the individual seasons in
which the data were observed. (In this case a
season is a month.)
Something like a time plot except that the data
from each season are overlapped.
Enables the underlying seasonal pattern to be
seen more clearly, and also allows any
substantial departures from the seasonal
pattern to be easily identified.
In R: seasonplot
Forecasting using R

Time series graphics

Seasonal plots
Data plotted against the individual seasons in
which the data were observed. (In this case a
season is a month.)
Something like a time plot except that the data
from each season are overlapped.
Enables the underlying seasonal pattern to be
seen more clearly, and also allows any
substantial departures from the seasonal
pattern to be easily identified.
In R: seasonplot
Forecasting using R

Time series graphics

Seasonal plots
Data plotted against the individual seasons in
which the data were observed. (In this case a
season is a month.)
Something like a time plot except that the data
from each season are overlapped.
Enables the underlying seasonal pattern to be
seen more clearly, and also allows any
substantial departures from the seasonal
pattern to be easily identified.
In R: seasonplot
Forecasting using R

Time series graphics

Seasonal subseries plots


30

Seasonal subseries plot: antidiabetic drug sales

15
10
5

$ million

20

25

> monthplot(a10)

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Month
Forecasting using R

Time series graphics

Seasonal subseries plots

Data for each season collected together in time


plot as separate time series.
Enables the underlying seasonal pattern to be
seen clearly, and changes in seasonality over
time to be visualized.
In R: monthplot

Forecasting using R

Time series graphics

Seasonal subseries plots

Data for each season collected together in time


plot as separate time series.
Enables the underlying seasonal pattern to be
seen clearly, and changes in seasonality over
time to be visualized.
In R: monthplot

Forecasting using R

Time series graphics

Seasonal subseries plots

Data for each season collected together in time


plot as separate time series.
Enables the underlying seasonal pattern to be
seen clearly, and changes in seasonality over
time to be visualized.
In R: monthplot

Forecasting using R

Time series graphics

Quarterly Australian Beer Production

beer <- window(ausbeer,start=1992)


plot(beer)
seasonplot(beer,year.labels=TRUE)
monthplot(beer)

Forecasting using R

Time series graphics

10

Time series graphics

450
400

megaliters

500

Australian quarterly beer production

1995

Forecasting using R

2000

2005

Time series graphics

11

Time series graphics


Seasonal plot: quarterly beer production

1992
1994
1997
1999
1995
1998
1993
1996
2002
2000
2001
2006
2003
2005
2007

2004

500

450
400

megalitres

2001
1994
1992
2006
1999
2004
2003
1993
1997
1998
2002
2007
1995
2000
2008
2005
1996

Q1

Q2

Q3

Q4

Quarter
Forecasting using R

Time series graphics

12

Time series graphics

450
400

Megalitres

500

Seasonal subseries plot: quarterly beer production

Jan

Apr

Jul

Oct

Quarter
Forecasting using R

Time series graphics

13

Outline

1 Time series graphics

2 Seasonal or cyclic?

3 Autocorrelation

Forecasting using R

Seasonal or cyclic?

14

Time series patterns


Trend pattern exists when there is a long-term
increase or decrease in the data.
Seasonal pattern exists when a series is
influenced by seasonal factors (e.g., the
quarter of the year, the month, or day of
the week).
Cyclic pattern exists when data exhibit rises and
falls that are not of fixed period (duration
usually of at least 2 years).

Forecasting using R

Seasonal or cyclic?

15

Time series patterns

12000
10000
8000

GWh

14000

Australian electricity production

1980

1985

1990

1995

Year
Forecasting using R

Seasonal or cyclic?

16

Time series patterns

400
300
200

million units

500

600

Australian clay brick production

1960

1970

1980

1990

Year
Forecasting using R

Seasonal or cyclic?

17

Time series patterns

60
50
40
30

Total sales

70

80

90

Sales of new onefamily houses, USA

1975

Forecasting using R

1980

1985

1990

Seasonal or cyclic?

1995

18

Time series patterns

88
87
86
85

price

89

90

91

US Treasury bill contracts

20

40

60

80

100

Day
Forecasting using R

Seasonal or cyclic?

19

1000 2000 3000 4000 5000 6000 7000

Annual Canadian Lynx trappings

Number trapped

Time series patterns

1820

1840

1860

1880

1900

1920

Time
Forecasting using R

Seasonal or cyclic?

20

Seasonal or cyclic?
Differences between seasonal and cyclic
patterns:
seasonal pattern constant length; cyclic pattern
variable length
average length of cycle longer than length of
seasonal pattern
magnitude of cycle more variable than
magnitude of seasonal pattern
The timing of peaks and troughs is predictable with
seasonal data, but unpredictable in the long term
with cyclic data.
Forecasting using R

Seasonal or cyclic?

21

Seasonal or cyclic?
Differences between seasonal and cyclic
patterns:
seasonal pattern constant length; cyclic pattern
variable length
average length of cycle longer than length of
seasonal pattern
magnitude of cycle more variable than
magnitude of seasonal pattern
The timing of peaks and troughs is predictable with
seasonal data, but unpredictable in the long term
with cyclic data.
Forecasting using R

Seasonal or cyclic?

21

Seasonal or cyclic?
Differences between seasonal and cyclic
patterns:
seasonal pattern constant length; cyclic pattern
variable length
average length of cycle longer than length of
seasonal pattern
magnitude of cycle more variable than
magnitude of seasonal pattern
The timing of peaks and troughs is predictable with
seasonal data, but unpredictable in the long term
with cyclic data.
Forecasting using R

Seasonal or cyclic?

21

Seasonal or cyclic?
Differences between seasonal and cyclic
patterns:
seasonal pattern constant length; cyclic pattern
variable length
average length of cycle longer than length of
seasonal pattern
magnitude of cycle more variable than
magnitude of seasonal pattern
The timing of peaks and troughs is predictable with
seasonal data, but unpredictable in the long term
with cyclic data.
Forecasting using R

Seasonal or cyclic?

21

Seasonal or cyclic?
Differences between seasonal and cyclic
patterns:
seasonal pattern constant length; cyclic pattern
variable length
average length of cycle longer than length of
seasonal pattern
magnitude of cycle more variable than
magnitude of seasonal pattern
The timing of peaks and troughs is predictable with
seasonal data, but unpredictable in the long term
with cyclic data.
Forecasting using R

Seasonal or cyclic?

21

Outline

1 Time series graphics

2 Seasonal or cyclic?

3 Autocorrelation

Forecasting using R

Autocorrelation

22

Autocorrelation
Covariance and correlation: measure extent of
linear relationship between two variables (y and
X).
Autocovariance and autocorrelation: measure
linear relationship between lagged values of a
time series y.
We measure the relationship between: yt and yt1
yt and yt2
yt and yt3
etc.
Forecasting using R

Autocorrelation

23

Autocorrelation
Covariance and correlation: measure extent of
linear relationship between two variables (y and
X).
Autocovariance and autocorrelation: measure
linear relationship between lagged values of a
time series y.
We measure the relationship between: yt and yt1
yt and yt2
yt and yt3
etc.
Forecasting using R

Autocorrelation

23

Autocorrelation
Covariance and correlation: measure extent of
linear relationship between two variables (y and
X).
Autocovariance and autocorrelation: measure
linear relationship between lagged values of a
time series y.
We measure the relationship between: yt and yt1
yt and yt2
yt and yt3
etc.
Forecasting using R

Autocorrelation

23

Example: Beer production


> lag.plot(beer,lags=9)
500

400

36

beer
31 11
3
15

47
39 35

51

55

59

63

7
43
23
27
19

37 9
1
57
49 45
29
215
41 13
61 25
31
47
6533 11 3
53
15
39
35
51
17
55 7
23 43
59
27
19
63
50

12

8
20

44
36
40

60

52

37
9
1
57
4929 45 5
21
25
133161 41
33
4711
3
15 39 35
53
51 27
55
42 591423
43 17
27
3454
18 19
26 22
63
50
58 62 30
10
46
38
6

37
9
1
57
49
5 45 2129
25 61 13
41
33
6
53
17
14 2 42
54 34
18
22
26
50
58
30
10
3862
46

500

36

52

52

48

37 9
1
4945
29 5
21
25
13 61 41
31
11 47
33
3
53
39
3515
7 51
17
55
23
43
59
27
19

57
47
59

51 35
55
43

31
3 11
39
15
7
23
27
19

28

44

4
24
16

4 12
24
32
16

8
20

8
44

36
48 40
56

37
57

29
41
33
53

14
42

54
34
26 18 22
50
58
10 46
38

9
1

49

30

lag 7

Forecasting using R

13

21 5
25
17

400

3038

1
29

lag 8
450

10

28

20
36
40
48

56

52

37

57
45 5 49
21
25 13
41
11 47
6 31 333
15 3953
35
2 425514 51 7 17
43 23
59
54
27
34
22 1819
26
50
30 58 1046
38

222618

50

lag 6
12

45

6
142

42
54
34
58

46

32

17
34

lag 5

beer

450
beer

1628
44 20

4
24

52

9
1
5 2921
41 25
13
33

12

40 60
56

60 40
48

31
1147
3
39
15 35
51
7
55
23 43
59 27
19

2
42
54
2218
26
50
58
62
3010
38 46

53

lag 3
4
24
32

32

56

400

30 10

4
24
28 16
8
20 44
36
4860 40

lag 4
32
16 28
8
20 44
36

58
46
38

56

beer

beer

56

12

42 14 2
34
18 26 22

54

61
6

lag 2
4 12
24
28

32 16

48

37
57
45

49

62

lag 1

52

3111
47
15 39
35
51
755
43
59 23 14
27
19
63

500

1
57
45
29 49
5
21
61 25 41
13
65 33 6
53
17 42 142
5434
1826
22
66
50
62 58 3846
30
10

500

56
64

52

500

beer

450
beer
400

52

37
9

450

4 12
24
32
16
28
8
20 44
36
60 4048

beer

500

20
44
40 48
60
56
64

450

12
24
32
28 16
8
20
44
36
40
60 48
56
64

400

450
12 4
24
28

16 32

beer

400

52
37
9
1
57
2949
455
21
41 1325
33 6
53
17 1442 2
54 34
18 26 22
50
58
10 46 3038

47
43

55

31
39
51

11
35
27

3
15
7
23
19

lag 9

Autocorrelation

24

Example: Beer production


> lag.plot(beer,lags=9,do.lines=FALSE)
400

beer

500

lag 7

Forecasting using R

beer

lag 8
450

400

lag 6

lag 5

lag 3

450
beer

lag 4

400

beer

beer

lag 2

lag 1

500

500

450

500

beer

400

450
beer

450

500

500

400

beer

450

beer

400

lag 9

Autocorrelation

25

Lagged scatterplots

Each graph shows yt plotted against ytk for


different values of k.
The autocorrelations are the correlations
associated with these scatterplots.

Forecasting using R

Autocorrelation

26

Lagged scatterplots

Each graph shows yt plotted against ytk for


different values of k.
The autocorrelations are the correlations
associated with these scatterplots.

Forecasting using R

Autocorrelation

26

Autocorrelation
We denote the sample autocovariance at lag k by ck and the
sample autocorrelation at lag k by rk . Then define
ck =

and

T
1 X

)(ytk y
)
(yt y

t =k +1

rk = ck /c0

r1 indicates how successive values of y relate to each


other
r2 indicates how y values two periods apart relate to
each other
rk is almost the same as the sample correlation between
yt and ytk .
Forecasting using R

Autocorrelation

27

Autocorrelation
We denote the sample autocovariance at lag k by ck and the
sample autocorrelation at lag k by rk . Then define
ck =

and

T
1 X

)(ytk y
)
(yt y

t =k +1

rk = ck /c0

r1 indicates how successive values of y relate to each


other
r2 indicates how y values two periods apart relate to
each other
rk is almost the same as the sample correlation between
yt and ytk .
Forecasting using R

Autocorrelation

27

Autocorrelation
We denote the sample autocovariance at lag k by ck and the
sample autocorrelation at lag k by rk . Then define
ck =

and

T
1 X

)(ytk y
)
(yt y

t =k +1

rk = ck /c0

r1 indicates how successive values of y relate to each


other
r2 indicates how y values two periods apart relate to
each other
rk is almost the same as the sample correlation between
yt and ytk .
Forecasting using R

Autocorrelation

27

Autocorrelation
We denote the sample autocovariance at lag k by ck and the
sample autocorrelation at lag k by rk . Then define
ck =

and

T
1 X

)(ytk y
)
(yt y

t =k +1

rk = ck /c0

r1 indicates how successive values of y relate to each


other
r2 indicates how y values two periods apart relate to
each other
rk is almost the same as the sample correlation between
yt and ytk .
Forecasting using R

Autocorrelation

27

Autocorrelation
Results for first 9 lags for beer data:
r1

r2

r3

r4

r5

r6

r7

r8

r9

0.126 0.650 0.094 0.863 0.099 0.642 0.098 0.834 0.116

Forecasting using R

Autocorrelation

28

Autocorrelation
Results for first 9 lags for beer data:
r1

r2

r3

r4

r5

r6

r7

r8

r9

0.5

ACF

0.0

0.5

0.126 0.650 0.094 0.863 0.099 0.642 0.098 0.834 0.116

10

11

12

13

14

15

16

17

Lag

Forecasting using R

Autocorrelation

28

Autocorrelation
r4 higher than for the other lags. This is due to
the seasonal pattern in the data: the peaks
tend to be 4 quarters apart and the troughs
tend to be 2 quarters apart.
r2 is more negative than for the other lags
because troughs tend to be 2 quarters behind
peaks.
Together, the autocorrelations at lags 1, 2, . . . ,
make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting using R

Autocorrelation

29

Autocorrelation
r4 higher than for the other lags. This is due to
the seasonal pattern in the data: the peaks
tend to be 4 quarters apart and the troughs
tend to be 2 quarters apart.
r2 is more negative than for the other lags
because troughs tend to be 2 quarters behind
peaks.
Together, the autocorrelations at lags 1, 2, . . . ,
make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting using R

Autocorrelation

29

Autocorrelation
r4 higher than for the other lags. This is due to
the seasonal pattern in the data: the peaks
tend to be 4 quarters apart and the troughs
tend to be 2 quarters apart.
r2 is more negative than for the other lags
because troughs tend to be 2 quarters behind
peaks.
Together, the autocorrelations at lags 1, 2, . . . ,
make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting using R

Autocorrelation

29

Autocorrelation
r4 higher than for the other lags. This is due to
the seasonal pattern in the data: the peaks
tend to be 4 quarters apart and the troughs
tend to be 2 quarters apart.
r2 is more negative than for the other lags
because troughs tend to be 2 quarters behind
peaks.
Together, the autocorrelations at lags 1, 2, . . . ,
make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting using R

Autocorrelation

29

0.0
0.5

ACF

0.5

ACF

Acf(beer)
Forecasting using R

10

11

12

13

14

15

16

17

Lag
Autocorrelation

30

0.0
0.5

ACF

0.5

ACF

Acf(beer)
Forecasting using R

10

11

12

13

14

15

16

17

Lag
Autocorrelation

30

Recognizing seasonality in a time series

If there is seasonality, the ACF at the seasonal lag


(e.g., 12 for monthly data) will be large and
positive.
For seasonal monthly data, a large ACF value
will be seen at lag 12 and possibly also at lags
24, 36, . . .
For seasonal quarterly data, a large ACF value
will be seen at lag 4 and possibly also at lags 8,
12, . . .

Forecasting using R

Autocorrelation

31

Recognizing seasonality in a time series

If there is seasonality, the ACF at the seasonal lag


(e.g., 12 for monthly data) will be large and
positive.
For seasonal monthly data, a large ACF value
will be seen at lag 12 and possibly also at lags
24, 36, . . .
For seasonal quarterly data, a large ACF value
will be seen at lag 4 and possibly also at lags 8,
12, . . .

Forecasting using R

Autocorrelation

31

Australian monthly electricity production

12000
10000
8000

GWh

14000

Australian electricity production

1980

1985

1990

1995

Year
Forecasting using R

Autocorrelation

32

0.4
0.2
0.0
0.2

ACF

0.6

0.8

Australian monthly electricity production

10

20

30

40

Lag
Forecasting using R

Autocorrelation

33

Australian monthly electricity production

Time plot shows clear trend and seasonality.


The same features are reflected in the ACF.
The slowly decaying ACF indicates trend.
The ACF peaks at lags 12, 24, 36, . . . , indicate
seasonality of length 12.

Forecasting using R

Autocorrelation

34

Australian monthly electricity production

Time plot shows clear trend and seasonality.


The same features are reflected in the ACF.
The slowly decaying ACF indicates trend.
The ACF peaks at lags 12, 24, 36, . . . , indicate
seasonality of length 12.

Forecasting using R

Autocorrelation

34

Which is which?
10
9
7

thousands

2. Accidental deaths in USA (monthly)

chirps per minute


40
60
80

1. Daily morning temperature of a cow

20

40

60

1973

1975

1977

1979

4. Annual mink trappings (Canada)

100

60
20

thousands

100

thousands
200 300 400

3. International airline passengers

1950

1952

1954

1956

1850

1870

1890

1910

ACF
0.2
0.6
-0.4

-0.4

ACF
0.2
0.6

1.0

1.0

10

15

20

10

15

20

15

20

ACF
0.2
0.6
-0.4

-0.4

ACF
0.2
0.6

1.0

1.0

10

15

20

10

Time series graphics


Time plots
R command: plot.ts
Seasonal plots
R command: seasonplot
Seasonal subseries plots
R command: monthplot
Lag plots
R command: lag.plot
ACF plots
R command: Acf
Forecasting using R

Autocorrelation

36

You might also like