You are on page 1of 81

Model Building For ARIMA

time series
Consists of three steps
1. Identification
2. Estimation
3. Diagnostic checking
ARIMA Model building

Identification
Determination of p, d and q
To identify an ARIMA(p,d,q) we use extensively
the autocorrelation function
{
h
: - < h < }
and
the partial autocorrelation function,
{u
kk
: 0 s k < }.
The definition of the sample covariance function
{C
x
(h) : - < h < }
and the sample autocorrelation function
{r
h
: - < h < }
are given below:
( ) ( )( )

=
+
=
h T
t
h t t x
x x x x
T
h C
1
1
( )
( ) 0
and
x
x
h
C
h C
r =
The divisor is T, some
statisticians use T h
(If T is large, both give
approximately the
same results.)
It can be shown that:
( )

=
+ +
~
t
k t t k h h
T
r r Cov
1
,
Thus
( )
)
`

+ ~ ~

=

=
q
t
t
t
t h
r
T T
r Var
1
2 2
2 1
1 1

Assuming
k
= 0 for k > q

=
+ =
q
t
t r
r
T
s
h
1
2
2 1
1
Let
The sample partial autocorrelation function is defined
by:
1
1
1
1
1

2 1
2 1
1 1
2 1
2 1
1 1


= u
k k
k
k
k k k
kk
r r
r r
r r
r r r
r r
r r
It can be shown that:
( )
T
Var
kk
1

~ u
T
s
kk
1
Let

=
u
Identification of an Arima process
Determining the values of p,d,q
Recall that if a process is stationary one of the
roots of the autoregressive operator is equal to
one.
This will cause the limiting value of the
autocorrelation function to be non-zero.
Thus a nonstationary process is identified by
an autocorrelation function that does not tail
away to zero quickly or cut-off after a finite
number of steps.
To determine the value of d
Note: the autocorrelation function for a stationary ARMA
time series satisfies the following difference equation
1 1 2 2 h h h p h p
| | |

= + + +
The solution to this equation has general form
1 2
1 2
1 1 1
h p
h h h
p
c c c
r r r
= + + +
where r
1
, r
2
, r
1
, r
p
, are the roots of the polynomial
( )
2
1 2
1
p
p
x x x x | | | | =
For a stationary ARMA time series
Therefore
1 2
1 2
1 1 1
0 as
h p
h h h
p
c c c h
r r r
= + + +
The roots r
1
, r
2
, r
1
, r
p
, have absolute value greater
than 1.
If the ARMA time series is non-stationary
some of the roots r
1
, r
2
, r
1
, r
p
, have absolute value
equal to 1, and
1 2
1 2
1 1 1
0 as
h p
h h h
p
c c c a h
r r r
= + + + =
0
0.5
1
0 3 6 9
1
2
1
5
1
8
2
1
2
4
2
7
3
0
stationary
0
0.5
1
0 3 6 9
1
2
1
5
1
8
2
1
2
4
2
7
3
0
non-stationary
If the process is non-stationary then first
differences of the series are computed to
determine if that operation results in a
stationary series.
The process is continued until a stationary time
series is found.
This then determines the value of d.
Identification
Determination of the values of p and q.
To determine the value of p and q we use the
graphical properties of the autocorrelation
function and the partial autocorrelation function.
Again recall the following:
Auto-correlation
function
Partial
Autocorrelation
function
Cuts off
Cuts off
Infinite. Tails off.
Damped Exponentials
and/or Cosine waves
Infinite. Tails off.
Infinite. Tails off.
Infinite. Tails off.
Dominated by damped
Exponentials & Cosine
waves.
Dominated by damped
Exponentials & Cosine waves
Damped Exponentials
and/or Cosine waves
after q-p.
after p-q.
Process
MA(q) AR(p) ARMA(p,q)
Properties of the ACF and PACF of MA, AR and ARMA Series
More specically some typical patterns of the autocorrelation
function and the partial autocorrelation function for some
important ARMA series are as follows:
Patterns of the ACF and PACF of AR(2) Time Series
In the shaded region the roots of the AR operator are complex
Patterns of the ACF and PACF of MA(2) Time Series
In the shaded region the roots of the MA operator are complex
Patterns of the ACF and PACF of ARMA(1.1) Time Series
Note: The patterns exhibited by the ACF and the PACF give
important and useful information relating to the values of the
parameters of the time series.
Summary: To determine p and q.
Use the following table.
MA(q) AR(p) ARMA(p,q)
ACF Cuts after q Tails off Tails off
PACF Tails off Cuts after p Tails off
Note: Usually p + q 4. There is no harm in over
identifying the time series. (allowing more parameters in
the model than necessary. We can always test to
determine if the extra parameters are zero.)
Examples
200 100 0
16
17
18
Ex ample A: "Uncontrolled" Concentration, Two-Hourly Readings:
Chemical Process
The data

1 17.0 41 17.6 81 16.8 121 16.9 161 17.1
2 16.6 42 17.5 82 16.7 122 17.1 162 17.1
3 16.3 43 16.5 83 16.4 123 16.8 163 17.1
4 16.1 44 17.8 84 16.5 124 17.0 164 17.4
5 17.1 45 17.3 85 16.4 125 17.2 165 17.2
6 16.9 46 17.3 86 16.6 126 17.3 166 16.9
7 16.8 47 17.1 87 16.5 127 17.2 167 16.9
8 17.4 48 17.4 88 16.7 128 17.3 168 17.0
9 17.1 49 16.9 89 16.4 129 17.2 169 16.7
10 17.0 50 17.3 90 16.4 130 17.2 170 16.9
11 16.7 51 17.6 91 16.2 131 17.5 171 17.3
12 17.4 52 16.9 92 16.4 132 16.9 172 17.8
13 17.2 53 16.7 93 16.3 133 16.9 173 17.8
14 17.4 54 16.8 94 16.4 134 16.9 174 17.6
15 17.4 55 16.8 95 17.0 135 17.0 175 17.5
16 17.0 56 17.2 96 16.9 136 16.5 176 17.0
17 17.3 57 16.8 97 17.1 137 16.7 177 16.9
18 17.2 58 17.6 98 17.1 138 16.8 178 17.1
19 17.4 59 17.2 99 16.7 139 16.7 179 17.2
20 16.8 60 16.6 100 16.9 140 16.7 180 17.4
21 17.1 61 17.1 101 16.5 141 16.6 181 17.5
22 17.4 62 16.9 102 17.2 142 16.5 182 17.9
23 17.4 63 16.6 103 16.4 143 17.0 183 17.0
24 17.5 64 18.0 104 17.0 144 16.7 184 17.0
25 17.4 65 17.2 105 17.0 145 16.7 185 17.0
26 17.6 66 17.3 106 16.7 146 16.9 186 17.2
27 17.4 67 17.0 107 16.2 147 17.4 187 17.3
28 17.3 68 16.9 108 16.6 148 17.1 188 17.4
29 17.0 69 17.3 109 16.9 149 17.0 189 17.4
30 17.8 70 16.8 110 16.5 150 16.8 190 17.0
31 17.5 71 17.3 111 16.6 151 17.2 191 18.0
32 18.1 72 17.4 112 16.6 152 17.2 192 18.2
33 17.5 73 17.7 113 17.0 153 17.4 193 17.6
34 17.4 74 16.8 114 17.1 154 17.2 194 17.8
35 17.4 75 16.9 115 17.1 155 16.9 195 17.7
36 17.1 76 17.0 116 16.7 156 16.8 196 17.2
37 17.6 77 16.9 117 16.8 157 17.0 197 17.4
38 17.7 78 17.0 118 16.3 158 17.4
39 17.4 79 16.6 119 16.6 159 17.2
40 17.8 80 16.7 120 16.8 160 17.2
1890 1860 1830 1800 1770
0
100
200
Example B: Annual Sunspot Numbers
( 1790-1869)
The data
Example B: Sunspot Numbers: Yearly

1770 101 1795 21 1820 16 1845 40
1771 82 1796 16 1821 7 1846 64
1772 66 1797 6 1822 4 1847 98
1773 35 1798 4 1823 2 1848 124
1774 31 1799 7 1824 8 1849 96
1775 7 1800 14 1825 17 1850 66
1776 20 1801 34 1826 36 1851 64
1777 92 1802 45 1827 50 1852 54
1778 154 1803 43 1828 62 1853 39
1779 125 1804 48 1829 67 1854 21
1780 85 1805 42 1830 71 1855 7
1781 68 1806 28 1831 48 1856 4
1782 38 1807 10 1832 28 1857 23
1783 23 1808 8 1833 8 1858 55
1784 10 1809 2 1834 13 1859 94
1785 24 1810 0 1835 57 1860 96
1786 83 1811 1 1836 122 1861 77
1787 132 1812 5 1837 138 1862 59
1788 131 1813 12 1838 103 1863 44
1789 118 1814 14 1839 86 1864 47
1790 90 1815 35 1840 63 1865 30
1791 67 1816 46 1841 37 1866 16
1792 60 1817 41 1842 24 1867 7
1793 47 1818 30 1843 11 1868 37
1794 41 1819 24 1844 15 1869 74
300 200 100 0
300
400
500
600
700
Daily IBM Common Stock Closing Prices
May 17 1961-November 2 1962
Day
Price($)
Example C: IBM Common Stock Closing Prices: Daily (May 17 1961- Nov 2 1962)

460 471 527 580 551 523 333 394 330
457 467 540 579 551 516 330 393 340
452 473 542 584 552 511 336 409 339
459 481 538 581 553 518 328 411 331
462 488 541 581 557 517 316 409 345
459 490 541 577 557 520 320 408 352
463 489 547 577 548 519 332 393 346
479 489 553 578 547 519 320 391 352
493 485 559 580 545 519 333 388 357
490 491 557 586 545 518 344 396
492 492 557 583 539 513 339 387
498 494 560 581 539 499 350 383
499 499 571 576 535 485 351 388
497 498 571 571 537 454 350 382
496 500 569 575 535 462 345 384
490 497 575 575 536 473 350 382
489 494 580 573 537 482 359 383
478 495 584 577 543 486 375 383
487 500 585 582 548 475 379 388
491 504 590 584 546 459 376 395
487 513 599 579 547 451 382 392
482 511 603 572 548 453 370 386
487 514 599 577 549 446 365 383
482 510 596 571 553 455 367 377
479 509 585 560 553 452 372 364
478 515 587 549 552 457 373 369
479 519 585 556 551 449 363 355
477 523 581 557 550 450 371 350
479 519 583 563 553 435 369 353
475 523 592 564 554 415 376 340
479 531 592 567 551 398 387 350
476 547 596 561 551 399 387 349
478 551 596 559 545 361 376 358
479 547 595 553 547 383 385 360
477 541 598 553 547 393 385 360
476 545 598 553 537 385 380 366
475 549 595 547 539 360 373 359
473 545 595 550 538 364 382 356
474 549 592 544 533 365 377 355
474 547 588 541 525 370 376 367
474 543 582 532 513 374 379 357
465 540 576 525 510 359 386 361
466 539 578 542 521 335 387 355
467 532 589 555 521 323 386 348
471 517 585 558 521 306 389 343
Read downwards
Chemical Concentration data:
200 100 0
16
17
18
Example A: "Uncontrolled" Concentration, Two- Hourly Readings:
Chemical Process

par Summary Statistics

d N Mean Std. Dev.
0 197 17.062 0.398
1 196 0.002 0.369
2 195 0.003 0.622

ACF and PACF for x
t
,Ax
t
and A
2
x
t
(Chemical Concentration Data)
h
10 20 30 40
h
r
- 1. 0
0.0
1.0
x
t
- 1
0
1
10 20 30 40
u
kk
^
x
t
k
- 1.0
0.0
1.0
h
r
10 20 30 40
h
x
t
A
- 1. 0
0.0
1.0
u
kk
^
x
t
A
10 20 30 40
k
- 1. 0
0.0
1.0
10 20 30 40
h
x
t
2
A
h
r
- 1. 0
0.0
1.0
10 20 30 40
u
kk
k
x
t
2
A
^
Possible Identifications
1. d = 0, p = 1, q= 1
2. d = 1, p = 0, q= 1
Sunspot Data:
1890 1860 1830 1800 1770
0
100
200
Example B: Annual Sunspot Numbers
( 1790-1869)


Summary Statistics for the Sunspot Data




ACF and PACF for x
t
,Ax
t
and A
2
x
t
(Sunspot
Data)
- 1. 0
0.0
1.0
r
h
h
10 20 30 40
x
t
- 1. 0
0.0
1.0
x
t
u
kk
10 20 30 40
k
- 1. 0
0.0
1.0
Ax
t
10 20 30 40
h
r
h
- 1. 0
0.0
1.0
Ax
t
10 20 30 40
u
kk
- 1. 0
0.0
1.0
A x
t
2
r
h
10 20 30 40
h
- 1. 0
0.0
1.0
10 20 30 40
u
kk
k
A x
t
2
Possible Identification
1. d = 0, p = 2, q= 0
IBM stock data:

300 200 100 0
300
400
500
600
700
Daily IBM Common StockClosing Prices
May 17 1961- November 2 1962
Day
Price( $)


Summary Statistics



ACF and PACF for x
t
,Ax
t
and A
2
x
t
(IBM Stock Price
Data)
- 1. 0
0.0
1.0
x
t
r
h
10 20 30 40
h
- 1. 0
0.0
1.0
10 20 30 40
k
u
kk
x
t
- 1. 0
0.0
1.0
r
h
Ax
t
10 20 30 40
h
- 1. 0
0.0
1.0
Ax
t
u
kk
10 20 30 40
k
- 1. 0
0.0
1.0
r
h
2
A x
t
10 20 30 40
h
- 1. 0
0.0
1.0
2
A x
t
u
kk
10 20 30 40
k
Possible Identification
1. d = 1, p =0, q= 0
Estimation
of ARIMA parameters
Preliminary Estimation
Using the Method of moments
Equate sample statistics to population
paramaters
Estimation of parameters of an MA(q) series
The theoretical autocorrelation function in terms the
parameters of an MA(q) process is given by.

>
s s
+ + + +
+ + +
=
+
q h
q h
q
q h q h h
h
0
1
1
2 2
2
2
1
1 1
o o o
o o o o o

To estimate o
1
, o
2
, , o
q
we solve the system of
equations:
q h r
q
q h q h h
h
s s
+ + + +
+ + +
=
+
1

1

2 2
2
2
1
1 1
o o o
o o o o o

This set of equations is non-linear and generally very


difficult to solve
For q = 1 the equation becomes:
Thus
2
1
1
1

o
o
+
= r
( ) 0

1
1 1
2
1
= + o o r
or
0

1 1
2
1 1
= + r r o o
This equation has the two solutions
1
4
1
2
1

2
1 1
1
=
r r
o
One solution will result in the MA(1) time series being
invertible
For q = 2 the equations become:
2
2
2
1
2 1 1
1

1

o o
o o o
+ +
+
= r
2
2
2
1
2
2

1

o o
o
+ +
= r
Estimation of parameters of an
ARMA(p,q) series
We use a similar technique.
Namely: Obtain an expression for
h
in terms |
1
,
|
2
, ... , |
p
; o
1
, o
1
, ... , o
q
of and set up q + p
equations for the estimates of |
1
, |
2
, ... , |
p
; o
1
,
o
2
, ... , o
q
by replacing
h
by r
h
.
Estimation of parameters of an ARMA(p,q) series
( )( )
1 1 2
1 1
2
1
1 1 1 1
1
2 1
1
|
| o o
| o | o

=
+ +
+ +
=
Example: The ARMA(1,1) process
The expression for
1
and
2
in terms of |
1
and
o
1
are:
Further
( ) ( ) 0
2 1
1
1 1
2
1
2
1
2
x t
u Var o
| o o
|
o
+ +

= =
( )( )
1 1 2
1 1
2
1
1 1 1 1
1

1
|
| o o
| o | o
r r
r
=
+ +
+ +
=
Thus the expression for the estimates of |
1
, o
1
,
and o
2
are :
and
( ) 0

1 1
2
1
2
1
2
x
C
| o o
|
o
+ +

=
( ) ( )( )
1 1 1 1 1 1
2
1 1
1
2
1

1
and

| o | o | o o
|
+ + = + +
=
r
r
r
Hence
or
|
|
.
|

\
|
+
|
|
.
|

\
|
+ =
|
|
.
|

\
|
+ +
1
2
1
1
2
1
1
2
1
2
1 1

1

1
r
r
r
r
r
r
r o o o o
This is a quadratic equation which can be solved
0

1 2

1
2
1 1
2
1
2
2
2
2
1
1
2
1
=
|
|
.
|

\
|
+
|
|
.
|

\
|
+
|
|
.
|

\
|

r
r
r
r
r
r
r
r
r o o
Example (ChemicalConcentration Data)
the time series was identified as either an
ARIMA(1,0,1) time series or an ARIMA(0,1,1)
series.

If we use the first identification then series x
t
is an
ARMA(1,1) series.
Identifying the series x
t
is an ARMA(1,1) series.
The autocorrelation at lag 1 is r
1
= 0.570 and the
autocorrelation at lag 2 is r
2
= 0.495 .
Thus the estimate of |
1
is 0.495/0.570 = 0.87.
Also the quadratic equation
becomes
0

1 2

1
2
1 1
2
1
2
2
2
2
1
1
2
1
=
|
|
.
|

\
|
+
|
|
.
|

\
|
+
|
|
.
|

\
|

r
r
r
r
r
r
r
r
r o o
0 2984 . 0

7642 . 0

2984 . 0
1
2
1
= + + o o
which has the two solutions -0.48 and -2.08. Again we
select as our estimate of o
1
to be the solution -0.48,
resulting in an invertible estimated series.
Since o = (1 - |
1
) the estimate of o can be computed as
follows:
Thus the identified model in this case is
x
t
= 0.87 x
t-1
+ u
t
- 0.48 u
t-1
+ 2.25
( ) 25 . 2 ) 87 . 0 1 ( 062 . 17

1
= = = | o x
If we use the second identification then series
Ax
t
= x
t
x
t-1
is an MA(1) series.
Thus the estimate of o
1
is:
1
4
1
2
1

2
1 1
1
=
r r
o
The value of r
1
= -0.413.
Thus the estimate of o
1
is:
( )
( )

=
53 . 0
89 . 1
1
413 . 0 4
1
413 . 0 2
1

2
1
o
The estimate of o
1
= -0.53, corresponds to an
invertible time series. This is the solution that we will
choose
The estimate of the parameter is the sample mean.
Thus the identified model in this case is:

Ax
t
= u
t
- 0.53 u
t-1
+ 0.002 or
x
t
= x
t-1
+ u
t
- 0.53 u
t-1
+ 0.002


This compares with the other identification:
x
t
= 0.87 x
t-1
+ u
t
- 0.48 u
t-1
+ 2.25
(An ARIMA(1,0,1) model)
(An ARIMA(0,1,1) model)
Preliminary Estimation
of the Parameters of an AR(p)
Process
( )
p p
| |
o
o

=

1 1
2
1
0
1 1 1
1

+ + =
p p
| |
2 1 1 2
+ + =
p p
| |

and
1
1 1 p p p
| | + + =


The regression coefficients |
1
, |
2
, ., |
p
and the
auto correlation function
h
satisfy the Yule-
Walker equations:
( ) ( )
p p x
r r C | | o

1 0

1 1
2
=
1 1 1


+ + =
p p
r r | |
2 1 1 2


+ + =
p p
r r r | |

and
1

1 1 p p p
r r | | + + =


The Yule-Walker equations can be used to estimate
the regression coefficients |
1
, |
2
, ., |
p
using the
sample auto correlation function r
h
by replacing
h

with r
h
.
Example
Considering the data in example 1 (Sunspot Data) the time series
was identified as an AR(2) time series .
The autocorrelation at lag 1 is r
1
= 0.807 and the autocorrelation
at lag 2 is r
2
= 0.429 .
The equations for the estimators of the parameters of this series
are
429 0

000 1

807 0
807 0

807 0

000 1
2 1
2 1
. . .
. . .
= +
= +
| |
| |
which has solution
637 0

321 . 1

2
1
.

=
=
|
|
Since o = ( 1 -|
1
- |
2
) then it can be estimated as follows:
Thus the identified model in this case is
x
t
= 1.321 x
t-1
-0.637 x
t-2
+ u
t
+14.9
( ) ( ) 9 . 14 637 . 0 321 . 1 1 590 . 46

1

2 1
= + = = x | | o
Maximum Likelihood
Estimation
of the parameters of an ARMA(p,q)
Series
The method of Maximum Likelihood Estimation
selects as estimators of a set of parameters u
1
,u
2
,
... , u
k
, the values that maximize
L(u
1
,u
2
, ... , u
k
) = f(x
1
,x
2
, ... , x
N
;u
1
,u
2
, ... , u
k
)
where f(x
1
,x
2
, ... , x
N
;u
1
,u
2
, ... , u
k
) is the joint
density function of the observations x
1
,x
2
, ... , x
N
.
L(u
1
,u
2
, ... , u
k
) is called the Likelihood function.
It is important to note that:
finding the values -u
1
,u
2
, ... , u
k
- to maximize
L(u
1
,u
2
, ... , u
k
) is equivalent to finding the
values to maximize l(u
1
,u
2
, ... , u
k
) = ln
L(u
1
,u
2
, ... , u
k
).

l(u
1
,u
2
, ... , u
k
) is called the log-Likelihood
function.
Again let {u
t
: t eT} be identically distributed
and uncorrelated with mean zero. In addition
assume that each is normally distributed .

Consider the time series {x
t
: t eT} defined by
the equation:
(*) x
t
= |
1
x
t-1
+ |
2
x
t-2
+... +|
p
x
t-p
+ o + u
t

+o
1
u
t-1
+ o
2
u
t-2
+... +o
q
u
t-q
Assume that x
1
, x
2
, ...,x
N
are observations on the
time series up to time t = N.
To estimate the p + q + 2 parameters |
1
, |
2
, ...
,|
p
; o
1
, o
2
, ... ,o
q
; o , o
2
by the method of
Maximum Likelihood estimation we need to find
the joint density function of x
1
, x
2
, ...,x
N

f(x
1
, x
2
, ..., x
N
||
1
, |
2
, ... ,|
p
; o
1
, o
2
, ... ,o
q
, o, o
2
)
= f(x| |, o, o ,o
2
).

We know that u
1
, u
2
, ...,u
N
are independent
normal with mean zero and variance o
2
.
Thus the joint density function of u
1
, u
2
, ...,u
N
is
g(u
1
, u
2
, ...,u
N
; o
2
) = g(u ; o
2
) is given by.

( ) ( )
)
`

|
.
|

\
|
= =

=
N
t
t
n
N
u g u u g
1
2
2
2 2
1
2
1
exp
2
1
; ; ,
o
o t
o o u
It is difficult to determine the exact density
function of x
1
,x
2
, ... , x
N
from this information
however if we assume that p starting values on
the x-process x* = (x
1-p
,x
2-p
, ... , x
o
) and q starting
values on the u-process u* = (u
1-q
,u
2-q
, ... , u
o
)
have been observed then the conditional
distribution of x = (x
1
,x
2
, ... , x
N
) given x* = (x
1-
p
,x
2-p
, ... , x
o
) and u* = (u
1-q
,u
2-q
, ... , u
o
) can
easily be determined.
The system of equations :
x
1
= |
1
x
0
+ |
2
x
-1
+... +|
p
x
1-p
+ o + u
1
+o
1
u
0

+ o
2
u
-1
+... + o
q
u
1-q
x
2
= |
1
x
1
+ |
2
x
0
+... +|
p
x
2-p
+ o + u
2
+o
1
u
1

+ o
2
u
0
+... +o
q
u
2-q
...
x
N
= |
1
x
N-1
+ |
2
x
N-2
+... +|
p
x
N-p
+ o + u
N

+o
1
u
N-1
+ o
2
u
N-2
+... + o
q
u
N-q
can be solved for:
u
1
= u
1
(x, x*, u*; |, o, o)
u
2
= u
2
(x, x*, u*; |, o, o)
...
u
N
= u
N
(x, x*, u*; |, o, o)

(The jacobian of the transformation is 1)
Then the joint density of x given x* and u* is
given by:
( )
2
, , , *, *, o o u x x f
( )
)
`

|
.
|

\
|
=

=
N
t
t
n
u
1
2
2
, , *, *,
2
1
exp
2
1
o
o
o t
u x
( )
)
`

|
.
|

\
|
= o
o
o t
, , *
2
1
exp
2
1
2
S
n
( ) ( )

=
=
N
t
t
u S
1
2
, , *, *, , , * where o o u x
Let:
( )
2
* *,
, , , o o
u x x
L
( )
)
`

|
.
|

\
|
=

=
N
t
t
n
u
1
2
2
, , *, *,
2
1
exp
2
1
o
o
o t
u x
( )
)
`

|
.
|

\
|
= o
o
o t
, , *
2
1
exp
2
1
2
S
n
( ) ( )

=
=
N
t
t
u S
1
2
, , *, *, , , * again o o u x
= conditional likelihood function
( ) ( )
2
* *,
2
* *,
, , , ln , , , o o o o
u x x u x x
L l =
( ) ( )

=
=
N
t
t
u
n n
1
2
2
2
, , *, *,
2
1
ln
2 2
o
o
o u x
( ) ( ) ( ) o
o
o t , , *
2
1
2 ln
2
2 ln
2
2
2
S
n n
=
conditional log likelihood function =
( ) ( )
2
* *,
2
* *,
, , , and , , , o o o o
u x x u x x
L l
( ) ( )

=
=
N
t
t
u S
1
2
, , *, *, , , * o o u x
The values that maximize
are the values
that minimize
o


( ) ( ) o o o

*
1

*, *,
1

1
2 2
u x S
n
u
n
N
t
t
= =

=
with
( ) ( )

=
=
N
t
t
u S
1
2
, , *, *, , , * o o u x
Comment:
Requires a iterative numerical minimization
procedure to find:
The minimization of:
o


Steepest descent
Simulated annealing
etc
( ) ( )

=
=
N
t
t
u S
1
2
, , *, *, , , * o o u x
Comment:
for specific values of
The computation of:
can be achieved by using the forecast
equations
o , ,
( ) 1

1
=
t t t
x x u
( ) ( )

=
=
N
t
t
u S
1
2
, , *, *, , , * o o u x
Comment:
assumes we know the value of starting values
of the time series {x
t
| t T} and {u
t
| t T}
The minimization of :
Namely x* and u*.
* of components for the 0
* of components for the
u
x x
Approaches:
1. Use estimated values:
2. Use forecasting and backcasting equations to
estimate the values:
Backcasting:
If the time series {x
t
|t T} satisfies the equation:

2 2 1 1 q t q t t t
u u u u

+ + + + + o o o

2 2 1 1
o | | | + + + + =
p t p t t t
x x x x
It can also be shown to satisfy the equation:

2 2 1 1 q t q t t t
u u u u
+ + +
+ + + + + o o o

2 2 1 1
o | | | + + + + =
+ + + p t p t t t
x x x x
Both equations result in a time series with the same
mean, variance and autocorrelation function:
In the same way that the first equation can be used to
forecast into the future the second equation can be
used to backcast into the past:
* of components for the 0
* of components for the
u
x x
Approaches to handling starting values of
the series {x
t
|t T} and {u
t
|t T}
1. Initially start with the values:
2. Estimate the parameters of the model using
Maximum Likelihood estimation and the
conditional Likelihood function.
3. Use the estimated parameters to backcast the
components of x*. The backcasted components of
u* will still be zero.
4. Repeat steps 2 and 3 until the estimates stablize.
This algorithm is an application of the E-M algorithm
This general algorithm is frequently used when there
are missing values.
The E stands for Expectation (using a model to estimate
the missing values)
The M stands for Maximum Likelihood Estimation, the
process used to estimate the parameters of the model.
Some Examples using:
Minitab
Statistica
S-Plus
SAS

You might also like