You are on page 1of 116

Predictive Analytics : QM901.

1x
Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Those who have knowledge dont predict.


Those who predict dont have knowledge.
- Lao Tzu

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

I think there is a world market for may be 5 computers


- Thomas Watson, Chairman of IBM 1943

Computers in future weigh no more than 1.5 tons


- Popular Mechanics, 1949

640K ought to be enough for everybody


- Bill Gates, 1981???

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Forecasting

Forecasting is a process of estimation of an unknown event/parameter


such as demand for a product.
Forecasting is commonly used to refer time series data.

Time series is a sequence of data points measured at successive time


intervals.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Corporate
Strategy

Business
Forecasting

Aggregate

Product and Market


Planning

Product and Market Strategy

Aggregate Production
Planning

Resource Planning Medium to Long Range


Planning

Forecasting

Master Production
Planning

Production Capacity - Short


Range Planning

Demand
forecasting at SKU
Level

Materials
Requirement
Planning

Capacity Requirement
Planning

Forecasting

Item

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Forecasting methods
Qualitative Techniques.
Expert opinion (or Astrologers)
Quantitative Techniques.
Time series techniques such as exponential smoothing
Casual Models.
Uses information about relationship between system elements
(e.g regression).

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Why Time Series Analysis ?

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Time series analysis helps to identify and explain:


Any systemic variation in the series of data which is due to seasonality.
Cyclical pattern that repeat.
Trends in the data.
Growth rates in the trends.

All Rights Reserved, Indian Institute of Management Bangalore

Time Series Components


Trend

Cyclical

Seasonal

Irregular

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Trend Component

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Persistent upward or downward pattern


Due to consumer behaviour, population, economy, technology etc.

All Rights Reserved, Indian Institute of Management Bangalore

Cyclical Component

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Repeating up and down movements.


Due to interaction of factors influencing economy such as recession.

Usually 2-10 years duration.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Seasonal Component
Regular pattern of up and down movements.
Due to weather, customs, festivals etc.

Occurs within one year.

All Rights Reserved, Indian Institute of Management Bangalore

Seasonal Vs Cyclical

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

When a cyclical pattern in the data has a period of less than one
year, it is referred as seasonal variation.
When the cyclical pattern has a period of more than one year we
refer to it as cyclical variation.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Irregular Component
Erratic fluctuations
Due to random variation or unforeseen events

White Noise

All Rights Reserved, Indian Institute of Management Bangalore

Demand

Demand

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

More
than
one
year gap

Random
movement

Less
than
one
year gap

Time
(c) Seasonal pattern

Time
(b) Cycle

Demand

Demand

Time
(a) Trend

Time
(d) Trend with seasonal pattern

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Time Series Data Additive and Multiplicative Models


1. Additive Forecasting Model
Seasonalit
y Cyclical
Random

Yt Tt St Ct Rt
Trend

2. Multiplicative Forecasting Model


Seasonalit
y Cyclical
Random

Yt Tt St Ct Rt
Trend

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Time Series Data Decomposition


Multiplicative Forecasting Model
Seasonalit
y
Yt Tt St
Trend

Yt
St
Tt

Yt / Tt is called
deseasonalized data

All Rights Reserved, Indian Institute of Management Bangalore

Time Series Techniques

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Moving Average.

Exponential Smoothing.
Auto-regression Models (AR Models).
ARIMA (Auto-regressive Integrated Moving Average) Models.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Moving Average Method

All Rights Reserved, Indian Institute of Management Bangalore

Moving Average (Rolling Average)

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Simple moving average.


Used mainly to capture trend and smooth short term fluctuations.
Most recent data are given equal weights.

Weighted moving average


Uses unequal weights for data

All Rights Reserved, Indian Institute of Management Bangalore

Data

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Demand for continental breakfast at the Die Another Day Hospital.


Daily data between 1 October 2014 23 January 2015 (115 days)

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Simple moving average

The forecast for period t+1 (Ft+1) is given by the average of the n most
recent data.
1 t
Ft 1
Yi

n i t n 1
Ft 1 Forecast for period t 1
Yi Data corresponding to time period i

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Simple moving average

The forecast for period t+1 (Ft+1) is given by the average of the n most
recent data.
1 t
Ft 1
Yi

n i t n 1
Ft 1 Forecast for period t 1
Yi Data corresponding to time period i

All Rights Reserved, Indian Institute of Management Bangalore

Demand for Continental Breakfast at DAD Hospital


Moving Average

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

60
50
40
30
20
10
0

Actual Demand

Forecast

t
1
Ft 1
Yi
7 i t 7 1
All Rights Reserved, Indian Institute of Management Bangalore

Measures of aggregate error


Mean absolute error MAE

Mean absolute percentage


error MAPE

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

1 n
MAE Et
n t 1
1 n Et
MAPE
n t 1 Yt

Mean squared error MSE

1 n 2
MSE Et
n t 1

Root mean squared error


RMSE

1 n 2
RMSE
Et
n t 1

Et = Ft - Yt
All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

DAD Forecasting MAPE and RMSE

Mean absolute percentage


error MAPE

0.1068 or 10.68%

Root mean squared error


RMSE

5.8199

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Exponential Smoothing

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Form of Weighted Moving Average


Weights decline exponentially.
Largest weight is given to the present observation, less weight to
immediately preceding observation and so on.
Requires smoothing constant ()
Ranges from 0 to 1

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Exponential Smoothing

Next forecast = (present actual value) + (1-) present forecast

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Simple Exponential Smoothing Equations


Smoothing Equations
Ft 1 * Yt (1 ) * Ft
F1 Y1

Ft+1 is the forecasted value at time t+1

All Rights Reserved, Indian Institute of Management Bangalore

Simple Exponential Smoothing Equations

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Smoothing Equations
Ft Yt 1 (1 )Yt 2 (1 ) 2 Yt 3 ....

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Exponential Smoothing Forecast


60

Demand

Exponential Smoothing Forecast

50
40
30
20
10
0
18-09-2014

08-10-2014

28-10-2014

17-11-2014

07-12-2014

27-12-2014

16-01-2015

05-02-2015

Ft 1 0.8098 * Yt (1 0.8098) * Ft
All Rights Reserved, Indian Institute of Management Bangalore

DAD Forecasting MAPE and RMSE


Exponential Smoothing

Mean absolute percentage


error MAPE

0.0906 or 9.06%

Root mean squared error


RMSE

5.3806

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

= 0.8098

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Choice of
For smooth data, try a high value of a , forecast responsive to most
current data.
For noisy data try low a forecast more stableless responsive

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Optimal

Min

Y F
t 1

/ n

Ft Yt 1 (1 ) Ft 1

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Double exponential Smoothing Holts model


Simple exponential smoothing may produce consistently biased forecasts in
the presence of a trend.
Holt's method (double exponential smoothing) is appropriate when demand
has a trend but no seasonality.
Systematic component of demand = Level + Trend

All Rights Reserved, Indian Institute of Management Bangalore

Holts method

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Holts method can be used to forecast when there is a linear trend present
in the data.
The method requires separate smoothing constants for slope and intercept.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Holts Method
Equation for
intercept or
level

Holts Equations

(i ) Lt Yt (1 ) ( Lt 1 Tt 1 )
(ii ) Tt ( Lt Lt 1 ) (1 ) Tt 1

Equation for
Slope (Trend)

Forecast Equation

Ft 1

Lt Tt
All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Initial values of Lt and Tt


L1 is in general set to Y1.

T1 can be set to any one of the following values (or use regression to get initial values):

T1 (Y 2Y1 )

T1 (Y2 Y1 ) (Y3 Y2 ) (Y4 Y3 ) / 3


T1 (Yn Y1 ) /(n 1)

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

60

50

40

30

20

10

0
0

20

40

60
Demand

80

100

120

140

Forecast

Double exponential smoothing


= 0.8098; = 0.05
All Rights Reserved, Indian Institute of Management Bangalore

DAD Forecasting MAPE and RMSE


Double Exponential Smoothing

Mean absolute percentage


error MAPE

0.0930 or 9.3%

Root mean squared error


RMSE

5.5052

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

= 0.8098; = 0.05

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Forecasting Accuracy

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

The forecast error is the difference between the forecast value and the actual value for
the corresponding period.

E t Yt Ft
E t Forecast error at period t
Yt Actual value at time period t
Ft Forecast for time period t

All Rights Reserved, Indian Institute of Management Bangalore

Measures of aggregate error


Mean absolute error MAE

1 n
MAE Et
n t 1

Mean absolute percentage


error MAPE

1 n Et
MAPE
n t 1 Yt

Mean squared error MSE

1 n 2
MSE Et
n t 1

Root mean squared error


RMSE

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

1 n 2
RMSE
Et
n t 1

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Forecasting Power of a Model

All Rights Reserved, Indian Institute of Management Bangalore

Theils coefficient (U Statistic)


n

(Yt 1 Ft 1 )

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

U t 1
n
2
(Yt 1 Yt )
t 1

The value U is the relative forecasting power of the method against nave technique. If U < 1,
the technique is better than nave forecast If U > 1, the technique is no better than the nave
forecast.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Theils coefficient for DAD Hospital


Method

Moving Average with 7 periods

1.221

Exponential Smoothing

1.704

Double exponential Smoothing

1.0310

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Theils Coefficient

U1

F
t t
t 1

2
Y
t
t 1

2
F
t
t 1

Ft 1 Yt 1

Yt
t 1

U2
2
n -1
Yt 1 Yt

Yt
t 1
n -1

U1 is bounded between 0 and 1, with values closure to zero indicating greater accuracy.
If U2 = 1, there is no difference between nave forecast and the forecasting technique
If U2 < 1, the technique is better than nave forecast
If U2 > 1, the technique is no better than the nave forecast.
All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Seasonal Effect

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Seasonal effect is defined as the repetitive and predictable pattern of data


behaviour in a time-series around the trend line.
In seasonal effect the time period must be less than one year, such as, days,
weeks, months or quarters.

All Rights Reserved, Indian Institute of Management Bangalore

Seasonal effect

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Identification of seasonal effect provides better understanding of the time


series data.
Seasonal effect can be eliminated from the time-series. This process is called
deseasonalization or seasonal adjusting.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Seasonal Adjusting

Seasonal adjustment in multiplicative model


Tt St Yt
Seasonal effect
100
Tt
Tt
Seasonal adjustment in additive model
Yt St Tt St St Tt

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Seasonal Index
Method of simple averages
Ratio-to-moving average method

All Rights Reserved, Indian Institute of Management Bangalore

Method of simple averages

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Average the unadjusted data by period ( for example daily or monthly).


Calculate the average of daily (or monthly) averages.

Seasonal index for day i (or month i) is the ratio of daily average of day i (or
month i) to the average of daily (or monthly) averages times 100.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Example: DAD Example - Demand for Continental Breakfast


5 October 1 November 2014 Data

Week
(5-11 October)

Week 2
(12-18 October)

Week 3
(19-25 OCT)

Week 4
(26 OCT - 1 NOV)

Sunday

41.00

40.00

40.00

41.00

Monday

30.00

44.00

43.00

41.00

Tuesday

40.00

49.00

41.00

40.00

Wednesday

40.00

50.00

46.00

43.00

Thursday

40.00

45.00

41.00

46.00

Friday

40.00

40.00

40.00

45.00

Saturday

40.00

42.00

32.00

45.00

DAY

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Seasonality Index
DAY
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday

Week1
(5-11 October)
41.00
30.00
40.00
40.00
40.00
40.00
40.00

Week 2
(12-18 October)
40.00
44.00
49.00
50.00
45.00
40.00
42.00

Week 3
(19-25 OCT)
40.00
43.00
41.00
46.00
41.00
40.00
32.00

Week 4
(26 Oct-1 Nov)
41.00
41.00
40.00
43.00
46.00
45.00
45.00

Average of daily
averages

Daily
Seasonality
Average
Index
40.50
97.34%
39.50
94.94%
42.50
102.15%
44.75
107.55%
43.00
103.34%
41.25
99.14%
39.75
95.54%
41.61
700

Total = Number of
seasons x 100

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Deseasonalized Data
Date

Demand

Seasonal Index De-seasonalized Data

10/5/2014

41

97.34%

42.12

10/6/2014

30

94.94%

31.60

10/7/2014

40

102.15%

39.16

10/8/2014

40

107.55%

37.19

10/9/2014

40

103.34%

38.70

10/10/2014

40

99.14%

40.35

10/11/2014

40

95.54%

41.87

All Rights Reserved, Indian Institute of Management Bangalore

Trend

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Trend is calculated using regression on deseasonalized data.


Deseasonalized data is obtained by dividing the actual data with its
seasonality index.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

SUMMARY OUTPUT

Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations

0.268288
0.071978
0.036285
3.715395
28

ANOVA
Regression
Residual
Total

Intercept
Day

df
1
26
27

SS
27.83729
358.9082
386.7455

Coefficients Standard Error


39.81731
1.442768
0.123437
0.086923

MS
27.83729
13.80416

F
2.016587

Significance F
0.167469

t Stat
27.59786
1.420066

P-value
8.7E-21
0.167469

Lower 95%
36.85166
-0.05524

Upper 95%
42.78296
0.30211

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Forecast
Ft = Tt x St
F29 = T29 x S29

F29 = (39.81 + 0.1234 x 29) x 0.9733 = 42.2372


Y29 = 46
All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Forecasting using method of averages in the


presence of seasonality

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Auto-correlation

Auto correlation is a correlation of a variable observed at two time


points (e.g. Yt and Yt-1 or Yt and Yt-3).

Auto-correlation of lag k, k, is given by:


Yt k Y Yt Y
t k 1

n
2
(Yt Y )
n

t 1

n total number of observations

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Auto-correlation Function (ACF)


A k-period plot of autocorrelations is called autocorrelation
function (ACF) or a correlogram.

All Rights Reserved, Indian Institute of Management Bangalore

Auto-Correlation

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Auto-correlation of lag k is auto-correlation between Yt and Yt+k.


To test whether the autocorrelation at lag k is significantly different
from 0, the following hypothesis test is used:
H0: k = 0
HA: k 0
For any k, reject H0 if | k| > 1.96/n. Where n is the number of
observations.
All Rights Reserved, Indian Institute of Management Bangalore

ACF for Demand for Continental


Breakfast

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Partial Auto-correlation

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Partial auto correlation of lag k is an auto correlation between Yt and Yt-k


with linear dependence between the intermedia values (Yt-k+1, Yt-k+2, , Yt-1)
removed.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Partial Auto-correlation Function

A k-period plot of partial autocorrelations is called partial


autocorrelation function (PACF).

All Rights Reserved, Indian Institute of Management Bangalore

Partial Auto-Correlation

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Partial auto-correlation of lag k is auto-correlation between Yt and Yt+k after the


removal of linear dependence of Yt+1 to Yt+k-1.
To test whether the partial autocorrelation at lag k is significantly different from 0,
the following hypothesis test is used:
H0: pk = 0
HA: pk 0

For any k, reject H0 if | pk| > 1.96/n. Where n is the number of observations.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

PACF Demand for Continental Breakfast at


DAD hospital

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Stationarity
A time series is stationary, if:

Mean () is constant over time.


Variance () is constant over time.
The covariance between two time periods (Yt) and (Yt+k) depends
only on the lag k not on the time t.
We assume that the time series is stationary before
applying forecasting models

All Rights Reserved, Indian Institute of Management Bangalore

ACF Plot of non-stationary and stationary


process

Non-stationary

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Stationary
All Rights Reserved, Indian Institute of Management Bangalore

White Noise

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

White noise is a data uncorrelated across time that follow normal


distribution with mean 0 and constant standard deviation .

In forecasting we assume that the residuals are white Noise.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Residual White Noise


All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Auto-Regression

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Auto-regression is a regression of Y on itself observed at different


time points.
That is, we use Yt as the response variable and Yt-1, Yt-2 etc. as
explanatory variables.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

AR(1) Parameter Estimation


Yt Yt 1 t
n

t 2

Yt Yt 1

t 2

Yt Yt 1

t 2
n

OLS
Estimate

2
Y
t 1

t 2

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

AR(1) Process
Yt = Yt-1 + t
Yt = (Yt-2 + t-1) + t

Yt = t X0 + t-1 1 + t-2 2 ++ 1 t-1 + t


t 1

Yt X 0 i t i
t

i 0

All Rights Reserved, Indian Institute of Management Bangalore

Auto-regressive process (AR(p))

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Assume {Yt} is purely random with mean zero and constant standard
deviation (White Noise).
Then the autoregressive process of order p or AR(p) process is
Yt 0 1Yt 1 2Yt 2 ... pYt p t

AR(p) process models each future observation


as a function p previous observations.
All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Model Identification in AR(p)


Process

All Rights Reserved, Indian Institute of Management Bangalore

Pure AR Model Identification

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Model

ACF

PACF

AR(1)

Exponential Decay: Positive side if 1


> 0 and alternating in sign starting on
negative side if 1 < 0.

Spike at lag 1, then cuts off to zero.


Spike positive if 1 > 0 and negative
side if 1 < 0.

AR(p)

Exponential decay: pattern depends on


signs of 1, 2, etc

Spikes at lags 1 to p, then cuts of to


zero.

Yt 0 1Yt 1 2Yt 2 ... pYt p t

All Rights Reserved, Indian Institute of Management Bangalore

Partial Auto-Correlation

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Partial auto-correlation of lag k is auto-correlation between Yt and Yt+k after the


removal of linear dependence of Yt+1 to Yt+k-1.
To test whether the autocorrelation at lag k is significantly different from 0, the
following hypothesis test is used:
H0: k = 0
HA: k 0

For any k, reject H0 if | k| > 1.96/n. Where n is the number of observations.

All Rights Reserved, Indian Institute of Management Bangalore

PACF Function DAD Continental


Breakfast

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

First two
PAC are
different
from
zero.

All Rights Reserved, Indian Institute of Management Bangalore

ACF Function - DAD Continental


Breakfast

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Actual Vs Fit- Continental Breakfast


Data

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

AR(2) Model

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

MAPE = 8.8%
All Rights Reserved, Indian Institute of Management Bangalore

Residual White Noise

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

(Yt 41.252) = 0.663 (Yt-1 41.252) 0.204 (Yt-2 41.252)

Yt = (Xt - )

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Moving Average Process

A moving average process is a time series regression model in which


the value at time t, Yt, is a linear function of past errors.

First order moving average, MA(1), is given by:


Yt = 0 + 1 t-1 + t

All Rights Reserved, Indian Institute of Management Bangalore

Moving Average Process MA(q)

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

{Yt} is a moving average process of order q (written MA(q)) if for some


constants 0, 1, . . . q
We have.,

Yt 0 1 t 2 t 2 ... q t q t
MA(q) models each future observation as a function of
q previous errors
All Rights Reserved, Indian Institute of Management Bangalore

Pure MA Model Identification

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Model

ACF

PACF

MA(1)

Spike at lag 1 then cuts of to zero. Spike


positive if 1 > 0 and negative side if
1 < 0.

Exponential decay. On negative side if


1> 0 on positive side if 1< 0.

MA(q)

Spikes at lags 1 to q, then cuts off to zero.

Exponential decay or sine wave. Exact


pattern depends on signs of 1, 2 etc.

All Rights Reserved, Indian Institute of Management Bangalore

ACF Function - DAD Continental


Breakfast

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

SPSS Output

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Actual Vs Fitted Value

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Residual Plot

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

ARMA(p,q) Model
AR(p)
Model

Yt 0 1Yt 1 2Yt 2 ... pYt p


0 1 t 1 2 t 2 ... q t q t
MA(q)
Model
All Rights Reserved, Indian Institute of Management Bangalore

ARMA(p,q) Model Identification

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

ARMA(p,q) models are not easy to identify. We usually start with pure
AR and MA process. The following thump rule may be used.
Process

ACF

PACF

ARMA(p,q)

Tails off after q lags

Tails off to zero after p


lags

The final ARMA model may be selected based on parameters such as


RMSE, MAPE, AIC and BIC.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

ACF

PACF
All Rights Reserved, Indian Institute of Management Bangalore

ARMA(2,1) Model Output

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

AR(2) Model

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

MAPE = 8.8%
All Rights Reserved, Indian Institute of Management Bangalore

Residual White Noise

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Forecasting Model Evaluation

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Akaikes information criteria


AIC = -2LL + 2m
Where m is the number of variables estimated in the model
Bayesian Information Criteria

BIC = -2LL + m ln(n)

AIC and BIC can


be interpreted as
distance from
true model

Where m is the number of variables estimated in the model and n is the number of
observations

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Final Model Selection

Model
BIC

AR(2)
3.295

MA(1)
3.301

ARMA(2,1)
3.343

AR(2) has the


least BIC

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

ARIMA
ARIMA has the following three components:

Auto-regressive component: Function of past values of the time series.


Integration Component: Differencing the time series to make it a stationary process.
Moving Average Component: Function of past error values.

All Rights Reserved, Indian Institute of Management Bangalore

Integration (d)

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Used when the process is non-stationary.

Instead of observed values, differences between observed values are considered.


When d=0, the observations are modelled directly. If d = 1, the differences
between consecutive observations are modelled. If d = 2, the differences of the
differences are modelled.

All Rights Reserved, Indian Institute of Management Bangalore

ARIMA (p, d, q)

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

The q and p values are identified using auto-correlation function (ACF)


and Partial auto-correlation function (PACF) respectively. The value d
identifies the level of differencing.

Usually p+q <= 4 and d <= 2.

All Rights Reserved, Indian Institute of Management Bangalore

Differencing

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Differencing is a process of making a non-stationary process into


stationary process.
In differencing, we create a new process Xt, where Xt = Yt Yt-1.

All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

ARIMA(p,1,q) Process
X t 0 1 X t 1 2 X t 2 ... p X t p
0 1 t 1 2 t 2 ... q t q t
Where Xt = Yt Yt-1

All Rights Reserved, Indian Institute of Management Bangalore

Ljung-Box Test

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Ljung-Box is a test on the autocorrelations of residuals. The test statistic is:

rk2
Qm n(n 2)
k 1 n k
m

n = number of observations in the time series.


k = particular time lag checked
m = the number of time lags to be tested
rk = sample autocorrelation function of the kth residual term.
H0: The model does not exhibit lack of fit
HA: The model exhibits lack of fit

Q statistic is approximate chi-square distribution with m p q degrees of freedom if ARMA orders


are correctly specified.
All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

AR(2) Ljung-Box Test

P-value is 0.740, thus the model doesnt show


lack of fit

All Rights Reserved, Indian Institute of Management Bangalore

Box-Jenkins Methodology

Predictive Analytics : QM901.1x


Prof U Dinesh Kumar, IIMB

Identification: Identify the ARIMA model using ACF & PACF plots. This
would give the values of p, q and d.
Estimation: Estimate the model parameters (using maximum likelihood)

Diagnostics: Check the residual for any issue such as not providing White
Noise.

All Rights Reserved, Indian Institute of Management Bangalore

You might also like