You are on page 1of 19

Forecasting Lecturer: Prof. Duane S.

Boning

Rev 8
1

Regression Review & Extensions


Single Model Coefficient: Linear Dependence Slope and Intercept (or Offset): Polynomial and Higher Order Models: Multiple Parameters Key point: linear regression can be used as long as the model is linear in the coefficients (doesnt matter the dependence in the independent variable) Time dependencies
Explicit Implicit

Agenda
1. Regression
Polynomial regression Example (using Excel)

2.

Time Series Data & Time Series Regression


Autocorrelation ACF Example: white noise sequences Example: autoregressive sequences Example: moving average ARIMA modeling and regression

3.

Forecasting Examples

Time Series Time as an Implicit Parameter


Data is often collected with a time-order An underlying dynamic process (e.g. due to physics of a manufacturing process) may create autocorrelation in the data
4

uncorrelated
2 x 0

-2

10

20 time

30

40

50

5 0 x -5 -10 0 10 20 30 40 50

autocorrelated

time

Intuition: Where Does Autocorrelation Come From?


Consider a chamber with volume V, and with gas flow in and gas flow out at rate f. We are interested in the concentration x at the output, in relation to a known input concentration w.

Key Tool: Autocorrelation Function (ACF)


Time series data: time index i
x
4 2 0 -2 -4 0 20 40 time 1 0.5 60 80 100

CCF: cross-correlation function

r(k)

ACF: auto-correlation function

0 -0.5 -1 0 5 10 15 20 lags 25 30 35 40

) ACF shows the similarity of a signal to a lagged version of same signal


6

Stationary vs. Non-Stationary


10

Stationary series: Process has a fixed mean

0
-5

-10

100

200 time

300

400

500

30 20 10 0

-10

100

200

300

400

500

time

White Noise An Uncorrelated Series


Data drawn from IID gaussian ACF: We also plot the 3 limits values within these not significant Note that r(0) = 1 always (a signal is always equal to itself with zero lag perfectly autocorrelated at k = 0) Sample mean
4 2 0 -2 -4 0 50 100 time 150 200

1 0.5 0 -0.5

Sample variance

r(k)

-1 0 5 10 15 20 lags 25 30 35 40

Autoregressive Disturbances
Generated by:
x
10 5 0 -5

Mean

-10

100

200 time

300

400

500

r(k)

Variance

0.5 0 -0.5 -1 0 5 10 15 20 lags 25 30 35 40

Slow drop in ACF with large

So AR (autoregressive) behavior increases variance of signal.


9

Another Autoregressive Series


Generated by:
x
10 5 0 -5 -10 0 100 200 time 1 0.5 300 400 500

High negative autocorrelation:

r(k)

0 -0.5

Slow drop in ACF with large But now ACF alternates in sign

Slow drop in ACF with large


0 5 10 15 20 lags 25 30 35 40

-1

10

Random Walk Disturbances


Generated by:
30 20 10 0 -10 0 100 200 time 1 300 400 500

Mean

Variance
r(k)

0.5 0 -0.5 -1 0 5 10 15 20 lags 25 30 35 40

Very slow drop in ACF for

=1

11

Moving Average Sequence


Generated by:
4 2 0 -2 -4 0 100 200 time 1 300 400 500

Mean

Variance
r(k)

0.5 0 -0.5 -1 0

r(1)

Jump in ACF at specific lag


5 10 15 20 lags 25 30 35 40

So MA (moving average) behavior also increases variance of signal.


12

ARMA Sequence
Generated by:
10 5 0 -5

Both AR & MA behavior

-10

100

200 time

300

400

500

1 0.5

r(k)

0 -0.5 -1 0 5 10 15 20 lags 25 30 35 40

Slow drop in ACF with large

13

ARIMA Sequence
Start with ARMA sequence: random walk (integrative) action
400 200

x
0 -200 0

100

200 time

300

400

500

1 0.5

r(k)

Add Integrated (I) behavior

0 -0.5 -1 0 5 10 15 20 lags 25 30 35 40

Slow drop in ACF with large

14

Periodic Signal with Autoregressive Noise


Original Signal
20 10 0 5

After Differencing

0 -10 -5

50

100

150

200 time

250

300

350

400

50

100

150

200 time

250

300

350

400

1 0.5

1 0.5

r(k)

0 -0.5 -1 0 5 10 15 20 lags 25 30 35 40

r(k)

0 -0.5 -1 0 5 10 15 20 lags 25 30 35 40

See underlying signal with period = 5

15

Cross-Correlation: A Leading Indicator


Now we have two series: An input or explanatory variable x An output variable y
10 5 0 -5 -10 10 5 0 -5 -10 1 0 100 200 time 300 400 500
y x

100

200 time

300

400

500

CCF indicates both AR and lag:


r (k)
xy

0.5 0 -0.5 -1 0 5 10 15 20 lags 25 30 35 40

16

Regression & Time Series Modeling


The ACF or CCF are helpful tools in selecting an appropriate model structure
Autoregressive terms? xi = xi-1 Lag terms? yi = xi-k

One can structure data and perform regressions


Estimate model coefficient values, significance, and confidence intervals Determine confidence intervals on output Check residuals

17

Statistical Modeling Summary


1. Statistical Fundamentals
Sampling distributions Point and interval estimation Hypothesis testing

2.

Regression
ANOVA Nominal data: modeling of treatment effects (mean differences) Continuous data: least square regression

3.

Time Series Data & Forecasting


Autoregressive, moving average, and integrative behavior Auto- and Cross-correlation functions Regression and time-series modeling
18

MIT OpenCourseWare http://ocw.mit.edu

2.854 / 2.853 Introduction to Manufacturing Systems


Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

You might also like