You are on page 1of 3

MLRForecastErrorTutorial 1 SpiderFinancialCorp,2014

TN: Forecast Error in Regression Models


Occasionally,wereceiverequestsforatechnicalpaperaboutregressionmodelingbeyondourregular
NumXLsupport,inordertodelvemoredeeplyintothemathematicalformulationofMLR.Weare
alwayshappytoaddressuserrequests,sowedecidedtoshareourinternaltechnicalnoteswithyou.
Thesenoteswereoriginallycomposedwhenwesatinonatimeseriesanalysisclass.Overtheyears,
wevemaintainedthesenoteswithnewinsights,empiricalobservations,andnewlyacquiredintuitions.
Weoftengobacktothesenotesforresolvingdevelopmentissuesortoproperlyaddressaproduct
supportmatter.
Inthispaper,wellgooverasimple,yetfundamentalandoftenaskedquestionaboutforecasterrorina
regressionmodel.
Background
Letsassumethetrueunderlyingmodelorprocessisdefinedasfollows:

1 1 2 2
...
k k
y x x x o | | | c = + + + + +
Where
- y isthedependent(response)variable.
-
1 2
{ , ,..., }
k
x x x aretheindependent(explanatory)variables.
- o istherealintercept(constant).
-
j
| isthecoefficient(loading)ofthejthindependentvariable.
- { } c isasetofindependent,identical,normallydistributederrors(residuals).

2
~. . ~ (0, ) i i d N c o
Inpractice,thetrueunderlyingmodelisunknown.However,withfinitesampledataandanOLSorother
procedure,wecanestimatethevaluesofthecoefficients(akaloadings)forthedifferentinput
(explanatory)variables.
LetsassumewehaveasampledatasetwithNobservations,i.e.
1, 2, ,
( , ,..., , )
i i k i i
x x x y .UsinganOLS
method,wearriveatthefollowingregressionmodel:

1 1 2 2

...
k k
y x x x u o | | | = + + + + +

MLRForecastErrorTutorial 2 SpiderFinancialCorp,2014

Where
-

j
| istheOLSestimateforthejthcoefficient(loading).
- o istheOLSestimateoftheintercept.
- { } u istheregressionresiduals.Theresidualsarehomoscedastic(i.e.stablevariance)and
uncorrelatedwithanyoftheinputvariables.

2 2
1
[ ] 0
[ ]
[ ] 0
i
i k
E u
E u s
E u x
s s
=
=
=

Forecast
Inpractice,thetrueregressionmodelishiddenorunknown.Wewillreverttotheestimatedregression
modeltoperformaforecast.
Mathematically,theconditionalforecastcanbeexpressedasfollows:

1 2 1 1 2 2

[ | , ,.., ] ...
k k k
y E Y x x x x x x o | | | = = + + + +
Asaresult,theerrorsintheforecastoriginatefromtwodistinctsources:
1. Residuals( { } c or{ } u )
2. Errorsintheestimatedcoefficientsvalues(i.e.using

j
| insteadof
j
| )
UsinganOLSprocedure,theestimatedvaluesofone

j
| arenormallydistributed.Nevertheless,the
errorsinthevaluesofthewholesetofparameters
1

{ }
j
j k
|
s s
arecorrelated.So,wecanignorethe
covariancetermswhenweexaminethestatisticalsignificanceofonecoefficient,butwewillneedto
factorintheiroverall/aggregateeffectfortheforecasterror.
Asaresult,theforecastvariance(akaerrorsquared)canbeexpressedasfollows:

2
,
1 2
1, 2, ,
2
,
1 1
( )
1
[ | , ,..., ] 1
( )
k
j m j
j
m m k m N k
j i j
i j
x x
Var y y x x x
N
x x
o
=
= =
| |

|
|
= + +
|

|
\ .

MLRForecastErrorTutorial 3 SpiderFinancialCorp,2014

However,thevarianceofresiduals(
2
o )inthetruemodelisunknown,soweusethevarianceofthe
errorterms(
2
o )oftheestimatedregressionmodel:

2
2 2 2 1
1 1 2 2
[u ] E[(y ... ) ]
1 1
N
i
i
k k
u
SSE
E x x x
N K N k
o o | | |
=
= = = =

Overall,theMLRforecasterrorsquaredisexpressedasfollows:

2
,
1
1, 2, ,
2
,
1 1
( )
1
[ | , ,..., ] 1
1
( )
k
j m j
j
m m k m N k
j i j
i j
x x
SSE
Var y y x x x
N k N
x x
=
= =
| |

|
|
= + +
|

|
\ .


Now,letstakeacloselookattheformulaaboveandtrytoexplainthedifferentterms:
1.
2
o istheestimatedvarianceoftrueregressionmodelresiduals.Thisvalueisconstantand
independentfromtheXvalue(s)ofthetargetdatapoint.
2.
2

N
o
istheerrorintheestimatedintercept(akaconstant).Thisvalueisconstantand
independentfromtheXvaluesofthetargetdatapoint.
3. Thelasttermisproportionaltothesquared(Euclidean)distanceofthetargetdatapointfrom
thecenterofthesampledataset.Thistermiszeroatthesampledatacenterpoint
1, 2, ,
( , ,..., )
i i k i
x x x .
Ineffect,theforecastvarianceishigherfordatapoints
1, 2, ,
( , ,..., )
i i k i
x x x thatarefurtherfromthe
centeroftheinputsampledataset(i.e.
1, 2, ,
( , ,..., )
i i k i
x x x ).
Asaresult,theforecasterrorissmallestatthesampledatacenterpoint
1, 2, ,
( , ,..., )
i i k i
x x x .

You might also like