You are on page 1of 17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

Linearregression
FromWikipedia,thefreeencyclopedia

Instatistics,linearregressionisanapproachformodelingtherelationshipbetweenascalardependent
variableyandoneormoreexplanatoryvariables(orindependentvariables)denotedX.Thecaseofone
explanatoryvariableiscalledsimplelinearregression.Formorethanoneexplanatoryvariable,the
processiscalledmultiplelinearregression.[1](Thistermshouldbedistinguishedfrommultivariate
linearregression,wheremultiplecorrelateddependentvariablesarepredicted,ratherthanasinglescalar
variable.)[2]
Inlinearregression,therelationshipsaremodeledusinglinearpredictorfunctionswhoseunknown
modelparametersareestimatedfromthedata.Suchmodelsarecalledlinearmodels.[3]Mostcommonly,
theconditionalmeanofygiventhevalueofXisassumedtobeanaffinefunctionofXlesscommonly,
themedianorsomeotherquantileoftheconditionaldistributionofygivenXisexpressedasalinear
functionofX.Likeallformsofregressionanalysis,linearregressionfocusesontheconditional
probabilitydistributionofygivenX,ratherthanonthejointprobabilitydistributionofyandX,whichis
thedomainofmultivariateanalysis.
Linearregressionwasthefirsttypeofregressionanalysistobestudiedrigorously,andtobeused
extensivelyinpracticalapplications.[4]Thisisbecausemodelswhichdependlinearlyontheirunknown
parametersareeasiertofitthanmodelswhicharenonlinearlyrelatedtotheirparametersandbecause
thestatisticalpropertiesoftheresultingestimatorsareeasiertodetermine.
Linearregressionhasmanypracticaluses.Mostapplicationsfallintooneofthefollowingtwobroad
categories:
Ifthegoalisprediction,orforecasting,orerrorreduction,linearregressioncanbeusedtofita
predictivemodeltoanobserveddatasetofyandXvalues.Afterdevelopingsuchamodel,ifan
additionalvalueofXisthengivenwithoutitsaccompanyingvalueofy,thefittedmodelcanbe
usedtomakeapredictionofthevalueofy.
GivenavariableyandanumberofvariablesX1,...,Xpthatmayberelatedtoy,linearregression
analysiscanbeappliedtoquantifythestrengthoftherelationshipbetweenyandtheXj,toassess
whichXjmayhavenorelationshipwithyatall,andtoidentifywhichsubsetsoftheXjcontain
redundantinformationabouty.
Linearregressionmodelsareoftenfittedusingtheleastsquaresapproach,buttheymayalsobefittedin
otherways,suchasbyminimizingthe"lackoffit"insomeothernorm(aswithleastabsolutedeviations
regression),orbyminimizingapenalizedversionoftheleastsquareslossfunctionasinridgeregression
(L2normpenalty)andlasso(L1normpenalty).Conversely,theleastsquaresapproachcanbeusedtofit
modelsthatarenotlinearmodels.Thus,althoughtheterms"leastsquares"and"linearmodel"are
closelylinked,theyarenotsynonymous.

Contents
1 Introductiontolinearregression
1.1 Assumptions
1.2 Interpretation
2 Extensions
2.1 Simpleandmultipleregression

https://en.wikipedia.org/wiki/Linear_regression

1/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

5
6
7
8

2.1 Simpleandmultipleregression
2.2 Generallinearmodels
2.3 Heteroscedasticmodels
2.4 Generalizedlinearmodels
2.5 Hierarchicallinearmodels
2.6 Errorsinvariables
2.7 Others
Estimationmethods
3.1 Leastsquaresestimationandrelatedtechniques
3.2 Maximumlikelihoodestimationandrelatedtechniques
3.3 Otherestimationtechniques
3.4 Furtherdiscussion
3.5 Usinglinearalgebra
Applicationsoflinearregression
4.1 Trendline
4.2 Epidemiology
4.3 Finance
4.4 Economics
4.5 Environmentalscience
Seealso
Notes
References
Furtherreading

Introductiontolinearregression
Givenadataset
ofn
statisticalunits,alinearregression
modelassumesthattherelationship
betweenthedependentvariableyi
andthepvectorofregressorsxiis
linear.Thisrelationshipismodeled
throughadisturbancetermorerror
variableianunobservedrandom
variablethataddsnoisetothelinear
relationshipbetweenthedependent
variableandregressors.Thusthe
modeltakestheform
Exampleofsimplelinearregression,whichhasoneindependent
variable

whereTdenotesthetranspose,sothatxiTistheinnerproductbetweenvectorsxiand.
Oftenthesenequationsarestackedtogetherandwritteninvectorformas

https://en.wikipedia.org/wiki/Linear_regression

2/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

where

Exampleofacubicpolynomialregression,whichisa
typeoflinearregression.

Someremarksonterminologyandgeneraluse:
iscalledtheregressand,endogenousvariable,responsevariable,measuredvariable,criterion
variable,ordependentvariable(seedependentandindependentvariables.)Thedecisionasto
whichvariableinadatasetismodeledasthedependentvariableandwhicharemodeledasthe
independentvariablesmaybebasedonapresumptionthatthevalueofoneofthevariablesis
causedby,ordirectlyinfluencedbytheothervariables.Alternatively,theremaybeanoperational
reasontomodeloneofthevariablesintermsoftheothers,inwhichcasethereneedbeno
presumptionofcausality.
arecalledregressors,exogenousvariables,explanatoryvariables,
covariates,inputvariables,predictorvariables,orindependentvariables(seedependentand
independentvariables,butnottobeconfusedwithindependentrandomvariables).Thematrix
issometimescalledthedesignmatrix.
Usuallyaconstantisincludedasoneoftheregressors.Forexample,wecantakexi1=1for
i=1,...,n.Thecorrespondingelementofiscalledtheintercept.Manystatisticalinference
proceduresforlinearmodelsrequireanintercepttobepresent,soitisoftenincludedevenif
theoreticalconsiderationssuggestthatitsvalueshouldbezero.
Sometimesoneoftheregressorscanbeanonlinearfunctionofanotherregressororofthe
data,asinpolynomialregressionandsegmentedregression.Themodelremainslinearas
longasitislinearintheparametervector.
https://en.wikipedia.org/wiki/Linear_regression

3/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

Theregressorsxijmaybeviewedeitherasrandomvariables,whichwesimplyobserve,or
theycanbeconsideredaspredeterminedfixedvalueswhichwecanchoose.Both
interpretationsmaybeappropriateindifferentcases,andtheygenerallyleadtothesame
estimationprocedureshoweverdifferentapproachestoasymptoticanalysisareusedin
thesetwosituations.
isapdimensionalparametervector.Itselementsarealsocalledeffects,orregression
coefficients.Statisticalestimationandinferenceinlinearregressionfocuseson.Theelementsof
thisparametervectorareinterpretedasthepartialderivativesofthedependentvariablewith
respecttothevariousindependentvariables.
iscalledtheerrorterm,disturbanceterm,ornoise.Thisvariablecapturesallotherfactors
whichinfluencethedependentvariableyiotherthantheregressorsxi.Therelationshipbetween
theerrortermandtheregressors,forexamplewhethertheyarecorrelated,isacrucialstepin
formulatingalinearregressionmodel,asitwilldeterminethemethodtouseforestimation.
Example.Considerasituationwhereasmallballisbeingtossedupintheairandthenwemeasureits
heightsofascenthiatvariousmomentsintimeti.Physicstellsusthat,ignoringthedrag,the
relationshipcanbemodeledas

where1determinestheinitialvelocityoftheball,2isproportionaltothestandardgravity,andiis
duetomeasurementerrors.Linearregressioncanbeusedtoestimatethevaluesof1and2fromthe
measureddata.Thismodelisnonlinearinthetimevariable,butitislinearintheparameters1and2
ifwetakeregressorsxi=(xi1,xi2)=(ti,ti2),themodeltakesonthestandardform

Assumptions
Standardlinearregressionmodelswithstandardestimationtechniquesmakeanumberofassumptions
aboutthepredictorvariables,theresponsevariablesandtheirrelationship.Numerousextensionshave
beendevelopedthatalloweachoftheseassumptionstoberelaxed(i.e.reducedtoaweakerform),and
insomecaseseliminatedentirely.Somemethodsaregeneralenoughthattheycanrelaxmultiple
assumptionsatonce,andinothercasesthiscanbeachievedbycombiningdifferentextensions.
Generallytheseextensionsmaketheestimationproceduremorecomplexandtimeconsuming,andmay
alsorequiremoredatainordertoproduceanequallyprecisemodel.
Thefollowingarethemajorassumptionsmadebystandardlinearregressionmodelswithstandard
estimationtechniques(e.g.ordinaryleastsquares):
Weakexogeneity.Thisessentiallymeansthatthepredictorvariablesxcanbetreatedasfixed
values,ratherthanrandomvariables.Thismeans,forexample,thatthepredictorvariablesare
assumedtobeerrorfreethatis,notcontaminatedwithmeasurementerrors.Althoughthis
assumptionisnotrealisticinmanysettings,droppingitleadstosignificantlymoredifficulterrors
invariablesmodels.
Linearity.Thismeansthatthemeanoftheresponsevariableisalinearcombinationofthe
parameters(regressioncoefficients)andthepredictorvariables.Notethatthisassumptionismuch
lessrestrictivethanitmayatfirstseem.Becausethepredictorvariablesaretreatedasfixedvalues
(seeabove),linearityisreallyonlyarestrictionontheparameters.Thepredictorvariables
themselvescanbearbitrarilytransformed,andinfactmultiplecopiesofthesameunderlying
https://en.wikipedia.org/wiki/Linear_regression

4/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

predictorvariablecanbeadded,eachonetransformeddifferently.Thistrickisused,forexample,
inpolynomialregression,whichuseslinearregressiontofittheresponsevariableasanarbitrary
polynomialfunction(uptoagivenrank)ofapredictorvariable.Thismakeslinearregressionan
extremelypowerfulinferencemethod.Infact,modelssuchaspolynomialregressionareoften
"toopowerful",inthattheytendtooverfitthedata.Asaresult,somekindofregularizationmust
typicallybeusedtopreventunreasonablesolutionscomingoutoftheestimationprocess.
Commonexamplesareridgeregressionandlassoregression.Bayesianlinearregressioncanalso
beused,whichbyitsnatureismoreorlessimmunetotheproblemofoverfitting.(Infact,ridge
regressionandlassoregressioncanbothbeviewedasspecialcasesofBayesianlinearregression,
withparticulartypesofpriordistributionsplacedontheregressioncoefficients.)
Constantvariance(a.k.a.homoscedasticity).Thismeansthatdifferentresponsevariableshave
thesamevarianceintheirerrors,regardlessofthevaluesofthepredictorvariables.Inpracticethis
assumptionisinvalid(i.e.theerrorsareheteroscedastic)iftheresponsevariablescanvaryovera
widescale.Inordertodetermineforheterogeneouserrorvariance,orwhenapatternofresiduals
violatesmodelassumptionsofhomoscedasticity(errorisequallyvariablearoundthe'bestfitting
line'forallpointsofx),itisprudenttolookfora"fanningeffect"betweenresidualerrorand
predictedvalues.Thisistosaytherewillbeasystematicchangeintheabsoluteorsquared
residualswhenplottedagainstthepredictingoutcome.Errorwillnotbeevenlydistributedacross
theregressionline.Heteroscedasticitywillresultintheaveragingoverofdistinguishable
variancesaroundthepointstogetasinglevariancethatisinaccuratelyrepresentingallthe
variancesoftheline.Ineffect,residualsappearclusteredandspreadapartontheirpredictedplots
forlargerandsmallervaluesforpointsalongthelinearregressionline,andthemeansquarederror
forthemodelwillbewrong.Typically,forexample,aresponsevariablewhosemeanislargewill
haveagreatervariancethanonewhosemeanissmall.Forexample,agivenpersonwhoseincome
ispredictedtobe$100,000mayeasilyhaveanactualincomeof$80,000or$120,000(astandard
deviationofaround$20,000),whileanotherpersonwithapredictedincomeof$10,000isunlikely
tohavethesame$20,000standarddeviation,whichwouldimplytheiractualincomewouldvary
anywherebetween$10,000and$30,000.(Infact,asthisshows,inmanycasesoftenthesame
caseswheretheassumptionofnormallydistributederrorsfailsthevarianceorstandard
deviationshouldbepredictedtobeproportionaltothemean,ratherthanconstant.)Simplelinear
regressionestimationmethodsgivelesspreciseparameterestimatesandmisleadinginferential
quantitiessuchasstandarderrorswhensubstantialheteroscedasticityispresent.However,various
estimationtechniques(e.g.weightedleastsquaresandheteroscedasticityconsistentstandard
errors)canhandleheteroscedasticityinaquitegeneralway.Bayesianlinearregressiontechniques
canalsobeusedwhenthevarianceisassumedtobeafunctionofthemean.Itisalsopossiblein
somecasestofixtheproblembyapplyingatransformationtotheresponsevariable(e.g.fitthe
logarithmoftheresponsevariableusingalinearregressionmodel,whichimpliesthattheresponse
variablehasalognormaldistributionratherthananormaldistribution).
Independenceoferrors.Thisassumesthattheerrorsoftheresponsevariablesareuncorrelated
witheachother.(Actualstatisticalindependenceisastrongerconditionthanmerelackof
correlationandisoftennotneeded,althoughitcanbeexploitedifitisknowntohold.)Some
methods(e.g.generalizedleastsquares)arecapableofhandlingcorrelatederrors,althoughthey
typicallyrequiresignificantlymoredataunlesssomesortofregularizationisusedtobiasthe
modeltowardsassuminguncorrelatederrors.Bayesianlinearregressionisageneralwayof
handlingthisissue.
Lackofmulticollinearityinthepredictors.Forstandardleastsquaresestimationmethods,the
designmatrixXmusthavefullcolumnrankp,otherwise,wehaveaconditionknownas
multicollinearityinthepredictorvariables.Thiscanbetriggeredbyhavingtwoormoreperfectly
correlatedpredictorvariables(e.g.ifthesamepredictorvariableismistakenlygiventwice,either
withouttransformingoneofthecopiesorbytransformingoneofthecopieslinearly).Itcanalso
happenifthereistoolittledataavailablecomparedtothenumberofparameterstobeestimated
(e.g.fewerdatapointsthanregressioncoefficients).Inthecaseofmulticollinearity,theparameter
vectorwillbenonidentifiableithasnouniquesolution.Atmostwewillbeabletoidentify
someoftheparameters,i.e.narrowdownitsvaluetosomelinearsubspaceofRp.Seepartialleast
https://en.wikipedia.org/wiki/Linear_regression

5/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

squaresregression.Methodsforfittinglinearmodelswithmulticollinearityhavebeen
developed[5][6][7][8]somerequireadditionalassumptionssuchas"effectsparsity"thatalarge
fractionoftheeffectsareexactlyzero.Notethatthemorecomputationallyexpensiveiterated
algorithmsforparameterestimation,suchasthoseusedingeneralizedlinearmodels,donotsuffer
fromthisproblemandinfactit'squitenormaltowhenhandlingcategoricallyvaluedpredictors
tointroduceaseparateindicatorvariablepredictorforeachpossiblecategory,whichinevitably
introducesmulticollinearity.
Beyondtheseassumptions,severalotherstatisticalpropertiesofthedatastronglyinfluencethe
performanceofdifferentestimationmethods:
Thestatisticalrelationshipbetweentheerrortermsandtheregressorsplaysanimportantrolein
determiningwhetheranestimationprocedurehasdesirablesamplingpropertiessuchasbeing
unbiasedandconsistent.
Thearrangement,orprobabilitydistributionofthepredictorvariablesxhasamajorinfluenceon
theprecisionofestimatesof.Samplinganddesignofexperimentsarehighlydeveloped
subfieldsofstatisticsthatprovideguidanceforcollectingdatainsuchawaytoachieveaprecise
estimateof.

Interpretation
Afittedlinearregressionmodel
canbeusedtoidentifythe
relationshipbetweenasingle
predictorvariablexjandthe
responsevariableywhenallthe
otherpredictorvariablesinthe
modelare"heldfixed".
Specifically,theinterpretationof
jistheexpectedchangeinyfor
aoneunitchangeinxjwhenthe
othercovariatesareheldfixed
thatis,theexpectedvalueofthe
partialderivativeofywith
respecttoxj.Thisissometimes
calledtheuniqueeffectofxjony.
Incontrast,themarginaleffectof
xjonycanbeassessedusinga

ThesetsintheAnscombe'squartethavethesamelinearregressionline
butarethemselvesverydifferent.

correlationcoefficientorsimple
linearregressionmodelrelating
xjtoythiseffectisthetotalderivativeofywithrespecttoxj.

Caremustbetakenwheninterpretingregressionresults,assomeoftheregressorsmaynotallowfor
marginalchanges(suchasdummyvariables,ortheinterceptterm),whileotherscannotbeheldfixed
(recalltheexamplefromtheintroduction:itwouldbeimpossibleto"holdtifixed"andatthesametime
changethevalueofti2).

https://en.wikipedia.org/wiki/Linear_regression

6/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

Itispossiblethattheuniqueeffectcanbenearlyzeroevenwhenthemarginaleffectislarge.Thismay
implythatsomeothercovariatecapturesalltheinformationinxj,sothatoncethatvariableisinthe
model,thereisnocontributionofxjtothevariationiny.Conversely,theuniqueeffectofxjcanbelarge
whileitsmarginaleffectisnearlyzero.Thiswouldhappeniftheothercovariatesexplainedagreatdeal
ofthevariationofy,buttheymainlyexplainvariationinawaythatiscomplementarytowhatis
capturedbyxj.Inthiscase,includingtheothervariablesinthemodelreducesthepartofthevariability
ofythatisunrelatedtoxj,therebystrengtheningtheapparentrelationshipwithxj.
Themeaningoftheexpression"heldfixed"maydependonhowthevaluesofthepredictorvariables
arise.Iftheexperimenterdirectlysetsthevaluesofthepredictorvariablesaccordingtoastudydesign,
thecomparisonsofinterestmayliterallycorrespondtocomparisonsamongunitswhosepredictor
variableshavebeen"heldfixed"bytheexperimenter.Alternatively,theexpression"heldfixed"canrefer
toaselectionthattakesplaceinthecontextofdataanalysis.Inthiscase,we"holdavariablefixed"by
restrictingourattentiontothesubsetsofthedatathathappentohaveacommonvalueforthegiven
predictorvariable.Thisistheonlyinterpretationof"heldfixed"thatcanbeusedinanobservational
study.
Thenotionofa"uniqueeffect"isappealingwhenstudyingacomplexsystemwheremultiple
interrelatedcomponentsinfluencetheresponsevariable.Insomecases,itcanliterallybeinterpretedas
thecausaleffectofaninterventionthatislinkedtothevalueofapredictorvariable.However,ithas
beenarguedthatinmanycasesmultipleregressionanalysisfailstoclarifytherelationshipsbetweenthe
predictorvariablesandtheresponsevariablewhenthepredictorsarecorrelatedwitheachotherandare
notassignedfollowingastudydesign.[9]Acommonalityanalysismaybehelpfulindisentanglingthe
sharedanduniqueimpactsofcorrelatedindependentvariables.[10]

Extensions
Numerousextensionsoflinearregressionhavebeendeveloped,whichallowsomeorallofthe
assumptionsunderlyingthebasicmodeltoberelaxed.

Simpleandmultipleregression
Theverysimplestcaseofasinglescalarpredictorvariablexandasinglescalarresponsevariableyis
knownassimplelinearregression.Theextensiontomultipleand/orvectorvaluedpredictorvariables
(denotedwithacapitalX)isknownasmultiplelinearregression,alsoknownasmultivariablelinear
regression.Nearlyallrealworldregressionmodelsinvolvemultiplepredictors,andbasicdescriptionsof
linearregressionareoftenphrasedintermsofthemultipleregressionmodel.Note,however,thatin
thesecasestheresponsevariableyisstillascalar.Anothertermmultivariatelinearregressionrefersto
caseswhereyisavector,i.e.,thesameasgenerallinearregression.Thedifferencebetweenmultivariate
linearregressionandmultivariablelinearregressionshouldbeemphasizedasitcausesmuchconfusion
andmisunderstandingintheliterature.

Generallinearmodels
ThegenerallinearmodelconsidersthesituationwhentheresponsevariableYisnotascalarbutavector.
ConditionallinearityofE(y|x)=Bxisstillassumed,withamatrixBreplacingthevectorofthe
classicallinearregressionmodel.MultivariateanaloguesofOrdinaryLeastSquares(OLS)and
GeneralizedLeastSquares(GLS)havebeendeveloped.Theterm"generallinearmodels"isequivalent
https://en.wikipedia.org/wiki/Linear_regression

7/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

to"multivariatelinearmodels".Itshouldbenotedthedifferenceof"multivariatelinearmodels"and
"multivariablelinearmodels,"wheretheformeristhesameas"generallinearmodels"andthelatteris
thesameas"multiplelinearmodels."

Heteroscedasticmodels
Variousmodelshavebeencreatedthatallowforheteroscedasticity,i.e.theerrorsfordifferentresponse
variablesmayhavedifferentvariances.Forexample,weightedleastsquaresisamethodforestimating
linearregressionmodelswhentheresponsevariablesmayhavedifferenterrorvariances,possiblywith
correlatederrors.(SeealsoWeightedlinearleastsquares,andgeneralizedleastsquares.)
Heteroscedasticityconsistentstandarderrorsisanimprovedmethodforusewithuncorrelatedbut
potentiallyheteroscedasticerrors.

Generalizedlinearmodels
Generalizedlinearmodels(GLMs)areaframeworkformodelingaresponsevariableythatisbounded
ordiscrete.Thisisused,forexample:
whenmodelingpositivequantities(e.g.pricesorpopulations)thatvaryoveralargescalewhich
arebetterdescribedusingaskeweddistributionsuchasthelognormaldistributionorPoisson
distribution(althoughGLMsarenotusedforlognormaldata,insteadtheresponsevariableis
simplytransformedusingthelogarithmfunction)
whenmodelingcategoricaldata,suchasthechoiceofagivencandidateinanelection(whichis
betterdescribedusingaBernoullidistribution/binomialdistributionforbinarychoices,ora
categoricaldistribution/multinomialdistributionformultiwaychoices),wherethereareafixed
numberofchoicesthatcannotbemeaningfullyordered
whenmodelingordinaldata,e.g.ratingsonascalefrom0to5,wherethedifferentoutcomescan
beorderedbutwherethequantityitselfmaynothaveanyabsolutemeaning(e.g.aratingof4may
notbe"twiceasgood"inanyobjectivesenseasaratingof2,butsimplyindicatesthatitisbetter
than2or3butnotasgoodas5).
Generalizedlinearmodelsallowforanarbitrarylinkfunctiongthatrelatesthemeanoftheresponse
variabletothepredictors,i.e.E(y)=g(x).Thelinkfunctionisoftenrelatedtothedistributionofthe
response,andinparticularittypicallyhastheeffectoftransformingbetweenthe
rangeof
thelinearpredictorandtherangeoftheresponsevariable.
SomecommonexamplesofGLMsare:
Poissonregressionforcountdata.
Logisticregressionandprobitregressionforbinarydata.
Multinomiallogisticregressionandmultinomialprobitregressionforcategoricaldata.
Orderedprobitregressionforordinaldata.
Singleindexmodelsallowsomedegreeofnonlinearityintherelationshipbetweenxandy,while
preservingthecentralroleofthelinearpredictorxasintheclassicallinearregressionmodel.Under
certainconditions,simplyapplyingOLStodatafromasingleindexmodelwillconsistentlyestimate
uptoaproportionalityconstant.[11]

Hierarchicallinearmodels

https://en.wikipedia.org/wiki/Linear_regression

8/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

Hierarchicallinearmodels(ormultilevelregression)organizesthedataintoahierarchyofregressions,
forexamplewhereAisregressedonB,andBisregressedonC.Itisoftenusedwherethevariablesof
interesthaveanaturalhierarchicalstructuresuchasineducationalstatistics,wherestudentsarenestedin
classrooms,classroomsarenestedinschools,andschoolsarenestedinsomeadministrativegrouping,
suchasaschooldistrict.Theresponsevariablemightbeameasureofstudentachievementsuchasatest
score,anddifferentcovariateswouldbecollectedattheclassroom,school,andschooldistrictlevels.

Errorsinvariables
Errorsinvariablesmodels(or"measurementerrormodels")extendthetraditionallinearregression
modeltoallowthepredictorvariablesXtobeobservedwitherror.Thiserrorcausesstandardestimators
oftobecomebiased.Generally,theformofbiasisanattenuation,meaningthattheeffectsarebiased
towardzero.

Others
InDempsterShafertheory,oralinearbelieffunctioninparticular,alinearregressionmodelmay
berepresentedasapartiallysweptmatrix,whichcanbecombinedwithsimilarmatrices
representingobservationsandotherassumednormaldistributionsandstateequations.The
combinationofsweptorunsweptmatricesprovidesanalternativemethodforestimatinglinear
regressionmodels.

Estimationmethods
Alargenumberofprocedureshavebeendevelopedforparameter
estimationandinferenceinlinearregression.Thesemethods
differincomputationalsimplicityofalgorithms,presenceofa
closedformsolution,robustnesswithrespecttoheavytailed
distributions,andtheoreticalassumptionsneededtovalidate
desirablestatisticalpropertiessuchasconsistencyand
asymptoticefficiency.
Someofthemorecommonestimationtechniquesforlinear
regressionaresummarizedbelow.

Leastsquaresestimationandrelatedtechniques
Ordinaryleastsquares(OLS)isthesimplestandthus
mostcommonestimator.Itisconceptuallysimpleand
computationallystraightforward.OLSestimatesare
commonlyusedtoanalyzebothexperimentaland
observationaldata.

ComparisonoftheTheilSen
estimator(black)andsimplelinear
regression(blue)forasetofpoints
withoutliers.

TheOLSmethodminimizesthesumofsquaredresiduals,andleadstoaclosedformexpression
fortheestimatedvalueoftheunknownparameter:

Theestimatorisunbiasedandconsistentiftheerrorshavefinitevarianceandareuncorrelated
withtheregressors[12]
https://en.wikipedia.org/wiki/Linear_regression

9/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

Itisalsoefficientundertheassumptionthattheerrorshavefinitevarianceandarehomoscedastic,
meaningthatE[i2|xi]doesnotdependoni.Theconditionthattheerrorsareuncorrelatedwiththe
regressorswillgenerallybesatisfiedinanexperiment,butinthecaseofobservationaldata,itis
difficulttoexcludethepossibilityofanomittedcovariatezthatisrelatedtoboththeobserved
covariatesandtheresponsevariable.Theexistenceofsuchacovariatewillgenerallyleadtoa
correlationbetweentheregressorsandtheresponsevariable,andhencetoaninconsistent
estimatorof.Theconditionofhomoscedasticitycanfailwitheitherexperimentalor
observationaldata.Ifthegoaliseitherinferenceorpredictivemodeling,theperformanceofOLS
estimatescanbepoorifmulticollinearityispresent,unlessthesamplesizeislarge.
Insimplelinearregression,wherethereisonlyoneregressor(withaconstant),theOLS
coefficientestimateshaveasimpleformthatiscloselyrelatedtothecorrelationcoefficient
betweenthecovariateandtheresponse.
Generalizedleastsquares(GLS)isanextensionoftheOLSmethod,thatallowsefficient
estimationofwheneitherheteroscedasticity,orcorrelations,orbotharepresentamongtheerror
termsofthemodel,aslongastheformofheteroscedasticityandcorrelationisknown
independentlyofthedata.Tohandleheteroscedasticitywhentheerrortermsareuncorrelatedwith
eachother,GLSminimizesaweightedanaloguetothesumofsquaredresidualsfromOLS
regression,wheretheweightfortheithcaseisinverselyproportionaltovar(i).Thisspecialcase
ofGLSiscalled"weightedleastsquares".TheGLSsolutiontoestimationproblemis

whereisthecovariancematrixoftheerrors.GLScanbeviewedasapplyingalinear
transformationtothedatasothattheassumptionsofOLSaremetforthetransformeddata.For
GLStobeapplied,thecovariancestructureoftheerrorsmustbeknownuptoamultiplicative
constant.
Percentageleastsquaresfocusesonreducingpercentageerrors,whichisusefulinthefieldof
forecastingortimeseriesanalysis.Itisalsousefulinsituationswherethedependentvariablehasa
widerangewithoutconstantvariance,asherethelargerresidualsattheupperendoftherange
woulddominateifOLSwereused.Whenthepercentageorrelativeerrorisnormallydistributed,
leastsquarespercentageregressionprovidesmaximumlikelihoodestimates.Percentageregression
islinkedtoamultiplicativeerrormodel,whereasOLSislinkedtomodelscontaininganadditive
errorterm.[13]
Iterativelyreweightedleastsquares(IRLS)isusedwhenheteroscedasticity,orcorrelations,or
botharepresentamongtheerrortermsofthemodel,butwherelittleisknownaboutthe
covariancestructureoftheerrorsindependentlyofthedata.[14]Inthefirstiteration,OLS,orGLS
withaprovisionalcovariancestructureiscarriedout,andtheresidualsareobtainedfromthefit.
Basedontheresiduals,animprovedestimateofthecovariancestructureoftheerrorscanusually
beobtained.AsubsequentGLSiterationisthenperformedusingthisestimateoftheerror
structuretodefinetheweights.Theprocesscanbeiteratedtoconvergence,butinmanycases,
onlyoneiterationissufficienttoachieveanefficientestimateof.[15][16]
Instrumentalvariablesregression(IV)canbeperformedwhentheregressorsarecorrelatedwith
theerrors.Inthiscase,weneedtheexistenceofsomeauxiliaryinstrumentalvariableszisuchthat
E[zii]=0.IfZisthematrixofinstruments,thentheestimatorcanbegiveninclosedformas

OptimalinstrumentsregressionisanextensionofclassicalIVregressiontothesituationwhere
https://en.wikipedia.org/wiki/Linear_regression

10/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

E[i|zi]=0.
Totalleastsquares(TLS)[17]isanapproachtoleastsquaresestimationofthelinearregression
modelthattreatsthecovariatesandresponsevariableinamoregeometricallysymmetricmanner
thanOLS.Itisoneapproachtohandlingthe"errorsinvariables"problem,andisalsosometimes
usedevenwhenthecovariatesareassumedtobeerrorfree.

Maximumlikelihoodestimationandrelatedtechniques
Maximumlikelihoodestimationcanbeperformedwhenthedistributionoftheerrortermsis
knowntobelongtoacertainparametricfamilyofprobabilitydistributions.[18]Whenfisa
normaldistributionwithzeromeanandvariance,theresultingestimateisidenticaltotheOLS
estimate.GLSestimatesaremaximumlikelihoodestimateswhenfollowsamultivariatenormal
distributionwithaknowncovariancematrix.
Ridgeregression,[19][20][21]andotherformsofpenalizedestimationsuchasLassoregression,[5]
deliberatelyintroducebiasintotheestimationofinordertoreducethevariabilityofthe
estimate.TheresultingestimatorsgenerallyhavelowermeansquarederrorthantheOLS
estimates,particularlywhenmulticollinearityispresentorwhenoverfittingisaproblem.Theyare
generallyusedwhenthegoalistopredictthevalueoftheresponsevariableyforvaluesofthe
predictorsxthathavenotyetbeenobserved.Thesemethodsarenotascommonlyusedwhenthe
goalisinference,sinceitisdifficulttoaccountforthebias.
Leastabsolutedeviation(LAD)regressionisarobustestimationtechniqueinthatitisless
sensitivetothepresenceofoutliersthanOLS(butislessefficientthanOLSwhennooutliersare
present).ItisequivalenttomaximumlikelihoodestimationunderaLaplacedistributionmodelfor
.[22]
Adaptiveestimation.Ifweassumethaterrortermsareindependentfromtheregressors
,
theoptimalestimatoristhe2stepMLE,wherethefirststepisusedtononparametricallyestimate
thedistributionoftheerrorterm.[23]

Otherestimationtechniques
BayesianlinearregressionappliestheframeworkofBayesianstatisticstolinearregression.(See
alsoBayesianmultivariatelinearregression.)Inparticular,theregressioncoefficientsare
assumedtoberandomvariableswithaspecifiedpriordistribution.Thepriordistributioncanbias
thesolutionsfortheregressioncoefficients,inawaysimilarto(butmoregeneralthan)ridge
regressionorlassoregression.Inaddition,theBayesianestimationprocessproducesnotasingle
pointestimateforthe"best"valuesoftheregressioncoefficientsbutanentireposterior
distribution,completelydescribingtheuncertaintysurroundingthequantity.Thiscanbeusedto
estimatethe"best"coefficientsusingthemean,mode,median,anyquantile(seequantile
regression),oranyotherfunctionoftheposteriordistribution.
QuantileregressionfocusesontheconditionalquantilesofygivenXratherthantheconditional
meanofygivenX.Linearquantileregressionmodelsaparticularconditionalquantile,for
exampletheconditionalmedian,asalinearfunctionTxofthepredictors.
Mixedmodelsarewidelyusedtoanalyzelinearregressionrelationshipsinvolvingdependentdata
whenthedependencieshaveaknownstructure.Commonapplicationsofmixedmodelsinclude
analysisofdatainvolvingrepeatedmeasurements,suchaslongitudinaldata,ordataobtainedfrom
clustersampling.Theyaregenerallyfitasparametricmodels,usingmaximumlikelihoodor
Bayesianestimation.Inthecasewheretheerrorsaremodeledasnormalrandomvariables,thereis
acloseconnectionbetweenmixedmodelsandgeneralizedleastsquares.[24]Fixedeffects
estimationisanalternativeapproachtoanalyzingthistypeofdata.
Principalcomponentregression(PCR)[7][8]isusedwhenthenumberofpredictorvariablesis
large,orwhenstrongcorrelationsexistamongthepredictorvariables.Thistwostageprocedure
firstreducesthepredictorvariablesusingprincipalcomponentanalysisthenusesthereduced
https://en.wikipedia.org/wiki/Linear_regression

11/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

variablesinanOLSregressionfit.Whileitoftenworkswellinpractice,thereisnogeneral
theoreticalreasonthatthemostinformativelinearfunctionofthepredictorvariablesshouldlie
amongthedominantprincipalcomponentsofthemultivariatedistributionofthepredictor
variables.ThepartialleastsquaresregressionistheextensionofthePCRmethodwhichdoesnot
sufferfromthementioneddeficiency.
Leastangleregression[6]isanestimationprocedureforlinearregressionmodelsthatwas
developedtohandlehighdimensionalcovariatevectors,potentiallywithmorecovariatesthan
observations.
TheTheilSenestimatorisasimplerobustestimationtechniquethatchoosestheslopeofthefit
linetobethemedianoftheslopesofthelinesthroughpairsofsamplepoints.Ithassimilar
statisticalefficiencypropertiestosimplelinearregressionbutismuchlesssensitivetooutliers.[25]
Otherrobustestimationtechniques,includingthetrimmedmeanapproach,andL,M,S,and
Restimatorshavebeenintroduced.

Furtherdiscussion
Instatisticsandnumericalanalysis,theproblemofnumericalmethodsforlinearleastsquaresisan
importantonebecauselinearregressionmodelsareoneofthemostimportanttypesofmodel,bothas
formalstatisticalmodelsandforexplorationofdatasets.Themajorityofstatisticalcomputerpackages
containfacilitiesforregressionanalysisthatmakeuseoflinearleastsquarescomputations.Henceitis
appropriatethatconsiderableefforthasbeendevotedtothetaskofensuringthatthesecomputationsare
undertakenefficientlyandwithdueregardtonumericalprecision.
Individualstatisticalanalysesareseldomundertakeninisolation,butratherarepartofasequenceof
investigatorysteps.Someofthetopicsinvolvedinconsideringnumericalmethodsforlinearleast
squaresrelatetothispoint.Thusimportanttopicscanbe
Computationswhereanumberofsimilar,andoftennested,modelsareconsideredforthesame
dataset.Thatis,wheremodelswiththesamedependentvariablebutdifferentsetsofindependent
variablesaretobeconsidered,foressentiallythesamesetofdatapoints.
Computationsforanalysesthatoccurinasequence,asthenumberofdatapointsincreases.
Specialconsiderationsforveryextensivedatasets.
Fittingoflinearmodelsbyleastsquaresoften,butnotalways,arisesinthecontextofstatisticalanalysis.
Itcanthereforebeimportantthatconsiderationsofcomputationalefficiencyforsuchproblemsextendto
alloftheauxiliaryquantitiesrequiredforsuchanalyses,andarenotrestrictedtotheformalsolutionof
thelinearleastsquaresproblem.
Matrixcalculations,likeanyothers,areaffectedbyroundingerrors.Anearlysummaryoftheseeffects,
regardingthechoiceofcomputationalmethodsformatrixinversion,wasprovidedbyWilkinson.[26]

Usinglinearalgebra
Itfollowsthatonecanfinda"best"approximationofanotherfunctionbyminimizingtheareabetween
twofunctions,acontinuousfunction on
andafunction
where isasubspaceof
:
,

https://en.wikipedia.org/wiki/Linear_regression

12/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

allwithinthesubspace .Duetothefrequentdifficultyofevaluatingintegrandsinvolvingabsolute
value,onecaninsteaddefine

asanadequatecriterionforobtainingtheleastsquaresapproximation,function ,of withrespectto


theinnerproductspace .
Assuch,

or,equivalently,

,canthusbewritteninvectorform:
.

Inotherwords,theleastsquaresapproximationof isthefunction
termsoftheinnerproduct
.Furthermore,thiscanbeappliedwithatheorem:
Let becontinuouson
,andlet beafinitedimensionalsubspaceof
squaresapproximatingfunctionof withrespectto isgivenby

closestto in
.Theleast

,
where

isanorthonormalbasisfor

Applicationsoflinearregression
Linearregressioniswidelyusedinbiological,behavioralandsocialsciencestodescribepossible
relationshipsbetweenvariables.Itranksasoneofthemostimportanttoolsusedinthesedisciplines.

Trendline
Atrendlinerepresentsatrend,thelongtermmovementintimeseriesdataafterothercomponentshave
beenaccountedfor.Ittellswhetheraparticulardataset(sayGDP,oilpricesorstockprices)have
increasedordecreasedovertheperiodoftime.Atrendlinecouldsimplybedrawnbyeyethroughaset
ofdatapoints,butmoreproperlytheirpositionandslopeiscalculatedusingstatisticaltechniqueslike
linearregression.Trendlinestypicallyarestraightlines,althoughsomevariationsusehigherdegree
polynomialsdependingonthedegreeofcurvaturedesiredintheline.
Trendlinesaresometimesusedinbusinessanalyticstoshowchangesindataovertime.Thishasthe
advantageofbeingsimple.Trendlinesareoftenusedtoarguethataparticularactionorevent(suchas
training,oranadvertisingcampaign)causedobservedchangesatapointintime.Thisisasimple
technique,anddoesnotrequireacontrolgroup,experimentaldesign,orasophisticatedanalysis
technique.However,itsuffersfromalackofscientificvalidityincaseswhereotherpotentialchanges
canaffectthedata.

Epidemiology
Earlyevidencerelatingtobaccosmokingtomortalityandmorbiditycamefromobservationalstudies
employingregressionanalysis.Inordertoreducespuriouscorrelationswhenanalyzingobservational
data,researchersusuallyincludeseveralvariablesintheirregressionmodelsinadditiontothevariable
https://en.wikipedia.org/wiki/Linear_regression

13/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

ofprimaryinterest.Forexample,supposewehavearegressionmodelinwhichcigarettesmokingisthe
independentvariableofinterest,andthedependentvariableislifespanmeasuredinyears.Researchers
mightincludesocioeconomicstatusasanadditionalindependentvariable,toensurethatanyobserved
effectofsmokingonlifespanisnotduetosomeeffectofeducationorincome.However,itisnever
possibletoincludeallpossibleconfoundingvariablesinanempiricalanalysis.Forexample,a
hypotheticalgenemightincreasemortalityandalsocausepeopletosmokemore.Forthisreason,
randomizedcontrolledtrialsareoftenabletogeneratemorecompellingevidenceofcausalrelationships
thancanbeobtainedusingregressionanalysesofobservationaldata.Whencontrolledexperimentsare
notfeasible,variantsofregressionanalysissuchasinstrumentalvariablesregressionmaybeusedto
attempttoestimatecausalrelationshipsfromobservationaldata.

Finance
Thecapitalassetpricingmodeluseslinearregressionaswellastheconceptofbetaforanalyzingand
quantifyingthesystematicriskofaninvestment.Thiscomesdirectlyfromthebetacoefficientofthe
linearregressionmodelthatrelatesthereturnontheinvestmenttothereturnonallriskyassets.

Economics
Linearregressionisthepredominantempiricaltoolineconomics.Forexample,itisusedtopredict
consumptionspending,[27]fixedinvestmentspending,inventoryinvestment,purchasesofacountry's
exports,[28]spendingonimports,[28]thedemandtoholdliquidassets,[29]labordemand,[30]andlabor
supply.[30]

Environmentalscience
Linearregressionfindsapplicationinawiderangeofenvironmentalscienceapplications.InCanada,the
EnvironmentalEffectsMonitoringProgramusesstatisticalanalysesonfishandbenthicsurveysto
measuretheeffectsofpulpmillormetalmineeffluentontheaquaticecosystem.[31]

Seealso
Analysisofvariance
Censoredregressionmodel
Crosssectionalregression
Curvefitting
EmpiricalBayesmethods
Errorsandresiduals
Lackoffitsumofsquares
Linefitting
Linearclassifier
Linearequation
Logisticregression
Mestimator
MLPACKcontainsaC++implementationoflinearregression
Multivariateadaptiveregressionsplines
Nonlinearregression
Nonparametricregression
Normalequations
Projectionpursuitregression
https://en.wikipedia.org/wiki/Linear_regression

14/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

Segmentedlinearregression
Stepwiseregression
Supportvectormachine
Truncatedregressionmodel

Notes
1.DavidA.Freedman(2009).StatisticalModels:TheoryandPractice.CambridgeUniversityPress.p.26."A
simpleregressionequationhasontherighthandsideaninterceptandanexplanatoryvariablewithaslope
coefficient.Amultipleregressionequationhastwoormoreexplanatoryvariablesontherighthandside,each
withitsownslopecoefficient"
2.Rencher,AlvinC.Christensen,WilliamF.(2012),"Chapter10,MultivariateregressionSection10.1,
Introduction",MethodsofMultivariateAnalysis,WileySeriesinProbabilityandStatistics709(3rded.),
JohnWiley&Sons,p.19,ISBN9781118391679.
3.HilaryL.Seal(1967)."ThehistoricaldevelopmentoftheGausslinearmodel".Biometrika54(1/2):124.
doi:10.1093/biomet/54.12.1.
4.Yan,Xin(2009),LinearRegressionAnalysis:TheoryandComputing,WorldScientific,pp.12,
ISBN9789812834119,"Regressionanalysis...isprobablyoneoftheoldesttopicsinmathematicalstatistics
datingbacktoabouttwohundredyearsago.Theearliestformofthelinearregressionwastheleastsquares
method,whichwaspublishedbyLegendrein1805,andbyGaussin1809...LegendreandGaussbothapplied
themethodtotheproblemofdetermining,fromastronomicalobservations,theorbitsofbodiesaboutthe
sun."
5.Tibshirani,Robert(1996)."RegressionShrinkageandSelectionviatheLasso".JournaloftheRoyal
StatisticalSociety,SeriesB58(1):267288.JSTOR2346178.
6.Efron,BradleyHastie,TrevorJohnstone,IainTibshirani,Robert(2004)."LeastAngleRegression".The
AnnalsofStatistics32(2):407451.doi:10.1214/009053604000000067.JSTOR3448465.
7.Hawkins,DouglasM.(1973)."OntheInvestigationofAlternativeRegressionsbyPrincipalComponent
Analysis".JournaloftheRoyalStatisticalSociety,SeriesC22(3):275286.JSTOR2346776.
8.Jolliffe,IanT.(1982)."ANoteontheUseofPrincipalComponentsinRegression".JournaloftheRoyal
StatisticalSociety,SeriesC31(3):300303.JSTOR2348005.
9.Berk,RichardA.RegressionAnalysis:AConstructiveCritique.Sage.doi:10.1177/0734016807304871.
10.Warne,R.T.(2011).Beyondmultipleregression:UsingcommonalityanalysistobetterunderstandR2
results.GiftedChildQuarterly,55,313318.doi:10.1177/0016986211422217
11.Brillinger,DavidR.(1977)."TheIdentificationofaParticularNonlinearTimeSeriesSystem".Biometrika64
(3):509515.doi:10.1093/biomet/64.3.509.JSTOR2345326.
12.Lai,T.L.Robbins,H.Wei,C.Z.(1978)."Strongconsistencyofleastsquaresestimatesinmultiple
regression".PNAS75(7):30343036.Bibcode:1978PNAS...75.3034L.doi:10.1073/pnas.75.7.3034.
JSTOR68164.
13.Tofallis,C(2009)."LeastSquaresPercentageRegression".JournalofModernAppliedStatisticalMethods7:
526534.doi:10.2139/ssrn.1406472.
14.delPino,Guido(1989)."TheUnifyingRoleofIterativeGeneralizedLeastSquaresinStatisticalAlgorithms".
StatisticalScience4(4):394403.doi:10.1214/ss/1177012408.JSTOR2245853.
15.Carroll,RaymondJ.(1982)."AdaptingforHeteroscedasticityinLinearModels".TheAnnalsofStatistics10
(4):12241233.doi:10.1214/aos/1176345987.JSTOR2240725.
16.Cohen,MichaelDalal,SiddharthaR.Tukey,JohnW.(1993)."Robust,SmoothlyHeterogeneousVariance
Regression".JournaloftheRoyalStatisticalSociety,SeriesC42(2):339353.JSTOR2986237.
17.Nievergelt,Yves(1994)."TotalLeastSquares:StateoftheArtRegressioninNumericalAnalysis".SIAM
Review36(2):258264.doi:10.1137/1036055.JSTOR2132463.
18.Lange,KennethL.Little,RoderickJ.A.Taylor,JeremyM.G.(1989)."RobustStatisticalModelingUsing
thetDistribution".JournaloftheAmericanStatisticalAssociation84(408):881896.doi:10.2307/2290063.
JSTOR2290063.
19.Swindel,BeneeF.(1981)."GeometryofRidgeRegressionIllustrated".TheAmericanStatistician35(1):12
15.doi:10.2307/2683577.JSTOR2683577.
20.Draper,NormanR.vanNostrandR.Craig(1979)."RidgeRegressionandJamesSteinEstimation:Review
andComments".Technometrics21(4):451466.doi:10.2307/1268284.JSTOR1268284.
https://en.wikipedia.org/wiki/Linear_regression

15/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

21.Hoerl,ArthurE.Kennard,RobertW.Hoerl,RogerW.(1985)."PracticalUseofRidgeRegression:A
ChallengeMet".JournaloftheRoyalStatisticalSociety,SeriesC34(2):114120.JSTOR2347363.
22.Narula,SubhashC.Wellington,JohnF.(1982)."TheMinimumSumofAbsoluteErrorsRegression:AState
oftheArtSurvey".InternationalStatisticalReview50(3):317326.doi:10.2307/1402501.JSTOR1402501.
23.Stone,C.J.(1975)."Adaptivemaximumlikelihoodestimatorsofalocationparameter".TheAnnalsof
Statistics3(2):267284.doi:10.1214/aos/1176343056.JSTOR2958945.
24.Goldstein,H.(1986)."MultilevelMixedLinearModelAnalysisUsingIterativeGeneralizedLeastSquares".
Biometrika73(1):4356.doi:10.1093/biomet/73.1.43.JSTOR2336270.
25.Theil,H.(1950)."Arankinvariantmethodoflinearandpolynomialregressionanalysis.I,II,III".Nederl.
Akad.Wetensch.,Proc.53:386392,521525,13971412.MR0036489Sen,PranabKumar(1968).
"EstimatesoftheregressioncoefficientbasedonKendall'stau".JournaloftheAmericanStatistical
Association63(324):13791389.doi:10.2307/2285891.JSTOR2285891.MR0258201.
26.Wilkinson,J.H.(1963)"Chapter3:MatrixComputations",RoundingErrorsinAlgebraicProcesses,
London:HerMajesty'sStationeryOffice(NationalPhysicalLaboratory,NotesinAppliedScience,No.32)
27.Deaton,Angus(1992).UnderstandingConsumption.OxfordUniversityPress.ISBN0198288247.
28.Krugman,PaulR.Obstfeld,M.Melitz,MarcJ.(2012).InternationalEconomics:TheoryandPolicy(9th
globaled.).Harlow:Pearson.ISBN9780273754091.
29.Laidler,DavidE.W.(1993)."TheDemandforMoney:Theories,Evidence,andProblems"(4thed.).New
York:HarperCollins.ISBN0065010981.
30.EhrenbergSmith(2008).ModernLaborEconomics(10thinternationaled.).London:AddisonWesley.
ISBN9780321538963.
31.EEMPwebpage(http://www.ec.gc.ca/eseeeem/default.asp?lang=En&n=453D78FC1)

References
Cohen,J.,CohenP.,West,S.G.,&Aiken,L.S.(2003).Appliedmultipleregression/correlation
analysisforthebehavioralsciences.(2nded.)Hillsdale,NJ:LawrenceErlbaumAssociates
CharlesDarwin.TheVariationofAnimalsandPlantsunderDomestication.(1868)(ChapterXIII
describeswhatwasknownaboutreversioninGalton'stime.Darwinusestheterm"reversion".)
Draper,N.R.Smith,H.(1998).AppliedRegressionAnalysis(3rded.).JohnWiley.ISBN0471
170828.
FrancisGalton."RegressionTowardsMediocrityinHereditaryStature,"Journalofthe
AnthropologicalInstitute,15:246263(1886).(Facsimileat:[1](http://www.mugu.com/galton/ess
ays/18801889/galton1886jaigiregressionstature.pdf))
RobertS.PindyckandDanielL.Rubinfeld(1998,4hed.).EconometricModelsandEconomic
Forecasts,ch.1(Intro,incl.appendicesonoperators&derivationofparameterest.)&
Appendix4.3(mult.regressioninmatrixform).

Furtherreading
Barlow,JesseL.(1993)."Chapter9:Numericalaspectsof
TheWikibookR
SolvingLinearLeastSquaresProblems".InRao,C.R.
Programminghasapage
ComputationalStatistics.HandbookofStatistics9.North
onthetopicof:Linear
Holland.ISBN0444880968
Models
Bjrck,ke(1996).Numericalmethodsforleastsquares
problems.Philadelphia:SIAM.ISBN0898713609.
Wikiversityhaslearning
Goodall,ColinR.(1993)."Chapter13:Computationusing
materialsaboutLinear
theQRdecomposition".InRao,C.R.Computational
regression
Statistics.HandbookofStatistics9.NorthHolland.
ISBN0444880968
Pedhazur,ElazarJ(1982)."Multipleregressioninbehavioralresearch:Explanationand
prediction"(2nded.).NewYork:Holt,RinehartandWinston.ISBN0030417600.
NationalPhysicalLaboratory(1961)."Chapter1:LinearEquationsandMatrices:Direct
https://en.wikipedia.org/wiki/Linear_regression

16/17

19/05/2016

LinearregressionWikipedia,thefreeencyclopedia

Methods".ModernComputingMethods.NotesonAppliedScience16(2nded.).HerMajesty's
StationeryOffice
NationalPhysicalLaboratory(1961)."Chapter2:LinearEquationsandMatrices:DirectMethods
onAutomaticComputers".ModernComputingMethods.NotesonAppliedScience16(2nded.).
HerMajesty'sStationeryOffice
Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Linear_regression&oldid=720569695"
Categories: Regressionanalysis Estimationtheory Parametricstatistics Econometrics
Thispagewaslastmodifiedon16May2016,at18:04.
TextisavailableundertheCreativeCommonsAttributionShareAlikeLicenseadditionalterms
mayapply.Byusingthissite,youagreetotheTermsofUseandPrivacyPolicy.Wikipediaisa
registeredtrademarkoftheWikimediaFoundation,Inc.,anonprofitorganization.

https://en.wikipedia.org/wiki/Linear_regression

17/17