Professional Documents
Culture Documents
CrossValidatedisaquestionandanswersiteforpeopleinterestedinstatistics,machinelearning,dataanalysis,
datamining,anddatavisualization.It's100%free,noregistrationrequired.
login
definitionWhysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddeviation?CrossValidated
21/7/2015
tour
help
Takethe2minutetour
Whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddeviation?
Inthedefinitionofstandarddeviation,whydowehavetosquarethedifferencefromthemeantogetthemean(E)andtakethesquare
rootbackattheend?Can'twejustsimplytaketheabsolutevalueofthedifferenceinsteadandgettheexpectedvalue(mean)of
those,andwouldn'tthatalsoshowthevariationofthedata?Thenumberisgoingtobedifferentfromsquaremethod(theabsolute
valuemethodwillbesmaller),butitshouldstillshowthespreadofdata.Anybodyknowwhywetakethissquareapproachasa
standard?
Thedefinitionofstandarddeviation:
= E [(X ) ] .
Can'twejusttaketheabsolutevalueinsteadandstillbeagoodmeasurement?
= E [|X |]
standarddeviation
definition
editedJul28'11at16:42
mbq
15.4k
askedJul19'10at21:04
c4il
44
93
1,019
10
11 Inaway,themeasurementyouproposediswidelyusedincaseoferror(modelquality)analysisthenit
iscalledMAE,"meanabsoluteerror".mbq Jul19'10at21:30
2
Inacceptinganansweritseemsimportanttomethatwepayattentiontowhethertheansweriscircular.
Thenormaldistributionisbasedonthesemeasurementsofvariancefromsquarederrorterms,butthat
isn'tinandofitselfajustificationforusing(XM)^2over|XM|.rpierceJul20'10at7:59
DoyouthinkthetermstandardmeansthisisTHEstandardtoday?Isn'titlikeaskingwhyprincipal
componentare"principal"andnotsecondary?robingirardJul23'10at21:44
12 Everyanswerofferedsofariscircular.Theyfocusoneaseofmathematicalcalculations(whichisnicebut
bynomeansfundamental)oronpropertiesoftheGaussian(Normal)distributionandOLS.Around1800
GaussstartedwithleastsquaresandvarianceandfromthosederivedtheNormaldistributionthere'sthe
circularity.Atrulyfundamentalreasonthathasnotbeeninvokedinanyansweryetistheuniquerole
playedbythevarianceintheCentralLimitTheorem.Anotheristheimportanceindecisiontheoryof
minimizingquadraticloss.whuber Sep13'13at15:28
1
+1@whuber:Thanksforpointingthisout,whichwasbotheringmeaswell.Now,though,havetogoand
readupontheCentralLimitTheorem!Ohwell.)SabuncuFeb11'14at21:55
20Answers
Ifthegoalofthestandarddeviationistosummarisethespreadofasymmetricaldataset(i.e.
ingeneralhowfareachdatumisfromthemean),thenweneedagoodmethodofdefining
howtomeasurethatspread.
Thebenefitsofsquaringinclude:
Squaringalwaysgivesapositivevalue,sothesumwillnotbezero.
Squaringemphasizeslargerdifferencesafeaturethatturnsouttobebothgoodandbad
(thinkoftheeffectoutliershave).
Squaringhoweverdoeshaveaproblemasameasureofspreadandthatisthattheunitsare
allsquared,whereaswe'dmightpreferthespreadtobeinthesameunitsastheoriginaldata
(thinkofsquaredpoundsorsquareddollarsorsquaredapples).Hencethesquarerootallows
ustoreturntotheoriginalunits.
Isupposeyoucouldsaythatabsolutedifferenceassignsequalweighttothespreadofdata
whereassquaringemphasisestheextremes.Technicallythough,asothershavepointedout,
squaringmakesthealgebramucheasiertoworkwithandofferspropertiesthattheabsolute
methoddoesnot(forexample,thevarianceisequaltotheexpectedvalueofthesquareofthe
distributionminusthesquareofthemeanofthedistribution)
It'simportanttonotehoweverthatthere'snoreasonyoucouldn'ttaketheabsolute
differenceifthatisyourpreferenceonhowyouwishtoview'spread'(sortofhowsomepeople
see5%assomemagicalthreshholdforpvalues,wheninfactit'ssituationdependent).
Indeed,thereareinfactseveralcompetingmethodsformeasuringspread.
MyviewistousethesquaredvaluesbecauseIliketothinkofhowitrelatestothe
http://stats.stackexchange.com/questions/118/whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddevia
1/8
21/7/2015
definitionWhysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddeviation?CrossValidated
PythagoreanTheoremofStatistics:c=sqrt(a^2+b^2)...thisalsohelpsmerememberthat
whenworkingwithindependentrandomvariables,variancesadd,standarddeviationsdon't.
Butthat'sjustmypersonalsubjectivepreference.
Anmuchmoreindepthanalysiscanbereadhere.
editedJul20'10at14:56
answeredJul19'10at22:31
TonyBreyal
1,647
10
12
31 "Squaringalwaysgivesapositivevalue,sothesumwillnotbezero."andsodoesabsolutevalues.
robingirardJul22'10at9:54
15 @robingirard:Thatiscorrect,hencewhyIprecededthatpointwith"Thebenefitsofsquaringinclude".I
wasn'timplyingthatanythingaboutabsolutevaluesinthatstatement.Itakeyourpointthough,I'llconsider
removing/rephrasingitifothersfeelitisunclear.TonyBreyalJul22'10at13:19
8
Muchofthefieldofrobuststatisticsisanattempttodealwiththeexcessivesensitivitytooutliersthatthat
isaconsequenceofchoosingthevarianceasameasureofdataspread(technicallyscaleordispersion).
en.wikipedia.org/wiki/Robust_statisticsThylacoleoAug13'10at5:15
ThankyouforthelinktothatanalysisJackAidleyJan23'13at14:03
Thearticlelinkedtointheanswerisagodsend.traggatmotMar19at7:27
Thesquareddifferencehasnicermathematicalpropertiesit'scontinuouslydifferentiable(nice
whenyouwanttominimizeit),it'sasufficientstatisticfortheGaussiandistribution,andit's(a
versionof)theL2normwhichcomesinhandyforprovingconvergenceandsoon.
Themeanabsolutedeviation(theabsolutevaluenotationyousuggest)isalsousedasa
measureofdispersion,butit'snotas"wellbehaved"asthesquarederror.
answeredJul19'10at21:14
Rich
2,154
10
15
said"it'scontinuouslydifferentiable(nicewhenyouwanttominimizeit)"doyoumeanthattheabsolute
valueisdifficulttooptimize?robingirardJul23'10at21:40
16 @robin:whiletheabsolutevaluefunctioniscontinuouseverywhere,itsfirstderivativeisnot(atx=0).This
makesanalyticaloptimizationmoredifficult.VinceJul23'10at23:59
1
Yeah,findingquantilesingeneral(whichincludesoptimizingabsolutevalues)tendstochurnuplinear
programmingtypeproblems,whichwhilethey'recertainlytractablenumericallycangetfiddly.They
typicallydon'thaveananalyticalclosedformsolution,andareabitslowerandabitmoredifficultto
implementthanleastsquaretypesolutions.RichJul24'10at2:55
Idonotagreewiththis.First,theoretically,theproblemmaybeofdifferentnature(becauseofthe
discontinuity)butnotnecessarilyharder(forexamplethemedianiseaselyshowntobearginf_mE[|Ym|]).
Second,practically,usingaL1norm(absolutevalue)ratherthanaL2normmakesitpiecewiselinearand
henceatleastnotmoredifficult.Quantileregressionanditsmultiplevarianteisanexampleofthat.
robingirardJul24'10at6:01
11 Yes,butfindingtheactualnumberyouwant,ratherthanjustadescriptorofit,iseasierundersquared
errorloss.Considerthe1dimensioncaseyoucanexpresstheminimizerofthesquarederrorbythe
mean:O(n)operationsandclosedform.Youcanexpressthevalueoftheabsoluteerrorminimizerbythe
median,butthere'snotaclosedformsolutionthattellsyouwhatthemedianvalueisitrequiresasortto
find,whichissomethinglikeO(nlogn).Leastsquaressolutionstendtobeasimpleplugandchugtype
operation,absolutevaluesolutionsusuallyrequiremoreworktofind.RichJul24'10at9:10
Onewayyoucanthinkofthisisthatstandarddeviationissimilartoa"distancefromthe
mean".
Comparethistodistancesineuclideanspacethisgivesyouthetruedistance,wherewhat
yousuggested(which,btw,istheabsolutedeviation)ismorelikeamanhattandistance
calculation.
answeredJul19'10at21:14
ReedCopsey
731
Yeah.Greatanalogy.DanielRodriguezOct31'11at4:10
Thisshouldbemodifiedasminimumdistancefromthemean.It'sessentiallyaPythagoreanequation.
http://stats.stackexchange.com/questions/118/whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddevia
2/8
21/7/2015
definitionWhysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddeviation?CrossValidated
JohnNov21'14at16:40
Thereasonthatwecalculatestandarddeviationinsteadofabsoluteerroristhatweare
assumingerrortobenormallydistributed.It'sapartofthemodel.
Supposeyouweremeasuringverysmalllengthswitharuler,thenstandarddeviationisabad
metricforerrorbecauseyouknowyouwillneveraccidentallymeasureanegativelength.A
bettermetricwouldbeonetohelpfitaGammadistributiontoyourmeasurements:
E(log(x)) log(E(x))
Likethestandarddeviation,thisisalsononnegativeanddifferentiable,butitisabettererror
statisticforthisproblem.
editedMay14'14at11:47
answeredAug10'10at22:34
NeilG
4,203
11
31
1 Ilikeyouranswer.Thesdisnotalwaysthebeststatistic.RockScienceNov25'10at3:03
1 Greatcounterexampleastowhenthestandarddeviationisnotthebestwaytothinkoffluctuationsizes.
HbarMay13'14at2:49
Squaringthedifferencefromthemeanhasacoupleofreasons.
Varianceisdefinedasthe2ndmomentofthedeviation(theR.Vhereis(x ))andthus
thesquareasmomentsaresimplytheexpectationsofhigherpowersoftherandom
variable.
Havingasquareasopposedtotheabsolutevaluefunctiongivesanicecontinuousand
differentiablefunction(absolutevalueisnotdifferentiableat0)whichmakesitthenatural
choice,especiallyinthecontextofestimationandregressionanalysis.
ThesquaredformulationalsonaturallyfallsoutofparametersoftheNormalDistribution.
answeredJul19'10at21:15
KungPaoChicken
251
Theanswerthatbestsatisfiedmeisthatitfallsoutnaturallyfromthegeneralizationofa
sampletondimensionaleuclideanspace.It'scertainlydebatablewhetherthat'ssomething
thatshouldbedone,butinanycase:
Assumeyourn measurementsXi areeachanaxisinRn .Thenyourdataxi defineapoint
x inthatspace.Nowyoumightnoticethatthedataareallverysimilartoeachother,soyou
canrepresentthemwithasinglelocationparameter thatisconstrainedtolieontheline
^ = x
,andthedistance
definedbyXi = .Projectingyourdatapointontothislinegetsyou
n1
^ 1 totheactualdatapointis
^
fromtheprojectedpoint
^1
= x
Thisapproachalsogetsyouageometricinterpretationforcorrelation,^
~ ~
= cos (x, y)
answeredNov24'10at20:49
sesqu
416
Thisiscorrectandappealing.However,intheenditappearsonlytorephrasethequestionwithoutactually
answeringit:namely,whyshouldweusetheEuclidean(L2)distance?whuber Nov24'10at21:07
Thatisindeedanexcellentquestion,leftunanswered.IusedtofeelstronglythattheuseofL2is
unfounded.Afterhavingstudiedalittlestatistics,Isawtheanalyticniceties,andsincethenhaverevised
myviewpointinto"ifitreallymatters,you'reprobablyindeepwateralready,andifnot,easyisnice".Idon't
knowmeasuretheoryyet,andworrythatanalysisrulestheretoobutI'venoticedsomenewinterestin
combinatorics,soperhapsnewnicetieshavebeen/willbefound.sesquNov24'10at21:39
14 @sesquStandarddeviationsdidnotbecomecommonplaceuntilGaussin1809derivedhiseponymous
deviationusingsquarederror,ratherthanabsoluteerror,asastartingpoint.However,whatpushedthem
overthetop(Ibelieve)wasGalton'sregressiontheory(atwhichyouhint)andtheabilityofANOVAto
decomposesumsofsquareswhichamountstoarestatementofthePythagoreanTheorem,arelationship
enjoyedonlybytheL2norm.ThustheSDbecameanaturalomnibusmeasureofspreadadvocatedin
Fisher's1925"StatisticalMethodsforResearchWorkers"andhereweare,85yearslater.whuber Nov
24'10at21:56
10 (+1)Continuingin@whuber'svein,IwouldbetthathadStudentpublishedapaperin1908entitled,
"ProbableErroroftheMeanHey,Guys,CheckOutThatMAEintheDenominator!"thenstatisticswould
haveanentirelydifferentfacebynow.Ofcourse,hedidn'tpublishapaperlikethat,andofcoursehe
couldn'thave,becausetheMAEdoesn'tboastallthenicepropertiesthatS^2has.Oneofthem(relatedto
Student)isitsindependenceofthemean(inthenormalcase),whichofcourseisarestatementof
http://stats.stackexchange.com/questions/118/whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddevia
3/8
21/7/2015
definitionWhysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddeviation?CrossValidated
orthogonality,whichgetsusrightbacktoL2andtheinnerproduct.G.JayKernsNov25'10at3:38
Yetanotherreason(inadditiontotheexcellentonesabove)comesfromFisherhimself,who
showedthatthestandarddeviationismore"efficient"thantheabsolutedeviation.Here,
efficienthastodowithhowmuchastatisticwillfluctuateinvalueondifferentsamplingsfroma
population.Ifyourpopulationisnormallydistributed,thestandarddeviationofvarioussamples
fromthatpopulationwill,onaverage,tendtogiveyouvaluesthatareprettysimilartoeach
other,whereastheabsolutedeviationwillgiveyounumbersthatspreadoutabitmore.Now,
obviouslythisisinidealcircumstances,butthisreasonconvincedalotofpeople(alongwith
themathbeingcleaner),somostpeopleworkedwithstandarddeviations.
answeredJul27'10at1:51
EricSuh
346
3 Yourargumentdependsonthedatabeingnormallydistributed.Ifweassumethepopulationtohavea
"doubleexponential"distribution,thentheabsolutedeviationismoreefficient(infactitisasufficientstatistic
forthescale)probabilityislogicJul16'11at5:08
3 Yes,asIstated,"ifyourpopulationisnormallydistributed."EricSuhSep8'11at19:49
Justsopeopleknow,thereisaMathOverflowquestiononthesametopic.
Whyisitsocooltosquarenumbersintermsoffindingthestandarddeviation
Thetakeawaymessageisthatusingthesquarerootofthevarianceleadstoeasiermaths.A
similarresponseisgivenbyRichandReedabove.
answeredJul26'10at22:22
RobbyMcKilliam
898
11
Therearemanyreasonsprobablythemainisthatitworkswellasparameterofnormal
distribution.
editedApr27'13at14:09
answeredJul19'10at21:11
mbq
15.4k
44
93
4 Iagree.Standarddeviationistherightwaytomeasuredispersionifyouassumenormaldistribution.Anda
lotofdistributionsandrealdataareanapproximatelynormal.ukaszLewJul20'10at14:40
2 Idon'tthinkyoushouldsay"naturalparameter":thenaturalparametersofthenormaldistributionaremean
andmeantimesprecision.(en.wikipedia.org/wiki/Natural_parameter)NeilGMar12'12at7:40
@NeilGGoodpointIwasthinkingabout"casual"meaninghere.I'llthinkaboutsomebetterword.mbq
Mar12'12at10:41
Ithinkthecontrastbetweenusingabsolutedeviationsandsquareddeviationsbecomes
cleareronceyoumovebeyondasinglevariableandthinkaboutlinearregression.There'sa
nicediscussionathttp://en.wikipedia.org/wiki/Least_absolute_deviations,particularlythe
section"ContrastingLeastSquareswithLeastAbsoluteDeviations",whichlinkstosome
studentexerciseswithaneatsetofappletsat
http://www.math.wpi.edu/Course_Materials/SAS/lablets/7.3/73_choices.html.
Tosummarise,leastabsolutedeviationsismorerobusttooutliersthanordinaryleastsquares,
butitcanbeunstable(smallchangeinevenasingledatumcangivebigchangeinfittedline)
anddoesn'talwayshaveauniquesolutiontherecanbeawholerangeoffittedlines.Also
leastabsolutedeviationsrequiresiterativemethods,whileordinaryleastsquareshasasimple
closedformsolution,thoughthat'snotsuchabigdealnowasitwasinthedaysofGaussand
Legendre,ofcourse.
answeredAug12'10at12:00
onestop
13.3k
30
60
the"uniquesolution"argumentisquiteweak,itreallymeansthereismorethanonevaluewellsupportedby
thedata.Additionally,penalisationofthecoefficients,suchasL2,willresolvetheuniquenessproblem,and
thestabilityproblemtoadegreeaswell.probabilityislogicJul4'14at11:13
Inmanyways,theuseofstandarddeviationtosummarizedispersionisjumpingtoa
http://stats.stackexchange.com/questions/118/whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddevia
4/8
21/7/2015
definitionWhysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddeviation?CrossValidated
conclusion.YoucouldsaythatSDimplicitlyassumesasymmetricdistributionbecauseofits
equaltreatmentofdistancebelowthemeanasofdistanceabovethemean.TheSDis
surprisinglydifficulttointerprettononstatisticians.OnecouldarguethatGini'smean
differencehasbroaderapplicationandissignificantlymoreinterpretable.Itdoesnotrequire
onetodeclaretheirchoiceofameasureofcentraltendencyastheuseofSDdoesforthe
mean.Gini'smeandifferenceistheaverageabsolutedifferencebetweenanytwodifferent
observations.Besidesbeingrobustandeasytointerpretithappenstobe0.98asefficientas
SDifthedistributionwereactuallyGaussian.
answeredMay14'14at12:55
FrankHarrell
27.3k
45
108
2 Justtoaddto@Frank'ssuggestiononGini,there'sanicepaperhere:
projecteuclid.org/download/pdf_1/euclid.ss/1028905831Itgoesovervariousmeasuresofdispersionand
alsogiveaninformativehistoricalperspective.ThomasSpeidelMay14'14at17:06
1 Iliketheseideastoo,butthere'salesswellknownparalleldefinitionofthevariance(andthustheSD)that
makesnoreferencetomeansaslocationparameters.Thevarianceishalfthemeansquareoverallthe
pairwisedifferencesbetweenvalues,justastheGinimeandifferenceisbasedontheabsolutevaluesofall
thepairwisedifference.NickCoxOct21'14at23:46
Becausesquarescanallowuseofmanyothermathematicaloperationsorfunctionsmore
easilythanabsolutevalues.
Example:squarescanbeintegrated,differentiated,canbeusedintrigonometric,logarithmic
andotherfunctions,withease.
answeredJul27'10at0:24
user369
49
1 Iwonderifthereisaselffulfillingprofecyhere.WegetprobabilityislogicMar13'12at12:04
Naturallyyoucandescribedispersionofadistributioninanywaymeaningful(absolute
deviation,quantiles,etc.).
Onenicefactisthatthevarianceisthesecondcentralmoment,andeverydistributionis
uniquelydescribedbyitsmomentsiftheyexist.Anothernicefactisthatthevarianceismuch
moretractablemathematicallythananycomparablemetric.Anotherfactisthatthevarianceis
oneoftwoparametersofthenormaldistributionfortheusualparametrization,andthenormal
distributiononlyhas2nonzerocentralmomentswhicharethosetwoveryparameters.Even
fornonnormaldistributionsitcanbehelpfultothinkinanormalframework.
AsIseeit,thereasonthestandarddeviationexistsassuchisthatinapplicationsthesquare
rootofthevarianceregularlyappears(suchastostandardizearandomvarianble),which
necessitatedanameforit.
answeredJul27'10at4:04
arik
IfIrecallcorrectly,isn'tthelognormaldistributionnotuniquelydefinedbyitsmoments.probabilityislogic
Apr10'14at13:38
Variancesareadditive:forindependentrandomvariablesX1 , , Xn ,
var(X1 + + Xn ) = var(X1 ) + + var(Xn ).
Noticewhatthismakespossible:SayItossafaircoin900times.What'stheprobabilitythat
thenumberofheadsIgetisbetween440and455inclusive?Justfindtheexpectednumberof
heads(450 ),andthevarianceofthenumberofheads(225 = 152 ),thenfindtheprobability
withanormal(orGaussian)distributionwithexpectation450 andstandarddeviation15is
between439.5 and455.5 .AbrahamdeMoivredidthiswithcointossesinthe18thcentury,
therebyfirstshowingthatthebellshapedcurveisworthsomething.
answeredSep18'12at1:41
MichaelHardy
894
14
Aremeanabsolutedeviationsnotadditiveinthesamewayasvariances?rpierceFeb9'13at23:30
2 No,they'renot.MichaelHardyFeb10'13at18:14
http://stats.stackexchange.com/questions/118/whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddevia
5/8
21/7/2015
definitionWhysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddeviation?CrossValidated
Itdependsonwhatyouaretalkingaboutwhenyousay"spreadofthedata".Tomethiscould
meantwothings:
1. Thewidthofasamplingdistribution
2. Theaccuracyofagivenestimate
Forpoint1)thereisnoparticularreasontousethestandarddeviationasameasureofspread,
exceptforwhenyouhaveanormalsamplingdistribution.ThemeasureE(|X |) isamore
appropriatemeasureinthecaseofaLaplaceSamplingdistribution.Myguessisthatthe
standarddeviationgetsusedherebecauseofintuitioncarriedoverfrompoint2).Probably
alsoduetothesuccessofleastsquaresmodellingingeneral,forwhichthestandarddeviation
istheappropriatemeasure.ProbablyalsobecausecalculatingE(X 2 ) isgenerallyeasierthan
calculatingE(|X|) formostdistributions.
Now,forpoint2)thereisaverygoodreasonforusingthevariance/standarddeviationasthe
measureofspread,inoneparticular,butverycommoncase.YoucanseeitintheLaplace
approximationtoaposterior.WithDataDandpriorinformationI ,writetheposteriorfora
parameter as:
exp (h())
p( DI ) =
(max ) h (max )
= 0
,sowe
(max ) h (max )
Ifwepluginthisapproximationweget:
exp (h(max ) +
p( DI )
exp (h(max ) +
1
exp (
=
exp (
2
1
2
1
2
1
2
(max ) h (max ))
2
(max t) h (max )) dt
2
(max ) h (max ))
2
(max t) h (max )) dt
Which,butfornotationisanormaldistribution,withmeanequaltoE(
varianceequalto
DI ) max
,and
V ( DI ) [h (max )]
(h (max ) isalwayspositivebecausewehaveawellroundedmaximum).Sothismeans
thatin"regularproblems"(whichismostofthem),thevarianceisthefundamentalquantity
whichdeterminestheaccuracyofestimatesfor .Soforestimatesbasedonalargeamountof
data,thestandarddeviationmakesalotofsensetheoreticallyittellsyoubasicallyeverything
youneedtoknow.Essentiallythesameargumentapplies(withsameconditionsrequired)in
multidimensionalcasewithh ()jk
h()
j k
beingaHessianmatrix.Thediagonalentries
arealsoessentiallyvariancesheretoo.
Thefrequentistusingthemethodofmaximumlikelihoodwillcometoessentiallythesame
conclusionbecausetheMLEtendstobeaweightedcombinationofthedata,andforlarge
samplestheCentralLimitTheoremappliesandyoubasicallygetthesameresultifwetake
p( I ) = 1 butwith andmax interchanged:
p(max ) N (, [h (max )]
(seeifyoucanguesswhichparadigmIprefer:P).Soeitherway,inparameterestimationthe
standarddeviationisanimportanttheoreticalmeasureofspread.
editedJul4'14at14:29
answeredJul16'11at14:37
MichaelHardy
probabilityislogic
894
13.2k
14
40
55
Estimatingthestandarddeviationofadistributionrequirestochooseadistance.
Anyofthefollowingdistancecanbeused:
1/n
http://stats.stackexchange.com/questions/118/whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddevia
6/8
21/7/2015
definitionWhysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddeviation?CrossValidated
n
1/n
dn ((X)i=1,,I , ) = ( |X | )
Weusuallyusethenaturaleuclideandistance(n = 2),whichistheoneeverybodyusesin
dailylife.Thedistancethatyouproposeistheonewithn = 1.
Botharegoodcandidatesbuttheyaredifferent.
Onecoulddecidetousen
= 3
aswell.
Iamnotsurethatyouwilllikemyanswer,mypointcontrarytoothersisnottodemonstrate
thatn = 2isbetter.Ithinkthatifyouwanttoestimatethestandarddeviationofadistribution,
youcanabsolutelyuseadifferentdistance.
editedJul31'14at17:00
answeredNov25'10at3:01
MichaelHardy
RockScience
894
1,004
14
Didyoumeann
= 1
insteadofthe(undefined)n
= 0
13
33
?whuber Jan5'11at3:25
Yesindeed,thxRockScienceJan5'11at3:40
Myguessisthis:Mostpopulations(distributions)tendtocongregatearoundthemean.The
fartheravalueisfromthemean,therareritis.Inordertoadequatelyexpresshow"outofline"
avalueis,itisnecessarytotakeintoaccountbothitsdistancefromthemeanandits(normally
speaking)rarenessofoccurrence.Squaringthedifferencefromthemeandoesthis,as
comparedtovalueswhichhavesmallerdeviations.Onceallthevariancesareaveraged,then
itisOKtotakethesquareroot,whichreturnstheunitstotheiroriginaldimensions.
answeredSep13'13at2:24
SamuelBerry
21
2 Thisdoesn'texplainwhyyoucouldn'tjusttaketheabsolutevalueofthedifference.Thatseems
conceptuallysimplertomoststats101students,&itwould"takeintoaccountbothitsdistancefromthe
meanandits(normallyspeaking)rarenessofoccurrence".gungSep13'13at2:35
Ithinktheabsolutevalueofthedifferencewouldonlyexpressthedifferencefromthemeanandwouldnot
takeintoaccountthefactthatlargedifferencesaredoublydisruptivetoanormaldistribution.
SamuelBerrySep13'13at2:44
1 Whyis"doublydisruptive"importantandnot,say,"triplydisruptive"or"quadruplydisruptive"?Itlookslike
thisanswermerelyreplacestheoriginalquestionwithanequivalentquestion.whuber Sep13'13at
15:19
"Whysquarethedifference"insteadof"takingabsolutevalue"?Toanswerveryexactly,there
isliteraturethatgivesthereasonsitwasadoptedandthecaseforwhymostofthosereasons
donothold."Can'twesimplytaketheabsolutevalue...?".Iamawareofliteratureinwhichthe
answerisyesitisbeingdoneanddoingsoisarguedtobeadvantageous.
AuthorGorardstates,first,usingsquareswaspreviouslyadoptedforreasonsofsimplicityof
calculationbutthatthoseoriginalreasonsnolongerhold.Gorardstates,second,thatOLSwas
adoptedbecauseFisherfoundthatresultsinsamplesofanalysesthatusedOLShadsmaller
deviationsthanthosethatusedabsolutedifferences(roughlystated).Thus,itwouldseemthat
OLSmayhavebenefitsinsomeidealcircumstanceshowever,Gorardproceedstonotethat
thereissomeconsensus(andheclaimsFisheragreed)thatunderrealworldconditions
(imperfectmeasurementofobservations,nonuniformdistributions,studiesofapopulation
withoutinferencefromasample),usingsquaresisworsethanabsolutedifferences.
Gorard'sresponsetoyourquestion"Can'twesimplytaketheabsolutevalueofthedifference
insteadandgettheexpectedvalue(mean)ofthose?"isyes.Anotheradvantageisthatusing
differencesproducesmeasures(measuresoferrorsandvariation)thatarerelatedtotheways
weexperiencethoseideasinlife.Gorardsaysimaginepeoplewhosplittherestaurantbill
evenlyandsomemightintuitivelynoticethatthatmethodisunfair.Nobodytherewillsquare
theerrorsthedifferencesarethepoint.
Finally,usingabsolutedifferences,henotes,treatseachobservationequally,whereasby
contrastsquaringthedifferencesgivesobservationspredictedpoorlygreaterweightthan
observationspredictedwell,whichislikeallowingcertainobservationstobeincludedinthe
studymultipletimes.Insummary,hisgeneralthrustisthattherearetodaynotmanywinning
reasonstousesquaresandthatbycontrastusingabsolutedifferenceshasadvantages.
References:
Gorard,S.(2005).Revisitinga90yearolddebate:theadvantagesofthemeandeviation,
BritishJournalofEducationalStudies,53,4,pp.417430.
Gorard,S.(2013).Thepossibleadvantagesofthemeanabsolutedeviationeffectsize,
SocialResearchUpdate,65:1.
http://stats.stackexchange.com/questions/118/whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddevia
7/8
21/7/2015
definitionWhysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddeviation?CrossValidated
editedJul14'14at2:57
answeredJul14'14at2:13
gung
Jen
53.6k
15
116
225
36
Whenaddingrandomvariables,theirvariancesadd,foralldistributions.Variance(and
thereforestandarddeviation)isausefulmeasureforalmostalldistributions,andisinnoway
limitedtogaussian(aka"normal")distributions.Thatfavorsusingitasourerrormeasure.Lack
ofuniquenessisaseriousproblemwithabsolutedifferences,asthereareoftenaninfinite
numberofequalmeasure"fits",andyetclearlythe"oneinthemiddle"ismostrealistically
favored.Also,evenwithtoday'scomputers,computationalefficiencymatters.Iworkwithlarge
datasets,andCPUtimeisimportant.However,thereisnosingleabsolute"best"measureof
residuals,aspointedoutbysomepreviousanswers.Differentcircumstancessometimescall
fordifferentmeasures.
answeredOct21'14at23:27
EricL.Michelsen
21
1 Iremainunconvincedthatvariancesareveryusefulforasymmetricdistributions.FrankHarrellOct22'14
at12:58
Squaringamplifieslargerdeviations.
Ifyoursamplehasvaluesthatarealloverthechartthentobringthe68.2%withinthefirst
standarddeviationyourstandarddeviationneedstobealittlewider.Ifyourdatatendedtoall
fallaroundthemeanthencanbetighter.
Somesaythatitistosimplifycalculations.Usingthepositivesquarerootofthesquarewould
havesolvedthatsothatargumentdoesn'tfloat.
|x| = x2
Soifalgebraicsimplicitywasthegoalthenitwouldhavelookedlikethis:
= E [(x )2 ]
whichyieldsthesameresultsasE [|x |] .
Obviouslysquaringthisalsohastheeffectofamplifyingoutlyingerrors(doh!).
editedJul28'14at22:46
Alexis
6,931
answeredJul28'14at20:57
PrestonThayne
16
47
BasedonaflagIjustprocessed,Isuspectthedownvoterdidnotcompletelyunderstandhowthisanswer
respondstothequestion.IbelieveIseetheconnection(butyoumightneverthelessconsidermakingsome
editstohelpotherreadersappreciateyourpointsbetter).Yourfirstparagraph,though,strikesmeasbeing
somewhatofacircularargument:the68.2%valueisderivedfrompropertiesofthestandarddeviation,so
howdoesinvokingthatnumberhelpjustifyusingtheSDinsteadofsomeotherLp normofdeviationsfrom
themeanasawaytoquantifythespreadofadistribution?whuber Jul28'14at21:20
Thefirstparagraphwasthereasonformydownvote.AlexisJul28'14at22:45
2 @PrestonThayne:Sincethestandarddeviationisnottheexpectedvalueof sqrt((xmu)^2) ,your
formulaismisleading.Inaddition,justbecausesquaringhastheeffectofamplifyinglargerdeviationsdoes
notmeanthatthisisthereasonforpreferringthevarianceovertheMAD.Ifanything,thatisaneutral
propertysinceoftentimeswewantsomethingmorerobustliketheMAD.Lastly,thefactthatthevarianceis
moremathematicallytractablethantheMADisamuchdeeperissuemathematicallythenyou'veconveyed
inthispost.SteveSJul29'14at2:18
protectedbywhuber Oct22'14at3:46
Thankyouforyourinterestinthisquestion.Becauseithasattractedlowqualityanswers,postingananswernowrequires10reputationonthissite.
Wouldyouliketoansweroneoftheseunansweredquestionsinstead?
http://stats.stackexchange.com/questions/118/whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandarddevia
8/8