You are on page 1of 8

AndreasToprac

DataExplorationMiniProject
ThedataIcollectedwasquantitativeandonanordinalscaleof110.Theshapeofthe
plotwasunimodalwithaleftskew.Mydataistakenfromapopulationof39studentsatLASA
whoansweredthequestionHowawesomeisyourhair?onascalefrom110(1=least
awesomeand10=mostawesome).Iusedthismeasurementbecauseitisverysimpleanditdid
notallowforpeopletoinsertanynumbertheywanted,becauseIknewpeoplewouldenter
ridiculousvalues.IcollectedmydatabypostingagoogleformontheLASAStats2016
facebookgroup.Ofthe67members,39chosetoanswermyquestion.Icollecteddatathisway
becauseallofthepeopleinthegroupareinStatssoitwasaneasywaytohaveaccesstoagood
amountofpeoplewillingtohelpoutandanswermyquestion.Plus,mostofthepeopleinthe
groupweredoingthesamedatacollectionprojectatthesametime,somorepeoplewerewilling
totakesometimetoanswereveryonesquestions.

OriginalDataSet
Thesamplesizeofmydatawas39,becausethedatasethas39valuesinit.Thefive
numbersummaryis3,7,8,10,10,whichIfoundusingthefivenumfunctioninR.Themean
is8.10,whichIfoundusingthemeanfunctioninR.Themedianis8,whichIfoundusingthe
medianfunctioninR.Therangeis7,whichIfoundbysubtractingthesmallestvalue(3)from
thebiggestvalue(10).Thestandarddeviationis1.79whichIfoundusingthesdfunctionin
R.Thevarianceis3.20,whichIfoundusingthevarfunctioninR.Lastly,theIQRis3,
whichIfoundbysubtractingQ1(7)fromQ3(10).Therearenooutliersinmydata,becausethe
IQR(3)multipliedby1.5isequalto4.5,andnoneofmydatavaluesare4.5unitsaboveQ3or
4.5unitsbelowQ1.


AndreasToprac
3


AndreasToprac
3
AwesomenessofLASAStudentsHairStemplot(1=least10=best)
Thedecimalpointisatthe|

3|0
4|0
5|0
6|0000
7|00000
8|000000000
9|0000000
10|00000000000

ThisistheRCodethatwasusedtocalculateallofthevaluesandgraphslistedabove.
>fivenum(Hair$Hair)
[1]3781010
>mean(Hair$Hair)
[1]8.102564
>median(Hair$Hair)
[1]8
>range(Hair$Hair)
[1]310
>sd(Hair$Hair)
[1]1.788779
>var(Hair$Hair)
[1]3.19973
>3*1.5
[1]4.5
>10+4.5
[1]14.5
>74.5
[1]2.5
>hist(Hair$Hair,xlab="AwesomenessofLASAStudent'sHair(1=least10=best)",
main="AwesomenessofLASAStudent'sHairHistogram")
>boxplot(Hair$Hair,ylab="AwesomenessofLASAStudent'sHair(1=least10=best)",
main="AwesomenessofLASAStudent'sHairBoxplot")>stem(Hair$Hair)


AndreasToprac
3
OriginalDataSetPlus100
Afteradding100toallofthevaluesinmydataset,thesamplesizeisstillat39.Thefive
numbersummaryis103,107,108,110,110,themeanis108.10,themedianis108,therangeis
7,thestandarddeviationis1.79,thevarianceis3.20,andtheIQRis3.Ascomparedtothefirst
calculation,thestandarddeviationstayedexactlythesamewhileboththemeanandmedianwent
upby100units.Thegraphsdidnotchangeinshapeatall.


AndreasToprac
3
AwesomenessofLASAStudentsHairStemplot(101=least110=best)
Thedecimalpointisatthe|

103|0
104|0
105|0
106|0000
107|00000
108|000000000
109|0000000
110|00000000000

ThisistheRCodethatwasusedtocalculateallofthevaluesandgraphslistedabove.
>fivenum(Hair2$SecHair)
[1]103107108110110
>mean(Hair2$SecHair)
[1]108.1026
>median(Hair2$SecHair)
[1]108
>range(Hair2$SecHair)
[1]103110
>sd(Hair2$SecHair)
[1]1.788779
>var(Hair2$SecHair)
[1]3.19973
>hist(Hair2$SecHair,xlab="AwesomenessofLASAStudent'sHair(101=least110=best)",
main="AwesomenessofLASAStudent'sHairHistogram")
>boxplot(Hair2$SecHair,ylab="AwesomenessofLASAStudent'sHair(101=least110=best)",
main="AwesomenessofLASAStudent'sHairBoxplot")
>stem(Hair2$SecHair)


AndreasToprac
3
OriginalDataSetIncreasedby50%
Afterincreasingallofthevaluesinmyoriginaldatasetby50%,thesamplesizeisstillat
39.Thefivenumbersummaryis4.5,10.5,12,15,15,themeanis12.15,themedianis12,the
rangeis10.5,thestandarddeviationis2.68,thevarianceis7.20,andtheIQRis4.5.As
comparedtothefirstcalculation,thestandarddeviation,mean,andmedianallincreasedby50%
fromtheoriginaldataset.Thegraphschangedslightlyinshape,buttheywerestillunimodal
withaleftskew.


AndreasToprac
3
AwesomenessofLASAStudentsHairStemplot(1.5=least15=best)
Thedecimalpointisatthe|

4|5
6|05
8|0000
10|55555
12|0000000005555555
14|00000000000

ThisistheRCodethatwasusedtocalculateallofthevaluesandgraphslistedabove.
>fivenum(Hair3$ThrHair)
[1]4.510.512.015.015.0
>mean(Hair3$ThrHair)
[1]12.15385
>median(Hair3$ThrHair)
[1]12
>range(Hair3$ThrHair)
[1]4.515.0
>sd(Hair3$ThrHair)
[1]2.683168
>var(Hair3$ThrHair)
[1]7.199393
>hist(Hair3$ThrHair,xlab="AwesomenessofLASAStudent'sHair(1.5=least15=best)",
main="AwesomenessofLASAStudent'sHairHistogram")
>boxplot(Hair2$SecHair,ylab="AwesomenessofLASAStudent'sHair(1.5=least15=best)",
main="AwesomenessofLASAStudent'sHairBoxplot")
>stem(Hair3$ThrHair)


AndreasToprac
3
AssumetheDataisNormallyDistributed
Ifmydatawasnormallydistributed,0%ofthedatawouldbegreaterthan5unitsabove
themeanbecausethemeanis8.1andthemaxvalueis10.Thepercentofdatabetween3units
belowthemeanof8.1and2unitsabovethemeanis91.5%.Thenumberofunitsrequiredtobe
inthetop10%was10.39.

Calculations:
(5.18.1)/1.79=1.68>.046
(10.18.1)/1.79=1.12>.869
.046+.869=.915

1.28=(x8.1)/1.79x=10.39

Conclusion
Inconclusion,IfoundthatthemajorityofLASAisconfidentintheirhair.Thisis
supportedbythefactthatboththemean(8.1)andmedian(8)ofthedatawasabove5,which
representedaneutralopinionontheawesomenessofone'shair.Thedatawasalsoskewedleft
showinghowmostofthepeopleansweredwithhighresponsesontheawesomenessoftheirhair.

You might also like