You are on page 1of 151

1

Good Morning

TATISTICS
IN
HODONTICS

Contents
Introduction
History of biostatistics
Uses of biostatistics
Basis for statistical analysis
Common Statistical Terms
Measures that are used to evaluate screening
Data collection
Types of scales in statistics

Data presentation
Measures of central tendency
Types of variability
Measures of variation or dispersion
Standard error
Coefficient of variation
Normal curve

Introduction
ThewordstatisticscomesfromtheItalianwordstatista
meaningstatesmanortheGermanwordstatistikwhich
meanspoliticalstate.
Statisticsisthescienceofcompiling,classifyingand
tabulatingnumericaldataandexpressingtheresultsina
mathematicalform.
Biostatisticsisthatbranchofstatisticsconcernedwith
mathematicalfactsanddatarelatedtobiologicalevents

Ithasbeensaidwhenyoucanmeasurewhatyouarespeaking
aboutandexpressitinnumbers,youknowsomethingabout
it,butwhenyoucannotexpressitinnumbersyourknowledge
isofmeagreandunsatisfactorykind.

-LordKelvin
Statisticordatummeansameasuredorcountedfactor
pieceoftheinformationstatedasafiguresuchasheightof
oneperson,birthweightofababyetc.

DuringtheoutbreakofplagueinENGLAND,In1532they
startedpublishingtheweeklydeathstatistics.

Thispracticecontinuedandby1632,thesebillsofmortality,
listedbirthsanddeathsbysex.

-In1662,CAPT.JOHNGRAUNTused30yearsofthesebills
tomakepredictionsaboutthenumberofpeoplewhowould
diefromvariousdiseasesandproportionsofmaleand
femalebirthsthatcouldbeexpected.

-Heisfatherofhealthstatistics

10

11

Uses of biostatistics
1.
2.
3.
4.
5.
6.
7.

Totestwhetherthedifferencebetweenpopulation->realorchance
ofoccurrence.
Tostudythecorrelationbetweenattributesinthesamepopulation
ToevaluateEfficacyofvaccines
Tomeasuremortalityandmorbidity
Toevaluateachivementsofpublichealthprograms
Tofixprioritiesinpublichealthprograms
Tohelppromotehealthlegislationandcreateadministrative
standardsfororalhealth

12

Basis for statistical analysis


Thepopulation(U)thatisofinterest
Setofcharacteristicsoftheunitsofthispopulation(V)
Theprobabilitydistribution(P)ofthesecharacteristicsin
thepopulation

13

Classification of variables

Independentvariables
Dependantvariables
Confoudingorinterveningvariable
Backgroundvariables

14

Independentvariables
Variablesthataremanipulatedinastudyinordertoseewhat
effect,differencesinthemwillhaveonthosevariables
proposedasbeingdependentonthem.
i.e.cause/riskfactor

Dependentvariables
Variablesinwhichchangesareresultsofleveloramountofthe
independentvariableorvariables.
i.e.Effect/outcome/Disease

15

Confoundingorinterveningvariables
Variablesthatshouldbestudiedbecausetheymayinfluenceor
confoundtheeffectoftheindependentvariablesonthe
dependentvariables.
ex.Tobacco-oralcancer[nutritionalstatus]

Backgroundvariables
Variablesthataresooftenofrelevanceininvestigationsof
groupsorpopulationsthattheyshouldbecosideredforpossible
inclusioninthestudy.
i.e.sex,age,maritalstatus,socialstatus

16

Common Statistical Terms


Constant
Quantitiesthatdonotvarye.g.inbiostatistics,mean,standard
deviationareconsideredconstantforapopulation

Variable
Characteristicswhichtakesdifferentvaluesfordifferentperson,
placeorthingsuchasheight,weight,bloodpressure

Population
Populationincludeseveryindividual,eventsandobjectsunder
study.itmaybefiniteorinfinite.

17

Sample
Definedasapartofapopulationgenerallyselectedsoastobe
representativeofthepopulationwhosevariablesareunderstudy

Parameter
Itisaconstantthatdescribesapopulation
e.g.inacollegethereare75%girls.Thisdescribesthepopulation,
henceitisaparameter.

Attribute
Acharacteristicbasedonwhichthepopulationcanbedescribedinto
categoriesorclasse.g.gender,caste,religion.

18

Measures that are used to evaluate screening :


Sensitivity:
-Itistheprobabilityofcorrectlyidentifyingacaseof
disease.
-Itindicatestheproportionoftrulynondiseased
personinthescreenedpopulationwhoareidentified
asdiseasedbythescreeningtest

Truepositives
Sensitivity=
Truepositives+Falsenegatives

19

Specificity
Itisprobabilityofcorrectlyidentifyingdiseasefree
person
Itindicatestheproportionoftrulydiseasedperson
whoareidentifiedasnon-diseasedbythe
screeningtest.Itisalsoknownasthetrue
negativerate.
Truenegatives
Specificity=
Truenegatives+Falsepositive

20

PositivePredictiveValue:

Itisprobabilityofadiseasewhoreceivesa
positiveresult

Truepositives
PPV=Truepositives+Falsepositives

21

Negativepredictivevalue:

Theprobabilityofnodiseaseinapersonwho
receivesnegativetestresult.

TrueNegatives
NPV=
TrueNegatives+Falsenegatives

22

23

24

The main sources for collection of data


Experiments
Surveys
Records
Primary source
Secondary source

25

Experiments
Experimentsareperformedtocollectdataforinvestigations
andresearchbyoneormoreworkers.

Surveys
CarriedoutforEpidemiologicalstudiesinthefieldbytrained
teamstofindincidenceorprevalenceofhealthordiseaseina
community.

Records
Recordsaremaintainedasaroutineinregistersandbooks
overalongperiodoftime
Providesreadymadedata.

26

27

Types of data
Qualitative or discrete data
Quantitative or continuous data

28

29

Qualitative or discrete data


Insuchdatathereisnonotionofmagnitudeorsizeofanattributeas
thesamecannotbemeasured.
Thenumberofpersonhavingthesameattributearevariableandare
measured
e.g.likeoutof100people75haveclassIocclusion,
15haveclassIIocclusionand10haveclassIII
occlusion.
ClassI,II,IIIareattributes,whichcannotbemeasuredinfigures,
onlynoofpeoplehavingitcanbedetermined.

30

Quantitative or continuous data


Inthistheattributehasamagnitude.Boththe
attributeandthenumberofpersonshavingthe
attributevary
E.g.
Freewayspace.Itvariesforeverypatient.Itisa
quantitywithadifferentvalueforeachindividualand
ismeasurable.Itiscontinuousasitcantakeanyvalue
between2and4likeitcanbe2.10or2.55or3.07etc.

31

32

Types of scales in statistics


Nominal scale data

Ordinal scale data

The information is divided into some definite qualitative


basis
Eg. male/female, white/black, Urban/Rural

Information is expressed in ordinal or rank order


relation
Eg. Ramu is taller than ravi and Ravi is taller than Vijay
A scale graded in equal increments is used

Interval or numeric scale


data

Eg. Height ,weight ,blood pressure


Difference between 2nd and 3rd inches is same as 7th and
8th inches

a)The interval scale data is placed with some meaningful


ratio

Ratio scale data

Eg. Weight, time, blood pressure, temperature measured


in kelvin
b)This type of data is biomedically most significant

33

Data presentation

Statisticaldataoncecollectedshouldbe
systematicallyarrangedandpresented

Toarouseinterestofreaders
Fordatareduction
Tobringoutimportantpointsclearlyandstrikingly
Foreasygraspandmeaningfulconclusions
Tofacilitatefurtheranalysis

34

Twomaintypesofdatapresentationare

Tabulation
Graphicrepresentationwithchartsanddiagrams

35

Tabulation
Itisthemostcommonmethod
Datapresentationisintheformofcolumnsand
rows
Itcanbeofthefollowingtypes
Simpletables
Frequencydistributiontables

36

Simple Table

Month NumberofpatientsatTKDC,New
Jan

Pargaon
2,500

Feb

2,900

March

3000

37

Frequency distribution table


NumberofCavities

NumberofPatients

0to3

78

3to6

67

6to9

32

9andabove

16

In a frequency distribution table, the data is first split into convenient groups ( class
interval ) and the number of items ( frequency ) which occurs in each group is shown in
adjacent column.

38

Charts and diagrams

Usefulmethodofpresentingstatisticaldata
Powerfulimpactonimaginationofthepeople
They are
Barchart
Histogram
Frequencypolygon
Frequencycurve
Linediagram
Cumulativefrequencydiagram
Scatterdiagram
Piechart
Pictogram
Spotmapormapdiagram

39

Bar charts

They are of three types


-Simple bar chart
-Multiple bar chart
-Component bar chart

Simple bar chart

40

Fig3.Barchartshowingthedistributionsofnonfailuresandbracketfailuresperarch
andarchsegment

-Wayofpresentingasetofnumbersbythelengthofabar.widthremainssameonly
lengthvariesaccordingtofreqeuncyineachcategory

41

Multiple bar chart

Two or more variables are grouped together, graph shows class


I, Class II and class III patients in each quarter.

42

Component bar chart

Barsaredividedintotwopartseachpartrepresentingcertain
itemandproportionaltomagnitudeofthatitems.

43

Histogram
Pictorialpresentationoffrequencydistribution
Consistsofseriesofrectangles
Classintervalgivenonverticalaxis

44

Frequency Polygon

obtained by joining midpoints of histogram blocks at the height of


frequency by straight lines usually forming a polygon

45

Frequency curve

when number of observations is very large and class interval is reduced


the frequency polygon losses its angulations becoming a smooth curve
known as frequency curve.

46

Line diagram

Linediagramareusedtoshowthetrendsofevents
withthepassageoftime

47

Cumulative Frequency Diagram

Graphicalrepresentationofcumulativefrequency.
Itisobtainedbyaddingthefrequencyofpreviousclass

48

Scatter or Dot diagram

Showsrelationshipbetweentwovariables
Ifthedotsareclusteredshowingastraightline,itshowsarelationship
oflinearnature

49

Pie chart

Piechartshowingthedistributionsofnonfailuresandbracket

failuresperarchandarchsegment.

Inthisfrequenciesofthegroupareshownassegmentofcircle
Degreeofangledenotesthefrequency

50

Pictogram

Popularmethodofpresentingdatatothecommonman
andthosewhocannotunderstandcomplicatedcharts.

Heresmallchartsareusedtopresentdata.

51

Spot map or map diagram

Thesemapsarepreparedtoshowgeographicdistributionoffrequenciesofcharacteristics.
Coverageofanyperticulardiseasecanbedepictedthroughthisdiagram.

52

Measuresofcentraltendency

Mean
Median
Mode

53

Mean
Itisthesummationofalltheobservationsdividedby
thetotalnumberofobservations(n)
X=X1+X2+X3.Xn/n
Advantageitiseasytocalculate
Disadvantageinfluencedbyextremevalues

54

Median
Whenalltheobservationsarearrangedeitherin
ascendingorderordescendingorder,themiddle
observationisknownasmedian.
Incaseofevennumbertheaverageofthetwo
middlevaluesistaken
Medianisbetterindicatorofcentralvalueasitis
notaffectedbytheextremevalues
Median=(0,1,2,2,2,3,3,4,8,10)=2+3/2
=2.5

55

Mode
Mostfrequentlyoccurringobservationinadatais
calledmode

Example
Numberofdecayedteethin10children
2,2,4,1,3,0,10,2,3,8
Mode=2(3Times)

56

57

Typesofvariability
Therearethreetypesofvariability
Biological variability
Real variability
Experimental variability

58

Biologicalvariability
Itisthenaturaldifferencewhichoccursin
individualsduetoage,genderandotherattributes
whichareinherent
Thisdifferenceissmallandoccursbychanceand
iswithincertainacceptedbiologicallimits
e.g.Verticaldimensionmayvaryfrompatientto
patient

59

RealVariability
Suchvariabilityismorethanthenormalbiological
limits
Thecauseofdifferenceisnotinherentornatural
andisduetosomeexternalfactors
e.g.Differenceinincidenceofcanceramong
smokersandnonsmokersmaybeduetoexcessive
smokingandnotduetochanceonly

60

ExperimentalVariability
Itoccursduetotheexperimentalstudy
Theyareofthreetypes
Observererror
Theinvestigatormayaltersomeinformationornot
recordthemeasurementcorrectly

Instrumentalerror
Thisisduetodefectsinthemeasuringinstrument
boththeobserverandtheinstrumenterrorarecallednon
samplingerror

Samplingerrororerrorsofbias
Thisistheerrorwhichoccurswhenthesamplesarenot
chosenatrandomfrompopulation.
Thusthesampledoesnottrulyrepresentthepopulation

61

Measuresofvariationordispersion

Range
Meanoraveragedeviation
Standarddeviation
Coefficientofvariation

62

Range
Itisthesimplest
Definedasthedifferencebetweenthehighest
andthelowestfiguresinasample
Definesthenormallimitsofabiological
characteristic
e.g.Ifthehighestscoreina1styear
Orthodonticsexamwas98andthelowest48,
thentherangewouldbe
98-48=50

63

Meandeviation
Itisthesummationofdifferenceordeviations
fromthemeaninanydistributionignoringthe+
orsign
DenotedbyMD
MD=(xx)
n
X=observation
X=mean
n=noofobservation

64

Standarddeviation
Alsocalledrootmeansquaredeviation
ItisanImprovementovermeandeviationusedmost
commonlyinstatisticalanalysis
DenotedbySDorsforsampleandfora
population
Denotedbytheformula
SD=(xx)2
norn-1

65

Greaterthestandarddeviation,greaterwillbethe
magnitudeofdispersionfrommean
Smallstandarddeviationmeansahighdegreeof
uniformityoftheobservations
Usuallymeasurementbeyondtherangeof2SD
areconsideredrareorunusualinanydistribution

66

Standard error
Uses:1.Efficacyofdrug
2.Lineoftreatmentorvaccination
Standarderrorofproportion=PxQ
n
p-proportionofoccuranceofanevent
q-1-p
n-samplesize

67

Coefficientofvariation
Itisusedtocompareattributeshavingtwodifferent
unitsofmeasurement
e.g.heightandweight
DenotedbyCV
CV=SDX100
Mean
Expressedaspercentage

68

Normal distribution or normal curve

69

CHARACTERISTICS OF NORMAL DISTRIBUTION


1. Thecurvehasasinglepeak,thusitisunimodel.
2. Ithasabellshape.
3. Mean,MedianandModearethesamevalues.

70

4.Twotailsextendedindefinitelyandnevertouch

thehorizontalaxis(Thismeansthatinfinitenumberof
valuesarepossible)

5.Meaniszero
6.SDisalways1

71

CONFIDENCE LIMITS
Populationmean+1Limitsinclude68.27%ofthe
samplemeanvalues.
Populationmean+2S.D.covers95.4%ofthe
observation.
Populationmean+3S.D.covers99.7%ofthe
observation.

72

73

Ifthecurveisnotnormal,thenitisskewed
distribution.
Inpositivelyskeweddistributioncurve,(rightskewed)
mean>median.
Innegativelyskewedorleftskeweddistributioncurve,
themean<median.

74

75

Thankyou

Guided by :
DR. SANGEETA GOLWALKAR
DR. KISHOR CHOUGULE
DR. VIKRANTH SHETTY
DR. SAYAM PATIL
DR. VIKRAMADITYA TODKAR

76

CONTENTS

SAMPLING
PRECISION
BIASINTHESAMPLE
UNBIASEDCHARACTER
DETERMINATIONOFSAMPLESIZE
PROBABILITYOFPVALUE
TESTSINTHETESTOFSIGNIFICANCE
LIMITATIONSOFTESTSOFHYPOTHESIS
ACCEPTORREJECTNULLHYPOTHESIS
PVALUE
TYPESOFERROR
SOFTWARESUSEDFORSTATISTICS

77

SAMPLING
SAMPLING:Istheselectionofthepartofan
aggregatetorepresentthewhole.
SAMPLE:Afinitesubsetofstatisticalindividuals
inapopulation
SAMPLESIZE:Thenumberofindividualsinthe
study

78

Samplingunit thebasicunitaroundwhicha
samplingprocedureisplanned
Person
Grouphousehold,school,district,etc.

Samplingframelistofallofthesamplingunits
inapopulation

79

What is a 'good' sample?


Itrepresentsthepopulationunderstudyinthe
characteristicsofinterest.
Itisadequateinnumbers.

80

What is an 'adequate sample?


Thenumbermustnotbetoofewthatwecan
notruleoutchance.
Itshouldnotbetoolargethatitis
uneconomical.

81

Ceib Phillips (Semin Orthod 2002)

Calculatingasamplesizerequiresfourthings:

(1)Decidingonthedesignofthestudy
(2)Assessingtheavailabilityofresources
(3)Specifyingdistributionassumptions
(4)Definingaclinicallyrelevanteffect

82

83

Probability (random) sampling


Samplinginwhicheachsamplingunithasa
knownandequalprobabilityofbeingincludedin
thesample.

Non Probability (Non random) sampling

84

Random sampling /probability sampling

Herethesampleoftheunitisselectedinaway
thatallthecharacteristicofthepopulationis
reflectedinthesample.
Thisispossiblebyselectingtheunitsofsample
atrandom.
Randomindicatethechanceofthepopulation
unitbeingselectedtothesample.

85

Types of Random sampling

Simple random sampling


Systematic random sampling
Stratified random sampling
Cluster sampling
Multi stage Sampling

86

Simple random sampling


Thisisasamplingtechniqueinwhicheachand
everyunitinthepopulationhasanequalchance
ofbeingincludedinthesample
Inthismethodtheselectionofunitis
determinedbychanceonly

87

Types of Simple random process


Drawing lots
Currency method
Using random number tables
Using a computer programme to generate
random sample

88

Simple random sampling


Makealistofallelementsinthepopulation,serially
numbered
Determinethesamplesize
Drawasampleofnumbers,usingarandomprocess
Selectthesamplefromthepopulationlistthat
correspondtotherandomlydrawnnumbers

89

Systematic random sampling


Asystematicrandomsampleisformedbyselectingoneunit
atrandomandthenselectingadditionalunitsatevenly
spacedintervaltillthesampleofrequiredsizehasformed.
Thismethodisusedwhenthepopulationislarge,non-
homogenousandscattered.
SampleiscalculatedbytakingeveryKthvariablewherekis

Totalpopulation
K=
Samplesizedesired

90

Stratified random sampling


Thepopulationtobesampledissubdividedinto
groupsknownasstrata,suchthateachgroupis
homogenousinitscharacteristic.
Asimplerandomsampleisthenchosenfrom
eachstratum.
Thistypeisusedwhenthepopulationis
heterogeneousandlarge.

91

Thismethodensuresmorerepresentativeness
Providesgreateraccuracy
Canconcentrateonwidergeographicalarea
Thelimitationofthismethodisthat
Carehastobetakenwhiledividingthe
populationintostrataregardinghomogeneityin
eachstratum

92

Area Sampling
Itisatypeofrandomsamplinginwhichmaps
ratherthanlistsareused.
Areatobecoveredinstudyisdividedinto
smallerareasandarandomsampleisselected
fromthesmallerareas.

93

Cluster sampling
Thismethodisusedwhenthepopulationforms
naturalgroupsorclusterssuchasvillages,
wards,childrenofschooletc.
Herefirstasampleofclustersisselected
Alltheunitsineachoftheselectedclustersare
surveyed

94

Advantage
Methodissimple,Involveslesstimeandcost
Disadvantage
Higherstandarderror

95

Multiphase sampling
Thismethodisusuallyadoptedwhentheinterest
isanyspecificdisease
Here,samplingisdoneindifferentphases.
Ex.Intuberculosissurvey
1. Firstphase-Montouxtestisdoneinallcasesof
sample
2. Secondphase-X-rayofchesttakeninMontoux
Positivecases
3.Thirdphase-Sputumexaminationinx-ray
positivepatients.

96

Multistage Sampling
Itisemployedinlargecountrysurveys.
Eg.
1ststageCountrywidesurvey
2ndstage-Districtwidesurveybyselecting
somedistrictsrandomly
3rdstageVillagewidesurveybyselecting
somevillagesrandomly

97

Sequential Sampling
Hereasmallsampleistestedinordertoanswer
certainquestionsaboutthepopulation.
Ifthequestionsarenotanswered,thenumberof
subjectsorunitsinthesampleisincreased
graduallyuntilconclusionsmaybedrawn.

98

Types of Non Random sampling

Convenience sample
Judgment sampling
Quota sampling

99

Convenience sample
-Anon-randomcollectionofsamplingunitsfroman
undefinedsamplingframe.
-Itisnotrandomlyobtainedvolunteerswould
constituteaconveniencesample

Advantages
Convenientandeassytoperform

Disadvantages

Notstatisticaljustificationforsample

100

Purposive Sampling
ThisisalsoknownasJudgmental sampling.
Theattitudehereisquitedifferent.
Purposivesamplingisdonetosaveaveryspecific
needorapurpose
Asubsetofpurposivesampleisasnowball
sample/chainrefferedsample
Asnowballsamplesareparticularlyusefulinhard
to-trackpopulations,suchasthosewithillegal
behaviourlikedrugusersetc

101

Quota Sampling
The investigator is interested in getting some
predetermined number of units from the
population.
General composition intermsofsex,education,
percapitaincomeisdecidedinadvanceandthe
investigatorisinterestedinjustfillingthequota
assignedtovariousgroupsinthepopulation.

102

Ex.Ifaresearcherisinterestedinattitudesof
membersofdifferentstates,hecouldseta
quotaof3%peopleofeachstate.However
samplemaynolongerberepresentativeofactual
proportionsinthepopulation.

103

Sample size
Biggerthesamplehigherwillbetheprecisionof
theestimatesofthesample
Anoptimumsizeofthesampleistobe
considered,keepinginmindthefollowing
factors
-Anapproximateideaoftheestimateofthe
characteristicsunderstudyanditsvariability
fromunittounitinthepopulation

104

-Knowledge about the precision of the estimate of


thecharacteristic
-The probability level within which the desired
precisionistobemaintained
-Theavailabilityofexperimentalmaterial,resources
andotherpracticalconsiderations

105

DETERMINATIONOFSAMPLE
SIZE
QUANTITATIVEDATA
4SD2
N= 2
L

SD=STANDARD
DEVIATION
L=ALLOWABLEERROR

106

DETERMINATIONOFSAMPLE
SIZE
QUALITATIVEDATA

P=POSITIVECHARACTER
4pq
N= 2
L

L=ALLOWABLEERROR
Q=1-p

107

PRECISION

Individualbiologicalvariation,samplingerrorsandmeasurement
errorsleadtorandomerrorsleadtolackofprecisioninthe
measurement.Thiserrorcanneverbeeliminnatedbutcanbe
reducedbyincreasingthesizeofthesample.

108

PRECISION
PRECISION=samplesize
standaraddeviation
STANDARDDEVIATIONREMAININGTHE
SAME,INCREASINGTHESAMPLESIZE
INCREASESTHEPRECISIONOFTHESTUDY.

109

Errors in sampling
Therearetwotypesoferrorsthatarisein
samplingtheinvestigation,
Samplingerror
Nonsamplingerror

110

ERRORS IN SAMPLING
SAMPLING ERRORS
Faulty sampling design

NON SAMPLING ERRORS


Coverage error

-due to non response or non


cooperation of the informant
Small size of the sample

Observational error

-due to interviewers
bias,imperfect exptl. design,or
interaction
Processing error

-due to errors in statistical


analysis

111

EXPERIMENTALVARIABILITY
ERROR/DIFFERENCE/VARIATION
THEREARETHREETYPES
1. OBSERVER-subjective/objective
2. INSTRUMENTAL
3.SAMPLINGDEFECTSOR
ERROROFBIAS

112

BIASINTHESAMPLE
This is called systematic error .
This occurs when there is a tendency to produce
results that differ from true values.
A study with small systematic error is said to have
high accuracy. Accuracy is not affected by the
sample size.

113

BIASINTHESAMPLE..
Accuracyisnotaffectedbythesamplesize.There
areasmanyas45typesofBiases,howeverthe
importantonesare
-SELECTIONBIAS
-MEASUREMENTBIAS
-CONFOUNDINGBIAS

114

Unbiased character
Thesampleshouldbeunbiasedi.e.everyindividual
shouldhaveanequalchancetobeselectedinthe
sample.
Thusastandardrandomsamplingmethodshouldbe
used
Nonsamplingerrorscanbetakencareofby
Usingstandardizedinstrumentsandcriteria
Bysingle,double,tripleblindtrials
Useofacontrolgroup

115

Determination of sample size


ForQuantitativeData

Theinvestigatorneedstodecidehowlargeanerrorduetosampling
defectisallowablei.e.allowableerrorL
EithertheinvestigatorshouldstartwithassumedSDordoapilotstudy
toestimateSD
samplesize=4SD2/L2
Meanpulserateofpopulationis70beatsperminwithstandarddeviation
of8beats.Whatwillbethesamplesizeifallowableerroris1
n=4X8X8/1X1=256

IfLislessnwillbemorei.e.largerthesamplesizelesseristheerror.

116

For qualitative data


Insuchdatawedealwithproportion
Samplesize=n=4pq/L2

p=proportionofpositivecharacter
q=proportionofnegativecharacter
q=1-por(100-pifexpressedinpercent)
L=allowableerrorusually10%ofp

e.g.incidencerateinlastinfluenzawasfoundtobe5%ofthe
populationexposed
whatshouldbethesizeofthesample
tofindincidencerateincurrentepidemicifallowableerroris10%?
p=5%q=95%
l=10%ofp=0.5%
n=4X5X95/0.5X0.5=7600

117

Probability or p value
Probabilityisthechanceofoccurrenceofanyeventor
permutationcombination.
Prangesfrom0to1
0=Thereisnochancethattheobserveddifference
couldnotbeduetosamplingvariation
1=Itisabsolutelycertainthatobserveddifference
between2samplesisduetosamplingvariation
Howeversuchextremevaluesarerare.

118

Theessenceofanytestofsignificanceistofindoutp
valueanddrawinference
Ifpvalueis0.05ormore
Itiscustomarytoacceptthatdifferenceisdueto
chance(samplingvariation).
Theobserveddifferenceissaidtobestatisticallynot
significant.

Ifpvalueislessthan0.05
observeddifferenceisnotduechancebutduetorole
ofsomeexternalfactors.
Theobserveddifferencehereissaidtobestatistically
significant.

119

Tests in test of
significance
Parametric
parametric

Quantitative

Studentsttest
Ztest
OnewayANOVA
TwowayANOVA
Chisquaretest
Pearsoncorelation
coefficient

Non

Qualitative

MannWhitneyUtest
Wilcoxonsignedranktest
KruskalWallistest
Mcnemarstest
Fishersexactprobabilitytest
Friedmantest

120

Parameterictests
Parametrictestsarethosetestsinwhichcertainassumptions
aremadeaboutthepopulation.
Sincethesetestmakeassumptionsaboutthepopulation
parametershencetheyarecalledparameterictests.
Theseareusuallyusedtotestthedifferencebetweentwo
means.

Theyare:Studnttest
Ztest
OnewayANOVA
TwowayANOVA
Chisquaretest
Pearsoncorelationcoefficient

121

Non parametric tests


Somebiologicalmeasurementsmaynotbetruenumerical
valueshencearithmeticproceduresarenotpossibleinsuch
cases.
In such cases distribution free or non parametric
tests are used in which no assumption are made
about the population parameters e.g.
MannWhitneyUtest
Wilcoxonsignedranktest
KruskalWallistest
Mcnemarstest
Fishersexactprobabilitytest
Friedmantest

122
Condition

Test used

Testusedtoexaminedifferences
betweenfrequenciesinsample

Chisquaretest

Tofindtheassociationbetweentwo
variables

Chisquaretest

Whenatestusesnominaldataonly
andhasmorethan25subjects
associatedwiththestudy

Chisquaretest

Whenonestudygroupissampledon
3ormoreoccasions

ANOVA

Thetestthatcomparethevariance
betweengroupswiththevariation
withinthegroup

ANOVA

Whenmultiplegroupsarestudiesin
termsofonlyonefactor

OnewayANOVA

Whenmultiplegroupsarestudiesin
termsofonlyonefactor

TwowayANOVA/
MULTIFACTORIAL

The test that is applied to find the T test or students t


significance of difference
between two means
Thetestusedwheninvestigation
haveoneonesetofintervaldate,one
setofnominaldataandonlytwo
groups

Ttest

Thesamepeoplearesampledontwo
differentoccasion

Pairedt-test

Studyingoftwoseparategroupstotest Unpairedttest
ifthedifferencebetweenthetwo
meansisrealoritcanbeattributedto
samplingvariabilitysuchasbetween
meansofcontrolandexperimental
groups
Thetestthatcomparetwoordinals
levels

Spearmanscorrelationcoefficient

Whenatestusesnominaldataand
Fischerexacttest
associatedwithfewerthan25subjects

123

124

Chi- square test


ItwasdevelopedbyKarlPearson
Whendataismeasuredintermsofattributesor
qualities,anditisintendedtotestwhetherthe
differenceinthedistributionofattributesin
differentgroupsisduetosamplingvariationor
not,thechisquaretestisapplied.
Itusedtotestthesignificance

125

WILCOXONSIGNEDRANKSTEST:

NON-PARAMETRICcounterpartofpairedttest
Usedtocompareasinglesamplewithahypothetical
mediantworelatedgroups

126

MANNWHITNEYTEST

Nonparametrictesttocomparethemediansoftwo
independentsamples
usefulalternativetotheparametricttestwhen
measurementisonanordinaryscale.
KRUSKAL-WALLISTEST;

Non-parametrictesttocomparethemediansofseveral
independentsamples.Itisthenonparametric
equivalentofone-wayanalysisofvariance.

127

ANOVA
Analysisofvariance
Caseswheremorethan2samplesareused
ANOVAcanbeused
Alsowhenmeasurementsareinfluencedbyseveral
factorsplayingthererolee.g.factorsaffecting
retentionofadenture,ANOVAcanbeused.
ANOVAhelpstodecidewhichfactorsaremore
important

128

Requirements
Dataforeachgroupareassumedtobe
independentandnormallydistributed
Samplingshouldbeatrandom

OnewayANOVA
Whereonlyonefactorwilleffecttheresultbetween
2groups

TwowayANOVA
Wherewehave2factorsthataffecttheresultor
outcome

MultiwayANOVA
Threeormorefactorsaffecttheresultoroutcomes
betweengroups

129

Ftest
F=MeanSquarebetweenSamples/MeanSquarewithinSamples
F=varianceratio
ThevaluesofMeansquareareseenfromtheanalysisofvariancetableif
wehavethevaluesofsumofsquaresanddegreeoffreedom(whichare
calculated)
MeanSquarebetweenSamples
Itdenotesthedifferencebetweenthesamplemeanofallgroups
involvedinthestudy(A,B,Cetc)withthemeanofthepopulation
MeanSquarewithinSamples
Itdenotesthedifferencebetweenthemeansinbetweendifferent
samples
Thegreaterboththesevaluemoreisthedifferencebetweenthesamples

130

TheFvalueobservedfromthestudyiscomparedtothe
theoreticalFvalueobtainedfromtheTablesat1%and5%
confidencelimits.
Theresultsaretheninterpreted.
Iftheobservedvalueismorethantheoreticalvalueat1%,
therelationishighlysignificant.
Iftheobservedvalueislessthanthetheoreticalvalueat5%
itisnotsignificant.
Iftheobservedvalueisbetween1and5%oftheoreticalvalue
itisstatisticallysignificant.

131

Limitations of the tests of Hypothesis


-Properinterpretationofstatisticalevidenceis
importanttointelligentdecisions
-Donotexplainaswhydoesthedifferenceexist
-Onlyprobabilitiesandnocertainties
-Statisticalinferencesbasedonsignificancetests
cannotbesaidtobeentirelycorrectevidences
concerningthetruthofthehypothesis

132

Tests of significance for large samples


Thesetestsareusedforsamplesizegreaterthan
60
ThetestusedisZtest
Zisstandardnormalderivateandhasbeen
discussedundernormaldistribution
Z=observationmean/SD
HoweverinZteststandarddeviationisreplaced
bystandarderror
InZtest,Z=observeddifference/standarderror

133

Weknowthatstandarddeviationmeasurethe
variationwithinasample
Standarderroristhemeasureofdifferenceinvalues
occuring
Betweenasampleandpopulation
Betweentwosamplesofthesamepopulation

StandarderrorusedinZtestcanbe

Standard error of mean


Standard error of proportion
Standard error of difference between 2 means
Standard error of difference between 2
proportions

134

Test of significance can also be divided into 1


tailed or 2 tailed test

One tailed test


Inthetestofsignificancewhenonewantstospecifically
knowifthedifferencebetweenthetwogroupsishigher
orlower
i.e.thedirectionplusorminussideisspecified.
Thenoneendortailofthedistributionisexcluded
Eg.Ifonewantstoknowifmalnourishedchildrenhave
lessmeanIQthanwellnourishedthenhighersideof
thedistributionwillbeexcluded
Suchtestofsignificanceiscalledonetailedtest

135

Two tailed test


Thistestdeterminesifthereisadifferencebetweenthe
twogroupswithoutspecifyingwhetherdifferenceis
higherorlower
Itincludesbothendsortailsofthenormaldistribution
SuchtestiscalledTwotailedtest
Eg.whenonewantstoknowifmeanIQin
malnourishedchildrenisdifferentfromwellnourished
childrenbutdoesnotspecifyifitismoreorless

136

137

Stages in performing test of significance

Statethenullhypothesis
Statethealternativehypothesis
Acceptorrejectthenullhypothesis
Finallydeterminethepvalue

138

State the null hypothesis


Nullhypothesis
Itisahypothesisofnodifferencebetweenstatisticsofa
sampleandparameterofthepopulationorbetween
statisticsoftwosamples
Itnullifiestheclaimthattheexperimentalresultis
differentfromorbetterthantheoneobservedalready

State the Alternative hypothesis


Itishypothesisstatingthatthesampleresultisdifferent
i.e.largerorsmallerthanthevalueofpopulationor
statisticsofonesampleisdifferentfromtheother

139

Accept or reject the null hypothesis


Iftheresultofasamplefallsintheareaofmean2SE
thenullhypothesisisaccepted.
Thisareaofnormalcurveiscalledzoneofacceptancefor
nullhypothesis
Iftheresultofsamplefallsbeyondtheareaofmean2
SE
nullhypothesisofnodifferenceisrejectedandalternate
hypothesisaccepted
Thisareaofnormalcurveiscalledzoneofrejectionfor
nullhypothesis

140

141

Finally determine the p value


Pvalueisdeterminedusinganyofthepreviously
mentionedmethods
Ifp>0.05thedifferenceisduetochanceandnot
statisticallydifferentbutif
p<0.05thedifferenceisduetosomeexternal
factorandstatisticallysignificant

142

Typesoferror

Whiledrawingconclusionsinastudyweare
likelytocommittwotypesoferror.
TypeIerror
TypeIIerror

143

Null Hypothesis
Decision

Accept

Reject

True

Right

TypeIerror

False

TypeIIerror

Right

144

TypeIerror
Thistypeoferroroccurs
Whenweconcludethatthedifferenceis
significantwheninfactthereisnoreal
differenceinthepopulationiewerejectthenull
hypothesiswhenitistrue
Denotedby

145

TypeIIerror
Thistypeoferroroccurs
Whenwesaythatthedifferenceisnot
significantwheninfactthereisarealdifference
betweenthepopulationsi.e.thenullhypothesis
isnotrejectedwhenitisactuallyfalse
Itisdenotedby

146

Softwares used for biostatistics


Freestatisticalsoftware
G7 from Inforum
GGobi
Gretl
Instat
R
WinBugs

147

Commercialstatisticalsoftwares

SAS
SPSS

148

Conclusion
Biostatisticsinthestudyofresearchprocedures
andmethodsisaveryimportantaspectofall
postgraduatestudiesandatthesametimeprovides
guidelinesonwhichourfutureresearchwillbe
based.

149

References
Seminarsinorthodonticsvol82002
EssentialsofpreventiveandcommunitydentistrySobenPeter
SocialandpreventivemedicineS.Park
Handbookofbiostatisticsbydrmahajan
BiostatisticsinoralhealthDailey
Statisticalandmethodologicalaspectsoforalhealth
research-Lesaffre,Feine,Leroux,Declerck
BiostatisticsThebareessentialsNormanand
Streiner

150

Science with statistics bears good fruits

has no roots
Science with statistics bears good
fruits
Statistics without scientific
application has no roots

151

THANK YOU

You might also like