You are on page 1of 157

1

(evidence-based medicine)

(p)P

Citations

MGH

Citation

CITATIONS: are footnotes


to reference published with
a scholarly journal article
.

Institute for Scientific InformationISI1962

Science Citation index (SCI)

Cited reference

) MedlineCitation
(


InspecMedlineSCI

MGH

Impact factor

Citation index

Impact factor
Citations
(impact factor)

(The New England journal of medicine)


34.833

3.256 (American Journal of Preventive Medicine )


4.03(respiratory research)
0.306

(Saudi medical journal)

Citations
Impact factor

MGH

60

(HCR)
5000

500

500

MGH


MGH

MGH


Snow


Statistical Analysis

Excel


MGH

Data

Cases -

RecordsObservations

FieldsVariables

Values
variables
cases


GIS

MGH

outcomedependent

exposureindependent

MGH

10

Measurement scales

Nominal

Categorical


A, B, AB, O

Dichotomous
Nominal
Yes/No , binary, logical

0, 1coding

MGH

11

Ordinal

I, II, III, IV, V 1

IIIII

IVIIIIIIII
I, II, III, IV

3


||||||

20,00010,000||10,000

40,000||40,00020,000||

45<||4515||150

Quantitative



Discrete

MGH

12

Continuous

Interval

Ratio

cyclic

variables

706050
(50+60+70) / 3 =60

nominal

21:00,

(21+1+2) / 3 = 801:00, 02:00


Continuous
165


MGH

13

binaryproportion

23

Rate

5.5

Time to an event

Data Types

Numeric

String

Date

Currency

Binary

MGH

14

coding

Martial status

single

divorced

widowed

married

.( 1,2,3,4 )

.
.(

MGH

15

MGH

16

ordinalscale

200001200010000

1000030000
4000030000300002000020000

MGH

17



Descriptive statistics

Inferential Statistics

Inferential statistics

Estimation a parameter within population

Detect Relations between variables.

Predict values: one variable based on others,

parameters

of the population,.

2020 2

MGH

18

there is no other way of representing "meaning" except


in terms of relations between some quantities or
qualities; either way involves relations between
variables


Population and samples

) CensusPopulation
interferencesample


Sampling variation

Population

10

target population

Population

MGH

19

census


Representing sample

students are typical

Observational

Experimental

MGH

20

Confounders
causation

association

Confounders

MGH

21

AccessExcel

Excel

AccessExcel

SAS System 1

1982

SAS

Personal computerPC

Windows, Main frame

Unix, Linux, Mac

MGH

22

SAS Specialist


SPSS 2


Graphics

STATISTICA 3
GraphicsSPSS
SPSS
Epi Info 4



S-PlusR 5

MGH

23

2-

spss

Data view

Variable view

SPSS Viewer

Data View

Value Labels
Labels

30

MGH

24


Variable view

Variable view

Name

_
Underscore

Label


Measure

Scale/, Ordinal, Nominal


MGH

25


TYPE

Numeric

Date

custom currencydollar

String

Character

Values


Value

Value

Label

MGH

26


Missing Values

SPSS
SPSS Missing

MGH

27

Open FileExcel
. Files of type ExcelData
MGH

28


Reliability - Unbiased
Reliable and
unbiased

Reliable = Free of random error. (stochastic error)


Unbiased = Free of systematic error.

PrecisionReliable

Accuracy Unbiased

MGH

29


ReliabilityBias

(a)

intra-arterial cannula

(b)

sphygmomano-meter

Unreliable

Unreliable Results Cant be used.

biased

Validation

MGH

30

Validity

Unbiased
Bias
UnvalidUnbiasedReliable

Gold standard

External VInternal V

Valid
UnbiasedReliable

Validation study

-4
Descriptive Statistics

Pictorial

Numerical summaries

MGH

31

Frequency distributions


Frequency Table

Frequency

Percent

Valid percent

Cumulative

percent
MGH

32

100

ordinal

Frequency Analysis


significant figures

0.4524975514202042924
%45.2 0.452

MGH

33


24,175,90026502
1100.00110.001090.0010962156
100,000

Graphics

Bars 1

. Frequency

MGH

34


Pie

SPSS
Descriptive

StatisticsAnalyze

Frequencies


Charts
OK

MGH

35


Histogram

125
Grouping
179161180160160140

MGH

36


SPSS
140 cm
1499cm 140cm

Frequency Percent Cumulative


frequency

MGH

<140

13

0.6

0.6

140-149.9

217

10.6

11.2

150-159.9

720

35.3

46.5

160-169.9

693

34.0

80.5

170-179.9

351

17.2

97.7

180-189.9

42

2.1

99.8

>=190

0.2

100


37

histogram

MGH

38


histogram , bars

bars

Histogram

MGH

39

Steam and leaf diagrams

histogram

Frequency

3.00
4.00
8.00
7.00
5.00
4.00
2.00
1.00

MGH

Stem & Leaf

1.
2.
3.
4.
5.
6.
7.
8.

259
0224
01125689
0002457
01127
2789
04
0

Stem width: 10.00


Each leaf:
1 case(s)
stem * stem width + leaf =X

leaves

12/15/ 19
20/22/22/24
30/31/31/32/35/36/
40/40/40/42/44/45/47
50/51/51/52/57
62/67/68/69
70/74
80

E.g.

1*10+2=12

40

Frequency Analysis

Crosstabulation

MGH

41

SPSSCrosstabulation

CrosstabsDescriptive statisticsAnalyze



Cells

ColumnRowPercentage


MGH

42

Clustered bar chart



male ,female

MGH

Clustered bar chart


43

90
77.1

80

74.2

70
60

single

50

married

40

divorced

30

22.2

widowed

20.6

20
10

0.8

1.2

4.3

Male

Female

Scale

scatter plots

MGH

44

Time series

Time series
35
30
25
20
Income
15
10
5
0
1988

1990

1992

1994

1996

1998

2000

2002

Healthy

not-healthy

Concentration (mg/100 ml)

16
14
12
10
8
6
4
2
0
0

30

60

90

120

150

180

210

240

270

300

330

Times (minutes)

MGH

45



Stacked bar chart

3-D graphics

single married divorced widowed

single
divorced

married
widowed

MGH

46

80
70
60
50
40
30
20
10
0

Mac

Win
None

None

Win

Mac

Mac

Win

None

McGill


MGH

47

Stacked bar chart



80

100

70
60

single
married
div orced
w idow ed

50
40
30
20

80
60
40

w idow ed
div orced

single

20

10
0

120

90

Clustered bar chartStacked bar chart


Male

Female

Male

Female

1990
2000
Income

Income

1990 1992 1994 1996 1998 2000

MGH

1990 1992 1994 1996 1998 2000

48

married

Income

Income

250

200

150

214
212
210
208
206

100

204

50

202

1990 1992 1994 1996 1998 2000

200
198
1990 1992 1994 1996 1998 2000


0198

MGH

49

Rubber-band Scales

MGH

50

MGH

51


10101

100


1970


100
100

11010

2100100

310001000

10100
MGH

52

12

:
MGH

53

) (experiment :
) (....
) : (Sample space .
). (Outcomes
) : (Event
.
: :
* }. {1,2,3,4,5,6
* Outcomes . 1, 2 ,3 ,4 ,5,6
* 1
...
) : (Simple )
( Compound
)

( .
: 1 ) }( {1
2 4
6 . { 2,4,6} .
) :(Union A B
. A U B
): (Intersection A
B . A B
) : (complement A
A A . A
:
.
:
.
:
* A .
* B .
54

MGH

.
* A U B
.
* AB
.
* A ) 4 123
(6
.

: . A, B

) :(Mutually exclusive
.
* : 1 3
.
: n a
. a/n
* : 100 48
. 0.48
) (probability :
.

55

MGH

.
AB

A,B

A ,B

0 > P < 1 : P
* ) P=0 ( .
* ) P=1 ( .
: A B (A
) U B .
* 1 1/6
3 1/6 1 3 2/6
)1- P : (A
* 0.7 . 1-0.7 = 0.3
: A B:
)P (A U B) = P(A) + P(B) P (A B

* ) 2 ( 6 4 2
) 3 (P = (3/6) + (2/6) (1/6) = (4/6) 6 , 3
(
* AU B
)( .
: E1 N E2 M
N x M.
*
.

56

MGH

.
* 5 10
= 10 x 10 x 10 x 10 x 5 = N x M x.......
:
50000

Odds

.
: .
)Odds = P / (1-P) =Probability (win) / probability (loss

)P = Odds / (1+Odds

: odds ) P
( .
Odds > P
P : ) (P <0.1 . P Odds
. Odds = 1
) (P=0.5
:


Conditional Probability
A B A
B .
:
* :A )(1,2,3
* :B )(2,3,4
* A = 3/6 B )
(2,3,4 A
. 2/3
). P(A|B
:
). P (A|B) = P (A B ) / P (B
:

) P (AB A B ) 2/6
. P (A|B) = (2|6) / (3|6) = 2/3
2 3( :

Independent events
A B
.
57

MGH

.
: A 1 .
B 3 .
) B 3 (
.1/6
). P(B|A) = P(B

P (A|B) = P(A) :

A B :
)P (AB) = P(A) * P(B
:
= * .

.
:
* 100
100
) (1030 *1.27
4
) ( .
* 100


.
) ** ...... (
) . ( .


: Ai :
) P (B) = P(A i ) * P(B|A i
:
)P (Ai|B) = P(Ai) * P(B|A i) / P (B
.
: A %60 B
%40 %30 . %50 :

58

MGH

.
* ) (
.
*
*
.
: ) (


3800 10000
%38

: 3800 1800
1800/3800 . 47.4%
: 6200 4200
4200/6200 67.7%
.

%1
99% 5%
.
*

*

59

MGH

* 594 99
99/594
. 16.7%
* 9406
9405
. 99.989%


.
%1
%99 %5
*
.
* .
.

Sensitivity

Specificity


:
* ) : (TP
TP) . (True Positive

60

MGH

.
* ) : (FP
) ( False Positive
* ) :(FN ) False Negative ( .
* ) :(TN ) True Negative ( .
* ) : (Patient TP + FN .
* ) : (healthy .
* ) : (+ test .
* ) : ( test .
* ) : (all .

: Sensitivity
*
.
* / TP :
Sensitivity = TP / patient
:Specificity
*

*
:
. Specificity = TN / healthy
Positive predictive value : PPV
* :
)PPV = TP / (+ test
Negative predictive value : NPV
* :
)NPV = TN / ( - test
:

61

MGH

.
*

.
*
.
:
* 95% ) (
90% ) (
10% )
(
) ( .
* . 10000

* 10% 9000 1000.


* 95%
1000 1000 * 0.95 950
. 50
* 90% 9000
0.9* 9000 8100 . 900
* .

* :
* Sensitivity = TP / patient = 950 / 1000 = 95%
* : :
* PPV = TP / (+test) = 950 / 1850 = 51.4 %
62

MGH

.
* :
* NPV = TN / (-test) = 8100 / 8150 = 99.4%

) 95% ( 90%
1% .10%

95
990
1085
5
8910
8915
100
9900
PPV = 95 / 1085 = 8.8% :
* 8.8%.
:
NPV = 8910 / 8915 = 99.94%
* .



.
100

NPV

80

60
PPV
NPV

PPV

40

20


:
0

0
2
4
6
8
10
12


.
:
. ) (

63

MGH

.

.

Clinical Decision-Making
50
SGPT . Alkaline phosphatase
: :
E ) ( alcoholic hepatitis
)(.
) (cholangitis
E
.
. %5
:
.1 : . % 5 :
.2 ) ( : :
0 . %50 :
0 . %10 :
.3 ) ( : :
0 . %25 :
0 . %75 :


) ( ) (
.

Mortality rates
biopsy

medical

Surgery

80

70
60
50
% 40

30

20
10
0

64

100

90

80

70

60

50

40

% Hepatitis

30

20

10

MGH

* ) : (1
.
* ) : : (2
.
* ) : : (3

.
* %5
%50 %98
.
.
.
:
%10 %5
%50 .
.1
.2
.3
.4
.5
) ( .

.1 %50
.
= %25 = 1000/250
.2
.3 = = 9000/250
%2.86
65

MGH

.
.4
%2.5
= = 10,000/8,750
.5
%87.5
.
= = 10,000/250

-
%10
.


: :
=
x
x
.
: P (Carrier|unaffected):
P (Carrier and unaffected) = 1/2
,
)P (Unaffected
= 3/4
)P (Carrier|unaffected) = P (Carrier and unaffected) / P (unaffected
=(1/2) / (3/4) = 0.67
%10 :
P = 0.67 * 0.1 = 0.067
25%
:
)P = 0.067 * 0.25 = 0.01675 (1.7 %
:
P =0.1 * 0.1 * 0.25 = 0.0025 (0.25%) = 25/10,000 child

66

MGH

.
\

- 6-


: Random variable
.
:
*
0 10 .
* .

) (.

.
:
.
:
.
:
* X .
* :
4

0.15625

0.3125

0.3125

0.15625

0.96875

0.8125

0.5

0.1875

X:
0
5
P : 0.03125
0.03125
Pc: 0.03125
1

: X : P : Pc

* :
: X . :
. P .
cP .

67

MGH

Binomial

Binomial distribution :
* N .
* .
* .
* . p
* ) (
:
* 10 ) n=10,
(p=0.5
* 4 8 ) n=8,
(p=0.167
* 10
) 0.01 .(n=10, p=0.01
: SPSS
.1 ) P= PDF.Binom( X, n, p ) ( n
p .
X .
.2 ) .Pc= CDF.Binom (X, n, p
X .
68

MGH

.
: X : X :
)1-Pc=1- CDF.Binom (X-1, n, p

:
* 6 ) n=6 p=1/6=0.167
* 5
P= PDF.Binom( 3, 6, 0.167) = .0538
* 5 ) 2 1 0
(54 3
P=CDF.Binom (3, 6, 0.167) = 0.99123
* 5 ) 5 4 3
6 (
P=1- CDF.Binom (2, 6, 0.167) = 0.0626
* ) (
P = PDF.Binom( 0, 6, 0.167) = .334


. Proportion
: %10 20 .
* 3
P= PDF.Binom(3, 20, 0.1) = 0.19
* 5
P = 1 CDF. Binom (4,20,0.1) = 0.04
* ...
P = PDF.Binom(0, 20, 0.1) = 0.12

Poisson

.
:
* .
* .
* .
:

.
: SPSS

69

MGH

.
.1 P=PDF.Poisson (X,) :
.
.2 Pc=CDF.Poisson (X,) :
:
* 3 x
) ) Poisson (3 3
=.(3
* :
6
0.101
0.966
0.988

X:
0
1
2
3
4
5
7 P: 0.050
0.149
0.224
0.2024
0.168
0.050 0.022
Pc: 0.050 0.199 0.423 0.647
0.815 0.916

* :

P=PDF.Poisson (3,3) = 0.224

P=CDF.Poisson (3,3) = 0.647

P=1-CDF.Poisson (2,3) = 0.577

3 x
:
E ) :CDF.Poisson(X,3 X
) (.
E CDF.Poisson(2,3) = 0.42319 :
) 0 1 (
E : 1-0.42319 = 0.5768
.
E ) (201 )(....345
.
:
*
5

70

MGH

.
40
40 .
* Poisson (40,5) = 7.5 E-23 3
.


Rate
:
E 10 100,000
20,000 6
E ) 10 100,000
20,000 (4 :
P =PDF.Poisson (6,4) = 0.10
5
E
.
E
.
:
0 : . p
0 :
.

) ( )
162 (
) f(x ) F(x
.
: : ......
140 150
). F(150) F(140
71

MGH

.
170
170

)P= F(170
)P= 1- F(170


) ( ...

.

.

.

72

MGH


:
E ) (
E ) (positive )(negative
modal .
U .

:
E


.
73

MGH

.
E
.
E :

) (
) (

- - : -

74

MGH

-7


Central tendency & variance
:
, :

Central tendency

: Mean

) / (

. expected value
: Mode .

)( :
E

E
E


parametrical methods
non
. parametrical methods
: Median
.
) :(1 :
134
145
158
170
177
185

193
Mean = (134+145++193) / 7 = 166
Median 7 Median ) . ( 170
Mode .
) :(2
100 100 100 100 100 100 100 100 100

1000
. Mean = 190
,
median = 100

77

MGH

.
reliable

.
Mean Median
) (
. Mean > Median > Mode
:
)( .

/
:
.

) (.
:

.
)
(.
.

Outliers

) ( .
120-80 450 40
78

MGH

.
)
( ....
:
.
) / (.
)
(.
Quartiles & Percentiles
: .

%50




Median
Lower

%25

Quartile
Upper

%75

Quartil
n-percentile

n
: 100 . .
Minimum = 134
1 :
Maximum = 193
100 :
Median = 50
50 :
=Lower quartile
25 :
158
Upper quartile = 177
75 :

190
100
Maximum

79

5 :

5th percentile = 145

95 95th percentile= :
185
185
95
95th
percentile

177

170

75

50

Upper
quartile

median

158

145

25

Lower
quartile

5th
percentile

134
1
Minimum

MGH

.
:
CHARTS :

:
) (cm ,in
.
.

) (
25 : 25
75 .
5
95 .

Measures Of Variance

.1 : Range

80

:
Minimum
:Unreliability
:
.
Range = Maximum

MGH

.
.2 : (IQR) Interquartile Range

:
Quartile

.
IQR = Upper Quartile Lower

.3 : Variance

:
)1
/ 1

0
N-1 N
0 N-1 )
(mean
N-1 ) Unbiased ( .
) ( N
. N-1
.
:
.
Variance: Sum (X Mean) / (N-



) Variation ( :
:True Biological V


.
Making observations
: under different conditions
Systematic ) (
.

81

MGH

.
: Measurement V
.
.
) : (:
:Within-subject variation
.
:Variation between subjects
.

.4 : Standard Deviation

)SD= SQR (Variance


: .
: )
SD Variance ( .
N*SD Mean

.5 : Coefficient Of variation

Coefficient Of variation = SD / Mean


:
:
) (.

.
.
= + )
(.
.
%1 . 2 150
:
pH Cl K Na ) %10 (
.%40-10

.6 :

:
) :(Skewness
.
) : (Kurtosis ) (.
.
.
.

82

MGH

Normal Distribution

f(x) = PDF.Normal( X, , ) :

:

)( .
)( ) ( .
- + .
. 1

83

MGH

:
) (-, +
0.6826
) (-2, +2
0.9544
3 ) (-3, +3
0.9974


160 . 10
.1 170 .
) ( 170
169.9 170.1.
.2 150 170 .
P= CDF.NORMAL(170,160,10)-
CDF.NORMAL(150,160,10) = 0.683
.3 150 :
P=CDF.Normal (150,160,10) = .1586

.4 175

84

MGH

P= 1- CDF.Normal (175, 160, 10) = .0668

.5 2
P = 1- CDF.Normal (200, 160, 10) = .000031671
)(32/millon

CDF.Normal (X, 150


) ,10

X 150 170 =
170 150 .
170 = 1
170 .


% 10
.
) ( :
)CDF.Normal (X,150,10)=(1-0.1
:
)X= IDF.Normal (p , ,
X = IDF.Normal (0.9, 150, 10) = 172.8

:
:
E Probability Density Function
SPSS ) P=PDF.Normal(X, ,
.
E Cumulative Distribution Function

SPSS )P= CDF.Normal(X, ,
E Inverse Distribution Functions
SPSS :
)X=IDF.Normal(p, ,
: :
85

MGH

.
.1
.2
.3
.4

): (...
.
.
) .(...
) .(PDF, CDF,IDF
.

0 1 . Z
: x
z=(x-)/ .
z )(Standardized Variable
.
: 150 10 160
160-150/10=1 1
.
.

.


.
Z ] [-3,+3 99%
.
Z<-3 . Z>+3


Central Limit Theorem
.
) (X1, X2 X3, Xn
. Y . n Y
Z= (Y-) /
.

86

MGH

.
: ) (X1, X2 X3, Xn .
:
.
Mean
). /Sqr (n
SD Standard Error of the mean ). /Sqr (n

Normal Distribution

.
.
:
) ( .

Feedback .
:
)( .
.
:
:
.

87

MGH

.
:
Feedback
.


.

.

.

SPSS
:
: SPSS
Analyze Descriptive Statistics . Descriptives
.
Options ) maximum
).,minimum ,range, variance .......
OK , .

88

MGH

: SPSS
Descriptive Statistics - Analyze . Explore
. Dependent variable
Statistics ) Descriptive
Outliers
Percentile (...
Plots ).( Histogram ,Stem and Leaf, Boxpolt

89

MGH

)(

90

MGH

Boxplot

91

MGH

) ( . Median
.

.


.

Explore

Explore
.
: : Analyze Descriptive Statistics
. Explore
. Dependent variable
. Factor list

92

MGH

:
) : (
.

Outliers
Outliers .
93

MGH

.
: )(:
: Parametrical .I

Potential Outlier = |Z| > 2

Marked Outlier = |Z| > 3

%5 . %0.5
DB 10 Z Z>6
) (.
.
Histogram P-P Plot .
.II : Non-parametrical
Potential Outlier = Outside 1.5* Interquartile Range
Marked Outlier = Outside 3*Interquartile Range
:
) (IQR*3 .
) (IQR*1.5 ) (IQR*3 .

:
Boxplot and outliers
187

180

165

164

164

162

160

160

156

155

150

150

150

130

126

15

14

13

12

11

10

UQ

LQ

150=LQ
E :
)( .
E 160= M :
) (
E 164= UQ:
)( .
E : IQR

IQR
=
164-150
=
14
E :

94

MGH

.
1.5* IQR: (14* 1.5 ) =2
150 21 = 129

164 + 21 = 185

0 185 129 Potential


.
E :
3* IQR: (14 * 3 ) = 42
150 42 = 108
164 + 42 = 206
0 206 108 Marked
.


.
): (1

20.8
14.0
%95 ] [48.28 ,6.69-
.
%95 ][50 ,3
.
): (2
.25
.2



.
5.1
6.4
%95 [-7.4,
] .17.6
)Length of Stay (Days
.
Z = 37-5.1/6.4=5 : 37 (Z = 5 ,
) P=0.3/million (.
.15
.1

Frequency

.05
0

80

95

60

40

20

MGH

37 %95


:
Mean Median .
:Skewness .
:Kurtosis )(.
Stem and leaf :Histogram )
(.
Box Plot
:Normal Probability plot

.
Shapiro-Wilk Kolmogorov D )
( .

Skewness . Kurtosis


96

MGH

) (.
.
.100
. Heavy Tail


.1 :Exponential distribution

) (

.
Memoryless
.
SPSS . PDF.EXP

:
:
) =3 (
1/3.
.
:
.
.
Memoryless .
) (
.
-

.

.2 :
97

MGH

:
.(A).
:
).(B
:
.(C).
.(D).

.

.3 : Gamma

.

.
. PDF.GAMMA

.4 : Uniform

( .
.

.

SPSS

. PDF.UNIFORM
.

98

MGH

.
.5


. U

:
57 =3/6
=3/6
58 : : .
61 : TN : .


: .

99

MGH


Statistical Data
0 : .

0 : Census


census tracts 100 200.
)
( ... ) (....
:
.1 .
.2 ) (..
.3 .
.

0 : Sampling


.
:
.
: Population
Finite . Infinite
2004
. 2004

.
: Sample .
18

0
representative
.


Sampling Methods
101

MGH

.
Sampling :
:Random Sampling


.
: Non-random sampling

.


.1 : Simple Random Sampling

: .
: )
( ) (.
:

.

.2 : Stratified Sampling

) (
) ( ) ( .
. Strata

.

.3 : Cluster Sampling

.Cluster
. .
.
.

.


stratified
Cluster
102

MGH

.

) )
(
(

.
.

.

.4 : Multi-stage sampling



.
.

.

: Systematic Sampling


. ) (
.

Estimation

0
.

)
( ... .

103

MGH


n>50
.

Standard error of a sample mean


.
)S/sqr(n
S n .
: .
.
)
(.
.

104

MGH



) (.
105

MGH

Point Estimation

.

.
50
169 169 .
Interval Estimation

. Interval Estimation
P
.
Confidence Interval .CI
:
P=0.95

][166.5-171.5
: . 95%

)Mean (Z* Standard Error of mean

)Standard Error of mean = Standard Deviation / Sqr (N-1

Mean 0 .
N 0 Sqr .
Z 0 ) ( .
1.96 ) . 95% Z (.
) ( :
0 ). (Standard Deviation
0 .
0 ) (Z
.


106

MGH

) ( )
( .
)
( . P
:
(P) = P Z * S.E
0
0

] S.E. (P) = Sqr [ p * (1-p) / n

0 (P) = P Z * S.E : CI Z
) %95 .( Z=1.96
:
E 40 . 15
P = 15/40 = 0.375
E :
E 95% :
0. 375 1.96 *Sqr [ 0.375 * 0.625 / 40 ]= 0.375 0.150
E 95% ] [22.5-52.5
) 95%
(5%
E 99% ] [17.8-57.2
) 99% . (1%
E : ) 40
( .

10

Hypothesis Testing


Null & Alternative Hypothesis

:
%25 .
. P
: Ho . P=0.25 :
107

MGH

. P>0.25 :
: Ha
)
(.
:
.I .
.II .
.III
.

:
.
) (
.
= > , < ,
:
Ha : P > 0.25
Ho: P = 0.25
: One sided hypothesis ) ( P
P >0.25 P < 0.7
.
: Two sided hypothesis P
. P 0.25
:
)(.


}{8,9,..20
X=11 X Ho
. Ha
:
:P>0.25 0 X . Upper Tailed
: P<0.7 0 X . Lower Tailed
:P0.25 0 X
. Two Tailed

:
.1 : Upper Tailed
0 Ho: P=0.25
108

. Ha: P>0.25
MGH

.
0 : .
0 } {9, 10, ,20 X X=11
11 .
.2 : Lower Tailed
. P< 0.25

0 Ho: P=0.25
0 : )
. ( P< 0.25
0 } {0,1,2,3 X X=3
.
.3 : Two Tailed
. P 0.25

0 Ho: P =0.25
0 : .
0 } {0,1,2} U {10,11,..20 X
.

:
0

Type I error: Reject Ho while Ho is correct

Type II error: Accept Ho while Ho is not correct

: I II .
: Type I error :
:Type I
.
:Type II
.

109

MGH

.

Ho: False
P>0.25
Type II Error

Ho: True
P = 0.25
No Error

No Error

Type I Error

Accept Ho

Reject Ho

I . Alpha Error
I = Ho .
:
) 20 P=0.25 P>0.25(
}.{8,9,..20
I X=8 or 9 or .. or 20
) ( 0.101812 =
. 10.1%

II

II . - Beta Error
II = Ho .
: ) (Ho : P=0.25 Ha >0.25
. :P
. P
( Ha: P>0 25 , Ho: P=25 ) : P
} {8,9,..20
) (P .
P B(p), .
p
0.7 0.6 0.5 0.4 0.3
)B(p
0.001 0.02 0.13 0.42 0.77
P=0.5
%13 .

110

MGH


}={8,..20
} {10,,20 :

R2={9,20} R1

) B(P P
0.7

0.6

0.5

0.4

0.3

0.102 0.772 0.416 0.132 0.021 0.001

R1

0.04

R2

0.014 0.952 0.755 0.412 0.128 0.017

R3

0.887 0.596 0.252 0.057 0.005

=R3

:
) (

. P
) ( .
) .
P
(
: Level of Significance
: ,

) 0.05
0.95 (.
I
.
Level Test
.
: : 0.05
R2
0.05 .
111

MGH

.

0.05 0.01
) :
0.001 ( .
Alpha and Beta Erros
N=250

N=100

N=25
1
0.8

0.4

Beta Error

0.6

0.2
0
0.2

0.15

0.1

0.05

Alpha error

Test

.
0.05 ) R2 } ({9,10,11,..,20
. Power
) Power of
(test ) (p . (p) = 1-ErrorII
0.05 0.20
0.80

P :Level of significance P
) (%5 .
112

MGH

:Critical Values Critical



.
:CI Confidence interval
.

: 0.25
:
0 Ho : P=0.25 :
0 . Ha : P >0.25 :
0
:
.1 : P

.
) ( :

P :
)
Po=0.25 Pa
Pa=0.3 523 Pa=0.7 11(.
: ) (

) (
.2 ) ( :
. P
: : P=0.3 0.05
523 0.01
. 1281
.3 :
) (
. P
P=0.3
0.05 ) (
.
113

MGH

.
P= 0.3 P= 0.4 P= 0.5 P= 0.6 P= 0.7
11

17

29

70

523

13

21

38

94

716

20

33

62

160

1281

= 0.05
= 0.8
= 0.05
= 0.9
= 0.01
= 0.95

0
P=0.26 %5 %80 . 11895
N = [ (Za + Zb ) * S /d ] ^ 2 0

P-values CI

:
0 CI .
0 ) value-P (
.
, : CI
0 . P- value<0.05 :
0 P-Value<0.05 CI .
CI .
5 3 .
40 100/
20 100/ )
20100/ (.
.

114

MGH

.
P

95% CI

Drug Number Decrease

0.32

-38 to 118

40

30

<0.001

32 to 48

40

3000

0.54

-45 to 85

33

40

0.54

-4.5 to 8.5

3.3

4000

0.012

1.1 to 8.9

5000

: P<0.05 CI
.
:
. .

: 1
0 : ).( P>0.05
0 : .
0 (-32 TO +118) : CI :
118 32 %95
115

MGH

.
.
: 2
0 ) . ( P<0.05
0
20/ .
0 : (32 TO 48) : CI %95
32 48100/ .
1 : 2
0 A 40
100/ 1

P
.
. Absence of evidence is not evidence of absence. 0
0
. more powerful
3 : 4
0 B 3 4
P=0.54 CI 3
4 B
no substantial effect
8.5 .
1 : 3
0 P CI ) (
A B .
: 5
0 : .P<0.05
0 :

0 P
CI medical
importance .

_ .
_ .
116

MGH

.
_
.
-- : -

117

MGH

.
11

- observational
association
.
:
0 .Chance
0 .Bias
0 .Confounding
0 : Causality
. .
:
.

Chance

) ( :
0 .
0 .
0 .
P

P
Statistical Significance

: .
:


.1
21

.2 bias . confounding
.3 )(.
: P
.
) P 0.05 (
.
119

MGH

Not statistically significant


:
.
:P
.
P=0.05 :
0 :P=<0.05
null
hypothesis .
0 : P>0.05 ) (

) (.
: P value
0 .
0 P=0.05
.
0 . P=0.01
0 . )
( .
P=0.1 .
:
0 P . P
.
0 P ) ( .
:P-value
0 : P-Value
. P CI
) (.
0 confidence interval
95%
. confidence
0 Confidence interval precision estimate

.
0 P CI .

120

MGH

.
Chi
) P (
70
60
50
40

30
20
10
0

P ) P
( .


Significant real and important

Statistically significance
.
0
1 .
0
13% 5%
) ( .

Bias


systematic bias ) . : :
( .
:
:Selection Bias .1 -
.
:Misclassification .2
.
: Information Bias .3 .
121

MGH

.
.4 .

:Selection Bias

.1

.2

:
: .
:
: .
selection
bias .
:Loss to Follow-up
.
:

:
!
: ) (

.


.
: non-responding or volunteer Bias
:
.
) (
)
( .
:

) (.
:
0 ) ( :
) (

.
122

MGH

.
0 mail questionnaire
)
(.

.3

.4

: Berksons bias

.
: .
0 :
. %25
. 7.6%
.
0 :
%7.3
. %7.2
) (.
0 : :


) .
( .
:
0 case-control
.
0 Cases Controls
.
0
. Controls
0
0 :


.
:Bias by indication
: .

123

MGH

.5


. :
0
.
0
.
:
.
:
.
:

)
(... .

.
: Overestimation
:
.
: Case-control study
0
.
I

0 :
.

0 :
.

)

(. .
0
.
0 :
124

MGH

.6

Inclusion

Diagnosis

Bleeding

Estrogens

Very likely

Very likely

Very Likely

Yes

Less Likely

Less Likely

Less Likely

No

0 : :

.
.
0
.
: Sampling Bias
: .
simple
random sampling
.
:
0
.
0 : .
.
0 :
.1

) (.
.2 )

(.

125

: :

0.2=10/2
10


%20
0.5 = 2/1
2


% 50
MGH

.
%20
. %50


Multistage sampling
.
.7 :
:Surveillance Bias .1 :
: :
OC VTE
OC . VTE
Cohort study of oral contraceptive (OC) and venous
??? )thromboembolism (VTE

.
: Publication bias .2
)
(.

Misclassification Measurement Bias

:
:
0 .

0 .
0 .
0
) (
.
:
:
.
.
0
:
126

MGH

.
.1 ): (1

.2 ) : (2 .
0 : ) (
(
0 :

.

127

MGH

.
0
.
0 :

.
.
0

.
.
0 :
:

.
0

. :

128

MGH

.

.......
:
0
.
:
.
0 : .
:
0

.
0


.

129

MGH

) ( 4

.

Information Bias

: .
:
.
0 100 100
.
0
.
.
130

MGH

.
0 : %80
%10 ) -
. ( p=2.5 10-23
0 %75
90% .
0
.
0 : .
0 :
:
.1 .
.2 .
: Information Bias
: Recall Bias .1
Retrospective
.
:Observer/Interviewer Bias .2
) . (Double Blind
: ) -(Placebo
Placebo .%50-20
: Enthusiasm Bias .3
0
.
: Reporting bias .4
0 .
0 questionnaire .
0
0 : :
) 25- 15 ( ) %1 ( %0.6

.

Placebo Effects
.

131

MGH


.
Single
blinded Double blinded
Triple

%40


.

.


.

Lead Time Bias

: .
: .
0 10
8
.
0
.
0
.
0 1990 Lab / 1992 Clinical / 2000 Death

Loss of denomination

: )
( .
:
132

MGH

.
AGE
1-5
6-10
>10

1990-1994
49
19
6

1985-1989
23
12
8

1980-1984
12
8
7

0 : )(1990-1994

0 : /
) (
.
.
250 0 75 45 175
45.

0 .
0 :
) (.
. 1999
0 1999
.
0
.
0 :

2000
.

loss of controls

: .
:
3000 0 2000 11%

0
3/2 %11
.
0 : . loss of controls
133

MGH

: 1000 4.7 10.5


) (.
) age (confounder
. Standardization

Standardization
Alaska

Florida

no.
deaths

deaths/
100000

persons

no.
deaths

deaths/
100000

Age group

persons

0-4

375

546,000

2,049

405

40,000

162

60

1,982,000

1,195

84

128,000

107

5-19

190

2,676,000

5,097

261

172,000

449

20-44

724

13,074 1,807,000

778

58,000

451

45-64

4398

63,505 1,444,000

4933

9,000

444

65+

1004

91,750 8,455,000

396

407000

1613

Totals

10 %8
%16
.
:
0 .
0 .
0 .
0 ) Prevalence incidence
.(rate

134

MGH

.
Babies who are breast-fed have less illness than babies who
are bottle fed
But
0 How is feeding type defined?
0 Which illnesses?
0 How large a difference in risk?
Babies who are exclusively breast-fed for three months or
more will have a reduction in the incidence of hospital
admissions for gastroenteritis of at least 30% over the first
year of life.


0 When a selection procedure is biased, taking a larger

sample does not help


0 This just repeats the mistake on a larger scale
0 If bias is more than half of the standard error , invest on
quality instead of large sample .
BIAS 0
( Bias)
.
( ... ) 0
Alpha error Bias
: Bias/standard error

MGH

135

(Bias/ Standard error) B/S Alpha

Control of Bias
. : Bias

Two events encouraged polling


agencies to further refine their
methods. In 1936 a poll conducted
by the Literary Digest incorrectly
determined that the Republican
candidate, Alf Landon, would win the
U.S. presidential election. The error
arose largely because of biases that
caused wealthy people to be
overrepresented in the poll. In the
1948 election, most polls mistakenly predicted a victory for the
Republican candidate, Thomas E. Dewey, over President Harry
S. Truman, again because poor people were underrepresented
and also
because
the polling agencies missed last-minute changes of attitude
MGH

136

.
among the voting public. Since 1948 techniques of public
opinion research and polling have improved considerably.
Efforts are now made to select respondents without bias, to
improve the quality of questionnaires, and to train able and
reliable interviewers.

: Bias :
:Careful study design .1
.
.2 sampling design .
.3 Objective Reliability
.
:Accurate and complete records .4
data validation roles
.5 .
.6 . loss to follow-up
.7 . Blind

.8
pilot study
.

137

MGH

Confounding


.Confounder
.
: :
0 : confounder .
.
0 : :



.
:
0

.
0 :


.
0
.
High LDL
CHD

Confounding .
0 : confounder
confounder:
High fat Diet High LDL CHD
16

140

MGH

.
0 High LDL
Confounding
.
0
.
0 .
0
.

0
).(BMI
0
. Confounder
0
.
0 :
.
0

0 :


.
: _
_
)
(.
: .
Alcohol and Lung cancer

141

MGH

80
60
40
20
0
No

Yes

Mortality rate per 100,000 PY

100

Daily alcohol consumption

0
_ .
0 :
Alcohol, smoking and lung cancer

80
60

Smokers
Non-smokers

40
20
0
Yes

Mortality rate per 100,000 PY

100

No

Daily alcohol consumption

0

_

.

142

MGH

:Chance .
:Bias
.
:Confounding :
0 .
0 Confounder )
(
Adjusted Control ) confounder
(.

Confounding

:
:Restriction 0 .
Matching 0
:Randomization 0 . Randomize clinical trials
0
0 .

0 : - .
0 Blind, Double blind Objective
: Confounding
0 Confounder
0 .
0 :
Stratification :
. Confounder
Multivariate Analysis .

.

) (.

CHD :
0 CHD

143

MGH

.
0 : CHD
.
0


Hills Criteria

:Study Design
:
0 Experimental study
Prospective cohort study
Historical cohort study
Case-control study
Crosssectional study .
: ) ( RR
. ) :
(.
:Totality of evidence = Consistency
.
0 .
A
) (
.
) :Biologic credibility (Plausibility

) ( .
0
.
:Biologic Gradient-Dose-Response
)
(.
0 .
) :Temporality (Time Sequence
) (.
0 )
(.
144

MGH

:Experimental Evidence
) (.
:Specificity
.
:Analogy -
.

12


0 :
) ( : Design of the study
0
0 Sampling methods
0 Related / Unrelated
.
.

) . (Nominal, ordinal, scale


)(Numerical, proportion, rate, time to an event
.
.
.
.

:
0 . Between-Subjects Design
0 ) ( ) (
.
::
Pre-post or self-pairing 0 . Within Subject Design
0 .
145

MGH

.
0 )
(Matched

.Univariate analysis
. Bivariate analysis
. Multivariate analysis

: parametrical
methods
: non-
. parametrical methods

) (.
. Interpretation
. Assumptions
.
.
:
. Onesample T test .1
. Independent Samples T Test .2
. Paired-Samples T-Test .3
.4 Chi-squared . Fisher exact test
. Chi-squared test for trend .5
.6 Pearson . Spearman


Onesample T test
146

MGH

:
0 .
0 .
:
20 ) 24
(.
One-Sample T test .Single Mean T Test
: %50 %30
170.
:
0 One sample T Test Compare means Analyze
0
0 .

147

MGH

: T
0 .
0 ) 50 (
0 .
: T
0 15 Data Population
.
0 15 39 Outliers . Skewness
0
:
0
%20 .
0 ) (
) (Yes, No No Yes . 1
0 ) 0.30
(30 .
148

MGH

.
0
0.1 ) 0.9 Yes .(%90-%10


Independent Samples T Test


0 .
0 Dichotomous
0 .
.
:
.


:
.1 ) 50
(.

.2 .
.3
) .
(.
.4 Homogeneity of variance
. I
. II
:
Independent-Samples
0 Analyze Compare Means
T Test
0 .
0 Grouping Variable .
Define Groups .
cut point Cut
Point . Cut point

149

MGH

150

MGH

.

0 . Group Statistics
0 Test for homogeneity of variance
) .(Levenes Test of Equality of Variances F
. F
) P (P>0.05
Null Hypothesis
.
. P<0.05

.
0
More conservative
.
0 :

) ( 0 1 .
Heart Disease
.
:
crosstabulation . 10

151

MGH


Paired-Samples T-Test

. Dependent T-Test
:
0
0
:
0

.
0
0 .
: T
Independent Normality
. Homogeneity of Variance
:
Compare means Analyze 0 Paired samples T Test

152

MGH

.
0
) / (
.

:
0 )
(.
0
.
0
)
( . .
153

MGH

.
0 : 46
5 .

154

MGH


Chi- squared Fisher

:
0 .

0 ) (

156

MGH

:
0 5.2% 4.3%

.
0 . Chi- squared
0 Crosstabulation Statistics
Chi-squared

157

MGH

:
0 Square-Pearson Chi
. 0.338
0
%33.8
.

: Chi-squared
0 .
0 / ) (P>0.05

.
0 P<0.05
.

158

MGH

.
0 : 5 25%

. Fisher Exact test
0 Chi
.

squared test for trend-Chi

:
0 .
0
.
0 .
:
0 ) 10,000
10,000 20,000 20,000 40,000 ( 40,00

) ( .
squares-chi
Linear by linear association P.


Correlation

:
0 .

:
0 ) (
) ( .

) (
.
Correlation coefficient
.

:
0 . Bivariate Correlate Analyze
0 .
0 Correlation coefficients Pearson
. Spearman
159

MGH

.
0 OK .

: :
0 Pearson Correlation :
.
.
0 Nonparametric Spearman's rho
:
.
.

160

MGH

:
r ] [-1,+1 :
0 measure of determination
)( Y )( X
0 r : -1 )
(.
0 r : .
0 r :1 )
(.
) significance
( 0.05 .

:
0 Graph Scatter Simple . Define
0 X Y . OK
161

MGH

.
0

gender . Set markers By

:
: Extrapolation

162

MGH

.
.
R R
.
R .

: Outliers






.

: Non-homogeneous Groups
0





.


.
163

MGH

:
0 X Y )

(.
0 .
0 .
0 . outliers
J J


.1 65 :
0 : : % 5 :
: ) % 5 :
( .
0 ) : (3
:
. %5
.2 70 :
o : (54 3 2 1 0
o : 3 2 1 0
.3 72 :
Poisson (40,5) = 7.5 E-23 7.5 E-23 7.5*10 -23
.4 85 :
) CDF.Normal (X, 150 ,10 )CDF.Normal (X, 160 ,10
.5 109 . Ha : P >= 0.25 _ Ha : P > 0.25 :
164

MGH

.
.6 SPSS .
.7 ) 3 -+65( .
.8
) (.
.9 : :
!! .

-

:

165

MGH

You might also like