You are on page 1of 44

Inferential Statistics

Drawing Conclusions from Data


What is an hypothesis?

Statement or conjecture about any


phenomena for which the truth has not been
verified.

Statement needing investigation of its truth.


Examples:-

Fear increases hypertension


Females are more prone to rheumatoid
arthritis.
Breast cancer is commoner in women
who do not breast feed.
Sore throat is commoner in winter.
Procedure for test of hypothesis.

State Null Hypothesis.


State alternative hypothesis
State level of statistical error
Choose appropriate test statistic
Apply data and evaluate test statistic
Take decision: reject or not reject null
hypothesis
Null Hypothesis.

Denoted by Ho
Hypothesis under test
Hypothesis to be nullified if data does
not support it
Statement in a form that does not pre-
judge
Examples on Null Hypothesis

No difference in age of onset of


rheumatoid arthritis between males and
females.
No association between children
attendance at group care and
pneumococcal infections.
No relationship between individuals age
and height.
Alternative Hypothesis.

Denoted by Ha
Hypothesis to consider if Ho is rejected.
Statement in the affirmative language
Examples on Alternative Hypothesis.

There is a difference in the age of onset of


Rheumatoid Arthritis between males and
females.
There is an association between children
attendance at group care and
pneunomococcal infections.
Individuals age and height are related.
Level of Statistical Error

Error committed for rejecting null


hypothesis wrongly- ().
Usually called a type 1 error.
Always very small (0.05 or 5%)
Type of Errors in Decision Making.

Type 1 Error
Reject Hypothesis when it is indeed
True. ( )
Type II Error ()
Failure to reject hypothesis when it is
false. ( -ERROR)
THE P-VALUE
THIS IS THE TYPE 1 ERROR
THE P-VALUE OR -ERROR
Interpretation of Errors:

Type 1 error is the level of significance


() (alpha)
Type II error is denoted by (beta)
1 - = Power of the test.
Usual Objectives of Studies:

Compare characteristics of groups


particularly average values.
Investigation of association or
relationship between variables.
Choice of Test Statistic.
Depend on study objectives (research
questions)
Kind of data (quantitative or qualitative)
Sample size (small or large)
Type of Test Statistics

Parametric tests
Non-parametric tests
Parametric Tests.

Assume distributional forms for the


measurements and parameters in the
population.
Commonest is assumption of normal
distribution
Examples of Parametric Tests.

Z-test for comparing 2 proportions


T-test to compare mean values between
only 2 groups.
F-test or analysis of variance to
compare mean values between several
groups (more than 2 groups)
NON-PARAMETRIC TESTS

Do not assume any particular functional form for a


population distribution.
Called distribution free methods.
NON-PARAMETRIC TESTS
CHI-SQUARE TEST
MANN-WHITNEY- U TEST
WILCOXON SIGNED RANK SUM TEST
KRUSKAL-WALLIS TEST
MEDIAN TEST
The student t-test

Used to compare mean values between two


groups.
Distribution has an underlying normal
distribution
Has more areas at the tails of the distribution
Distribution based on degrees of freedom to
take care of small sample size less than 50.
Widely used in medical studies.
Comparison of two mean values
Independent Samples:
- -
Use t = x1 x2
- -
SE (x1 x2)
-
- X1 is mean of first group
-
- X2 is mean of second group
- -
- S.E (x1x2)= S2+S22 if assume unequal variance in two groups
n1 n 2

S2 = (n1 1)S12 + (n2 1)S22


n1 + n2 - 2
Example on t Test
The mean score on knowledge of the adverse
effects of a group of 81 primary health care
physicians with less than 10 years experience to
the diagnosis of depression was 35.94, SD =
4.60. If the mean score on knowledge of 64
primary care physicians with more than 10
years experience was 39.8, SD = 4.05; test if
the difference in knowledge of the diagnosis of
depression is statistically significant. What
interpretation can you give the observed result.
Solution: Mean SD sample size

Group 1:
(< 10 years experience) 35.94, 4.60
81

Group 2:
(> 10 years experience: 39.8, 4.05
64

t = 35.94 39.8

(4.60)2 + (4.05)2
81 64

= - 3.86 = 5.3656
0.7194
Degree of freedom = 81 + 64-2
= 143
P <0.01
EXAMPLE ON T-TEST
A total of 36 hypertensive individuals were
split into two groups of 18. Group 1
received a diuretic therapy while Group 2
received a diuretic therapy in combination
with another antihypertensive agents.After
one month, their diastolic blood pressures
were measured and results summarized as
follows: GRP1 MEAN= 117.0 sd=22, gp2:
mean=93.0, SD=20. Was there any
significant effect of therapy?
EXAMPLE ON PAIREDT-TEST
A random sample of 6 patients with ischeanic heart
disease were treated with clofibrate and the concentration
of ther plasma fbrnngen determined as follows
patents no : 1 2 3 4 5 6
pre-value: 379 351 420 303 346 370
post-value: 325 333 391 275 311 323
Does the treatment have any statistical significant effect?
Solution:
- Null Hypothesis:
There is no difference in the 10 measureme
- Alternative Hypothesis:
There is a difference in the measurements.
- Level of significance: 0.05
- Test statistic: Paired t-test
Comparison of mean values
Dependent Groups
-
Use t = d
-
SE (d)
-
Where d = Mean difference of pairs

S.E (d) = S
n
Where S = (di d)2
n-1

n = Number of pairs
Evaluation of test statistic:
Paired differences (di) 0, -2, 1, 1, -2, -5, 0, -1, -1

D = di = - 10 = -1
n 10

SD (d) = di2 ( di)2 = 38 100


n 10
n1 10 - 1
= 1.764
10. t value = d_
SE (d)
= -1_ = 0.5669
1.764
t = 0.5669 on 9.d.f.
Verdict P > 0.1, Do not Reject Ho.
Conclusion:
No difference in the measurements of the 2 instruments.
EXAMPLE ON PAIRED T-
TEST
Seven pairs of twins were allocated at
random to two alternative diets. Their
weight gain after a fixed duration were as
follows: Diet(A, B) : (10,16), (17,20),
(8,14), (15,15), (17,16), (12,16), (14,17).
DO THE DIETS SHOW ANY
SIGNIFICANT EFFECT ON WEIGHT
GAIN?
Chi-square Test

What it is?
Choice of test statistic to
investigate the significance of
association between two
qualitative variables.
Examples:

Presence or absence of a risk factor and


having or not having a condition.
Exposure to dust and Bronchial Asthma.
Breast cancer and Type of Diet.
Occupation and Colorectal cancer.
Contingency Table

Usual table for data presentation.


Involves cross classification of two
qualitative variables.
Table serve initial assessment of
association.
1. Example of a contingency Table. Data on
occupational status of subjects and the presence of
stress.

Occupation
Stress Professional Skilled Unskilled Total
Present 5 13 70 88
Absent 20 32 60 112
Total 25 45 130 200
Percentage 20.0 28.9 53.8 44.0
with stress
Procedure for Statistical Test.
Step 1
Ho: There is no association between occupational
status and the presence of stress.
Step 2: HA: There is an association.
Step 3: = 0.05
Step 4: Choose X2 test.
Step 5: X2 = (Oi Ei)2
Ei
Step 6: Compare calculated X2 with tabulated X2
an appropriate degree of freedom at 5% level.

Conclusion: P < 0.05, Reject Null Hypothesis.


Chi-square Equation:
2 = (Oi Ei)2
Ei

Oi = Observed frequency in cell i of table.

Ei = Expected frequency in cell i of Table


If Null Hypothesis were true.
Degree of Freedom: d.f.
Measure of size of the table depends on
number of categories of variables in
rows and columns.
Number of categories in rows less 1
multiplied by number of categories in
columns less 1 [(r1) x (c-1)].
In the example, d.f. = (2-1) x (3-1) = 2.
Calculating Chi-square value.

Occupation
Stress Professional Skilled Unskilled Total
Present 5 Ei 13 E2 70 E3 88

Absent 20 E4 32 E5 60 E6 112

Total 25 45 130 200


Expected Frequencies

E1 = 88 x 25 = 11.0 E4 = 112 x 25 = 14.0


200 200

E2 = 88 x 45 = 19.8 E5 = 112 x 45 = 25.2


200 200

E3 = 88 x 130 = 57.2 E6 = 112 x 130 = 72.8


200 200
Chi-square value
X2=(511.0)2 + (13-19.8)2 + (70-57.2)2 + (20-14.0)2 + (32-25.2)2
11.0 19.8 57.2 14 25.2

+ (60-72.8)2 = 3.27 + 2.57 + 2.34 + 1.83 + 2.86 + 2.25

x2 = 15.12 on 2 degrees of freedom.


Decisions.
Compare calculated chi-square with tabulated
chi-square at 5% level and corresponding
degree of freedom.
If calculated chi-square is smaller than
tabulated chi-square, then P > 0.05.
Do not reject Null hypothesis if P > 0.05.
If calculated chi-square is larger than
tabulated chi-square, than P < 0.05.
Reject Null hypothesis if P < 0.05.
Here tabulated chi-square on 2 d.f. at 5% is
5.991.
Decision Reject Null Hypothesis.
Conclusion:

Association between occupation and


stress statistically significant. (P <
0.05).
SUMMARY
Study Objectives
Kind of Data
Distribution of Data
Size of Sample
THANK YOU

You might also like