Professional Documents
Culture Documents
HYPOTHESIS TESTING
Cont
When the number of observations is
large, the observed relative
frequency distribution will tend to
look the true probability distribution.
Statistical Inference
Consider the example of the probability
distribution of the six-face die. In simulated
tosses of an unbiased die 60 and 6000 times,
the following frequency distributions were
observed.
Value on Face 1 2 3 4 5 6
Consider No. of tosses = 60
Frequency
9
13 11 3 15 9
Probability
0.150
0.217
0.183
0.250
0.150
Consider No. of tosses = 6000
Frequency
964 1045 994
993
Probability
0.161
0.174
0.166
0.164
0.170
4
0.050
983
1021
0.166
Statistical Inference
Notice that with 6000 tosses the observed
frequency distribution is virtually identical to the
true underlying probability distribution (which is 1/6
= 0.167), while with 60 tosses the distributions are
quite different.
In practice, we do not know the true population
probability distribution, but wish to infer, from our
sample, something about it.
Statistical inference is the process of using
samples to make inferences (formal methods for
drawing conclusions) about a population.
Statistical inference includes
hypothesis testing.
5
Hypothesis Testing
Cont
4. Compare the observed value of the
statistic to the critical value
obtained for the chosen .
5. Make a decision.
Cont...
Types of testes
1
H 0 : 0 ( 0 )
H A : 1 0 ( 0 )
x 0
zcal
n
ztabulated z for two tailed test
2
Cont
2
H 0 : 0 ( 0 )
H A : 1 0 ( 0 )
z cal
x 0
n
Decision :
3
H 0 : 0 ( 0 )
H A : 1 0 ( 0 )
Decision :
Types of errors
There are two types of errors
Type of
decision
H0 true
H0 false
Reject H0
Type I error ()
Correct decision
(1-)
Accept H0
Correct decision
(1-)
Type II error ()
Guilty
Not Guilty
OK
Error()
error
OK
15
1.Hypothesis is to statistics
Ho: = o, P=po
HA: these are different
2. Data
3. Test statistics H
Conclusion
Truth
true
Ho false
Accept
OK
Error
Reject
error
OK
Test Statistics
In hypothesis testing we always start with
statements for HO and HA.
Because of random variation, even an unbiased
sample may not accurately represent the
population as a whole.
As a result, it is possible that any observed
differences or associations may have occurred
by chance.
Statistical testing of a research hypothesis
allows the researcher to quantify the risk or error
involved in making inferences about a
population based on the information obtained
from a sample.
16
Test Statistics...
A test statistics is a value we can compare
with known distribution of what we expect
when the null hypothesis is true.
The general formula of the test statistics is:
Observed
Hypothesized
Test statistics =
value
value
Standard error
17
18
19
20
21
The P- Value
22
The P- Value
..
EXAMPLE
2.
3.
4.
5.
Example
In the study of childhood abuse in psychiatry patients, brown
found that 166 in a sample of 947 patients reported histories of
physical or sexual abuse.
a. constructs 95% confidence interval
b. test the hypothesis that the true population proportion is
30%?
Solution (a)
The
95% CIpfor
P is
(1
p )given by
p z
n
2
0.175 0.825
947
0.175 1.96 0.0124
0.175 1.96
[0.151 ; 0.2]
25
example
zcal
ztab
p Po
0.175 0.3 0.125
8.39
0.0149
p (1 p )
0.3(0.7)
n
947
1.96
26
Example
Step 4: Comparison of the calculated and tabulated
values of the test statistic
Since the tabulated value is smaller than the
calculated value of the test the we reject the null
hypothesis.
Step 6: Conclusion
Hence we concluded that the proportion of childhood
abuse in psychiatry patients is different from 0.3
If the sample size is small (if np<5 and n(1-p)<5)
then use students t- statistic for the tabulated value
of the test statistic.
27
12
n1
22
n2
1 , and 2
If
are unknown, then can be
s1 , and s2
estimated by
29
zcal
( x y ) ( 1 2 )
12
n1
22
n2
30
Hypothesis
ztabulated z for two tailed test
2
32
Example
A researchers wish to know if the data they have
collected provide sufficient evidence to indicate a
difference in mean serum uric acid levels between
normal individual and individual with downs
syndrome. The data consists of serum uric acid
readings on 12 individuals with downs syndrome and
15 normal individuals. The means are 4.5mg/100ml
and 3.4 mg/100ml with standard deviation of 2.9 and
H
: 1 2 respectively.
0
3.5
O mg/100ml
H A : 1 2 0
33
SOLUTION
THE
z cal
( x y ) ( 1 2 )
2
1
n1
2
2
n2
( 4.3 3.4) 0
2.9 2
3.5 2
12
15
1.6
1.6
5.33
1.23
1.5178
z z 0.025 1.96
2
34
size is plarge
and
n21(p11>
5,
n 1 (1-p1)>5,
If the sample
(
1
p
)
p
p
)
1
1
2
p1 p2 z
n2p2>5, then
n
n
35
zcal
( p1 p2 ) ( 1 2 )
p1 (1 p1 ) p2 (1 p2 )
n1
n2
36
37
Chi square
Chi square = 2 = { (Oi - ei)2 }
ei
The sampling distribution of the chisquare statistic is known as the chi
square distribution.
As in t distributions, there is a different
2 distribution for each different value
of degrees of freedom
39
Example:
The following table shows the relation
between the number of accidents in 1
year and the age of the driver in a
random sample of 500 drivers
between 18 and 50. Test, at a 01 level
of significance, the hypothesis that
the number of accidents is
independent of the driver's age.
42
Observed frequency
44
Expected frequency
45
SUMMARY
48
49
THANK YOU
50