Professional Documents
Culture Documents
Random Processes
IC 210
Hypothesis Testing-1
Reference: Introductory statistics
By Prem S. Mann available on Moodle Chapter 9
Inferential Statistics
Statistics:
1. Model
2. Estimation
3. Hypothesis test
X i ~ N ( , 2 ), i 1, 2, , n iid.
x ,
2 s 2
0 , 2 02
Hypothesis testing
The purpose of hypothesis testing is to determine whether there is
enough statistical evidence in favor of a certain belief about a parameter.
For Example:
A software company may claim that, on average, it cans contain 12
ounces of soda. A government agency may want to test whether or not
such cans do contain, on average, 12 ounces of soda. Here we are to
test a hypothesis about the population mean .
According to some survey 75% of the total charitable contributions in
2008 were given by individuals. An economist want to check if this
percentage is still true for this year. Here we are to test a hypothesis
about population proportion p.
Hypothesis testing
p( X )
Implausible
X X X
Fairly plausible
Highly plausible
p( X )
p
10
p( z )
X
z
X
p
11
12
Note : the size of the rejection region depends on the value assigned
to
(that is, the soda contained in all cans, on average, is less than
12 ounces), but it happens by chance that we draw a sample with
a mean that is close to or greater than 12 ounces and we
wrongfully accepted it.
The value of represents the probability of making a type II error.
It represents the probability that Ho is not rejected when Ho is
false.
= P(Ho is not rejected Ho is false)
Jury Trial
Actual Situation
Verdict
Innocent
Guilty
Actual Situation
Decision
H 0 True
Accept
Innocent
Guilty
Correct
Error
Error
Correct
Reject
H
1-
Type I
Error
False
Positive
( )
H 0 False
Type II
Error (
Power
(1 - )
False
Negative
p (accept H 0 | H 0 true)
p (accept H0 | H0 false)
p (reject H 0 | H 0 true)
p (reject H0 | H0 false)
1 (power)
Note
By rejecting H0, we are saying that the difference between
the value of stated in H0 and the value of obtained from
the sample is too large to have occurred because of the
sampling error alone. Consequently, this difference is real.
By not rejecting H0, we are saying that the difference
between the value of stated in H0 and the value of
obtained from the sample is small and it may have
occurred because of the sampling error alone.
17
Tailed Tests
Right-tailed test
Example: The average price of homes in New Jersey was
$461,216 in 2007. Suppose a real estate researcher wants to
check whether the current mean price of homes in this Town is
higher than $461,216 .
Ho: =$ 461.216
H1: >$ 461.216
20
Left-tailed test
Example: The company claims that their soft-drink cans, on
average, contain 12 ounces of soda. However, if these cans
contain less than the claimed amount of soda, then the company
can be accused of cheating. Suppose a consumer agency wants
to test whether the mean amount of soda per can is less than 12
ounces.
H0: = 12 ounces = mean is equal to 12 ounces
H1: < 12 ounces =The mean is less
than 12 ounces
21
Hypothesis tests
Type I and type II errors
Type I error: H0 rejected, when H0 is true.
Type II error: H0 not rejected, when H0 is false.
Significance level: a is the probability of committing a
Type I error.
One-sided test
23
Two-sided test
/2
26
27
28
X 265
Calculating Test Statistics
x
N
sX
sx
N
X
zc
x
zc 1.80
Procedure
First we find the critical value(s) of z from the normal
distribution table for the given significance level.
Then we find the value of the test statistic z for the observed
value of the sample statistic.
Finally we compare these two values and make a decision.
Remember, if the test is one-tailed, there is only one critical
value of z, and it is obtained by using the value of which gives
the area in the left or right tail of the normal distribution curve
depending on whether the test is left-tailed or right-tailed,
respectively. However, if the test is two-tailed, there are two
critical values of z and they are obtained by using area in each
30
tail of the normal distribution curve.
Problem : A used car dealer says that the mean price of a 1995
Ford F-150 Super Cab is at least $16,500. You suspect this claim is
incorrect and find that a random sample of 14 similar vehicles has a
mean price of $15,700 and a standard deviation of $1250. Is there
enough evidence to reject the dealers claim at = 0.05?
Solution:
The claim is the mean price is at least $16,500.
Ho: $16,500 (Claim) and H1 : < $16,500
2.39
s
1250
n
14
The claim is the mean pH level is 6.8. So, the null and alternative
hypotheses are:
Ho: = 6.8 (Claim) and Ha : 6.8
Because the test is a two-tailed test, the level of significance is = 0.05.
There are d.f. = 19 1 = 18 degrees of freedom and the critical value is
-t = -2.101 and t = 2.101 The rejection regions are t < -2.101 and t >
2.101. Using the t-test, the standardized test statistic is:
x 6.7 6.8
to
1.82
s
0.24
n
19
The graph shows the location of the rejection region and the standardized
test statistic, t. Because t0 is not in the rejection region, you should decide
not to reject the null hypothesis. There is not enough evidence at the 5%
level of significance to reject the claim that the mean pH is 6.8.
t distribution table
Probability Values
Z statistic (obtained) The test statistic
computed by converting a sample statistic
(such as the mean) to a Z score. The
formula for obtaining Z varies from test to
test.
P value The probability associated with the
obtained value of Z.
Probability Values
Probability Values
36.8
36.6
38.0
37.4
37.6
37.0
37.2
38.2
36.8
37.6
37.4
36.1
38.7
36.2
37.2
37.5
Mean
37.22
SD
0.68
SE
0.161
to P
2.38 0.029
s
standard error
n
Probability (p value)
0.10
0.025
0.01
1
5
10
17
20
24
25
6.314
2.015
1.813
1.740
1.725
1.711
1.708
1.645
12.706
2.571
2.228
2.110
2.086
2.064
2.060
1.960
63.657
4.032
3.169
2.898
2.845
2.797
2.787
2.576
p-value = 0.029
From t Table: t17,.025= 2.11
-2.11
calculated t0 =2.38
Since t0 > t
Reject the null hypothesis
+2.11
t
t0
Exampleusing p value
1.
2.
3.
Test Statistic
p-value
prob = 0.367
prob = 0.367
T = -0.342
T = 0.342
Since are alternative hypothesis was two-sided our pvalue is the sum of both tail probabilities (0.734)
Statistical Significance
prob = 0.010
prob = 0.010
T = -2.32
T = 2.32
Looking up probability in table, we see that the two-sided pvalue is 0.010+0.010 = 0.02
Since the p-value is less than 0.05, we can reject the null
hypothesis
Conclusion: people below the poverty line have significantly (at a =0.05
level) lower calcium intake than the RDA