You are on page 1of 18

Probability and Statistics Unit 12

Unit 12 Hypothesis Testing


Structure:
12.1 Introduction
Objectives
12.2 Testing Hypothesis
Null and Alternate Hypothesis
Interpreting the Level of Significance
Hypotheses are accepted and not proved
12.3 Selecting a significance level
12.4 One Tailed Tests and Two Tailed
12.5 Tests of Hypothesis Concerning Large Samples
Testing Hypothesis about population Mean
Testing Hypothesis for the Difference Between Two Means
Test of Hypothesis Concerning Attributes
Testing Hypothesis about a population Proportion
Testing Hypothesis about Difference Between Two Proportions
12.6 Summary
12.7 Terminal Questions
12.8 Answers

12.1 Introduction
In the previous unit we studied about the Sampling theory, now in this unit
we shall study about Testing of Hypothesis. Hypothesis testing is the
opinion about the population parameter that may or may not be in the
confidence interval derived from the sample. Hypothesis testing is helpful in
decision making. Before starting this unit, refresh the concepts you have
studied on estimation.
Hypothesis testing begins with an assumption, called a hypothesis that we
make about a population parameter. We assume a certain value for a
population parameter. To test the validity of our assumption, we gather
sample data and determine the difference between the hypothesized value
and the actual value of the sample statistic. Then we judge whether the
difference is significant.
The smaller the difference, the greater the likelihood that our hypothesized
value for the parameter is correct. The larger the difference, the smaller the

Sikkim Manipal University Page No.: 360


Probability and Statistics Unit 12

likelihood that our hypothesized value for the parameter is correct.


Unfortunately, the difference between the hypothesized population
parameter and the actual statistic is more often neither so large that we
automatically reject our hypothesis nor so small that we just as quickly
accept it. So in hypothesis testing, as in most significant real-life decisions,
clear-cut solutions are the exception, not the rule.
Objectives:
At the end of this unit the student should be able to:
describe the basic concepts of hypothesis testing
explain the different types of error
identify the test for a given problems

12.2 Testing of Hypothesis


A hypothesis is some statement or assertion about a population which we
want to verify on the basis of information available from a sample.
There are two types of hypothesis
1. Null hypothesis
2. Alternative Hypothesis
12.2.1 Null and Alternative Hypothesis
Null and Alternate Hypothesis
According to R.A. Fisher, Null hypothesis is the hypothesis which is tested
for possible rejection under the assumption that it is true.
In testing of hypothesis we always begin with the assumption or hypothesis
which is the assumed value of a parameter. This is called Null hypothesis.
The null hypothesis asserts that there is no significant difference between
the sample statistic and the population parameter. If there is any difference
between the sample statistic and the population parameter then it may be
due to fluctuations in sampling from the same population.
Null hypothesis is the hypothesis which is to be verified with the help of
given sample. That is null hypothesis is the hypothesis which is under test.
In hypothesis testing, we must state the assumed or hypothesized value of
the population parameter before we begin sampling. The assumption we
wish to test is called the null hypothesis and is symbolized by Ho.

Sikkim Manipal University Page No.: 361


Probability and Statistics Unit 12

Example : We want to test the hypothesis that the population mean is equal
to 500. We would symbolize it as follows and read it as,
The null hypothesis is that the population mean = 500 written as,
0 : 500

Alternative Hypothesis
A hypothesis which is different from Null hypothesis is called Alternative
hypothesis. It is denoted by H1. The two hypothesis H0 and H1 are opposite
of each other. That is if one of the hypothesis is accepted then the other is
rejected and vice versa.
Example: If we want to test success rate of a particular treatment, we make
null hypothesis for success rate p (for the test value of 0.99) as
0 : p 0.99 and alternative hypothesis is among
1 : p 0.99
1 : p 0.99
1 : p 0.99
Example: If we want to test if the attribute of educational qualification has
any influence on income of the individual, we make null hypothesis as
0 : Educational qualification has no influence on income of an individual
and alternative hypothesis is
1 : Educational qualification has an influence on income of the individual
12.2.2 Interpreting the Level of Significance
The purpose of hypothesis testing is not to question the computed value of
the sample statistic but to make a judgment about the difference between
that sample statistic and a hypothesized value for population parameter.
The next step after stating the null and alternative hypotheses is to decide
what criterion to be used for deciding whether to accept or reject the null
hypothesis. If we assume the hypothesis is correct, then the significance
level will indicate the percentage of sample statistic that is outside certain
limits (in estimation, the confidence level indicates the percentage of sample
statistic that falls within the defined confidence limits).

Sikkim Manipal University Page No.: 362


Probability and Statistics Unit 12

12.2.3 Hypotheses are Accepted and Not Proved


Even if our sample statistic does fall in the non-shaded region (the region
shown in figure 12.1 that makes up 95 percent of the area under the curve),
this does not prove that our null hypothesis (H0) is true; it simply does not
provide statistical evidence to reject it.
Therefore, whenever we say that we accept the null hypothesis, we actually
mean that there is no sufficient statistical evidence to reject it. Use of the
term accept, instead of do not reject, has become standard practice. It
means that when sample data do not suggest us to reject a null hypothesis,
we believe as if that hypothesis is true.

Fig. 12.1: Acceptance and rejection region of sample

12.3 Selecting a significance level


There is no single standard or universal level of significance for testing
hypotheses. In some instances, a 5% and 1% level of significance is used
which means that our decision is correct to the extent of 95% or 99%.
Hence, it is possible to test a hypothesis at any level of significance. But
remember that our choice of the minimum standard for an acceptable
probability, or the significance level, is also the risk we assume of rejecting a
null hypothesis when it is true.
The higher the significance level we use for testing a hypothesis, the higher
the probability of rejecting a null hypothesis when it is true. The 5% level of
significance implies we are ready to reject a true hypothesis in 5% of cases.

Sikkim Manipal University Page No.: 363


Probability and Statistics Unit 12

If the significance level is high then we would rarely accept the null hypothesis
when it is not true but, at the same time, often reject it when it is true.
When testing a hypothesis we come across four possible situations.
Possible situations when testing a hypothesis
Decision from Sample
Reject H0 Accept H0
Ho True Wrong (Type-I Error) Correct
True State
Ho False Wrong
Correct
(H1 True) (Type II Error)

The combinations are:


1. If the null hypothesis is true, and the test result make up to accept it,
then we have made a right decision.
2. If null hypothesis is true, and the test result make us to reject it, then we
have made a wrong decision (Type I error). It is also known as
Consumers Risk, denoted by .
3. If hypothesis is false, and the test result make us to accept it, then we
have made a wrong decision (Type II error). It is known as producers
risk, denoted by ,where ,1 is called power of the Test.
4. If hypothesis is false, test result make us to reject it we have made a
right decision.

12.4 One Tailed Test and Two Tailed Test


There are two types of problems of tests of hypothesis
1. Two tailed Test
2. One tailed Test
One tailed test is again classified into two types
a) Right Tailed Test
b) Left Tailed Test
Two Tailed Test: A two tailed test is the test of any statistical hypothesis
where the Alternative hypothesis is written with the symbol .
That is, a two-tailed test of a hypothesis will reject the null hypothesis if the
sample mean is significantly higher than or lower than the hypothesized

Sikkim Manipal University Page No.: 364


Probability and Statistics Unit 12

population mean. Thus, in a two-tailed test, rejection region is split in two


parts under the distribution curve.
A two-tailed test is appropriate when:
the null hypothesis is = Ho (where Ho is some specified value)
the alternative hypothesis is Ho.

One Tailed Test:


When the hypothesis about the population mean is rejected only for the
value of falling into one of the tails of the sampling distribution, then it is
called One tailed test
Right Tailed Test: A Hypothesis Test where the rejection region is located
to the extreme right of the distribution. A right-tailed test is conducted when
the alternative hypothesis (H1) contains the condition H1: > Ho (greater
than a given quantity).

Sikkim Manipal University Page No.: 365


Probability and Statistics Unit 12

Right-tailed Test
Left Tailed Test: A Hypothesis Test where the rejection region is located to
the extreme left of the distribution. A left-tailed test is conducted when the
alternative hypothesis (H1) contains the condition H1: < Ho (less than a
given quantity)

12.5 Tests of Hypothesis Concerning Large Samples


When the size of sample exceeds 30, it is called as large sample otherwise
it is considered as small sample. Following are the assumptions for the tests
of hypothesis for large samples:
(i) The sampling distribution of a sample statistics is approximately
normal.
(ii) Values given by the samples are sufficiently close to the population
value and can be used in its place for the standard error of the
estimate.

Sikkim Manipal University Page No.: 366


Probability and Statistics Unit 12

12.5.1 Testing of Hypothesis About Population Mean:


(i) We shall first take the hypothesis testing concerning the population
parameter by considering the two- tailed test:
( )
Since the best unbiased estimator of is the sample mean ,
,

, where

If the calculated value of , the null hypothesis is


rejected.
(ii) If the hypothesis involves a right- tailed test. For example,

For the calculated value , the null hypothesis is rejected.


(iii) If the hypothesis involves a left- tailed test, i.e.,

For the calculated value , the null hypothesis is rejected.


Example: The mean life time of a sample of 100 electrical bulbs produced
by a company is found to be 1,580 hours with standard deviation of 90
hours. Test the hypothesis that the mean life time of the bulbs produced by
the company is 1,600 hrs.
Solution: The null hypothesis is that there is no significant difference
between the sample mean and hypothetical population mean, i.e.

where

12.5.2 Testing Hypothesis for the Difference Between Two Means


The test statistics for the difference between two normally distributed
population mean is based on the general form of standard normal statistic
as given below:

Sikkim Manipal University Page No.: 367


Probability and Statistics Unit 12

Where . Since the best unbiased estimator of is


, therefore . The standard deviation of the sampling
distribution of ( .) is given by

The test statistic z is given by

The null Hypothesis is

Hence, the z statistic =

At 5% level of significance, the critical value of z for two tailed test is


If the computed value of z is greater than 1.96 or less than -1.96, then reject
, otherwise accept .

Note: If and are not known then for large samples then and can
be used..

Example: Details of two companies are

Company A Company B

Mean life (in hours) 1,300 1,288

Standard Deviation (in hrs) 82 93

Sample size 100 100

Which brand of test tubes are better if the desired risk is 5%.

Sikkim Manipal University Page No.: 368


Probability and Statistics Unit 12

Solution: Let the null hypothesis that there is no significant difference in the
quality of the two brands of test tube i.e.,

Note that here and are not known therefore, we can replace it by
and .

z=

Since z = 0.968 is less than critical value of z = 1.96 (5% ) level, we accept
the null hypothesis. Hence the quality of two brands do not differ
significantly.

12.5.3 Testing of Hypothesis Concerning Attributes


In this case we try to make binomial type problems. A selection of individual
on an individual on sampling is called event, the appearance of an attribute
is called success and its non- appearance is known as failure. The sampling
distribution of the number of success, being a binomial model would have its
mean and its standard deviation

Then,

Example: In 600 throws of six faced die, odd points appeared 360 times.
Would you say that the die is fair at 5% level of significance.

Solution: Let the null hypothesis be that the die is not biased.

p =q = , n = 600, np = 300

Thus,

Since the calculated value of z is greater than the tabulated value ie. Z =
1.64, so the null hypothesis is rejected ie. the die is not fair at 5 % level of
significance.

Sikkim Manipal University Page No.: 369


Probability and Statistics Unit 12

12.5.4. Testing Hypothesis About a Population Proportion:


The population parameter of interest is proportion . If the sample size is
large, then sample proportion p will be approximately normally distributed.
Then

Therefore, the statistic where

~ N(0,1)

If , the null hypothesis is rejected with 100% level of significance.

Example: A sales clerk in the departmental store claims that 60% of the
shoppers entering the store leave without making a purchase. A random
sample of 50 shoppers showed that 35 of them left without buying anything.
Are these sample results consistent with the claim of the sales clerk? Use a
level of significance of 0.05.

Solution: The null hypothesis is

The sample proportion p =

The critical value of z is 1.64 at 5% level of significance.

Since the compute value of z is less than the critical value of z = 1.64,
therefore, the null hypothesis cannot be rejected. Hence, based on this
sample data, we cannot reject the claim of the sales clerk.

Sikkim Manipal University Page No.: 370


Probability and Statistics Unit 12

12.5.5. Testing Hypothesis About the Difference Between Two


Proportions

Let the sample proportions obtained in large samples of sizes


drawn from respective populations having proportions .
We can test the null hypothesis that there is no difference between the
population proportions, i.e.,

The sampling distribution of differences in proportion, is normally


distributed with mean and the standard deviation

So,

If the null hypothesis is true, are two independent unbiased


estimators of the same parameter . The pooled estimate of is
the weighted mean of the two sample proportions, i.e.,

Then z= where

Example: In a random sample of 100 persons taken from village A, 60 are


found to be consuming tea. In another sample of 200 persons taken from
village B, 100 persons are found to be consuming tea. Do the data reveal
significant difference between the two village so far as the habit of taking tea
is concerned?

Solution: Let us take the hypothesis that there is no significant difference


between the two village so far as the habit of tea is concerned, i.e.,

We are given:

Sikkim Manipal University Page No.: 371


Probability and Statistics Unit 12

The appropriate statistics to be used here is given by

Where

Since, the computed value of z is less than the critical value of z = 1 at 5%


level of significance, therefore, we accept the hypothesis. Hence, we
conclude that there is no significant difference in the habit of taking tea in
the two village A and B.

Example: Before an increase in excise duty on tea, 400 people out of a


sample of 500 people were found to be tea drinkers. After an increase in
duty, 400 people were tea drinkers in a sample of 600 people. State,
whether there is a significant decrease in the consumption of tea.

Solution: Let us take the hypothesis that there is no significant decrease in


the consumption of tea after the increase in duty, i.e., =

Given ,

Then , Where

Sikkim Manipal University Page No.: 372


Probability and Statistics Unit 12

Since, the computed value of z is greater than the critical value of z = 1.96
at 5% level of significance, therefore, hypothesis is rejected. Hence, there is
a significant decrease in the consumption of tea after an increase in duty.

SAQ 1: From the following data obtained from a sample of 1,000 persons,
calculate the standard error of mean:
Weekly Earnings (Rs . hundred):
0-10 10-20 20-30 30-40 40-50 50- 60 60-70 70-80
No. of persons:
50 100 150 200 200 100 100 100
Is it likely that the sample has come form the population with an average
weekly earnings of Rs 4,200

SAQ 2: A sample of 400 managers is found to have a mean height of


171.38 cms. Can it be reasonably regarded as a sample from a large
population of mean height 171.17 cms and standard deviation of 3.30 cms?

SAQ 3: Intelligence test given to two groups of boys and girls gave the
following information:
Mean Score S.D. Number
Girls 75 10 50
Boys 70 12 100

Is the difference in the mean scores of boys and girls statistically significant?

12.6 Summary
In this unit we studied about different types of hypothesis Null hypothesis
and Alternative hypothesis, one tailed test and two tailed test and different
types of tests of large samples with applications in daily life.

Sikkim Manipal University Page No.: 373


Probability and Statistics Unit 12

12.7 Terminal Questions


1. In a survey of buying habits, 400 women shoppers are chosen at
random in super market A. Their average weekly food expenditure is Rs.
250 with a standard deviation of Rs. 40. For another group of 400
women shoppers chosen at random in super market B located in
another area of the same city, the average weekly food expenditures is
Rs.220 with standard deviation of Rs. 55. Test at 1% level of
significance, whether the average weekly food expenditure of the
population of women shoppers are equal.
2. A dice is thrown 49152 times and of these 25145 yielded either 4 or 5 or
6. Is this consistent with the hypothesis that the dice must be unbiased.
3. An ambulance service claims that it takes, on the average, 8.9 minutes
to reach its destination in emergency calls. To check on this claim, the
agency which licenses ambulance services has then timed on 50
emergency calls, getting a mean of 9.3 minutes with a standard
deviation of 1.8 minutes. At the level of significance of 0.05, does this
constitute evidence that the figure claimed is too low?
4. A coin is tossed 100 times under identical conditions independently
yielding 30 heads and 70 tails. Test at 1% level of significance, whether
or not the coin is unbiased. State clearly the null hypothesis and the
alternative hypothesis.
5. A buyer of electrical bulbs bought 100 bulbs each of two famous brands.
Upon testing these he found that brand A had a mean life of 1500 hours
with a standard deviation of 50 hours where as brand B had a life of
1530 hours with a standard deviation of 60 hours. Can it be concluded at
5 % level of significance that the two brands differ significantly in quality
of the bulbs.

Sikkim Manipal University Page No.: 374


Probability and Statistics Unit 12

12.8 Answers
Self Assessment Questions

1.
Weekly X f (X- fd fd2
Earnings 45)/10=
d

0-10 5 50 -4 -200 800

10-20 15 100 -3 -300 900

20-30 25 150 -2 -300 600

30-40 35 200 -1 -200 200

40-50 45 100 0 0 0

50-60 55 100 1 100 100

60-70 65 100 2 200 400

70-80 75 100 3 300 900

N=
1000

= 1.934 10 = 19.34

Therefore, the standard error of mean is 0.612

Sikkim Manipal University Page No.: 375


Probability and Statistics Unit 12

Since, the computed value of z is less than the critical value of


it is not significant and hence there is no significant difference
between the sample average and the population average weekly
earnings and the difference could have arisen due to fluctuations of
sampling.

2. The null hypothesis is that there is no significant difference between the


sample mean height and the population mean height. Given = 171.38,
.

Applying the test statistics,

Since, the computed value of z = 1.31 is less than critical values of z =


1.96 at 5% level of significance, therefore, the null hypothesis is
accepted. Hence there is no significant difference between the sample
mean height and population mean height.

3. Let us take the hypothesis that the difference in the mean score of boys
and girls is not significant, i.e., . Given that

The appropriate statistics to be used here is given by

Since the compound value is greater than the critical value of


z = 2.58 at 1% level of significance, therefore, the hypothesis is rejected.
Hence, the difference in the mean score of boys and girls is statistically
significant.

Sikkim Manipal University Page No.: 376


Probability and Statistics Unit 12

Terminal Questions

1. z = 8.822, Since the value of z is greater than 3, the null hypothesis is


rejected. Hence, the average weekly expenditure of two populations of
women shoppers differ significantly.
2. z = 6, Since, the computed value of z = 6 is greater than the critical
value of z = 3. It is significant, and therefore, null hypothesis is rejected.
Hence the dice is certainly biased.
3. z = 1.6. Since the computed value of z = 1.6 is less than the critical
value of z = 1.96 at 5% level of significance, therefore, the hypothesis is
accepted. Hence, there is no significant difference between the average
figure observed and the average figure claimed.
4. z = -4 . since the computed value of z =-4 is greater than critical values
of z = at 1% level of significance, therefore, we reject the null
hypothesis. Hence, the coin is biased.
5. z = -3.84, Since, the compound value of z is more than the table value of
z = 1.96 at 5% level of significance, the null hypothesis is rejected. So
the brands of bulbs differ significantly in quality.

Sikkim Manipal University Page No.: 377

You might also like