You are on page 1of 39

Chapter 9

Preliminary Concepts on Statistical Inference


In descriptive statistics the use of single measures to describe a set of
data or distribution was introduced. These measures are the central measures of
variability. The central measures include the mean, the median and the mode while
the range, quartile range, mean deviation, variance, standard deviation and the
coefficient of variant are the measures of variability.
The other branch of statistics is the inferential statistics. This branch or
category of statistics enables us to make estimates of population values called
parameters and to make the statements about computed statistics acceptable to
some degree of confidence. The statistical method concerned with making
estimates of population values is called statistical inference. This particular
method and process will help us determine how accurate and acceptable our
generalizations are.

Statistics plays important role in the field of applied scientific research. To get
the data or information about the population, the researcher may just use a portion
of it in order to eliminate at least the cost and time constraints. Statistics offers
varied tools and techniques that will help the researcher draw reliable and valid
inferences or generalization about the population using the sample as basis.

At this point, certain basic concepts are needed to be clarified in order to


understand and appreciate the concept of statistical inference better. The following
are the discussion of the two sub-areas of inferential statistics namely, statistical
estimation and the test of hypothesis.

Statistical Estimation

In statistical estimation, we also consider a population and a sample. Recall


that a population is an aggregate of persons, objects, events, places and actions
to certain stimuli that have a unique pattern of qualities. Sometimes, this is referred
to as the universe in statistical investigation. However, a sample is a portion or a
smaller part of the population that truly represents the unique qualities or
characteristics of the population. The acceptability of the sample depends on how
well the sampling technique has been selected and employed.

For example, in the study on the adjustment problems of freshmen students


from the Department of Liberal Arts, the researchers considered all the students
coming from all the seven (7) programs. However, they opted not to use all the
students on account of their big number. They used Slovins formula to determine
the appropriate sample size. A stratified proportional sampling was employed to
determine the number of students per program. A test on Personal Adjustment
inventory was administered and they use the test results to report the encountered
problems regarding the adjustments of the entire group of freshmen students
belonging to the department.

Inferential statistics help facilitate our work. Imagine, for instance, that the
population of the said department cited in the example above was 1,000 students
distributed among seven programs. It would be extremely laborious for the
researchers to involve all the students. They can make their work easier by drawing
representative sample for each program which is proportional to the population size
of every program. The total sample size will be 286 students only applying Slovins
formula and using 5% margin of error. And yet, they can report about the results as
reflective of the adjustment of the whole freshmen students.

Parameter and Statistics

Statistical inference also deals with the concepts about parameter and
statistics that are involved in the estimation and further in the testing the
hypothesis.

Parameters exist whether they are computed or not. In practice, these


parameters are the attributes of a population. The numerical descriptive measures
, median ~
of central tendencies such as the mean and mode ^ and the
2
measures of variation including variance and the standard deviation are not
known unless we invoke probability sampling techniques. The Greek symbols used
here are read as mufor the central measures and sigma for the measures of
variations.

Statistics are the computed measures about the sample. The sample mean
is symbolized as x and the sample standard deviation by s. The term statistics is
synonymous to the concepts of estimates.

Sampling Methods Revisited

The degree to which a particular statistic approximates its corresponding


parameter value depends upon how impartially we have drawn our sample.
Sampling theory is based on the theoretical use of the word random.
Random sampling which is the most commonly used sampling technique has
two properties. First is equiprobability which means that each member of the
population has an equal chance of being drawn and be included in the sample. If for
instance there are 500 members in the population, the probability of each member
1
to be drawn is 500 . This is especially true when sampled cases are replaced or
returned to the original pool.

The second property is independence. This means that the chance of one
member being drawn does not affect the chances of the other members getting
chosen. For example, in a population where there are father and son members not
necessarily paired, when the father is drawn, this does not mean that the son will
automatically be included in the sample. Hence, the selection of the father is
independent of the inclusion of the son in this sample selection.

We know from Chapter 2 that sampling can be broken down into 2 types. The
first one isthe nonprobability sampling where there is no way of estimating the
probability that each individual or element will be included in the sample. Examples
of this type are the accidental sampling, quota sampling and the purposive
sampling. The convenience and economical use of this type are its advantages.

The second type is the probability sampling wherein every individual has
an equal chance of becoming a part of the sample. Examples under these type
include the simple random sampling, stratified random sampling, cluster sampling,
systematic sampling and multistage sampling. It is also noted whenever the
sampling is not carried out like in the probability sampling then the result is a biased
sample.

The sampling error is the difference between a particular value, and its
corresponding statistic. Supposing that someone administered an intelligence test
to a random sample of incoming freshmen in a certain college and computed the
mean. The statistic is an estimate of the parameter mean x . The sampling error is
denoted as eis the difference between the population mean and the sample mean,
thus - x = e.

Point and Interval Estimation

There are two types of an estimator, these are the point estimator and
interval estimator. The point estimator is a rule or formula that gives the value of
the gathered information for estimating a particular parameter. An example of a
point estimator is the sample mean x , in which it estimates the value of the
population mean, . The second type is referred to as the interval estimator in
which it is a rule or formula that gives a set of computed values of the gathered
information indicating the range of values in which the parameter to be estimated
will lie.

` Standard Error

The standard error of the distribution of the means is denoted by SE(


x ). If it were possible to draw a sample from a population 100 times and each
time the mean of each sample is computed,
We would be able to compute 100 means. And if we were to construct a frequency
distribution of these 100 means, this distribution would be referred to as a sampling
distribution of the means.
Furthermore, if we compute the standard deviation of this distribution the
value we will get is the standard error of the means. In simple terms, the standard
error is the standard deviation of the sampling distribution of the means. It is an
index number that guides us in making certain conclusions concerning how far or
how close the mean if a given sample is from the measure we would get had we
involved in the entire population.
The formula for obtaining the standard error of the sample mean is given by:
s
SE( x ) = n (1)

The acceptability of the representatives of the particular sample is


determined by the magnitude of the standard error. Furthermore, the formula above
indicates that the magnitude of the standard error of the distribution of the means
is dependent on two measures. First is the standard deviation or variability of scores
around the mean and second is the size of the sample or the number of cases being
studied.

Hypothesis Testing

The goal of hypothesis is not to question the computed value of the sample
statistic but to make a judgment about the difference between the sample statistics
and a hypothesized population parameter.
Hypothesis testing enables a researcher to generalize a population from
relatively small samples. In many instances, a population from relatively small
samples. In many instances, a researcher can only rely on the information provided
for by a part of the population.
A hypothesis is a tentative explanation for certain events, phenomena or
behaviors. It is a statement of prediction of the relationship between or among
variables. It is also the most specific statement of a problem in which the variables
considered as measurable and that the statement specifies how these variables
are related. Furthermore, this statement is testable which means that the
relationship between the variables can be put into test by means of the application
of appropriate statistical test on the data gathered about the variables.

Null and Alternative Hypothesis


There are two kinds of hypotheses, the null hypothesis and the alternative
hypothesis. Null hypothesis, which is denoted as H0is the statement of equality
indicating no existence of relationship between the variables under study. This
statement is tested for the purpose of being accepted or rejected. Examples of a
null hypothesis are given below.

Example 1 The Mathematics ability test scores of the control group do not differ
with that of the experimental group.
Example 2 The job performance of a group of employees working in a class A
hotel is independent on their working condition.
Example 3 The scholastic competition among Freshman students has no
relationship on their academic achievement.
Example 4 There is no difference in the college entrance examination scores
obtained by the students in the public and private schools.

The alternative hypothesis, which is denoted as Ha is also termed as the


research hypothesis. It is a statement of the expectation derived from the theory
under the study. It specifies an existence of a difference and is therefore termed as
non-directional alternative hypothesis. Examples of a non- directional
alternative hypothesis are given below.

Example 5 The Mathematics ability test scores of the control group differs with
that of the experimental group
Example 6 The job performance of a group of employees working in a class A
hotel is related on their working condition.
Example 7 The scholastic competition among Freshmen students has a
relationship on their academic achievement.

Example 8 There is a difference in the college entrance examination scores


obtained by the students in the public and private schools.

It can be predictive hypothesis which specifies that one group performs


better than the other and is therefore termed as directional alternative
hypothesis.

Example 9 The Mathematics ability test scores of the control group is lower than
that of the experimental group.
Example 10 High scores in the mental ability test corresponds to high scores on the
self-concept test.
Example 11 Students exposed to time pressure has a negative effect on their
reading comprehension skills.
Example 12 The brand of cellular phone used by college students in XYZ University
has a positive effect in developing ones self-image.

Directional and Non-directional Tests of Hypothesis


The non-directional tests of hypothesis is also referred to as the two-
tailed test. It makes use of the two opposite sides or tails of the statistical model
or distribution. This indicates that no assertion is made whether the difference falls
within the positive or negative end of the distribution. The illustration of this test at
5% level of significance is presented on the next page.

The directional test of hypothesis is also referred to as the one-tailed


test. It makes use of only one side or tail of the statistical model or distribution
which can be left-tailed or a right-tailed test. This indicates that an assertion is
made to whether the positive end of the distribution for a right-tailed test. The
illustration of this right-tail test at 5% level of significance is presented on figure 9.2
and the left-tailed test at the same 5% level of significance is presented on Figure
9.3

Critical Region and Critical Value

The critical region is a set of values of the test statistic (computed from the
gathered data set) that is chosen before the experiment to define the conditions
under which the null hypothesis will be rejected.
The critical value or values separate the critical region from the values of
test statistic that would lead to the rejection of the null hypothesis. The critical
values depend on the nature of the null hypothesis, the relevant sampling
distributions and the level of significance.
In two-tailed tests, the level of significance is divided equally between two
tails that constitute the critical region. For example, in a two tailed test with a
significance level of = 5%, there is an area of 2.5% in each of the two tails as
shown on the figure 9.1. On the other hand, for one tailed-tests, the level of
significance constitutes the critical region that can be found on either the left or
right tail of the distribution.

Type I and Type II Errors

In testing the null hypothesis, our conclusion is that of rejecting or accepting


it. Correct decisions happen when we reject a null hypothesis when it is false or
when we accept a null hypothesis when it is true. Otherwise, the decisions are
wrong. That is, when we reject a true null hypothesis or when we accept a false null
hypothesis. These two possible scenarios of committing a wrong decision give two
different types of error in a statistical decision making. These errors are not a
miscalculation or procedural misstep. They are actual error that can occur when a
rare event happens by chance.
The first type is the type I error. This is the chance of rejecting the null
hypothesis when it is true. It is also referred to as the significance level and denoted
by the Greek symbol alpha () to represent the probability of a type I error. The
common values for are 1%, 5% and 10%.
The second type is the type II error. This is the chance of failing to reject
the null hypothesis when it is false. It is denoted by the Greek symbol beta () to
represent the probability of a type II error.

Confidence Interval

On the previous section, we discussed the chance of committing type I error,


which is denoted as . This is also referred as the level of significance. A confidence
level is denoted as 1 , which represents the chance of accepting the null
hypothesis when in fact it is true. This is usually attached to the notion of interval
estimation in which we are attaching certainty that the parameter we are
estimating will lie on the interval where the lower and upper bounds exist. The
common values used for the confidence level is 90%, 95% and 99%.
A confidence interval is constructed when we attach a confidence level to
the interval estimate of a particular parameter. A general formula for constructing a
100 x (1 )% confidence interval for the parameter we used to estimate is given
by:
Estimator S.E. (estimator) x critical value (2)
Wherein critical value is the one we obtain in the tables which depends on the
nature of the null hypothesis, the relevant sampling distributions and the level of
significance. For the case of confidence intervals, we only consider a critical value

whose level of significance is 2 .

Steps in Performing Hypothesis Testing

The following are the steps in performing hypothesis testing:


1. State the null and the alternative hypothesis of the given problem.
2. Determine the level of significance and the level of significance and the
direction of test will be based on whether the alternative hypothesis is stated
as left or right tailed test or two-tailed test.

3. Determine the appropriate statistical test based on the level of measurement


of the gathered data.

4. Write the decision rule expressing on how we accept or reject the null
hypothesis.

5. Compute for the test statistic and compare with the critical value. The test
statistic plays a vital role whether the null hypothesis will be rejected or
accepted.

6. State the decision based on the resulting computed value when compared to
the critical value.

7. Write the conclusion for the given problem.


Testing Hypothesis for the Mean for Single Sample Case.

By the following the procedures in testing the hypothesis for only a single
mean, we have a hypothesized mean (0). Symbolically the null hypothesis is
written as:

Ho: = 0(3)

Stating that the mean is equal to the hypothesized mean. On the other hand,
the alternative hypothesis is written as symbolically as:

Ha: = 0 (4)

Ha: > 0 (5)

Ha: < 0 (6)

The expression in (4) is used when the alternative hypothesis is non-


directional and hence undergoing a two-tailed test. The remaining expressions (5)
and (6) are used when the alternative hypothesis is directional and hence
undergoing a one-tailed test which is either right or left-tailed test respectively.

`` The decision rule is stated as follows: Reject the null hypothesis if the
absolute value of the test statistic exceeds the critical value. Otherwise, accept the
null hypothesis.

Population Variance is known

In order to draw inference on a mean in one-population case assuming that


the entries are normally distributed and the variance is known, we use the Z-test.
The Z-statistic, Zcis the test statistic used in order to lead for the rejection of null
hypothesis in favor of the alternative hypothesis. This is computed as follows

x 0
Zc = /n (7)

Where x the computed mean is in the gathered data, 0 is the hypothesized

mean, is the population standard deviation which is known or given and n is


the sample size.

The critical value is obtained using the Z-tabular value located in Appendix 2.

For a two-sided test we consider the value of 1 - 2 , the value is symbolically

written as Z 2 . Otherwise, for a one-sided test, we consider the value of 1 and

the value is symbolically represented as Z .

Example 1 A random sample of 100 students enrolled in Statistics course under


professor XYZ shows that the average grade in the midterm
examination is 85%. Professor XYZ claims that the average grade of
the students in the midterm examination is at least 80% with a
standard deviation 16%. Is there an evidence to say that the claim of
the Professor XYZ is correct at 5% level of significance?

Solution:

We first state symbolically the null hypothesis as Ho: =


80%, which means that the average grade of Professor XYZs students
is greater than 80% and the alternative hypothesis as Ha: >80 ,
which means that the average grade of Professor XYZs students is
greater than 80%. Since the last statement on given problem asserts it
to fall positive end of the distribution, we consider this is a one-tailed
test. Thus, our decision rule is to reject the null hypothesis if Z c> Z.
Otherwise, accept the null hypothesis.

The test statistic together with the Z-Tabular value is computed


as follows:

Z 0.850.80 0.05
c= = =3.125
0.16 0.016
( )
100

So that Zc = 3.125 against the Z-tabular value of Z 0.05= 1.645.


Our decision based on this computation is to reject the null hypothesis
in favor of the alternative hypothesis. Thus, we conclude that the claim
of Professor XYZ is true since the average grade by his student in
Statistics is greater than 80% at 5% level of significance.

Example 2 A random sample of 100 recorded deaths in U.S during the past year
showed on average life span of 71.8 yrs. assuming a population standard deviation
of 8.9 years, does this seem to indicate that the mean life span today is greater
than 70 years? Use a 0.05 level of significance.
1.) HO: = 70
HA: > 70
2.) Z- Test (right tailed)
3.) =.05
1 - 0.05 = 0.95
4.) Reject HO if ZC > Ztab
(71.870) 100
5.) Z c=
8.9

Zc = 2.02
P(z > 2.02) = 1 P(z > 2.02)
= 1 0.9783
= 0.0217
**Note: P value is the lowest significance in which the observed value of the
test statistic is significant.
6.) Zc > Ztab; Reject HO
7.) The mean life span today is greater than 70.
Example 3 A manufacturer of sports equipment has developed a new synthetic
finishing line that he claims has a mean breaking strength of 8 kg with = 0.5 kg
test the hypothesis that = 8 kg against = 8kg if a random sample of 50 lines is
tested and found to have a mean breaking strength of 7.8 kg. Use 0.01 level of
significance
1.) HO: = 8
HA: 8
2.) Z- Test ()
3.) = 0.01/2

Ztab1= -2.575
Ztab2= 2.575

4.) Reject HO if ZC < Ztab1


Reject HO if ZC > Ztab2
(7.88) 50
5.) Z c=
0.5

Zc = -2.828
P(z < -2.828) = 0.0023(2)
= 0.0046
6.) Zc < Ztab1 ; Reject HO
7.) The mean mreaking strengths is not equal to 8kg.

TWO SAMPLE MEAN TEST


X 1 X 2
Zc=

21 22
+
n1 n2

Where:
X1, X2 = sample mean of 1 and 2
12, 22 = variances of 1 and 2
n1, n2 = size of 1 and 2
Example 1 An admission test was administered to incoming freshman in 2 colleges.
Two independent sample of 150 students each are randomly selected and the mean
scored of the given samples are X1 = 88 and X2 = 85. Assume that the variances of
the test scores are 40 and 35 respectively. Is the difference between the mean
scores significant or can be attributed to chance? Use 0.01 level of significance.
1.) HO: 1 = 2
HA: 1 2
2.) Z- Test (two-tailed)
3.) = 0.01

Ztab1= -2.578
Ztab2= 2.578

4.) Reject HO if ZC < Ztab1


Reject HO if ZC > Ztab2
8885
Zc=
5.)
40 35
+
150 150

Zc = 4.2426
6.) Zc > Ztab1 ; Accept HO
7.) There is a significant difference and can be attributed to chance.

Population Variance is Unknown

In order to draw inference on a mean in one-population case assuming that


the entries are normally distributed but the variance is unknown, we use the t-test.
The test statistic used in order to lead for the rejection of null hypothesis in favor of
the alternative hypothesis is the t-statistic, tc , which is computed as follows:
x 0
t c=
s / n

Where x is the computed mean in the gathered data, 0 is the hypothesized

mean, s is the sample standard deviation which is known or given and n is the
sample size.

The critical value is obtained using the t-tabular value located in Appendix 5.
For a two-sided test we look for the value of df, which is referred as the degrees of
t
freedom, this is symbolically written 2
(n1 ) . Otherwise, for a one-sided test, we

look on the column of and look for the value of df, this is symbolically written as
t (n1) .

Example 1 A random sample of 100 students enrolled in Statistics


course under professor XYZ shows that the average grade in the
midterm examination is 85% with computed standard deviation
of 25%. Professor XYZ claims that the average grade of the
students in the midterm examination is at least 80%. Test the
claim of the professor at 5% level of significance.

Solution:

We first state symbolically the null hypothesis as Ho:


= 80%, which means that the average grade of Professor XYZs
students is 80% and the alternative hypothesis as Ha: 80 ,
which means that the average grade of Professor XYZs students
is not 80%. Since the last statement on given problem does not
assert to whether it falls within the positive or negative end, we
consider this as a non-directional test. Thus, our decision rule is
t
to reject the null hypothesis if tc> 2
(n1 )
. Otherwise, accept

the null hypothesis.

The test statistic together with the t-tabular value is


computed as follows:

0.850.80
0.05
tc = 0.25
( ) = 0.025 = 2.000
100
So that tc = 2.000 against the t -tabular value of t0.025(99)=
1.960. Notice that on the values of n in the t-table is only up to
30 and then followed by INF (infinity), which is used for n greater
than 30.

Our decision based on this computation is to reject the


null hypothesis in favor of the alternative hypothesis. Thus, we
conclude that the claim of Professor XYZ is not true since the
average grade obtained by his students in Statistics is not 80%.

A 100 x (1-)% confidence interval is constructed


whenever the null hypothesis of the two-tailed test is rejected.
Otherwise, this confidence interval will not be constructed.

In order to determine the possible values that the true


average grade will lie, we will construct a 95% confidence
interval. Using formula (2), we have the estimator for the mean
which is the average grade by the students to 85%, the tabular
value of 1.960 and the standard error of the estimate is given by
0.25
=0.025 which is equivalent to 2.5%.
100

Using the results on example 8.14, the resulting


confidence interval is given as 85% (2.5%)(1.96) =85%
4.9%. That is, with an attached 95% confidence coefficient, the
true average grade obtained by Professor XYZs students lies
within 89.9% to 80.1% which is eventually higher than the
hypothesized value of 80%.

Example 2 A random sample of 25 female high school students show


that their average body mass index (BMI) is about 18 points with
a standard deviation of 4.5 points. Test the hypothesis that the
average BMI of the female high school students is lower than 19
points at 5% level of significance.

Solution:

The null hypothesis is stated as Ho : =19, against the


alternative hypothesis of Ha: < 19. The last statement on
given problem asserts it to fall within the negative end of the
distribution which is considered it as this left-tailed test. Thus,
our decision rule is to reject the null hypothesis. The test e t-
statistic together with the t-tabular value is computed as follows:
1819 1
t c = = =1.111versus
4.5 0.9
( )
25

t (0.05)(251)=t (0.05 )(24)=1.711

Our decision based on this computation is to accept the


null hypothesis. Thus, we conclude that average BMI of female
high school students is about 19 points.

Example 3 The manager of a car rental agency claims that the average mileage of
cars rented is less than 8000. A sample of 5 auto-mobiles has an average mileage
of 7723, with st. dev. of 500 miles. At =0.01, is there enough evidence to reject to
managers claim?
1.) HO: 1 = 8000
HA: 1 < 8000
2.) T- Test (left-tailed)
3.) = 0.01

V= n-1= 5-1= 4

Ttab= -3.747

4.) Reject HO if TC < Ttab


(77238000) 5
5.) Z c=
500

Zc = -1.238
6.) Since tc > ttab ; Accept HO
7.) The managers claim is false.

In testing two small samples:


X 1 X 2
tc=


2 2
n1 S1 +n2 S 2 n1 +n 2
+
n 1+ n22 n1 n2

With n1 + n2 -2 (degrees of freedom)

Example 4 Two samples are randomly selected from two groups of students
who have been taught using different teaching methods. An examination is
given and the results are shown below.
Group1 Group2
n1=8 n2=10
X1=85 X2=87
S12=46 S22=66
Using 0.05 level of significance, can we conclude that the two different
teaching methods are equally effective?

1.) HO: 1 = 2
HA: 1 2
2.) T- Test (two-tailed test)
3.) = 0.05

V= n1+n2-2= 8+10-2 = 16

Ttab1= -2.120
Ttab2= 2.120

4.) Reject HO if TC < Ttab1


Reject HO if TC > Ttab2
8587
t c=
5.)
8 ( 6 ) +10(36) 8+10
8+102
+
8 (10)

tc = -0.651
6.) Since tc > ttab1 ; Accept HO
7.) The two teaching methods are equally effective.

EXERCISES

1. Suppose that mi allergist wishes to test the hypothesis that at least 30% of the
public is allergic to some cheese products. Explain how the allergist could commit
(a) a type I error;
(b) a type II error.
Answer: ((a) Conclude that fewer than 30% of the public are allergic to some cheese
products when, in fact, 30% or more are allergic.
(b) Conclude that at least 30% of the public are allergic to some cheese
products when, in fact, fewer than 30% are allergic.)
2. A sociologist is concerned about the effectiveness of a training course designed
to get more drivers to use seat, belts in automobiles.
(a) What hypothesis is she testing if she commits a type I error by erroneously
concluding that the training course: is ineffective?
(b) What hypothesis is she testing if she commits a type II error by erroneously
concluding that the training course is effective?
Answer: ((a) The training course is effective.
(b) The training course is effective.)
3. A large manufacturing firm is being charged with discrimination in its hiring
practices.
(a) What hypothesis is being tested if a jury commits a type I error by dueling the
firm guilty?
(b) What hypothesis is being tested if a jury commits a type II error by finding the
firm guilty?
Answer: ((a) The firm is not guilty.
(b) The firm is guilty.)

4. The proportion of adults living in a small town who are college graduates is
estimated to be p = 0.6.
To test this hypothesis, a random sample of 15 adults is selected. If the number of
college graduates in our sample is anywhere from 6 to 12, we shall not reject the
null hypothesis that p = 0.6; otherwise, we shall conclude that p ^ 0.6.
(a) Evaluate a assuming that p = 0.6. Use the binomial distribution.
(b) Evaluate 8 for the alternatives p = 0.5 and p 0.7.
(c) Is this a good test procedure?
Answer: ((a) = P(X 5 | p = 0.6)+P(X 13 | p = 0.6) = 0.0338+(10.9729) =
0.0609.
(b) = P(6 X 12 | p = 0.5) = 0.9963 0.1509 = 0.8454.
= P(6 X 12 | p = 0.7) = 0.8732 0.0037 = 0.8695.
(c) This test procedure is not good for detecting differences of 0.1 in p.)

5. Repeat. Exercise 10.4 when 200 adults are selected and the fail to reject region is
defined to be
110 < x < 130, where x is the number of college graduates in our sample. Use the
normal approximation.
Answer: ((a) = P(X < 110 | p = 0.6) + P(X > 130 | p = 0.6) = P(Z < 1.52) + P(Z
>
1.52) = 2(0.0643) = 0.1286.
(b) = P(110 < X < 130 | p = 0.5) = P(1.34 < Z < 4.31) = 0.0901.
= P(110 < X < 130 | p = 0.7) = P(4.71 < Z < 1.47) = 0.0708.
(c) The probability of a Type I error is somewhat high for this procedure,
although
Type II errors are reduced dramatically.)

6. A fabric manufacturer believes that the proportion of orders for raw material
arriving late is p = 0.6.
If a random sample of 10 orders shows that 3 or fewer arrived late, the hypothesis
that p = 0.6 should be rejected in favor of the alternative p < 0.6. Use the binomial
distribution.
(a) Find the probability of committing a type I error if the true proportion is p = 0.6.
(b) Find the probability of committing a type II error for the alternatives p = 0.3, p
0.4, and p = 0.5.
Answer: ((a) = P(X 3 | p = 0.6) = 0.0548.
(b) = P(X > 3 | p = 0.3) = 1 0.6496 = 0.3504.
= P(X > 3 | p = 0.4) = 1 0.3823 = 0.6177.
= P(X > 3 | p = 0.5) = 1 0.1719 = 0.8281.)

7. Repeat Exercise 10.6 when 50 orders are selected and the critical region is
defined to be x < 24, where x is the number of orders in our sample that arrived
late. Use the normal approximation.
Answer: ((a) = P(X 24 | p = 0.6) = P(Z < 1.59) = 0.0559.
(b) = P(X > 24 | p = 0.3) = P(Z > 2.93) = 1 0.9983 = 0.0017.
= P(X > 24 | p = 0.4) = P(Z > 1.30) = 1 0.9032 = 0.0968.
= P(X > 24 | p = 0.5) = P(Z > 0.14) = 1 0.4443 = 0.5557.)
8. An electrical firm manufactures light bulbs that have a lifetime that is
approximately normally distributed with a mean of 800 hours and a standard
deviation of 40 hours. Test tbe hypothesis that p = 800 hours against the alternative
p ^ 800 hours if a random sample of 30 bulbs has an average life of 788 hours. Use
a P-value in your answers.
Answer: (The hypotheses are
H0 : = 800,
H1 : 6= 800.
Now, z = 788800

40/30

= 1.64, and P-value= 2P(Z < 1.64) = (2)(0.0505) = 0.1010.


Hence, the mean is not significantly different from 800 for < 0.101.)
9. A random sample of 64 bags of white Cheddar popcorn weighed, on average,
5.23 ounces with a standard deviation of 0.24 ounces. Test the hypothesis that p =
5.5 ounces against the alternative hypothesis, p < 5.5 ounces at the 0.05 level of
significance.
Answer: ((The hypotheses are
H0 : = 5.5,
H1 : < 5.5.
The White Cheddar Popcorn, on average, weighs less than 5.5oz.)
10. In a research report by Richard H. Weindruch of the UCLA Medical School, it is
claimed that mice with an average life span of 32 months will live to be about 40
months old when 40% of the calories in their food are replaced by vitamins and
protein. Is there any reason to believe that /z < 40 if 64 mice that are placed on this
diet have an average life of 38 months with a standard deviation of 5.8 months?
Use a P-value in your conclusion.
Answer: (10.21 The hypotheses are
H0 : = 40 months,
H1 : < 40 months.
Decision: reject H0.)
10.25 Test the hypothesis that the average content of containers of a particular
lubricant is 10 liters if the contents of a random sample of 10 containers are 10.2,
9.7, 10.1, 10.3, 10.1, 9.8, 9.9, 10.4, 10.3, and 9.8 liters. Use a 0.01 level of
significance and assume that the distribution of contents is normal.
Answer: (The hypotheses are
H0 : = 10,
H1 : 6= 10.
Decision: Fail to reject H0.)
10.26 According to a dietary study, a high sodium intake may be related to ulcers,
stomach cancer, and migraine headaches. The human requirement for salt is only
220 milligrams per day, which is surpassed in most single servings of ready-to-eat
cereals. If a random sample of 20 similar servings of of certain cereal has a mean
sodium content of 244 milligrams and a standard deviation of 24.5 milligrams, does
this suggest at the 0.05 level of significance that the average sodium content for a
single serving of such cereal is greater than 220 milligrams? Assume the
distribution of sodium contents to be normal.
Answer: (The hypotheses are
H0 : = 220 milligrams,
H1 : > 220 milligrams.
Decision: Reject H0 and claim > 220 milligrams.)
10.27 A study at the University of Colorado at Boulder shows that running increases
the percent resting metabolic rate (RMR) in older women. The average RMR of 30
elderly women runners was 34.0% higher than the average RMR of 30 sedentary
elderly women and the standard deviations were reported to be 10.5% and 10.2%,
respectively. Was there a significant increase in RMR of the women runners over the
sedentary women? Assume the populations to be approximately normally
distributed with equal variances. Use a P-value in your conclusions.
Answer: (The hypotheses are
H 0 : 1 = 2 ,
H 1 : 1 > 2 .
Hence, the conclusion is that running increases the mean RMR in older
women)
10.28 According to Chemical Engineering an important property of fiber is its water
absorbency. The average percent absorbency of 25 randomly selected pieces of
cotton fiber was found to be 20 with a standard deviation of 1.5. A random sample
of 25 pieces of acetate yielded an average percent of 12 with a standard deviation
of 1.25. Is there strong evidence that the population mean percent absorbency for
cotton fiber is significantly higher than the mean for acetate. Assume that the
percent absorbency is approximately normally distributed and that the population
variances in percent absorbency for the two fibers are the same. Use a significance
level of 0.05.
Answer: (The hypotheses are
H 0 : C = A ,
H 1 : C > A ,
The mean percent absorbency for the cotton fiber is significantly higher than the
mean percent absorbency for acetate.)
10.29 Past experience indicates that the time for high school seniors to complete a
standardized test is a normal random variable with a mean of 35 minutes. If a
random sample of 20 high school seniors took an average of 33.1 minutes to
complete this test with a standard deviation of 4.3 minutes, test the hypothesis at
the 0.05 level of significance that p = 35 minutes against the alternative that p <
35 minutes.
Answer: (The hypotheses are
H0 : = 35 minutes,
H1 : < 35 minutes.
Decision: Reject H0 and conclude that it takes less than 35 minutes, on the
average, to take the test.)
10.31 A manufacturer claims that the average tensile strength of thread A exceeds
the average tensile strength of thread B by at least 12 kilograms. To test his claim,
50 pieces of each type of thread are tested under similar conditions. Type A thread
had an average tensile strength of 86.7 kilograms with known standard deviation of
a A = 6.28 kilograms, while type B thread had an average tensile strength of 77.8
kilograms with known standard deviation of an = 5.61 kilograms. Test the
manufacturer's claim ata = 0.05.
Answer: (hypotheses are
H0 : A B = 12 kilograms,
H1 : A B > 12 kilograms.
The average tensile strength of thread A does not exceed the average tensile
strength of thread B by 12 kilograms.)

NONPARAMETRIC TESTS

Nonparametric tests are sometimes called distribution-free tests because


they are based on fewer assumptions (e.g., they do not assume that the
outcome is approximately normally distributed). Parametric tests involve
specific probability distributions (e.g., the normal distribution) and the tests
involve estimation of the key parameters of that distribution (e.g., the mean
or difference in means) from the sample data. The cost of fewer assumptions
is that nonparametric tests are generally less powerful than their parametric
counterparts (i.e., when the alternative is true, they may be less likely to
reject H0).

It can sometimes be difficult to assess whether a continuous outcome follows


a normal distribution and, thus, whether a parametric or nonparametric test
is appropriate. There are several statistical tests that can be used to assess
whether data are likely from a normal distribution. The most popular are the
Kolmogorov-Smirnov test, the Anderson-Darling test, and the Shapiro-Wilk
test1. Each test is essentially a goodness of fit test and compares observed
data to quantiles of the normal (or other specified) distribution. The null
hypothesis for each test is H0: Data follow a normal distribution versus H1:
Data do not follow a normal distribution. If the test is statistically significant
(e.g., p<0.05), then data do not follow a normal distribution, and a
nonparametric test is warranted. It should be noted that these tests for
normality can be subject to low power. Specifically, the tests may fail to
reject H0: Data follow a normal distribution when in fact the data do not
follow a normal distribution. Low power is a major issue when the sample
size is small - which unfortunately is often when we wish to employ these
tests. The most practical approach to assessing normality involves
investigating the distributional form of the outcome in the sample using a
histogram and to augment that with data from other studies, if available,
that may indicate the likely distribution of the outcome in the population.

There are some situations when it is clear that the outcome does not follow a
normal distribution. These include situations:

when the outcome is an ordinal variable or a rank,

when there are definite outliers or

when the outcome has clear limits of detection.

Using an Ordinal Scale

Consider a clinical trial where study participants are asked to rate their
symptom severity following 6 weeks on the assigned treatment. Symptom
severity might be measured on a 5 point ordinal scale with response options:
Symptoms got much worse, slightly worse, no change, slightly improved, or
much improved. Suppose there are a total of n=20 participants in the trial,
randomized to an experimental treatment or placebo, and the outcome data
are distributed as shown in the figure below.

Distribution of Symptom Severity in Total Sample


The distribution of the outcome (symptom severity) does not appear to be
normal as more participants report improvement in symptoms as opposed to
worsening of symptoms.

When the Outcome is a Rank

In some studies, the outcome is a rank. For example, in obstetrical studies an


APGAR score is often used to assess the health of a newborn. The score,
which ranges from 1-10, is the sum of five component scores based on the
infant's condition at birth. APGAR scores generally do not follow a normal
distribution, since most newborns have scores of 7 or higher (normal range).

When There Are Outliers

In some studies, the outcome is continuous but subject to outliers or extreme


values. For example, days in the hospital following a particular surgical
procedure is an outcome that is often subject to outliers. Suppose in an
observational study investigators wish to assess whether there is a
difference in the days patients spend in the hospital following liver transplant
in for-profit versus nonprofit hospitals. Suppose we measure days in the
hospital following transplant in n=100 participants, 50 from for-profit and 50
from non-profit hospitals. The number of days in the hospital are summarized
by the box-whisker plot below.
Distribution of Days in the Hospital Following Transplant

Note that 75% of the participants stay at most 16 days in the hospital
following transplant, while at least 1 stays 35 days which would be
considered an outlier. Recall from page 8 in the module on Summarizing Data
that we used Q1-1.5(Q3-Q1) as a lower limit and Q3+1.5(Q3-Q1) as an upper
limit to detect outliers. In the box-whisker plot above, 10.2, Q1=12 and
Q3=16, thus outliers are values below 12-1.5(16-12) = 6 or above
16+1.5(16-12) = 22.

Limits of Detection

In some studies, the outcome is a continuous variable that is measured with


some imprecision (e.g., with clear limits of detection). For example, some
instruments or assays cannot measure presence of specific quantities above
or below certain limits. HIV viral load is a measure of the amount of virus in
the body and is measured as the amount of virus per a certain volume of
blood. It can range from "not detected" or "below the limit of detection" to
hundreds of millions of copies. Thus, in a sample some participants may
have measures like 1,254,000 or 874,050 copies and others are measured as
"not detected." If a substantial number of participants have undetectable
levels, the distribution of viral load is not normally distributed.

Hypothesis Testing with Nonparametric


Tests

In nonparametric tests, the hypotheses are not


about population parameters (e.g., =50 or
1=2). Instead, the null hypothesis is more
general. For example, when comparing two
independent groups in terms of a continuous
outcome, the null hypothesis in a parametric test
is H0: 1 =2. In a nonparametric test the null
hypothesis is that the two populations are equal,
often this is interpreted as the two populations
are equal in terms of their central tendency.

Advantages of Nonparametric Tests

Nonparametric tests have some distinct


advantages. With outcomes such as those
described above, nonparametric tests may be
the only way to analyze these data. Outcomes
that are ordinal, ranked, subject to outliers or
measured imprecisely are difficult to analyze
with parametric methods without making major
assumptions about their distributions as well as
decisions about coding some values (e.g., "not
detected"). As described here, nonparametric
tests can also be relatively simple to conduct.

Introduction to Nonparametric Testing

This module will describe some popular nonparametric tests for continuous
outcomes. Interested readers should see Conover3 for a more comprehensive
coverage of nonparametric tests.

Key Concept:

Parametric tests are generally more


powerful and can test a wider range
of alternative hypotheses. It is worth
repeating that if data are
approximately normally distributed
then parametric tests (as in the
modules on hypothesis testing) are
more appropriate. However, there are
situations in which assumptions for a
parametric test are violated and a
nonparametric test is more
appropriate.

The techniques described here apply to outcomes that are ordinal, ranked, or
continuous outcome variables that are not normally distributed. Recall that
continuous outcomes are quantitative measures based on a specific
measurement scale (e.g., weight in pounds, height in inches). Some
investigators make the distinction between continuous, interval and ordinal
scaled data. Interval data are like continuous data in that they are
measured on a constant scale (i.e., there exists the same difference between
adjacent scale scores across the entire spectrum of scores). Differences
between interval scores are interpretable, but ratios are not. Temperature in
Celsius or Fahrenheit is an example of an interval scale outcome. The
difference between 30 and 40 is the same as the difference between 70
and 80, yet 80 is not twice as warm as 40. Ordinal outcomes can be less
specific as the ordered categories need not be equally spaced. Symptom
severity is an example of an ordinal outcome and it is not clear whether the
difference between much worse and slightly worse is the same as the
difference between no change and slightly improved. Some studies use
visual scales to assess participants' self-reported signs and symptoms. Pain
is often measured in this way, from 0 to 10 with 0 representing no pain and
10 representing agonizing pain. Participants are sometimes shown a visual
scale such as that shown in the upper portion of the figure below and asked
to choose the number that best represents their pain state. Sometimes pain
scales use visual anchors as shown in the lower portion of the figure below.

Visual Pain Scale


In the upper portion of the figure, certainly 10 is worse than 9, which is worse
than 8; however, the difference between adjacent scores may not
necessarily be the same. It is important to understand how outcomes are
measured to make appropriate inferences based on statistical analysis and,
in particular, not to overstate precision.

Assigning Ranks

The nonparametric procedures that we describe here follow the same


general procedure. The outcome variable (ordinal, interval or continuous) is
ranked from lowest to highest and the analysis focuses on the ranks as
opposed to the measured or raw values. For example, suppose we measure
self-reported pain using a visual analog scale with anchors at 0 (no pain) and
10 (agonizing pain) and record the following in a sample of n=6 participants:

7 5 9 3 0 2

The ranks, which are used to perform a nonparametric test, are assigned as
follows: First, the data are ordered from smallest to largest. The lowest value
is then assigned a rank of 1, the next lowest a rank of 2 and so on. The
largest value is assigned a rank of n (in this example, n=6). The observed
data and corresponding ranks are shown below:

Ordered Observed
023579
Data:

Ranks: 123456

A complicating issue that arises when assigning ranks occurs when there are
ties in the sample (i.e., the same values are measured in two or more
participants). For example, suppose that the following data are observed in
our sample of n=6:

Observed Data: 7 7 9 3 0 2

The 4th and 5th ordered values are both equal to 7. When assigning ranks, the
recommended procedure is to assign the mean rank of 4.5 to each (i.e. the
mean of 4 and 5), as follows:
Ordered Observed 0. 2. 3.
7 7 9
Data: 5 5 5

1. 2. 3. 4. 4.
Ranks: 6
5 5 5 5 5

Suppose that there are three values of 7. In this case, we assign a rank of 5
(the mean of 4, 5 and 6) to the 4th, 5th and 6th values, as follows:

Ordered Observed
023777
Data:

Ranks: 123555

Using this approach of assigning the mean rank when there are ties ensures
that the sum of the ranks is the same in each sample (for example,
1+2+3+4+5+6=21, 1+2+3+4.5+4.5+6=21 and 1+2+3+5+5+5=21).
Using this approach, the sum of the ranks will always equal n(n+1)/2. When
conducting nonparametric tests, it is useful to check the sum of the ranks
before proceeding with the analysis.

To conduct nonparametric tests, we again follow the five-step approach


outlined in the modules on hypothesis testing.

1. Set up hypotheses and select the level of significance . Analogous to


parametric testing, the research hypothesis can be one- or two- sided
(one- or two-tailed), depending on the research question of interest.

2. Select the appropriate test statistic. The test statistic is a single


number that summarizes the sample information. In nonparametric
tests, the observed data is converted into ranks and then the ranks are
summarized into a test statistic.

3. Set up decision rule. The decision rule is a statement that tells under
what circumstances to reject the null hypothesis. Note that in some
nonparametric tests we reject H0 if the test statistic is large, while in
others we reject H0 if the test statistic is small. We make the distinction
as we describe the different tests.

4. Compute the test statistic. Here we compute the test statistic by


summarizing the ranks into the test statistic identified in Step 2.

5. Conclusion. The final conclusion is made by comparing the test statistic


(which is a summary of the information observed in the sample) to the
decision rule. The final conclusion is either to reject the null
hypothesis (because it is very unlikely to observe the sample data if
the null hypothesis is true) or not to reject the null hypothesis (because
the sample data are not very unlikely if the null hypothesis is true).

What is Chi Square Test?


Any statistical test that uses the chi square distribution can be called chi
square test. Chi-square test is conducted a statistical test to investigate
difference, and it is denoted by 2 . The chi-square test measures the
difference between a statistically generated expected result and an actual
result to see if there is a statistically significant difference between them. It
measure the goodness of fit between an expected and an actual result.
Chi Square Test Formula

The formula for Chi Square is defined as follows:

2=(OE)2E

Where,

2 - Chi Square

O - Observed sample in each category.

E - Expected frequency in corresponding category.


Chi Square Test Degrees of Freedom
The degree of freedom for the chi square difference test is equal to the
difference between degree of freedom associated with the models. Each type
of two way table has its own chi-square distribution, depending on the
number of rows and columns, and each chi-square distribution is identified
by its degree of freedom. A two way table with r rows and c column uses a
chi-square distribution with (r - 1)*(c - 1) degree of freedom.

1. For one degree of freedom, the distribution looks like a hyperbola.


2. For than one degree of freedom, it loos like a mound that has a long
right tail.

Chi Square Test of Independence

Chi square test is applied when we have two categorical variables from a
single population. It is used to determine whether there is a significant
association between the two variables. This test is applicable when the
observations are independent (random). The Chi-square test for
independence is also called a contingency table Chi-square test.

Chi Square Test of Independence Example

For a given population, we consider two attributes and we may find the
dependence between them. We have a set of workers in a factory and we try
to classify them as smokers and non-smokers. The same workers are
classified again as 'men' and 'women'. Here, we may find that the number of
smokers are more in men than in women. So, we say that the attributes
'smoking' and 'sex' is dependent (associated).
Chi Square Goodness of Fit Test

This test is applicable when the observations are independent (random) and
the total frequency should be large. This test is used to test association of
variables in two-way tables where the assumed model of independence is
evaluated against the observed data. The chi-square goodness of fit test is
that it can be applied to any univariate distribution for which you can
calculate the cumulative distribution function. The chi-square goodness-of-fit
test can be applied to discrete distributions such as the binomial and the
Poisson.

Chi-square test statistic is of the form

2=(Observed value - Expected value)2Expected value


Degree of Freedom for the Chi-Square Test for Goodness of Fit

The number of degree of freedom that we calculate for the Chi-square test
for goodness of fit reflects the number of categories that we are comparing
minus one.

Degree of freedom (df) = c - 1


Chi Square Difference Test
The chi square difference test is very useful both for making simpler models
more complex and for making complex models simpler. A more accurate test
can be obtained by performing a chi square difference test.

Estimating the original model.

Estimating the revised model in which new path has been added.

Calculating the difference between the two resulting chi square values.

The resulting chi square difference statistic also has a chi square distribution.
The degree of freedom for the chi square difference test is equal to the
difference between degree of freedom associated with the models. When the
chi square difference is statisticant, the model with the smaller chi-square is
considered to fit the data better than the model with the higher chi-square.
Chi Square Test of Homogeneity
The chi square test of homogeneity is used to test the differences between
two popuations that are homogeneous with respect to some characteristics.
In this test categories are assumed mutually and exhaustively exclusive. The
test statistics for chi square test of homogeneity is the same as that for chi
square of association.

2 = mi=1nj=1(OijEij)2Eij

Where, df = (m - 1)(n - 1).


Chi Square Test of Association
Chi-square test of association is equivalent to the Chi-square test of
independence and the Chi-square test of homogeneity. The Chi-square test of
association is used to determine whether there is an association between
two or more categorical variables. In the Chi-square test of association the
expected proportions are known a priori, for the Chi-square test of
association the expected proportions are not known a priori but must be
estimated from the sample data.
Chi Square Test for Trend
The chi-square test for trend tests is a linear trend between rows and the
columns of the table. It only makes sense when the rows are arranged in a
natural order (such as by age or time), and are equally spaced. A large chi-
square statistic indicates in the table, the observed frequencies differ
markedly from the expected frequencies. When a chi-square is high, examine
the table to determine which cells are responsible. In the chi-squared test for
trend, we not only use the order of the categories, but attach a numerical
value. The chi-squared for trend statistic is always less than the chi-squared
for association statistic. The difference between the two chi-squared
statistics follows a Chi-squared distribution if the null hypothesis is true, with
degrees of freedom equal to the difference between the two degrees of
freedom.
One Sample Chi Square Test

The one-sample Chi-square test compares the distribution of cases across


the categories of a variable with a hypothesized distribution. The Chi-square
test used with one sample is described as a "goodness of fit" test. It can help
you decide whether a distribution of frequencies for a variable in a sample is
representative of, or "fits", a specified population distribution. The one
sample Chi-square test is used to test a hypothesis such as 'suicide rate
varies significant from month to month'. If the hypothesis is false, the suicide
rate will be the same for one of the twelve months. The one sample Chi-
square test can be used to compare observed suicide rates per month with
what would be expected if the rate were equal for the all months.
Chi Square Test Interpretation

The chi-square test measures the difference between a statistically


generated expected result and an actual result to see if there is a statistically
significant difference between them. After finding the Chi-square value and
the degree of freedom are known, a standard table of Chi-square values can
be consulted to determine the corresponding p-value. The p value indicates
the probability that a Chi-square value that large would have resulted from
the chance.
Chi Square Test Assumptions

The chi-square test have some important assumption.

For the chi-square test to be meaningful it is imperative that each


person, item or entity contributes to only one cell of the contingency
table.

Both independent and dependent variables are categorical with two or


more levels.
The data consist of frequencies, not scores.

Each randomly selected observation can be classified into only one


category for the independent variable and only one category for the
dependent variable.

Purpose of Chi Square Test


Chi-square test is one of the simplest and most widely used non-parametric
tests. The chi-square test is the most commonly used method for comparing
frequencies or proportions. It is a statistical test used to compare observed
data with data that would be expected according to a given hypothesis. It is
very popularly known as test of "goodness of fit" for the reason that it
enables us to ascertain how appropriately the theoretical distributions.
Chi Square Test Table
Table for Chi square test is given below:

Chi Square Test Example


Chi-Square Test
1.) Goodness-of-fit-test
Ho: fo=fe where fo= observed frequency
Ha: fofe fe= expected frequency
df= c-1

2 ( fofe )2
XC =
fe
Example:

The city distributor of air conditioners in the city of Manila has divided the
area into four sub-areas. A prospective buyer of the distributorship was told
that the installations of the equipment are equally distributed. The
prospective buyer took a random sample of 40 installed performed during the
past years from the corporation file and found the following:

SUB AREAS A B C D TOTAL


NO INSTALL 6 12 14 8 40

Based on the information can we say that the units are equally distributed?
Use = 0.05

1.) Ho: fo=fe


Ha:fofe

2.) Chi-square test

3.) = 0.05
Xtab= 7.815
Df= c-1
4-1
Df= 3

4.) reject Ho if XC2 > Xtab2

5.) fe= 40/4


= 10
XC2= (6-10)2+(12-10)2+(14-10)2+(8-10)2
10
2
XC = 4

6.) Since XC2 < Xtab2 do not reject Ho

7.) The units are equally distributed.

2.) Test of Independence

fe=
r c
n
Where: r= row
c= column

df=(r-1)(c-1)
Example: a survey was conducted to determine whether gender and age are
related among stereo shop customers. A total of 200 respondents was taken
and the results are presented in the table.

Age Gender
Male fe Female fe TOTAL
Under 30 60 77 50 33 110
30 and 80 63 10 27 90
over
TOTAL 140 60 200

Conduct attest whether gender and age of stereo shop customers are
independent at 1% level of significance

1.) Ho: Gender and age of stereo shop customers are independent
Ha: Gender and age of stereo shop customers are dependent

2.) Chi- square test

3.) = 0.9
Df= (2-1)(2-1)
=1
Xtab2= 6.635

4.) Reject Ho if XC2 > Xtab2

5.) Xc =2 (fofe)
fe

2 2 2 2
2 (6077) (8063) (5033) (1027)
Xc = + + +
77 63 33 27

XC2= 27.80

6.) Since XC2 > Xtab2 ; reject Ho

7.) Gender and age of stereo shop customers are independent

A certain school classified 1725 students according to the intelligence and


family economic levels. The results is as follows:

Economic Intelligence
Level
Dull fe Avera fe Intellige fe TOTA
ge nt L
Rich 81 128.6 322 347.3 273 160.0 636
7 1 1
Middle class 141 151.9 457 410.1 153 188.9 751
4 1 5
Poor 127 68.38 163 184.5 148 85.04 338
8
TOTAL 349 742 438 1725
Using this results, can we conclude that intelligence is related to the
economic level? Use 1% level of significance

1.) Ho: Intelligence is not related to the economic level


Ha: Intelligence is related to the economic level

2.) Chi- square test

3.) = 0.01
Df= (3-1)(3-1)
=4
Xtab2= 13.277

4.) Reject Ho if XC2 > Xtab2

5.)
2
Xc =
(fofe)
fe

2 2 2 2 2
2 (81128.67) (322347.31) (273160.01) (141151.94) ( 457410.11) (1531
Xc = + + + + +
128.67 347.31 160.01 151.94 410.11 188

XC2= 137.70

6.) Since XC2 > Xtab2 ; reject Ho

7.) Intelligence is related to the economic level

Given below are some of the examples on chi square test.

Exercises
Question 1:
Find the chi square for the following given datas

Color Blue Black Brown Yellow

Observed frequency 5 15 10 20

Expected frequency 10 20 5 30
Answer:

(For blue, Observed frequency - Expected frequency = 5-10 = -5

For black, Observed frequency - Expected frequency = 15-20 = -5

For brown, Observed frequency - Expected frequency = 10-5 = 5

For yellow, Observed frequency - Expected frequency = 20-30 = -10

=9.58333)

Question 2:
Find the chi square for the following given datas

Color Bl Bla Bro Yello


ue ck wn w

Observ 10 5 25 35
ed
freque
ncy

Expect 15 30 30 25
ed
freque
ncy

Answer:
(27.3332)

Question 3:
Find the chi square for the following given datas
Color Bl Bla Bro Yello
ue ck wn w

Observ 23 24 32 23
ed
freque
ncy

Expect 12 32 25 21
ed
freque
ncy

Answer:
(14.2338)

Question 4:

Determine whether the gender and shoe size are dependent among the students
of section ChE 4102 and instructors from Chemical Engineering Department of
Batangas State University. Use 0.01for level of significance.
Data
Shoe Size
Gender below 8 8 and above TOTAL
Male 2 13 15
Female 23 11 34
TOTAL 25 24 49

Answer:
(Gender and shoe size are dependent.)

Question 5:
49 samples are selected from the group of male and female students of section ChE
4102 from Chemical Engineering Department of Batangas State University with their
instructors. Determine whether gender and height are independent among the students
and their professor. Use 0.01 level of significance. Given data below:
GENDER HEIGHT
below 160 cm 160 cm and above TOTAL
Male 1 14 15
Female 16 18 34
TOTAL 17 32 49

Answer:

(Gender and height are dependent.)

Question 6:

Reference:
http://sphweb.bumc.bu.edu/otlt/MPH-
Modules/BS/BS704_Nonparametric/BS704_Nonparametric_print.html
http://math.tutorvista.com/statistics/chi-square-test.html

You might also like