You are on page 1of 56

1

EXPLORING RANDOM VARIABLES


Lesson Objectives
At the end of this lesson, you are expected to:
 illustrate a random variable
 classify random variables as discrete or continouos; and
 find the possible values of random variable
Starting Point
You have learned in your past lessons in mathematics that an experiment is any
activity, which can be done repeatedly under similar conditions. The set of all posssible
outcomes if an experiment is called sample space. You have also learned how to
systematically list the possible outcomes of a given experiment.
To find if you are ready to learn this new lesson, do the following activity:
ENTRY CARD
List the sample space of the following experiments
Experiment Sample Space
1. Tossing three coins
2. Rolling a die and tossing a coin simultaneously
3. Drawing a spade from deck of cards

4. Getting a defective item when two items are


randomly selected from a box of two defective and three
nondefective items
5. Drawing a card greater than 7 from a deck of cards

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
2

Defective Cell Phones


Recall that a variable is a characteristics or attribute that can assume different values. We
use capital letters to denote or represent a variable. In this lesson, we shall discuss
variables that are associated with probabilities, called random variables

Suppose three cell phones are tested at random. We want to find out the
number of defective cell phones that occur. Thus, to each outcome in the sample space
we shall assign a value. These are 0, 1, 2, or 3. If there is no defective cell phone, we
assign the number 0; if there is 1 defective cell phone, we assign the number 1; if there
are two defective cell phones, we assign the number2; and 3, if there are three
defective cell phones. The number of defective cell phones is a random variable. The
possible values of this random variables are 0, 1, 2, and 3.

Illustration
Let D represent the defective cell phones and N represent the non-defective cell
phone. If we let X be the random variable representing the number of defective cell
phones, can you show the values of random variable X? Complete the table below to show
the values of the random variable.
The completed table should look like this.
Possible Outcomes Value of the Random Variable X
(number of defective cell phones)
NNN 0
NND 1
NDN 1
DNN 1
NDD 2
DND 2
DDN 2
DDD 3

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
3

A random variable is a function that associates a real number to each element


in the sample space. It is a variable whose value are determined by chance

A random variable is a discrete random variable if it’s set of possible outcomes


in countable. Mostly, discrete random variables represent count data, such as the
number of defective chairs produced in a factory.
A random variable is a continuous random variable if it takes on values on
continuous scale. Often, continuous random variable represent measured data, such
as heights, weights, and temperatures.

Classify the following random variables as discrete or continuous.


a. The number of defective computer produced by manufacturers
b. The height of new born each year in a hospital
c. The number of siblings in a family of a religion
d. The amount of paint utilized in a building project
e. The number of dropout in a scool district for a period of 10 years
f. The speed of a car
g. The number of female athletes
h. The time needed to finifh the test
i. The amount of sugar in a cup of coffee
j. The number of people who are playing LOTTO each day
k. The number of accidents per year at an intersection
l. the number of voter favoring a candidate
m. The number of bushels of apples per hactare this year
n. The number of patietient arrivals per hour at a medical clinic
o. P. the number of deaths per year attributed to lung cancer.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
4

CONSTRUCTION PROBABILITY DISTRIBUTIONS


Lesson Objectives
At the end of this lesson, you should be able to:
 Illustrate a probability distribution for a discrete random variables and its
properties;
 Compute probabilities corresponding to a given random variable; and
 Construct the probability mass function of a discrete random variable and its
corresponding histogram.

In your previous study of mathematic, you have learned how to find the probability
of an event. In this lesson, you will learn how to construct a probability distribution of a
discrete random variabe. Your knowledge of getting the probability of an event is very
important in understanding the present lesson. To find out if you are raedy to learn this
new lesson, do the following actvities.
ENTRY CARD
A. Find the probability of the following events.
Event (E) Probability P (E)
1. Getting an even number in a single roll of a die
2. Getting a sum of 6 when two dice are rolled
3. Getting an ace when a card is drwan from a deck
4. The probability that all children are boys if a couple has three
children
5. Getting an odd number and a tail when a die is rolled and a
coin is tossed simultaneously
6. Getting a sum of 11 when two dice are rolled
7. Getting a black card and a 10 when a card is drawn from a
deck
8. Getting a red queen when a card id drawn from a deck
9. Getting doubles when two dice are rolled
10. Getting a red ball from a box containing 3 red and 6 black balls

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
5

Number of Tails
Suppose three coins are tossed. Let Y be the random variable representing the
number of tails that occur. Find the probability of each of the values of the random
variable Y.
Solution
Steps Solution
1. Determine the sample space. Let H The sample space for this experiment is
represent head and T represent tail. S = {TTT, TTH, THT, HTT, HHT, THH, HHH}
2. Count the number of tails in each Possible Value of the
outcome in the sample space and Outcomes Random Variable X
assign this number to this outcome (number of tails)
TTT 3
TTH 2
THT 2
HTT 2
HTH 1
THH 1
HHH 0
3. There are four possible values of the
random variable y representing the Number of Tails Probability P(Y)
number of tails. These are 0, 1, 2, Y
and 3. Assign probability values P(Y)
to each value of the random
variable.
 There are 8 possible outcomes and
no tail occurs once, so the 1
probability that we shall assign to 0 8
1
the random variable 0 is .
8
 There are 8 possible outcomes and
1 tail occurs three times, so the
probability that we shall assign to 3
3
the random variable 1 is . 1 8
8
 There are 8 possible outcomes and
2 tails occur three times, so the
3 2 3
random variable 2 is .
8 8
 There are 8 possible outcomes and
3 tails occur once, so the probability

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
6

that we shall assign to the random


1 1
variable 3 is . 3
8
8
Table 1.1. The Probability Distribution of the Probability
Mass Function of Discrete Random Variable
Number of Tails Y 0 1 2 3
Probability 1 3 3 1
8 8 8 8

A discrete probability distribution or a probability mass function consists of the


values a random variables can assume and the corresponding probabilities of the
values.

Number of defective Cell Phones


Suppose three cell phones are tested at random. Let D represent the defective cell
phones and let N represent the non-defective cell phone. If we let X be the random
variable representing the number of defective cell phones, construct the probability
distribution of the random variable X.
Solution
Steps Solution
1. Determine the sample space. Let D The sample space for this experiment is
represent defective cell phones and N S = {NNN, NND, NDN, DNN, NDD, DDN,
represent non-defective cell phone. DDD }
2. Count the number of defective cell Possible Value of the
phones in each outcome in the sample Outcomes Random Variable X
space and assign this number to this (number of
outcome. Defective CP)
NNN 0
NND 1
NDN 1
DNN 1
NDD 2
DND 2
DDN 2
DDD 3

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
7

3. There are four possible values of the


random variable X representing the Number of Probability P(X)
number of defective cell phones. These Defective Cell
are 0, 1, 2, and 3. Assign probability Phones X
values P(Z) to each value of the random
variable.
 There are 8 possible outcomes and no
defective cell phones occurs once, so
the probability that we shall assign to 0 1
1 8
the random variable 0 is .
8
 There are 8 possible outcomes and 1
defective cell phone occurs three
times, so the probability that we shall 1 3
3
assign to the random variable 1 is or . 8
8
 There are 8 possible outcomes and 2
defective cell phones occur three 3
times, so the probability that we shall 2 8
3
assign to the random variable 2 is .
8
 There are 8 possible outcomes and 3
defective cell phones occur once, so 3 1
the probability that we shall assign to 8
1
the random variable 3 is .
8

Table 1.2. Probability Distribution or Probability Mass Function


Of Discrete Random Variable X
Number of Defective Cell phones X 0 1 2 3
Probability P(X) 1 3 3 1
8 8 8 8

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
8

Number of Defective Cellphones (X)


0.4
Probability P(X)

0.3

0.2

0.1

0
1 2 3 4

Number of Defective Cellphones (X)

Figure 1.2. The Histogram for the Probability


Exercises

Properties of a Probability Distributions


1. The probability of each value if the random variable must be between or equal
to 0 and 1. In symbol, we write as 0 ≤ P(X) ≤ 1,
2. The sum of the probabilities of all values of the random variable must be equal
to 1. In symbol, we write it as ∑P(X) = 1.

A. Determine whether the distribution represents a probability distribution. Explain


your answer.
1.
X 1 5 8 7 9
P(X) 1 1 1 1 1
3 3 3 3 3
2.
X 0 2 4 6 8
P(X) 1 1 1 1 1
6 6 3 6 6

B. The following data show the probabilities for the number of cars sold in a given
day at a car dealer store.
Number of Copies X Probability P(X)
0 0.100
1 0.150
2 0.250

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
9

3 0.140
4 0.090
5 0.080
6 0.060
7 0.050
8 0.040
9 0.025
10 0.015

a. Find P (X ≤ 2)
b. Find P (X ≥ 7)
c. Find P (1 ≤ X ≤ 5)

COMPUTING THE VARIANCE OF A DISCRETE


PROBABILITY DISTRIBUTION
Lesson Objectives
At the end of this lesson, you should be able to:
 illsutrate and calculate the variance of a discrete random variable;
 interpret the variance of a discrete random variable; and
 solve the problems involving variance of probability distributions.
Mean and Variance Sampling Distribution
of Sample Means

Consider a population consisting 1, 2, 3, 4, and 5. Suppose sample of size 2 are


drawn from this population. Describe the sampling distribution of the sample means.
 What is the mean and variance of the sampling distribution of the sample
means?
 Compare the histogram of the sampling distribution of the population.
 Draw the histogram of the sampling distribution of the population mean.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
10

Steps Solution
1. Compute the mean of the population (µ) ∑𝑋
𝜇=
𝑁
1+2+3+4+5
=
5
=3
So, the mean of the population is 3.00
2. Compute the variance of the population (𝜎)
X X-µ (X - µ)2 ∑(𝑋−𝜇)2
1 -2 4
𝜎2 =
𝑁
2 -1 1 10
=
3 0 0 5
4 1 1 =2
5 2 4 So, the variance of
∑(X - the population is 2.
µ)2 = 10

3. Determine the number of possible samples Use the formula NCn. Here N = 5 and n
of size n = 2 = 2.
5C2 = 10
So, there are 10 possible samples of
size 2 that can be drawn.
4. List all possible samples and their
corresponding means. Samples Means
1, 2 1.50
1, 3 2.00
1, 4 2.50
1, 5 3.00
2, 3 2.50
2, 4 3.00
2, 5 3.50
3, 4 3.50
3, 5 4.00
4, 5 4.50

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
11

5. Construct the sampling distribution of Sampling Distribution of Sample Means


the sample means. Sample Probability
Mean Frequency P(𝝌 ̅)
̅
𝝌
1
1.50 1
10
1
2.00 1
10
1
2.50 2
5
1
3.00 2
5
1
3.50 2
5
1
4.00 1
10
1
4.50 1
10
Total 10 1.00

6. Compute the mean of the sampling


distribution of the sample means (𝜇𝑥̅ ). Sample Probability Probability
Follow these steps. Mean ̅)
P (𝝌 ̅ ) ∙ P(𝝌
(𝝌 ̅)
a. Multiply the sample mean by the ̅
𝝌
corresponding probability 1.50 1
0.15
b. Add the results. 10
1
2.00 0.20
10
1
2.50 0.50
5
1
3.00 0.60
5
1
3.50 0.70
5
1
4.00 0.40
10
1
4.50 0.45
10
Total 1.00 3.00
𝜇𝑥 = (𝜒̅) ∙ P(𝜒̅)
= 3.00
So, the mean of the sampling distribution
of the sample means is 3.00

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
12

7. Compute the variance (𝜎 2 )𝑥̅ of the


sampling distribution of the sample 𝜒̅ P(𝜒̅) 𝜒̅ - µ (𝜒̅ - µ)2 P(𝜒̅) ∙
means. Follow these steps: (𝜒̅ - µ)2
1
1.50 -1.50 2.25 0.225
10
a. Subtract the population mean (µ) 2.00 1
-1.00 1.00 0.100
from each sample mean (𝜒̅). Label 10
1
this as 𝜒̅ − µ. 2.50 -0.50 0.25 0.050
5
b. Square the difference. Label this as 3.00 1
0.00 0.00 0.000
(𝜒̅ − µ)2. 5
1
3.50 0.50 0.25 0.050
c. Multiply the result by corresponding 5
1
probability. Label this as 4.00 1.00 1.00 0.100
10
P(X) ∙ (𝜒̅ − µ)2. 4.50 1
1.50 2.25 0.225
d. Add the results. 10
Total 1.00 0.750
(𝜎 2 )𝑥̅ = ∑P(𝜒̅) ∙ (𝜒̅ - µ)2
= 0.75
So, the variance the sampling distribution
is 0.75.
8. Construct the histogram for the
sampling distribution of the sample
means.

Try to think how to answers these questions.


 How do you compare mean of the sample and the mean of the population?
 How do you compare variance of the sample means and variance of the
population?
Let us summarize what we have done for the preceding activities by comparing the
means and variance of the population and the sampling distribution of the means.
In-class Activity 1 In-class Activity 1
Population Sampling Distribution Population Sampling Distribution
(N = 5) of the Sample Means (N = 2) of the Sample Means
(n = 5) (n = 2)
Main µ = 3.00 µx = 3.00 µ = 3.00 µx = 3.00

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
13

Variance 𝜎 2 = 2.00 (𝜎 2 )x = 0.75 𝜎 2 = 2.00 (𝜎 2 )x = 0.33


Standard 𝜎 = 1.41 𝜎 x = 0.87 𝜎 = 1.41 𝜎 x = 0.57
Deviation
Observe that the means of the sampling distribution of the sample means is always
equal to the mean of the population. The variance of the sampling distribution is obtained
𝜎2 𝑁−𝑛
by using the formula, (𝜎 2 )𝑥̅ = ⋅ . This formula holds when the population is finite.
𝑛 𝑁⋅1
The example shown in the preceding activities are all finite population.
A finite population is one that consists of a finite or fixed number of elements,
measurements, or observations; while an infinite population contains hypothetically at
least, infinitely elements.

𝑁−𝑛
The expression, √ is called the finite correction factor. In general, the
𝑁−1
population is large and the sample size is small, the correction factor is not used since it
will be very close to 1.
We summarize the properties of the sampling distribution below.

Properties of the Sampling Distribution of Sample Mean


If all possible size of n are drawn from a population of size N with means µ and
variance 𝜎 2 , then the sampling distribution of the sample means has the following
properties:
1. The mean of the sampling distribution of the sample means is equal to
the population mean µ. That is, µx = µ.
2. The variance of the sampling distribution of the sample means 𝜎 is
given by:
𝜎2 𝑁−𝑛
 (𝜎 2 )𝑥̅ = ⋅ for finite population, and
𝑛 𝑁⋅1
𝜇
 𝜎𝑥 = for finite population.
√𝑛
3. The standard deviation of the sampling distribution of the sample
means is given by:
𝜎2 𝑁−𝑛 𝑁−𝑛
 𝜎𝑥̅ = ⋅√ for finite population where √ is the finite
√𝑛 𝑁−1 𝑁−1
population correction factor
𝜇
 𝜎𝑥̅ = for infinite population
√𝑛

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
14

The standard deviation (𝜎𝑥 ) of the sampling distribution of the sample means is
also known as the standard of the mean. It measures the degree of accuracy of the sample
mean (𝜎𝑥̅ ) as an estimate of the population mean (µ).
A good estimate of the mean is obtained if the standard error of the mean (𝜎𝑥 ) is
small or close to zero, while a poor estimate, if the standard error of the mean (𝜎𝑥̅ ) is
large. Observe that the value of (𝜎𝑥̅ ) depends on the size of the sample (n). What happens
to (𝜎𝑥̅ ) when n increases?
Thus, if we want to get a good estimate of the population mean, we have to make
n sufficient large. This fact is stated as a theorem, which is known as The Central Limit
Theorem.

The Central Limit Theorem


If random samples of size n are drawn from a population, then as n becomes
larger, the sampling distribution of the mean approaches the normal distribution,
regardless of the shape of the population distribution.

Describing the Sampling Distribution of the Sample Means


From an Infinite Population
Example 1: A population has a mean of 60 and a standard deviation of 5. A random
sample of 16 measurements is drawn from this population. Describe the
sampling distribution of the sample means by computing its mean and
standard deviation.
We shall assume that the population is infinite.

Steps Solution
1. Identify the given information. Here µ= 60, 𝜎 = 5, and n = 16.
2. Find the mean of the sampling µx = µ.
distribution. Use the property = 60
that µx = µ.
𝜎
3. Find the standard deviation of the 𝜎𝑥̅ =
√𝑛
sampling distribution. Use the 5
𝜎 =
property that 𝜎𝑥̅ = . √16
5
√𝑛
=
4
= 1.25

Example 2: The heights of male college students are normally distributed with mean 0f
68 inches and standard deviation of 3 inches. If 80 samples consisting of 25

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
15

students each are drawn from the population, what would be the expected
mean and standard deviation of the resulting sampling distribution of the
means?
We shall assume that the population is finite.

Steps Solution
1. Identify the given information. Here µ= 68, 𝜎 = 3, and n = 25.
2. Find the mean of the sampling µx = µ.
distribution. Use the property = 60
that µx = µ.
𝜎
3. Find the standard deviation of the 𝜎𝑥̅ =
√𝑛
sampling distribution. Use the 3
𝜎 =
property that 𝜎𝑥̅ = . √25
3
√𝑛
=
5
= 0.6

Mathematical Journal
Think about the answers to the following questions. Discuss your answers with
your classmates.
If a sample is drawn from a population, what happens to the standard error of the
mean if the sample size is:
1. increased from 50 to 200?

2. increased from 25 to 225?

3. increased 200 to 400?

4. decreased from 600 to 40?

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
16

SOLVING PROBLEMS INVOLVING SAMPLING


DISTRIBUTION OF THE SAMPLE MEANS

In the previous chapter, you have learned how to use the normal distribution to
gain information about an individual data value obtained from the population. In this
lesson, you will use the sampling distribution of the mean to obtain information about
the sample mean. Find out if you are ready to learn the present lesson by doing the next
activity.
A. Areas Under the Normal Curve
Find the area under the normal curve given the following conditions.
Conditions Illustration Area

between z = 0.5 and z = 1.5

between z = -1.5 and z = -0.2.5

between z = 0.75 and z = 1.5

to the left of z = 1.5

to the left of z = 0.75

The Central Limit Theorem is of fundamental importance in statistics because it


justifies the use of normal curve methods for a wide range of problems. This theorem
applies automatically to sampling from infinite population. It also assure us that no matter
what is the shape of the population distribution of the mean is, the sampling distribution
of the sample means is closely normally distributed whenever n is large.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
17

𝑥̅ −𝜇
Consequently, it justifies the use of the formula 𝑧 = 𝜎 when computing for the
√𝑛
probability that 𝑋̅ will take on a value within a given range in the sampling distribution of
𝑋̅ .
𝑥̅ − 𝜇
𝑧= 𝜎
√𝑛
where 𝑋̅ = sample mean
µ = population mean
𝜎 = population standard deviation
n = sample size
Time to Complete an Examination
Example 1. The average time it takes a group of college students to complete a certain
examination is 46.2 minutes. The standard deviation is 8 minutes. Assume
that the variable is normally distributed.
a. What is the probability that a randomly selected college student will
complete the examinations in less than 43 minutes?
Steps Solution
1. Identify the given information. µ= 46.2
𝜎=8
X = 43
2. Identify what is asked. P(X < 43)
3. Identify the formula to be used. Here we are dealing with an
individual data obtained from the
population. So, we will use the
𝑥−𝜇
formula 𝑧 = to standardize 43.
𝜎
4. Solve the problem. 𝑥−𝜇
𝑧=
𝜎
43 − 42.6
=
8
= -0.40

We shall find P(X < 43) by getting


the area under the normal curve.
P(X < 43) = P(z < -0.40)

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
18

= 0.5000 -0.1554
= 0.3446
5. State the final answer. So, the probably that a randomly
selected college student will
complete the examination in less
than 43 minutes is 0.3446 or
34.46%.

b. If 50 randomly selected students take the examination, what is the


probability that the mean time it takes the group to complete the test
will be less than 43 minutes?
Steps Solution
1. Identify the given µ= 46.2
information. 𝜎=8
X = 43
2. Identify what is asked. P(X < 43)
3. Identify the formula to be Here we are dealing with an individual
used. data obtained from the population. So,
𝑋̅ −𝜇
we will use the formula 𝑧 = 𝜎 to
√𝑛
standardize 43.
4. Solve the problem. 𝑋̅ − 𝜇
𝑧= 𝜎
√𝑛
43 − 42.6
=
8
√50
= -0.40

We

shall find P(X < 43) by getting the area


under the normal curve.
P(X < 43) = P(z < -2.83)
= 0.5000 -0.4977
= 0.0023
5. State the final answer. So, the probably that 50 randomly
selected college student will complete

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
19

the examination in less than 43 minutes


is 0.0023or 0.23%.

c. Does it seem reasonable that a college student would finish the


examination in less than 43 minutes? Yes.
d. Does it seem reasonable that the mean of the 50 college students could
be less than 43 minutes? No, it is very unlikely.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
20

Cholesterol Content
The average number of milligrams (mg) of cholesterol in a cup of certain brand of
ice cream is 600 mg, and the standard deviation is 35 mg. assume the variable is normally
distributed.
a. If a cup of ice cream is selected, what is the probability that the cholesterol content
will be more than 670 mg?
Steps Solution
1. Identify the given information. µ= 660
𝜎 = 35
X = 670
2. Identify what is asked. P(X > 670)
3. Identify the formula to be used. Here we are dealing with an
individual data obtained from the
population. So, we will use the
𝑋̅ −𝜇
formula 𝑧 = to standardize
𝜎
670.
4. Solve the problem. 𝑥−𝜇
𝑧=
𝜎
670 − 660
=
35
= 0.29

We shall find P(X > 670) by getting


the area under the normal curve.
P(X > 670) = P(z > 0.29)
= 0.5000 -0.1141
= 0.3859
5. State the final answer. So, the probably that the
cholesterol content will be more
than 670 mg is 0.3859 or 38.59%

b. If a sample of 10 cups of ice cream is selected, what is the probability that the mean
of the sample will be larger than 670 mg?

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
21

Steps Solution
1. Identify the given information. µ= 660
𝜎 = 35
𝑋̅ = 670
n = 10
2. Identify what is asked. P(𝑋̅ > 670)
3. Identify the formula to be used. Here we are dealing with data
about the sample means. So, we
𝑋̅ −𝜇
will use the formula 𝑧 = 𝜎 to
√𝑛
standardize 670.
4. Solve the problem. 𝑋̅ − 𝜇
𝑧= 𝜎
√𝑛
670 − 660
=
35
√10
= 0.90

We shall find P(𝑋̅ > 670) by getting


the area under the normal curve.
P(𝑋̅ > 670) = P(z > 0.90)
= 0.5000 -0.3159
= 0.1841
5. State the final answer. So, the probably that the mean
cholesterol content of 10 randomly
selected cups of ice cream will be
more than 670 mg 0.1841 or
18.41%

When do you use these formulae?


𝑋̅ −𝜇
 𝑧=
𝜎
𝑋̅ −𝜇
 𝑧= 𝜎
√𝑛

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
22

Exercises
Solve the following problems.
1. A manufacturer of light bulbs produces that last a mean of 950 hours with a
standard deviation of 120 hours. What is the probability that the mean lifetime
of a random sample of 10 of these bulbs is less than 900 hours?
2. The average cholesterol content of a curtain canned goods is 215 milligrams,
and the standard deviation is 15 milligrams. Assume the variable is normally
distributed.
a. If a canned good is selected, what is the probability that the cholesterol
content will be greater than 220 milligrams?
b. If a sample of 25 canned goods is selected, what is the probability that the
mean of the sample will be larger than 220 milligrams?
3. The average public high school has 468 students with a standard deviation of
87.
a. If a public school is selected, what is the probability that the number of
students enrolled is greater than 400?
b. If a random sample of 38 public elementary schools is selected, what is
probability that the number of students enrolled is between 445 and 485?

Recalling and Applying the Normal Curve Concepts

Task:
1. Read the following carefully in preparation for determining confidence
interval estimates for the population mean µ.
2. Consult the z-table for the z-values and their corresponding areas a
deemed necessary.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
23

Z-table
z 0.0 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .0000 .0040 .0080 .0120 .0160 .0199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .0569 .0636 .0675 .0714 .0753
0.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2257 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2517 .2549
0.7 .2580 .2611 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621
1.1 .2643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3889 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 .4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319
1.5 .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4429 .4441
1.6 .4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 .4591 .4599 .4608 .4616 .4625 .4633
1.8 .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4756 .4761 .4767
2.0 .4772 .4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817
2.1 .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .4861 .4864 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890
2.3 .4893 .4896 .4898 .4901 .4904 .4906 .4909 .4991 .4913 .4916
2.4 .4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
2.5 .4938 .4940 .4941 .4943 .4945 .4946 .4948 .4949 .4951 .4952
2.6 .4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964
2.7 .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4979 .4980 .4981
2.9 .4981 .4982 .4982 .4983 .4984 .4984 .4985 .4985 .4986 .4986
3.0 .4987 .4987 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990
For values of z above 3.09, use .4999 for the area.
Adopted from Mario F. Triola (1995). Elementary Statistics. 6th ed. New York: Addison-Wesle

Recall that a standard normal distribution is a normal probability distribution with


a mean of 0 and a standard deviation of 1. At the horizontal base of the curve, we find z-
values. The z-values or z-score is actually the number of standard deviations that a

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
24

particular X value is away from the mean. Table 4.1 gives the area (to four decimal places)
under the standard normal curve for any z-value from -3.49 to 3.49. This table is also
known as the z-table.
The are under the curve is 1 or 100%. The proportion of the area between 1
standard deviation unit below the mean and 1 standard deviation unit above the mean is
approximately 68%. The middle 95% is the proportion of the region above z = -1.96 and
below z = 1.96. These z-values determine the 95% confidence interval estimates. Similarly,
the middle 99% is the proportion of the area bound by z = -2.58 and z = +2.58.
approximations are shown in the figure. For another level of confidence interval, the
corresponding z-value are called confidence coefficients. They are also called critical
values. We also say that the standard normal variable z is the test statistic used to
calculate the interval boundaries.
Recall also that for a large sample values, the Central Limit Theorem (CLT) applies.
That is, as the sample size n increases without limit, the shape of the distribution of the
sample means taken with replacement from a population with the mean µ and standard
deviation 𝜎 will approach a normal distribution. So, when the sample size is large,
applying CLT, approximately 95% of the sample means taken from a population with the
mean µ will fall ±1.96 standard errors of the population mean. This means that the
interval estimate is given by:
𝜎 𝜎
𝜇 − 1.96 ( ) 𝑡𝑜 𝜇 + 1.96 ( )
√𝑛 √𝑛
Thus, if a sample mean is specified, there is a 95% probability that the interval:
𝜎 𝜎
𝜇 − 1.96 ( ) 𝑡𝑜 𝜇 + 1.96 ( ) contains 𝑋̅
√𝑛 √𝑛
In an analogous manner, there is a 95% probability that the interval sepecified by:
𝜎
𝑋̅ − 1.96 ( ) 𝑡𝑜 𝑋̅ +
√𝑛
𝜎
1.96 ( ) will contain µ.
√𝑛

This expression may also be stated like this:


𝜎 𝜎
𝑋̅ − 1.96 ( ) < µ < 𝑋̅ + 1.96 ( )
√𝑛 √𝑛

The expression shows that the interval estimate of the population mean µ is a
𝜎 𝜎
number from 𝑋̅ − 1.96 ( ) 𝑡𝑜 𝑋̅ + 1.96 ( ). This is shown in the figure where zµ/2 =
√𝑛 √𝑛
±1.96. The two z-values, and + 1.96, are the boundaries of the interval estimate. Under
the normal curve, the total proportion of the are to the left of -1.96 and to the right of

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
25

𝜎
1.96 is 𝑋̅ − 1.96 ( ) is denoted by α (Greel letter alpha). In statistical analysis, we often
√𝑛
use α to indivate our level of confidence. If the confidence is 95%, then α is the remaining
5% or 0.05. this is the proportion of the area that is distributed in bothe tails of the stndard
normal distirbution curve. This are is outside boundaries of the interval estimate. So , the
𝛼 0.05
area at each tail is or which is equal to 0.025. This is indicated in the symbol z α/2
2 2
(read “zee sub alpha over two”) in the general formula.
The general formula for confidence interval for large samples is:

𝜎 𝜎
𝑋̅ − 𝑧𝛼⁄2 ( ) < µ < 𝑡𝑜 𝑋̅ + 𝑧𝛼⁄2 ( )
√𝑛 √𝑛
The short form of this formula is:
𝜎
𝑋̅ − 𝑧𝛼⁄2 ( )
√𝑛
Other confidence levels are also used in statistics like
90% or 99%.
In general formula for determining the interval
𝜎
estimate for the parameter µ, the value 𝑋̅ − 𝑧𝛼⁄2 ( ) is
√𝑛
𝜎
called the lower confidence boundary or limit and the other value 𝑋̅ + 𝑧𝛼⁄2 ( ) is called
√𝑛
upper confidence boundary or limit.
For a 90% confidence interval, 𝑧𝛼⁄2 = ±1.95; for a 95% confidence interval, 𝑧𝛼⁄2 =
±1.95 and for a 99%, 𝑧𝛼⁄2 = ±2.58.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
26

The figure on the right shows the 95% confidence interval in a normal distribution.

1. Write a formula for computing the interval estimate of the population mean µ
for:
a. 90% confidence
b. 99% confidence
2. Draw a normal curve showing the confidence coefficients in the interval
estimate for:
a. 90% confidence
b. 99% confidence

Determining Interval Estimates


𝜎
In the general formula for a confidence interval, the term 𝑧𝛼⁄2 ( ) is called margin
√𝑛
of error, denoted by E, which is defined as the maximum likely difference between the
observed sample mean and the true value of the population mean µ. Thus, another way
of writing the formula for finding the confidence interval for the population parameter µ
is:
However, when 𝜎 is not known (as is often the case), the sample standard
deviation s is used to approximate 𝜎. So, the formula for E is modified.
𝜎 𝑠
E = 𝑧𝛼⁄2 ( ) ≈ 𝑧𝛼⁄2 ( )
√𝑛 √𝑛

The following rule is observed in computing the confidence interval for a

population mean µ. 𝜎
where E = 𝑧𝛼⁄2 ( )
√𝑛

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
27

The interval estimation procedure is summarized in the following four-step


process.

A Four-Step Process in Computing the Interval Estimate


Step1. Describe the population of interest (e.g., mean µ)
Step2. Specify the confidence interval criteria.
a. Check the assumptions.
b. Determine the test statistics to be used.
c. State the level of confidence.
Step3. Collect and present sample evidence.
a. Collect sample information.
b. Find the point estimate.
Step4. Determine the confidence interval.
a. Determine the confidence coefficients (e.g., 𝑧𝛼⁄2 ).
b. Find the maximum error E of the estimate.
c. Find the lower and upper confidence limits.
d. Describe / interpret the results.

Applying Normal Curve Concepts

Example 1:
Given: Find the estimate of the population mean µ using the 95% confidence level.
Solution:
With the large sample, by the Central Limit Theorem, the distribution is normally
distributed.
a. Point Estimate
Steps Solution
1. Describe the population The parameter of interest is the mean µ
parameter of the interest. where the sample purportedly belongs.
2. Specify the confidence interval criteria.
a. Check the assumptions  The 𝜎 is given.
 The sample is normal as guaranteed
by the CLT.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
28

b. Determine the test statistics to The test statistics is the z with 𝜎 = 3.


be used to calculated the
interval
c. State the level of confidence. The question asks for a 95% confidence, or
α= 0.05. This means that if more random
samples were taken from the target
population, and an interval estimate is
made for each sample, then 95% of the
intervals will contain the true parameter
value
3. Collect and present sample evidence.
a. Collect the sample The sample information consists of 𝑋̅ = 72,
information. n = 120, and 𝜎 = 3.
b. Find the point estimate. The point estimate for the population
mean µ is 72 (the sample mean).
b. 95% Confidence Interval.
4. Determine the confidence interval.
a. Determine the confidence The confidence coefficient is 1.96.
coefficient.
b. Find the maximum error E. 𝜎
E = 𝑧𝛼⁄2 ( )
√𝑛
3
= 1.96 ( )
√120
3
= 1.96 ( )
1095
= 1.96(0.27)
= 0.53
c. Find the lower and upper 𝜎 𝜎
confidence limits. 𝑋̅ − 𝑧𝛼⁄2 ( ) < µ < + 𝑋̅ + 𝑧𝛼⁄2 ( )
√𝑛 √𝑛
3 3
𝑋̅ − 1.96 ( ) < µ < + 𝑋̅ + 1.96 ( )
√120 √120
72 – 0.53 to 72 + 0.53
71.47 to 72.53
d. Describe the results. We can say with 95% confidence that the
interval between 71.17 and 72.53 contains
the population mean µ based on a sample
size 120.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
29

Example 3: GPAs of Entering Mathematics Majors


A random selection of 40 entering Mathematics majors has the following GPAs.
Assume that σ=0.46.
4.0 3.5 3.0 3.3 3.8 3.1 3.6 4.0 3.9 3.5
3.2 4.0 3.5 3.2 3.0 3.2 4.0 3.0 3.4 3.0
3.0 2.8 5.6 3.0 3.2 3.5 3.2 2.8 3.3 3.1
3.2 2.9 3.0 2.8 4.0 3.7 3.0 3.3 3.2 2.8
Estimate the true mean GPA with 99% confidence.
Solution:
a. Point Estimate
Steps Solutions
1. Describe the population The parameter of interest is the mean µ GPA of the
parameter of interest. population of entering mathematics majors.
2. Specify the confidence interval criteria.
a. Check the assumptions. The sample of 40 math majors is large enough for the
Central Limit Theorem to satisfy the assumption that
the sampling distribution of means is normal.
b. Determine the test statistic The test statistic is the z with σ = 0.46.
to be used to calculate the
interval.
c. State the level of confidence. 99% confidence level, so α = 0.01
From the z-table, the confidence coefficients are
±2.58.

3. Collect and present sample evidence.


a. Collect the sample The sample information consists of 40 raw scores and
information. α = 0.46.
b. Find the point estimate. The point estimate for the population mean is:

4.0 + 3.2 + 3.0 + 3.2 + 3.5 … + 2.8


𝑥̅ =
40
= 3.34 (the sample mean).
b. 99% Confidence Interval
4. Determine the confidence interval.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
30

a. Determine the confidence The confidence coefficient is 2.58.


coefficient.
b. Find the maximum error E. 𝜎
𝐸 = 𝑧𝛼/2 ( )
√𝑛
0.46
𝐸 = 2.58( )
√40
𝐸 = 2.58(0.07)
𝐸 = 0.19
c. Find the lower and the upper 𝜎 𝜎
𝑥̅ − 𝑧𝛼/2 ( ) < 𝜇 < 𝑥̅ + 𝑧𝛼/2 ( )
confidence limits. √𝑛 √𝑛
0.46 0.46
𝑥̅ − 2.58( )𝜇 < 𝑥̅ + 2.58( )
√40 √40
3.34 − 0.19 < 𝜇 < 3.34 + 0.19
3.15 𝑡𝑜 3.53
d. Describe the results. We can say with 99% confidence that the interval
between 3.115 to 3.53 contains the true mean GPA
of the population based on the sample GPA of 40
entering mathematics majors.

CONFIDENCE INTERVALS FOR THE POPULATION


MEAN WHEN σ IS UNKNOWN
Lesson Objectives
At the end of this lesson, you are expected to:
 Identify the appropriate distribution when the population σ is unknown;
 Understand the t-distribution;
 State the difference between a z-distribution and t-distribution; and
 Identify the confidence coefficients for computing t from the t-Table.
Tasks:

 Study the hypothetical situation about an effective teaching strategy.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
31

 Compute the parameter estimates to answer the questions that follow.

Aldrei wants to know if cooperative grouping is an effective strategy in


improving the mathematics performance of Grade 7 students. Twenty students were
included in the experimental group while another 20 students were included in the
control group. The mean achievement score of the students in the experimental
group was 82.5 with a standard deviation of 3 while the mean of the students in the
control group was 80 with a standard deviation of 6. The two groups come from
normally distributed populations. The confidence level adopted was 95%.
1. What is the estimate of the population mean where the experimental
group come from? _____________
2. What is the estimate of the population mean where the control group
comes from? _____________
3. Express your confidence level as percentage. _________________

Notice that the population standard deviation σ for each group is unknown. In
statistics, there is a method that we can use to compute confidence intervals for a
population mean when σ is unknown. However, there are assumptions to bear in mind.
Assumptions in Computing for the Population Mean when σ is Unknown
When n ≥ 30, and σ is unknown, the sample standard deviation scan be substituted
for σ. However, the following assumptions should be met.
1. The sample is a random sample.
2. Either n ≥ 30 or the population is normally distributed when n > 30.
In the past lesson, when σ is known and the sample size is 30 or more, or the
sample size is less than 30 but comes from a population that is approximately normally
distributed, the confidence interval for the population mean can be found by using the z-
distribution. Very often, however σ is not known. So, it must be estimated by s, the sample
standard deviation. When s is used, especially when the sample size is small, critical values
greater than the values for zα/2 are used in confidence intervals in order to keep the
intervals at a given level such as the 95% level. This means that a multiplier of the standard
𝑠
error of the means, denoted as , slightly larger than 1.96 is needed. The sampling error
√𝑛
𝑠
associated with using is reflected in wider confidence intervals. But the number of
√𝑛
𝑠
standard errors ( 𝑠) needed for the 0.90 or 0.95 confidence intervals depends on the
√𝑛

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
32

sample size. With small sample size, more standard errors are needed to span the 0.95
𝑠
confidence interval. This number of values is called t.
√𝑛

The general expression for the confidence interval when σ is unknown is given by:
𝑠
𝑥̅ ± 𝑡 ( ) , and the distribution of values is called 𝐭 − 𝐝𝐢𝐬𝐭𝐫𝐢𝐛𝐮𝐭𝐢𝐨𝐧.
√𝑛
The concept of the degrees of freedom is used in the t-distribution. The degrees of
freedom, denoted as df, are the number of values that are free to vary after a sample
statistic has been computed, and they tell us the specific curve to use when a distribution
consists of a family curve. For example, if the mean of 5 values is 10, then 4 of the 5 values
are free to vary. But once the 4 values are selected, the 5 th value must be a specific
number to get a sum of 50, since. Thus, if n = 5, df = n -1 = 4. (McClave & Sincich 2003).
Task: Learn how to use the t-Table in computing interval estimates of µ.

Historical Note
The t-distribution was formulated in 1908 by an Irish brewing employee
named W.S. Gosset. Gosset was involved in researching new methods of
manufacturing ale. Because brewing employees were not allowed to publish results,
Gosset published his finding using the pseudonym Student. Hence, the T-distribution
is sometimes called Student’s t-distribution.

The formula for computing the confidence interval using the t-distribution is:
𝑠 𝑠
𝑥̅ − 𝑡 ( ) < 𝜇 < 𝑥̅ + 𝑡 ( )
√𝑛 √𝑛
The t-values found in the reproduced t-Table are the proportions of the areas in
two tails of the t-curve. They are called critical values of t in the sense that they are the
boundaries of the middle area where the true mean lies. Like the z, they are also called
confidence coefficients.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
33

The t-Table
Confidence Coefficient
Degrees of Freedom (amount of α in two tails)
n (n-1) 0.90 0.95 0.99
2 1 6.314 12.706 63.657
3 2 2.920 4.303 9.925
4 3 2.353 3.182 5.841
5 4 2.132 2.776 4.604
6 5 2.015 2.571 4.032
7 6 1.943 2.447 3.707
8 7 1.895 2.365 3.499
9 8 1.860 2.306 3.355
10 9 1.833 2.262 3.250
11 10 1.812 2.228 3.169
12 11 1.796 2.201 3.106
13 12 1.792 2.179 3.055
14 13 1.771 2.160 3.012
15 14 1.761 2.145 2.977
16 15 1.753 2.131 2.947
17 16 1.746 2.120 2.921
18 17 1.740 2.110 2.898
19 18 1.734 2.101 2.878
20 19 1.729 2.093 2.861
21 20 1.725 2.086 2.845
22 21 1.721 2.080 2.831
23 22 1.717 2.074 2.819
24 23 1.714 2.069 2.807
25 24 1.711 2.064 2.797
26 25 1.708 2.060 2.787
27 26 1.706 2.056 2.779
28 27 1.703 2.052 2.771
29 28 1.701 2.048 2.763
30 29 1.699 2.045 2.756
31 30 1.697 2.042 2.750
41 40 1.684 2.021 2.714
61 60 1.671 2.000 2.660
∞ ∞ 1.645 1.960 2.576

Note that in the table, the t values are based, not on sample size n, but on degrees
of freedom, n -1. For example, for n = 20, the 0.95 (95%) confidence interval when σ is

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
34

𝜎
known 𝑥̅ ± 1.96 ( ); but when σ is unknown and only s is available, the 0.95 confidence
√𝑛
𝑠
interval is 𝑥̅ ± 2.09 ( ). The confidence coefficient is 2.09. Likewise, in the t-table, for n
√𝑛
𝑠
= 10, the 0.95 or 95% confidence interval is 𝑥̅ ± 2.26 ( ). The confidence coefficient is
√𝑛
2.26.
Tasks: Observe the areas associated with the sample size n.
 What happens to the values of t as n increases.
 What values do you observe when n = ∞
 Discuss your observations.

Confidence Coefficients
A. Find the Confidence coefficients for each of the following:
1. n = 6, 90% confidence
2. n = 7, 90% confidence
3. n = 12, 95% confidence
4. n = 17, 95% confidence
5. n = 24, 99% confidence
B. Find E given the following:
1. n = 6, s = 2, 90% confidence
2. n = 9, s = 2.8, 90% confidence
3. n = 13, s = 4.5, 95% confidence
4. n = 16, s = 3.1, 95% confidence
5. n = 21, s = 5, 95% confidence
Since the population standard deviation σ and the standard deviation of the
sampling distribution of means σx are rarely known, the procedure involving t is typically
used in setting confidence intervals.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
35

The following Four-Step Method is helpful in determining the interval estimate for
the population mean when σ is unknown.
Steps in Computing the Interval Estimate of the Population Mean When σ is
Unknown
Step 1: Describe the population parameter of interest.
Step 2: Specify the confidence interval criteria.
a. Check the assumptions
b. Determine the test statistic to be used. In this case, it is the t statistic.
c. State the level of confidence.
Step 3: Collect and present sample evidence.
a. Collect the sample information
b. Find the Point estimate.
Step 4: Determining the confidence interval.
a. Determine the confidence coefficients (𝑡𝛼/2 ) from the t-table.
𝑠
b. Find .
√𝑛
c. Find the lower and upper confidence limits.
d. Describe the results.

Teaching Strategy
Tasks:
1. Use the Four-Step Method to find the estimates of the population means where
the experimental group and the control group belong as give in In-Class Activity 1.
2. Fill in the blanks to complete the solution.

Solution:
Steps Solutions
1. Describe the population The 1st parameter of interest is the mean µ1 of the
parameter of interest. population where the experimental group belongs.
The 2nd parameter of interest is the mean µ2 of the
population where the control group belongs

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
36

2. Specify the confidence interval criteria.


a. Check the assumptions. The samples of size 20 for each group come from
normally distributed parent populations and the σ for
each group are unknown.
b. Determine the test statistic The test statistic is the t, using s1 = ________ and s2 =
to be used to calculate the _______, respectively.
interval.
c. State the level of confidence. For a 95% confidence, α = 1 – 0.95 = 0.05.
From the t-table, with a df = 19 for each group, the
confidence coefficients are _________ for each
group.
3. Collect and present sample evidence.
a. Collect the sample The sample information consists of 20 raw scores for
information. each group.
From the experiment group:
N = 20, so df = 19
X = 82.5, and s = ___________
For the control group:
N = 20, so df = 19
X = 80, and s = ___________
b. Find the point estimate. The point estimate for the population mean are the
sample means. Thus, the point estimate µ1 is 82.5 and
the point estimate for µ2 is 80.
b. 95% Confidence Interval
4. Determine the confidence interval.
a. Determine the confidence Since n = 20, then the df = 19. The confidence
coefficient. coefficients in the t-table under 0.95 (for 95%) is __.
b. Find the maximum error E. For the experimental group:
𝑠
𝐸 = 𝑡𝛼/2 ( )
√𝑛
3
𝐸 = _____( )
√20
𝐸 = ______(0.67)
𝐸 = _________
For the control group:
𝑠
𝐸 = 𝑡𝛼/2 ( )
√𝑛
6
𝐸 = _____( )
√20
𝐸 = ______(1.34)

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
37

𝐸 = 2.8
c. Find the lower and the upper For the experimental group:
confidence limits. 𝑠 𝑠
𝑥̅ − 𝑡𝛼/2 ( ) < 𝜇 < 𝑥̅ + 𝑡𝛼/2 ( )
√𝑛 √𝑛
So, 82.5 – 1.40 = 81.1 (lower limit) and
82.5 = 1.40 = 83.9 (upper limit)

For the control group:


________= 77.2
________ = 82.8
d. Describe the results. We can say with 95% confidence that the interval
between 81.1 and 83.9 contains the true mean of the
experimental population while the interval between
77.2 and 82.8 contains the true mean of the control
population based on the given sample data.

CONDUCTING HYPOTHESIS TESTING


The second area of statistically inference is hypothesis testing. A statistical
hypothesis is an assertion or a conjecture about one or more populations. It involves from
the characteristics of an observed sample. In order to make it more meaningful an useful,
it should be subjected to a rigorous test. The whole process is referred to as hypothesis
testing.
Hypothesis testing is gaining wide acceptance in many situations where decisions
have to be carefully made. Hence, it is generally known as a decision-making process for
evaluating claims about a population based on the characteristics of a sample purportedly
coming from that population.
UNDERSTANDING HYPOTHESIS TESTING
Lesson Objectives
At the end of this lesson, you are expected to:
 Understand the idea behind hypothesis testing;
 Define and formulate statistical hypothesis;
 Distinguish null hypothesis from alternative hypothesis;
 Determine whether a hypothesis test is no-directional or directional;
 Determine whether a directional test left-tailed or right-tailed; and
 Sketch the graph of a mathematical model for testing hypothesis.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
38

In Statistics, decision-making starts with a concern about a population regarding


its characteristics denoted by parameter values. We might be interested in the population
parameter like the mean or the proportion. For example, what makes a farmer decide
when to plant corn crop? Naturally, the decision will be based on a set of observations as
to when be the best time of the year to plant the crop. More consideration for decision-
making may include environmental conditions, manpower and equipment availability,
and the need for other resources. In like manner, a wise fisherman looks into several
factors before deciding to go out to catch fish in the sea. In a community, a politician may
want to know the probability that the voters approve an agenda on environmental
awareness. These concerns can be addressed in a procedure in Statistics called hypothesis
testing.
Hypothesis testing is another area of Inferential Statistics. How does it differ from
estimation that was taken up in the previous chapter? While estimation is concerned with
determining specific parameter values, testing hypotheses is hypothesizing about the
population parameter and subjecting this hypothesis to a test. How do we do it? We get
a random sample from the population, collect data from the sample, and use this data to
make a decision as to whether the hypothesis is acceptable or not.
There are two types of hypothesis: the null and the alternative hypothesis. The null
hypothesis is what we want to test. It states an exact value about the parameter. When
the null hypothesis is accepted, the buck stops right there! But when the alternative
hypothesis is rejected, this leads to another option, which is the alternative hypothesis
that allows for the possibility of many values.
Hypothesis testing is a decision-making process for evaluating claims about a
population based on the characteristics of a sample purportedly coming from that
population. The decision is whether the characteristic is acceptable or not.
In short, the process of hypothesis testing involves making a decision between two
opposing hypotheses. These two hypotheses are formulated in such a way that one is a
negation of the other. If one is true, the other must be false. That is why one hypothesis
is tested to show that it cannot happen. If the improbability of occurrence can be
established, then the other hypothesis is likely to occur, it is usually the null hypothesis
that is subjected to the rigor of a statistical test.
The null hypothesis, denoted by 𝐻0 , is a statement that there is no difference
between a parameter and a specific value, or that there is no difference between two
parameters.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
39

The alternative hypothesis, denoted by 𝐻1 , is a statement that there is a difference


between a parameter and a specific value, or that there is a difference between two
parameters.
Suppose the two parameters of interest are denoted by 𝜇1 𝑎𝑛𝑑 𝜇2 . If there is no
difference between two values, the relationship is written in symbols as:
𝜇1 − 𝜇2 = 0
So, the null hypothesis would be written in symbols as:
𝐻0 ∶ 𝜇1 = 𝜇2
The null hypothesis is the starting point of investigation. Thus, it is the first
statement to be made. You might ask: Why start with the null hypothesis? The sequence
of the arguments is like the situation of a case brought to court where the accused is
presumed ‘not guilty’ at the start. Then, evidences are collected and evaluated following
a standard procedure. At the end of the process, a decision is made as to whether ‘not
guilty’ should be rejected or not rejected.
Toward the end of a hypothesis testing exercise, based on the evaluation of the
data at hand, a decision is made about the null hypothesis: Should the null hypothesis, 𝐻0
should be rejected or not rejected (i.e. accepted)? It is logical to state that if there is
evidence to warrant the rejection of the null hypothesis, then there is a stand by
hypothesis to be accepted. This is the role of the alternative hypothesis. Should you
decide to accept the null hypothesis after considering the evidences, then you can stop
there as there is no need for an alternative hypothesis. Remember: No two things can be
and cannot be at the same time. This is a mathematical Principle.
Formulating Hypotheses
Tasks:

 Formulate a null hypothesis and the alternative hypothesis for each of the
following.
 Write them in symbols.
1. The average TV viewing time of all five-year old is 4 hours daily.
2. A college librarian claims that 20 storybooks on the average are borrowed daily.
3. The mean performance of all grade six level of a school in the NAT is 35.
4. The inventor of a new kind of light bulb claims that all such bulbs last as long as
3000 hours.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
40

5. The average of all the identified stratum of senior citizens in a remote area is 92
years.

Applying Hypothesis Testing


Tasks:
 Study the steps in solving the problems in the following examples.
 Note the procedures and symbols used.
 Note interpretations made on the results.
Example 1: Bottled Fruit Juice Content
The owner of a factory that sells a particular bottled fruit juices claims that the
average capacity of a bottle is 250 ml. is the claim true?
To test the claim, the members of a consumer group did the following:
1. Get a sample of 100 such bottles.
2. Calculate the capacity of each bottle.
3. Compare the sample mean and the claim.
The observed mean capacity, 𝑥̅ of the 100 bottles is 243 ml. The sample standard
deviation is 10 ml. In the example, the owner’s statement (called claim) is a general
statement. The claim is that the capacity of all of their bottles products is 250 ml per
bottle. So, the population mean is 250 ml. On the other hand, the consumer group has a
sample value which is 𝑥̅ = 243 ml, clearly a sample mean. There is a difference of 7 ml.
Can the consumer group generalize that the bottled product is short of the claim? If this
can be proven then the factory owner is lying. The evidence has to be established. So, the
consumer group gets interested in the population mean. They are interested to know if,
in reality, each bottle contains 250 ml.
Thus, the two hypotheses would be:
𝐻0 : The bottled drinks contain 250 ml per bottle. (This is the claim.)
𝐻1 : The bottled drinks do not contain 250 ml per bottle. (This is the opposite of the
claim.)
But these statements should be written in symbols. For now, let us drop the unit
measure and simply write:
𝐻0 : 𝜇 = 250 and 𝐻1 : 𝜇 ≠ 250
The expression may be interpreted as follows:

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
41

1. The sample comes from a population whose mean µ is 250.


2. The sample comes from a population whose mean is equal to the population mean
250. (claim)
If µ1 (read as “myu sub one”) is the population where the sample comes from and µ is
the population mean (the claim), then the null hypothesis may also be written as:
𝐻0 ∶ 𝜇1 = 𝜇2 and the alternative is 𝐻1 ∶ 𝜇1 ≠ 𝜇2
In mathematics, the symbol ≠ in the alternative hypothesis suggests either a
greater than (>) or a less than (<) relation. What is the interpretation of the symbol ≠ in
the example? It means that the consumer group is not interested in getting a sample
mean greater than 250 or a sample mean less than 250. However, this does not make
sense in the given exercise. The consumer group has a purpose, a direction. The consumer
group may want to refute the claim. So, the appropriate alternative hypothesis is:
𝐻1 ∶ µ < 250
When the alternative hypothesis utilizes the ≠ symbol, the test is said to be non-
directional.
When the alternative hypothesis utilizes the > or < symbol, the test is said to be
directional.
Task 1: Explain why in the given exercise, the statement 𝐻1 ∶ µ < 250 is not a good
alternative hypothesis.
In problems that involve hypothesis testing, there are worlds like greater, efficient,
improves, effective, increases and so on that suggest a right-tailed direction in the
formulation of the alternative hypothesis. Words like decrease, less than, smaller, and the
like suggest a left-tailed direction.
Task 2: Study the formulation of the hypothesis in the following examples carefully.
Example 2: Music and Studies
A teacher wants to know if listening to popular music affects the performance of
pupils. A class of 50 grade 1 pupils was used in the experiment. The mean score was 83
and the standard deviation is 5. A previous study revealed the µ = 82 and the standard
deviation is 10.
1. State the null and the alternative hypothesis in words and in symbols.
2. State whether the test is directional or non-directional.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
42

Solution:
The parameter of interest is the population mean µ = 82.
1. In words, the hypotheses are:
𝐻0 : The sample comes from a population whose mean is 82; or
𝐻1 : The sample comes from a population whose mean is equal to the population
mean 82.
𝐻1 The sample does not come from a population whose mean is 82.
In symbols, we write:
𝐻0 : 𝜇 = 82 and 𝐻1 : 𝜇 ≠ 82
2. There is no clue as to the direction of the investigation. The phrase affects
performance implies either an increase or a decrease in performance. So, the test
is non-directional.
Example 3: Organic Fertilizers
A farmer believes that using organic fertilizers on his plants will yield greater
income. His average income from the past was P200,000.00 pesos per year. State the
hypotheses in symbols.
Solution:
𝐻0 : 𝜇 = 𝑃200,000.00 𝑝𝑒𝑠𝑜𝑠
The phrase ‘greater income’ is associated with the greater than direction.
𝐻1 : 𝜇 > 𝑃200,000.00 𝑝𝑒𝑠𝑜𝑠
Task 3: Write the null hypothesis and the alternative hypothesis in words and in symbols
for each of the following.
1. The net weight of a packet of a snack brand is 130 g. A sample of 80 packets yielded
a sample mean weight of 122 g with a standard deviation of 15 g.
2. In a graduate college, the average length of registration time during a semester is
120 minutes with a standard deviation of 25 minutes. With the registration
procedure, a random sample of 50 students took an average of 80 minutes with a
standard deviation of 12 minutes.
3. The average height of grade 8 female students is 158.2 cm. The mean height of a
sample of 100 female students is 160 cm with a standard deviation of 6 cm.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
43

Sketches of Directional and Non-Directional Tests


 Read the following section carefully.
 Note the graphical representations of directional and non-directional tests at
the tails of the normal curve distribution.
 Discuss with your classmates.
Recall that the normal curve evolved from the probability distribution. With the
area under the curve being 1, it has become a mathematical model in hypothesis testing.
The areas are probability values that we need for decision-making. In hypothesis testing,
we determine the probability of obtaining the sample results if the null hypothesis is true.
Thus, the calculations can be graphically represented by using the normal curve. The
greater than (>) the mean, direction can be shown the right tail of the curve just as the
less than (<) the mean, direction can be shown at the left tail.
A non-directional test is also called a two-tailed test.
A directional test may either be left-tailed or right tailed.
These are the graphical representations of the two-tailed test and one-tailed test.

Non-Directional
(Two-tailed) 1-α
α/2 α/2
The probability is found on both tails of
the distribution. µ

Directional
(One-tailed, left tail) 1-α
α/2
The probability is found on the left tail
of the distribution. µ

Directional
(One-tailed, right tail)
1-α
The probability is found on the right tail α
of the distribution.
µ

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
44

Is the test two-tailed or one-tailed?


Tasks:
 Determine whether the test is two-tailed or one tailed. If it is one-tailed, is it
left-tailed or right-tailed?
 Sketch the graphical representation of the test.
1. A nutritionist claims her developed bead is fortified with Vitamin B.
2. A musician believes that listening to classical music affects mood.
3. A storekeeper thinks that time of day influences sale of ice cream.
4. A mother wants to prove that reading books to children improves their
thinking processes.
5. A certain combination of fruits provides the daily requirement for Vitamin
C.
Exercises
1. What is a hypothesis?
2. What are the types of hypotheses?
3. Define Hypothesis testing.
4. State the null hypothesis and the alternative hypothesis in (a) words and in (b)
symbols for each of the following:
a. A librarian of a school claims that their grade 8 students read an average of
10 storybooks a month with a standard deviation of 2 books. A random
sample of grade 8 students read an average 12 books a month with a
standard deviation of 1 book. The confidence statement is 95%.
b. According to a factory employer, the mean working time of workers in the
factory is 6 hours with a standard deviation of 0.5 hours. A researcher
interviewed 50% of the employees and found out that their mean working
time is 8 hours with a standard deviation of 1 hour. The α level is 0.05.
c. A random sample of 200 students got a mean score of 62 with a standard
deviation score of 5 in a knowledge test in mathematics. In the
standardization of the test µ = 50 with σ = 10.
5. Sketch the graph of each of the items in number 4.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
45

EXPLORING MORE ELEMENTS OF HYPOTHESIS TESTING


Lesson Objectives
At the end of this lesson, you are expected to:
 Understand the concept of Type I and Type II errors;
 Connect error to the process of hypothesis testing;
 Locate critical values under the normal curve;
 Determine critical values for hypothesis testing; and
 Make decision about the null hypothesis.

What mistakes do people make?


Read the following statements and identify the errors.
1. Bryan thinks that he is a six-footer. His actual height is 156 cm.
2. On a moonlit night, a young man declares that there are two full moons.
3. Mark says “I’m victorious!” In the next moment, he finds himself in jail.
4. Thousands of years ago, Ptolemy declared that the earth is flat.
5. On a beachfront, a signage reads, “No littering of plastic wrappers, empty bottles,
and cans.” A few yards away, environmentalists are picking up the rubbish left by
picnic lovers.
Understanding the Decision Grid
Table 5.1. Four Possible Outcomes in Decision-Making

Decisions about the 𝑯𝟎

Reject Do not reject 𝑯𝟎


(or Accept 𝑯𝟎 )

𝑯𝟎 is true Type I error Correct Decision


Reality

𝑯𝟎 is false Correct decision Type II error.

If the null hypothesis is true and accepted, or if it is false and rejected, the decision
is correct. If the null hypothesis is true and rejected, the decision is incorrect and this is a

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
46

Type I error. If the null hypothesis is false and accepted, the decision is incorrect and this
is a Type II error.
In an ideal situation, there is no error when we accept the truth and reject what is
false.

Understanding Errors
Task: Study the following examples carefully and the notes that follow. Discuss for the
better understanding of hypothesis testing.

Example 1: Maria’s Age


Maria insists that she is 30 years old, in fact, she is 32 years old. What error is Mary
committing?
Solution:
Mary is rejecting the truth. She is committing Type I error.

Example 2: Stephen’s Hairline


Stephen says that he is bald. His hairline is just receding. Is he committing an error?
If so, what type of error.
Solution:
Yes. A receding hairline indicates balding. This is a Type I error. Stephen action may
be find remedial measures to stop falling hair.

Example 3: Monkey-Eating Eagle Hunt


A man plans to go hunting the Philippine Monkey-Eating Eagle believing that it is a
proof of his mettle. What type of error is this?
Solution:

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
47

Hunting the Philippine Monkey-Eating Eagle is prohibited by law. Thus, it is not a


good sport. It is a Type II error. Since hunting the Philippine Eagle is against the law, the
man may find himself in jail if he goes out of his way hunting endangered species.
In decisions that we make, we form conclusions and these conclusions are the
bases of our actions. But this is not always the case in Statistics because we make
decisions based on sample information. The best that we can do is to control probability
with which an error occurs.
The probability of committing a Type I error is denoted by the Greek letter α (alpha)
while the probability of committing Type II error is denoted by β (beta).
The following table shows the probability with which decisions occur.
Table 5.2. Types of Errors
Error in Decision Type Probability Correct Decision Type Probability
Reject a true 𝐻0 I α Accept a true 𝐻0 A 1–α
Accept a false 𝐻0 II β Reject a false 𝐻0 B 1-β

We can control the errors by assigning small probability values to each of them.
The most frequently used probability values for α and β are 0.05 and 0.01. The probability
assigned to each depends on its seriousness. The more serious the errors, the less willing
we are to have it occur. So, a smaller probability will be assigned. The symbols α and β
are each probabilities of error, each under separate conditions, and they cannot be
combined. Therefore, there is no single probability for making an incorrect decision. In
like manner, two correct decision are distinct and each has its own probability. As can be
seen in Table 5.2, 1 – α is the probability of a correct decision when the null hypothesis is
true, and 1 – β is the probability of a correct decision when the null hypothesis is false. 1
– β is the called the power of the statistical test ice it is measure of the ability of a
hypothesis test to reject a false null hypothesis which is considered very important.
(McClave & Sincich, 2003)
Graphically, we can show the decision errors under the normal curve.

1-α α = 0.05 1-α


α/2 = 0.025 α/2 = 0.025

µ µ

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
48

Reject Ho Reject Ho Reject Ho

Note that the rejection region for a directional test is in one tail (first figure) but
distributed to the two tails in a non-directional test (second figure).
Under the normal curve, the rejection region refers to the region where the value
of the test statistic lies for which we will reject the null hypothesis. This region is also
called critical region.
So, if your computed statistic is found in the rejection region, then you reject Ho.
If it is found outside the rejection region, you accept Ho.
Note also the line that separates the rejection region from the non-rejection region
(1 – α). This line passes through the confidence coefficients, which are also called critical
values. The critical values can be obtained from the critical values table of the test
statistic. For example, if the test statistic is a z, the critical values can be obtained from
the z-table. So, for a 95% confidence level, the critical values for a non-directional test are
-1.96 and +1.96. When the confidence level is 99%, for a non-directional test, the critical
values are -2.58 and +2.58.
Determining the Critical Values
Task: Study how the critical values are determined under the normal curve.
Recall that the critical value are the z-values associated with the probabilities at
the tails of the normal curve.
For a 95% confidence level:
0.95
= 0.4750 (expressed up to four decimal places so
2
that we can identify an area in the normal curve table
as close as possible to this value). 95%
Ho = 0.025 Ho = 0.025
In the normal curve table, the area .0.4750
corresponds to z = 1.96.
µ
-1.96 +1.96
Thus, the critical values for 95% confidence
level are -1.96 and +1.96.

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
49

For a 99% confidence level:


99%
0.99
= 0.4950 (expressed up to four decimal places so 0.005 0.005
2
that we can identify an area in the normal curve table
as close as possible to this value).
-2.58 µ +2.58
In the normal curve table, there are more two
close areas close to this values: 0.4949 that
corresponds to z = 2.57 and 0.4951 that correspongs to z = 2.58. Then, we get the average
of the z-values. This results to z = 2.575. In practice, we use the z-values ±2.58.
Thus, for a 99% confidence, the critical values are -2.58 and +2.58.

Finding Critical Values


Tasks:
1. For a 95% confidence level, what are the critical values for a one-tailed test?
2. For a 99% confidence level, what are the critical values for a one-tailed test?
3. Complete the following summary table of critical values.
Table 5.4. Summary of Critical Values
Confidence Level Two-tailed One-tailed

-𝑧𝛼 = ____________ -z = ____________


2
95% (1 – α) +𝑧𝛼 = ____________ +z = ____________
2

-𝑧𝛼 = ____________ -z = ____________


2
99% (1 – α) +𝑧𝛼 = ____________ +z = ____________
2

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
50

Exercises
1. Between Ho and H1, what is a good reason for starting a hypothesis test with
Ho?
2. Suppose it is the Christmas season and Janine thinks that it is the month of
January, what error is she committing?
3. What type of error is committed when you reject a null hypothesis when, in
fact, it is true?
4. Determine whether the test is directional or non-directional. If:
a. A researcher claims that method of teaching affects learning
b. A food additive enhances food flavor
c. A study habit improves the memory
d. Health is related to lifestyle
e. Peoples’ culture affects tourism
5. Draw a normal curve for a 95% confidence and show z = 1.5. What decision can
you associate with this z-value? Why?

CONDUCTING HYPOTHESIS TEST USING


THE TRADITIONAL METHOD
Lesson Objectives
At the end of this lesson, you are expected to:
 Understand the underlying procedure in hypothesis teaching; and
 Conduct a statistical test using the traditional method.

After exploring the beginning elements of hypothesis testing, you are now ready
to engage in it. Your knowledge on this will enable you to apply your skills in research and
problem-solving exercises. More importantly, you become a polished decision-maker.
In hypothesis testing, we employ a logical sequence of steps and procedures. The
practical statistical procedure that we employ in hypothesis testing are called tests of
significance.
 The probability of committing Type I error is called the significance level of a
test.
 For any hypothesis test,
p value = probability of committing a Type I error

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
51

Usually, the level of significance is an arbitrary choice. In practice, the level of


significance for α is pegged at 0.05 or 0.01.
For example, suppose we want to compare two means. Mathematically, these two
means are different. However, are they significantly different?
 If p ≤ 0.5 of asserting that there is a difference, when no such difference
between the two means exists, then the difference is said to be significant at
the 0.05 or 5%, or less level.
 If p ≤ 0.01, the difference is said to be significant at the 0.01 or 1%, or less, level.
 If p ≤ 0.001, the difference is said to be highly significant
Suppose I select α = 0.05. What am I saying
about the Type I error? In this case, Type I error is
somewhat serious. I am willing to state that the 95%

probability is 5/100 (or 1 out of 20) that I am wrong inα/2 = 0.025 α/2 = 0.025
rejecting a null hypothesis that is true.
We can conduct a hypothesis test in two ways: -1.96 µ +1.96

1. Traditional or classical method; and


2. P value method (to be discussed in another
lesson)
Why is it important to learn both? The traditional method has been used since the
hypothesis testing procedure was formulated. The p-value method has become popular
with easy access to computer software and high-powered statistical calculators. However,
we should know how the probabilities are determined. For each method, there is a
corresponding Decision Rule that guides us in making conclusions and interpretations.

Representing Decisions
The mathematical model applied in hypothesis testing is the normal distribution
curve.
One of the processes in hypothesis testing is the calculation of the test statistic. What is
a test statistic? A test statistic is a value used to determine the probability needed in
decision-making. In the traditional method of hypothesis testing, the test statistic is the
value, determined by a computational formula that is compared with a confidence
coefficient (like 1.96 and 2.58). The decision that w make depends on the computed test
statistic. The formula for computing the test statistic depends on the sample size.
Increasing the sample has the effect on the shape of the distribution. This is specified by

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
52

the Central Limit Theorem (CLT), which states in part, that as sample size increases, the
sampling distribution of the mean approaches the normal distribution, regardless of the
shape of the parent population distribution. However, for the CLT to hold, sampling must
be random.
Steps in Traditional Method of Hypothesis Testing
Step 1 Describe the population parameter of interest (e.g., mean, proportion)
Step 2 Formulate the hypothesis: the null hypothesis and the alternative
hypothesis. That is, state a null hypothesis, in such a way that a Type I error
can be calculated.
Step 3 Check the assumptions.
 Is the sample size large enough to apply the Central Limit Theorem
(CLT)?
 Do small samples come from normally distributed populations?
 Are the samples selected randomly?
Step 4 Choose a significance level size for α. Make α small when the consequence
of rejecting a true Ho is severe.
 Is the test two-tailed or one-tailed
 Get the critical values from the test statistic table.
 Establish the critical regions.
(Optional: Draw a normal curve, draw vertical through the critical values, and
shade the rejection region.)
Step 5 Select the appropriate test statistic.
 Compute the test statistic using the appropriate formula.
Step 6 State the decision rule for rejecting or not rejecting the null hypothesis.
For a two-tailed test:
Reject Ho if the computed test statistic ≤ negative critical value or if the
computed test statistic ≥ positive critical value.
Do not reject (that is, accept) Ho if the computed test statistic > negative
critical value or if the computed test statistic < positive critical value.
In symbols, we write the rule as follows
Reject Ho if the computed z ≤ -𝑧𝛼 critical value or if the computed z≥ -𝑧𝛼
2 2
critical value.
Do not reject (that is, accept) Ho if the computed z > -𝑧𝛼 critical values or if
2
the computed z < +𝑧𝛼
2

For a one-tailed test:

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
53

Reject Ho if the computed z ≤ -z critical value or if the computer z ≥ +z critical


value.
Do not reject (that is, accept) Ho if the computed z > -z crit. Value or if the
computed z < + z crit. Value
Step 7 Compare the computed test statistic and the critical value. Then, based on
the decision rule, decide whether to reject or not to reject (accept) Ho.
Interpret the result.
(Optional: Take a course of action.)
A sketch of a normal curve in drawn (step 4) to show whether the computed
statistic lies in the rejection region or in the acceptance region.
In the following tests, note the symbols used and the interpretations associated
with the procedures. Remember that the decision is based on a comparative statement
about the computed value of the test statistic and the critical value (step 7).
Large-Sample Test Concerning the Mean µ of a Population
A one-population test is a test conducted on one sample purportedly coming from
a population with mean µ. It is sometimes called a significance test for a single mean.
There are two cases to consider for testing the mean of a single population:
1. The sample is large enough (n ≥ 30). Thus, we can apply the Central Limit Theorem
(CLT) and we use the normal curve as a model.
2. When CLT is applied, the sample standard deviation s may be used as an estimate
of the population standard deviation σ when the value of σ is unknown.
When the sample is large, that is, n ≥ 30, the test statistic is the z. The z statistic measures
the number of standard deviations between the observed value of 𝑥̅ and the null
hypothesized value of µ.
We consider two cases when conducting a significance test for a single mean:
Case 1. The population mean µ and the population standard deviation σ are known.
𝑥̅ − 𝜇 𝜎
𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: 𝑧 = 𝑤ℎ𝑒𝑟𝑒 𝜎𝑥̅ =
𝜎𝑥̅ √𝑛
Example 1: Computing z
Given 𝑥̅ = 90, µ = 88, σ = 6, n = 100. Find the value of z.
Steps Solution
𝑥̅ −𝜇
1. Write the computing formula 𝑧= that simplifies to
𝜎𝑥
̅

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
54

𝑥̅ − 𝜇
𝑧= 𝜎
√𝑛
2. Replace the terms in the formula by 90 − 88 2
𝑧= = = 3.33
the given values. 6 0.6
√100

Case 2. The population mean µ is known but not the population standard deviation σ.
𝑥̅ − 𝜇 𝜎
𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: 𝑧 = 𝑤ℎ𝑒𝑟𝑒 𝜎𝑥̅ =
𝜎𝑥̅ √𝑛
Note that in the Case 2 statistic, the sample standard deviation s is used as an
estimate for the population standard deviation σ.
Example 2: Computing the z value given s
Given 𝑥̅ = 80, µ = 83, s = 4, n = 100. Find the value of z.
Steps Solution
𝑥̅ −𝜇
1. Write the computing formula 𝑧= that simplifies to
𝜎𝑥
̅

𝑥̅ − 𝜇
𝑧= 𝑠
√𝑛
2. Replace the terms in the formula by 80 − 83 −3
𝑧= = = −7.5
the given values. 4 0.4
√100
Applying Hypothesis Testing in Problem Solving
Example 3: Problem-Solving Performance
A researcher used a development problem solving test to randomly select 50
Grade 6 pupils. In this sample, 𝑥̅ = 80 and s = 10. The mean µ and the standard deviation
of the population is used in the standardization of the test were 75 and 15, respectively.
Use the 95% confidence level to answer the following questions:
1. Does the sample mean differ significantly from the population mean?
2. Can it be said that the sample is above average?

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
55

Solving for Question Number 1:


Steps Answer
1. Describe the population parameter The parameter of interest is the mean µ of
of interest. the population where the sample comes
from.
2. Formulate the hypotheses: the null Ho: µ = 75
hypothesis and the alternative H1: µ ≠ 75
hypothesis. That is, state a null
hypothesis, Ho, in such a way that a
Type I error can be calculated.
3. Check the assumptions Since n = 50, by the Central Limit Theorem,
 Is the sample size large enough the distribution is normally distributed.
to apply the Central Limit
Theorem (CLT)? (This assumption need not be addressed.)
 Do small samples come from
normally distributed
populations? Yes.
 Are the samples selected
randomly?
4. Choose a significance level size for 𝛼 = 1 − 0.95 = 0.05
α.
 Is the test two-tailed or one-
tailed Two-tailed
 Get the critical values from the
test statistic table. Z critical values ±1.96
 Establish the critical regions.

95%

α/2 = 0.025 α/2 = 0.025

-1.96 µ +1.96

5. Select the appropriate test Test Statistic: z and σ = 15.


statistic. 𝑥̅ − 𝜇
𝑧= 𝜎
 Compute the test statistic.
√𝑛
80 − 75
𝑧=
15
√50

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019
56

5
𝑧= = 2.36
2.12

6. State the decision rule. Reject Ho if the computed test statistic ≤


negative critical value or if the computed
test statistic ≥ positive critical value.

Otherwise, do not reject (or accept Ho).


7. Compare the computed test Decision-making:
statistic and the critical value. 2.36 > 1.96
 Based on the decision rule, The null hypothesis is rejected.
decide whether to reject or
accept Ho. Interpretation:
 Interpret the result.  There is enough evidence to
 Take a course of action reject the null hypothesis.
(optional)  There is a significant difference
between the sample mean and
the population mean.

In the graph of the normal curve, the computed z-value is located outside the
acceptance region. So, the null hypothesis has to be rejected.

95%

α/2 = 0.025 α/2 = 0.025

-1.96 µ +1.96

JUBERT B. OLIGO, MST-MATHEMATICS


FIRST SEMESTER 2018-2019

You might also like