Professional Documents
Culture Documents
Suppose three cell phones are tested at random. We want to find out the
number of defective cell phones that occur. Thus, to each outcome in the sample space
we shall assign a value. These are 0, 1, 2, or 3. If there is no defective cell phone, we
assign the number 0; if there is 1 defective cell phone, we assign the number 1; if there
are two defective cell phones, we assign the number2; and 3, if there are three
defective cell phones. The number of defective cell phones is a random variable. The
possible values of this random variables are 0, 1, 2, and 3.
Illustration
Let D represent the defective cell phones and N represent the non-defective cell
phone. If we let X be the random variable representing the number of defective cell
phones, can you show the values of random variable X? Complete the table below to show
the values of the random variable.
The completed table should look like this.
Possible Outcomes Value of the Random Variable X
(number of defective cell phones)
NNN 0
NND 1
NDN 1
DNN 1
NDD 2
DND 2
DDN 2
DDD 3
In your previous study of mathematic, you have learned how to find the probability
of an event. In this lesson, you will learn how to construct a probability distribution of a
discrete random variabe. Your knowledge of getting the probability of an event is very
important in understanding the present lesson. To find out if you are raedy to learn this
new lesson, do the following actvities.
ENTRY CARD
A. Find the probability of the following events.
Event (E) Probability P (E)
1. Getting an even number in a single roll of a die
2. Getting a sum of 6 when two dice are rolled
3. Getting an ace when a card is drwan from a deck
4. The probability that all children are boys if a couple has three
children
5. Getting an odd number and a tail when a die is rolled and a
coin is tossed simultaneously
6. Getting a sum of 11 when two dice are rolled
7. Getting a black card and a 10 when a card is drawn from a
deck
8. Getting a red queen when a card id drawn from a deck
9. Getting doubles when two dice are rolled
10. Getting a red ball from a box containing 3 red and 6 black balls
Number of Tails
Suppose three coins are tossed. Let Y be the random variable representing the
number of tails that occur. Find the probability of each of the values of the random
variable Y.
Solution
Steps Solution
1. Determine the sample space. Let H The sample space for this experiment is
represent head and T represent tail. S = {TTT, TTH, THT, HTT, HHT, THH, HHH}
2. Count the number of tails in each Possible Value of the
outcome in the sample space and Outcomes Random Variable X
assign this number to this outcome (number of tails)
TTT 3
TTH 2
THT 2
HTT 2
HTH 1
THH 1
HHH 0
3. There are four possible values of the
random variable y representing the Number of Tails Probability P(Y)
number of tails. These are 0, 1, 2, Y
and 3. Assign probability values P(Y)
to each value of the random
variable.
There are 8 possible outcomes and
no tail occurs once, so the 1
probability that we shall assign to 0 8
1
the random variable 0 is .
8
There are 8 possible outcomes and
1 tail occurs three times, so the
probability that we shall assign to 3
3
the random variable 1 is . 1 8
8
There are 8 possible outcomes and
2 tails occur three times, so the
3 2 3
random variable 2 is .
8 8
There are 8 possible outcomes and
3 tails occur once, so the probability
0.3
0.2
0.1
0
1 2 3 4
B. The following data show the probabilities for the number of cars sold in a given
day at a car dealer store.
Number of Copies X Probability P(X)
0 0.100
1 0.150
2 0.250
3 0.140
4 0.090
5 0.080
6 0.060
7 0.050
8 0.040
9 0.025
10 0.015
a. Find P (X ≤ 2)
b. Find P (X ≥ 7)
c. Find P (1 ≤ X ≤ 5)
Steps Solution
1. Compute the mean of the population (µ) ∑𝑋
𝜇=
𝑁
1+2+3+4+5
=
5
=3
So, the mean of the population is 3.00
2. Compute the variance of the population (𝜎)
X X-µ (X - µ)2 ∑(𝑋−𝜇)2
1 -2 4
𝜎2 =
𝑁
2 -1 1 10
=
3 0 0 5
4 1 1 =2
5 2 4 So, the variance of
∑(X - the population is 2.
µ)2 = 10
3. Determine the number of possible samples Use the formula NCn. Here N = 5 and n
of size n = 2 = 2.
5C2 = 10
So, there are 10 possible samples of
size 2 that can be drawn.
4. List all possible samples and their
corresponding means. Samples Means
1, 2 1.50
1, 3 2.00
1, 4 2.50
1, 5 3.00
2, 3 2.50
2, 4 3.00
2, 5 3.50
3, 4 3.50
3, 5 4.00
4, 5 4.50
𝑁−𝑛
The expression, √ is called the finite correction factor. In general, the
𝑁−1
population is large and the sample size is small, the correction factor is not used since it
will be very close to 1.
We summarize the properties of the sampling distribution below.
The standard deviation (𝜎𝑥 ) of the sampling distribution of the sample means is
also known as the standard of the mean. It measures the degree of accuracy of the sample
mean (𝜎𝑥̅ ) as an estimate of the population mean (µ).
A good estimate of the mean is obtained if the standard error of the mean (𝜎𝑥 ) is
small or close to zero, while a poor estimate, if the standard error of the mean (𝜎𝑥̅ ) is
large. Observe that the value of (𝜎𝑥̅ ) depends on the size of the sample (n). What happens
to (𝜎𝑥̅ ) when n increases?
Thus, if we want to get a good estimate of the population mean, we have to make
n sufficient large. This fact is stated as a theorem, which is known as The Central Limit
Theorem.
Steps Solution
1. Identify the given information. Here µ= 60, 𝜎 = 5, and n = 16.
2. Find the mean of the sampling µx = µ.
distribution. Use the property = 60
that µx = µ.
𝜎
3. Find the standard deviation of the 𝜎𝑥̅ =
√𝑛
sampling distribution. Use the 5
𝜎 =
property that 𝜎𝑥̅ = . √16
5
√𝑛
=
4
= 1.25
Example 2: The heights of male college students are normally distributed with mean 0f
68 inches and standard deviation of 3 inches. If 80 samples consisting of 25
students each are drawn from the population, what would be the expected
mean and standard deviation of the resulting sampling distribution of the
means?
We shall assume that the population is finite.
Steps Solution
1. Identify the given information. Here µ= 68, 𝜎 = 3, and n = 25.
2. Find the mean of the sampling µx = µ.
distribution. Use the property = 60
that µx = µ.
𝜎
3. Find the standard deviation of the 𝜎𝑥̅ =
√𝑛
sampling distribution. Use the 3
𝜎 =
property that 𝜎𝑥̅ = . √25
3
√𝑛
=
5
= 0.6
Mathematical Journal
Think about the answers to the following questions. Discuss your answers with
your classmates.
If a sample is drawn from a population, what happens to the standard error of the
mean if the sample size is:
1. increased from 50 to 200?
In the previous chapter, you have learned how to use the normal distribution to
gain information about an individual data value obtained from the population. In this
lesson, you will use the sampling distribution of the mean to obtain information about
the sample mean. Find out if you are ready to learn the present lesson by doing the next
activity.
A. Areas Under the Normal Curve
Find the area under the normal curve given the following conditions.
Conditions Illustration Area
𝑥̅ −𝜇
Consequently, it justifies the use of the formula 𝑧 = 𝜎 when computing for the
√𝑛
probability that 𝑋̅ will take on a value within a given range in the sampling distribution of
𝑋̅ .
𝑥̅ − 𝜇
𝑧= 𝜎
√𝑛
where 𝑋̅ = sample mean
µ = population mean
𝜎 = population standard deviation
n = sample size
Time to Complete an Examination
Example 1. The average time it takes a group of college students to complete a certain
examination is 46.2 minutes. The standard deviation is 8 minutes. Assume
that the variable is normally distributed.
a. What is the probability that a randomly selected college student will
complete the examinations in less than 43 minutes?
Steps Solution
1. Identify the given information. µ= 46.2
𝜎=8
X = 43
2. Identify what is asked. P(X < 43)
3. Identify the formula to be used. Here we are dealing with an
individual data obtained from the
population. So, we will use the
𝑥−𝜇
formula 𝑧 = to standardize 43.
𝜎
4. Solve the problem. 𝑥−𝜇
𝑧=
𝜎
43 − 42.6
=
8
= -0.40
= 0.5000 -0.1554
= 0.3446
5. State the final answer. So, the probably that a randomly
selected college student will
complete the examination in less
than 43 minutes is 0.3446 or
34.46%.
We
Cholesterol Content
The average number of milligrams (mg) of cholesterol in a cup of certain brand of
ice cream is 600 mg, and the standard deviation is 35 mg. assume the variable is normally
distributed.
a. If a cup of ice cream is selected, what is the probability that the cholesterol content
will be more than 670 mg?
Steps Solution
1. Identify the given information. µ= 660
𝜎 = 35
X = 670
2. Identify what is asked. P(X > 670)
3. Identify the formula to be used. Here we are dealing with an
individual data obtained from the
population. So, we will use the
𝑋̅ −𝜇
formula 𝑧 = to standardize
𝜎
670.
4. Solve the problem. 𝑥−𝜇
𝑧=
𝜎
670 − 660
=
35
= 0.29
b. If a sample of 10 cups of ice cream is selected, what is the probability that the mean
of the sample will be larger than 670 mg?
Steps Solution
1. Identify the given information. µ= 660
𝜎 = 35
𝑋̅ = 670
n = 10
2. Identify what is asked. P(𝑋̅ > 670)
3. Identify the formula to be used. Here we are dealing with data
about the sample means. So, we
𝑋̅ −𝜇
will use the formula 𝑧 = 𝜎 to
√𝑛
standardize 670.
4. Solve the problem. 𝑋̅ − 𝜇
𝑧= 𝜎
√𝑛
670 − 660
=
35
√10
= 0.90
Exercises
Solve the following problems.
1. A manufacturer of light bulbs produces that last a mean of 950 hours with a
standard deviation of 120 hours. What is the probability that the mean lifetime
of a random sample of 10 of these bulbs is less than 900 hours?
2. The average cholesterol content of a curtain canned goods is 215 milligrams,
and the standard deviation is 15 milligrams. Assume the variable is normally
distributed.
a. If a canned good is selected, what is the probability that the cholesterol
content will be greater than 220 milligrams?
b. If a sample of 25 canned goods is selected, what is the probability that the
mean of the sample will be larger than 220 milligrams?
3. The average public high school has 468 students with a standard deviation of
87.
a. If a public school is selected, what is the probability that the number of
students enrolled is greater than 400?
b. If a random sample of 38 public elementary schools is selected, what is
probability that the number of students enrolled is between 445 and 485?
Task:
1. Read the following carefully in preparation for determining confidence
interval estimates for the population mean µ.
2. Consult the z-table for the z-values and their corresponding areas a
deemed necessary.
Z-table
z 0.0 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .0000 .0040 .0080 .0120 .0160 .0199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .0569 .0636 .0675 .0714 .0753
0.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2257 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2517 .2549
0.7 .2580 .2611 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621
1.1 .2643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3889 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 .4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319
1.5 .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4429 .4441
1.6 .4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 .4591 .4599 .4608 .4616 .4625 .4633
1.8 .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4756 .4761 .4767
2.0 .4772 .4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817
2.1 .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .4861 .4864 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890
2.3 .4893 .4896 .4898 .4901 .4904 .4906 .4909 .4991 .4913 .4916
2.4 .4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
2.5 .4938 .4940 .4941 .4943 .4945 .4946 .4948 .4949 .4951 .4952
2.6 .4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964
2.7 .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4979 .4980 .4981
2.9 .4981 .4982 .4982 .4983 .4984 .4984 .4985 .4985 .4986 .4986
3.0 .4987 .4987 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990
For values of z above 3.09, use .4999 for the area.
Adopted from Mario F. Triola (1995). Elementary Statistics. 6th ed. New York: Addison-Wesle
particular X value is away from the mean. Table 4.1 gives the area (to four decimal places)
under the standard normal curve for any z-value from -3.49 to 3.49. This table is also
known as the z-table.
The are under the curve is 1 or 100%. The proportion of the area between 1
standard deviation unit below the mean and 1 standard deviation unit above the mean is
approximately 68%. The middle 95% is the proportion of the region above z = -1.96 and
below z = 1.96. These z-values determine the 95% confidence interval estimates. Similarly,
the middle 99% is the proportion of the area bound by z = -2.58 and z = +2.58.
approximations are shown in the figure. For another level of confidence interval, the
corresponding z-value are called confidence coefficients. They are also called critical
values. We also say that the standard normal variable z is the test statistic used to
calculate the interval boundaries.
Recall also that for a large sample values, the Central Limit Theorem (CLT) applies.
That is, as the sample size n increases without limit, the shape of the distribution of the
sample means taken with replacement from a population with the mean µ and standard
deviation 𝜎 will approach a normal distribution. So, when the sample size is large,
applying CLT, approximately 95% of the sample means taken from a population with the
mean µ will fall ±1.96 standard errors of the population mean. This means that the
interval estimate is given by:
𝜎 𝜎
𝜇 − 1.96 ( ) 𝑡𝑜 𝜇 + 1.96 ( )
√𝑛 √𝑛
Thus, if a sample mean is specified, there is a 95% probability that the interval:
𝜎 𝜎
𝜇 − 1.96 ( ) 𝑡𝑜 𝜇 + 1.96 ( ) contains 𝑋̅
√𝑛 √𝑛
In an analogous manner, there is a 95% probability that the interval sepecified by:
𝜎
𝑋̅ − 1.96 ( ) 𝑡𝑜 𝑋̅ +
√𝑛
𝜎
1.96 ( ) will contain µ.
√𝑛
The expression shows that the interval estimate of the population mean µ is a
𝜎 𝜎
number from 𝑋̅ − 1.96 ( ) 𝑡𝑜 𝑋̅ + 1.96 ( ). This is shown in the figure where zµ/2 =
√𝑛 √𝑛
±1.96. The two z-values, and + 1.96, are the boundaries of the interval estimate. Under
the normal curve, the total proportion of the are to the left of -1.96 and to the right of
𝜎
1.96 is 𝑋̅ − 1.96 ( ) is denoted by α (Greel letter alpha). In statistical analysis, we often
√𝑛
use α to indivate our level of confidence. If the confidence is 95%, then α is the remaining
5% or 0.05. this is the proportion of the area that is distributed in bothe tails of the stndard
normal distirbution curve. This are is outside boundaries of the interval estimate. So , the
𝛼 0.05
area at each tail is or which is equal to 0.025. This is indicated in the symbol z α/2
2 2
(read “zee sub alpha over two”) in the general formula.
The general formula for confidence interval for large samples is:
𝜎 𝜎
𝑋̅ − 𝑧𝛼⁄2 ( ) < µ < 𝑡𝑜 𝑋̅ + 𝑧𝛼⁄2 ( )
√𝑛 √𝑛
The short form of this formula is:
𝜎
𝑋̅ − 𝑧𝛼⁄2 ( )
√𝑛
Other confidence levels are also used in statistics like
90% or 99%.
In general formula for determining the interval
𝜎
estimate for the parameter µ, the value 𝑋̅ − 𝑧𝛼⁄2 ( ) is
√𝑛
𝜎
called the lower confidence boundary or limit and the other value 𝑋̅ + 𝑧𝛼⁄2 ( ) is called
√𝑛
upper confidence boundary or limit.
For a 90% confidence interval, 𝑧𝛼⁄2 = ±1.95; for a 95% confidence interval, 𝑧𝛼⁄2 =
±1.95 and for a 99%, 𝑧𝛼⁄2 = ±2.58.
The figure on the right shows the 95% confidence interval in a normal distribution.
1. Write a formula for computing the interval estimate of the population mean µ
for:
a. 90% confidence
b. 99% confidence
2. Draw a normal curve showing the confidence coefficients in the interval
estimate for:
a. 90% confidence
b. 99% confidence
population mean µ. 𝜎
where E = 𝑧𝛼⁄2 ( )
√𝑛
Example 1:
Given: Find the estimate of the population mean µ using the 95% confidence level.
Solution:
With the large sample, by the Central Limit Theorem, the distribution is normally
distributed.
a. Point Estimate
Steps Solution
1. Describe the population The parameter of interest is the mean µ
parameter of the interest. where the sample purportedly belongs.
2. Specify the confidence interval criteria.
a. Check the assumptions The 𝜎 is given.
The sample is normal as guaranteed
by the CLT.
Notice that the population standard deviation σ for each group is unknown. In
statistics, there is a method that we can use to compute confidence intervals for a
population mean when σ is unknown. However, there are assumptions to bear in mind.
Assumptions in Computing for the Population Mean when σ is Unknown
When n ≥ 30, and σ is unknown, the sample standard deviation scan be substituted
for σ. However, the following assumptions should be met.
1. The sample is a random sample.
2. Either n ≥ 30 or the population is normally distributed when n > 30.
In the past lesson, when σ is known and the sample size is 30 or more, or the
sample size is less than 30 but comes from a population that is approximately normally
distributed, the confidence interval for the population mean can be found by using the z-
distribution. Very often, however σ is not known. So, it must be estimated by s, the sample
standard deviation. When s is used, especially when the sample size is small, critical values
greater than the values for zα/2 are used in confidence intervals in order to keep the
intervals at a given level such as the 95% level. This means that a multiplier of the standard
𝑠
error of the means, denoted as , slightly larger than 1.96 is needed. The sampling error
√𝑛
𝑠
associated with using is reflected in wider confidence intervals. But the number of
√𝑛
𝑠
standard errors ( 𝑠) needed for the 0.90 or 0.95 confidence intervals depends on the
√𝑛
sample size. With small sample size, more standard errors are needed to span the 0.95
𝑠
confidence interval. This number of values is called t.
√𝑛
The general expression for the confidence interval when σ is unknown is given by:
𝑠
𝑥̅ ± 𝑡 ( ) , and the distribution of values is called 𝐭 − 𝐝𝐢𝐬𝐭𝐫𝐢𝐛𝐮𝐭𝐢𝐨𝐧.
√𝑛
The concept of the degrees of freedom is used in the t-distribution. The degrees of
freedom, denoted as df, are the number of values that are free to vary after a sample
statistic has been computed, and they tell us the specific curve to use when a distribution
consists of a family curve. For example, if the mean of 5 values is 10, then 4 of the 5 values
are free to vary. But once the 4 values are selected, the 5 th value must be a specific
number to get a sum of 50, since. Thus, if n = 5, df = n -1 = 4. (McClave & Sincich 2003).
Task: Learn how to use the t-Table in computing interval estimates of µ.
Historical Note
The t-distribution was formulated in 1908 by an Irish brewing employee
named W.S. Gosset. Gosset was involved in researching new methods of
manufacturing ale. Because brewing employees were not allowed to publish results,
Gosset published his finding using the pseudonym Student. Hence, the T-distribution
is sometimes called Student’s t-distribution.
The formula for computing the confidence interval using the t-distribution is:
𝑠 𝑠
𝑥̅ − 𝑡 ( ) < 𝜇 < 𝑥̅ + 𝑡 ( )
√𝑛 √𝑛
The t-values found in the reproduced t-Table are the proportions of the areas in
two tails of the t-curve. They are called critical values of t in the sense that they are the
boundaries of the middle area where the true mean lies. Like the z, they are also called
confidence coefficients.
The t-Table
Confidence Coefficient
Degrees of Freedom (amount of α in two tails)
n (n-1) 0.90 0.95 0.99
2 1 6.314 12.706 63.657
3 2 2.920 4.303 9.925
4 3 2.353 3.182 5.841
5 4 2.132 2.776 4.604
6 5 2.015 2.571 4.032
7 6 1.943 2.447 3.707
8 7 1.895 2.365 3.499
9 8 1.860 2.306 3.355
10 9 1.833 2.262 3.250
11 10 1.812 2.228 3.169
12 11 1.796 2.201 3.106
13 12 1.792 2.179 3.055
14 13 1.771 2.160 3.012
15 14 1.761 2.145 2.977
16 15 1.753 2.131 2.947
17 16 1.746 2.120 2.921
18 17 1.740 2.110 2.898
19 18 1.734 2.101 2.878
20 19 1.729 2.093 2.861
21 20 1.725 2.086 2.845
22 21 1.721 2.080 2.831
23 22 1.717 2.074 2.819
24 23 1.714 2.069 2.807
25 24 1.711 2.064 2.797
26 25 1.708 2.060 2.787
27 26 1.706 2.056 2.779
28 27 1.703 2.052 2.771
29 28 1.701 2.048 2.763
30 29 1.699 2.045 2.756
31 30 1.697 2.042 2.750
41 40 1.684 2.021 2.714
61 60 1.671 2.000 2.660
∞ ∞ 1.645 1.960 2.576
Note that in the table, the t values are based, not on sample size n, but on degrees
of freedom, n -1. For example, for n = 20, the 0.95 (95%) confidence interval when σ is
𝜎
known 𝑥̅ ± 1.96 ( ); but when σ is unknown and only s is available, the 0.95 confidence
√𝑛
𝑠
interval is 𝑥̅ ± 2.09 ( ). The confidence coefficient is 2.09. Likewise, in the t-table, for n
√𝑛
𝑠
= 10, the 0.95 or 95% confidence interval is 𝑥̅ ± 2.26 ( ). The confidence coefficient is
√𝑛
2.26.
Tasks: Observe the areas associated with the sample size n.
What happens to the values of t as n increases.
What values do you observe when n = ∞
Discuss your observations.
Confidence Coefficients
A. Find the Confidence coefficients for each of the following:
1. n = 6, 90% confidence
2. n = 7, 90% confidence
3. n = 12, 95% confidence
4. n = 17, 95% confidence
5. n = 24, 99% confidence
B. Find E given the following:
1. n = 6, s = 2, 90% confidence
2. n = 9, s = 2.8, 90% confidence
3. n = 13, s = 4.5, 95% confidence
4. n = 16, s = 3.1, 95% confidence
5. n = 21, s = 5, 95% confidence
Since the population standard deviation σ and the standard deviation of the
sampling distribution of means σx are rarely known, the procedure involving t is typically
used in setting confidence intervals.
The following Four-Step Method is helpful in determining the interval estimate for
the population mean when σ is unknown.
Steps in Computing the Interval Estimate of the Population Mean When σ is
Unknown
Step 1: Describe the population parameter of interest.
Step 2: Specify the confidence interval criteria.
a. Check the assumptions
b. Determine the test statistic to be used. In this case, it is the t statistic.
c. State the level of confidence.
Step 3: Collect and present sample evidence.
a. Collect the sample information
b. Find the Point estimate.
Step 4: Determining the confidence interval.
a. Determine the confidence coefficients (𝑡𝛼/2 ) from the t-table.
𝑠
b. Find .
√𝑛
c. Find the lower and upper confidence limits.
d. Describe the results.
Teaching Strategy
Tasks:
1. Use the Four-Step Method to find the estimates of the population means where
the experimental group and the control group belong as give in In-Class Activity 1.
2. Fill in the blanks to complete the solution.
Solution:
Steps Solutions
1. Describe the population The 1st parameter of interest is the mean µ1 of the
parameter of interest. population where the experimental group belongs.
The 2nd parameter of interest is the mean µ2 of the
population where the control group belongs
𝐸 = 2.8
c. Find the lower and the upper For the experimental group:
confidence limits. 𝑠 𝑠
𝑥̅ − 𝑡𝛼/2 ( ) < 𝜇 < 𝑥̅ + 𝑡𝛼/2 ( )
√𝑛 √𝑛
So, 82.5 – 1.40 = 81.1 (lower limit) and
82.5 = 1.40 = 83.9 (upper limit)
Formulate a null hypothesis and the alternative hypothesis for each of the
following.
Write them in symbols.
1. The average TV viewing time of all five-year old is 4 hours daily.
2. A college librarian claims that 20 storybooks on the average are borrowed daily.
3. The mean performance of all grade six level of a school in the NAT is 35.
4. The inventor of a new kind of light bulb claims that all such bulbs last as long as
3000 hours.
5. The average of all the identified stratum of senior citizens in a remote area is 92
years.
Solution:
The parameter of interest is the population mean µ = 82.
1. In words, the hypotheses are:
𝐻0 : The sample comes from a population whose mean is 82; or
𝐻1 : The sample comes from a population whose mean is equal to the population
mean 82.
𝐻1 The sample does not come from a population whose mean is 82.
In symbols, we write:
𝐻0 : 𝜇 = 82 and 𝐻1 : 𝜇 ≠ 82
2. There is no clue as to the direction of the investigation. The phrase affects
performance implies either an increase or a decrease in performance. So, the test
is non-directional.
Example 3: Organic Fertilizers
A farmer believes that using organic fertilizers on his plants will yield greater
income. His average income from the past was P200,000.00 pesos per year. State the
hypotheses in symbols.
Solution:
𝐻0 : 𝜇 = 𝑃200,000.00 𝑝𝑒𝑠𝑜𝑠
The phrase ‘greater income’ is associated with the greater than direction.
𝐻1 : 𝜇 > 𝑃200,000.00 𝑝𝑒𝑠𝑜𝑠
Task 3: Write the null hypothesis and the alternative hypothesis in words and in symbols
for each of the following.
1. The net weight of a packet of a snack brand is 130 g. A sample of 80 packets yielded
a sample mean weight of 122 g with a standard deviation of 15 g.
2. In a graduate college, the average length of registration time during a semester is
120 minutes with a standard deviation of 25 minutes. With the registration
procedure, a random sample of 50 students took an average of 80 minutes with a
standard deviation of 12 minutes.
3. The average height of grade 8 female students is 158.2 cm. The mean height of a
sample of 100 female students is 160 cm with a standard deviation of 6 cm.
Non-Directional
(Two-tailed) 1-α
α/2 α/2
The probability is found on both tails of
the distribution. µ
Directional
(One-tailed, left tail) 1-α
α/2
The probability is found on the left tail
of the distribution. µ
Directional
(One-tailed, right tail)
1-α
The probability is found on the right tail α
of the distribution.
µ
If the null hypothesis is true and accepted, or if it is false and rejected, the decision
is correct. If the null hypothesis is true and rejected, the decision is incorrect and this is a
Type I error. If the null hypothesis is false and accepted, the decision is incorrect and this
is a Type II error.
In an ideal situation, there is no error when we accept the truth and reject what is
false.
Understanding Errors
Task: Study the following examples carefully and the notes that follow. Discuss for the
better understanding of hypothesis testing.
We can control the errors by assigning small probability values to each of them.
The most frequently used probability values for α and β are 0.05 and 0.01. The probability
assigned to each depends on its seriousness. The more serious the errors, the less willing
we are to have it occur. So, a smaller probability will be assigned. The symbols α and β
are each probabilities of error, each under separate conditions, and they cannot be
combined. Therefore, there is no single probability for making an incorrect decision. In
like manner, two correct decision are distinct and each has its own probability. As can be
seen in Table 5.2, 1 – α is the probability of a correct decision when the null hypothesis is
true, and 1 – β is the probability of a correct decision when the null hypothesis is false. 1
– β is the called the power of the statistical test ice it is measure of the ability of a
hypothesis test to reject a false null hypothesis which is considered very important.
(McClave & Sincich, 2003)
Graphically, we can show the decision errors under the normal curve.
µ µ
Note that the rejection region for a directional test is in one tail (first figure) but
distributed to the two tails in a non-directional test (second figure).
Under the normal curve, the rejection region refers to the region where the value
of the test statistic lies for which we will reject the null hypothesis. This region is also
called critical region.
So, if your computed statistic is found in the rejection region, then you reject Ho.
If it is found outside the rejection region, you accept Ho.
Note also the line that separates the rejection region from the non-rejection region
(1 – α). This line passes through the confidence coefficients, which are also called critical
values. The critical values can be obtained from the critical values table of the test
statistic. For example, if the test statistic is a z, the critical values can be obtained from
the z-table. So, for a 95% confidence level, the critical values for a non-directional test are
-1.96 and +1.96. When the confidence level is 99%, for a non-directional test, the critical
values are -2.58 and +2.58.
Determining the Critical Values
Task: Study how the critical values are determined under the normal curve.
Recall that the critical value are the z-values associated with the probabilities at
the tails of the normal curve.
For a 95% confidence level:
0.95
= 0.4750 (expressed up to four decimal places so
2
that we can identify an area in the normal curve table
as close as possible to this value). 95%
Ho = 0.025 Ho = 0.025
In the normal curve table, the area .0.4750
corresponds to z = 1.96.
µ
-1.96 +1.96
Thus, the critical values for 95% confidence
level are -1.96 and +1.96.
Exercises
1. Between Ho and H1, what is a good reason for starting a hypothesis test with
Ho?
2. Suppose it is the Christmas season and Janine thinks that it is the month of
January, what error is she committing?
3. What type of error is committed when you reject a null hypothesis when, in
fact, it is true?
4. Determine whether the test is directional or non-directional. If:
a. A researcher claims that method of teaching affects learning
b. A food additive enhances food flavor
c. A study habit improves the memory
d. Health is related to lifestyle
e. Peoples’ culture affects tourism
5. Draw a normal curve for a 95% confidence and show z = 1.5. What decision can
you associate with this z-value? Why?
After exploring the beginning elements of hypothesis testing, you are now ready
to engage in it. Your knowledge on this will enable you to apply your skills in research and
problem-solving exercises. More importantly, you become a polished decision-maker.
In hypothesis testing, we employ a logical sequence of steps and procedures. The
practical statistical procedure that we employ in hypothesis testing are called tests of
significance.
The probability of committing Type I error is called the significance level of a
test.
For any hypothesis test,
p value = probability of committing a Type I error
probability is 5/100 (or 1 out of 20) that I am wrong inα/2 = 0.025 α/2 = 0.025
rejecting a null hypothesis that is true.
We can conduct a hypothesis test in two ways: -1.96 µ +1.96
Representing Decisions
The mathematical model applied in hypothesis testing is the normal distribution
curve.
One of the processes in hypothesis testing is the calculation of the test statistic. What is
a test statistic? A test statistic is a value used to determine the probability needed in
decision-making. In the traditional method of hypothesis testing, the test statistic is the
value, determined by a computational formula that is compared with a confidence
coefficient (like 1.96 and 2.58). The decision that w make depends on the computed test
statistic. The formula for computing the test statistic depends on the sample size.
Increasing the sample has the effect on the shape of the distribution. This is specified by
the Central Limit Theorem (CLT), which states in part, that as sample size increases, the
sampling distribution of the mean approaches the normal distribution, regardless of the
shape of the parent population distribution. However, for the CLT to hold, sampling must
be random.
Steps in Traditional Method of Hypothesis Testing
Step 1 Describe the population parameter of interest (e.g., mean, proportion)
Step 2 Formulate the hypothesis: the null hypothesis and the alternative
hypothesis. That is, state a null hypothesis, in such a way that a Type I error
can be calculated.
Step 3 Check the assumptions.
Is the sample size large enough to apply the Central Limit Theorem
(CLT)?
Do small samples come from normally distributed populations?
Are the samples selected randomly?
Step 4 Choose a significance level size for α. Make α small when the consequence
of rejecting a true Ho is severe.
Is the test two-tailed or one-tailed
Get the critical values from the test statistic table.
Establish the critical regions.
(Optional: Draw a normal curve, draw vertical through the critical values, and
shade the rejection region.)
Step 5 Select the appropriate test statistic.
Compute the test statistic using the appropriate formula.
Step 6 State the decision rule for rejecting or not rejecting the null hypothesis.
For a two-tailed test:
Reject Ho if the computed test statistic ≤ negative critical value or if the
computed test statistic ≥ positive critical value.
Do not reject (that is, accept) Ho if the computed test statistic > negative
critical value or if the computed test statistic < positive critical value.
In symbols, we write the rule as follows
Reject Ho if the computed z ≤ -𝑧𝛼 critical value or if the computed z≥ -𝑧𝛼
2 2
critical value.
Do not reject (that is, accept) Ho if the computed z > -𝑧𝛼 critical values or if
2
the computed z < +𝑧𝛼
2
𝑥̅ − 𝜇
𝑧= 𝜎
√𝑛
2. Replace the terms in the formula by 90 − 88 2
𝑧= = = 3.33
the given values. 6 0.6
√100
Case 2. The population mean µ is known but not the population standard deviation σ.
𝑥̅ − 𝜇 𝜎
𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: 𝑧 = 𝑤ℎ𝑒𝑟𝑒 𝜎𝑥̅ =
𝜎𝑥̅ √𝑛
Note that in the Case 2 statistic, the sample standard deviation s is used as an
estimate for the population standard deviation σ.
Example 2: Computing the z value given s
Given 𝑥̅ = 80, µ = 83, s = 4, n = 100. Find the value of z.
Steps Solution
𝑥̅ −𝜇
1. Write the computing formula 𝑧= that simplifies to
𝜎𝑥
̅
𝑥̅ − 𝜇
𝑧= 𝑠
√𝑛
2. Replace the terms in the formula by 80 − 83 −3
𝑧= = = −7.5
the given values. 4 0.4
√100
Applying Hypothesis Testing in Problem Solving
Example 3: Problem-Solving Performance
A researcher used a development problem solving test to randomly select 50
Grade 6 pupils. In this sample, 𝑥̅ = 80 and s = 10. The mean µ and the standard deviation
of the population is used in the standardization of the test were 75 and 15, respectively.
Use the 95% confidence level to answer the following questions:
1. Does the sample mean differ significantly from the population mean?
2. Can it be said that the sample is above average?
95%
-1.96 µ +1.96
5
𝑧= = 2.36
2.12
In the graph of the normal curve, the computed z-value is located outside the
acceptance region. So, the null hypothesis has to be rejected.
95%
-1.96 µ +1.96