Statistics Finals

5.
1 STEPS IN TESTING HYPOTHESIS (PowerPoint)
 A statistical hypothesis is an assertion or conjecture concerning one or more

populations.
 To prove that a hypothesis is true, or false, with absolute certainty, we would
need absolute knowledge. That is, we would have to examine the entire
population.
 Instead, hypothesis testing concerns on how to use a random sample to
judge if it is evidence that supports or not the hypothesis.
 Hypothesis testing is formulated in terms of two hypotheses:
 : the null hypothesis;
 : the alternate hypothesis.
Note that the failure to reject does not mean the null hypothesis is
true. There is no formal outcome that says “accept “. It only means we
do not have sufficient evidence to support .
Note will ALWAYS have an equal sign (and possibly a less than or greater
than symbol, depending on the alternative hypothesis). The alternative
hypothesis has a range of values that are alternatives to the one in .
Illustration:
In a jury trial the hypotheses are:
 : defendant is innocent;
 : defendant is guilty.
(innocent is rejected if (guilty) is supported by evidence beyond

“reasonable doubt”. Failure to reject (prove guilty) does not imply
innocence, only that the evidence is insufficient to reject it.
Two – tailed test Right – tailed test Left – tailed test
Notes: Right – tailed and left – tailed tests are distinguished by the way the greater
than or less than points. It is the direction where alternative places the true mean.
Examples: State the and for each cases,
1. A researcher thinks that if expectant mothers use vitamins, the birth weight of
the babies will increase. The average birth weight of the population is 8.6 pounds.
2. An engineer hypothesizes that the mean number of defects can be decreased

in manufacturing process of compact disks by using robots instead of humans for
certain tasks. The mean number of defective disks per 1000 is 18.
After stating the hypotheses, the researcher designs the study.
 Select the correct statistical test

 Choose an appropriate level of significance
 Formulate a plan for conducting the study
Statistical test – uses the date obtained from a sample to make a decision about
whether the null hypothesis should be rejected.
Test Value (test statistic) – the numerical value obtained from a statistical test.
Two types of Errors:
Type I error – reject when is true.
Type II error – do not reject when is false.
Results of a statistical test:
is True Is False
Reject Type I error Correct decision
Do not reject Correct decision Type II error
The question is how large of a difference is enough to say we have enough

evidence to reject the null hypothesis?
Significance level – is the maximum probability of committing a Type I error. This

probability is symbolized by 𝛼.
P (Type I error is true) = 𝛼
Critical or Rejection Region – the range of values for the test value that indicate
a significant difference that the null hypothesis should be rejected.
Non – critical or Non – rejection Region – the range of values for the test value
that indicates that the difference was probably due to no chance and that the
null hypothesis should be rejected.
Critical Value (CV) – separates the critical region from the non – critical region,
i.e., when we should reject from when we should not reject .
 The location of the critical value depends on the inequality sign of the
alternative hypothesis.
 Depending on the distribution of the test value, you will use different
tables to fine critical value.
One – tailed test – indicates that the null hypothesis should be rejected when the
test value is in the critical region on one side.
 Left – tailed test – when the critical region is on the left side of the distribution
of the test value.
 Right – tailed test – when the critical region is on the right side of the
distribution of the test value.
Two – tailed test – the null hypothesis should be rejected when the test value is in
either of two critical regions on either side of the distribution of the test value.
Let us now summarize the steps in conducting a hypothesis testing.
Step 1: Identify the null and alternative hypothesis.
Step 2: Decide on the level of significance.
Step 3: Find the critical value(s) from the appropriate table.
Step 4: Compute the test statistic.
Step 5: Make the decision (reject or not reject the null hypothesis).
Step 6: Interpret the results.
Example 1. Lodi Country High School seniors have an average NAT score of 1,020.
From a random sample of 144 Lodi High School students we find the average NAT
score to be 1,100 with a standard deviation of 12.4. We want to know if these high
school students are representative of the overall population. What are our
hypothesis?
2. A recent survey of college campuses across Batangas claims that students
spend an average of 2.7 hours a day using their cellphones. A random sample of
35 BatState-U students showed an average use of 2.9 hours a day, with a standard
deviation of 0.4 hours. Do BatState-U college students use their cellphones more
than the typical Batangas college student?
3. A package of gum claims that the flavor lasts more than 39 minutes. What
would be the null hypothesis of a test to determine the validity of the claim? What
sort of test is this?
4. An ice pack claims to stay cold between 35 and 65 minutes. What would be
the null hypothesis of a test to determine the validity of the claim? What sort of
test would it be?
5. Mrs. Dudley is a grade 9 English teacher who is marking 2 papers that are
strikingly similar. She is concerned that one of her students is cheating, but she is
not sure which one of the two is guilty. Mrs. Dudley meets with the two students
(Laura and Greg) who have similar papers, and suspects that Greg is probably
the one who is guilty. She decides that she must deal with the situation in the same
way as in a justice trial, in order to determine who is innocent and who is guilty.
Describe the type I and type II errors that may be committed in the statement
above.
5.2 TEST FOR MEAN (PowerPoint)
Steps for a Hypothesis Test
- for a population when the variance is known and population is assumed

to follow a normal distribution.
1. State the null and alternative hypothesis.
2. Choose the level of significance.
3. Compute the test statistic.
4. Determine the critical value of p – value.
5. Draw a conclusion.
General Formula:
Statistic − Parameter
Test Statistic =
Standard Error
Example 1. The leader of the association of jeepney drivers claims that the
average daily take home pay of all jeepney drivers in Pasay City is Php. 400.00. A
random sample of 100 jeepney drivers in Pasay City was interviewed and the
average daily take home pay of all jeepney drivers in Pasay City is different from
Php. 400.00. Assume that the population variance is Php 92.00.
Example 2. According to a study done last year, the average monthly expenses
for cellphone loads of high school students in Batangas was Php 350.00. A statistics
student believes that this amount has increased since January of this year. Is there
a reason to believe that this amount has really increased if a random sample of
60 students has an average monthly expenses for cellphone loads of Php 380?
Use a 0.05 level of significance. Assume that the population standard deviation is
Php 77.00.
Exercise 1. The head of the Math department announced that the mean score of
Grade 9 students I the first periodic examination in Mathematics was 89 and the
standard deviation was 12. One students who believed that the mean score was
less than this, randomly selected 34 students and computed their mean score.
She obtained a mean score of 85. At 0.01 level of significance, test the student’s
belief.
Exercise 2.A company which produces batteries claims that the life expectancy
of their batteries is 90 hours. In order to test the claim, a consumer interest group
tested a random sample of 40 batteries. The test resulted to a mean life
expectancy of 87 hours. Using a 0.05 level of significance, can it be concluded
that the life expectancy of their batteries is less than 90 hours? Assume that the
population standard deviation is known to be 10 hours.
5.2 TEST FOR MEAN (Book)
Computation of Test Statistic
Statistic − Parameter
Test Statistic =
Standard Error
When doing a hypothesis testing for a mean, the observed value refers to the
sample mean while the expected value of the population mean when the null
hypothesis is assumed to be true. The standard error of the mean is computed as
the number:
Remember that the Central Limit Theorem allows you to use the standard normal
distribution to approximate the distribution of sample means provided that n≥ 30
(that is, the sample size is large). Moreover, for large samples even when the value
of the population standard deviation 𝜎 is unknown, the standard error is
computed by using the formula:
where s stands for the sample standard deviation.
Test For a Population Mean 𝜇
Population standard deviation 𝜎 is Known or the Sample Size is Large n≥ 30
We use the z test for a mean to conduct a statistical test for a population
mean when the population is normal and 𝜎 is known, or when the sample size
is n≥ 30. The test statistic is given by the formula
When 𝜎 is unknown but n≥ 30, we use the approximation

Since we are using the common levels of significance 𝛼 = 0.10, 0.01, or 0.05, we
will encounter the same corresponding critical values from the z table. We list
them here indicating the type of test used.
Level of
One-tailed Test Two-tailed Test
significance 𝛼
Left (<) Right (<)
0.10 = -1.28 = +1.28 = ±1.645
0.05 = -1.645 = +1.645 = ±1.96
0.01 = -2.33 = +2.33 = ±2.575

5.3 Test for Proportion
 The kind of test that will certainly interest those who want to know the
percentage of the population who are in favor of a certain idea or
concept.
 In testing for a Population Proportion p, use the z-test for a proportion to
conduct a statistical test for a proportion p on the assumption that np≥5
and nq≥5. The test statistic is defined by the equation: Formula:
Example:
1. Beefy burgee, a fast food restaurant claims that 85% of the burger
fanatics prefer to eat in their place. To test this claim, a random
sample of 90 burger customers are selected at random and asked
what they prefer. If 76 of the 90 burger fanatics said they prefer to
eat at Beefy Burger, what conclusion do we draw? Use a 0.005
level of significance.
2. Haus of Gaz claims that more than two-thirds of the houses in a
certain subdivision use their brand. Do we have reason to doubt
this claim if in a random sample of 40 houses in this subdivision, it is
found that 25 use the company’s brand. Use a 0.01 level of
significance.
3. A congressman is hoping that his bill is favored by his constituents
to increase his chance of getting reelected in the next election.
He asked his research office to conduct a survey on this matter to
verify whether or not he can get support from his constituents so
that if he gets a 90% support for the bill, he would certainly start
with his campaign. A random sample of 150 respondents yielded
128 who are in favor of his bill. Test the hypothesis that p=0.9
against the alternative p<0.9 at 0.05 level of significance.
6.1 CORRELATION
A correlation is relationship between two variables.
If data is represented as a set of ordered pairs (x,y), we take

x as independent (or explanatory) variable and y as the dependent
(or response) variable.
i.e., height and weight
Bivariate data – is a data that has two variables
Correlation efficient – numerical measure to determine whether two or more

variables are related and to what extent the strength of their
relationship is.
 SCATTER PLOT – visual tool that can help us to analyze the relationship
between the two variables in a bivariate data.
1. POSITIVE LINEAR CORRELATION 2. NEGATIVE CORRELATION
 Perfect Positive  Perfect Negative

 Strong Positive  Strong Negative
 Weak Positive  Weak Positive
3. NO CORRELATION
(A) Positive Linear Correlation (B) Negative Linear Correlation

Relationship
(C) Non-linear Correlation (D) Zero Correlation
The scatter plot in (A) shows a linear relationship between the variables, number
of hours of sleep x and the test scores y. It suggests that the longer hours of sleep
yields higher test scores. This type of relationship is said to be positive and linear.
Figure (B) also shows a linear relationship but it shows a negative relationship, that
is, as the number of hours of travel x increases, the corresponding speed of travel
y becomes slower. This type is a negative and linear relationship. Figure (C) does
not show linearity as you can see that the points that represent the data do not
follow a linear pattern. Instead, these points seem to follow the shape of a
quadratic curve. This kind of relationship is non-linear and yet we do not claim
that the independent and dependent variables are not related at all because
we still see a pattern from the graph. We can still describe the graph as when the
number of products sold x is less than 25, the profit y increases but when the
number of product x is more than 25, the profit y decreases. On the other hand,
(D) shows points that are just scattered in the plane and we see no pattern
showing the relationship between the independent variable (number of
hours of training) and the dependent variable (number of sales).
The correlation coefficient computed from the sample data

measures the strength and direction of a linear relationship
between two variables.
We use the symbol 𝜌 (rho) to represent the population
correlation coefficient and r for the sample correlation coefficient.
The correlation coefficient ranges between -1 and +1. A value close to +1 signifies
a strong positive linear relationship. A value close to -1 signifies a strong negative
linear relationship. When the variables concerned yield a coefficient value that is
close to 0 then there is no linear relationship between the variables.
The correlation coefficient r is given by
where n is the number of pairs.
Values of r Notation Values of r Notation

+1 Perfect Positive -1 Perfect Negative
0.71 to 0.99 Strong Positive -0.71 to -0.99 Strong Negative
Moderately
0.51 to 0.70 Moderately Positive -0.51 to -0.71
Negative
0.31 to 0.50 Weak Positive -0.31 to -0.50 Weak Negative
Negligible
0.01 to 0.30 Negligible Positive -0.01 to -0.30
Negative
0 No correlation 0 No correlation
Testing a Population Correlation Coefficient
Right-tailed Test:
No significant positive correlation
Significant positive correlation
Left-tailed Test:
No significant negative correlation
Significant negative correlation
Two-tailed Test:
No significant correlation
Significant correlation
 Test Statistic:
Example 12. The following are test scores of 15 students in abstract reasoning x
and arithmetic y. Determine whether the two variables have a significant positive
linear relationship. Use 𝛼 = 0.05.
x 100 95 78 85 74 79 89 90 100 88 85 50 54 75 65
y 80 90 80 70 55 40 50 88 50 72 100 74 30 90 88
n x y xy x² y²
1 100 80
2 95 90
3 78 80
4 85 70
5 74 55
6 79 40
7 89 50
8 90 88
9 100 50
10 88 72
11 85 100
12 50 74
13 54 30
14 75 90
15 65 88
 To solve for the test-statistic:
 To determine the critical value:
 Interpretation:
Example 18. In a Biology class, students experiment a number of cultures that are
grown in their laboratory. The number of bacteria y (in millions) and their ages x
(in days), are given below.
x 1 2 3 4 5 6 7 8
y 34 107 140 200 198 240 270 350
Compute the value r and comment on your results. Use 𝛼 = 0.01.
n x y xy x² y²
1 1 34 34 1 1,156
2 2 107 214 4 11,449
3 3 140 420 9 19,600
4 4 200 800 16 40,000
5 5 198 990 25 39,204
6 6 240 1,440 36 57,600
7 7 270 1,890 49 72,900
8 8 350 2,800 64 122,500
x = 36 y = 1,539 xy = 8,588 x² = 204 y² = 364,409
 To solve for the test-statistic:
 To determine the critical value:
 Interpretation:
At 1% level of significance, the computed test statistic t = 12.49 is greater than

critical value t = 3.143 and we decided to reject the null hypothesis. Thus, there is
significant positive correlation in the number of bacteria and their ages.
Example 19. In a beauty contest, the ratings of two judges are recorded.
Compute the correlation coefficient of their scores and state how these ratings
are related. Test at 𝛼 = 0.01.
Judge x 9 8 8 5 7 5
Judge y 10 8 9 6 9 10
n x y xy x² y²
1 9 10 90 81 100
2 8 8 64 64 64
3 8 9 72 64 81
4 5 6 30 25 36
5 7 9 63 49 81
6 5 10 50 25 100
42

Statistics Finals

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics Finals

Uploaded by

Copyright:

Available Formats

5.

1 STEPS IN TESTING HYPOTHESIS (PowerPoint)

 A statistical hypothesis is an assertion or conjecture concerning one or more

In a jury trial the hypotheses are:

(innocent is rejected if (guilty) is supported by evidence beyond

Examples: State the and for each cases,

2. An engineer hypothesizes that the mean number of defects can be decreased

After stating the hypotheses, the researcher designs the study.

 Select the correct statistical test

Type I error – reject when is true.

Type II error – do not reject when is false.

Results of a statistical test:

The question is how large of a difference is enough to say we have enough

Significance level – is the maximum probability of committing a Type I error. This

P (Type I error is true) = 𝛼

Let us now summarize the steps in conducting a hypothesis testing.

Step 1: Identify the null and alternative hypothesis.

Step 2: Decide on the level of significance.

Step 3: Find the critical value(s) from the appropriate table.

Step 4: Compute the test statistic.

Step 6: Interpret the results.

Steps for a Hypothesis Test

- for a population when the variance is known and population is assumed

1. State the null and alternative hypothesis.

2. Choose the level of significance.

3. Compute the test statistic.

4. Determine the critical value of p – value.

Computation of Test Statistic

where s stands for the sample standard deviation.

Test For a Population Mean 𝜇

Population standard deviation 𝜎 is Known or the Sample Size is Large n≥ 30

When 𝜎 is unknown but n≥ 30, we use the approximation

Left (<) Right (<)

0.10 = -1.28 = +1.28 = ±1.645

0.05 = -1.645 = +1.645 = ±1.96

0.01 = -2.33 = +2.33 = ±2.575

A correlation is relationship between two variables.

If data is represented as a set of ordered pairs (x,y), we take

i.e., height and weight

Bivariate data – is a data that has two variables

Correlation efficient – numerical measure to determine whether two or more

1. POSITIVE LINEAR CORRELATION 2. NEGATIVE CORRELATION

 Perfect Positive  Perfect Negative

(A) Positive Linear Correlation (B) Negative Linear Correlation

(C) Non-linear Correlation (D) Zero Correlation

The correlation coefficient computed from the sample data

We use the symbol 𝜌 (rho) to represent the population

correlation coefficient and r for the sample correlation coefficient.

The correlation coefficient r is given by

where n is the number of pairs.

Values of r Notation Values of r Notation

No significant positive correlation

Significant positive correlation

No significant negative correlation

Significant negative correlation

linear relationship. Use 𝛼 = 0.05.

 To determine the critical value:

Compute the value r and comment on your results. Use 𝛼 = 0.01.

 To determine the critical value:

At 1% level of significance, the computed test statistic t = 12.49 is greater than

are related. Test at 𝛼 = 0.01.

You might also like