You are on page 1of 47

CHAPTER 8

INTRODUCTION TO
HYPOTHESIS TESTING
ILLI NURASHIKIN BT MOHD ISA (GS38619)
NAQIAH BT PUAAD (GS38686)
NUUR ADILA MOHAMAD ALI (GS)

Subtopics
8.1 The Logic of Hypothesis Testing
8.2 Uncertainty and Errors in Hypothesis
Testing
8.3 More About Hypothesis Testing
8.4 Directional (One-Tailed) Hypothesis
Tests
8.5 Concerns About Hypothesis Testing:
Measuring Effect Size
8.6 Statistical Power

8.1 Logic of Hypothesis


Testing
Hypothesis testing
A statistical method that uses
sample data to evaluate a
hypothesis about a population.
It is one of the most commonly
used inferential procedures.
Hypothesis tests is used to
evaluate the results of the study.

Example:
Purpose of research is to determine
the effect of a treatment on an
individuals in the population.
Assuming = 80 and =20.

Figure above basically shows the structure of the research study from
the point of view of the hypothesis test.
It is impossible to administer the treatment on the entire population,
so the actual research study is conducted using a sample.

Four steps of hypothesis


test
STEP 1: State the hypothesis
STEP 2: Set the criteria for a
decision
STEP 3: Collect data and
compute sample
statistics
STEP 4: Make a decision

Example 8.1 (pg 206)

For general population,


If the mean score is different
Mean, = 80
from the mean for general
Standard deviation,
population
= 20 of students,
Study on sample, researcher
n = 25
conclude that

Step 1: State the


hypothesis
Null hypothesis (H) Alternative hypothesis
(H1 )
States that in the
general population,
there is no change, no
difference, or no
relationship.

States that there is a


change, a difference, or a
relationship for the
general population.

Based on the
experiment, H predicts
that the independent
variable (treatment) has
no effects on the
dependent variable
(scores) for the
population.

Based on the experiment,


H predicts that the
independent variable
(treatment) does have an
effect on the dependent
variable.

Step 2: Set the criteria for


a decision
The alpha level or the level of
significance, is a probability value
that is used to define the concept of
very unlikely in a hypothesis test.
alpha values commonly used are:
= .05 (5%)
= .01 (1%)
= .001 (0.1%)

Critical region is composed of the


extreme sample values that are very
unlikely (as defined by the alpha level)
to be obtained if the null hypothesis is
true. The boundaries for the critical
region are determined by the alpha
level. If sample data fall in the critical
region, the null hypothesis is rejected.

Step 3: Collect data and


compute sample statistics
Comparing the data with the hypothesis
Calculate a z-score that identifies where
our sample mean is located in the
hypothesized distribution.
z=M-
M
z = sample mean hypothesized
population mean
standard error between M and

Step 4: Make a decision


Researcher uses the z-score value obtain in step 3 to
make a decision about the null hypothesis according to
the criteria
2. There
possible
Sample
data areestablished
located in thein stepSample
dataare
are 2
not
in the critical
outcomes.
critical region.
region.
Reject the null hypothesis

Fail to reject the null hypothesis.

Example:
Sample mean, M = 89,
Population mean, = 80,
n = 25, and = 20.
Standard error for the sample mean is :
M = = 20 = 20 = 4

Example:
Sample mean, M = 84,
Population mean, = 80,
n = 25, and = 20.
Standard error for the sample mean is :
M = = 20 = 20 = 4

n 25 5
z = M - = 89-80 = 9 = 2.25
M
4
4
alpha level of = .05, the z-score
beyond the boundary of 1.96.

n 25 5
z = M - = 84-80 = 4 = 1.00
M
4
4
alpha level of = .05, the z-core is in
the boundary of 1.96.

8.2 Uncertainty and Errors


In Hypothesis Testing
TYPE I ERRORS
Occurs when a researcher rejects a null hypothesis that
is actually true.
In a typical research situation, a Type I error means that
the researcher concludes that a treatment does have an
effect when in fact, it has no effect.
It occurs when a researcher unknowingly obtains an
extreme, non-representative sample.
The alpha level determines the probability of a Type I
error.
How to avoid Type I error?
Use a lower value for . However, using a lower value for
alpha means that you will be less likely to detect a true
difference if one really exists.

TYPE II ERRORS
Occurs when a researcher fails to reject a null
hypothesis that is really false.
In a typical research situation, a Type II error
means that the hypothesis test has failed to
detect a real treatment effect.
The research data do not show the results that the
researcher had hoped to obtain.
It occurs when the sample mean is not in the
critical region even though the treatment has had
an effect on the sample. (happens when the effect
of the treatment is relatively small)
How to avoid Type II error?
By ensuring the sample size is large enough to
detect a practical difference when one truly exists.

Selecting an alpha level


Alpha level
Helps determine the boundaries for critical region
Determine the probability of a Type I error if the null
hypothesis is true
Primary concern when selecting an alpha level is to
minimize the risk of a Type I error.
Best strategy is to choose the smallest possible value to
minimize the risk of a Type I error.

8.3 More About Hypothesis


Testing

In the literature

Example (pg.218)
Electrical stimulation on the scalp near the parietal lobe had a significant
effect on the mathematic test scores for the students, z = 2.25, p < .05

A result said to be significant when the result is sufficient to


reject the null hypothesis. Thus, a treatment has a significant
effect if the decision from the hypothesis test is to reject Ho.

The z indicates the z score used to evaluate the sample data and
that is the value is 2.25
p<.05 conventional way of specifying the alpha level that was
used for the hypothesis test. It is also acknowledged the
possibility of a Type I error.

Reject Ho

Reject Ho
p>

p<

Fail to reject Ho

p<

Figure 8.6
Sample means that fall in the critical region (shaded areas) have probability
less than alpha (p<). In this case, Ho should be rejected. Sample means
that do not fall in the critical region have probability greater than alpha
(p>).

Assumptions for Hypothesis tests with z-scores

Random Sampling
Participants used in the study were selected randomly. To generalize the
findings from the sample to the population, the sample must be
representative of the population. Random sampling helps to ensure that
it is representative.

The value of is unchanged by the treatment.


A critical part of the z-score formula in hypothesis test is the standard
error, M. To compute the value for the standard error, must know the
sample size (n) and the population standard deviation ().

Normal sampling distribution


To evaluate hypothesis with z-scores we have used the unit normal
table to identify the critical region. This table can be used only if the
distribution of sample means is normal

A closer look at the z-score in a hypothesis test

The z-score formula forms a ratio

For example, z-score of z =3.00 means that obtained difference


between the sample and the hypothesis is 3 times bigger than
would be expected if the treatment had no effect. A discrepancy
this large is strong indication that the hypothesis is probably
wrong.

Factors that influence a hypothesis test


The final decision in a hypothesis test determined by the
value obtained for the z-score statistics. If the z-score is
large enough to be in critical region, then we reject the null
hypothesis and conclude there is significant treatment
effect and vice versa.
1. The variability of the scores
Higher variability can reduce the chances of finding a
significant treatment effect.
Example: if the standard deviation is increased to = 30.
with the increased variability, the standard error becomes
M = 30 = 6 points
25

The new z-score becomes


89-80 = 9/6 = 1.50
6
The z-score is no longer beyond the critical boundary of 1.96 so the
statistical decision is to fail to reject the null hypothesis.
In general increasing the variability of the scores produces a large
standard error and a smaller value (closer to zero) for the z-score.
2. The number of scores in the sample
In the example 8.1 a significant z-score of z= 2.25. If the sample size to
n=100 students. With n=100, the standard error becomes M = 20 =2
100
The z-score becomes
89-80 = 9/2= 4.50
2
Increasing the no. of scores in the sample produces smaller standard
error and larger value for the z-score.

If all other factors are held constant, the


larger the sample size, the greater the
likelihood of finding a significant treatment
effect.
Learning check

1.A researcher conducts a hypothesis test with = .05 to evaluate


the effectiveness of a treatment. Assume that the sample mean
produces a z-score of z = 2.17
a) do the data indicate that the treatment has a
significant effect?
Answer: with =.05 the critical region consists of z-scores in the tails
beyond z=1.96. Reject the null hypothesis
2. In research report the result hypothesis include the phrase
z=1.63 p>.05. This means that the test failed reject the null
hypothesis? (T/F)
Answer: True

8.4 Directional (One-tailed)


Hypothesis Tests
In a directional hypothesis test, or a one-tailed test the statistical
hypotheses (Ho and H1) specify either an increase or a decrease in the
population mean. That is, they make a statement about the direction of
the effect
Example 8.2
(pg.224)

The Hypothesis for a Directional Test

Because a specific direction is expected for the treatment effect, it is


possible for the researcher to perform directional test.
The first step is to incorporate the directional prediction into the statement
of the statistical hypotheses.
To express directional hypotheses in symbols, it usually begin with the
alternative hypothesis (H1).

H1: > 80 (with the stimulation, the average score is greater than 80)
The null hypothesis states that the stimulation does not increase scores,
Ho: 80 (with the stimulation the average score is not greater than 80)

The Critical Region for Directional Tests

M = 4

Reject
Ho
M

=80
z
0

1.65

Critical region is located entirely in the right-hand tail of


the distribution corresponding to sample mean much
greater than = 80.
Because the critical region is contained in one taildistribution, a directional test commonly called onetailed test.

The alpha level is not divided between two tails but rather
contained entirely in one tail.
Using = .05, the whole 5% is located in one tail.
The score boundary for critical region is z= 1.65
Directional (one-test) test requires changes the first 2 steps
of the steps hypothesis testing procedure.
1. First step, the directional prediction is included in the
statements of the hypotheses
2. Second step, the critical region located entirely in one
tail of the distribution.

For this example, the researcher obtained a mean of M=87 for the
25 participants who received the brain stimulation, so sample mean
corresponds to a z-score

87-80 = 7/4= 1.75


4
A z-score of z=1.75 in the critical region for one tailed test. Therefore,
we reject the null hypothesis and conclude that the electrical
stimulation produces significant increase in mathematics test
scores.

In the literature, this result would be reported as follows:


The stimulation produced significant increase in scores, z = 1.75,
p<.05

Comparison of One Tailed versus


Two-Tailed Tests
The major distinction between one tailed and two-tailed
test is the criteria that they use for rejecting H o .
One tailed test allows you to reject the null hypothesis
when the differences between the sample and
population is relatively small, provided that the
difference is in the specified direction.
Two-tailed test requires a relatively large difference
independent of direction.

Example 8.3 (pg. 226)

With the two-tailed test with the 7 point difference between


sample mean and hypothesized population mean (M=87 and
=80) is not big enough to reject null hypothesis.
However, with one tailed test, the same 7 point difference is
large enough to reject Ho and conclude that has significant
effect.

Learning check
1.A researcher selects a sample from population with mean of =
60 and administers a treatment to the individuals in the sample. If
the researcher predicts that the treatment will increase scores, then
a) Using symbols, state the hypothesis for one tailed test.
Answer: Ho : 60 and H1: > 60
2. A researcher obtains z= 2.43 for hypothesis test. Using =.01 the
researcher should reject the null hypothesis for one-tailed test but
fail to reject for two-tailed test. (T/F)
Answer: True. The one-tailed critical value is z = 2.33 and the twotailed value z is = 2.58.

8.5 Concerns about hypothesis


testing : measuring effect size
Two limitations to establish the significance of
treatment effect
1. the focus of the hypothesis test is on the data
rather than the hypothesis.
2. a significant treatment effect does not necessarily
indicate a substantial treatment effect.

Measuring effect size


To provide a measurement of the absolute magnitude of
a treatment effect, independent of the size of the
sample(s) being used

Standard deviation is included to standardize the size of


the mean difference in much the way that z-scores
standardize locations in the distribution.

Example:

Part (a) shows the results of the treatment that produces


a 15-point mean difference in SAT scores; before
treatment the average SAT score is 500, after treatment
the average score is 515. Notice the standard deviation
for SAT score is 100, so the 15-point difference appears
to be small
Cohens d = mean difference = 515-500 = 0.15
standard deviation
100

Part (b), treatment produces 15-point mean difference in


IQ score. Before treatment the average IQ is 100, and
after treatment the average is 115. because IQ score
have standard deviation of 15, 15-point mean difference
now appears to be large.
Cohens d = mean difference = 115-100 = 1.00
standard deviation
15

8.6 Statistical power


Another alternative approach to measure the
size or strength of the treatment effect
The probability that the test will correctly
reject a false null hypothesis.

Example :
PLEASE FIX THE PICTURE

Normal-shaped population with a mean of 80


and standard deviation of 10. a researcher
plans to select a sample of 25 individuals from
this population and administer a treatment to
each individual. It is expected that the treatment
will have an 8-point effect, that is the treatment
will add 8 points to each individual score.

Power and effect size


As the effect size increases, the distribution of
sample means on right-hand size moves even
farther to the right so that more and more of
the samples are beyond the z boundary.
As the effect size increase, probability of
rejecting H0 also increase, which means the
power of the test increases

Other factors that effect power


Sample size larger sample produces greater
power for the hypothesis test
Alpha level reducing alpha level for a
hypothesis test also reduces the power of a
test
Changing from a regular two-tailed test to onetailed test increases the power of the
hypothesis test

Example :
PLEASE FIX THE PICTURE

Thank You

You might also like