You are on page 1of 28

Chapter 3.

Comparing two populations

Contents
I Hypothesis for the dierence between two population means:
matched pairs
I Hypothesis for the dierence between two population means:
independent samples
I Two normal populations with equal (unknown) variances
I Two normal populations with known variances
I Two nonnormal populations with unknown variances and large
samples
I Two Bernoulli populations
I Hypothesis for the ratio of two population variances: independent
samples
Chapter 3. Comparing two populations

Learning goals
At the end of this chapter you should be able to:
I Perform a test of hypothesis for the dierence between two
population means and for the ratio of two population variances
I Construct confidence intervals for the dierence/ratio
I Distinguish situations where a test based on matched pairs is
suitable from those where a test based on independent samples is
I Calculate the power of a test and the probability of Type II Error
Chapter 3. Comparing two populations

References
I Newbold, P. Statistics for Business and Economics
I Chapter 9 (9.6-9.9)
I Ross, S.
I Chapter 10
Introduction

In this chapter, we examine the case where instead of one random


sample, two random samples are available from two populations, and the
quantities of interest are:
I the dierence between two population means
I case of matched pairs
I case of independent samples
I the ratio between two population variances
I case of independent samples
We will draw on our experience from Chapters 1 and 2 to construct
confidence intervals and perform tests of hypothesis for the
abovementioned dierences/rations of population parameters.
Tests for the dierence between two means: matched pairs

Example: In a study aimed at assessing the relationship between a


subjects brain activity while watching a tv commercial and the subjects
subsequent ability to recall the contents of the commercial, subjects were
shown commercials for two brands of each of ten products. For each
commercial, the ability to recall 24h later was measured, and each
member of a pair of commercials was then designated high-recall or
low-recall. The table below shows an index of the total amount of brain
activity of subjects while watching these commercials.

product: i 1 2 3 4 5 6 7 8 9 10
high-recall: xi 137 135 83 125 47 46 114 157 57 144
low-recall: yi 53 114 81 86 34 66 89 113 88 111
di.: di = xi yi 84 21 2 39 13 20 25 44 31 33
Tests for the dierence between two means: matched pairs

I Let X be a population with mean X and Y be a population with


mean Y .
I Suppose we have a random sample of n matched pairs of
observations from these two populations and let
d1 = x1 y1 , d2 = x2 y2 , . . . , dn = xn yn represent n dierences
with mean d and quasi-standard deviation sd . Let assume that the
population of dierences is normal.
I In a two-tail test H0 : X Y = D0 against H1 : X Y 6= D0 :
I The test statistic is
D D0
T = p H0 tn 1
sD / n
I The rejection region is (at significance level ):

RR = {t : t < tn 1;/2 or t > tn 1;/2 }


Tests for the dierence between two means: matched pairs

Example: cont.
Population:
D = dierence between high- Test statistic: T = sDD /p
D0
tn
n 1
and low-recall
Observed test statistic:

'
2
D N(X Y , D )
D0 = 0 n = 10
SRS: n = 10 p
d = 21 sd = 1088 = 32.98
Sample: d = 210 d D0
10 = 21 t = p
142022 10(21)2 sd / n
sd2 = 10 1 = 1088
21
= p = 2.014
Objective: test 32.98/ 10
D0
z}|{
H0 : X Y 0 against H1 : X Y > 0
(Upper-tail test)
Tests for the dierence between two means: matched pairs
Example: cont.

p-value = P(T 2.014)


t=
2 (0.025, 0.05) because 2.014
t9;0.05 t9;0.025
z }| { z }| { pvalue
1.833 < 2.014 < 2.262 =area

Hence, given that


p-value < = 0.05 we reject the
null hypothesis at this level. ||
1.833 2.262

tn 1 density

Conclusion: The sample data gave enough evidence to support the claim
that on the average, brain activity is higher for the high-recall than for
the low-recall group. If in fact, the mean brain activity were the same for
these two groups, then the probability of finding a sample result as
extreme as or more extreme than that actually obtained would be
between 0.025 and 0.05 (which is rather low).
Tests for the dierence between two means: matched pairs
Example: cont. in Excel: Go to menu: Data, submenu: Data Analysis,
choose function: t-Test Paired Two Sample for Means.
Columns A and B (data), in yellow (the observed test statistic and
p-value).
Two-tail test for the dierence between two means via CI:
matched pairs

Example: cont. Construct a 95% confidence interval for X Y .



sd sd
CI0.95 (X Y ) = d tn 1;0.025 p , d + tn 1;0.025 p
n n

32.98 32.98
= 21 2.262 p , 21 + 2.262 p
10 10
= ( 2.59, 44.59)

Since the value of 0 belongs to this interval, we cannot reject the null
hypothesis of the equality of the two population means at a = 0.05
significance level.
Tests for the dierence between two means: independent
normal samples, population variances equal
I Let X be a population with mean X and variance X2 and Y be a
population with mean Y and variance Y2 , both normally distributed
with unknown, but equal population variances 2 = X2 = Y2 .
I Suppose we have a random sample of n1 observations from X and
an independent random sample of n2 observations from Y .
I In a two-tail test H0 : X Y = D0 against H1 : X Y 6= D0 :
I The test statistic is
X Y D0
T = q H0 tn1 +n2 2
1 1
sp n1
+ n2

where the estimator of the common population variance is


(n1 1)sX2 + (n2 1)sY2
sp2 =
n1 + n2 2
Note: the number of degrees of freedom is n1 + n2 2 (the total
number of observations from both samples minus two - two dfs are
lost to estimate X and Y )
I The rejection region is (at significance level ):
RR = {t : t < tn1 +n2 2;/2 or t > tn1 +n2 2;/2 }
Tests for the dierence between two means: independent
normal samples, population variances equal
Example: 9.8 (Newbold) A study attempted to assess the eect of the
presence of a moderator on the number of ideas generated by a group. Groups
of four members, with or without moderator, were observed. For a random
sample of four groups with a moderator, the mean number of ideas generated
per group was 78.0, and the sample quasi-standard deviation was 24.4. For an
independent sample of four groups without a moderator, the mean number of
ideas generated was 63.5, and the sample quasi-standard deviation was 20.2.
Assuming that the populations distributions are normal with equal variances,
test the null hypothesis ( = 0.1) that the population means are equal against
the alternative that the true mean is higher for groups with a moderator.
Population 1: Population 2:
X = number of ideas in groups Y = number of ideas in groups
with a moderator without a moderator

' '
X N(X , X2 ) X N(Y , Y2 )

SRS: n1 = 4 SRS: n2 = 4

Sample: x = 78.0 Sample: y = 63.5


sx = 24.4 sy = 20.2
Assume independent normal samples and X2 = Y2 = 2
Tests for the dierence between two means: independent
normal samples, population variances equal p
Example: 9.8 (Newbold cont.) sp = 501.7 = 22.4
x y
t = p
Objective: test sp 1/n1 + 1/n2

D0 78.0 63.5
z}|{ = p = 0.915
22.4 1/4 + 1/4
H0 : X Y = 0
against
Rejection region:
H1 : X Y > 0
(Upper-tail test) 1.440
z }| {
RR0.1 = {t : t > t6;0.1 }
Test statistic: T = rX Y H tn +n 2
1 + 1 0 1 2
sp
n1 n2 Since t = 0.915 2
/ RR0.1 we cannot reject the null hypothesis
Observed test statistic: at a 10% level.

D0 = 0 n1 = 4 n2 = 4
x = 78.0 sx = 24.4 y = 63.5 sy = 20.2
Conclusion: The sample data did
(n1 1)sx2 + (n2 1)sy2 not contain strong evidence
2
sp =
n1 + n2 2 suggesting that on average, more
(4 1)24.42 + (4 1)20.22 ideas will be generated by groups
=
4+4 2 with moderators. However, for such
= 501.7 small sample sizes, we cannot expect
great power in the test so quite large
dierences in the population means
would be needed to reject the null
hypothesis at low significance levels.
Two-tail test for the dierence between two means via CI:
independent normal samples, population variances equal

Example: 9.8 (Newbold cont.) Construct a 99% confidence interval for


X Y .
r
1 1
CI0.99 (X Y ) = x y tn1 +n2 2;0.005 sp +
n1 n2
r !
1 1
= 78.0 63.5 3.707 22.4 +
4 4
= ( 44.22, 73.22)

Since the value of 0 belongs to this interval, we cannot reject the null
hypothesis of the equality of the two population means at a = 0.01
significance level.
Tests for the dierence between two means: independent
large samples or two normal populations with known
variances
I Let X be a population with mean X and variance X2 and Y be a
population with mean Y and variance Y2 .
I Suppose we have a random sample of n1 observations from X and
an independent random sample of n2 observations from Y and:
I Either that both n1 and n2 are large and 12 and 22 are unknown
I Or that X and Y are normally distributed and 12 and 22 are known
I In a two-tail test H0 : X Y = D0 against H1 : X Y 6= D0 :
I The test statistic is:
I Either
X Y D0
Z = r H0 , approx. N(0, 1)
2
sX 2
sY
n1
+ n2
I Or
X Y D0
Z = r H0 N(0, 1)
2 2
X Y
n1
+ n2
I The rejection region is (at significance level ):
RR = {z : z < z/2 or z > z/2 }
Tests for the dierence between two means: independent
large samples or two normal populations with known
variances
Example: 9.7 (Newbold) A survey of practicing certified public accountants on
attitudes to women in the profession was carried out. Survey respondents were
asked to react on a scale from one (strongly disagree) to five (strongly agree)
to the statement: Women in public accounting are given the same job
assignments as men. For a sample of 186 male accountants, the mean
response was 4.059 and the sample quasi-standard deviation was 0.839. For an
independent random sample of 172 female accountants, the mean response was
3.680 and the sample quasi-standard deviation was 0.966. Test the null
hypothesis ( = 0.0001) that the two population means are equal against the
alternative that the true mean is higher for male accountants.
Population 1: Population 2:
X = response of a male accountant Y = response of a female accountant

' '
X X , X2 X Y , Y2

SRS: n1 = 186 SRS: n2 = 172

Sample: x = 4.059 Sample: y = 3.680


sx = 0.839 sy = 0.966
Tests for the dierence between two means: independent
large samples or two normal populations with known
variances
Example: 9.7 (Newbold cont.)
Rejection region:

Objective: test 3.75


z }| {
D0 RR0.0001 = {z : z > z0.0001 }
z}|{
H0 : X Y = 0
Since z = 3.95 2 RR0.0001 we reject the null hypothesis at a
against
0.01% level.
H1 : X Y > 0
(Upper-tail test)

Test statistic: Z = s X Y H , approx. N(0, 1)


Conclusion: The data contains very
s2 s2
X + Y
0 strong evidence suggesting that the
n1 n2 population mean response is higher
Observed test statistic:
for males than for females - that is,
D0 = 0 n1 = 186 n2 = 172
on average, males feel more strongly
x = 4.059 sx = 0.839 y = 3.680 sy = 0.966
than females in the profession that
x y
women are given the same job
z = q assignments as men.
sx2 /n1 + sy2 /n2

4.059 3.680
= q = 3.95
0.8392 /186 + 0.9662 /172
Tests for the dierence between two means: independent
large samples or two normal populations with known
variances

Example: 9.7 (Newbold) Construct a 95% confidence interval for


X Y .
0 s 1
2 s 2
s y
CI0.95 (X Y ) = @x y z0.025 x
+ A
n1 n2
p
= 4.059 3.680 1.96 0.8392 /186 + 0.9662 /172
= (0.19, 0.57)

Since the value of 0 does not belong to this interval, we can reject the
null hypothesis of the equality of the two population means at a = 0.05
significance level.
Tests for the dierence between two proportions:
independent large samples
I Let X Bernoulli(pX ) and let Y Bernoulli(pY ) where pX and pY
are two population proportions of individuals with a characteristic of
interest.
I Suppose we have a random sample of n1 observations from X and
an independent random sample of n2 observations from Y and that
both n1 and n2 are large
I In a two-tail test H0 : pX = pY (= p0 ) against H1 : pX 6= pY :
I The test statistic is:
pX pY
Z = r H0 , approx. N(0, 1),
1 1
p0 (1 p0 ) n1
+ n2

where
n1 pX + n2 pY
p0 =
n1 + n2
I The rejection region is (at significance level ):

RR = {z : z < z/2 or z > z/2 }


Tests for the dierence between two proportions:
independent large samples

Example: 9.9 (Newbold) In market research, when populations of individuals or households are surveyed by mail questionnaires, it is

important to achieve as high a response rate as possible. One way to improve response might be to include in the questionnaire an initial

inducement question, intended to increase the respondents interest in completing the questionnaire. Questionnaires containing an

inducement question on the importance of recreation facilities in a city were sent to a sample of 250 households, yielding 101 responses.

Otherwise identical questionnaires, but without the inducement question, were sent to an independent random sample of 250 households,

producing 75 responses. Test the null hypothesis that the two population proportions of responses would be the same against the

alternative that the response rate would be higher when the inducement question is included.

Population 1: Population 2:
X = 1 if a person completes the Y = 1 if a person completes the
questionnaire with the inducement questionnaire without the inducement
question, and 0 otherwise question, and 0 otherwise

' '
X Bernoulli(pX ) Y Bernoulli(pY )

SRS: n1 = 250 SRS: n2 = 250

101 75
Sample: px = 250
= 0.404 Sample: py = 250
= 0.300
Tests for the dierence between two proportions:
independent large samples

Example: 9.9 (Newbold cont.) px py


z = s
p0 (1 p0 ) 1 + 1
n1 n2
Objective: test
0.404 0.300
H0 : pX = pY = r = 2.43
0.352(1 0.352) 1 + 1
against 250 250
H1 : pX > pY
(Upper-tail test) p-value = P(Z z) = P(Z 2.43) = 0.0075

Test statistic: Since p-value is very small, the null hypothesis can be rejected
pX pY at any significance level bigger than 0.0075.
Z = s H0 , approx. N(0, 1)
p0 (1 p0 ) 1 + 1
n1 n2
Observed test statistic:
Conclusion: The sample data did
n1 = 250 n2 = 250 contain very strong evidence
px = 0.404 py = 0.300 suggesting that a higher response
n1 px + n2 py rate will be achieved when an
p0 =
n1 + n2
inducement question is included
250(0.404) + (250)(0.300)
than when it is not.
=
250 + 250
= 0.352
Tests for the dierence between two proportions:
independent large samples

Example: 9.9 (Newbold cont.) Construct a 95% confidence interval for


pX pY .
s !
1 1
CI0.95 (pX pY ) = px py z0.025 p0 (1 p0 ) +
n1 n2
s !
1 1
= 0.404 0.300 1.96 0.352(1 0.352) +
250 250
= (0.1877, 0.0203)

Since the value of 0 does not belong to this interval, we can reject the
null hypothesis of the equality of the two population means at a = 0.05
significance level.
Tests for the ratio of variances: normal samples

I Let X be a population with mean X and variance X2 and Y be a


population with mean Y and variance Y2 , both normally
distributed.
I Suppose we have a random sample of n1 observations from X and
an independent random sample of n2 observations from Y .
I In a two-tail test H0 : X2 = Y2 (= 2 ) against H1 : X2 6= Y2 :
I The test statistic is
sX2
F = H0 Fn1 1,n2 1
sY2
I The rejection region is (at significance level ):

RR = {f : f < Fn1 1,n2 1;1 /2 or f > Fn1 1,n2 1;/2 }


F distribution
Recall that if X1 , X2 , . . . , Xn and
Y1 , Y2 , Y3 , . . . , Ym denote
independent rvs, all following an
F densities
N(0, 1) distribution. The random
variable
1 Pn

1.2
2
i=1 Xi
F = n1 Pm 2
m i=1 Yi

1.0
follows an Fn,m distribution with n df1=30 df2=30)
and m degrees of freedom. We can df1=10 df2=15

0.8
view it as a ratio of two normalized df1=8 df2=8
chi-square rvs. This is where the df1=5 df2=3

0.6
result from the previous page comes
from:
0.4
2
n1 1
z }| { 0.2

1 (n1 1)sX2
sX2 n1 1 2
=H0 Fn1 1,n2 1
0.0

sY2 1 (n2 1)sY2 0 2 4 6 8


n2 1 2
| {z }
2
n2 1
Tests for the ratio of variances: normal samples

Example: 9.10 (Newbold) For a random sample of 17 newly issued


AAA-rated industrial bonds, the quasi-variance of maturities (in years
squared) was 123.35. For an independent random sample of 11 issued
CCC-rated industrial bonds, the quasi-variance of maturities was 8.02. If
the respective population variances are denoted X2 and Y2 , perform a
two-sided test at a 5% level.
Population 1: Population 2:
X maturity of AAA-rated bonds (in Y maturity of CCC-rated bonds (in
years) years)

' '
X N(X , X2 ) Y N(Y , Y2 )

SRS: n1 = 17 SRS: n2 = 11

Sample: sx2 = 123.35 Sample: sy2 = 8.02


Tests for the ratio of variances: normal samples
Example: 9.10 (Newbold cont.)
Rejection region:
0.402
z }| {
Objective: test RR0.10 = {f : f < F16,10;1 0.05 }
H0 : X2 = 2
Y [ {f : f > F16,10;0.05 }
against | {z }
2.83
H1 : X2 6= Y2
(Two-tail test) Note: the quantile F16,10;0.05 = 2.83
is directly available from the F-table,
Test statistic: but the other one not. We can get it
s2
F = sX2 H0 Fn1 1,n2 1 however using the following property
Y
Observed test statistic: of the F-distribution
1
Fn,m; = Fm,n;1
Hence
n1 = 17 n2 = 11
sx2 = 123.35 sy2 = 8.02 1 1
F16,10;1 0.05 = = = 0.402
123.35 F10,16;0.05 2.49
f = = 15.38
8.02 We see that f = 15.38 2 RR0.10 .
Conclusion: There is very strong
evidence that the population
variances are dierent.
Two-tail test for the ratio of variances via confidence
interval

Example: 9.10 (Newbold cont.) Construct a 90% confidence interval for


the ratio of the variances.
2 2
X sx 1 sx2 1
CI0.90 2 = ,
Y sy2 Fn1 1,n2 1;0.05 sy2 Fn1 1,n2 1;1 0.05

123.35 1 123.35 1
= ,
8.02 2.83 8.02 0.402
= (5.43, 38.26)

As we expected, the value of 1 does not belong to this interval, so we can


reject the null hypothesis of the equality of the two population variances
at a = 0.1 significance level.
Test statistics

Parameter Assumptions Test statistic

Normal dierences D D0
p tn 1
Matched pairs sD / n

Normal pops. X Y D0
r H tn +n
Equal common var. 1 + 1 0 1 2 2
sp
n1 n2
Normal pops. X Y D0
X Y = D0 s H N(0, 1)
Known vars. 2 2 0
X + Y
n1 n2
Nonnormal pops. X Y D0
Unknown vars. s H , approx N(0, 1)
s2 s2 0
Large samples X + Y
n1 n2
Bernoulli pops. pX pY
pX pY = 0 s H0 , approx N(0, 1)
Large samples
p0 (1 p0 ) 1 + 1
n1 n2
2
sX
2 2 H Fn
X/ Y = 1 Normal pops.
s2 0 1 1,n2 1
Y

Question: How would you define RR in upper- and lower-tail tests?

You might also like