You are on page 1of 20

MATH& 146

Lesson 29
Section 4.2
Paired Data Inference

1
Textbook Prices
Are textbooks actually cheaper online? Here we
compare the price of textbooks at UCLA's
bookstore and prices at Amazon.com.

dept course ucla amazon diff


1 Am Ind C170 27.67 27.95 0.28
2 Anthro 9 40.59 31.14 9.45
3 Anthro 135T 31.68 32.00 0.32
4 Anthro 191HB 16.00 11.52 4.48

72 Wom Std M144 23.76 18.72 5.04
73 Wom Std 285 27.70 18.22 9.48
2
Textbook Prices
Seventy-three UCLA courses were randomly
sampled in Spring 2010, representing less than
10% of all UCLA courses. A portion of the data is
shown below:
dept course ucla amazon diff
1 Am Ind C170 27.67 27.95 0.28
2 Anthro 9 40.59 31.14 9.45
3 Anthro 135T 31.68 32.00 0.32
4 Anthro 191HB 16.00 11.52 4.48

72 Wom Std M144 23.76 18.72 5.04
73 Wom Std 285 27.70 18.22 9.48
3
Textbook Prices
Separate graphs for UCLA and Amazon don't
show much. It seems unclear if a price difference
exists.

UCLA

Amazon

4
Textbook Prices
Each textbook has two corresponding prices in the
data set: one for the UCLA bookstore and one for
Amazon. Therefore, each textbook price from the
UCLA bookstore has a natural correspondence
with a textbook price from Amazon.

When two sets of observations have this special


correspondence, they are said to be paired.

5
Textbook Prices
Two sets of observations are paired if each
observation in one set has a special
correspondence or connection with exactly one
observation in the other data set.

Examples of paired data include surveys of


husbands and wives, experiments on twins, and
before/after studies.

6
Textbook Prices
To analyze paired data, it is often useful to look at the
difference in outcomes of each pair of observations. In
the textbook data set, we look at the difference in
prices, which is represented as the diff variable in the
textbooks data.
dept course ucla amazon diff
1 Am Ind C170 27.67 27.95 0.28
2 Anthro 9 40.59 31.14 9.45
3 Anthro 135T 31.68 32.00 0.32
4 Anthro 191HB 16.00 11.52 4.48

72 Wom Std M144 23.76 18.72 5.04
7
73 Wom Std 285 27.70 18.22 9.48
Textbook Prices
Here the differences are taken as UCLA Amazon
for each book. It is important that we always
subtract using a consistent order; here Amazon
prices are always subtracted from UCLA prices.

dept course ucla amazon diff


1 Am Ind C170 27.67 27.95 0.28
2 Anthro 9 40.59 31.14 9.45
3 Anthro 135T 31.68 32.00 0.32
4 Anthro 191HB 16.00 11.52 4.48

72 Wom Std M144 23.76 18.72 5.04
8
73 Wom Std 285 27.70 18.22 9.48
Example 1
The first difference shown in the table is computed
as 27.67 27.95 = 0.28. Verify the differences
are calculated correctly for observations 2 and 3.

dept course ucla amazon diff


1 Am Ind C170 27.67 27.95 0.28
2 Anthro 9 40.59 31.14 9.45
3 Anthro 135T 31.68 32.00 0.32
4 Anthro 191HB 16.00 11.52 4.48

72 Wom Std M144 23.76 18.72 5.04
9
73 Wom Std 285 27.70 18.22 9.48
Example 2
Given this set of paired data:
Pairs 1 2 3 4 5
Sample A 3 6 1 4 7
Sample B 2 5 1 2 8

Find:
a) The paired differences, d = A B, for this set of
data
b) The mean of the paired differences
c) The standard deviation of the paired differences

10
Textbook Prices
A histogram of these differences is shown below.
Using differences between paired observations is a
common and useful way to analyze paired data.

11
Textbook Prices
To analyze a paired data set, we use the exact
same tools that we developed earlier, only now we
apply them to the differences in the paired
observations.
Below are the summary statistics for the price
differences. There were 73 books, so there are 73
differences.
ndiff xdiff sdiff
73 12.76 14.26
12
The Hypotheses
When the conditions are met, we are ready to test
whether the mean of paired differences is
significantly different from zero. We test the
hypotheses
H0: diff = 0
HA: diff 0 (or >, or <)
The differences are almost always compared to
zero, but they can be compared to any number.

13
Paired Tests from Start to Finish

1) Verify you have paired data and compute the


differences.
2) State the hypotheses using diff.
3) Check the independence and normality conditions
for the differences.
4) Use SEdiff = sdiff n to calculate the standard error
of the differences.

continued
14
Paired Tests from Start to Finish

5) Calculate the test statistic using


point estimate null value xdiff 0
T
SE SEdiff
6) Calculate the p-value. Use df = n 1, where n is
the number of pairs.
For left-tail tests, use tcdf(999, T, df)
For right-tail tests, use tcdf(T, 999, df)
For two-tail tests, use tcdf(|T|, 999, df) 2

continued
15
Paired Tests from Start to Finish

7) Compare the p-value to and make a conclusion.


If p-value < , then this would be considered
enough evidence to reject H0.
If p-value , then this would be considered
insufficient to reject H0.

16
Example 3
Can the normal model be used to describe the
distribution of xdiff ? Check to see if the
independence and normality conditions have been
met?

ndiff xdiff sdiff


73 12.76 14.26

17
Example 4
Is there statistical evidence of a difference in
textbook prices between UCLA and Amazon? Set
up and implement a hypothesis test with p-values
to determine whether, on average, there is a
difference between Amazon's price for a book and
the UCLA bookstore's price.

ndiff xdiff sdiff


73 12.76 14.26
18
Example 5
Create a 95% confidence interval for the average
price difference between books at the UCLA
bookstore and books on Amazon.

Do your results agree with the hypothesis test of


the last example?

ndiff xdiff sdiff


73 12.76 14.26
19
Example 6
Following are the 95% confidence intervals of
UCLA Amazon differences. Interpret each result.
a) ($9.44, $16.08)
b) ($9.44, $16.08)
c) ($9.44, $16.08)

20

You might also like