You are on page 1of 54

Compare parameters of the

original pop. of all individual scores


and the parameters of the
sampling dist. of all possible xs
mean = & std. dev. = o
Original population:

x
o
x
=
(same mean as
individual values)
=
o
n
(different std. dev.,
but related!)
o

=
X
Z
Z Formula for X - value
Z = the number of standard deviations
that an X - value is from the mean.
n
X
Z
o

=
Z Formula for Sample Means, X
Anytime the original pop.
is Normal (true for any n).
Anytime the original pop.
is not Normal, but
n is BIG (n > 30).
Reminder
When is the population of
all possible X values Normal?
Anytime the original population
is not Normal AND
n is NOT BIG.
Reminder
When is the population of
all possible X values NOT Normal?
Confidence Interval
and
Hypothesis
The Margin of Error of
a sample mean in estimating
the true mean of the population.
MOE at 95% confidence =
The maximum amount by which 95%
of ALL X-bars miss the population mean

With 95% confidence, the most by which
MY X-bar misses
= true population mean
Where are the 95% of all possible X-bars
that are closest to the population mean ?

X
.95
? ?
-1.96 0 +1.96 Z
.025 .025
Illustration of Margin of Error
? -
Z =
o
n
+1.96 =
? -
o
n
? = + 1.96
o
n
Margin of Error
Convert to the X-bar axis:
1.96 l
o
n

x
standard error
of the mean
Margin of Error for 95% confidence:
MOE =
The confidence
multiplier
q More confidence?
q Less confidence?
q Larger sample size?
q Smaller sample size?
Larger MOE
Smaller MOE
Smaller MOE
Larger MOE
General form for margin of error
when o is known:
MOE = Z l
o
n
o
2
Z
o
2
appropriate percentile from the
standard normal distribution,
i.e., the Z table.
where is the
Explanation of symbol:
Z ~ N(0,1)
0
Z
o

/

2
Z
o

/

2
cuts off the top tail at area = /2
o / 2 1 o /2
Amount of
confidence
Area in
each tail
Table value
.95
.90
.80
.98
.0250
.0500
.1000
.0100
1.96
1.645
1.28
2.33
1 - o o / 2
Z
o / 2
Examples
Confidence Interval:
Point estimate MOE
(1-o)100% Confidence Interval:
Point estimate MOE
(1-o)100% is the amount of
confidence desired.
.95 confidence o = .05 risk
Example: Exam 1 scores
X =
X ~
individual score on Exam 1
Population: Exam 1 scores
?( = , o = ) slightly skewed-left
?
14.0
What do we know?
Is the population of X-bars Normal?
Maybe. The CLT might apply for n =
25.
Estimate with 98% confidence the mean
score on Exam 1 if the pop. std. dev. is 14.0.
The mean of a sample of 25 exams is 71.04.
MOE = Z l
o
n
o
2
= 2.33 l
14
5
= 6.52 pts
= Z
.01
l
14
25
Note: Were assuming that n = 25 is
large enough for the CLT to apply.
Example: Exam 1 scores
Estimate with 98% confidence the mean
score on Exam 1 if the pop. std. dev. is 14.0.
The mean of a sample of 25 exams is 71.04.
71.04 6.52
Point estimate MOE
or
64.52 to 77.56 pts
98% confidence interval:
Example: Exam 1 scores, continued . . .
I am 98% confident that
the true mean Exam 1 score ()
in the population of all test-takers
is within the interval 64.52 to 77.56 points.
A statement in the L.O.P. contains four parts:
1. the level of confidence.
2. the parameter estimated in the L.O.P.
3. the population to which we generalize
in the L.O.P.
4. the calculated interval.
Statement in the L.O.P.
Before sampling, there is a 98% chance that well
pick an X-bar within MOE of .

X
.98
- MOE + MOE
.01 .01
Think about this:
MOE | MOE
| |
If my X-bar is one of the 98% closest, then
X-bar +/- MOE will contain the true mean .
X
[ ] CI
MOE + MOE
Replace o with s.
Replace z with t.
What do we do if the
true population standard
deviation o is unknown?
Estimated margin of error (MOE)
when o is UN-known:
MOE =
t l
s
n
o
2
,n1
s
x
estimated
standard
error of
the
mean
t o
2
,n1
where is the

appropriate percentile from
the t-distribution.

Explanation of symbol:
t-distribution
0
t
o / 2 , n1

t
o

/

2, n-1
cuts off the top tail at area = /2
Use the t-table to find the value.
o / 2 1 o /2

0 t
Table gives right-tail area.
(e.g., for a right-tail area
of 0.025 and d.f. = 15,
the t value is 2.131.)
0.1 0.05 0.025 0.01 0.005

d.f. = 1 3.078 6.314 12.706 31.821 63.656
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
11 1.363 1.796 2.201 2.718 3.106
12 1.356 1.782 2.179 2.681 3.055
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977
15 1.341 1.753 2.131 2.602 2.947
16 1.337 1.746 2.120 2.583 2.921
17 1.333 1.740 2.110 2.567 2.898
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23 1.319 1.714 2.069 2.500 2.807
24 1.318 1.711 2.064 2.492 2.797
25 1.316 1.708 2.060 2.485 2.787
26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756
30 1.310 1.697 2.042 2.457 2.750
31 1.309 1.696 2.040 2.453 2.744
32 1.309 1.694 2.037 2.449 2.738
33 1.308 1.692 2.035 2.445 2.733
34 1.307 1.691 2.032 2.441 2.728
35 1.306 1.690 2.030 2.438 2.724

0.1 0.05 0.025 0.01 0.005

36 1.306 1.688 2.028 2.434 2.719
37 1.305 1.687 2.026 2.431 2.715
38 1.304 1.686 2.024 2.429 2.712
39 1.304 1.685 2.023 2.426 2.708
40 1.303 1.684 2.021 2.423 2.704
41 1.303 1.683 2.020 2.421 2.701
42 1.302 1.682 2.018 2.418 2.698
43 1.302 1.681 2.017 2.416 2.695
44 1.301 1.680 2.015 2.414 2.692
45 1.301 1.679 2.014 2.412 2.690

97 1.290 1.661 1.985 2.365 2.627
98 1.290 1.661 1.984 2.365 2.627
99 1.290 1.660 1.984 2.365 2.626
100 1.290 1.660 1.984 2.364 2.626
1.282 1.645 1.960 2.326 2.576

Want 95% CI,
n = 20,
o/2 = .025
d.f. = n-1 = 19
t
.025, 19
=
2.093

0 t
0.1 0.05 0.025 0.01 0.005

d.f. = 1 3.078 6.314 12.706 31.821 63.656
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
11 1.363 1.796 2.201 2.718 3.106
12 1.356 1.782 2.179 2.681 3.055
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977
15 1.341 1.753 2.131 2.602 2.947
16 1.337 1.746 2.120 2.583 2.921
17 1.333 1.740 2.110 2.567 2.898
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23 1.319 1.714 2.069 2.500 2.807
24 1.318 1.711 2.064 2.492 2.797
25 1.316 1.708 2.060 2.485 2.787
26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756
30 1.310 1.697 2.042 2.457 2.750
31 1.309 1.696 2.040 2.453 2.744
32 1.309 1.694 2.037 2.449 2.738
33 1.308 1.692 2.035 2.445 2.733
34 1.307 1.691 2.032 2.441 2.728
35 1.306 1.690 2.030 2.438 2.724

0.1 0.05 0.025 0.01 0.005

36 1.306 1.688 2.028 2.434 2.719
37 1.305 1.687 2.026 2.431 2.715
38 1.304 1.686 2.024 2.429 2.712
39 1.304 1.685 2.023 2.426 2.708
40 1.303 1.684 2.021 2.423 2.704
41 1.303 1.683 2.020 2.421 2.701
42 1.302 1.682 2.018 2.418 2.698
43 1.302 1.681 2.017 2.416 2.695
44 1.301 1.680 2.015 2.414 2.692
45 1.301 1.679 2.014 2.412 2.690

97 1.290 1.661 1.985 2.365 2.627
98 1.290 1.661 1.984 2.365 2.627
99 1.290 1.660 1.984 2.365 2.626
100 1.290 1.660 1.984 2.364 2.626
1.282 1.645 1.960 2.326 2.576

Want 98% CI,
n = 33,
o/2 = .01
d.f. = n-1 = 32
t
.01, 32
=
2.449
Table gives right-tail area.
(e.g., for a right-tail area
of 0.025 and d.f. = 15,
the t value is 2.131.)

0 t
0.1 0.05 0.025 0.01 0.005

d.f. = 1 3.078 6.314 12.706 31.821 63.656
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
11 1.363 1.796 2.201 2.718 3.106
12 1.356 1.782 2.179 2.681 3.055
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977
15 1.341 1.753 2.131 2.602 2.947
16 1.337 1.746 2.120 2.583 2.921
17 1.333 1.740 2.110 2.567 2.898
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23 1.319 1.714 2.069 2.500 2.807
24 1.318 1.711 2.064 2.492 2.797
25 1.316 1.708 2.060 2.485 2.787
26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756
30 1.310 1.697 2.042 2.457 2.750
31 1.309 1.696 2.040 2.453 2.744
32 1.309 1.694 2.037 2.449 2.738
33 1.308 1.692 2.035 2.445 2.733
34 1.307 1.691 2.032 2.441 2.728
35 1.306 1.690 2.030 2.438 2.724

0.1 0.05 0.025 0.01 0.005

36 1.306 1.688 2.028 2.434 2.719
37 1.305 1.687 2.026 2.431 2.715
38 1.304 1.686 2.024 2.429 2.712
39 1.304 1.685 2.023 2.426 2.708
40 1.303 1.684 2.021 2.423 2.704
41 1.303 1.683 2.020 2.421 2.701
42 1.302 1.682 2.018 2.418 2.698
43 1.302 1.681 2.017 2.416 2.695
44 1.301 1.680 2.015 2.414 2.692
45 1.301 1.679 2.014 2.412 2.690

97 1.290 1.661 1.985 2.365 2.627
98 1.290 1.661 1.984 2.365 2.627
99 1.290 1.660 1.984 2.365 2.626
100 1.290 1.660 1.984 2.364 2.626
1.282 1.645 1.960 2.326 2.576

Want 90% CI,
n = 600,
o/2 = .05
d.f. = n-1 = 599
t
.05,
=
1.645
Same as Z
Table gives right-tail area.
(e.g., for a right-tail area
of 0.025 and d.f. = 15,
the t value is 2.131.)


t-dist. Z
Bell shaped,
symmetric
Mean
Std. dev.

As n1 increases, t
n1
approaches Z

Degrees of freedom
= 0
> 1
Yes Yes
= 0
= 1
n-1
Comparison of Z and t
= 2.492 l
16.8
5
= 8.37 pts
= t
.01, 24
l
16.8
25
MOE =
s
n
t l
o
2
, n-1
Example: Exam 1 scores
Estimate with 98% confidence the mean
score on Exam 1.

In a sample of 25 exams, the mean is 71.04
and the std dev is 16.8.
(1-o)100%
n
X
s
71.04 8.37
Point estimate MOE
or
62.67 to 79.41 pts
Example, continued . . .
98% confidence interval:
I am 98% confident that
the true mean Exam 1 score ()
in the population of all test-takers
is within the interval 62.67 to 79.41 points.
A statement in the L.O.P. contains four parts:
1. the level of confidence.
2. the parameter estimated in the L.O.P.
3. the population to which we generalize
in the L.O.P.
4. the calculated interval.
Statement in the L.O.P.
Compare the two 98% MOE values.
Sample 2. True o was NOT known:
Sample 1. True o was known:
= 2.33 l
14
5
= 6.52 pts
MOE
= 2.492 l
16.8
5
= 8.37 pts
MOE
To get a smaller margin of error:
q Increase n.
q Decrease the amount of
confidence desired.
Margin of Error for 95% confidence:
= 1.96 l
o
n
MOE
What sample size is needed to estimate
the mean mpg of Toyota Camrys with an
MOE of 0.2 mpg at 90% confidence if the
pop. std. dev. is 0.88 mpg?
MOE = Z l
o
n
o
2
0.2 = 1.645 l
0.88
n
n =
1.645
2
l 0.88
2
0.2
2
= 52.39
Round
UP,
use 53
Camrys
Sample Size Determinations
Problem: What sample size n is needed to have
a margin of error of MOE with
(1o)100% confidence?

MOE = z
o/2
o
n

n =
z
o/2
o
MOE
|
\

|
.
|
2
Confidence Interval
for a Parameter

point estimate margin of error

Choose the appropriate statistic
(point estimate) and its MOE based on the problem that is to
be solved.
Mean,
if o is known:
Population
Parameter
Point Estimate
Margin of Error
at (1-o)100% confidence
Mean,
if o is unknown:

MOE = Z
o
2

o
n

MOE = t
o
2
, n1

s
n

X

X
A (1-o)100% confidence interval estimate of a parameter is
point estimate MOE
Estimation of Parameters
Diff. of two
proportions, p
1
- p
2
:
A (1-o)100% confidence interval estimate of a parameter is
2 1
p p
1 1 2 2

2
1 2
p (1-p p (1-p
m.o.e. = Z +
n n

) )
2 1
x x
2 2
1 2

2
1 2
s s
m.o.e. = Z +
n n
Proportion, p:

2
m.o.e. = Z p(1-p) n


MOE = Z
o
2
g
o
n
Mean,
if o is known:

( , n-1)
2
s
m.o.e. = t
n
Population
Parameter
Point Estimate
Margin of Error
at (1-o)100% confidence
Mean,
if o is unknown:
/ , p X n =

X

X
point estimate MOE
Estimation of Parameters
Diff. of two
means,
1
-
2
:
(for large sample sizes only)

( , n-2)
2
s
m.o.e. = t
Equ. 2
Slope of regression
line, | :
b
* 2

( , n-2)
2
1 (x -x)
m.o.e. =t s +
n Equ. 2
Mean from a
regression
when X = x
*
:
*
bx a y + =
where s MSE =
Hypothesized mean
Making a decision using a CI.
A value of the parameter that
we believe is,
or ought to be,
the true value of the mean.
We gather evidence and make a
decision about this hypothesis.
Question of interest: Is there evidence
that the true mean is different from the
hypothesized mean?
Making a decision using a CI.
q If the hypothesized value is inside the CI,
then it IS a plausible value.
Reach a vague conclusion..
q If the hypothesized value is not in the CI,
then it IS NOT a plausible value.
Reject it! Reach a strong conclusion.
Take appropriate action!
5 8 11 14 17 20 23
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
X
p
2
The true population
mean is hypothesized
to be 13.0.
X-axis
Population of
all possible
X-bar values,
assuming . . . .
My ONE
sample mean.
My ONE
Confidence Interval.
Conclusion:
The hypothesis is
wrong. The true
mean not 13.0!

13.0 does NOT fall in
my confidence interval;
it is not a plausible value
for the true mean.
l l
l
10.2 5.6 7.9
Middle
95%
The data
convince me
the true
mean is
smaller
than 13.0.

I am 95%
confident
that . . . .
5 8 11 14 17 20 23
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
X
p
1
The true population
mean is hypothesized
to be 13.0.
Conclusion:
The hypothesis is
wrong. The true
mean not 13.0!

l l
l
10.2 5.6 7.9
X-axis
A more likely location
of the population.
13.0 does NOT fall in
my confidence interval;
it is not a plausible value
for the true mean.
The data
convince me
the true
mean is
smaller
than 13.0.

I am 95%
confident
that . . . .
Net weight of potato chip bags
should be 16.00 oz.
FDA inspector takes a sample.
If the 95% CI is (15.81 to 15.95),
If the 95% CI is (15.71 to 16.05),
then 16.00 is NOT in the interval.
Therefore, reject 16.00 as a plausible
value. Take action against the company.
X = 15.88
then 16.00 IS in the interval.
Therefore, 16.00 is a plausible value.
Take no action.
X = 15.88
Net weight of potato chip bags
should be 16.00 oz.
FDA inspector takes a sample.
then 16.00 is NOT in the interval.
Therefore, reject 16.00 as a plausible
value. But, the FDA does not care that
the company is giving away potato chips.
The FDA would obviously take no action
against the company.
X = 16.10
If the 95% CI is (16.05 to 16.15),
Hypothesis Testing
for Means
Court case
Hypothesis: Defendant is innocent.
Alternative: Defendant is guilty.
Decisions: Based on the sample data.
Reject
Innocence
Declare
Guilty
Person
goes to
jail!
Do not Reject
Innocence
Declare
Not
Guilty
Person
goes
free!
Court case
H
0
: Defendant is innocent.
H
1
: Defendant is guilty.
Decisions: Based on the sample data.
Reject
Innocence
Declare
Guilty
Person
goes to
jail!
Do not Reject
Innocence
Declare
Not
Guilty
Person
goes
free!
Type I: Send an innocent person to jail.
Type II: Let a guilty person go free.
o = level of risk deemed reasonable
for the occurrence of a Type I error.
= the point of reasonable doubt.
Types of Errors in a Court Case
Reject the hypothesized value if:
1. it is outside the confidence interval.
2. the p-value is less than the
user specified o-level. (p-value < o-level)
Statistical Inference Methods:
Two methods; both give the same result.
Testing the mean MPG of a car population
H
o
: = 40.0
H
1
: 40.0
H
o
: The population mean is 40.0
H
1
: The population mean is NOT 40.0
Decision:
Do NOT reject H
o
.
Conclusion:
Not sufficient evidence
to say that H
1
is correct.
Reject H
o
!
The true population mean
is NOT 40.0. Take action!
X is in the vicinity of 40.0.
Its too close to call.
X is far enough away
from 40.0 to convince me.
then 40.0 is a plausible for .
then 40.0 is NOT plausible.
p-value < o-level, or
40.0 is outside the CI,
If
p-value > o-level, or
40.0 is inside the CI,
If
1. Confidence Interval Method
Two tailed test:
Is the mean something
other than 40.0?
Hypothesized mean: 40.0
.05 .95
.10 .90
.01 .99
Desired
o-level:
Size of
CI to use:
1 - o o
Result: Each tail has half of o.
2. p-value Method
p-value = the probability of observing
a statistic value that is more extreme
(more contradictory to hypothesis) than
the value observed, assuming that the
hypothesized parameter value is correct.
Calculate p-value using the
appropriate distribution.
Decision rule: If p-value < o-level,
reject the hypothesized value.
X = 42.6
-4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0
40 X 42.6
Hypo. mean: 40.0,
p-Value:
37.4
Two tailed test:
Is the mean something
other than 40.0?
p-value / 2 p-value / 2
2.6 2.6
p-value is a measure of how far the observed estimate
of the true parameter is from the hypothesized value.
Small
p-value
Far away!
(inconsistent)
Reject
hypothesized
value
Large
p-value
Fairly close
(consistent)
Do not reject
hypothesized
value
Sample results:
X = 43.0
s = 7.2
-4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0
0 Z 1.50
Z =
43.0 40.0
2.0
= 1.50
.0668
40 X 43.0
Hypothesized mean: 40.0.
Change ads if MPG is off
in either direction.
More extreme
Also
more extreme
-1.50
p-value = .0668 2
= .1336
than 3.0 units than 3.0 units
Problem 1, using p-Value
What distribution
should be used?
Pick o = .05
(p-value =.1336) > (o =.05); Do not reject 40.0.
X ~ N( = ?, o = 8.0)
n = 16
X ~ N(
X
= ?, o
X
= 2.0)
Sample results:
X = 43.0
s = 7.2
Hypothesized mean: 40.0.
Change ads if MPG is off
in either direction.
Problem 1 with Confidence Interval
Which CI formula
should be used?
Pick o = .05
40.0 falls inside the CI. Do not reject 40.0.
X Z
.025

o
n
43.0 1.96 - 2.0
43.0 3.92
(39.08, 46.92)
X ~ N( = ?, o = 8.0)
n = 16
X ~ N(
X
= ?, o
X
= 2.0)

You might also like