Professional Documents
Culture Documents
RSR
Sampling Distributions
Amity Business School
RSR
Sampling Distributions
A sampling distribution is created by, as the name suggests,
sampling. There are two ways to create a sampling distribution.
The first is to actually draw samples of the same size from a
population, calculate the statistic of interest, and then use
descriptive techniques to learn more about the sampling
distribution.
The method we will employ on the rules of probability and the
laws of expected value and variance to derive the sampling
distribution.
For example, consider the roll of one and two dice
Amity Business School
RSR
Sampling Distribution of the Mean
A fair die is thrown infinitely many times,
with the random variable X = # of spots on any throw.
The probability distribution of X is:
and the mean and variance are calculated as well:
x 1 2 3 4 5 6
P(x) 1/6 1/6 1/6 1/6 1/6 1/6
Amity Business School
RSR
All Samples of Size 2 from a Population
A sampling distribution is created by looking at all samples of
size n=2 (i.e. two dice) and their means
Amity Business School
RSR
Amity Business School
RSR
1.0 1/36
1.5 2/36
2.0 3/36
2.5 4/36
3.0 5/36
3.5 6/36
4.0 5/36
4.5 4/36
5.0 3/36
5.5 2/36
6.0 1/36
P( )
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
6/36
5/36
4/36
3/36
2/36
1/36
P
(
)
Amity Business School
RSR
Compare
Compare the distribution of X
with the sampling distribution of .
As well, note that:
1 2 3 4 5 6 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
Amity Business School
RSR
Generalize
We can generalize the mean and variance of the sampling of two dice:
to n-dice:
The standard deviation of the
sampling distribution is
called the standard error:
Amity Business School
RSR
It is important to recognize that the distribution of is
different from the distribution of X. However the two
random variables are related.
Their means are the same and their variances
are related
X
) 2 (
2 2
o o =
x
) 5 . 3 ( = =
x
2
o
If we now repeat the sampling process with the same
population but with other values of n, we produce
somewhat different sampling distributions of when
n = 5, 10 and 25.
2
x
o
Amity Business School
RSR
Amity Business School
RSR
Amity Business School
RSR
Amity Business School
RSR
The variance of sampling distributions of is less than the
variance of the population were sampling from all the sample
sizes.
Thus, a randomly selected value of (the mean of the number
of spots in, say five throws of the die) is likely to be closer to
the mean value of 3.5 than is a randomly selected value of X
(the number of spots observed in one throw). Indeed this is
what one would expect, because, in five throws of the die one
is likely to get some 5s and 6s and some 1s and 2s, which will
tend to offset one another in averaging process and produce a
sample mean reasonably close to 3.5. As the number of throws
of the die increases the, the probability that the sample mean
will be close to 3.5 also increases. Thus we observe that the
sampling distribution of becomes narrower as n increases.
X
X
X
Amity Business School
RSR
Amity Business School
RSR
Central Limit Theorem
The sampling distribution of the mean of a random sample drawn from
any population is approximately normal for a sufficiently large
sample size.
The larger the sample size, the more closely the sampling distribution
of X will resemble a normal distribution.
Amity Business School
RSR
Central Limit Theorem
If the population is normal, then X is normally distributed for all
values of n.
If the population is non-normal, then X is approximately normal only
for larger values of n.
In most practical situations, a sample size of 30 may be sufficiently
large to allow us to use the normal distribution as an approximation
for the sampling distribution of X.
Amity Business School
RSR
Sampling Distribution of the Sample Mean
1.
2.
3. If X is normal, X is normal. If X is nonnormal, X is approximately
normal for sufficiently large sample sizes.
Note: the definition of sufficiently large depends on the extent of
nonnormality of x (e.g. heavily skewed; multimodal)
Amity Business School
RSR
Sampling Distribution of the Sample Mean
We can express the sampling distribution of the mean
simple as
n /
X
Z
o
=
Amity Business School
RSR
Sampling Distribution of the Sample Mean
The summaries above assume that the population is infinitely large.
However if the population is finite the standard error is
where N is the population size and
is the finite population correction factor.
1 N
n N
n
x
o
= o
1 N
n N
\
|
>
= >
z P
X
P X P
o
\
|
>
= >
z P
X
P X P
x
x
o
( ) 32 > X P
15 . 4 3 . ) ( . 3
2 . 32 . 2
. 1
2
= = =
= =
n
d distribute normally is X
x
x
o o
Amity Business School
RSR
In the example, we began with the assumption that both
and were known.
Then, using the sampling distribution, we made a
probability statement about mean.
Unfortunately the values of and are not usually known,
so an analysis such as that in this example cannot usually
be conducted.
However, we can use the sampling distribution to infer
something about an unknown value of on the basis of a
sample mean.
Amity Business School
RSR
EXERCISE
The number of pizzas consumed per month by university
students is normally distributed with a mean of 10 and a
standard deviation of 3.
a. What proportion of students consume more than 12
pizzas per month?
b. What is the probability that, in a random sample of
25, students, more than 275 pizzas are consumed?
Amity Business School
RSR
Solution
|
.
|
\
|
>
3
10 12
= 12) > P(X
o
X
P
= P(Z > .67) = .5 P(0 < Z < .67)
= .5 .2486
= .2514
Amity Business School
RSR
EXERCISE
The number of customers who enter a supermarket each
hour is normally distributed with a mean of 600 and a
standard deviation of 200. The supermarket is open 16
hours per day. What is the probability that the total
number of customers who enter the supermarket in one
day is greater than 10,000?
Amity Business School
RSR
Solution
= .5 P(0 < Z < .50)
= .5 .1915
= .3085
.50) > P(Z =
16 / 200
600 625
/
) 625 (
) 16 / 000 , 10 (
|
|
.
|
\
|
>
=
> =
>
n
X
P
X P
X P
o