You are on page 1of 6

Student Notes - Prep Session Topic: Sampling Distributions

Gloria Barrett, Virginia Advanced Study Strategies

edited by Daren Starnes

Content
The AP Statistics topic outline contains the following list of items related to sampling
distributions. (Items (4), (5), and (8) will not be covered in this session.)
1.
2.
3.
4.
5.
6.
7.
8.

Sampling distribution of a sample proportion


Sampling distribution of a sample mean
Central Limit Theorem
Sampling distribution of a difference between two independent sample proportions
Sampling distribution of a difference between two independent sample means
Simulation of sampling distributions
t-distribution
Chi-square distribution

Sampling distributions are an extension of probability, so many free response questions that
include questions on sampling distributions will also include parts that relate to material
discussed and reviewed in the earlier prep session on probability.
Be sure you understand --1. The difference between a parameter and a statistic
2. What we mean by the sampling distribution of a statistic (that is, the distribution of the
values of that statistic obtained from all possible samples of a given size from a given
population)
3. What we mean by an unbiased statistic
4. The formulas for p and x should be used only when the population is at least 10 times
as large as the sample
5. The sampling distribution of p is approximately normal when the sample size is large (your
textbook will have a definition of large, for example np 10 and n(1 p) 10 )
6. The sampling distribution of x is normally distributed, regardless of sample size, if the
underlying population is normally distributed
7. The sampling distribution of x is approximately normally distributed, regardless of the
shape of the underlying population, when the sample size is large (according to the
Central Limit Theorem). In this case n 30 is usually sufficiently large.
8. The CLT is a statement about shape. It says that the sampling distribution of sample
means becomes more normally distributed as the sample size increases.
Formulas
You will want to be familiar with the probability formulas that are provided on the exam. A partial
list of formulas related to probability on the exam formula sheet is provided here. Note that
several relate to the sampling distribution of sample means and sample proportions:
If X has a binomial distribution with
p (1 p )
p
parameters n and p, then:

n
P( X k ) p k (1 p ) n k
k
X np

If x is the mean of a random sample of


size n from an infinite population with
mean and standard deviation ,
then:

X np(1 p)

x
n

p p

MC B # 30, 38
AP Exam Free Response Questions for Practice and Discussion
2004, Form B #3
Trains carry bauxite from a mine in Canada to an aluminum processing plant in northern New
York
State in hopper cars. Filling equipment is used to load ore into the hopper car. When functioning
properly, the actual weights of ore loaded into each car by the filling equipment at the mine are
approximately normally distributed with a mean of 70 tons and a standard deviation of 0.9 ton. If
the mean is greater than 70 tons, the loading mechanism is overfilling.
(a) If the filling equipment is functioning properly, what is the probability that the weight of the
ore in a randomly selected car will be 70.7 tons or more? Show your work.

(b) Suppose that the weight of ore in a randomly selected car is 70.7 tons. Would that fact make
you suspect that the loading mechanism is overfilling the cars? Justify your answer.

(c) If the filling equipment is functioning properly, what is the probability that a random sample
of 10 cars will have a mean weight of 70.7 tons or more? Show your work.

(d) Based on your answer in part (c), if a random sample of 10 cars had a mean ore weight of
70.7 tons, would you suspect that the loading mechanism was overfilling the cars? Justify your
answer.

2008, Form B, #2

Four different statistics have been proposed as estimators of a population parameter. To


investigate the behavior of these estimators, 500 random samples are selected from a known
population and each statistic is calculated for each sample. The true value of the population
parameter is 75. The graphs below show the distribution of the values for each statistic.

(a) Which of the statistics appear to be unbiased estimators of the population parameter? How
can you tell?

(b) Which of the statistics A or B would be a better estimator of the population parameter?
Explain your choice.

(c) Which of the statistics C or D would be a better estimator of the population parameter?
Explain your choice.

2007, Form B #2
The graph below shows the relative frequency distribution for X , the total number of dogs and
cats owned per household, for the households in a large suburban area. For instance, 14 percent
of the households own 2 of those pets.

(a) According to local law, each household in this area is prohibited from owning more than 3 of
these pets. If a household in this area is selected at random, what is the probability that the
selected household will be in violation of this law? Show your work.

(b) If 10 households in this area are selected at random, what is the probability that exactly 2 of
them will be in violation of this law? Show your work.

(c) The mean and standard deviation of X are 1.65 and 1.851 respectively. Suppose that 150
households in this area are to be selected at random and X , the mean number of dogs and cats
per household, is to be computed. Describe the sampling distribution of X , including its shape,
center, and spread.

Solution, 2004 Form B Question 3


Let X = weight of ore in a randomly selected car.
(a) P ( X 70.7) P ( Z

70.7 70
) P ( Z 0.78) 0.2177
0.9

(b) No. Approximately 22% of the cars will have ore weights of 70.7 or greater when the filling
equipment is working properly, so a car that was filled with 70.7 tons of ore would not be an
unusual occurrence.
(c)

P ( X 70.7) P ( Z

70.7 70
0.7
) P(Z
) P ( Z 2.46) 0.0069
0.9
0.285
10

(d) Yes, we would suspect that the filling mechanism is overfilling. If it is working properly, the
probability that the mean weight of the ore in 10 randomly selected cars is 70.7 or greater is
0.0069 which is very small.
Note 1: To receive complete credit for part (a) or part (c), students must show how the
probability is computed. Since part (a) and part (c) involve different normal distributions, it is
important to identify which normal distribution is used in each part. As shown above, this could
be done by displaying a probability statement containing the mean and standard deviation for
the appropriate normal distribution. It could be done in other ways, such as listing the mean and
standard deviation and displaying an appropriate graph.
Note 2: The response in part (b) could be justified by indicating that 70.7 tons is less than one
standard deviation away from the desired mean of 70 tons. The response in part (d) could be
justified by indicating that 70.7 tons is more than two standard deviations above the desired
mean of 70 tons.
Solution, 2008 Form B Question 2
(a) Statistics A, C, and D appear to be unbiased. This is indicated by the fact that the mean of
the estimated sampling distribution for each of these statistics is about 75, the value of the
population parameter.
Note: No other characteristic should be mentioned in the response. Students must clearly
demonstrate an understanding of the term unbiased.
(b) Statistic A would be a better choice because it appears to be unbiased (or centered at 75).
Although the variability of the two estimated sampling distributions is similar, statistic A would
produce estimates that tend to be closer to the true population parameter value of 75 than
would statistic B.
(c) Statistic C would be a better choice because it has smaller variability. Although both statistic
C and statistic D appear to be unbiased, statistic C would produce estimates that tend to be
closer to the true population parameter value of 75 than would statistic D.
Solution, 2007 Form B Question 2
(a) P ( X 3) 0.07 0.04 0.04 0.02 0.17
(b) Y = number of households in violation. Y has a binomial distribution with n = 10 and p =
0.17.

10
2
8
(0.17) (0.83) 0.2929
2

P (Y 2)

(c) The distribution of X will:


1. be approximately normal (note that the word approximately is required for an
essentially correct response) OR is more symmetric than the population distribution which
is highly skewed.

2. have mean X 1.65


3. have standard deviation X

1.851

0.1511
n
150

You might also like