Professional Documents
Culture Documents
BAGS (# of bags)
Yellow 726
Green 710
Purple 701
Total 3551
60
Summary statistics:
Column n Mean Std. dev. Median Range Min Max Q1 Q3 IQR Mode
Total 60 59.183
3.110
59
14 50 64 58 61.5 3.5
59
Outliers/Fences
Total candies 58-1.5(3.5) 61.5+1.5(3.5) Less than 52.75 More than 66.75
Mine was not an outlier.
Shape of distribution
Skittles Project 3
1- Height cannot/should not be used to determine the number of skittles in a bag.
2- Candy is the response variable (Y)
3- Height is the explanatory variable (X)
Regression output.
R2 = 0.0290
2.9% is the percentage variation in candies that can be explained by height.
This is not a good predictor of x because it is very small.
Yao Mings height of 90 inches is outside the scope; therefore, would not be appropriate and is outside
the scope.
Systematic sampling
52
64
57
70
58
61
59
80
61
65
62
66
Correlation coefficient
r= 0.1457
n=6 CV=0.811
0.1457 > 0.811
Regression equation
y = 0.2759 x + 51.620
Is there a significant linear relationship in the smaller data set?
There is NOT a significant linear relationship because R is less than the CV.
Project 4
Problem 1: Suppose you are going to randomly select two Skittles from the bag YOU purchased.
(a) What is the probability that both Skittles are purple if you select them with replacement? Give your
answer correct to four decimal places. (4 points)
.3115 x .3115 = .0970
(b) What is the probability that both Skittles are purple if you select them without replacement? Give
your answer correct to four decimal places. (4 points)
.3115 x .3000 = .0935
(c) What is the probability that at least one Skittle is purple if you select them with replacement? (4
points)
.6885 x .6885 = .4740 1-.4740 = .5260
Problem 2: Suppose all of the Skittles in the class data set are combined into one large bowl and you are
going to randomly select one Skittle.
(a) What is the probability that you select a green Skittle? (4 points)
710/3551 = .1999
(b) What is the probability that you select a Skittle that is NOT green? (4 points)
1-0.1999 = .8001
(c) What is the probability that you select a Skittle that is red OR yellow? (4 points)
716+726/3551 = .4061
(d) What is the probability that you select a Skittle that is orange GIVEN that it is a secondary color
(secondary colors are green, orange and purple)? (4 points)
698/710+698+701 = .3310
Problem 3: Suppose all of the Skittles in the class data set are combined into one large bowl and you are
going to randomly select ten Skittles with replacement and count how many are yellow.
Yellow = 726 n = 10 Total = 3551
(a) Show that this meets the requirements of the binomial probability distribution and identify n and p.
(5 points)
n = 10 p = 726/3551 = .2044
This is a binominal experiment because:
The number of trials is 10
The trials are independent
There are two possible outcomes of the experiment: Yellow or (Red, Orange, Green, Purple) Everything
else
The probability of success (Yellow) is .2044 and the probability of Everything else is .7956. The
probabilities are the same for each trial
(b) What is the probability that exactly 4 of the 10 Skittles are yellow? (4 points)
Binompdf = (10, .2044, 4) = .0930
(c) For samples of size 10, what is the expected value and standard deviation for the number of yellow
skittles that will be included? (4 points)
Expected value (mean) = 10 x .2044 = 2.044
Standard deviation = square root of 10 x .2044(1-.2044) = 1.275
Problem 4: For this problem, treat a 2.17 ounce bag of Skittles as an individual. Suppose the values for
our class data are the parameter values for all 2.17 ounce bags of Skittles. In other words, assume =
mean number of candies per bag in our class data set and = standard deviation of number of candies
per bag in our class data set (you computed these values in Part 2).
= 59.183 = 3.110 n = 32
(a) Describe the sampling distribution for the mean number of candies per bag for samples of 32 bags.
Include center, spread and shape. Note: The shape of the SAMPLING DISTRIBUTION is different from the
shape of the population, which you determined in Part 2 of the project. (5 points)
Center 59.183
Spread 3.11/square root of 32 = .5498
Shape Approximately normal because n more than or equal to 30.
(b) What is the probability that the mean number of candies per bag for a sample of 32 bags is greater
than 58.5? (4 points)
58.5-59.183/.5498 = -1.24 z-score = .1075 1-.1075= .8925
Project 5
Purpose and meaning of confidence intervalConfidence interval is an interval of numbers based on a point estimate that gives a range of likely
values for an unknown parameter. The purpose of a confidence interval is to give a percentage level of
confidence that the population proportion is between the lower bound and upper bound range.
99% confidence interval estimate for true proportion of yellow candiesConditions- SRS? N greater than or equal to 30? Population normal or n greater than or equal to 30
Yellow = 726 n = 3551 p-hat = .2044 alpha/2 = .005 Z alpha/2 = 2.575
.2044 - 2.575 square root .2044(1-.2044)/3551 = .1870
.2044 + 2.575 square root .2044(1-.2044)/3551 = .2218
With a 99% confidence the true proportion of the yellow candies is .1870 and .2218 or 18.70% and
22.18%.
99% confidence interval estimate for true proportion of yellow candies in my bag of Skittles
Yellow = 15 n= 61 (total candies) p-hat = .2459 alpha/2 = .005 Z alpha/2 = 2.575
.2459 - 2.575 square root .2459(1-.2459)/61 = .1039
.2459 + 2.575 square root .2459(1-.2459)/61 = .3879
With a 99% confidence the true proportion of the yellow candies in my bag is .1039 and .3879 or 10.39%
and 38.79%.
95% confidence interval estimate for the true mean number of candies per bag
n = 60 (bags) s = 3.110 mean = 59.183 T alpha/2 = 2.000
59.183 2.000(3.110/square root 60)
59.183 + 2.000(3.110/square root 60)
With a 95% confidence the true mean number of candies per bag is between 58.114 and 60.252.
Calculated using T-Interval T-Test
Based on the true mean number of candies calculated above my bag of Skittles was NOT a likely value
for the population mean.
My bag = 61
Expected value between 58.114 and 60.252
Simple Random sample or data from randomized experiment. The purchase of the bags was not
a SRS, it was a convenience sample.
np0(1-p0) 10 3551 x .20 (1-.20) 10
sampled values are independent of each other (n .05N)
Reject that 20% of the Skittles are red (the null hypothesis), when 20% of the Skittles are actually red.
Type II error Do not reject the null hypothesis when the alternative hypothesis is actually true.
Do not reject that 20% of the Skittles are red (the null hypothesis), when the proportion of red Skittles is
not equal to 20%.
Using values for the class data that you computed in Part 2 of the project and a 0.01 significance level,
test the claim that the mean number of candies in a bag of Skittles is more than 58. Show all the steps
(neatly written and scanned, typed, or copied from StatCrunch) including:
1. the hypotheses with correct notation (4 points)
H0: = 58
H1: > 58
2. the conditions for performing the hypothesis test, along with checking that they are methint: they
are not all met! (5 points)
Simple Random sample or data from randomized experiment. The purchase of the bags was not
a SRS, it was a convenience sample.
No outliers and comes from a normal population OR sample size larger than 30
Sampled values are independent of each other