Professional Documents
Culture Documents
The project we were assigned to complete for statistics class was to gather data from
skittles and analyze the data using the tools we have learned during the semester. Each
student purchased a 2.17- ounce bag of skittles and then we used the data to produce
the following results.
Data Collection
Each student in the class will purchase one 2.17-ounce bag of Original Skittles and
record the following data:
Number of Number of Number of Number of Number of
red candies orange yellow green candies purple
candies candies candies
260 240 255 287 256
Skittles candies
purple red
20% 20%
orange
green
18%
22%
yellow
20%
290
287
280
270
260
260
250 255 256
240
240
230
220
210
orange yellow purple red green
300
250
200
150
100
50
0
orange yellow purple red green
Series1 Series2
My total candies
From the collected data the pie chart appears to be evenly distributed. Although, from the
pareto chart the data shows differently. The data is fairly skewed to the left. I would say that
my data does appear to be fairly consistent with the rest of the class.
Confidence interval
A confidence interval is a way to say how confident you are that the population proportion will
fall within a specific interval. It gives a statistical range of what you could expect to find when
analyzing data.
Construct a 99% confidence interval estimate for the true proportion of yellow candies.
Construct a 95% confidence interval estimate for the true mean number of candies per bag
Construct a 98% confidence interval estimate for the standard deviation of the number of
candies per bag
Confidence interval for the true proportion of yellow candies was analyzed using a 99% confidence
interval. The confidence interval was determined to be 0.168<P<.224. So from that data we can say that
if a random sample from all of the students was taken we would be 99% confident that the true
proportion of yellow candies would fall within that interval.
Confidence intervals for the true mean using a 95% confidence interval was analyzed. The confidence
interval was determined to be somewhere between 58.01 low limit to 60.18 high limit. From this we
could say that we are 95% confident that the true mean would fall between 58.01 and 60.18.
And using a 98% confidence interval to find the standard deviation of candies per bag. We found that in
98% of the bags you can expect to see a 1.85 to 3.80 standard deviation.
Hypothesis Tests
A hypothesis test is used to find a claim being made and whether it is true, or if we reject the claim. We
do this by stating the null hypothesis and alternative hypothesis.
Use a 0.05 significance level to test the claim that 20% of all skittles candies are red
Use of 0.01 significance level to test the claim that the mean number of candies in a bag of
skittles is 55
Using the classical method to test the hypothesis claim that 20% of the skittles are red we determined in
the first test using a significant level of .05 that we failed to reject the null hypothesis. The reason being
is that our test statistic was 1.89 which is less than critical value 1.96 therefore we failed to reject the
null hypothesis. The test seems reasonable because the proportion of the total red skittle candies is
0.200.
For our second test we used the t-distribution test to find our test statistic. We determined that our test
statistic was 7.87. The t critical is 2.819. Since our t statistic number is greater than t critical we reject
the null hypothesis since it falls in the reject region. This test seems reasonable since the mean was
calculated to be 59.13
Reflection
The conditions that are needed for confidence interval estimates and hypothesis testing are as follows:
1. The sample must come from a simple random sample. For our test this was the case.
2. The condition for binomial distribution is that there must be a normal distribution. The
histogram data appears to have normal distribution bell shaped. There is however the one
outlier that was noticed and it is also skewed left.
3. Sample sizes must also be at least 30. In our case we fell short of the sample size requirement.
Therefore, since our sample size is smaller than the requirement there could lead to a type 2
error.