You are on page 1of 7

Zachary Twitchell

Statistics Final Project

The purpose of this project is to use the different proportions of skittles colors and the mean
number of skittles in a 2.14 oz bag of skittles to perform statistical evaluations using the data. 2
sets of data were collected. The first was just one bag of skittles with the colors each counted
and recorded. The second was 49 bags of skittles with each of the colors counted and recorded.
The first data set is as follows.

Number of Red Number of Number of Number of Number of


candies Orange candies Yellow Candies Green Candies Purple Candies
9 16 17 10 10

The proportions of each are P(red)= 0.145, P(orange)=0.258, P(yellow)=0.274, P(green)=0.161


and P(purple)=0.161.

The total candies in the bag was 62.

The following is the second data set.

total
ORANG YELLO GREE PURPL count
RED E W N E s
13 8 11 15 10 57
14 13 14 11 10 62
11 13 9 13 16 62
11 7 14 15 13 60
18 10 15 8 13 64
16 15 9 11 9 60
17 9 10 16 11 63
11 9 25 14 6 65
13 9 11 15 15 63
8 15 9 12 20 64
11 8 20 8 15 62
16 8 9 11 10 54
8 13 21 13 6 61
15 8 15 9 14 61
15 10 16 8 13 62
14 18 11 5 11 59
9 13 17 10 14 63
14 12 15 9 9 59
8 16 16 10 7 57
11 9 25 14 6 65
14 10 20 7 10 61
19 11 13 6 10 59
17 10 12 7 16 62
7 10 7 6 17 47
9 7 16 9 16 57
13 15 10 8 15 61
23 9 17 6 9 64
10 12 10 16 14 62
15 7 10 13 17 62
10 12 10 16 14 62
14 8 10 14 16 62
12 14 8 14 15 63
15 16 10 9 10 60
6 9 15 9 13 52
25 7 14 14 10 70
16 14 13 8 9 60
13 15 8 11 13 60
15 21 12 8 6 62
13 9 12 13 13 60
15 8 11 8 17 59
15 13 13 10 9 60
20 10 9 7 14 60
13 12 10 15 13 63
10 11 9 14 13 57
16 9 8 19 10 62
10 13 7 15 12 57
14 11 15 12 10 62
13 22 11 9 3 58
16 16 6 13 9 60
661 564 618 543 581 2967
These graphs show the proportions of both the first and second data set.

Proportion of Skittles Colors


One Bag

5; 16.13% 1; 14.52%

4; 16.13%
2; 25.81%

3; 27.42%

Proportions of Skittles
Colors Whole Sample

5; 19.58% 1; 22.28%

4; 18.30%
2; 19.01%

3; 20.83%

These graphs show the proportions of the different colors of skittles. One thing that the two of
them show together is that the proportions of each color get closer together as the sample size
increases.
Now using the sample data we can calculate the mean, standard deviation and 5 number
summary.

Mean=60.6

Standard Deviation=3.51

Minimum=47 Q1=59 Median=61 Q3=62 Maximum=70

The following is a frequency histogram that allows us to see the distribution of the number of
candies in a bag. The mean is 60.6 and the median is 61 candies. This graph shows the normal
curve with the center being right over that area.

Frequency Histogram
Number of Skittles

14

12

10

0
1 5 9 13 1 7 2 1 2 5 29 33 37 4 1 4 5 49 53 57 6 1 6 5 6 9 73 77 8 1 8 5 8 9 93 97

The following is a boxplot of the data.


With this experiment we use two different types of data. Categorical and quantitative.
Categorical data is discrete and is recorded based on category such as the color of the skittles.
The quantitative data can be discrete or continuous and each value is recorded regardless of
category like color. For categorical data it makes sense to do calculations using proportions.
With quantitative data mean and standard deviation makes more sense.

Confidence Intervals

Now using the data acquired we want to construct confidence intervals. First we will use the
calculator to obtain a 99% confidence interval for the proportion of yellow skittles. The function
1-propZint with X=618, n=2967 and c-level of 0.99. This gives the interval (0.189, 0.227).

Now we want a 95% confidence interval for the true mean of the candies. For this we will use
the T-interval function on the calculator. The result is (59.5,61.6).

Lastly we want a 98% confidence interval estimate for the standard deviation. For this we have
to use the equation S.D.= ((n-1)s^2)/X^2 where n=49 s=3.51 and Xr and Xl are taken from a data
table. Xr= 29.707 and Xl is 76.154. Plugging in these values gives the interval (2.787, 4.462).

These confidence intervals say that we are x% confident that the true value is within that
interval giving us a more accurate value.

Hypothesis Tests
The purpose of hypothesis tests is to see if there is sufficient evidence to support a claim. For
these we use to claims the null hypothesis and the alternative hypothesis.

First we want to test the claim that 20% of all skittles are red. Ho=.2 and H1 does not = 0.2. We
will test this at significance level of 0.05. When the information is plugged into the calculator we
get a p-value of 0.0019 which is less than 0.05. This means we have sufficient evidence to reject
the null hypothesis and state that the proportion of red skittles is not 0.2.

Next we want to test the claim that the mean number of candies in a bag is 55 at a significance
level of 0.01. When we plug the information into the calculator we get a p value of 1.12x10 ^-
14. The p value is less than 0.01 meaning we have sufficient evidence to reject the null
hypothesis and claim that the true mean of skittles per bag is not 55.

These tests allow us to see that claims made like that the red skittles are 1/5 of the skittles in a
bag are not true and that the mean number of skittles is not 55. The hypothesis tests gives us
evidence for or against claims.

The conditions for doing these tests require a sample of sufficient size, and either recorded
proportions or mean values. The claims we tested met the conditions.
Zachary Twitchell

Reflective Writing

Statistics

This project has allowed me to see how statistics can be applied to anything and how

different aspects of statistics can be used on the same data set. For the skittles project we

obtained mean and standard deviation, confidence intervals for proportions, mean, and

standard deviation, and we tested claims using the hypothesis tests. All this was done using the

same data set from the beginning to the end.

What has been the most interesting part of this class is understanding how statistics can

be used. My classes are pre-medical in nature and understanding how the statistics I see in

those classes can be interpreted and can also be misleading. The number one thing I have

learned from this class is that a sample statistic may look different from the population but can

in fact be a usual measurement.

This understanding is very important in the medical field to understand if a medication

truly is more effective than a predecessor or if a treatment option is actually more effective or

less effective or if the one sample is still within the usual range. The hypothesis tests are the key

to this understanding and making statistics more applicable to me in school and in a future

career.

You might also like