You are on page 1of 6

Skittles Term Project, By Kai Galbiso

This project is based on the probability of a packet of Skittles. We are testing to figure out
whether or not Skittles are consistent with their costumers and if every basic $1.29 pack of
Skittles (2 oz.) has the same variety of color. I will be including organizing and analyzing data,
drawing conclusions using confidence intervals and hypothesis tests to put Skittles up the
challenge. Lets see what we got:
Class Total Skittles Colors
(You cant control the color on StatCrunch so I did a black and white).

The graphs reflected what I expected to see, just thought we would have a little more information
and such on how to use stat crunch. For the most part, my personal bag of skittles matches up
with the whole class except I had more orange than I did red, which is different from the class.

As you can see, my data is slightly different than the classes. My orange is the highest, and my
yellow is a lot lower than the class yellow. Red is also second to last.

!
Summary statistics:
Column
Mean Std. dev. Min Q1 Median Q3 Max
Total Skittles 59.3

2.72

54

58

59

61

66

Total # of Skittle bags in the sample: 26


Total # of Skittles in my bag : 58
The shape of the distribution is skewed right and data shown isn't much different from what I
expected.
Reflection: The values of a categorical variable can be put into a countable number of categories
or different groups and may or may not have some logical order. Compared to the values of a
quantitative variable, which can be ordered and measured. Categorial data is best used with bar
graphs and pie charts because they have a more broad subject (hints the name category). There
is no order to categorial data, its usually a broad topic to generalize, where quantitative goes in
order and deals with the gritty work using more numbers. Quantitative data use histograms, box
plots, and scatter plots.
Lets say you survey people and ask them to tell you their eye color. They would respond with a
categorical variable of brown, blue, green, or hazel. They wouldnt respond eye color. For

quantitative data there must be some sort of ascending or descending numbers, else it wouldn't
work. For example, GPA, you wont be able to graph GPA versus another variable (say, race or
sex) unless you actually have a unit, like 3.1 or 2.9.
CONFIDENCE INTERVAL
Confidence Interval Estimates: Statisticians use a confidence interval to express the degree of
uncertainty associated with a sample statistic. A confidence interval is an interval estimate
combined with a probability statement
1. Confidence interval estimate based on a 99% confidence level for the true proportion of
all yellow Skittles in a sample randomly selected bag.
Using a TI-83 Edition calculator ( based on the formula to find our margins of
errors:E=Z/2 (P hat qhat)n )
P-hat = .184
Q-hat = .816
n = 1601
X = 294
Z =2.575
Confidence interval = .159 < p < .209
Conclusion: We are 99% confident that the interval from .159 to .209 actually does contain
the true value of the population proportion of yellow Skittles. Thats to say, if we sampled
bags of Skittles, we are 99% confident that the proportion of yellow Skittles would be
between 15.9% and 20.9%.
2. Confidence interval estimate based on a 95% confidence level for the true mean number of
candies per bag.
Because we do not know the standard deviation for the population, we will work with the
standard deviation of the sample and a t-score rather than a z-score. We would use the z-score if
we knew the population standard deviation.
The relative data of importance to creating our interval includes:
Mu=59.3
Sum=1601
Sample standard deviation=2.71
Number of bags in our sample=27 Degrees of freedom=26
Our formula: E=t /2 x (s/ n ) t /2=2.056
S=2.71
n=27
Using a TI-83 calculator, our confidence interval is 58.22 < Mu< 60.37
Conclusion: If we collect many different samples of 2.7 ounce Skittles bags, we are 95%
confident that any given bag will contain no fewer than 58.22 and no more than 60.37 Skittles.

3. Confidence interval estimate based on a 98% confidence level for the standard deviation of
the number of candies per bag. (using a TI-83 plus calculator)
We will use a Chi^squared value for this confidence interval.
Important data relative to getting our interval include:
Degrees of freedom = 26
Sample standard deviation squared = 2.71 = 7.37
X^2 (chi^sq) right = 12.198
X^2 (chi^sq) left = 45.642
Final Result: 2.049<sigma squared<3.964
Hypothesis Testing
Hypothesis Testing: Hypothesis testing is an inferential procedure that uses sample data to
evaluate the credibility of a hypothesis about a population. (using a TI-83 plus calculator)
1. Use a 0.05 significance level to test the claim that 20% of all Skittles candies are red.
H0 = .20
H1 .20
(alpha) = .05 /2 =.025
Rejection: 2 Tailed
Critical Z = 1.96
Test Statistic = 2.49
Reject because the Zo (test statistic) is inside the rejection zone.
Conclusion: There is sufficient evidence to reject the claim that 20% of all skittles are red.

2. Use a 0.01 significance level to test the claim that the mean number of candies in a bag of
Skittles is 55. (using a TI-83 plus calculator)
Ho = 55
H1 55
(alpha) = .01 /2 = .005
Critical T = 2.779
Test Statistic = 8.23
We have to T- test because we do not know the standard deviation for the entire population,
however we do now the standard deviation for our sample. When we compare the t statistic to
the critical z there is a huge difference. Therefore we will reject the null hypothesis.
Conclusion: There is sufficient evidence to accept the alternative hypothesis.

Reflection
As a result of this course and project, I have learned a lot. I have learned that everything
in life has a statistic to it and can be calculated through a basic line curve as long as we know a
standard deviation, mean, or percentage.
For this project in particular, it taught me a better understanding of stats, as well as how
to use such tools as excel and stat crunch, which give one a in depth view of how statistics are
used in every day life. I have learned how to do a confidence interval as well as a hypothesis
test. This skittles data can be substituted with any other data that I want, and I can now find a
statistic on pretty much anything. Obviously, still very basic, but it could come in handy for all
aspects of life.
This project has helped me understand and process my problem solving skills. I can now
view things not just as things but rather as a stat now. Or how probable something is to
happen. This is a beautiful real life skill that can be used in everyday life. Whether it be bills,
weather, skittles, etc. Everything has a statistic that can be broken down, that'll help one solve a
problem.
I now look at the world differently not because of a simple Skittles stat, but rather I can
see that there is stats in everything. I am able to see inaccurate statistics that our media gives us,
and how unreliable some of our biggest sources may be. It is quite mind blowing how much
false information is actually put out there, and stats has helped me realize this.