You are on page 1of 8

Skittles Term Project

For this project each person in our class got a small bag of
skittles and
counted the
Mine
Class
number of candies
in each bag.
Totals
We then
compared how
Red
19
189
many green, red,
purple, yellow,
and orange
skittles there
Orange
8
194
were. We
recorded the
Yellow
15
166
data that we
found and
compared it to our Green
other
8
152
classmates. We
then found the
Purple
14
193
probability of
getting each
color of skittle.
After retrieving
this data we compared and analyzed it.

Class's Data
0.22

0.22
Red Candies
Orange Candies
Yellow Candies
Green Candies
Purple Candies

0.17
0.22
0.19

I did expect to see results


like these. There is a pretty even
amount of each color per bag
according to the classs data. In
the classs data there is about .2
probability of getting each color
of candy. In my data both green
and orange have significantly
less amount of candies than the
others in the bag. The
probability of getting a red or
yellow candy is much higher
than the probability of getting
green or orange candies. I feel
that this is because the red and
yellow candies are popular so
they make more of those candies

Skittles in my bag
0.22
0.3

Red Candies
Orange Candies
Yellow Candies
Green Candies

0.13

Purple Candies
0.13
0.23

Maximum: 65
Quartile 3: 62

Median: 60

Quartile 1: 57
Minimum: 52

Five Number Summary


70
60
50
40
30
20
10
0

Reflection:

Five Number
Summary

The distribution is
slowly increasing upward.
The shape of the
distribution is basically a
line. I expected it to be
like this because the data
seems to be pretty evenly
distributed. The boxplot
for the classs data is more
evenly distributed than my
data. In my bag there
were on 64 candies but our
class average per bag it is
only 59.6. So my bag of
skittles had more candies
than average in it.

Categorical data is data that is put into categories and is not numerical
data. Bar graphs will show this kind of data. Categorical data does not use
numbers but instead uses things like hair color, eye color, or colors to
categorize data. Quantitative data is data that is measured by numbers.
Histograms are the type of graph that you want to use for quantitative data.
Categorizing things by symbols or names is not quantitative data.
Quantitative data can have discrete or continuous numbers.

Confidence Intervals:
A confidence interval is based on a probability that something will occur
at. For example a 95% confidence interval would mean that we are 95%
confident that the answer will lie within this interval.
95% Confidence Interval for Purple Candies (Classs Data):
((p(hat)*q(hat))/n) *1.96 = ((.2158837*.78411633)/193) *1.96 = .
0580467 .2158837-.0580467 =.157837 .2158837+.0580467 = .27393
. 157837 < p < .27393
99% Confidence Interval for the Mean (Classs Data): ZInterval Inpt: data
Standard Deviation: 3.67 List: L1 Freq: 1 C-Level: .99 =
57.159 < mean < 62.041
98% Confidence Interval for the Standard Deviation (Classs Data): ((n1)s^2) /Chi-Square ((14)(3.67)^2)/19.81 = 3.085 ((14)(3.67)^2)/7.042
= 5.175
3.085 < Standard Deviation <5.175
The sample proportion is between the 95% confidence interval being that
the actual proportion is .2159. The same goes for the mean, 59.6 is in
between the confidence interval. Although the standard deviation is closer
to the bottom of the confidence interval it is also in between the two
numbers.
The purpose of hypothesis testing is to choose between two hypotheses.
We will either reject our hypothesis or fail to reject it based on whether we
have sufficient or non-sufficient evidence.
.01 Significance Level: (p(hat) p)/ (p*q(hat))/n = .6/ .16/15 = 5.809
Because the value is above .05 we can conclude that we need to fail to
reject the hypothesis because there is insufficient information.
.05 Significance Level: T-Test Inpt: stats Claimed Mean: 56 Mean: 59.6
Standard Deviation: 3.7947 n: 15 Mean is not equal to 56 =
T-Score = 3.674 p= .0025 Therefore we reject the null hypothesis because
we have sufficient information.
In order for a hypothesis test to work for a population proportion you need
to have a simple random sample, we need to have two possible outcomes,
the population needs to be 20 times bigger than the sample size, and n
has to be greater than or equal to 10. We need to have a simple random
sample instead of a convenience sample in order to meet these
requirements.
In order for a hypothesis test to work for a population mean you need to
have a simple random sample and a normal distribution. Our data needs
to be from a simple random sample instead of from a convenience sample.

You need to have a simple random sample and a sample size greater
than or equal to 5 in order for a population standard deviation
hypothesis test to work. Our data doesnt meet these requirements
because we used a convenience sample.
Possible errors could have been a miscount in data or getting the
wrong kind of skittles. These mistakes could affect the data a lot
especially if there is an outlier. I feel that the sampling method could
have been improved if we had more people get more bags and chose
the bags randomly instead of picking the first one off the shelf. Overall
I have been able to draw many conclusions from this data. I have
learned the average amount of skittles in a bag and how many things
such as the mean, standard deviation, and the proportion are inbetween the confidence intervals we have found. We were able to
reject and fail to reject some of our hypotheses.

Conclusion:
Although there were many problems and challenges that I faced
during this assignment I have learned a lot. I have been able to apply
what I have learned in statistics to the real world, even if it is just
calculating information about skittles. After analyzing my data I was
able to find confidence intervals, test my hypotheses, find the standard
deviation, and figure out how to create different types of graphs from
data. I will be able to apply these skills in the real world by identifying
what articles or statistics are incorrect or and which ones are correct,
so I will not be able to be tricked by faulty information and other
statistics.
This project has changed the way I see math and how I can apply
it to the real world. When I begun this class I felt that I would never
use any of the things I learned in the real world and that it was really
pointless information. But after completing this project I can see that
this type of math along with many other different types of math can be
applied in real life as well. I have never really seen a point to learning
so many different types of math unless you were going into a major
that required math skills. This project has changed my views and

helped me see that even the simplest math can be used in the real
world.
Overall I felt that this project was helpful and a good way to
improve my understanding of statistics and my problem solving skills.
After calculating all of the information I needed from this project I was
able to overcome roadblocks I faced. I have found that if I cannot
figure out something by myself there are many other resources I can
use to get my answer. I struggled with hypothesis testing during this
assignment and was able to learn more about it by looking at my notes
and searching the internet.

You might also like