Professional Documents
Culture Documents
Math 1040
Skittles Final Project
In the Skittles term project, each student was instructed to purchase a 2.17 ounce bag of
Original Skittles. The number of each color of Skittles were recorded and compiled into a class
data set. The data sets were displayed using Pie Charts and Pareto Charts to show the categorical
data. These charts summarized the entire class sample as a whole as well as the data from my
personal bag of Skittles. The data was then converted into a frequency histogram, as well as a
box plot, with the use of a five number summary of the data.
As you can see in table number one, the entire sample is displayed using categorical data
in the form of a pie chart. The sample size of candies was out of 1,020 and contained yellow,
orange, green, purple, and red Skittles. This pie chart using the entirety of the class' Skittles looks
to be fairly symmetrical between all colors. However, in table number three, you can see that my
own bag showed a significant amount more of red and green Skittles as opposed to the other
colors.
Table number two reflects a Pareto Chart, which shows that there were more yellow
Skittles than any other color when it came to the entire sample size of the class. As you can see it
was the red Skittles that were least common amongst the class as a whole. However, table
number four reflects my bag which reflected the exact opposite results. In my own sample size,
table four shows that red was actually the most common color and yellow was the least common,
as opposed to the results from the class sample size. Therefore, the overall data from the class
sample size does not agree with the data from my individual sample size. The data contained in
my sample size reflected colors that had significantly more than others, while the entire class
sample size showed a more equal amount of each color.
Overall, the shape of the distribution comes out to be fairly symmetrical. There were 17
bags of Skittles that were tested. The least amount found in a single bag was 56 Skittles, while
the two bags with the most Skittles had 64. This seems to be fitting, as you can see in table
number five, the majority of the bags had a total number of Skittles that were somewhere in the
middle. This can be seen in the frequency histogram, which shows that nine of the bags had a
total between 59 and 61 Skittles in them. As opposed to 4 of the bags with only 56 to 58, and 4
bags in the higher range with 62 to 64. In terms of my individual bag, which was 64 Skittles,
mine was on the higher range.
Categorical data is data that can't be counted, but can be described, also known as
Qualitative data. This type of data is represented in the emphasis put on the different colors of
the Skittles. The colored Pie Chart and Pareto graphs are representative of this qualitative data in
that you can easily see how many of each type of color of Skittles there were. Quantitative data
on the other hand, can be counted and in this case is used to show how many Skittles were in
each bag. This type of data is more easily seen in the Frequency Histogram in Table 5, because
there were only three frequencies shown. The frequencies were used to show how many Skittles
were in each bag. Therefore, it is easier to see using the bar graph, rather than a pie chart.
Ultimately, for Qualitative data, pie charts or bar graphs make the most sense to represent
data that is descriptive. Pie charts wouldn't work for quantitative data, as they represent variables
as a whole rather than individually. Quantitative data is best represented using Histograms, line
charts or box-plots, as seen in table number 6, to show data that can be counted. These types of
graphs are best used to measure the quantities of each of the given variables.
2
In terms of calculations, finding the mean and median for quantitative variables is the
most useful. Since quantitative data can be counted it is more effective to be able to find an
average number as well as the middle component. However, for categorical, or Qualitative data,
in this case finding the mode, or the color of Skittle that occurs most often is very useful.
Overall, in terms of this project, both categorical and quantitative data were represented.
(Tables 1 & 2) - Yellow: 213 Orange: 195 Green: 211 Purple: 210 Red: 191
Sample Size: 1020, 17 total bags
Table 1:
Yellow: 20.9%
Green: 20.7%
Purple: 20.6%
Orange: 19.1%
Red: 18.7%
Table 2:
185
190
195
200
205
210
215
Table 3:
(Tables 3 & 4) - Red:18 Green:16 Orange:12 Purple:11 Yellow:7
Sample Size: 64
Red: 28.2%
Green: 25%
Orange: 18.7%
Purple: 17.2%
Yellow: 10.9%
Table 4:
10
12
14
16
18
Table 5:
The frequency histogram reflects the total number of Skittles found in each bag. There were 17
bags total. The total number in my own bag was 64 Skittles.
Total in
each
bag
56-58
59-61
62-64
Frequenc
y
4
9
4
9
8
7
6
5
56
59
62
3
2
1
0
Frequency Histogram
Table 6:
5-Number Summary: (Based on the total number of Skittles found in each bag.)
Lower Fence= 56
Q1= 59
Q2= 60
Q3= 61
8
Upper Fence = 64
58
58.5
59
59.5
60
60.5
61
61.5
Yellow: 213 Orange: 195 Green: 211 Purple: 210 Red: 191
Sample Size: 1020, 17 total bags
Class mean: 60
Class standard deviation: 2.42
10
11
12
on.
13
14
Reflections:
In this Skittles project, I was able to further develop my problem solving skills. In the first stages
of developing this project, the sorting and analyzing of data, calculating what the data meant for
the class as well as displaying it were required. Amongst collecting the data, I was able to use the
skills from class to be able to interpret what the class found. Including each class member's data
into this as well as separating it from my own, I was able to see the difference that one single bag
of Skittles had against a class of 17 total bags. Being able to analyze my own data as well against
the class data was very useful and something that I would be able to use in the future for real
world math applications.
Furthermore, after analyzing the qualitative and quantitative aspects of this project, the further
testing we did emphasized just how much can be done with a single data set. In the latter part of
this project, we analyzed confidence interval estimates as well as hypothesis testing. I was able
to be 99% confident that there are between 17.8 and 24.2 % of yellow candies in any given bag
of Skittles within our sample. I was also able to determine that our null hypothesis was true, in
the claim that there are 20% of red candies in any given bag of Skittles from our sample. Being
able to answer these types of questions about a data set certainly aided to my problem solving
skills. This is largely due to having to determine exactly which formulas to use and what
numbers to use in those formulas.
Ultimately, my views on real-world math applications have changed in that I can analyze and
interpret data more thoroughly. This will aid in my ability to perform studies and analyze data as
a Sociology major.
15