You are on page 1of 8

Hillary Bowler

Ashley Clark
Amanda Cordova
Victoria Schofield
Part IIIntroduction
The first step in our process was to collect the datathis came in the form of a
report from each student on their respective bag of skittles. We reported the
number of each color we found in the bags we purchased. Our objective is to study
the data to understand the frequency and distribution of the occurrence of the
number of candies and number of each color of candy. To better understand this, we
compared class totals and our own personal totals. We detail that comparison
below.
CATEGORICAL DATA

DISCUSSION: Both graphs for the full class data express what we expected to see.
The totals for each color were similarthe range was only 21 for data in thats in
the 100s. The pie chart and Pareto show a very small difference between the
frequencies of each color. The story is a little different for our individual data. Hillary
clearly had more greens than anything else, Ashley had more reds, Amanda had the
most yellows and Victoria had more purples than any other color. Each of our Pareto
charts especially show a very clear descending pattern and clear difference
between the frequency of colors. This is probably because the total number of
skittles in each bag is small compared to class totals, so even the smallest
difference of frequency is significant.

MEAN/STANDARD DEVIATION/5-NUMBER SUMMARY


Mean: 59.08
Standard Deviation: 1.9
5-Number Summary
Minimum: 54
Maximum: 62
Q1: 58
Q2 (Median): 59
Q3: 60
HISTOGRAM/BOXPLOT

DISCUSSION: The graphs show a fairly normal distribution of data, minus one
outlier and one gap. The histogram has no data for 57 skittles, because no one out
of the 26 students had a bag with that exact count. The box plot shows one outlier
at 54this student had the lowest skittle count by far. Again, we expected a pretty
normal distribution due to a lack of much variance between each students data, the
standard. All of our individual data (Hillary 61, Ashley 59, Amanda 55, Victoria 60
see a box plot of our data below) fall within range, though Amandas is on the low
end and more than 4 skittles away from the mean.

REFLECTION: Put simply, quantitative data requires numbers and categorical data
divides information by category. The first set of graphs, pie charts and Pareto charts,
are better suited to categorical data. Categorical data is discrete and not always
numerical. While we did count the quantity of each color, color itself is not numeric.
The histogram and boxplot work better for quantitative data. They were a visual
representation of the number of skittles per each bag in the class sample and their
distribution. You cant calculate much for categorical data beyond total counts per
each category. With quantitative data, you can calculate everything from mean to
standard deviation and variance. The categorical data was fun to know, but the
quantitative data of the skittles provided more calculations to help us really
understand skittle packaging patterns.

You might also like