You are on page 1of 6

----- 1

Emily ------

Jeffery Price, Instructor

English 1010, Section 7

8 January 2018

Math 1040 Skittles Term Project

This project involves collecting skittles and developing statistics of the proportion of each

color, as well as the number of candies in each bag. Everyone in the class was assigned to

purchase and bring to class a standard 2.17-ounce bag of Original Skittles. Each participant

would then count and record the number of Skittles and the number of each color of Skittle. We

proceeded to gather into groups for samples. The purpose of this assignment was to perform

research and use data to practice creating confidence intervals and performing hypothesis tests.

Our sample’s data included the following:

Red Orange Yellow Green Purple Total


Student 1 (me) 12 13 11 12 10 58
Student 2 12 11 13 11 11 58
Student 3 8 9 16 10 19 62
Student 4 7 9 17 17 8 58
Student 5 6 14 14 15 11 60
Student 6 10 9 13 14 12 58
Totals 55 65 84 79 71 354
Proportions 0.155 0.184 0.237 0.223 0.201 1
Organizing and Displaying Categorical Data: Colors

Skittles of Each Color Proportion of Skittle Colors

55 0.155
71 0.201

65 0.184
79 0.223
84 0.237

Red Orange Yellow Green Purple Red Orange Yellow Green Purple
----- 2

These numbers are about what I expected. There was a difference in the numbers of each

skittle color, but over-all the proportions are similar. My single bag of skittles had similar

proportions to the sample. My most frequent color was orange and my least frequent color was

purple. The sample, however, had yellow as the most frequent color and the least frequent color

was red. However, both my bag and the sample had proportions around 0.2.

Organizing and Displaying Quantitative Data: the Number of Candies per Bag

Next, the entire class recorded the total number of skittles in each 2.17-ounce bag. The

following is what was recorded.

Frequency Table
Total number of 5-number-summary Box and Whisker Plot
Skittles per bag Frequency
63 1 min 54
62 1 Q1 57
61 3 med 59
60 6 Q3 60
59 5 Max 63
58 3
57 5
Mean 58.643
56 1
Standard Dev 2.181
55 2
54 1
----- 3

The distribution is fairly normal. It is slightly skewed to the right, but it is in negligible. I

do not believe the numbers in the frequency table are correct. The collection was gathered by an

individual who wrote the numbers on the board as students called out their numbers. I had four

people in my sample group who had 58 skittles in their bags. However, the class recorded only

three bags of 58 skittles in the whole class. Unfortunately, I don’t know exactly what the class

data is. However, I still believe that the distribution is normal, and these are the only numbers

available.

Reflection

Two types of data include categorical and quantitative data. Categorical data is made up

of categories, or groups, that cannot be numbered. Quantitative data is data that can be

numbered. In this assignment, color is the categorical data. For example, the average color is not

red-orange. Although how many skittles of each color can be numbered (how many reds, how

many oranges, etc.), the colors themselves cannot be numbered. It wouldn’t make sense to make

a box plot for proportions of each color. It makes much more sense to use a pie chart, like we did

above, because pie charts compare categories. The only calculation that makes sense for

categorical data is proportions. The quantitative statistic we worked with above was the number

of candies in each bag. It makes more sense to put this kind of data in a box and whisker plot

because you can make a 5-number-summary with numbers. This is not possible with colors

because you can’t average two colors together. Although we could use a pie chart for our

quantitative data, it wouldn’t helpful.


----- 4

Confidence Interval Estimates

In statistics, the confidence interval is the area in which we can be reasonably confident

the true number is within. They help show how close the estimate is to the actual number and

they help provide a more accurate estimate.

For our data, utilizing my calculator’s 1-Proportion Z-Interval test (x: 84, n: 354, C-

Level: 0.99), a 99% confidence interval for the true proportion of yellow candies would be

(0.179, 0.296). We can be 99% confident that the true proportion of yellow candies is between

0.179 and 0.296.

For our data, utilizing my calculator’s T-Interval test (x: 58.643, sx: 2.181, n:28, C-Level:

0.95), a 95% confidence interval for the true mean number of candies per bag is (57.835,

59.643). We can be 95% confident that the true mean number of candies per bag is between

57.835 and 59.643.

By looking at this data, we can see that the interval of yellow candies is quite large. This

is because the percent confidence is high (99%). The interval for the average number of candies

per bag is smaller. This is because the percent confidence is lower (95%.

Hypothesis Tests

The purpose of a hypothesis test is to test believed or previously found statistics. For

example, because there are 5 Skittle colors, it would be reasonable to believe that each color is

proportionally 20% of the whole.

I used a .05 significance level to test the claim that 20% of all Skittles candies are red.

Utilizing my calculator’s 1-Proportion Z Test, (p0: .2, x: 55, n: 354, prop ≠ p0), I found that
----- 5

p = .036. This means that if the true proportion of red candies is .2, the chance that I would get

55 red candies out of 354 is .036. Remember my significance level is 0.05. Because the p-value

= .036 which is < .05, we reject the null hypothesis. There is enough evidence that the true

proportion of red candies is not 0.2 based on my group sample.

Next, I used a .01 significance level to test the claim that the mean number of candies in a

bag of Skittles is 55. Utilizing my calculators T-Test, (µ0: 58.643, x: 55, sx: 2.818, n: 28, µ ≠ µ0)

I found p = 0.000. Because p = 0.000 which is < .01, we reject the null. There is enough evidence

that the mean number of candies in a bag of Skittles is not 55.

I had expected the difference between the sample proportion of red Skittles to be less

statistically significant from 0.2. I was very surprised at the low p-value. I think this reveals that

there is a great difference in the proportion of different colors of Skittles per bag than the total

proportion of the whole. I was also surprised at how statistically significant the difference

between 58.643 and 55 was.

Reflection

The conditions required for accurate hypothesis tests and confidence intervals include the

samples being sufficiently random and either a normal distribution or a large sample size. Both

sets of data collected, the proportion of each color and the number of Skittles per bag, had small

sample sizes. I think that if we were to be more precise with our records and had two classes

perform this activity combined, we would get more accurate results. I also noticed many

classmates brought in larger boxes of Skittles and then counted out a similar number of Skittles

to their neighbor because they couldn’t find a 2.17-ounce bag. This totally ruins the experiment

because the number, and potentially color, of Skittles is now not accurate. To improve the
----- 6

experiment, we could make sure everyone bought an actual 2.17-ounce bag of Skittles. Maybe

each classmate could bring in two bags, to increase the sample size. I do not feel it is appropriate

to draw any conclusions from the numbers calculated in this paper, because the experimental

procedures were so badly carried out.

You might also like