You are on page 1of 4

Kimberly Young

Math 1040 Skittles Project


Introduction:
To test the consistency of Skittles being packaged, we will be taking a sample of 15
packages (2.17 oz.) of Skittles candies. With each bag, we will count the total number of
candies, the number of red candies, the number of orange candies, the number of yellow
candies, the number of green candies, and the number of purple candies. Using the collected
values, we will create a number of charts and graphs to interpret the data.

Organizing and Displaying Categorical Data:

In the Pie and Pareto Charts above, we can clearly observe that the colors are fairly
equally distributed. While they are not exactly the same, it is easy to see that they are close. I
expected to see something like this because as a company, Skittles Candies should try to have
each flavor be approximately equal among packages. While it is unrealistic for each package to
have the exact same number of each candy, it is expected for them to be fairly close. In my
personal package of candies, the number of each color in my package had greater variation.
Below are tables showing the data from both my package and the entire class sample.

My Package
Red
9

Oran Yello Gree Purp


ge
w
n
le
13

14

12

Organizing and Displaying Quantitative Data:

This data is right skewed because of


the outlier. I was not surprised that
most packages have between 55 and
65 Skittles in them, but I was shocked
to see that someone got a pack-age
with about thirty more candies than
every-one else. My package of
Skittles had 57 candies which is
included in the green bar of my
histogram.
Five-Number-Summary:
Min:

57 candies
Q1:

59 candies

Med: 60 candies
Q2:

62 candies

Max: 97 candies
Standard Deviation: 9.67 candies
Mean: 62.7 candies

Reflection:
Categorical data is defined as information organized into groups. Quantitative data is
defined as information that can be measured and written down with numbers. In other words,
categorical data does not contain numerical valuesoutside of frequencywhereas quantitative
data does. Graphs and charts that make sense to use with categorical data include: pie, Pareto,
and frequency tables because these forms of organizing data do not necessarily need
quantitative values. Graphs and charts that make sense to use with quantitative data include:
histogram, boxplot, bar graph, line graph, dot plot, and stem plot because these charts and
graphs generally do need numerical values in order to convey the data properly. For categorical
data, the frequency of the categories needs to be calculated because it is the one numerical
value used in the graphing of categorical data. Calculations that need to be made for
quantitative data include: five-number-summary, standard deviation, mean, frequency and
others depending on the kind of chart or graph that is chosen to be used to display the data.

Confidence Interval Estimates:


Confidence interval is a range of values used to estimate the true value of a population
parameter.
Proportion:

Mean:

(0.191)
E = 1.96(0.809)
941

= 0.025

9.67
E = 2.581
941 = 0.814

Standard Deviation:
2
940(9.67)<
<
124.116

940(9.67)2
61.754

We can be 95% confident that the


p + E = (0.166,
0.216)
true
proportion of purple candies
lies between the values: 0.166 and
0.216.
We can be 99% confident that the
x + E = (61.886,
63.514)
true
mean of candies per bag lies
between the values: 61.886 and
63.574.
We can be 98% confident that the

26.61 < < 37.73

true standard deviation lies


between the values of: 26.61 and
37.73.

Hypothesis Tests:
A hypothesis test is a method that uses sample data to decide between two claims about a
population characteristic.
Claim: 20% of all Skittles are green
0.206 0.20
= 0.01
(0.20)(0.80)
941
H0: p = 0.20
z=
= 0.4601
H1: p 0.20
to
original claim.
Claim: The mean number of Skittles is 56.
62.7 56
H0: = 56
t=
9.67 = 2.68
15
H1: 56
to
claim.

p-value = 0.1614 >


Fail to Reject H0
There is not sufficient evidence
warrant rejection of the

t/2 = 2.145
Fail to Reject H0
There is not sufficient evidence
support the original

Reflection:
The requirements to calculate interval estimates for population proportions are:
1. The sample is a simple random sample.
2. The conditions for the binomial distribution are satisfied:
- There is a fixed number of trials
- The trials are independent
- There are two categories or outcomes
- the probabilities remain constant for each trial
3. There are at least five successes and at least five failures

The requirements for hypothesis testing for a population proportion are:


- np > 5 and nq > 5

The requirements to calculate interval estimates for population means are:


1. The sample is a simple random sample
2. The population is normal and/or n > 30
The requirements for hypothesis testing for a population mean are:
- For a t-distribution: not known and normally distributed population or not known and
n > 30.
- For a z-distribution: known and normally distributed population or known and n > 30.
The requirements to calculate interval estimates for population standard deviation are:
1. The sample is a simple random sample
2. The population must have normally distributed values. **The requirement of a normal
distribution is much
stricter here than in earlier section, so departures from normal
distributions can result in large errors.**
The requirements for hypothesis testing for a population standard deviation are:
- The population must be normally distributed
Possible errors coming from the use of this data could include miscalculation, and the data might
not be normally distributed which could cause serious issues. The sampling method could be
improved by increasing the number of samples to reach a number larger than thirty, thus
ensuring that the sample is normally distributed. Conclusions that could be drawn from this
statistical research could be that for the most part, Skittles Candies are distributed fairly
normally for the most part, and the colors are also distributed fairly evenly. This means that it
doesnt really matter which package you select in a store because it will have about the same
number of each color and about the same number total compared to all the packages.

You might also like