This project is based on the probability of a packet of Skittles. We are testing to figure out

whether or not Skittles are consistent with their costumers and if every basic $1.29 pack of

Skittles (2 oz.) has the same variety of color. I will be including organizing and analyzing data,

drawing conclusions using confidence intervals and hypothesis tests to put Skittles up the

challenge. Lets see what we got:

Class Total Skittles Colors

(You cant control the color on StatCrunch so I did a black and white).

The graphs reflected what I expected to see, just thought we would have a little more information

and such on how to use stat crunch. For the most part, my personal bag of skittles matches up

with the whole class except I had more orange than I did red, which is different from the class.

As you can see, my data is slightly different than the classes. My orange is the highest, and my

yellow is a lot lower than the class yellow. Red is also second to last.

!

Summary statistics:

Column

Mean Std. dev. Min Q1 Median Q3 Max

Total Skittles 59.3

2.72

54

58

59

61

66

Total # of Skittles in my bag : 58

The shape of the distribution is skewed right and data shown isn't much different from what I

expected.

Reflection: The values of a categorical variable can be put into a countable number of categories

or different groups and may or may not have some logical order. Compared to the values of a

quantitative variable, which can be ordered and measured. Categorial data is best used with bar

graphs and pie charts because they have a more broad subject (hints the name category). There

is no order to categorial data, its usually a broad topic to generalize, where quantitative goes in

order and deals with the gritty work using more numbers. Quantitative data use histograms, box

plots, and scatter plots.

Lets say you survey people and ask them to tell you their eye color. They would respond with a

categorical variable of brown, blue, green, or hazel. They wouldnt respond eye color. For

quantitative data there must be some sort of ascending or descending numbers, else it wouldn't

work. For example, GPA, you wont be able to graph GPA versus another variable (say, race or

sex) unless you actually have a unit, like 3.1 or 2.9.

CONFIDENCE INTERVAL

Confidence Interval Estimates: Statisticians use a confidence interval to express the degree of

uncertainty associated with a sample statistic. A confidence interval is an interval estimate

combined with a probability statement

1. Confidence interval estimate based on a 99% confidence level for the true proportion of

all yellow Skittles in a sample randomly selected bag.

Using a TI-83 Edition calculator ( based on the formula to find our margins of

errors:E=Z/2 (P hat qhat)n )

P-hat = .184

Q-hat = .816

n = 1601

X = 294

Z =2.575

Confidence interval = .159 < p < .209

Conclusion: We are 99% confident that the interval from .159 to .209 actually does contain

the true value of the population proportion of yellow Skittles. Thats to say, if we sampled

bags of Skittles, we are 99% confident that the proportion of yellow Skittles would be

between 15.9% and 20.9%.

2. Confidence interval estimate based on a 95% confidence level for the true mean number of

candies per bag.

Because we do not know the standard deviation for the population, we will work with the

standard deviation of the sample and a t-score rather than a z-score. We would use the z-score if

we knew the population standard deviation.

The relative data of importance to creating our interval includes:

Mu=59.3

Sum=1601

Sample standard deviation=2.71

Number of bags in our sample=27 Degrees of freedom=26

Our formula: E=t /2 x (s/ n ) t /2=2.056

S=2.71

n=27

Using a TI-83 calculator, our confidence interval is 58.22 < Mu< 60.37

Conclusion: If we collect many different samples of 2.7 ounce Skittles bags, we are 95%

confident that any given bag will contain no fewer than 58.22 and no more than 60.37 Skittles.

3. Confidence interval estimate based on a 98% confidence level for the standard deviation of

the number of candies per bag. (using a TI-83 plus calculator)

We will use a Chi^squared value for this confidence interval.

Important data relative to getting our interval include:

Degrees of freedom = 26

Sample standard deviation squared = 2.71 = 7.37

X^2 (chi^sq) right = 12.198

X^2 (chi^sq) left = 45.642

Final Result: 2.049<sigma squared<3.964

Hypothesis Testing

Hypothesis Testing: Hypothesis testing is an inferential procedure that uses sample data to

evaluate the credibility of a hypothesis about a population. (using a TI-83 plus calculator)

1. Use a 0.05 significance level to test the claim that 20% of all Skittles candies are red.

H0 = .20

H1 .20

(alpha) = .05 /2 =.025

Rejection: 2 Tailed

Critical Z = 1.96

Test Statistic = 2.49

Reject because the Zo (test statistic) is inside the rejection zone.

Conclusion: There is sufficient evidence to reject the claim that 20% of all skittles are red.

2. Use a 0.01 significance level to test the claim that the mean number of candies in a bag of

Skittles is 55. (using a TI-83 plus calculator)

Ho = 55

H1 55

(alpha) = .01 /2 = .005

Critical T = 2.779

Test Statistic = 8.23

We have to T- test because we do not know the standard deviation for the entire population,

however we do now the standard deviation for our sample. When we compare the t statistic to

the critical z there is a huge difference. Therefore we will reject the null hypothesis.

Conclusion: There is sufficient evidence to accept the alternative hypothesis.

Reflection

As a result of this course and project, I have learned a lot. I have learned that everything

in life has a statistic to it and can be calculated through a basic line curve as long as we know a

standard deviation, mean, or percentage.

For this project in particular, it taught me a better understanding of stats, as well as how

to use such tools as excel and stat crunch, which give one a in depth view of how statistics are

used in every day life. I have learned how to do a confidence interval as well as a hypothesis

test. This skittles data can be substituted with any other data that I want, and I can now find a

statistic on pretty much anything. Obviously, still very basic, but it could come in handy for all

aspects of life.

This project has helped me understand and process my problem solving skills. I can now

view things not just as things but rather as a stat now. Or how probable something is to

happen. This is a beautiful real life skill that can be used in everyday life. Whether it be bills,

weather, skittles, etc. Everything has a statistic that can be broken down, that'll help one solve a

problem.

I now look at the world differently not because of a simple Skittles stat, but rather I can

see that there is stats in everything. I am able to see inaccurate statistics that our media gives us,

and how unreliable some of our biggest sources may be. It is quite mind blowing how much

false information is actually put out there, and stats has helped me realize this.

