You are on page 1of 9

Reflection

This semester, in Math 1040 (Intro to Statistics), we were given a class term project to
exercise and put to use what we had learned throughout the course. The project for the class was
to have each student get a 2.17 oz. bag of regular skittles and measure the total count and the
counts of each color. We gathered our data and recorded how many colors we had of each and
how many total were in the bag. We then organized the data and displayed it in both a categorical
(colors) and quantitative (numbers) form. With this data we began to analyze the counts and
determine proportions and use various graphing methods to better illustrate the data. We were
able to use confidence intervals, with a certain degree of confidence that the populations
numbers would fall within the range of our data analysis. Intervals to help us make an educated
guess about the population parameter. Finally, we were able to test the claims made by Skittles to
determine whether or not there is evidence to support those claims. .
All the steps taken and the lessons learned on how to make a solid interpretation of data
are applicable in everyday life. One can compile data and use the methods for finding a mean,
and standard deviation to make more prepared plans for a business, in example. You can
inventory the amount of uses from products you purchase and compare them to other products to
get a comparison and the likelihood of more satisfactory results from which product. Using a
proper number of samples and accurate measurements, you can determine (with a degree of
certainty) the overall population of the products measurements. Using these will allow for less
costly and more advantageous expenditures. Also as a result of this project, it helped me to see
that just because a label states something; doesnt necessarily mean that it is exactly true. When I
observed the amounts and means of all the bags of skittles among the sample, it was easy to see
that not all of the bags have exactly what the package informs us that they do. We assume that
companies give us accurate results and products, when in all actuality all of them have a set

range or interval of product that they feel comfortable selling. These are just a few ways I have
seen that I can apply the lessons taught through this project in my regular life.
Skittles Part 2
Row 27: red 12, orange 8, yellow 9, green 13, purple 14, total number of candies 56
Row 12: red 10, orange 11, yellow 9, green 12, purple 19, total number of candies 61
Row 1: red 9, orange 11, yellow 9, green 14, purple 17, total number of candies 60
Sample total: red 31, orange 30, yellow 27, green 39, purple 50, total number of candies 177
N= 177 bag of skittles that were brought
The way the group decided on selecting the three bags of skittles was using a TI-83 random calculating
method. I first turned on my calculator, and then clicked the math button. After I had four different options
pop up (MATH, NUM, CPX, AND PRB). I clicked towards PRB then went to rand and pressed on the
enter button. Upon pressing the button we got a bunch of random decimals show up and those decimals
are what we based our random groups off of (27, 12, 1).
Sampling Method Used: Simple Random Sample of Rows, but we specifically used the cluster sampling
method of Individuals. With the random selection we first divided into each, the sections were randomly
selected, and each skittle in the cluster was included in the sample.
Some of the errors could just be human error, someone could have forgotten to put it into statscrunch and
got the wrong groups totals, or somebody could have missed a piece of their data when recording them. It
could have been made so every individual could input his/her info themselves. After everyone put in their
own information, someone else should then check the data and input it.
So yes, the sample is representative of the class data set.

Skittles Project 3

1. Candy color is Categorical because the values can be sorted according to their category or
into groups. The number of candies per bag, is quantitative, the candies can be measured.
The numbers can be placed in ascending or descending order.
2.

3.

Red
309
0.185

Orange
329
0.197

4. Summary statistics:
Column n
Mean
Total
28
59.642857

Class Summary
Yellow
Green
315
350
0.1886
0.2096

Purple
367
0.2198

Total
1,670

Std. dev.
Median Min Max Q1 Q3 Sum IQR Mode
1.7472579
60 56
64 58 61 1670
3
60

5. Upper Fence: 61+1.5(3)=65.5 (66)


Lower Fence: 58-1.5(3) =53.5 (54)
Outliers: no outliers

Based on the outlier fences the bag that I purchased had a total of 61 so according to the fences,
it was not an outlier bag.
6. I think it would be appropriate to discuss the shape of the distribution since the shape of
the normal distribution is a bell shape. It starts low goes to a high point and goes back to
being low again. With categorical data you wouldnt discuss the shape with categorical
data since there is no shape to it. With pareto charts you are able to list the colors into
categories in descending order of frequency not allowing it show the actual shape of the
variables.

Part 4: Probability
1.
a. ( 351/1607) =.210
b. 1- .210=.79
c.

1 - (1046/1670)=.374

d. 329/1046=.315

2.
a. (367/1670)(367/1670)= .0483
b. (367/1670)(366/1669)= .0482

3.
a. Fixed n=10 p=.189
i. Independent (with replacement, selected randomly)
ii. 2 outcomes; (yellow, Not yellow)
b. p(4)= 10C4 *1(1-.189)^6 = 210(.00127599)(.284528075)=.0762

4.
a. n=32 Mean=59.6

Std Dev=1.75 u=59.6

Spread=(1.75/sq root 32) = .309

Shape=Approx Normal since n>30, So Central theorem Applies

b. 61.5-59.6/.309 = 6.15

1-.9999 = .0001

See PDF file for part 5

Part 6: Hypothesis Test


In statistics, a hypothesis is a claim or statement about a property of a
population. A hypothesis test (or test of significance) is a procedure for
testing a claim about a property of population.
Claim: 1
H0: p= 0.20(The claim)
x= 309 (total red candies)

H1: p 0.20
n=1670

Stat-Test-1=Pep p=.20 x=309 n=1670 prop: doesnt equal Calc


z= -1.529 p=.1262
p-hat=.185
The p-value .1262 is greater than the significance level .05

We would fail to reject the null hypothesis. There is insufficient evidence to


warrant the rejection of the claim 20% of all skittles candies that are red.

Claim: The mean number of skittles in a 2.17 oz. bag is more than
55. (u > 55).
H0: u= 55
H1: u> 55 (the claim)
One Tail Test
= 0.01
n = 28
_
x= 59.6
=55
DF= 27 (28-1)
t =13.909

2.473.

There is sufficient evidence to Support the claim. Since 13.9 is very far
to be the right, meaning that It is much greater than we would expect
if mu is 55, so we conclude that it is greater than 55. This is supporting
the claim
Requirements for using normal distribution to test hypotheses about the
population proportion are 1-SRS

The requirement for using the normal distribution to test hypotheses about
the population proportion are 1-SRS. T
-

The conditions are binomial distribution are satisfied (fixed # of trials=


1670).
Trials are independent =309 red candies
5% guideline
2 categories red or not
Probability of success remains the same in all trails.
The conditions np and nq are greater than or equal to 5 are both met. (
1670*.20=334) and (1670*.80)=1336.

The requirement for testing claims about the population mean with standard
deviation are unknown are 1-SRS, 2-either or both these conditions are
satisfied. (The populations are normally distributed or n>30). From looking at
the data it seems that our class data met the requirements needed for these
types of problems. My individual bag of skittles wasnt able to meet all of the
requirements
Part 7: Correlation and Regression (1% of course grade)
1. I think the results will show no relationship between the number of candies in a bag
and the height of the person who purchased the skittles bag. Because the correlation
doesnt make sense.
2. Explanatory Variable= Height Response Variable= # of candies in a bag.

3. There is not a significant between the two variables due to Correlation Coefficient is
less than the Critical Value r.
a. Correlation coefficient r = .2438 .244 C.Vr = .602 .244< .602
b. Thus we failed to reject our H(o) and making this outcome as I
expected/predicted regarding this data before analyzing the data.
4. The regression equation.
= 54.77+.084x
5. Since no correlation was proven in step 4, we use
=
a.
= 60.0
= 60.0 of candies in bag.
b. 54.77+.084(63.5)=60.1 It isnt appropriate to use regression equation to make
predictions about the number of candies since you only use the regression
equation only if the linear correlation coefficient r indicates that there is a
linear correlation between our two variables. Which step 4 gives us r = .244
6. Given an assumed significant relationship between our height and number of
candies per bag to predict the number of candies in a bag purchased by a retired
Houston Rockets player Yao Ming using his height of 90 inches alone, would be
inappropriate. Due to the fact that the height of 90 inches would be extrapolating
beyond the scope of our data.

You might also like