Eportfolio

Part
One: Collect Data

We organized and did some data analysis to input the number of skittles candies for the entire
class. Then we input them into a doc to share with other students.
Part Two: Analyzing Skittles Data
1. Determine the proportion of each color within the overall sample gathered by the class.
a. My guess is that there will be more yellow candies than any color in general. The
second largest group of candies will be both orange and green. Then red will be the
second to last and purple will have the fewest candies.

Color Frequency Relative Cumulative Cumulative
of Frequency Frequency Relative Frequency
Candy
Red 1164 20.6% 1164 20.6%
Orange 1117 19.8% 2281 40.4%
Yellow 1189 21.0% 3470 61.4%
Green 1087 19.2% 4557 80.6%
Purple 1093 19.3% 5650 99.9%
Total 5650 99.9% 5650 99.9%

2. In StatCrunch, create a pie chart and a Pareto chart for the total number of candies of
each color in our class data set. Submit copies of your graphs in this report.

Total Number of Candies of Each Color
Amount of Each

ColorRed 1164
19.3% 20.6%
Orange 1117
Yellow 1189
Green 1087
19.2% 19.8% Purple 1093
21.0%

Total Number of Candies of Each Color
1200
1180
1189
1160
1164
Total Amount
1140
1120
1100 1117
1080 1093 1087
1060
1040
1020
Yellow Red Orange Purple Green
Skittle Color

3. A random sample is a type of method where all parts of the population have a fair and
equal chance of being selected. Our class used 93 bags of 2.17-‐ ounce bag of Original
Skittles, this is our random sample. The population is all the bags of 2.17-‐ ounce bags of
Original Skittles.

When collecting my own data, I guessed that there would be almost an even amount of every
color of skittles. When I collected my data, I saw that my guess was pretty accurate. There are
about the same number of each color. The color I had the most was purple and yellow, with 13
of each. The least amount I had was red, with 9. There were no real surprises from the graphs.
The chart above doesn’t show that there was one color that was drastically more than the
other.
When looking at the pie graph, each color represents about 20%. If you look at the pareto
graph, they also display the same data when comparing the total amount of each color. When
looking at the other classes data there could have been some outliers. Some students reported
only having 37 or 40 total skittles in their bag. This seems unlikely since the average was
somewhere around 60 (5650/93=~60). I assume each bag of skittles gets weighed before being
sold. Some students reported having 78-‐82 skittles in their bag. Again, this also seems unlikely.
This type of data can throw of our overall class values. Some colors could be over or
underestimated. There is also some error here since perhaps some students could have bought
the wrong size of bag or not original.
The distribution of colors for the class total somewhat matches my own distribution. Yellow and
Purple were the two colors I had the most of and most students also said they had a lot of
yellow but not as much as purple. Red was my lowest amount, but the class total shows that
other students had average amounts of red in their skittles bags.
Part Three: Organize, Display Quantitative Data

Mean
60.11

STD 17.06
Five Number Summary
Min 16

48
Q1
Median 58

Q3 71.5

Max 108

1. Write a paragraph discussing your findings about the variable “Total candies in each bag”.
Address the following in your writing:
i. What is the shape of the distribution of this variable? The shape of the frequency diagram is
bell shaped. This is reasonable since all the bags will be weighed before being shipped out.
Since a single skittle probably all weigh the same, it makes sense that most of the bags had the
same amount totals.
ii. Do the graphs reflect what you expected to see or are there some surprises? Overall there
are no surprises. It was expected that each class member would find roughly the same amount
of skittles in each bag. Since each of us were told to get a specific weight of skittles bag and
type, I guessed that we would all have the same. There are some outliers but that could be due
to human error or input error. iii. Does the overall data collected by the whole class agree with
your own single bag data? My individual bag did agree with the rest of the class.
My total amount in my bag was 58. The mean was roughly 60. That makes mine a little less than
average but still really close to what everyone else got for their total number of skittles in a bag.
Include the number of candies from your own bag and the total number of bags in the class
sample in your discussion. The total number of candies in my bag was 58. The total number of
bags in the class in our sample was 97 bags.
In a half page, explain the difference between categorical and quantitative data. Variables can
either be categorical or quantitative.
Categorical variables are also known as qualitative data. This type of data can be arranged,
order and put into many different groups. Categorical data deals more with descriptions that
are observed and not measured. Categorical often measure the quality of what is being
observed. Examples of qualitative data are gender, color, appearance, taste and names of
people. Categorical data usually cannot be expressed as a number. Quantitative data is
information that can be measured. The data is in numbers. Some examples of quantitative data
are length, ages, cost, temperature and weight. This type of data can also be ordered in many
different ways to give meaning to the data that it reflects. Bar graphs are used to show
quantitative data because it is an easy way to compare quantities next to each other by using
the height of each bar. Pie charts are not as useful for quantitative data because it is more
difficult to compare the side of each category in a slice rather than being lined up next to one
another like a bar graph. Other graphs used for quantitative include boxplots. Pie charts are the
best for categorical data because they help us easily see what percentage each groups contains.
They require all the data points to be included in the graph. Bar graphs are not as useful
because they can be more flexible with which data is displayed. Pie charts must use all or none
of the data.
For categorical data it is best to use nominal when no order can be determined and ordinal
when there is no significant differences between each category (A,B,C). For quantitative data
use interval when there is no meaningful zero (IQ scores) and ratio when there is a meaningful
zero (weight).

Part 4: Confidence Interval Estimate
Question 1.
99% CIE population proportion of yellow candies

x = 1189 yellow candies
n = 5650 pieces of candy
#
𝑃 = = 1189/5650 = 0.2104424
$
𝛼 = 1 -‐ .99 = .01
&
= .01/2 = .005
'
&
Z = invNorm(.01/2, 0, 10) = -‐2.57582
'
& ( )*( .'),-- )* .'),--
E = Z * = 2.57582 = 0.013968
' $ /0/,

We are 99% confident that the true proportion of yellow candies falls between .1964714 and
.2244024.

Question 2
95% CIE for true mean number of candies per bag

𝑥 = 60.1
S = 5.556
n = 94
df = n -‐ 1 = 93
𝛼 = 0.05
𝑇& = 1.986
'
< /.//0
𝐸 = 𝑇: ( ) = 1.986 = 1.379
; $ >-
Confidence Interval = 𝑥 − 𝐸 < 𝜇 < 𝑥 + 𝐸 = 60.1 − 1.379 < 𝜇 < 60.1 + 1.379 =
58.7 < 𝜇 < 61.5

We can be 95% certain that the true population mean falls within the range of 58.7 to 61.5.

Question 3.
98% CIE for the population standard deviation of the number of candies per bag
n = 94
df = n -‐ 1 = 93
s = 5.556
𝑠 ' = 30.869
𝛼 = 0.02
&
= 0.01
'
𝑥 ' 𝐿 = 1 − 0.01 = . 99
𝑥 ' 𝑅 = 0.01

Chart A-‐4
𝑥 ' 𝑅 = 0.01 = 124.116
𝑥 ' 𝐿 = .99 = 61.754

$*) < ; $*) < ;
<𝜎 < = 4.809 < 𝜎 < 6.818
#;J #;L

We are 98% confident that the population standard deviation falls between 4.809 and 6.818

Question 4.
We are 99% confident that the true proportion of yellow candies falls between .1964714 and
.2244024. For the true mean number of candies per bag, we can be 95% certain that the true
population mean falls within the range of 58.7 to 61.5. And for the population standard
deviation we are 98% confident that the population standard deviation falls between 4.809 and
6.818

Part 5: Reflection
Who knew you could learn so much from a bag of skittles. This semester, our term projects had
us applying what we have learned in class. For our term project we learned how to collect,
analyze, organize, display and create confidence intervals for the data that we collected. By the
end of the semester, I knew more than I ever wanted to know about skittles (next time maybe
do M&M’s for the chocolate fans out there). This statistical data on skittles helped me to
understand and apply the concepts to material I was learning in school, and it will help me in
my future career and everyday life.
This class project has already helped me better understand the class that I am currently taking. I
am taking a Human Physiology Lab. This lab requires us to make scatter plots, bar graphs,
compute r-‐squared values, t-‐tests and interpret the p-‐values. Each lab report always has us
report a p-‐value and we have to interpret if our hypothesis was supported or rejected. Learning
about these in the last exam has helped me a lot! This class project will also help me in my
future career. I plan on being a physical therapist. I have currently worked at an out-‐patient
clinic for a couple of years now. Physical therapy is based on continuing clinical research to help
find and prove the best methods for recovery. This studies often will collect and interpret the
data using statically analysis. Being able to interpret the data and understanding it will help me
become a better physical therapist.
This class project will also help me in my everyday life. I feel like I learned that the data can
sometimes be skewed are seem more persuasive than it actually is. Knowing what to look for in
a research article in a fitness magazine or some new claim better have some strong statistical
evidence to prove it. Just the other day I was buying toothpaste. On the box it said, “Nine out of
10 dentist recommend Crest.” This statement got me thinking about where did this Leland
Black Term Project Part 5 Reflection claim come from, where is the data and who are these nine
dentists anyways. These types of claims are EVERYWHERE in our society today. Statistics play an
important role in marketing and can influence us what to purchase. By participating in this class
project, I have have applied it to my everyday life and will hopefully become a smarter
consumer.
What’s in a bag of skittles? Turns out to be a lot actually. This semester we opened up almost
100 bags of skittles and applied what we learned in class. By applying different test, we were
able to learn a lot that eventually helped me in my school work and will continue to help me in
my future as a physical therapist and a daily consumer.

Eportfolio

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Eportfolio

Uploaded by

Copyright:

Available Formats

Part

One: Collect Data

Total Number of Candies of Each Color

Amount of Each

Five Number Summary

my future career and everyday life.

become a better physical therapist.

my future as a physical therapist and a daily consumer.

You might also like