You are on page 1of 12

Karina Cruz

Stats 1040
April 30, 2015
Introduction:
In my Statistic course I learned how to build up on my project and make my project stand out by
each individual lesson I learned in class and how to apply it to a class experiment for example
the Skittle project we did. The term project was fun and helped me understand many concepts
throughout the course. The data I gathered and my teammates that were in my group is shown
below.
Group Data
Red (0.20) reason why only 0.20 is because there is always seems to be hardly any red skittles in
a skittle bag.
Orange (0.40) the orange skittle was the most popular color in our group
Yellow (0.12) yellow also seemed to be the lowest color represented.
Green (0.15) the green seemed the 3rd most popular by just seeing our group data
Purple (0.13) purple is 0.13 since thats the remaining of 100%

Question 3)

The data from the skittles is a random sample because each number of total for each color is
different even when we go in the table provided for us in excel, each bag has different total of
skittle for each color there is also a total of 48 bags of skittles and each bag does have an amount
of 1-20 skittles for each individual skittle color. The data is random because no one knows how
many skittles of each color the bag contains. Our group couldnt really depict the population
because the data was taken from only our class and I dont see how that can represent a
population unless its taking about the class population then the population would be the class
then we could use a sample from the class to determine the class population if desired. In the pie
chart the percentages are between 18-20%. The data shown shows that red is the most common
color of skittle and that there is a decreasing starting at the red skittle to the yellow skittle being
the least common skittle. Also the pie chart shows similar proportions for each skittle, looks like
each cut of the pie chart is equal if the percentages werent able but they look closely related due
to the percentages showing. The data also show a total of 1232 skittles; red skittles with 260,
orange with 259, yellow with 229, green with 243, and purple with 241.
Individual Data
Frequency table results for my skittle bag:
Count = 65
color in
Frequency
Relative Frequency
green

10

0.15384615

orange

16

0.24615385

purple

12

0.18461538

red

14

0.21538462

yellow

13

0.2

Frequency table results for the whole class skittle bag:


Count = 1232
color in
Frequency
green

243

orange

259

purple

241

red

260

yellow

229

48 bags of skittle bags

(Question 2)
My observations for my bag of skittles is that the most common skittle is Orange and the second
most common is the red skittle. The total of skittle in my bag were 65 and observations for the
class is that the total of skittles in the class is 1232 and the total of bags of skittles is 48. Also the
most common color of skittle is red with 260 following with orange as the second most common
with 259. What I found similar is that both the red and orange skittles are the top two with high
frequency. The tables do reflect on what I expected seeing red and orange skittles are the most
common. Also the total number of distribution of colors in the total class does not match my own
data from my single bag of candies. Because the total of each class is higher then what I got for
each color due to the fact that I only had one bag and the class has a total of 48 bags of skittles.
The graphs do not look similar in any way if they are put in the same order from red, orange,
yellow, green, and purple. But in general if we dont put attention to that both graphs are

decreasing approximately at the same heights. The graphs do reflect on what I expect in seeing a
decreasing order in the skittles, with one color always being higher than the rest of the skittles.
Group Data
Mean 60.3
Standard deviation 9.0
Min 35, Q1 57, med 59.5, Q3 64, max 86

Individual Data
My finding about the variable total candies in a bag is that there was one variable where made
that made the histogram miss leading because it was incomplete it only has 26 to avoid this
problem we just took out the variable that had 26 before constructing the frequency histogram
for the variable total candies in the bag. I also did the exact same process when constructing
the boxplot. The histogram as bell-shaped. Yes the graphs do reflect on what I expect because
everyone should have had a close amount of skittles in each bag making the graph a bell-shape
with the help of some having lower candies per bag; which was also expected, there is mostly but
not always someone who buys the wrong size of candy bag. In by bag I only had a total of 65
candies and the total of candy per bag including mine was but not including the miss leading
variable was 1206 candies per bag it obvious that there is a huge difference in my candies bag
compared to the sample of the class. My graph and the class graph for the total of candies per
bag will start differently and end in different heights in the frequency histogram. One way the

class data agrees with my data of my own bag is that the candies range about the same. It also
looks like if my graph went all the way to 90 on the x-axis my bar would have been in the center
just like most of the class data.

Question 2
The difference between category data and quantitative data is the category data is a way to
organized data and group categories into similar characteristics especially when it comes to
constructing a type of graph for example when constructing a bar graph and we want to know
how many hands, faces, feets, and legs are in a classroom the amount of body parts mentioned
in list from the class is the frequency (quantitative date) and the body parts is the category and it
will go on the horizontal line the x-axis while the quantitative data will go in the vertical line the
y-axis. Category data is also referred to qualitative data they may or may not have a logical
order. Some examples of category is car model, Gender, classes, pass or fail. One important
category for data is zip-codes which is seen in most but not call cases as part of a quantitative
data. A pie chart is also another good example of putting data into similar groups and getting the
percentages of what body part in the list from the class ranks the highest. Quantitative date is
also seen in a bar graph because its observations corresponding to a quantitative variable
meaning it provides a numerical measure of items or individuals also the numerical measure can
be added and subtracted and provide a meaningful result another graph can be the dot graph.
Examples of Quantitative data is observations of weight in pounds, number of customers, times
in seconds, and lastly anything that has a frequency in constructing a survey of a certain topic the
topic of course would be the category data/variable. Types of calculations that makes sense with
category data consists of attributes and labels for calculations it would make sense to figure the
average type. Types of calculations that make sense with quantitative data that consists of
numerical measures or counts is finding the measure of center or even examine the spread of
data. Calculations that wouldnt make sense for category is if we were trying to do what we do to
find qualitative data the same goes for category data.

Group Data
1) We are 99% confident that the population proportion of yellow candies is between
0.16 and 0.22
2) We are 95% confident that the mean number of candies per bag in 1040 math class in
between 58.26 and 65candies per bag.
3) We are 99% confident that the standard deviation of candies per bag is between 5.03
and 11.20 candies per bag
Note: in our calculations we took out row 17 and 22 because of misleading data.
Individual Data
Confidence interval is The range of values so defined that there is a specified probability that
the value of the parameter lies within it (Wikipedia). The confidence interval of the value of the
parameter is determined by the confidence level which includes the upper and lower bound,
point estimates, and margin of error. Confidence interval has different types of interpretations
depending if we want to estimate the population proportion, estimate the population mean, or
estimate the population standard deviation. In each estimate there will to be level of confidence
for example if our level of confidence is 95% that doesnt mean that the probability lies between
the upper and lower bound it just tells us that the interval includes the unknown parameter of
95% of all samples (Interactive Statistics Informed Decisions Using Data). The purpose of the
confidence interval is to give us a range of values for our estimated population parameter rather
than a single value or point estimate. The estimated confidence interval gives us a range of
values within which we believe, with a level of confidence, which the true population value
falls. (McGraw Hill Higher Education).

Cited page:
"Confidence Interval." Wikipedia. Wikimedia Foundation, n.d. Web. 23 Apr. 2015.
Woodbury, George. "Sections 9.1: Estimating a Population Proportion."Interactive Statistics Informed
Decisions Using Data. By Michael Sullivan. Boston: Pearson Education, 2016. 188-95. Print.
"Point Estimation and Confidence Intervals." TABLE OF CONTENTS FOR THE SOLUTION
MANUAL TO ACCOMPANY STATISTICAL METHODS FOR CRIMINOLOGY AND CRIMINAL
JUSTICE. McGraw-Hill Higher Education, 2001. Web. 23 Apr. 2015.

Conclusion
What I learned in Stats 1040 is the hypothesis test regarding a parameter, discrete probability
distributions, probability, data collection, and the relation between two variables. Most
importantly I learned how to use a graphing calculator and how to read formulas for estimating a
population parameter, probability, distributions, liner correlation and regression, hypothesis
testing, and normal distribution and sampling distributions. The Mathematics and statistic skills
that I applied in this project will help me in my career because working in groups has helped me
work as a team to overcome problems I had questioned before, because I didnt know how to
solve some of the questions, and my teammates would help me. It also helped me by having
teaching skills on some of the questions I really knew and my teammates didnt and this project
helped me learn how to explain some of the questions on the project and how to interpret the
problems. Some specific parts that were in the project was how to build all types of graphs and
how to label graphs correctly by using Stat Crunch. The project helped me develop problem
solving skills by helping me depict useful words in the problems and how to write out everything
that would be useful to solve the problem. This class has helped me change my perspective
towards the real world about statistics applications by showing me that statistics do apply in
everything we do and its everywhere we go. There is always graphs and measurements about
peoples opinions on topics. Statistics is always involved and seen in political issues or vote

because its useful to know the summery of elections and thats what stats does it gives you
measurements and percentages. Its even seen in college when it comes to grade point averages.
Statistics is seen everywhere.

You might also like