Data Collection

Each student in the class will purchase one 2.17-ounce bag of Original Skittles and

record the following data:

Number of Number of

total candies red candies

54

Number of

orange

candies

13

Number of

Number of

Number of

yellow candies green candies purple candies

14

10

11

The Pie Chart showed me the total numbers of colors for a whole class which the highest

was red and the least was yellow. Compared to my own bag, it was opposite. Which the highest

was yellow and the least was red.

Colors

Sum

Red

Orange

Yellow

Green

Purple

360

315

294

313

319

Colors

Red

Amount

6

Orange

13

Yellow

14

Green

10

Purple

11

The graph reflects what I expected to see. The overall data collected by the whole class

does not agree with my own data from a single bag because it was opposite of the highest and the

least numbers of certain colors.

Quantitative Data

Summary statistics:

Column

Total Skittles

n

27

Mean

59.3

Std. dev.

2.72

Median

59

Min

54

Max

66

Q1 Q3

58

61

The data looks in a normal distribution (Bell shape). The totals numbers of candies per

bag is not to0 far away from the average. The mean and median are close to each other.. It

reflects what I expected to see.

Categorical (Qualitative Data) describes something and does not represent measurement.

In this project, the qualitative data is the colors of skittles. Which is best represent by pie chart or

Pareto chart. Quantitative data represents some measurement. The types of graph that make sense

for this data are histogram and boxplot.

A confidence interval is a range or an interval of values used to estimate the true

value of a population parameter. A confidence interval is sometimes abbreviated as

CI. Confidence interval is associated with any value of confidence level, it depends

of how confident a person wants to be that the true value is within the confidence

interval.

1. Construct a 99% confidence interval estimate for the true proportion of

yellow candies.

N = number of sample values (total candies from the entire class)

X = sample size (total number of yellow candies)

P hat = sample proportion

P = population proportion

= 0.01 because we are 99% confidence (1-confidence level)

E = margin of error (is the maximum likely difference with probability of 0.99

between the observed sample proportion p hat and the true value of the

population proportion p)

The result is we are 99% confidence that the interval from 0.159 to 0.209

actually does contain the true value of the population proportion p.

2. Construct a 95% confidence interval estimate for the true mean number of

candies per bag.

N = number of sample values

= population mean

x bar = sample mean

E = margin of error

S of x = standard deviation of the sample

The result is we are 95% confidence that the interval from 58.23 to 60.37

actually does contain the true value of the population mean .

3. Construct a 98% confidence interval estimate for the standard deviation of

the number of candies per bag.

N = number o0f sample values

S of x = standard deviation of the sample

chi-squared left = left-tailed critical value of chi-squared

chi-squared right = right-tailed critical value of chi-squared

The result is we are 98% confidence that the interval from 2.049 to 3.964

actually does contain the true value of the population standard deviation.

Hypothesis is a claim or statement about a property of a population. Hypothesis test

is a procedure for testing a claim about a property of a population.

First identify the null hypothesis and alternative hypothesis, express both in

symbolic form. The null hypothesis indicates no change and alternative hypothesis

indicates there is a change. After we determining alternative hypothesis, then we

can find out that the test is one tailed or two tailed test. It helps find how is

distributed. Then we calculate test statistic, given a claim and sample data. Choose

the sampling distribution that is relevant. Either find P-value or identify the critical

value and then state a conclusion about a claim in simple and non-technical terms.

1. Use a 0.05 significance level to test the claim that 20% of all skittles candies

are red.

I used classical method to compare test statistic with the critical value. If the

test statistic is bigger than the critical value, we reject the claim (rejected null

hypothesis). But if test statistic less than the critical value, it means we failed

to reject the claim (failed to reject null hypothesis and supports the

alternative hypothesis). In this case, the result is that the test statistic is

bigger than critical value, it means we rejected the claim that 20% of all

skittles candies are red.

2. Use a 0.01 significance level to test the claim that the mean number of

candies in a bag of skittles is 55.

I used classical method to compare test statistic with critical value. In this

case the test statistic is far away beyond the critical value, it means we

rejected the claim that the mean number of candies in a bag of skittles is 55.

Reflection

The conditions for doing interval estimates are the data is a normal distribution, it is

a simple random sample, and each event are independent. Based on the skittles

data that we did in our class, we met these conditions. The possible error could

have been made such as miscounting of a number of candies (colors). For example,

a different way how the students count how many candies (colors) in the bag. By

these statistical research (Confidence interval) we could find that the true value of a

parameter is within the confidence interval value, and test statistic is a procedure

for testing a claim about a property of a population.

