You are on page 1of 3

Chris Lee and Eva Burns

Mr. Dunnahoo
AP Statistics
07 May, 2015

This project that we are doing is focused on Phone Company Carriers. There is a
published statistical data on the proportion of U.S. people using a particular phone company in
2014 found on http://www.statista.com/statistics/199359/market-share-of-wireless-carriers- inthe-us-by-subscriptions/. We are going to use the Chi Square Goodness of Fit Test to prove
whether or not our observed data is or is not consistent with their data. Our observed data will
consist of individuals from our school, family, friends, friends of friends, and strangers that we
have asked what cell phone company they used in 2014.
To prove whether or not the published statistical data is accurate or not, we will be
performing a Chi Square Goodness of Fit Test. Before we perform this test, there are a few
conditions that must be met, and they are the: Counted Data, Independence, Randomization, and
Expected Cell Frequency conditions. If any of these were not met, this test will not be possible.
Counted Data: The data are counts for the categories of the categorical variables (Phone
Company).
Independence: The counts in the cells are independent of each other.
Randomization: The individuals who have been counted are a random sample from some
population. For our sample, we only surveyed people we knew, but for the purposes of this
project, we will continue cautiously with the test and take this convenience bias into
consideration when comparing the results.

Expected Cell Frequency: All of the expected cell frequencies are greater than or equal to 5.
All four of these conditions are met, therefore, we are going to proceed with the Chi Square
Goodness of Fit Test.

Phone Company

Observed

Expected

Verizon

16

25.84

AT&T

23

25.84

T-Mobile + Others

33

12.16

Sprint

12.16

To begin our test, we must come with a null and an alternate hypothesis. Our null
hypothesis will be: Ho: The observed distribution is consistent with the published distribution.
Our alternate hypothesis will be: Ha: The observed distribution is not consistent with the
published distribution. After coming up with our hypotheses, we will have to solve for the test
statistic, degrees of freedom, and a p-value.
There are two ways that we could have solved for our test statistic:
1. Solve by hand with the formula: x2 = [(Observed - Expected)2 / Expected ].
2. Solve with a calculator: x2 GOF (L1, L2, DoF) where L1 = Observed Data and L2 = Expected
Data. DoF = Degrees of Freedom = n-1.
After having solved our test statistic, we have to solve for the p-value which will decide whether
we reject the null or fail to reject the null. To solve for the p-value, a calculator is needed. What
we put in the calculator is: x2CDF(x2, infinity, DoF), where x2 is the test statistic and DoF is the
degree of freedom for our test.

We solved for our test statistics using the calculator. On our calculator, we inputted
x2GOF (L1, L2, 3), and the calculator reported that x2= 45.25. After having solved our test
statistics, we solved for the p-value by inputting x2CDF (45.25, 9999999, 3) into the calculator
and received 8.025x10-10.
With a p-value of 8.025x10-10 we reject the null because the p-value is smaller than any
normal significance level ( = 0.01, 0.05, 0.10). This means that our observed data is not
consistent with the published data. Because of the convenience bias in our data, we acknowledge
that it is possible that we have not captured a representative sample of the population and thusly,
our data and results may not be accurate.

You might also like