Chi Square Goodness of Fit Testfnl

MASINDE MULIRO UNIVERSITY
OF SCIENCE & TECHNOLOGY

COMPUTER SCIENCE DEPARTMENT
PCT 911: ADVANCE RESEARCH METHODS
TASK
Chi-square goodness-of-fit test (x2) for Stolen
Vehicles
SUBMITTED BY
NAME: NAHASON MATOKE
REGNO: SIT/H/004/10
SUBMITTED TO:
DR G. WANYEMBI
KAKAMEGA
Chi Sqr Assignment 2
Purpose:
Test for distributional adequacy the chi-square test (Snedecor and Cochran, 1989) is used
to test if a sample of data came from a population with a specific distribution.
An attractive feature of the chi-square goodness-of-fit test is that it can be applied to any
univariate distribution for which you can calculate the cumulative distribution function.
The chi-square goodness-of-fit test is applied to binned data (i.e., data put into classes).
This is actually not a restriction since for non-binned data you can simply calculate a
histogram or frequency table before generating the chi-square test. However, the values of
the chi-square test statistic are dependent on how the data is binned. Another disadvantage
of the chi-square test is that it requires a sufficient sample size in order for the chi-square
approximation to be valid.
The chi-square test is an alternative to the Anderson-Darling and Kolmogorov-Smirnov

goodness-of-fit tests. The chi-square goodness-of-fit test can be applied to discrete
distributions such as the binomial and the Poisson. The Kolmogorov-Smirnov and
Anderson-Darling tests are restricted to continuous distributions.
Definition The chi-square test is defined for the hypothesis:

H0: The data follow a specified distribution.
Ha: The data do not follow the specified distribution.
Test Statistic: For the chi-square goodness-of-fit computation, the data are divided into k
bins and the test statistic is defined as:
Where is the observed frequency for bin i and is the expected frequency for bin i.
The expected frequencies

The Kenya Evening Star, Nov. 7, 2009, reported the following information for a random
sample of 1000 stolen cars for the previous year:
170 were Fords, 300 Toyotas, 210 Nissans, 190 Hyundai's, and 130 Peugeots.
Using the X2= goodness-fit test and significance level of 0.01 to test the hypothesis that
proportions stolen are identical to population make proportions.
Suppose it is established that 15% of all cars are Fords, 35% are Toyotas, 20% are Nissans,
15% are Hyundai’s, and 15% are Peugeots.
The Observed Stolen Vehicles.

Ford Toyota Nissan Hyundai Peugeot Total
Stolen (Oij) 170 300 210 190 130 1000
Percentage of vehicles stolen for each make;

(Stolen make/Total stolen) * 100
Ford Toyota Nissan Hyundai Peugeot Total

Stolen (Oij) % 17 30 21 19 13 100
Total vehicles
Total Vehicles= ∑(stolen/percentage of stolen Vehicle)*100 = 5000
There fore
Expected Stolen Frequencies (Stolen Vehicle)
Given that
15% of all cars are Fords, 35% are Toyotas, 20% are Nissans, 15% are Hyundai’s, and 15%
are Peugeots
15% Ford of Total Vehicles =
Ford Toyota Nissan Hyundai Peugeot

Eij % 15% 35% 20% 15% 15%
Eij 150 350 200 150 150
Test the null hypothesis
Oij Eij Oij- Eij (Oij- Eij)2/ Eij

Ford 170 150 20 2.666666667
Toyota 300 350 -50 7.142857143
Nissan 210 200 10 0.5
Hyundai 190 150 40 10.66666667
Peugeot 130 150 -20 2.666666667
23.64285714
That is, chi-square is the sum of the squared difference between observed (Oij) and the
expected (Eij) data (or the deviation, d), divided by the expected data in all possible
categories
Assessing significance levels:
In the chi-square test for independence the degree of freedom is equal to the number of
columns in the table minus one multiplied by the number of rows in the table minus one.
Df: = (c-1) (r-1)
= (2-1) (5-1)
=4
Thus the value calculated from the formula above is compared with values in the chi-
square distribution table (Bissonnette, 2006). We reject the null hypothesis if the chi-
squared value is greater than the critical value (what is called the upper critical value).
Conclusion
Therefore the chi square for these data is: 23.643 (4 degrees of freedom: (2-1) (5-1)). The
critical value at p =.01 is 13.277
Since 23.643 is larger than 13.277, what observed differs from these expectations is enough
to reject the null Hypothesis.
State the you can draw from the observations made
Test the null hypothesis
Set up the hypothesis for Chi-Square goodness of fit test:
H0. Null hypothesis: In Chi-Square goodness of fit test, the null hypothesis assumes that
there is no significant difference between the observed and the expected value.
Ha. Alternative hypothesis: In Chi-Square goodness of fit test, the alternative hypothesis
assumes that there is a significant difference between the observed and the expected value.
The calculated value of X2 (23.636) is much higher than the table value(13.277) which
means that the calculated value cannot be said to have been due to chance. It is significant
Hence, the hypothesis does not hold

Chi Square Goodness of Fit Testfnl

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chi Square Goodness of Fit Testfnl

Uploaded by

Copyright:

Available Formats

MASINDE MULIRO UNIVERSITY

OF SCIENCE & TECHNOLOGY

PCT 911: ADVANCE RESEARCH METHODS

The chi-square test is an alternative to the Anderson-Darling and Kolmogorov-Smirnov

Definition The chi-square test is defined for the hypothesis:

The expected frequencies

The Observed Stolen Vehicles.

Percentage of vehicles stolen for each make;

Ford Toyota Nissan Hyundai Peugeot Total

Ford Toyota Nissan Hyundai Peugeot

Test the null hypothesis

Oij Eij Oij- Eij (Oij- Eij)2/ Eij

Assessing significance levels:

State the you can draw from the observations made

Test the null hypothesis

Set up the hypothesis for Chi-Square goodness of fit test:

You might also like