You are on page 1of 20

FL IA MS

A 0 1 1
B 10 49 59
AB 3 19 22
O 5 43 48
18 112 130
FL IA MS
A 0.14 0.86 0.00 1.00
B 8.17 50.83 0.00 59.00
AB 3.05 18.95 0.00 22.00
O 6.65 41.35 0.00 48.00
18.00 112.00 0.00 130.00
FL IA MS
A 0.138 0.022
B 0.410 0.066
AB 0.001 0.000
O 0.408 0.066
Chi-square statistic: 1.111 (sum the previous 12 cells)
df 3 (#rows-1)(#cols-1)
p-value: 0.774
critical value: 7.815
CHI-SQUARE TESTS FOR TABLES
The table below shows the observed distribution of ABO blood type in three samples of African Americans living in
different locations. (Example 11.10) Perform a chi-squared test of independence to determine whether Blood Type is
independent of Location. Unfortunately Excel does not have a built-in function for this test but we can use
spreadsheet properties to simplify the analysis.
Observed
Location
Total
Blood Type
Total
First calculate the "Expecteds" in the table below. The Expected number of Type A in FL has been calculated for you using the
appropriate row and column totals from the above table. Fill in the rest of the Expected numbers by clicking and extending the
formula to the cells below and to the right. Check to see that the row and column totals then match those from the Observed
table. Also notice the use of relative and absolute referencing (fix column and/or row in formula).
Expected
Location
Total
Blood Type
The chi-square test statistic is calculated as the sum of the 12 cells in this table. Fill in the chi-square statistic using SUM. Also
fill in the degrees of freedom for this test. From these values, the p-value and critical value will be automatically calculated
below.
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp range)
Total
Next calculate (Observed - Expected)^2/Expected for each cell. In the table below, the cell for Type A and FL has already been
filled in. Click and drag to extend the formula to the remaining 11 cells.
(O-E)^2/E
Location
Blood Type
P-value: 0.774
Chi-square statistic: 1.111
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp range)
This does not tell you the actual value of the test statistic. However, you can calculate the test statistic from the p-value using
=CHIINV(p-value, df)
CHI-SQUARE TESTS FOR TABLES
The table below shows the observed distribution of ABO blood type in three samples of African Americans living in
different locations. (Example 11.10) Perform a chi-squared test of independence to determine whether Blood Type is
independent of Location. Unfortunately Excel does not have a built-in function for this test but we can use
spreadsheet properties to simplify the analysis.
First calculate the "Expecteds" in the table below. The Expected number of Type A in FL has been calculated for you using the
appropriate row and column totals from the above table. Fill in the rest of the Expected numbers by clicking and extending the
formula to the cells below and to the right. Check to see that the row and column totals then match those from the Observed
table. Also notice the use of relative and absolute referencing (fix column and/or row in formula).
The chi-square test statistic is calculated as the sum of the 12 cells in this table. Fill in the chi-square statistic using SUM. Also
fill in the degrees of freedom for this test. From these values, the p-value and critical value will be automatically calculated
below.
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp range)
Next calculate (Observed - Expected)^2/Expected for each cell. In the table below, the cell for Type A and FL has already been
filled in. Click and drag to extend the formula to the remaining 11 cells.
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp range)
This does not tell you the actual value of the test statistic. However, you can calculate the test statistic from the p-value using
=CHIINV(p-value, df)
In this module, we will discuss
1 Chi-square distribution
2 Goodness of fit test
3 Test of independence
CATEGORICAL DATA
CATEGORICAL DATA
Excel provides two functions involving the chi-square distribution. CHIDIST requires a x value and degrees of freedom and
returns the upper tail area. This function could be used to compute the p-value. CHIINV requires a probability p and degrees of
freedom and returns a numeric value x such that Pr(c
2
> x) = p. This function can be used to determine the critical value of a
test.
a. Consider the snapdragon problem described on page 334-336 of the text. In that experiment, the test statistic was
4.18 with 2 df. Use CHIDIST to compute the exact P-value. Using table 8, it is between .2 and .1.
b. Consider Exercise 11.26 in the text. The test statistic is 24.35 and you are asked to perform the test at a=.01.
Instead of computing the p-value, use CHIINV to determine the critical value (df=4).
CHI-SQUARE DISTRIBUTION
Similar to the t distribution, the c
2
distribution depends on the "degrees of freedom". Unlike the t distribution, the c
2
distribution
is skewed and only positive values can occur.
When performing a test that involves a chi-square statistic, large values suggest a departure from the Null hypothesis. As a
result, p-values of hypothesis tests involve determining an upper tail area.
Excel provides two functions involving the chi-square distribution. CHIDIST requires a x value and degrees of freedom and
returns the upper tail area. This function could be used to compute the p-value. CHIINV requires a probability p and degrees of
freedom and returns a numeric value x such that Pr(c
2
> x) = p. This function can be used to determine the critical value of a
test.
a. Consider the snapdragon problem described on page 334-336 of the text. In that experiment, the test statistic was
4.18 with 2 df. Use CHIDIST to compute the exact P-value. Using table 8, it is between .2 and .1.
b. Consider Exercise 11.26 in the text. The test statistic is 24.35 and you are asked to perform the test at a=.01.
Instead of computing the p-value, use CHIINV to determine the critical value (df=4).
CHI-SQUARE DISTRIBUTION
Similar to the t distribution, the c
2
distribution depends on the "degrees of freedom". Unlike the t distribution, the c
2
distribution
is skewed and only positive values can occur.
When performing a test that involves a chi-square statistic, large values suggest a departure from the Null hypothesis. As a
result, p-values of hypothesis tests involve determining an upper tail area.
0 1 2 3 4 5
Obs: 97 93 115 104 97 109
Exp: 100 100 100 100 100 100
(O-E)
2
/E 0.09 0.49 2.25 0.16 0.09 0.81
c
2
s
= 4.91
Goodness of Fit Test
Suppose you do not feel comfortable with Excel's random number generator and decide to develop your own discrete integer algorithm. This
algorithm is to generate the digits 0-9 with equal probability.
After writing the macro, you decide to test it out by generating 1000 digits and checking to see if there is any evidence that the algorithm is not
performing properly. Based on the table below, perform the goodness of fit test.
Digits
6 7 8 9
98 101 96 91
100 100 100 100
0.04 0.01 0.16 0.81
Goodness of Fit Test
Suppose you do not feel comfortable with Excel's random number generator and decide to develop your own discrete integer algorithm. This
algorithm is to generate the digits 0-9 with equal probability.
After writing the macro, you decide to test it out by generating 1000 digits and checking to see if there is any evidence that the algorithm is not
performing properly. Based on the table below, perform the goodness of fit test.
Digits
FL IA MS
A 7 8 0 15
B 14 71 1 86
AB 1 4 0 5
O 1 14 0 15
23 97 1 121
FL IA MS
A 2.85 12.02 0.12 15.00
B 16.35 68.94 0.71 86.00
AB 0.95 4.01 0.04 5.00
O 2.85 12.02 0.12 15.00
23.00 97.00 1.00 121.00
FL IA MS
A 6.037 1.347 0.124
B 0.337 0.061 0.118
AB 0.003 0.000 0.041
O 1.202 0.324 0.124
Chi-square statistic: 9.718 (sum the previous 12 cells)
df 6 (#rows-1)(#cols-1)
p-value: 0.137
critical value: 12.592
Location
Expected
Location
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp range)
The chi-square test statistic is calculated as the sum of the 12 cells in this table. Fill in the chi-square statistic using SUM. Also
fill in the degrees of freedom for this test. From these values, the p-value and critical value will be automatically calculated
below.
CHI-SQUARE TESTS FOR TABLES
First calculate the "Expecteds" in the table below. The Expected number of Type A in FL has been calculated for you using the
appropriate row and column totals from the above table. Fill in the rest of the Expected numbers by clicking and extending the
formula to the cells below and to the right. Check to see that the row and column totals then match those from the Observed
table. Also notice the use of relative and absolute referencing (fix column and/or row in formula).
Next calculate (Observed - Expected)^2/Expected for each cell. In the table below, the cell for Type A and FL has already been
filled in. Click and drag to extend the formula to the remaining 11 cells.
Blood Type
Total
(O-E)^2/E
Total
Blood Type
The table below shows the observed distribution of ABO blood type in three samples of African Americans living in
different locations. (Example 11.10) Perform a chi-squared test of independence to determine whether Blood Type is
independent of Location. Unfortunately Excel does not have a built-in function for this test but we can use
spreadsheet properties to simplify the analysis.
Blood Type
Location
Total
Total Observed
P-value: 0.137
Chi-square statistic: 9.718
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp range)
This does not tell you the actual value of the test statistic. However, you can calculate the test statistic from the p-value using
=CHIINV(p-value, df)
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp range)
The chi-square test statistic is calculated as the sum of the 12 cells in this table. Fill in the chi-square statistic using SUM. Also
fill in the degrees of freedom for this test. From these values, the p-value and critical value will be automatically calculated
below.
CHI-SQUARE TESTS FOR TABLES
First calculate the "Expecteds" in the table below. The Expected number of Type A in FL has been calculated for you using the
appropriate row and column totals from the above table. Fill in the rest of the Expected numbers by clicking and extending the
formula to the cells below and to the right. Check to see that the row and column totals then match those from the Observed
table. Also notice the use of relative and absolute referencing (fix column and/or row in formula).
Next calculate (Observed - Expected)^2/Expected for each cell. In the table below, the cell for Type A and FL has already been
filled in. Click and drag to extend the formula to the remaining 11 cells.
The table below shows the observed distribution of ABO blood type in three samples of African Americans living in
different locations. (Example 11.10) Perform a chi-squared test of independence to determine whether Blood Type is
independent of Location. Unfortunately Excel does not have a built-in function for this test but we can use
spreadsheet properties to simplify the analysis.
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp range)
This does not tell you the actual value of the test statistic. However, you can calculate the test statistic from the p-value using
=CHIINV(p-value, df)
FL IA MS
A 0 1 0 1
B 0 1 0 1
AB 10 45 1 56
O 1 17 0 18
12 33 0 45
23 97 1 121
FL IA MS
A 0.19 0.80 0.01 1.00
B 0.19 0.80 0.01 1.00
AB 10.64 44.89 0.46 56.00
O 3.42 14.43 0.15 18.00
8.55 36.07 0.37 45.00
23.00 97.00 1.00 121.00
FL IA MS
A 0.190 0.049 0.008
B 0.190 0.049 0.008
AB 0.039 0.000 0.624
O 1.714 0.458 0.149
1.389 0.262 0.372
Chi-square statistic: 5.500 (sum the previous 15 cells)
df 8 (#rows-1)(#cols-1)
p-value: 0.702994
CHI-SQUARE TESTS FOR TABLES
The table below shows the observed distribution of ABO blood type in three samples of African Americans living in
different locations. (Example 11.10) Perform a chi-squared test of independence to determine whether Blood Type is
independent of Location. Unfortunately Excel does not have a built-in function for this test but we can use
spreadsheet properties to simplify the analysis.
Observed
Location
Total
Blood Type
Total
First calculate the "Expecteds" in the table below. The Expected number of Type A in FL has been calculated for you using
the appropriate row and column totals from the above table. Fill in the rest of the Expected numbers by clicking and
extending the formula to the cells below and to the right. Check to see that the row and column totals then match those from
the Observed table. Also notice the use of relative and absolute referencing (fix column and/or row in formula).
Expected
Location
Total
Blood Type
The chi-square test statistic is calculated as the sum of the 12 cells in this table. Fill in the chi-square statistic using SUM.
Also fill in the degrees of freedom for this test. From these values, the p-value and critical value will be automatically
calculated below.
Total
Next calculate (Observed - Expected)^2/Expected for each cell. In the table below, the cell for Type A and FL has already
been filled in. Click and drag to extend the formula to the remaining 11 cells.
(O-E)^2/E
Location
Blood Type
critical value: 15.507
P-value: 0.703
Chi-square statistic: 5.500
si estadistico > valor critico teorico , si hay relacion
si estadistico < valor critico teorico , no hay relacion
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp
range)
This does not tell you the actual value of the test statistic. However, you can calculate the test statistic from the p-value using
=CHIINV(p-value, df)
CHI-SQUARE TESTS FOR TABLES
The table below shows the observed distribution of ABO blood type in three samples of African Americans living in
different locations. (Example 11.10) Perform a chi-squared test of independence to determine whether Blood Type is
independent of Location. Unfortunately Excel does not have a built-in function for this test but we can use
spreadsheet properties to simplify the analysis.
First calculate the "Expecteds" in the table below. The Expected number of Type A in FL has been calculated for you using
the appropriate row and column totals from the above table. Fill in the rest of the Expected numbers by clicking and
extending the formula to the cells below and to the right. Check to see that the row and column totals then match those from
the Observed table. Also notice the use of relative and absolute referencing (fix column and/or row in formula).
The chi-square test statistic is calculated as the sum of the 12 cells in this table. Fill in the chi-square statistic using SUM.
Also fill in the degrees of freedom for this test. From these values, the p-value and critical value will be automatically
calculated below.
Next calculate (Observed - Expected)^2/Expected for each cell. In the table below, the cell for Type A and FL has already
been filled in. Click and drag to extend the formula to the remaining 11 cells.
si estadistico > valor critico teorico , si hay relacion
si estadistico < valor critico teorico , no hay relacion
Excel can calculate the chi-square p-value directly from the Observed and Expected tables, via CHITEST(obs range, exp
range)
This does not tell you the actual value of the test statistic. However, you can calculate the test statistic from the p-value using
=CHIINV(p-value, df)

You might also like