Professional Documents
Culture Documents
Edpsy/Psych/Soc 589
Carolyn J. Anderson
Department of Educational Psychology
I L L I N O I S
UNIVERSITY OF ILLINOIS AT URBANA - CHAMPAIGN
◆ Unrelated classifications
Unrelated Classification
◆ Other
Other Hypotheses
The Chi–Squared Distribution ■ “Expected Frequencies” are the values expected if the null
Pearson’s Chi-Squared Statistic hypothesis is true,
Likelihood Ratio Statistic
µij = nπij
Chi-Squared Test Hypotheses
Independence
■ To test a null hypothesis, we compare the observed
Homogeneous Distributions
frequencies nij and the expected frequencies µij :
Unrelated Classification
{nij − µij }
Other Hypotheses
Partitioning Chi-Square ■ The test statistics are functions of observed and expected
Summary Comments on
Chi-Squared Tests
frequencies.
■ If the null hypothesis is true, then the test statistics are
distributed as chi-squared random variables so they are
referred to as
“Chi-Squared Tests”.
Homogeneous Distributions
Unrelated Classification
Other Hypotheses
Partitioning Chi-Square
Summary Comments on
Chi-Squared Tests
Partitioning Chi-Square
■ As df increase, the distribution becomes more “bell-shaped”
Summary Comments on
(i.e., df → ∞, χ2df → N ).
Chi-Squared Tests
Independence
Homogeneous Distributions
Unrelated Classification
Other Hypotheses
Partitioning Chi-Square
Summary Comments on
Chi-Squared Tests
I X J
The Chi–Squared Distribution
2
X (nij − µij )2
Pearson’s Chi-Squared Statistic X =
● Pearson’s Chi-Squared
i=1 j=1
µij
Statistic
● Chi–Squared Distribution and
p-value
Other Hypotheses
A good rule: “Large” means µij ≥ 5 for all (i, j).
Partitioning Chi-Square
Summary Comments on
Chi-Squared Tests
■ The p-value for a test is the right tail probability of X 2 .
Independence
Homogeneous Distributions
Unrelated Classification
Other Hypotheses
Partitioning Chi-Square
Summary Comments on
Chi-Squared Tests
Partitioning Chi-Square
■ If max L(HO ) = max L(HA ), then there is no evidence
Summary Comments on
against HO . (i.e., Λ = 1)
Chi-Squared Tests
■ The smaller the likelihood under HO , the more evidence
against HO (i.e., the smaller Λ).
J
I X
Pearson’s Chi-Squared Statistic X
Likelihood Ratio Statistic
G2 = 2 nij log(nij /µij )
● Likelihood Ratio Statistic i=1 j=1
● Likelihood Ratio Statistic for
2-way Table
Independence
Homogeneous Distributions
This is the “likelihood ratio chi-squared statistic”.
Unrelated Classification
Other Hypotheses
Partitioning Chi-Square
Summary Comments on
Chi-Squared Tests
Partitioning Chi-Square
■ These four test differ in terms of
◆ Experimental procedure (i.e., sampling design)
Summary Comments on
Chi-Squared Tests ◆ The null and alternative hypothesis
◆ Logic used to obtain estimates of expected frequencies
assuming HO is true.
Homogeneous Distributions
To test this hypothesis, we assume HO is true.
Unrelated Classification
Given data, the observed marginal proportions pi+ and p+j are
Overview and Definitions the maximum likelihood estimates of πi+ and π+j , respectively;
The Chi–Squared Distribution that is,
Pearson’s Chi-Squared Statistic
Independence
π̂+j = p+j
● Independence
● Expected Frequencies Under
Independence
● Testing Independence
● Computing Degrees of
Freedom
“Estimated Expected Frequencies” are
● Example: Two Items from the
1994 GSS
● Example: Estimated µ̂ij = nπ̂i+ π̂+j
Expected Values
● Example: Test Statistics
● Residuals
= n(ni+ /n)(n+j /n)
● Adjusted Residuals ni+ n+j
● Residuals and SAS =
● Another Example of
Independence
n
● Admission Scandal Results
● Results Continued
● Test of Independence
Homogeneous Distributions
Unrelated Classification
Homogeneous Distributions
Unrelated Classification
df = (# parameters in HA ) − (# parameters in HO )
Overview and Definitions
■ Null hypothesis has
The Chi–Squared Distribution
◆ (I − 1) unique parameters for the row margin, π̂i+ .
Pearson’s Chi-Squared Statistic
◆ (J − 1) unique parameters for the column margin, π̂+j .
Likelihood Ratio Statistic
Unrelated Classification
Homogeneous Distributions
Unrelated Classification
Homogeneous Distributions
250 415 102 101 16 884
Unrelated Classification
Independence
● Independence
● Expected Frequencies Under
Independence
● Testing Independence
● Computing Degrees of
Freedom
● Example: Two Items from the What’s the nature of the dependency? Residuals. . .
1994 GSS
● Example: Estimated
Expected Values
● Example: Test Statistics
● Residuals
● Adjusted Residuals
● Residuals and SAS
● Another Example of
Independence
● Admission Scandal Results
● Results Continued
● Test of Independence
Homogeneous Distributions
Unrelated Classification
Unrelated Classification
Unrelated Classification
Homogeneous Distributions
Unrelated Classification
Unrelated Classification
Homogeneous Distributions
Unrelated Classification
24-1
Test of Independence
Admission
yes no Total
Overview and Definitions
I list 123 37 160
The Chi–Squared Distribution
Independence
● Independence
● Expected Frequencies Under
Statistics for Table of List by Admission
Independence
● Testing Independence Statistic DF Value Prob
● Computing Degrees of
Freedom
● Example: Two Items from the
Chi-Square 1 4.3659 0.0367
1994 GSS
● Example: Estimated Likelihood Ratio Chi-Square 1 4.6036 0.0319
Expected Values
● Example: Test Statistics Continuity Adj. Chi-Square 1 4.0141 0.0451
● Residuals
● Adjusted Residuals
● Residuals and SAS
Mantel-Haenszel Chi-Square 1 4.3657 0.0367
● Another Example of
Independence Phi Coefficient −0.0129
● Admission Scandal Results
● Results Continued
● Test of Independence
Homogeneous Distributions
Unrelated Classification
Chi-Squared Test Hypotheses Null Hypothesis: The distributions of responses from the
Independence different populations are the same.
Homogeneous Distributions
● Homogeneous Distributions Alternative Hypothesis: The distributions of responses from the
● Chi-Square Test for
Homogeneous Distributions
● Estimated Expected
different populations are different.
Frequencies
● Degrees of Freedom
● Example: Effectiveness of
Effectiveness of Vitamin C for prevention of common cold.
Vitamin C
● Summary regarding Outcome
Effectiveness of Vitamin C
Summary Comments on
Chi-Squared Tests
Partitioning Chi-Square
which is the exact same formula that we use to compute
Summary Comments on
Chi-Squared Tests
estimated expected frequencies under independence.
Unrelated Classification
Other Hypotheses
Same as for testing independence.
Partitioning Chi-Square
Summary Comments on
Chi-Squared Tests
Independence
48 231 279 48 231 279
Homogeneous Distributions Test Statistic df Value p–value
● Homogeneous Distributions
● Chi-Square Test for
Homogeneous Distributions
Pearson Chi-Square X2 1 4.811 .03
● Estimated Expected
Frequencies Likelihood Ratio Chi-Square G2 1 4.872 .03
● Degrees of Freedom
● Example: Effectiveness of
Vitamin C Adjusted Residuals
● Summary regarding
Effectiveness of Vitamin C
Outcome
Unrelated Classification
Other Hypotheses
Cold No Cold
Partitioning Chi-Square vitamin C −2.31 2.17
Summary Comments on
Chi-Squared Tests
placebo 2.10 −2.22
Independence
Test Statistic df Value p–value
Homogeneous Distributions Pearson Chi-Square X2 1 4.811 .03
● Homogeneous Distributions
● Chi-Square Test for
Homogeneous Distributions
Likelihood Ratio Chi-Square G2 1 4.872 .03
● Estimated Expected
Frequencies
● Degrees of Freedom
● Example: Effectiveness of
Adjusted Residuals
Vitamin C
● Summary regarding Outcome
Effectiveness of Vitamin C
Homogeneous Distributions
1–122 123–244 245–366 Totals
Jan 9 12 10 31
Unrelated Classification
Feb 7 12 10 29
● Unrelated Classification
● Hypothesis of Unrelated March 5 10 16 31
Classification
April 8 8 14 30
● Expected Values
● Example: 1970 Draft May 9 7 15 31
Month June 11 7 12 30
Other Hypotheses
July 12 7 12 31
Partitioning Chi-Square Aug 13 7 11 31
Sept 10 15 5 30
Summary Comments on
Chi-Squared Tests Oct 9 15 7 31
Nov 12 12 6 30
Dec 17 10 4 31
Totals 122 122 122 366
Other Hypotheses
Partitioning Chi-Square
Summary Comments on
Chi-Squared Tests
Homogeneous Distributions If the null hypothesis is true, then expected frequencies µij are
Unrelated Classification
● Unrelated Classification µij = (# in row i)(proportion in column j)
● Hypothesis of Unrelated
Classification
● Expected Values
= ni+ (n+j /n)
● Example: 1970 Draft
ni+ n+j
=
Other Hypotheses
n
Partitioning Chi-Square
Summary Comments on
Chi-Squared Tests
Degrees of Freedom = (I − 1)(J − 1).
Explanation. . .
Homogeneous Distributions
Unrelated Classification
Other Hypotheses
● Other Hypotheses
● Example of Other Hypothesis
● Another Other Example
Partitioning Chi-Square
Summary Comments on
Chi-Squared Tests
Independence
Correct Incorrect
Homogeneous Distributions
Method A n/2 = .5
Unrelated Classification B n/2 = .5
Other Hypotheses
● Other Hypotheses
n
● Example of Other Hypothesis
● Another Other Example
■ HO : Independence and equal number of students should
Partitioning Chi-Square
choose each method.
Summary Comments on
Chi-Squared Tests ■ HA : Method and Answer are dependent and/or unequal
number of students choose each method.
The expected frequencies = ni+ n+j /n = n+j /2.
Unrelated Classification
Other Hypotheses
then (Z12 + Z22 ) is chi-squared with df = df1 + df2 = 2
Partitioning Chi-Square
● Partitioning Chi-Square
. . . and (of course) Z12 and Z22 are independent.
● Partitioning Chi-Square by
Example
● Check for Relationship &
Then Partition
“Partitioning chi-squared” uses this fact, but in reverse:
● Independent Component
Tables We start with a chi-squared statistic with df > 1 and break it
● Description of Association
● Necessary Conditions for into component parts, each with df = 1.
Partitioning
Summary Comments on
Chi-Squared Tests
Independence
Example: A sample of psychiatrists were classified with
Homogeneous Distributions
respect to their school of psychiatric thought and their beliefs
Unrelated Classification
about the origin of schizophrenia. (Agresti, 1990; Gallagher, et
al, 1987).
Other Hypotheses
Homogeneous Distributions
Psychoanalysis 19 13 50
Unrelated Classification
Sub-table 1: Sub-table 3:
Other Hypotheses Bio Env −→ df = 1 Bio Env −→ df = 1
Eclectic 90 12 G2 = .294 Medical 13 1 G2 = 6.100
Partitioning Chi-Square
● Partitioning Chi-Square
Medical 13 1 p-value = .59 Psychan 19 13 p-value = .01
● Partitioning Chi-Square by
Example Sub-table 2: Sub-table 4:
● Check for Relationship & Env Com −→ df = 1 Env Com −→ df = 1
Then Partition
● Independent Component Eclectic 12 78 G2 = .005 Medical 1 6 G2 = .171
Tables Medical 1 6 p-value = .94 Psychoan 13 50 p-value = .68
● Description of Association
● Necessary Conditions for
Partitioning
But. . . .294 + .005 + 6.100 + .171 = 6.570 6= 23.036
Summary Comments on
Chi-Squared Tests
Independence Eclectic 90 12 78
Medical 13 1 6
Homogeneous Distributions
Psychoanalysis 19 13 50
Unrelated Classification
Sub-Table 1: Sub-Table 3:
Other Hypotheses
Bio Env −→ df = 1 Bio Env −→ df = 1
Partitioning Chi-Square Eclectic 90 12 G2 = .294 Ecl+Med 103 13 G2 = 12.953
● Partitioning Chi-Square
Medical 13 1 X 2 = .264 Psychoan 19 13 X 2 = 14.989
● Partitioning Chi-Square by
Example θ̂ = .577 θ̂ = 5.421
● Check for Relationship &
Then Partition Sub-Table 2: Sub-Table 4:
● Independent Component
Tables Bio Bio
● Description of Association +Env Com −→ df = 1 +Env Com −→ df = 1
● Necessary Conditions for
Eclectic 102 78 G2 = 1.359 Ecl+Med 116 84 G2 = 8.430
Partitioning
Medical 14 6 X 2 = 1.314 Psychoan 32 50 X 2 = 8.397
Summary Comments on
θ̂ = .560 θ̂ = 2.158
Chi-Squared Tests
Unrelated Classification
■ Each marginal total of the original table must be a marginal
Other Hypotheses
total for one and only one sub-table.
Partitioning Chi-Square
● Partitioning Chi-Square
● Partitioning Chi-Square by A better approach to studying the nature of association —
Example
● Check for Relationship & estimating parameters that describe aspects of association
Then Partition
● Independent Component and models the represent association.
Tables
● Description of Association
● Necessary Conditions for
Partitioning
Summary Comments on
Chi-Squared Tests