Print Conerly Mike

Is Your Model Good Enough?
Michael Conerly,
Winston Choi,
Michael Hardin
University of Alabama
Copyright © 2007, SAS Institute Inc. All rights reserved.
Outline of Presentation
Goodness of Fit (GOF) Procedures
Why are these necessary?
Review of Hosmer and Lemeshow
Adapting H-L to data mining
Selection of groups for GOF
Application of New test to real data
What if test fails?

Consider a simple Contingency table
Group Low Med High Total
Goods 12 18 30 60
Bads 32 20 8 60
Note that rows are inversely related

Are category and row related?
Focus on either Goods or Bads,
we will use Goods.

Test that Observed and Predicted
Good values are in agreement
Observed 12 18 30 60
Predicted 8 18 34 60
Here χ2 = 2.3 with p = .313

Model seems accurate

Another example
Observed 12 18 30 60
Predicted 6 16 38 60
Here χ2 = 7.9 with p = .019,

Model does not seem accurate
We are more accurate in Med group,
less in Low, High groups.
The propensity of being correct is
related to the prediction???
Why use GOF?

Are predictions consistent?
For Linear models we look for residuals to be
centered at 0 with similar dispersion.
Graphs for (0/1) data are not informative
Binning the data allows us to compare counts.
If we predict better for some categories than
others:
• Is our model consistent?
• Are there unexplained data issues?
• Are we overfitting?

Predictive Models
Generally predict Pr(Good) based on:
• Logistic Regression
• Neural Networks
• Decision Trees
• Or some combination of these
• Or other techniques
In all cases, we predict Pr(Success),
with 0 < Pr(Success) < 1.
GOF for Predictive Models
Review Hosmer & Lemeshow

procedure for Logistic Regression
See Applied Logistic Regression ,
second edition, by David Hosmer and
Stanley Lemeshow, Wiley
Extend this to Other Estimation
Techniques

Hosmer and Lemeshow Procedure
First divide the data into categories.
These need to be homogeneous.
H&L suggest deciles based on predicted
probabilities, i.e.,
0 - .1, .1 - .2, …
The resulting categories are equal in size
With large data sets, may use 50 or more
groups.
Need Expected #s > 5 for almost all
categories
Hosmer and Lemeshow Procedure
Tests focus on Goods or Bads, but not

both, since these are related.
We can construct Expected # by
averaging the predicted probabilities
for each category to obtain
Expected # = n x

Tabulated Data
Category Observed Pr(Success) Expected
1 Obs1 π1 n×π1
2 Obs2 π2 n×π2
3 Obs3 π3 n×π3
…
k Obsk πκ n×πk
Total n 1.0 n
Chi-square statistic
The statistic
2 (Obsi − Expi )2
χ =∑
nπi (1 − πi )
is approximately Chi-square with k-2
degrees of freedom.
Large values Little or no
agreement between Observed and
predicted counts

Generalizing to Predictive Modeling
Given categories, this H-L procedure easily
adapts to Predictive Models
To choose categories:
• Bin data based on the Predicted probs as
in H-L.
• This is problematic if you consider multiple
models (Logit, Tree, NN) since different
predictions yield different bins
• Need to compare competing models on
same categories.

Bin data using Cluster Analysis to identify
homogeneous groups.
Many forms are available to cluster similar
observations
Perform cluster analysis then do K-Means
to ensure homogeneity of categories
(clusters)
Store the cluster # in column to create
summary tables for comparing models.

H-L also suggest applying this GOF
test to “hold-out” data for validation.
This is a straightforward application
of the same process. Apply Cluster
analysis first, then create a summary
table on these clusters and compute
Chi-Square statistic.
Examples
Using the Home Equity data, without

missing values, available in JMP 7 we
performed the following GOF tests.
We will compare a Logistic, Decision
Tree and Neural Network fit to these
data to predict P(Good Loan).

Continuation of Example here
To be added later
What if Test Fails?

Use the Chi-square calculation for each category
to determine which ones are problematic.
Do these categories contain aberrant data?
Fit these categories separately to see where
model(s) change.
Use Predictive Models to predict these problem
categories (1) versus other categories(0) to find
problem variables.
Adjust the overall models accordingly.

Conclusion
Fitting Predictive Models to large data

sets makes discovering unusual data
problems difficult.
GOF is part of the validation and
model checking process.
It may help to identify serious flaws
and suggest corrections leading to
improved models.
Questions?


Print Conerly Mike

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Print Conerly Mike

Uploaded by

Copyright:

Available Formats

Is Your Model Good Enough?

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

 Note that rows are inversely related

Copyright © 2007, SAS Institute Inc. All rights reserved.

Goodness of Fit (GOF) Procedures

 Here χ2 = 2.3 with p = .313

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

 Here χ2 = 7.9 with p = .019,

Copyright © 2007, SAS Institute Inc. All rights reserved.

Why use GOF?

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

GOF for Predictive Models

 Review Hosmer & Lemeshow

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Hosmer and Lemeshow Procedure

 Tests focus on Goods or Bads, but not

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Generalizing to Predictive Modeling

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

 Using the Home Equity data, without

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

What if Test Fails?

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

 Fitting Predictive Models to large data

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

You might also like

Note that rows are inversely related

Here χ2 = 2.3 with p = .313

Here χ2 = 7.9 with p = .019,

Review Hosmer & Lemeshow

Tests focus on Goods or Bads, but not

Using the Home Equity data, without

Fitting Predictive Models to large data