Professional Documents
Culture Documents
LOGIT MODELS
Del Rosario, RP | Perez, JJ
Nominal Responses
One response variable Y with J levels
One or more explanatory or predictor variables
quantitative, qualitative or both
Logistic Regression
Forming Logits
When J = 2, Y is dichotomous
log of success odds that an event occurs or does not
occur:
logit () = log
When J > 2,
Multicategory or Polytomous response variable
(1)
2
There are
logits that can be formed but only
(J 1) are non-redundant
Ordinal response
Ordinal logistic regression
Cumulative Logits/Proportional Odds Model
Adjacent Categories
Continuous Ratio
Multicategory Logits
Model simultaneously all relationships between
probabilities for pairs of categories (vs Binary Logistic
Regression)
Optimal efficiency
Estimates of the model parameters smaller SE than the
estimates obtained by fitting the equations separately.
For simultaneous fitting, the same parameter estimates occur for a
pair of categories no matter which category is baseline.
, = 1, ,
log
= + , = 1, ,
log
= log
= log
log
= + +
= +
0:
1:
2:
3:
4:
5:
log
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
=
-57.570928
-49.97414
-49.186349
-49.170647
-49.170622
-49.170622
Using STATA
Number of obs
LR chi2(2)
Prob > chi2
Pseudo R2
P>|z|
=
=
=
=
59
16.80
0.0002
0.1459
food
Coef.
size
_cons
-.110109
1.617731
.517082
1.307275
-0.21
1.24
0.831
0.216
-1.123571
-.9444801
.9033531
4.179943
size
_cons
-2.465446
5.697444
.8996503
1.793809
-2.74
3.18
0.006
0.001
-4.228728
2.181644
-.702164
9.213244
(base outcome)
. estat ic
Model
Obs
ll(null)
ll(model)
df
AIC
BIC
59
-57.57093
-49.17062
106.3412
114.6514
Using R
1
log
= 1.618 0.110
3
Estimated log odds that primary food choice of alligators is
invertebrate rather than other types:
2
log
= 5.697 2.465
3
1
log
= 1.618 5.697 + 0.110 (2.465)
2
log
1
2
= 4.080 + 2.355
Ho : j = 0 for j = 1, 2
LR = 16.8, p = 0.0002
Strong effect of length of alligator on food choice
Estimated Probabilities
=
( + )
=1 (
+ )
Estimated Probabilities
1.62 0.011
1 =
1 + 1.62 0.011 + (5.70 2.47)
(5.70 2.47)
2 =
1 + 1.62 0.011 + (5.70 2.47)
1
3 =
1 + 1.62 0.011 + (5.70 2.47)
Iteration
1:
log likelihood
= -103.35145
. mlogit satisfaction
income gender
[weight=count], b(1)
Iteration
2:
log
likelihood
= -102.92608
(frequency weights assumed)
Iteration 3:
log likelihood = -102.91365
Iteration
4:
log
Iteration 0:
log likelihood
likelihood == -102.91362
-107.39082
Iteration
log
Iteration 5:
1:
log likelihood
likelihood == -102.91362
-103.35145
Iteration 2:
log likelihood = -102.92608
Multinomial
regression= -102.91365
Number of obs
=
104
Iteration 3:logistic
log likelihood
LR chi2(6)
=
8.95
Iteration 4:
log likelihood = -102.91362
Iteration 5:
log likelihood = -102.91362
Prob > chi2
=
0.1762
Log likelihood = -102.91362
Pseudo R2
=
0.0417
Multinomial logistic regression
Number of obs
=
104
LR chi2(6)
=
8.95
Prob
>
chi2
=
0.1762
satisfaction
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
Log likelihood = -102.91362
Pseudo R2
=
0.0417
1
(base outcome)
2satisfaction
income
1
gender
_cons
2
income
gender
income
_cons
gender
_cons
3
income
4
gender
income
_cons
gender
4
_cons
income
gender
. estat _cons
ic
. estat
ic
Model
.
Model
Coef.
Std. Err.
P>|z|
.9239423
.7752856
(base outcome)
.1239678
1.317757
-.583335
1.990687
1.19
0.09
-0.29
0.233
0.925
0.769
-.5955895
-2.458788
-4.485009
2.443474
2.706724
3.318339
.9239423
.1239678
1.157282
-.583335
.005601
.5385145
1.157282
.005601
1.560782
.5385145
.1884805
-1.81048
1.560782
.1884805
-1.81048
1.19
0.09
1.57
-0.29
0.00
0.29
1.57
0.00
2.04
0.29
0.15
-0.92
2.04
0.15
-0.92
0.233
0.925
0.117
0.769
0.996
0.770
0.117
0.996
0.042
0.770
0.883
0.360
0.042
0.883
0.360
-.5955895
-2.458788
-.2907792
-4.485009
-2.390357
-3.072897
-.2907792
-2.390357
.0595581
-3.072897
-2.332134
-5.685582
.0595581
-2.332134
-5.685582
2.443474
2.706724
2.605344
3.318339
2.401559
4.149926
2.605344
2.401559
3.062005
4.149926
2.709095
2.064621
3.062005
2.709095
2.064621
.7752856
1.317757
.7388206
1.990687
1.22245
1.842591
.7388206
1.22245
.7659445
1.842591
1.286052
1.977129
.7659445
1.286052
1.977129
Obs
ll(null)
ll(model)
df
AIC
BIC
104
Obs
-107.3908
ll(null)
-102.9136
ll(model)
9
df
223.8272
AIC
247.6267
BIC
log
= + 1 + 2 , = 1,2, , 1
1
I = Income
Notice that does not have a subscript j which implies that the
value of is constant for all J-1 cumulative logits.
When the model fits well, a single parameter instead of J-1
parameters is enough to describe the effect of x.
The curves of each cumulative probability have the same
shape/slope/rate of change but different start and end points
depending on j.
( | = 2
( | = 1
log
( > | = 2
( > | = 1
= + 2 + 1 = (2 -1 )
Estimated Probabilities
The model expression for the cumulative probabilities is:
= = = 1
For example,
Example: Cont.
Ensure that dataset is in case or expanded form before using
polr.
The coefficients of the last output are called proportional odds ratios.
For pared, the odds of "very likely" applying versus "somewhat likely" or "unlikely"
applying combined are 3.07 greater among students from public than private
colleges, given that all the other variables in the model are held constant
Likewise, the odds of "very likely" or "somewhat likely" applying versus "unlikely"
applying is 3.07
times greater among students with high parental education,
given that all of the other variables in the model are held constant
Example: Cont.
The coefficients of the last output are called proportional odds ratios.
For pared, the odds of "very likely" applying versus "somewhat likely" or "unlikely"
applying combined are 2.85 greater among students from public than private
colleges, given that all the other variables in the model are held constant
Likewise, the odds of "very likely" or "somewhat likely" applying versus "unlikely"
applying is 2.85 times greater among students with high parental education,
given that all of the other variables in the model are held constant
For gpa, when a student's gpa moves 1 unit, the odds of moving from "unlikely"
applying to "somewhat likely" or "very likley" applying (or from the lower and
middle categories to the high category) are multiplied by 1.85.
Invariance
Invariance to choice of response categories
Situation: Researcher A used a 5-point likert scale (SD, D, N, A,
SA). Researcher B conducted a similar study but used a 3-point
likert scale (D, N, A). If the proportional odds assumption is not
violated, the parameters for the effect of a predictor are roughly
the same.
ADJACENT-CATEGORIES LOGITS
The adjacent-category logits are:
Example
Stem Cell Research and Religious Fundamentalism
Example: Cont.
Example
Tonsil Size and Streptococcus
Example: Cont.