You are on page 1of 14

LOGISTIC REGRESSION, POISSON

REGRESSION AND GENERALIZED


LINEAR MODELS
We have introduced that a continuous response, Y, could depend on continuous or
discrete variables X
1
, X
2
, X
p-1
. However, dichotomous (binary) outcome is most
common situation in biology and epidemiology.
Example:
In a longitudinal study of
coronary heart disease as a
function of age, the
response variable Y was
defined to have the two
possible outcomes: person
developed heart diease
during the study, person
did not develop heart
disease during the study.
These outcomes may be
coded 1 and 0,
respectively.
Logistic regression
Age CD Age CD Age CD
22 0 40 0 54 0
23 0 41 1 55 1
24 0 46 0 58 1
27 0 47 0 60 1
28 0 48 0 60 0
30 0 49 1 62 1
30 0 49 0 65 1
32 0 50 1 67 1
33 0 51 0 71 1
35 1 51 1 77 1
38 0 52 0 81 1

Age and signs of coronary heart disease (CD)
Prevalence (%) of signs of CD
according to age group
Diseased
Age group # in group # %
20 -29 5 0 0
30 - 39 6 1 17
40 - 49 7 2 29
50 - 59 7 4 57
60 - 69 5 4 80
70 - 79 2 2 100
80 - 89 1 1 100


Diseased %
Age (years)
1
The simple linear regression model
Y
i
=
0
+
1
X
i
+
i
Y
i
=0,1
The response function
E{Y
i
}=
0
+
1
X
i

We view Y
i
as a random variable with a
Bernoulli distribution with parameter
I
Y
i
Prob(Y
i
)
1
0
P(Y
i
=1)=
i
P(Y
i
=0)= 1-
i
P(Y
i
=k)=
k 1
i
k
i
) 1 (

, k=0,1
E{Y
i
}=1*
i
+0*(1-
i
)=
i
Special Problems When Response
Variable Is Binary
1. Nonnormal Error Terms
When Y
i
=1:
i
=1-
0
-
1
X
i

When Y
i
=0:
i
=-
0
-
1
X
i

Can we assume
i
are normally distributed?
2. Nonconstant Error Variance

2
{
i
}= (
0
+
1
X
i
)(1-
0
-
1
X
i
)
ordinary least squares is no longer optimal
3. Constraints on Response Function
0E{Y
i
}1
What does E{Y
i
} mean?
E{Y
i
}=
0
+
1
X
i
=
i
E{Y
i
} is the probability that Y
i
=1 when
then level of the predictor variable is X
i
.
This interpretation applies whether the
response function is a simple linear one,
as shown above, or a complex multiple
regression one.
The logistic function
Probability of
disease
x
x
x
1 0
1 0
e 1
e
1) P(y
+
+
+

Both theoretical and empirical results suggest
that when the response variable is binary, the
shape of the response function is either as
a tilted S or as a reverse tilted S.
The logistic function
Simple Logistic Regression
2
1. Model: Y
i
=E{ Y
i
}+
i

Where, Y
i
are independent Bernoulli random variables with
E{Y
i
}=
i
=
) X exp( 1
) X exp(
i 1 0
i 1 0
+ +
+

2. How to estimate
0
and
1
?
a. Likelihood Function:
Since the Y
i
observations are independent, their joint
probability function is:
i i
Y 1
i
n
1 i
Y
i n 1
) 1 ( ) Y ,..., Y ( g


The logarithm of the joint probability function (log-likelihood
function):
+ + +
+



n
1 i
i 1 0 e
n
1 i
i 1 0 i
n
1 i
i e
n
1 i
i
i
e i
n 1 1 0
)] X exp( [1 log ) X ( Y
) (1 log )]
1

( log [Y
) Y ,..., g(Y log ) , ( L
e
b. Maximum Likelihood Estimation:
3
x
1
n l
1 0
function it log
i
i
+
1
]
1


Maximum Likelihood Estimation
L
o
g
-
l
i
k
e
l
i
h
o
o
d
The maximum likelihood estimates of
0
and
1
in the simple
logistic regression model are those values of
0
and
1
that
maximize the log-likelihood function. However, no closed-form
solution exists for the values of
0
and
1
that maximize the log-
likelihood function. Several Computer-intensive numerical search
procedures are widely used to find the maximum likelihood
estimates b
0
and b
1
. We shall rely on standard statistical software
programs specifically designed for logistic regression to obtain the
maximum likelihood estimates b
0
and b
1
.
3. Fitted Logit Response Function
i 1 0
i
i
e
X b b )
1

( log +

4. Interpretation of b
1
4
X b b )
1

( log
1 0 e
+

when X=X
j
,
j 1 0
j
j X b b
X
X
1
e
1

odds
+

when X=X
j
+1,
) 1 X ( b b
1 X
1 X
2
j 1 0
j
j
e
1

odds
+ +
+
+

e
b
2
1
log e
odds
odds
OR
1
>
OR=b
1
b
1
=increase in log-odds for a one unite increase
in X
Assumption
Pi
Predictor Predictor
Logit
Transform
Example:
Y = 1 if the task was finished
Perso
n
i
Months of
Experienc
e
Xi
Task
Success
Yi
0
Fitted
Valu
e
i

Deviance
Residual
Devi
-.862
5
0 if the task wasnt finished
X = months of programming
experience
1
2
3
.
.
.
23
24
25
14
29
6
.
.
.
28
22
8
0
0
.
.
.
1
1
1
0.31
0.835
0.110
.
.
.
0.812
0.621
0.146
-1.899
-.483
.
.
.
.646
.976
1.962
SAS CODE:
proc logistic data = ch14ta01 ;
model y (event='1') = x ;
run;
Notice that we can specify which event to
model using the event = option in the
model statement. The other way of
specifying that we want to model 1 as
event instead of 0 is to use the
descending option in the proc logistic
statement.
SAS OUPUT:
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -3.0597 1.2594 5.9029 0.0151
x 1 0.1615 0.0650 6.1760 0.0129
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
x 1.175 1.035 1.335
How to use the output to calculate
1
? How to interpret
1

=0.31?
Interpretation of Odds Ratio
OR=1.175 means that the odds
of completing the task increase by 17.5
percent with each additional month
of experience.
Interpretation of b
1
b
1
=0.1615 means that the log-odds of
completing the task increase 0.1615
with each additional month of
experience.
4. Repeat Observations-Binomial Outcomes
In some cases, particularly for designed experiments, a number of repeat
observations are obtained at several levels of the predictor variable X. For
example, in a study of the effectiveness of coupons offering a price
reduction on a given product, 1000 homes were selected at random. The
coupons offered different price reductions (5,10,15,20 and 30 dollars), and
6
200 homes werej assigned at random to each of the price reduction
categories.
Level

j
1
2
3
4
5
Price
Reduction
Xj
5
10
15
20
30
Number of
Households
nj
200
200
200
200
200
Number of
Coupons
Redeemed
Y..j
30
55
70
100
137
Proportion of
Coupons
Redeemed
pj
.150
.275
.350
.500
.685
Mondel-
Based
Estimate
j


.1736
.2543
.3562
.4731
.7028

'

+ + + +

,
_

,
_

,
_

'

c
1 j
j 1 0 e j j 1 0 j .
j .
j
e 1 0 e
j . j j .
j
j .
Y n
j
Y
j
j .
.j
.j
j
j .
j
n
1 i
ij j .
j
j
j
ij
)]} X exp( 1 [ log n ) X ( Y
Y
n
log ) , ( L log
: function likelihood - log The
)! Y n ( ! Y
! n
Y
n
where ) 1 (
Y
n
) f(Y
: by given on distributi binomial a has Y variable random The
n
Y
p Y Y
1,2,3,4,5 j ; n 1,..., i
X level at coupons redeemed not household ith 0
X level at coupons redeemed household ith 1
Y
j j . j j . j
j
SAS CODE:
data ch14ta02;
infile 'c:\stat231B06\ch14ta02.txt';
input x n y pro;
proc logistic data=ch14ta02;
model y/n=x;
/*request estimates of the predicted*/
/*values to be stored in a file named */
/*estimates under the variable name pie*/
output out=estimates p=pie;
run;
proc print data=estimate;
run;
SAS OUTPUT:
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -2.0443 0.1610 161.2794 <.0001
x 1 0.0968 0.00855 128.2924 <.0001
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
x 1.102 1.083 1.120


Obs x n y pro pie
1 5 200 30 0.150 0.17362
2 10 200 55 0.275 0.25426
3 15 200 70 0.350 0.35621
4 20 200 100 0.500 0.47311
5 30 200 137 0.685 0.70280
Multiple Logistic Regression
1. Model: Y
i
=E{ Y
i
}+
i

7
Where, Y
i
are independent Bernoulli random variables with
E{Y
i
}=
i
=
) exp( 1
) exp(
'
'

i
i
X
X
+

2. How to estimate the vector ?


+

n
i
i e
n
i
i e
X X
1
'
1
'
i
)] exp( 1 [ log ) ( Y ) ( L log
3. Fitted Logit Response Function
b X
i
i
i
e
'
)
1

( log

Example:

'

absent disease 0
present disease 1
Y

status mic socioecono
others 0
Class Lower 1
X
others 0
Class Middle 1
X
Age X
3
2
1

'

'

'

1 sector city 0
2 sector city 1
X
4
Study purpose: assess the strength of the
association between each of the predictor
variables and the probability of a person
having contracted the disease
SAS CODE:
data ch14ta03;
infile 'c:\stat231B06\ch14ta03.txt'
DELIMITER='09'x;
input case x1 x2 x3 x4 y;
proc logistic data=ch14ta03;
model y (event='1')=x1 x2 x3 x4;
run;
Case
i
1
2
3
4
5
6
.
98
Age
Xi1
33
35
6
60
18
26
.
35
Socioeconomic
Status
Xi2 Xi3
0 0
0 0
0 0
0 0
0 1
0 1
.
0 1
City
Sector
Xi4
0
0
0
0
0
0
.
0
Disease
Status
Yi
0
0
0
0
1
0
.
0
Fitted
Value
i

.209
.219
.106
.371
.111
.136
.
.171

SAS OUTPUT:
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -2.3127 0.6426 12.9545 0.0003
x1 1 0.0297 0.0135 4.8535 0.0276
x2 1 0.4088 0.5990 0.4657 0.4950
x3 1 -0.3051 0.6041 0.2551 0.6135
x4 1 1.5746 0.5016 9.8543 0.0017
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
x1 1.030 1.003 1.058
x2 1.505 0.465 4.868
x3 0.737 0.226 2.408
x4 4.829 1.807 12.907
The odds of a person having contracted the disease increase
by about 3.0 percent with each additional year of age (X1), for
given socioeconomic status and city sector location. The odds
of a person in section 2 (X4) having contracted the disease
are almost five times as great as for a person in sector 1, for
given age and socioeconomic status.
Polynomial Logistic Regression
1. Model: Y
i
=E{ Y
i
}+
i

8

'
function log
1
i
it
i
i
X n l
1
]
1


Where, Y
i
are independent Bernoulli random variables with
E{Y
i
}=
i
=
) exp( 1
) exp(
'
'

i
i
X
X
+

Where x denotes the centered predictor, X- X.


9
k
kk
2
22 11 0
'
function logit
i
i
x x x
1

n l

+ + +
1
]
1


i
X
Example:

'

funds capital by venture financed t wasn' IPO 0


funds capital by venture financed was IPO 1
Y
company the of value face the X
1

Study purpose: determine the characteristics of companies that


attract venture capital.
SAS CODE:
data ipo;
infile 'c:\stat231B06\appenc11.txt';
input case vc faceval shares x3;
lnface=LOG(faceval);
run;
* Run 1st order logistic regression analysis;
proc logistic data=ipo descending;
model vc=lnface;
output out=linear p=linpie;
run;
* produce scatterplot and fitted 1st order
logistic;
data graph1;
set linear;
run;
proc sort data=graph1;
by lnface;
run;
proc gplot data=graph1;
symbol1 color=black value=none interpol=join;
symbol2 color=black value=circle;
title'Scatter Plot and 1st Order Logit Curve';
plot linpie*lnface vc*lnface/overlay;
/* /overlay means to overlay the two graph*/
run;
*Find mean of lnface=16.7088;
proc means;
var lnface;
run;
* Run 2st order logistic regression analysis;
data step2;
set linear;
xcnt=lnface-16.708;
xcnt2=xcnt**2;
run;
proc logistic data=step2 descending;
model vc=xcnt xcnt2;
output out=estimates p=pie;
run;
* produce scatterplot and fitted 2st order
logistic;
data graph2;
set estimates;
run;
proc sort data=graph2;
by xcnt;
run;
proc gplot data=graph2;
symbol1 color=black value=none interpol=join;
symbol2 color=black value=circle;
title'Scatter Plot and 1st Order Logit Curve';
plot pie*xcnt vc*xcnt/overlay;
/* /overlay means to overlay the two graph*/
run;
E s t i ma t e d P r o b a b i l i t y
0 . 0
0 . 1
0 . 2
0 . 3
0 . 4
0 . 5
0 . 6
0 . 7
0 . 8
0 . 9
1 . 0
l n f a c e
1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0
E s t i ma t e d P r o b a b i l i t y
0 . 0
0 . 1
0 . 2
0 . 3
0 . 4
0 . 5
0 . 6
0 . 7
0 . 8
0 . 9
1 . 0
x c n t
- 3 - 2 - 1 0 1 2 3
1. The natural logarithm of face value is
chosen because face value ranges
over several orders of magnitude, with
a highly skewed distribution)
2. The lowess smooth clearly suggests a
mound-shaped relationship.
SAS OUTOUT:
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 0.3000 0.1240 5.8566 0.0155
xcnt 1 0.5530 0.1385 15.9407 <.0001
xcnt2 1 -0.8615 0.1404 37.6504 <.0001
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
xcnt 1.739 1.325 2.281
xcnt2 0.423 0.321 0.556
10
Inferences about Regression Parameters
1. Test Concerning a Single
k
: Wald Test
Hypothesis: H
0
:
k
=0 vs. H
a
:
k
0
Test Statistic:
} s{b
b
z*
k
k

Decision rule: If |z*| z(1-/2), conclude H


0
.
If |z*|> z(1-/2), conclude H
a
.
Where z is a standard normal distribution.
Note: Approximate joint confidence intervals for several logistic regression
model parameters can be developed by the Bonferroni procedure. If g
parameters are to be estimated with family confidence coefficient of
approximately 1-, the joint Bonferroni confidence limits are
b
k
tBs{ b
k
}, where B=z(1-/2g).
2. Interval Estimation of a Single
k
The approximate 1- confidence limits for
k
:
b
k
tz(1-/2)s{ b
k
}
The corresponding confidence limits for the odds ratio exp(
k
):
exp[b
k
tz(1-/2)s{ b
k
}]
11
Example:
Y = 1 if the task was finished
0 if the task wasnt finished
X = months of programming
experience
Person
i
1
2
3
.
23
24
25
Months of
Experience
Xi
14
29
6
.
28
22
8
Task
Success
Yi
0
0
0
.
1
1
1
Fitted Value
i

0.31
0.835
0.110
.
0.812
0.621
0.146
Deviance
Residual
Devi
-.862
-1.899
-.483
.
.646
.976
1.962
SAS CODE:
proc logistic data=ch14ta01 ;
model y (event='1')=x /cl;
run ;
Notice that (1) we can specify cl in the
model statement to get the output for
interval estimate for
0
,
1
, etc. (2) The
test for
1
is a two-sided test. For a one-
sided test, we simply divide the p-value
(0.0129) by 2. This yields the one-sided
p-value of 0.0065. (3) The text authors
report Z*=2.485 and the square of Z* is
equal to the Wald Chi-Square Statistic
6.176, which is distributed approximately
as Chi-Square distribution with df=1.
SAS OUPUT:
The LOGISTIC Procedure
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -3.0597 1.2594 5.9029 0.0151
x 1 0.1615 0.0650 6.1760 0.0129
H0: 10 vs. Ha: 1>0
for =0.05, Since one-sided p-value=0.0065<0.05, we conclude Ha, that
1 is positive.
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
x 1.175 1.035 1.335

Wald Confidence Interval for Parameters
Parameter Estimate 95% Confidence Limits
Intercept -3.0597 -5.5280 -0.5914
x 0.1615 0.0341 0.2888
With approximately 95% confidence that 1 is between 0.0341
and 0.2888. The corresponding 95% condidence limits for the
odds ratio are exp(.0341)=1.03 and exp(.2888)=1.33.
3. Test Whether Several
k
=0: Likelihood Ratio Test
Hypothesis: H
0
:
q
=
q+1
=...
p-1
=0 v
H
a
: not all of the
k
in H
0
equal zero
Full Model:
1 p 1 p 1 1 0
1
X X ' , )] ' exp( 1 [
) ' exp( 1
) ' exp(

+ + +
+

F F
F
F
X X
X
X

Reduced Model:
1 q 1 q 1 1 0
1
X X ' , )] ' exp( 1 [
) ' exp( 1
) ' exp(

+ + +
+

R R
R
R
X X
X
X

The Likelihood Ratio Statistic:


L(F)] log L(R) 2[log
L(F)
L(R)
2log G
e e e
2

1
]
1


The Decision rule: If G
2

2
(1-;p-q), conclude H
0
.
If G
2
>
2
(1-;p-q), conclude H
a
.
12
Example:

'

absent disease 0
present disease 1
Y
status mic socioecono
others 0
Class Lower 1
X
others 0
Class Middle 1
X
Age X
3
2
1

'

'

'

1 sector city 0
2 sector city 1
X
4
Study purpose: assess the strength of the association between each of the
predictor variables and the probability of a person having contracted the
disease

Case
i
1
2
3
4
5
6
.
98
Age
Xi1
33
35
6
60
18
26
.
35
Socioeconomic
Status
Xi2 Xi3
0 0
0 0
0 0
0 0
0 1
0 1
.
0 1
City
Sector
Xi4
0
0
0
0
0
0
.
0
Disease
Status
Yi
0
0
0
0
1
0
.
0
Fitted
Value
i

.209
.219
.106
.371
.111
.136
.
.171
SAS CODE:
data ch14ta03;
infile 'c:\stat231B06\ch14ta03.txt'
DELIMITER='09'x;
input case x1 x2 x3 x4 y;
/*fit full model*/
proc logistic data=ch14ta03;
model y (event='1')=x1 x2 x3 x4;
run;
/*fit reduced model*/
proc logistic data=ch14ta03;
model y (event='1')=x2 x3 x4;
run;
SAS OUTPUT:
Full model:
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 124.318 111.054
SC 126.903 123.979
-2 Log L 122.318 101.054
Reduced model:
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 124.318 114.204
SC 126.903 124.544
-2 Log L 122.318 106.204
We use proc logistic to regress Y on X1,X2,X3 and X4 and
refer to this as full model. In SAS output for full model we see
that -2 Log Likelihood statistic=101.054. We now regress Y
on X2,X3 and X4 and refer to this as the full model. In SAS
output for reduced model we see that -2 Log Likelihood
statistic=106.204. Using equation (14.60), test page 581, we
find G
2
=106.204-101.054=5.15. For =0.05 we require

2
(.95,1)=3.84. Since our computed G
2
value (5.15) is greater
than the critical value 3.84, we conclude Ha, that X1 should
not be dropped from the model.
4. Global Test Whether all
k
=0: Score Chi-square test
13
Let
) ( U
be the vector of first partial derivatives of the log likelihood
with respect to the parameter vector , and let
) ( H
be the matrix of
second partial derivatives of the log likelihood with respect to . Let I()
be either -
) ( H
or the expected value of -
) ( H
. Consider a null
hypothesis H
0
. Let 0

be the MLE of under H


0
. The chi-square score
statistic for testing H
0
is defined by )

( )

( )

( '
0 0
1
0
U I U

and it has an
asymptotic
2

distribution with r degrees of freedom under H


0
, where r is
the number of restriction imposed on by H
0.

Example:

'

absent disease 0
present disease 1
Y
status mic socioecono
others 0
Class Lower 1
X
others 0
Class Middle 1
X
Age X
3
2
1

'

'

'

1 sector city 0
2 sector city 1
X
4
Study purpose: assess the strength of the association between each of the
predictor variables and the probability of a person having contracted the
disease

Case
i
1
2
3
4
5
6
.
98
Age
Xi1
33
35
6
60
18
26
.
35
Socioeconomic
Status
Xi2 Xi3
0 0
0 0
0 0
0 0
0 1
0 1
.
0 1
City
Sector
Xi4
0
0
0
0
0
0
.
0
Disease
Status
Yi
0
0
0
0
1
0
.
0
Fitted
Value
i

.209
.219
.106
.371
.111
.136
.
.171
SAS CODE:
data ch14ta03;
infile 'c:\stat231B06\ch14ta03.txt'
DELIMITER='09'x;
input case x1 x2 x3 x4 y;
proc logistic data=ch14ta03;
model y (event='1')=x1 x2 x3 x4;
run;
SAS OUTPUT:
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 21.2635 4 0.0003
Score 20.4067 4 0.0004
Wald 16.6437 4 0.0023
Since p-value for the score test is 0.0004,
we reject the null hypothesis H
0
:

1
=
2
=
3
=
4
=0. We can also wald test and
likelihood ratio test to test the above null
hypothesis.
14

You might also like