Professional Documents
Culture Documents
Selection of
Ridge regression in R/SAS
Ridge Regression
Patrick Breheny
September 1
Patrick Breheny
1/22
Ridge regression
Selection of
Ridge regression in R/SAS
(yi xTi )2 +
Patrick Breheny
p
X
j2
j=1
2/22
Ridge regression
Selection of
Ridge regression in R/SAS
Note the similarity to the ordinary least squares solution, but with
the addition of a ridge down the diagonal
b ridge
b OLS
Corollary: As 0,
b ridge 0
Corollary: As ,
Patrick Breheny
3/22
Ridge regression
Selection of
Ridge regression in R/SAS
Patrick Breheny
4/22
Ridge regression
Selection of
Ridge regression in R/SAS
Patrick Breheny
5/22
Ridge regression
Selection of
Ridge regression in R/SAS
Invertibility
Recall from BST 760 that the ordinary least squares estimates
do not always exist; if X is not full rank, XT X is not
b OLS
invertible and there is no unique solution for
This problem does not occur with ridge regression, however
Theorem: For any design matrix X, the quantity XT X + I
is always invertible; thus, there is always a unique solution
b ridge
Patrick Breheny
6/22
Ridge regression
Selection of
Ridge regression in R/SAS
Patrick Breheny
7/22
Ridge regression
Selection of
Ridge regression in R/SAS
Existence theorem
Patrick Breheny
8/22
Ridge regression
Selection of
Ridge regression in R/SAS
Bayesian interpretation
2
X X+ 2I
Patrick Breheny
1
XT y.
9/22
Ridge regression
Selection of
Ridge regression in R/SAS
Information criteria
Cross-validation
Degrees of freedom
Patrick Breheny
10/22
Ridge regression
Selection of
Ridge regression in R/SAS
Information criteria
Cross-validation
i
i +
11/22
Ridge regression
Selection of
Ridge regression in R/SAS
Information criteria
Cross-validation
Patrick Breheny
12/22
Ridge regression
Selection of
Ridge regression in R/SAS
Information criteria
Cross-validation
Introduction
Now, it would not be fair to use the data twice twice once
to fit the model and then again to estimate the prediction
accuracy as this would reward overfitting
Ideally, we would have an external data set for validation, but
obviously data is expensive to come by and this is rarely
practical
Patrick Breheny
13/22
Ridge regression
Selection of
Ridge regression in R/SAS
Information criteria
Cross-validation
Cross-validation
One idea is to split the data set into two fractions, then use
b and the other to evaluate how well X
b
one portion to fit
predicted the observations in the second portion
The problem with this solution is that we rarely have so much
data that we can freely part with half of it solely for the
purpose of choosing
To finesse this problem, cross-validation splits the data into K
folds, fits the data on K 1 of the folds, and evaluates risk
on the fold that was left out
Patrick Breheny
14/22
Ridge regression
Selection of
Ridge regression in R/SAS
Information criteria
Cross-validation
Cross-validation figure
This process is repeated for each of the folds, and the risk
averaged across all of these results:
Patrick Breheny
15/22
Ridge regression
Selection of
Ridge regression in R/SAS
Information criteria
Cross-validation
Generalized cross-validation
You may recall from BST 760 that we do not actually have to
refit the model to obtain the leave-one-out (deleted)
residuals:
yi yi(i) =
ri
1 Hii
Patrick Breheny
16/22
Ridge regression
Selection of
Ridge regression in R/SAS
age
pgg45: % Gleason
score 4 or 5
Patrick Breheny
17/22
Ridge regression
Selection of
Ridge regression in R/SAS
SAS/R syntax
To fit a ridge regression model in SAS, we can use PROC REG:
PROC REG DATA=prostate ridge=0 to 50 by 0.1 OUTEST=fit;
MODEL lpsa = pgg45 gleason lcp svi lbph age lweight lcavol;
RUN;
Patrick Breheny
18/22
Ridge regression
Selection of
Ridge regression in R/SAS
Results
63
224
18
0.8
0.6
svi
lcavol
0.4
0.2
Coefficients
lweight
lbph
0.0
gleason
pgg45
age
lcp
0
Degrees of freedom
19/22
Ridge regression
Selection of
Ridge regression in R/SAS
120
RSS
60
80
100
6
4
2
0
Degrees of freedom
Additional plots
0.01
104
10
107
Degrees of freedom
Patrick Breheny
20/22
Ridge regression
Selection of
Ridge regression in R/SAS
10
5
0
Criterion min
15
20
Degrees of freedom
Patrick Breheny
21/22
Ridge regression
Selection of
Ridge regression in R/SAS
lcavol
lweight
age
lbph
svi
lcp
gleason
pgg45
Estimate
OLS Ridge
0.587 0.519
0.454 0.444
-0.020 -0.016
0.107 0.096
0.766 0.698
-0.105 -0.044
0.045 0.060
0.005 0.004
Patrick Breheny
Std.
OLS
0.088
0.170
0.011
0.058
0.244
0.091
0.157
0.004
Error
Ridge
0.075
0.153
0.010
0.053
0.209
0.072
0.128
0.003
z-score
OLS Ridge
6.68
6.96
2.67
2.89
-1.76 -1.54
1.83
1.83
3.14
3.33
-1.16 -0.61
0.29
0.47
1.02
1.02
22/22