81 views

Uploaded by Alex Nutkiewicz

- Pert 10 Uji Arima Untuk Metode Arch Garch
- BSE-500 Index
- Chapter 5
- RM
- stats216_hw2
- 5pp40to46MS709
- House Pricing Regression
- Base Line Characteristics
- Cutting Forse Prediction for Picks
- EWCOR_12
- Chapter 14, Multiple Regression Using Dummy Variables
- FinQuiz-Level2Mock2016Version6JuneAMSolutions.pdf
- 249361082-Correlation-and-Simple-Linear-Regression-Problems-With-Solutions.pdf
- LinearRegression
- Output
- 322-1090-1-PB.pdf
- Linear Regression
- Correlation and Regression
- Govender Ramroop 2013
- Statistical Tuning of Walfisch-Ikegami Model in Urban and Suburban Environments

You are on page 1of 26

html

Problem 1

Consider two curves, g1 and g2, where g^(m) represents the mth derivative of g.

g2 will have the smaller training RSS because its a higher order polynomial and is therefore more likely to capture

more of the data due to its higher DoF value.

g1 will have the smaller test RSS because g2 is more likely to overt the data with the extra degree of freedom.

When = 0, the penalty function will cancel out, and because the loss function is the same in g1 and g2, they will

have the same training and test RSS.

Problem 2

Suppose that we carry out backward stepwise, forward stepwise, and best subset all on the same data set. Each

approach will yield a sequence of models with k = 0 up through k = p predictors.

a. Which approach with k predictors will have the smallest test residual sum of squares? Explain.

While best subset selection is able to lter through all potential models, it is less likely to nd a model to t the

test data because there is concern of overtting the data, especially as p increases. Forward and backwise step

selection evaluate fewer models, making it less likely to overt the data. However, when looking at the number of

p parameters, if p > n, only forward stepwise selection is the viable model able to provide the most accurate test

RSS.

b. Which approach with k predictors will have the smallest training residual sum of squares? Explain.

With a larger search space, we are more likely to nd a model that looks good on training data because it is able

to lter through all p^2 model options, unlike forward and backward stepwise selection methods, which only lter

through 1+p(p+1)/2 models.

c.True or False: i. The predictors in the k-variable model identied by forward stepwise are a subset of the

predictors in the (k+1)-variable model identied by backward stepwise selection.

False. Predictors defined by a forward stepwise model are not necessarily the same ones

identified by backward stepwise because these models do not evaluate all possible option

s.

ii. The predictors in the k-variable model identied by backward stepwise are a subset of the predictors in the

(k+1)-variable model identied by forward stepwise selection.

False. Predictors dened by a backward stepwise model are not necessarily the same ones identied by

forward stepwise because these models do not evaluate all possible options.

iii. The predictors in the k-variable model identied by best subset are a subset of the predictors in the (k+1)-

variable model identied by best subset selection.

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 1/26

3/6/2017 Homework__3.html

False. The k and k+1 variable models are evaluated independently of one another, so it is impossible to

determine fully that a k variable model is a subset of a larger best subset model.

iv. The predictors in the k-variable model identied by backward stepwise are a subset of the predictors in the

(k+1)-variable model identied by backward stepwise selection.

True. The model contains all but one feature in the (k+1) variable model, minus the feature resulting in the

smallest overall benet in RSS.

v. The predictors in the k-variable model identied by forward stepwise are a subset of the predictors in the

(k+1)-variable model identied by forward stepwise selection.

True. The k+1 variable model contains all chosen k features, plus the best overall feature.

This question uses the variables dis (the weighted mean of distances to ve Boston employment centers) and nox

(nitrogen oxides concentration in parts per 10 million) from the Boston data (in the MASS library). We will treat dis

as the predictor and nox as the response.

a. Use the poly() function to t a cubic polynomial regression to predict nox using dis. Report the regression

output, and plot the resulting data and polynomial ts.

require(MASS)

library(splines)

attach(Boston)

summary(polyfit)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 2/26

3/6/2017 Homework__3.html

##

## Call:

## lm(formula = nox ~ poly(dis, 3), data = Boston)

##

## Residuals:

## Min 1Q Median 3Q Max

## -0.121130 -0.040619 -0.009738 0.023385 0.194904

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 0.554695 0.002759 201.021 < 2e-16 ***

## poly(dis, 3)1 -2.003096 0.062071 -32.271 < 2e-16 ***

## poly(dis, 3)2 0.856330 0.062071 13.796 < 2e-16 ***

## poly(dis, 3)3 -0.318049 0.062071 -5.124 4.27e-07 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## Residual standard error: 0.06207 on 502 degrees of freedom

## Multiple R-squared: 0.7148, Adjusted R-squared: 0.7131

## F-statistic: 419.3 on 3 and 502 DF, p-value: < 2.2e-16

dislims = range(dis)

#create grid of x-axis points for which we want to predict

dis.grid = seq(from=dislims[1], to=dislims[2])

#predict values for each of the points

polypreds = predict(polyfit,newdata = list(dis=dis.grid), se=TRUE)

require(ggplot2)

0.9341281 + -0.1820817*x^1 + 0.0219277*x^2+ -0.0008850*x^3)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 3/26

3/6/2017 Homework__3.html

Based on the results above, each of the linear(1), quadratic(2), and cubic(3) coecients are signicant to our

output.

b. Plot the polynomial ts for a range of dierent polynomial degrees (say, from 1 to 10), and report the

associated residual sum of squares.

for(i in 1:10){

polyfit = lm(nox ~ poly(dis, i), data=Boston)

rss[i] = sum(polyfit$residuals^2)

}

plot(1:10, rss, xlab = "Degree", ylab = "RSS", type = "b")

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 4/26

3/6/2017 Homework__3.html

rss[10]

## [1] 1.832171

Based on the plot, RSS decreases as the polynomial degree increases, as expected. Therefore, we see a

minimum RSS at degree 10 of 1.832171.

c. Perform cross-validation or another approach to select the optimal degree for the polynomial, and explain

your results.

require(boot)

set.seed(36)

cv.error = rep(0,10)

for (i in 1:10){

polyfit = glm(nox ~ poly(dis, i), data=Boston)

cv.error[i] = cv.glm(Boston, K=10, polyfit)$delta[2] #delta = estimated test MSE, valu

e 2 considers LOOCV in estimation

}

plot(1:10, cv.error, xlab = "Degree", ylab = "Test MSE", type = "b", col="darkgreen")

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 5/26

3/6/2017 Homework__3.html

Looking at our test MSE curve, we see the traditional U-shape occur with it bottoming out at the 3rd degree, and

as we increase the degree of polynomial, we notice our model starts likely overtting our data, which is why we

see peaks in test MSE at the 7th and 9th degrees. I used 10-fold cross validation (K=10) to minimize

computational time.

d. Use the bs() function to t a regression spline to predict nox using dis. Report the output for the t using

four degrees of freedom. How did you choose the knots? Plot the resulting t.

library(splines)

bs.fit = lm(nox ~ bs(dis, knots = c(6)), degree=3, data=Boston)

## extra argument 'degree' will be disregarded

summary(bs.fit)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 6/26

3/6/2017 Homework__3.html

##

## Call:

## lm(formula = nox ~ bs(dis, knots = c(6)), data = Boston, degree = 3)

##

## Residuals:

## Min 1Q Median 3Q Max

## -0.12387 -0.04012 -0.01033 0.02308 0.19446

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 0.76037 0.01018 74.667 < 2e-16 ***

## bs(dis, knots = c(6))1 -0.23672 0.02321 -10.200 < 2e-16 ***

## bs(dis, knots = c(6))2 -0.36177 0.02548 -14.200 < 2e-16 ***

## bs(dis, knots = c(6))3 -0.33337 0.04044 -8.244 1.47e-15 ***

## bs(dis, knots = c(6))4 -0.36220 0.05105 -7.095 4.45e-12 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## Residual standard error: 0.06208 on 501 degrees of freedom

## Multiple R-squared: 0.7152, Adjusted R-squared: 0.7129

## F-statistic: 314.6 on 4 and 501 DF, p-value: < 2.2e-16

bs.pred2 = cbind(bs.pred$bs.fit)

lines(dis.grid, predict(bs.fit,list(dis=dis.grid)), col="darkblue", lwd=2)

abline(v=c(6), lty=2, col="darkblue")

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 7/26

3/6/2017 Homework__3.html

The goal in choosing knots is for all terms to be signicant. So, in order to eectively select values for our knot,

we want to select inputs that give each term importance, by referring to the coecient and t-statistics. I chose

degree 3 (because of our CV error results), resulting in the above t curve.

e. Now t a regression spline for a range of degrees of freedom, and plot the resulting ts and report the

resulting RSS. Describe the results obtained.

for(i in 1:10){

splineFit = lm(nox ~ bs(dis, knots=c(6), degree=i), data=Boston)

rss[i] = sum(splineFit$residuals^2)

plot(dis,nox,col="darkgrey")

lines(dis.grid,predict(splineFit,list(dis=dis.grid)),col="darkblue",lwd=2)

abline(v=c(6),lty=2,col="darkgreen")

}

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 8/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 9/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 10/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 11/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 12/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 13/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 14/26

3/6/2017 Homework__3.html

Referring to the above chart, we see our plots get smootherand then bumpier as a result of the models

increased exibility. As mentioned in lecture, splines create more unstable plots at the tails of our graphs -

especially when the degree of freedom increases.

f. Perform cross-validation or another approach in order to select the best degrees of freedom for a

regression spline. Describe your results.

set.seed(36)

reg.cv.error = rep(NA,10)

for (i in 1:10){

regPolyfit = glm(nox ~ bs(dis, knots= c(6), degree = i), data=Boston)

reg.cv.error[i] = cv.glm(Boston, K=10, regPolyfit)$delta[2]

}

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 15/26

3/6/2017 Homework__3.html

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 16/26

3/6/2017 Homework__3.html

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 17/26

3/6/2017 Homework__3.html

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

plot(1:10, reg.cv.error, xlab = "Degree", ylab = "Test MSE", type = "b", col="darkred")

reg.cv.error[1]

## [1] 0.004429047

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 18/26

3/6/2017 Homework__3.html

Looking at our test MSE plot, we see the test MSE value slowly increases as degree increases, so we can

conclude that 1 degree is the best option to t our data. The minimum test MSE, at 1 degree, is 0.004429047.

This problem works with the body dataset, which you can download from the homework folder on the class

website. The goal of this problem is to perform and compare Principal Components Regression and Partial Least

Squares on the problem of trying to predict someones weight.

a. Read the body dataset into R using the load() function. This dataset contains: X - A dataframe containing

21 dierent types of measurements on the human body and Y - A dataframe that contains the age, weight

(kg), height (cm), and the gender of each person in the sample. Lets say we forgot how the gender is

coded in this dataset. Using a simple visualization, explain how you can tell which gender is which.

load("/Users/alexnutkiewicz/Downloads/body.rdata")

genderCode = as.factor(Y$Gender)

par(mfrow = c(3,1))

plot(Y$Weight, Y$Gender, col = "darkblue")

plot(X$Bicep.Girth, Y$Gender, col = "darkgreen")

plot(X$Forearm.Girth, Y$Gender, col = "darkgrey")

The above plots analyze weight, chest girth, bicep girth, and forearm girth versus gender to allow us to intitively

gure out whether or not 0 or 1 is male. We can assume men are likely to be heavier and have girthier features

than women. So, we can assume that 1 is coded as male.

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 19/26

3/6/2017 Homework__3.html

b. Reserve 200 observations from your dataset to act as a test set and use the remaining 307 as a training

set. On the training set, use both pcr and plsr to t models to predict a persons weight based on the

variables in X. Use the options scale = TRUE and validation=CV. Why does it make sense to scale our

variables in this case?

set.seed(36)

testing = sort(sample(1:nrow(X), 200))

training = (1:nrow(X))[-testing]

library(pls)

##

## Attaching package: 'pls'

##

## loadings

plsrFit = plsr(Y$Weight ~., data=X, subset=training, scale=TRUE, validation="CV")

We want to scale our variables to improve stability in our analysis. It is much easier to compare predictors when

they are on the same scale (e.g., comparing mm to mm vs.cm to mm). Additionally, we want to use cross validate

our variables because this process takes place within the PCR/PSLR model tting. This is therefore cross

validating our choice of model on the 307 training observations. This will prevent us from overtting the data.

c. Run summary() on each of the objects calculated above, and compare the training % variance explained

from the pcr output to the plsr output. Do you notice any consistent patterns (in comparing the two)? Is that

pattern surprising? Explain why or why not.

summary(pcrFit)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 20/26

3/6/2017 Homework__3.html

## Y dimension: 307 1

## Fit method: svdpc

## Number of components considered: 21

##

## VALIDATION: RMSEP

## Cross-validated using 10 random segments.

## (Intercept) 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps

## CV 12.95 3.370 3.193 2.976 2.940 2.936 2.936

## adjCV 12.95 3.369 3.191 2.974 2.937 2.933 2.932

## 7 comps 8 comps 9 comps 10 comps 11 comps 12 comps 13 comps

## CV 2.915 2.895 2.892 2.896 2.913 2.925 2.924

## adjCV 2.911 2.888 2.886 2.888 2.906 2.916 2.916

## 14 comps 15 comps 16 comps 17 comps 18 comps 19 comps

## CV 2.943 2.909 2.861 2.850 2.821 2.832

## adjCV 2.934 2.895 2.850 2.815 2.809 2.820

## 20 comps 21 comps

## CV 2.842 2.843

## adjCV 2.829 2.831

##

## TRAINING: % variance explained

## 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps 7 comps

## X 62.49 74.20 79.27 83.69 86.21 88.29 89.98

## Y$Weight 93.23 93.97 94.81 94.99 95.07 95.09 95.15

## 8 comps 9 comps 10 comps 11 comps 12 comps 13 comps

## X 91.41 92.66 93.81 94.92 95.87 96.75

## Y$Weight 95.26 95.31 95.35 95.35 95.39 95.42

## 14 comps 15 comps 16 comps 17 comps 18 comps 19 comps

## X 97.53 98.06 98.53 98.93 99.32 99.59

## Y$Weight 95.42 95.66 95.76 95.92 95.92 95.93

## 20 comps 21 comps

## X 99.82 100.00

## Y$Weight 95.93 95.94

summary(plsrFit)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 21/26

3/6/2017 Homework__3.html

## Y dimension: 307 1

## Fit method: kernelpls

## Number of components considered: 21

##

## VALIDATION: RMSEP

## Cross-validated using 10 random segments.

## (Intercept) 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps

## CV 12.95 3.273 2.956 2.859 2.843 2.811 2.807

## adjCV 12.95 3.272 2.954 2.855 2.832 2.802 2.796

## 7 comps 8 comps 9 comps 10 comps 11 comps 12 comps 13 comps

## CV 2.802 2.80 2.801 2.804 2.805 2.804 2.805

## adjCV 2.792 2.79 2.791 2.793 2.794 2.793 2.794

## 14 comps 15 comps 16 comps 17 comps 18 comps 19 comps

## CV 2.805 2.804 2.804 2.804 2.804 2.804

## adjCV 2.794 2.794 2.794 2.794 2.794 2.794

## 20 comps 21 comps

## CV 2.804 2.804

## adjCV 2.794 2.794

##

## TRAINING: % variance explained

## 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps 7 comps

## X 62.48 72.47 78.75 80.70 83.45 86.13 87.99

## Y$Weight 93.67 94.99 95.43 95.77 95.87 95.92 95.94

## 8 comps 9 comps 10 comps 11 comps 12 comps 13 comps

## X 89.31 90.48 91.65 92.79 93.58 94.61

## Y$Weight 95.94 95.94 95.94 95.94 95.94 95.94

## 14 comps 15 comps 16 comps 17 comps 18 comps 19 comps

## X 95.37 96.13 96.81 97.47 98.07 98.81

## Y$Weight 95.94 95.94 95.94 95.94 95.94 95.94

## 20 comps 21 comps

## X 99.61 100.00

## Y$Weight 95.94 95.94

Each of these models has a similar training percent of variance explained in the data. Although each of these

methods are so dierent (PLSR is a supervised learning method, PCR is unsupervised), this is not a surprising

result. Each of these methods are used to model a response variable under a large p value, especially if the p

predictors are highly correlated. PCR creates linear combinations of the original set of predictors without

consideration for the response variable. PLSR however does consider the response variable, which is why it

typically has fewer linear combinations. Despite their dierences, both of these approaches create linear

combinations of our original set of predictors, and so their similarity in results is unsurprising.

d. For each of the models, pick a number of components that you would use to predict future values of weight

from X. Please include any further analysis you use to decide on the number of components.

validationplot(pcrFit, val.type = "RMSEP", type = "b", main = "PCR Fit")

validationplot(plsrFit, val.type = "RMSEP", type = "b", main = "PLSR Fit")

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 22/26

3/6/2017 Homework__3.html

Using validationplot through the pls library, we were able to nd that a signicant amount of the RMSE drops o

after just adding 1 component, so well move forward in our analysis of the body dataset with one component.

e. Practically speaking, it might be nice if we could guess a persons weight without measuring 21 dierent

quantities. Do either of the methods performed above allow us to do that? If not, pick another method that

will, and t it on the training data.

Yeah, so if we want to reduce the number of predictors, aka simplify the model via feature selection, the lasso

seems like a good option!

library(ISLR)

library(glmnet)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 23/26

3/6/2017 Homework__3.html

lassoX = scale(model.matrix(Y$Weight ~ ., data = X)[, -1])

lassoY = Y$Weight

lassoFit = glmnet(lassoX[training,], lassoY[training], alpha = 1, lambda = grid)

plot(lassoFit)

set.seed(36)

lassoCV = cv.glmnet(lassoX[training,], lassoY[training], alpha = 1)

bestLambda = lassoCV$lambda.1se

predict(lassoFit, type = "coefficients", s = bestLambda)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 24/26

3/6/2017 Homework__3.html

## 1

## (Intercept) 69.00221166

## Wrist.Diam 0.04076614

## Wrist.Girth .

## Forearm.Girth 1.35906177

## Elbow.Diam 0.54778799

## Bicep.Girth .

## Shoulder.Girth 1.62958068

## Biacromial.Diam 0.42140884

## Chest.Depth 0.47952509

## Chest.Diam 0.21704888

## Chest.Girth 1.39778183

## Navel.Girth .

## Waist.Girth 3.26613218

## Pelvic.Breadth 0.35693112

## Bitrochanteric.Diam .

## Hip.Girth 1.45081886

## Thigh.Girth 0.82597085

## Knee.Diam 0.32318685

## Knee.Girth 1.13763065

## Calf.Girth 0.84002676

## Ankle.Diam 0.44355954

## Ankle.Girth 0.19631757

bestLambda

## [1] 0.5396431

f. Compare all 3 methods in terms of performance on the test set. Keep in mind that you should only run one

version of each model on the test set. Any necessary selection of parameters should be done only with the

training set.

mean((pcrPredict - lassoY[testing])^2)

## [1] 8.562787

mean((plsrPredict - lassoY[testing])^2)

## [1] 7.952771

lassoPredict = predict(lassoFit, s = bestLambda, newx = lassoX[testing,])

mean((lassoPredict - lassoY[testing])^2)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 25/26

3/6/2017 Homework__3.html

## [1] 8.141433

These results show that if we are employing the 1 standard error rule, we nd that PLSR has the lowest test MSE.

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 26/26

- Pert 10 Uji Arima Untuk Metode Arch GarchUploaded byaris
- BSE-500 IndexUploaded byAbhishek Prasad
- Chapter 5Uploaded byjayroldparcede
- RMUploaded byAsad Ali Khawaja
- stats216_hw2Uploaded byAlex Nutkiewicz
- 5pp40to46MS709Uploaded byAbhishek2009GWU
- House Pricing RegressionUploaded bynitin3078
- Base Line CharacteristicsUploaded byPrabhav Chauhan
- Cutting Forse Prediction for PicksUploaded bysken
- EWCOR_12Uploaded byNaheed Nazneen TuLi
- Chapter 14, Multiple Regression Using Dummy VariablesUploaded byAmin Haleeb
- FinQuiz-Level2Mock2016Version6JuneAMSolutions.pdfUploaded byAjoy Ramanan
- 249361082-Correlation-and-Simple-Linear-Regression-Problems-With-Solutions.pdfUploaded byiqbaltaufiqur rochman
- LinearRegressionUploaded bySoumyajit Das
- OutputUploaded byNajwa Naydirra Nanie Triana
- 322-1090-1-PB.pdfUploaded bygpal.india2802
- Linear RegressionUploaded byswat700
- Correlation and RegressionUploaded byJj Aiteng
- Govender Ramroop 2013Uploaded byTamires Tonioti
- Statistical Tuning of Walfisch-Ikegami Model in Urban and Suburban EnvironmentsUploaded byTejas Potdar
- METU Industrial Engineering - Engineering Statistics II Case StudyUploaded byOnur Yılmaz
- 8399320Uploaded byNeeta Joshi
- Aplicat Chemical Kinetic Deteriorat of Foods (1)Uploaded byAngeliika Aviila
- Solutions Chapter 11Uploaded byDenzil D'Souza
- xyApr6Lec26Uploaded byIngga Permana
- Linear Regrssion Analysis and ResidualUploaded byhebahaddad
- ThesisUploaded byManash Verma
- Cp Mellow CriterionUploaded bymayrano
- 1891654Uploaded byJawad Gujjar
- spss analgetikUploaded byRizal Ibeel

- SITRANS FX300Uploaded byFredy Rojas Barra
- Analysis - What's the Difference Between a Hospital and a Bottling FactoryUploaded byProf Dr Dr Ernst Hanisch
- Jazz Assignment 2Uploaded byDiana Fox
- norma RUSUploaded byBosr Sac
- mdl7Uploaded byRyza Indah Permatasari
- ASTM G 190-06Uploaded byDayana Pozo
- Present Simple and ContinuousUploaded bylupumihaela27
- Himachal Pradesh General Sales Tax Act, 1968 .pdfUploaded byLatest Laws Team
- ROMAN.docxUploaded bymakahiya
- Surface Tension by the Ring Method (Du Nouy Method)Uploaded byJose Galvan
- Lista Verbe Neregulate EnglezaUploaded byMogyorosi-Szilagyi Eva
- p2int_2012_dec_qUploaded byMhamudul Hasan
- bai tap.Uploaded byhoai_hm2357
- Withdrawal of ProsecutionUploaded byPooja Meena
- Risk Management Best Practice Principles for FCMUploaded bynamhvu99
- Science Process Skills Examination 1ncv0kyUploaded byInsar Damopolii Tempoesekarang
- A Simple and Rapid Vascular Anastomosis for Emergency SurgeryUploaded byMuhammad Diko Prakoso
- Jennifer L. McCoy, David J. Myers - The Unraveling of Representative Democracy in Venezuela (2006, The Johns Hopkins University Press)Uploaded byEnndiel Mendes
- 02-Forms of Corrosion-Localized CorrosionUploaded byAbdellatif Radwan
- Knowledge Management & IntranetsUploaded byCharityn
- Operations Management- Quality Improvement and Productivity 1Uploaded byAshish Kumar
- Oracle ADF DeveloperUploaded bySharif Mohammed
- All Language Oer Merged (1)Uploaded byVioleta Feliciano
- Joseph Mulligan v. Ralph Kemp, Warden, Georgia Diagnostic and Classification Center, Respondent, 771 F.2d 1436, 11th Cir. (1985)Uploaded byScribd Government Docs
- Phy Report Exp2Uploaded byaudilicious
- Collated Interview QuestionsUploaded byHarish Surana
- Primary Maths Curriculum FrameworkUploaded byMathiarasu Muthumanickam
- Criminal ProfilingUploaded byLuis Filipe Fontes
- Electoral ProcessUploaded bymananchahal
- huxley and orwellUploaded byapi-337051597