HMK 3

Homework 3 (Attendance 5) for Statistics 512
Applied Regression Analysis

Material Covered: Chapter 6 Neter et al. and Kuhn
By: Friday, 3rd October, Fall 2003
This homework is worth 5% and marked out of 5 points. Homework assignments

are to be handed in using Vista on the Internet before 4am. Vista will not allow
any homework assignment to be handed in late. It is highly recommended that you
complete the homework, by hand, before logging onto Vista; use Vista simply to
submit your answers. Submit as many times as you want before the deadline and
receive the highest score of all the submissions. This is an individual homework
and so each student submits their own homework, although they are encouraged to
cooperate with other students.
1. Applied Linear Statistical Models

(Neter et al.) Questions.
Chapter Problem(s) hints
6, pages 252257 6.9, 6.10, 6.11, 6.12, 6.13, 6.14 Chemical shipment
6.18, 6.19, 6.20, 6.21 Mathematicians salaries
(6.9) chemical shipment, hw3-6-9-chem-diagnos
*HOMEWORK 3, 6-9, PAGES 252-257;

DATA CHEMICAL;
INPUT Y X1 X2 TIME;
DATALINES;
58 7 5.11 1
152 18 16.72 2
41 5 3.2 3
93 14 7.03 4
101 11 10.98 5
38 5 4.04 6
203 23 22.07 7
78 9 7.03 8
117 16 10.62 9
44 5 4.76 10
121 17 11.02 11
112 12 9.51 12
50 6 3.79 13
82 12 6.45 14
48 8 4.6 15
127 15 13.86 16
140 17 13.03 17
155 21 15.21 18
39 6 3.64 19
90 11 9.57 20
;
*6.9(A) STEM AND LEAF OF X1 AND X2;
PROC UNIVARIATE DATA=CHEMICAL PLOT;
TITLE1 '6.9(A) STEM AND LEAF OF NUMBER OF DRUMS, X1';
TITLE2 'AND OF WEIGHT OF SHIPMENTS, X2';
VAR X1 X2;
RUN;
*6.9(B) TIMEPLOTS OF HANDLING MINUTES;
SYMBOL1 V=STAR C=BLACK;
PROC GPLOT DATA=CHEMICAL;
TITLE1 '6.9(B-1) TIMEPLOT OF NUMBER OF DRUMS, X1';
PLOT X1*TIME;
RUN;
TITLE1 '6.9(B-2) TIMEPLOT OF WEIGHT OF SHIPMENTS, X2';
PLOT X2*TIME;
RUN;
*6.9(C) SCATTERPLOT MATRICES AND CORRELATION;
TITLE1 '6.9(C-1) HANDING TIME VERSUS NUMBER OF DRUMS, Y VS X1';
PLOT Y*X1;
RUN;
TITLE1 '6.9(C-2) HANDING TIME VERSUS NUMBER OF DRUMS, Y VS X2';
PLOT Y*X2;
RUN;
TITLE1 '6.9(C-3) HANDING TIME VERSUS NUMBER OF DRUMS, X1 VS X2';
PLOT X1*X2;
RUN;
PROC CORR DATA=CHEMICAL;
TITLE '6.9(C-4) CORRELATION Y, X1 AND X2';
VAR Y X1 X2;
RUN;
QUIT;
(a) Stemandleaf plots.

Look for outliers.
(b) Time Plots.
Any patterns?
(c) Scatter plots and correlation matrix
It would be good that Y is strongly linearly related to both X1 and X2 ,
but it would be bad that X1 and X2 are strongly linearly related to one
another.
(6.10) chemical shipment again, hw3-6-10-chem-residual
*HOMEWORK 3, 6-10, PAGES 252-257;

DATA CHEMICAL;
INPUT Y X1 X2 TIME;
X1X2 = X1*X2;
DATALINES;
58 7 5.11 1
152 18 16.72 2
41 5 3.2 3
93 14 7.03 4
101 11 10.98 5
38 5 4.04 6
203 23 22.07 7
78 9 7.03 8
117 16 10.62 9
44 5 4.76 10
121 17 11.02 11
112 12 9.51 12
50 6 3.79 13
82 12 6.45 14
48 8 4.6 15
127 15 13.86 16
140 17 13.03 17
155 21 15.21 18
39 6 3.64 19
90 11 9.57 20
;
*6.10(A) REGRESSION;
PROC REG DATA=CHEMICAL OUTEST=EST;
TITLE1 '6.10(A) REGRESSION OF Y VS X1 AND X2';
MODEL Y = X1 X2;
OUTPUT OUT=OUTPLOT PREDICTED=PRED RESIDUAL=RESID;
RUN;
*6.10(B) BOXPLOT OF RESIDUALS;
PROC UNIVARIATE DATA=OUTPLOT PLOT;
TITLE1 '6.10(B) BOXPLOT OF RESIDUALS';
VAR RESID;
RUN;
*6.10(C) RESIDUALS VS PREDICTED, X1, X2 AND X1X2;
PROC GPLOT DATA=OUTPLOT;
TITLE '6.10(C-1) RESIDUALS VS PREDICTED';
PLOT RESID*PRED;
RUN;
TITLE '6.10(C-2) RESIDUALS VS X1';
PLOT RESID*X1;
RUN;
TITLE '6.10(C-3) RESIDUALS VS X2';
PLOT RESID*X2;
RUN;
TITLE '6.10(C-4) RESIDUALS VS X1X2';
PLOT RESID*X1X2;
RUN;
*6.10(C) NORMAL PROBABILITY PLOT;
* RESIDUALS VS EXPECTED RESIDUALS;
PROC SORT DATA=OUTPLOT;
BY RESID;
RUN;
DATA OUTPLOT;
SET OUTPLOT NOBS=NOBS;
QUANTILE = PROBIT( (_N_- (3/8)) / (NOBS + (1/4)) );
RUN;
DATA OUTPLOT2;
IF _N_ = 1 THEN SET EST;
SET OUTPLOT;
EXPRESIDUAL = _RMSE_*QUANTILE;
RUN;
PROC GPLOT DATA=OUTPLOT2;
TITLE '6.10(C-5) NORMAL PROBABILITY PLOT';
PLOT RESID*EXPRESIDUAL;
RUN;
*6.10(D) TIMEPLOT OF RESIDUALS;
TITLE1 '6.10(D) TIMEPLOT OF RESIDUALS';
PLOT RESID*TIME;
RUN;
*6.10(E) LEVENE TEST OF RESIDUALS;
DATA NEWCHEMICAL;
SET OUTPLOT;
IF PRED < 92 THEN LEVENEGROUP = 'A';
IF PRED GE 92 THEN LEVENEGROUP = 'B';
RUN;
PROC GLM DATA=NEWCHEMICAL ALPHA=0.01;
TITLE '6.10(E) (UNMODIFIED) LEVENE TEST';
TITLE1 'OF HOMOGENEITY OF VARIANCE OF RESIDUALS';
CLASS LEVENEGROUP;
MODEL RESID = LEVENEGROUP;
MEANS LEVENEGROUP / HOVTEST = LEVENE (TYPE=ABS);
RUN;
QUIT;
(a) Estimated regression function.
(b) Box plot of the residuals.
Look for outliers.
(c) Residual plots.
It is good if there is no pattern or outliers in residual plots.
(d) Residuals versus time plot.
(e) Levene Test
1. Statement.
The statement of the test is (check none, one or more):
(i) H0 : error variance constant versus H1 : > 1.
(ii) H0 : error variance constant versus H1 : not constant
(iii) H0 : error variance constant versus H1 : 6= 1.
2. Test.
From SAS, the pvalue is (choose one) 0.446 / 0.8278 / 0.989
The level of significance is (circle one) 0.01 / 0.05 / .10
3. Conclusion.
Since the pvalue is smaller / larger than the level of significance we
(circle one) accept / reject the null hypothesis that the error variance
is constant.
(6.11) chemical shipment again, hw3-6-11-chem-regress
*HOMEWORK 3, 6-11, PAGES 252-257;

DATA CHEMICAL;
INPUT Y X1 X2 TIME;
DATALINES;
58 7 5.11 1
152 18 16.72 2
41 5 3.2 3
93 14 7.03 4
101 11 10.98 5
38 5 4.04 6
203 23 22.07 7
78 9 7.03 8
117 16 10.62 9
44 5 4.76 10
121 17 11.02 11
112 12 9.51 12
50 6 3.79 13
82 12 6.45 14
48 8 4.6 15
127 15 13.86 16
140 17 13.03 17
155 21 15.21 18
39 6 3.64 19
90 11 9.57 20
;
*6.11 REGRESSION;
PROC REG DATA=CHEMICAL;
TITLE1 '6.11 REGRESSION OF Y VS X1 AND X2';
MODEL Y = X1 X2;
RUN;
QUIT;
Source Sum Of Squares Degrees of Freedom Mean Squares

Regression 40,496.48 p1=31=2 20,248.24
Error 536.47 n p = 20 3 = 17 31.56
Total 41,032.95 n 1 = 20 1 = 19
(a) Test of regression relation at = 0.05.

1. Statement.
(i) H0 : 1 = 2 = 0 versus H1 : 1 = 2 > 0.
(ii) H0 : 1 = 2 = 0 versus H1 : 1 = 2 < 0.
(iii) H0 : 1 = 2 = 0 versus H1 : not all i is zero.
2. Test.
From SAS, the pvalue is (choose one) 0 / 0.0827 / 0.098
3. Conclusion.
(circle one) accept / reject the null hypothesis that 1 = 2 = 0.
(b) Bonferroni Confidence Intervals.
From TI83 (INVT 18 ENTER 0.975 ENTER)
B = t(1 /2g; n 2) = t(1 0.05/2(2); 20 2) = t(0.9875; 18) = 2.458
From SAS,
1. Bonferroni CI for 1 :
b1 = 3.7681 and s{b1 } = 0.614,
b1 Bs{b1 } = 3.7681 2.458(0.614) =?
b2 = 5.0796 and s{b2 } = 0.666
b2 Bs{b2 } = 5.0796 2.458(0.666) =?
(c) Correlation Coefficient.
SSR
R2 = SSTO = 40,496.48
41,032.95
0.987
2
R is also given directly on the SAS output
(6.12) chemical shipment again, hw3-6-12-chem-respCI
*HOMEWORK 3, 6-12, PAGES 252-257;

DATA CHEMICALX;
INPUT Y X1 X2 TIME;
DATALINES;
58 7 5.11 1
152 18 16.72 2
41 5 3.2 3
93 14 7.03 4
101 11 10.98 5
38 5 4.04 6
203 23 22.07 7
78 9 7.03 8
117 16 10.62 9
44 5 4.76 10
121 17 11.02 11
112 12 9.51 12
50 6 3.79 13
82 12 6.45 14
48 8 4.6 15
127 15 13.86 16
140 17 13.03 17
155 21 15.21 18
39 6 3.64 19
90 11 9.57 20
. 5 3.20 21
. 6 4.80 22
. 10 7.00 23
. 14 10.00 24
. 20 18.00 25
;
DATA CHEMICAL X;
SET CHEMICALX;
IF READ NE . THEN OUTPUT CHEMICAL;
ELSE OUTPUT X;
RUN;
PROC REG DATA=CHEMICAL ALPHA=0.05 NOPRINT;
TITLE '6.12(A) BONFERRONI AND WH JOINT CIs FOR MEAN';
MODEL Y = X1 X2;
RUN;
PROC REG DATA=CHEMICALX;
MODEL Y = X1 X2;
OUTPUT OUT=PRED_DS(WHERE=(Y =.)) P=PHAT STDP=STDP;
RUN;
PROC PRINT DATA=PRED_DS;
RUN;
PROC PLOT DATA=CHEMICALX;
TITLE '6.12(B) RANGE OF X1 AND X2';
PLOT X1*X2=Y;
RUN;
PROC G3D DATA=CHEMICALX;
SCATTER X1*X2=Y;
RUN;
QUIT;
(a) Family CIs For Different Responses.

At = 0.05, and g = 5 (five simultaneous intervals),
from TI83,
q q
W = pF (1 ; p, n p) = 3F (1 0.05; 3, 20 3) = 3.098
(INVF 3 ENTER 17 ENTER 0.95 ENTER,
then multiply by 3 and find the square root)
B = t(1 /2g; n p) = t(1 0.05/2(5); 20 3) = t(0.995; 17) = 2.898
(INVT 17 ENTER 0.995 ENTER)
Since W = 3.098 > B = 2.898, use ?
From SAS,
1. Xh1 = 5, Xh2 = 3.20:
Yh = 38.4195 and s{Yh } = 2.0332
Yh Bs{Yh } = 38.4195 2.898(2.0332) =?
2. Xh1 = 6, Xh2 = 4.80:
Yh = 50.3150 and s{Yh } = 1.9192
Yh Bs{Yh } = 50.3150 2.898(1.9192) =?
3. Xh1 = 10, Xh2 = 7.00:
Yh = 76.5625 and s{Yh } = 1.3701
Yh Bs{Yh } = 76.5625 2.898(1.3701) =?
4. Xh1 = 14, Xh2 = 10.00:
Yh = 106.8737 and s{Yh } = 1.4761
Yh Bs{Yh } = 106.8737 2.898(1.4761) =?
5. Xh1 = 20, Xh2 = 18.00:
Yh = 170.1191 and s{Yh } = 2.6096
Yh Bs{Yh } = 170.1191 2.898(2.6096) =?
(b) Plot Xi1 versus Xi2 .
The point (X1 , X2 ) = (20, 5) is clearly where in the the scatter of points?
The point (X1 , X2 ) = (20, 19) is clearly where in the the scatter of points?
(6.13) chemical shipment again, hw3-6-13-chem-respPI
*HOMEWORK 3, 6-13, PAGES 252-257;

DATA CHEMICAL;
INPUT Y X1 X2 TIME;
DATALINES;
58 7 5.11 1
152 18 16.72 2
41 5 3.2 3
93 14 7.03 4
101 11 10.98 5
38 5 4.04 6
203 23 22.07 7
78 9 7.03 8
117 16 10.62 9
44 5 4.76 10
121 17 11.02 11
112 12 9.51 12
50 6 3.79 13
82 12 6.45 14
48 8 4.6 15
127 15 13.86 16
140 17 13.03 17
155 21 15.21 18
39 6 3.64 19
90 11 9.57 20
;
PROC IML;
USE CHEMICAL;
READ ALL VAR {'X1'} INTO X1;
READ ALL VAR {'Y'} INTO Y;
N = NROW(X1);
M = NCOL(Y);
J = J(N,N,1);
X = J(N,1,1)||X1||X2;
B = INV(X`*X)*X`*Y;
H = X*INV(X`*X)*X`;
SSE = Y`*(I(N) - H)*Y;
DFE = N - 3;
MSE = SSE/DFE;
XH = { 1 1 1 1,
9 12 15 18,
7.20 9.00 12.50 16.50};
YHAT = XH`*B;
*SQRT WORKS BECAUSE NO NEGATIVES!;
SPRED = SQRT(MSE*(1 + XH`*INV(X`*X)*XH));
PRINT YHAT;
PRINT S2PRED;
PRINT SPRED;
RUN;
QUIT;
At = 0.05, g = 4 (four simultaneous intervals)

and p = 3 (three parameters, 0 ,1 , 2 ),
fromqTI83, q
S = gF (1 ; g, n p) = 4F (1 0.05; 4, 20 3) = 3.441
B = t(1 /2g; n p) = t(1 0.05/2(4); 20 3) = t(0.99375; 17) = 2.793
Since S = 3.441 > B = 2.793, use B because the Bonferroni gives narrower
(more efficient) CIs than the Scheffe CIs.
From SAS,
1. Xh1 = 9, Xh2 = 7.20:
Yh = 73.8103 and s{pred } = 5.8076
Yh Bs{Yh } = 73.8103 2.793(5.8076) =?
2. Xh1 = 12, Xh2 = 9.00:
Yh = 94.2579 and s{pred } = 5.7578
Yh Bs{Yh } = 94.2579 2.793(5.7578) =?
3. Xh1 = 15, Xh2 = 12.50:
Yh = 123.3408 and s{pred } = 5.8217
Yh Bs{Yh } = 123.3408 2.793(5.8217) =?
4. Xh1 = 18, Xh2 = 16.50:
Yh = 154.9635 and s{pred } = 6.1013
Yh Bs{Yh } = 154.9635 2.793(6.1013) =?
(6.14) chemical shipment again, hw3-6-14-chem-respPmean
*HOMEWORK 3, 6-14, PAGES 252-257;

DATA CHEMICAL;
INPUT Y X1 X2 TIME;
DATALINES;
58 7 5.11 1
152 18 16.72 2
41 5 3.2 3
93 14 7.03 4
101 11 10.98 5
38 5 4.04 6
203 23 22.07 7
78 9 7.03 8
117 16 10.62 9
44 5 4.76 10
121 17 11.02 11
112 12 9.51 12
50 6 3.79 13
82 12 6.45 14
48 8 4.6 15
127 15 13.86 16
140 17 13.03 17
155 21 15.21 18
39 6 3.64 19
90 11 9.57 20
;
PROC IML;
USE CHEMICAL;
READ ALL VAR {'Y'} INTO Y;
N = NROW(X1);
M = NCOL(Y);
J = J(N,N,1);
X = J(N,1,1)||X1||X2;
B = INV(X`*X)*X`*Y;
H = X*INV(X`*X)*X`;
SSE = Y`*(I(N) - H)*Y;
DFE = N - 3;
MSE = SSE/DFE;
XH = { 1 1 1,
7 7 7,
6 6 6};
YHAT = XH`*B;
*SQRT WORKS BECAUSE NO NEGATIVES!;
SPRED = SQRT(MSE*(1/3 + XH`*INV(X`*X)*XH));
PRINT YHAT;
PRINT SPRED;
RUN;
QUIT;
(a) Mean of New Observations CI.

At = 0.05, p = 3 (three parameters, 0 , 1 , 2 ),
and m = 3 (mean of three new observations)
from TI83,
B = t(1 /2; n p) = t(1 0.05/2; 20 3) = t(0.975; 17) = 2.110
Xh1 = 7, Xh2 = 6:
Yh = 60.1786 and s{predmean} = 3.7281
Yh Bs{Yh } = 60.1786 2.110(3.7281) =?
(b) A CI for the total handling time, then, would be
3 (52.30, 68.04) =?
(6.18) Mathematicians salaries, hw3-6-18-math-diagnos
*HOMEWORK 3, 6-18, PAGES 252-257;

DATA MATH;
INPUT Y X1 X2 X3;
X1X2 = X1*X2;
X1X3 = X1*X3;
X2X3 = X2*X3;
DATALINES;
33.2 3.5 9 6.1
40.3 5.3 20 6.4
38.7 5.1 18 7.4
46.8 5.8 33 6.7
41.4 4.2 31 7.5
37.5 6 13 5.9
39 6.8 25 6
40.7 5.5 30 4
30.1 3.1 5 5.8
52.9 7.2 47 8.3
38.2 4.5 25 5
31.8 4.9 11 6.4
43.3 8 23 7.6
44.1 6.5 35 7
42.8 6.6 39 5
33.6 3.7 21 4.4
34.2 6.2 7 5.5
48 7 40 7
38 4 35 6
35.9 4.5 23 3.5
40.4 5.9 33 4.9
36.8 5.6 27 4.3
45.2 4.8 34 8
35.1 3.9 15 5
;
*6.18(A) STEM AND LEAF OF X1, X2 AND X3;
PROC UNIVARIATE DATA=MATH PLOT;
TITLE1 '6.18(A) STEM AND LEAF OF WORK QUALITY, X1';
TITLE2 'AND OF YEARS OF EXPERIENCE, X2';
TITLE3 'AND OF PUBLICATION SUCCESS, X3';
VAR X1 X2 X3;
RUN;
*6.18(B) SCATTERPLOT MATRICES AND CORRELATION;
PROC GPLOT DATA=MATH;
TITLE '6.18(B) SCATTERPLOT MATRICES';
PLOT Y*X1;
PLOT Y*X2;
PLOT Y*X3;
PLOT X1*X2;
PLOT X1*X3;
PLOT X2*X3;
RUN;
PROC CORR DATA=MATH;
TITLE '6.18(C-4) CORRELATION Y, X1, X2 AND X3';
VAR Y X1 X2 X3;
RUN;
*6.18(C) REGRESSION;
PROC REG DATA=MATH OUTEST=EST;
TITLE1 '6.18(C) REGRESSION OF Y VS X1, X2 AND X3';
MODEL Y = X1 X2 X3;
RUN;
*6.18(D) BOXPLOT OF RESIDUALS;
PROC UNIVARIATE DATA=OUTPLOT PLOT;
TITLE1 '6.18(D) BOXPLOT OF RESIDUALS';
VAR RESID;
RUN;
*6.18(E) RESIDUALS VS PREDICTED, X1, X2, X3 AND INTERACTIONS;
TITLE '6.18(E-1) RESIDUALS VS VARIOUS';
PLOT RESID*PRED;
PLOT RESID*X1;
PLOT RESID*X2;
PLOT RESID*X3;
PLOT RESID*X1X2;
PLOT RESID*X1X3;
PLOT RESID*X2X3;
RUN;
PROC SORT DATA=OUTPLOT;
BY RESID;
RUN;
DATA OUTPLOT;
SET OUTPLOT NOBS=NOBS;
QUANTILE = PROBIT( (_N_- (3/8)) / (NOBS + (1/4)) );
RUN;
DATA OUTPLOT2;
IF _N_ = 1 THEN SET EST;
SET OUTPLOT;
EXPRESIDUAL = _RMSE_*QUANTILE;
RUN;
PROC GPLOT DATA=OUTPLOT2;
TITLE '6.18(E-2) NORMAL PROBABILITY PLOT';
PLOT RESID*EXPRESIDUAL;
RUN;
*6.10(F) LEVENE TEST OF RESIDUALS;
DATA NEWMATH;
SET OUTPLOT;
IF PRED < 38.75 THEN LEVENEGROUP = 'A';
IF PRED GE 38.75 THEN LEVENEGROUP = 'B';
RUN;
PROC GLM DATA=NEWMATH ALPHA=0.05;
TITLE '6.18(F) (UNMODIFIED) LEVENE TEST';
TITLE1 'OF HOMOGENEITY OF VARIANCE OF RESIDUALS';
CLASS LEVENEGROUP;
MODEL RESID = LEVENEGROUP;
MEANS LEVENEGROUP / HOVTEST = LEVENE (TYPE=ABS);
RUN;
QUIT;
(a) Stem and Leaf Plots.
(b) Scatterplots and Correlation Matrix
(c) Estimated Regression.
(d) Residual Box Plot.
(e) Residual Plots.
(f ) Lack of Fit Test.
(f ) Levene Test
1. Statement.
(i) H0 : error variance constant versus H1 : > 1.
(ii) H0 : error variance constant versus H1 : not constant
(iii) H0 : error variance constant versus H1 : 6= 1.
2. Test.
From SAS, the pvalue is (choose one) 0.446 / 0.8278 / 0.884
3. Conclusion.
(circle one) accept / reject the null hypothesis that the error variance
is constant.
(6.19) Mathematicians salaries continued, hw3-6-19-math-famCI
*HOMEWORK 3, 6-19, PAGES 252-257;

DATA MATH;
INPUT Y X1 X2 X3;
X1X2 = X1*X2;
X1X3 = X1*X3;
X2X3 = X2*X3;
DATALINES;
33.2 3.5 9 6.1
40.3 5.3 20 6.4
38.7 5.1 18 7.4
46.8 5.8 33 6.7
41.4 4.2 31 7.5
37.5 6 13 5.9
39 6.8 25 6
40.7 5.5 30 4
30.1 3.1 5 5.8
52.9 7.2 47 8.3
38.2 4.5 25 5
31.8 4.9 11 6.4
43.3 8 23 7.6
44.1 6.5 35 7
42.8 6.6 39 5
33.6 3.7 21 4.4
34.2 6.2 7 5.5
48 7 40 7
38 4 35 6
35.9 4.5 23 3.5
40.4 5.9 33 4.9
36.8 5.6 27 4.3
45.2 4.8 34 8
35.1 3.9 15 5
;
*6.19 REGRESSION OF Y ON X1, X2 AND X3;
PROC REG DATA=MATH OUTEST=EST TABLEOUT ALPHA=0.05;
TITLE '6.19 REGRESSION';
TITLE2 'BONFERRONI JOINT CIs FOR B0, B1 AND B2';
TITLE3 'CORRELATION';
MODEL Y = X1 X2 X3;
RUN;
QUIT;
(a) Test of regression relation at = 0.05.

1. Statement.
(i) H0 : 1 = 2 = 3 = 0 versus H1 : 1 = 2 = 3 > 0.
(ii) H0 : 1 = 2 = 3 = 0 versus H1 : 1 = 2 = 3 < 0.
(iii) H0 : 1 = 2 = 3 = 0 versus H1 : not all i is zero.
2. Test.
From SAS, the pvalue is (choose one) 0 / 0.0827 / 0.098
3. Conclusion.
(circle one) accept / reject the null hypothesis that 1 = 2 = 3 = 0.
(b) Bonferroni Confidence Intervals.
From TI83 (INVT 18 ENTER 0.975 ENTER)
B = t(1 /2g; n p) = t(1 0.05/2(3); 24 4) = t(0.9917; 20) = 2.614
From SAS,
b1 = 1.1031 and s{b1 } = 0.330,
b1 Bs{b1 } = 1.1031 2.614(0.330) =?
b2 = 0.3215 and s{b2 } = 0.037
b2 Bs{b2 } = 0.3215 2.614(0.037) =?
b3 = 1.2889 and s{b3 } = 0.298
b3 Bs{b3 } = 1.2889 2.614(0.298) =?
(c) From SAS,
(6.20) Mathematicians salaries, hw3-6-20-math-respCI
*HOMEWORK 3, 6-20, PAGES 252-257;

DATA MATHX;
INPUT Y X1 X2 X3;
DATALINES;
33.2 3.5 9 6.1
40.3 5.3 20 6.4
38.7 5.1 18 7.4
46.8 5.8 33 6.7
41.4 4.2 31 7.5
37.5 6 13 5.9
39 6.8 25 6
40.7 5.5 30 4
30.1 3.1 5 5.8
52.9 7.2 47 8.3
38.2 4.5 25 5
31.8 4.9 11 6.4
43.3 8 23 7.6
44.1 6.5 35 7
42.8 6.6 39 5
33.6 3.7 21 4.4
34.2 6.2 7 5.5
48 7 40 7
38 4 35 6
35.9 4.5 23 3.5
40.4 5.9 33 4.9
36.8 5.6 27 4.3
45.2 4.8 34 8
35.1 3.9 15 5
. 5.0 20 5
. 6.0 30 6
. 4.0 10 4
. 7.0 50 7
;
*6.20 BONFERRONI AND WH JOINT CIs FOR MEAN;
DATA MATH X;
SET MATHX;
IF READ NE . THEN OUTPUT MATH;
ELSE OUTPUT X;
RUN;
PROC REG DATA=MATH ALPHA=0.05 NOPRINT;
TITLE '6.20 BONFERRONI AND WH JOINT CIs FOR MEAN';
MODEL Y = X1 X2 X3;
RUN;
PROC REG DATA=MATHX;
MODEL Y = X1 X2 X3;
OUTPUT OUT=PRED_DS(WHERE=(Y =.)) P=PHAT STDP=STDP;
RUN;
PROC PRINT DATA=PRED_DS;
RUN;
QUIT;
(a) At = 0.05, and g = 4 (four simultaneous intervals),

and p = 4 (parameters: 0 , 1 , 2 , 3 )
from TI83,
q q
W = pF (1 ; p, n p) = 4F (1 0.05; 4, 24 4) = 3.388
B = t(1 /2g; n p) = t(1 0.05/2(4); 24 4) = t(0.99375; 20) = 2.744
Since W = 3.388 > B = 2.744, use B because the Bonferroni gives nar-
rower (more efficient) CIs than the WorkingHotelling CIs.
From SAS,
1. Xh1 = 5, Xh2 = 20, Xh3 = 5:
Yh Bs{Yh } = 36.2377 2.744(0.4631) =?
2. Xh1 = 6, Xh2 = 30, Xh3 = 6:
Yh Bs{Yh } = 41.8449 2.744(0.4170) =?
3. Xh1 = 4, Xh2 = 10, Xh3 = 4:
Yh Bs{Yh } = 30.6304 2.744(0.7560) =?
4. Xh1 = 7, Xh2 = 50, Xh3 = 7:
Yh Bs{Yh } = 50.6674 2.744(0.8975) =?
The questions from the text are altered somewhat to fit into the multiple choice
context given on Vista. The altered questions are given below.
Problem 6.9, pp 252-257.

Match the problems with the answers.
problem answer
6.9(a) time plots indicate wavelike pattern in Xi1 and Xi2
6.9(b) time plots indicate fairly random distribution of Xi1 and Xi2
6.9(c) scatterplot, correlation indicates strong correlation between Y and X i1 only
stem and leaf plots indicate Xi1 , Xi2 both have two extreme outliers
stem and leaf plots indicate fairly even distribution in Xi1 , Xi2
scatterplot, correlation indicates strong correlations between Y , Xi1 and Xi2
Problem 6.10, pp 252-257.

problem answer
6.10(a) Y = 3.324 + 4.768Xi1 + 5.080Xi2
6.10(b) box plot indicates no outlying residuals
6.10(c) residual plot, normal probability plot indicates no outlying residuals
6.10(d) residual vs time plot indicates no outlying residuals
6.10(e) Levene test p-value is 0.8278
Y = 3.324 + 3.768Xi1 + 5.080Xi2
box plot indicates one outlying residual
residual plot, normal probability plot indicates one outlying residual
residual vs time plot indicates one outlying residual
Levene test p-value is 0.989
Problem 6.11, pp 252-257.

problem answer
6.11(a) R2 = 0.787
6.11(b) Bonferroni CI for 1 is (2.259, 5.277)
6.11(c) R2 = 0.987
test of regression relation has F = 541.58
Bonferroni CI for 1 is (3.443, 6.717)
Problem 6.12, pp 252-257.
problem answer
6.12(a) for family CIs of response, B = 4.098 > W = 2.898
6.12(b) point (Xh1 , Xh2 ) = (20, 5) is inside scatter plot
for family CIs of response, W = 4.098 > B = 2.898
for family CIs of response, W = 3.098 > B = 2.898
point (Xh1 , Xh2 ) = (20, 5) is outside scatter plot
point (Xh1 , Xh2 ) = (20, 19) is outside scatter plot
Problem 6.13, pp 252-257.

problem answer
6.13(a) for Xh1 = 12 and Xh2 = 9.00, CI is (78.176, 100.339)
6.13(b) for Xh1 = 15 and Xh2 = 12.50, CI is (107.081, 159.600)
6.13(c) for Xh1 = 15 and Xh2 = 12.50, CI is (107.081, 139.600)
6.13(d) for Xh1 = 18 and Xh2 = 16.50, CI is (157.923, 172.004)
for Xh1 = 9 and Xh2 = 7.20, CI is (47.590, 90.031)
for Xh1 = 9 and Xh2 = 7.20, CI is (57.590, 90.031)
for Xh1 = 12 and Xh2 = 9.00, CI is (78.176, 110.339)
for Xh1 = 18 and Xh2 = 16.50, CI is (137.923, 172.004)
Problem 6.14, pp 252-257.

problem answer
6.14(a) for (Xh1 , Xh2 ) = (7, 6), PI of TOTAL is (166.94, 204.14)
6.14(b) for (Xh1 , Xh2 ) = (7, 6), PI of TOTAL is (156.94, 214.14)
for (Xh1 , Xh2 ) = (7, 6), CI of MEAN is (42.312, 68.045)
for (Xh1 , Xh2 ) = (7, 6), PI of TOTAL is (156.94, 204.14)
Problem 6.18, pp 252-257.

problem answer
6.18(a) scatterplot, correlation indicates strong correlations between Y , X i1 , Xi2 and Xi3
6.18(b) residual box plot indicates badly skewed distribution
6.18(c) Y = 7.84693 + 0.10313Xi1 + 0.32152Xi2 + 1.28894Xi3
6.18(d) residual box plot indicates fairly symmetric distribution
6.18(e) residual plots, normal probability plot indicates data normal
6.18(f) lack of fit test pvalue is 0.567
6.18(g) Levene test p-value is 0.884
stem and leaf plots indicates one extreme outlier in Xi1 , Xi2 , Xi3
stem and leaf plots indicate fairly even distribution in Xi1 , Xi2 , Xi3
Y = 17.84693 + 1.10313Xi1 + 0.32152Xi2 + 1.28894Xi3
scatterplot, correlation indicates strong correlations between Y and X i1 , Y and Xi2 , Y and Xi3 only
residual plots, normal probability plot indicates data not normal
unable to do lack of fit test because no repeated observations
Levene test p-value is 0.584
Problem 6.19, pp 252-257.

problem answer
6.19(a) test of regression relation has F = 68.119
6.19(b) Bonferroni CI for 3 is (0.240, 1.966)
6.19(c) R2 = 0.8087
Bonferroni CI for 3 is (0.510, 2.068)
R2 = 0.9109
Problem 6.20, pp 252-257.

problem answer
6.20(a) for (Xh1 , Xh2 , Xh3 ) = (5, 20, 5), CI is (36.967, 37.508)
6.20(b) for (Xh1 , Xh2 , Xh3 ) = (6, 30, 6), CI is (40.701, 42.989)
6.20(c) for (Xh1 , Xh2 , Xh3 ) = (7, 50, 7), CI is (48.205, 55.130)
6.20(d) for (Xh1 , Xh2 , Xh3 ) = (7, 50, 7), CI is (48.205, 53.130)
for (Xh1 , Xh2 , Xh3 ) = (5, 20, 5), CI is (34.967, 37.508)
for (Xh1 , Xh2 , Xh3 ) = (6, 30, 6), CI is (41.701, 42.989)
for (Xh1 , Xh2 , Xh3 ) = (4, 10, 4), CI is (29.556, 32.705)
for (Xh1 , Xh2 , Xh3 ) = (4, 10, 4), CI is (28.556, 32.705)

HMK 3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

HMK 3

Uploaded by

Copyright:

Available Formats

Homework 3 (Attendance 5) for Statistics 512

Applied Regression Analysis

This homework is worth 5% and marked out of 5 points. Homework assignments

1. Applied Linear Statistical Models

*HOMEWORK 3, 6-9, PAGES 252-257;

(a) Stemandleaf plots.

*HOMEWORK 3, 6-10, PAGES 252-257;

*HOMEWORK 3, 6-11, PAGES 252-257;

Source Sum Of Squares Degrees of Freedom Mean Squares

(a) Test of regression relation at = 0.05.

*HOMEWORK 3, 6-12, PAGES 252-257;

(a) Family CIs For Different Responses.

*HOMEWORK 3, 6-13, PAGES 252-257;

At = 0.05, g = 4 (four simultaneous intervals)

*HOMEWORK 3, 6-14, PAGES 252-257;

(a) Mean of New Observations CI.

*HOMEWORK 3, 6-18, PAGES 252-257;

*HOMEWORK 3, 6-19, PAGES 252-257;

(a) Test of regression relation at = 0.05.

*HOMEWORK 3, 6-20, PAGES 252-257;

(a) At = 0.05, and g = 4 (four simultaneous intervals),

Problem 6.9, pp 252-257.

Problem 6.10, pp 252-257.

Problem 6.11, pp 252-257.

Problem 6.13, pp 252-257.

Problem 6.14, pp 252-257.

Problem 6.18, pp 252-257.

Problem 6.19, pp 252-257.

Problem 6.20, pp 252-257.

You might also like