Answers to recommended practice problems from chapter 1
Note: These are brief answers to the problems and many would need more detail in order to receive full marks on a test or exam. 1.3 There will be sources of variation in the Y s caused by, for example, variations in the experimental conditions, measurement error, etc. which can be modelled as random error about a mean value that is a functional relation in the Xs. So regression analysis is appropriate. 1.5 No. Either the left hand side should be Yi or the i should not be on the right hand side. 1.6 (b) 0 = 200 indicates that the mean value for Y when X = 0 is 200. 1 = 5 indicates that for each unit increase in X, the mean value of Y increases by 5. 1.7 (a) No. Need to know the probability distribution of the Y s (or the s). (b) Yes. Now we know the distribution is normal. Standardize and use Table B.1 to calculate P (195 Y 205) where the distribution of Y is N (100 + 20(5), 25). (The answer is approximately 0.68.) 1.8 E(Y ) would still be 104. Y would not likely be 108 (it would be some sample from a distribution with mean 104). 1.11 The slope less than 1 could be the regression effect where employees who did well before the training will tend to do worse, on average, the next time they are measured and employees who did poorly before the training will tend to do better, on average, the next time they are measured. However, the cut-off point (where the regression line crosses the line Y = X) is at X = 400. But for this situation, X ranges from 40 to 100. Therefore, on average, employees did better after the training. The slope of 0.95 is not the whole story! 1.16 No. For least squares we didnt require any statistical assumptions. 1.18 No. The i s are random variables with mean 0. We cant actually observe their value. There is no reason they should sum to 0. 1.20 (a) Y = 0.58016 + 15.03525X (b) seems to be a pretty good fit (c) b0 = 0.58016 is the predicted value of Y when X = 0. A negative Y makes no sense for this problem, neither does a zero X. Extrapolation is not to be trusted! (d) 0.58016 + 15.03525(5) = 74.5961 1.21 (a) Y = 10.2 + 4.0X Line seems to fit data pretty well but the variability in the Y s seems to
be greater for smaller Xs.
(b) 14.2 (c) 4.0 (the slope is the expected increase in Y when X increases by 1 unit) Y ) = (1, 14.2) which satisfies the fitted equation (d) (X, 1.24 (a) See the output (.lst file) for 1.20 for the residuals. The sum of the residuals squared is 3416.37702 which you can calculate from the residuals. Of course, its also SSE from the ANOVA table. This is the minimum value of Q (minimized over b0 and b1 ). (b) s2 = SSE/(n 2) = 79.45063 = M SE s = s2 = 8.91351 in minutes, the units of Y 1.29 The model would go through the origin (0, 0). 1.30 The model is a horizontal line. The fitted line would be Y = Y . P 1.33 Minimize S = (Yi b0 )2 by differentiating S with respect to b0 and setting the derivative equal to 0. This gives b0 = Y . P P P P 1.36 Yi e i = (b0 + b1 Xi )ei = b0 ei + b1 Xi ei = 0 using properties previously proven 1.39 (a) The equations for b0 and b1 do not change. This can be seen by plugging in the 6 points versus the 3 points into the formulas for b0 and b1 . P 1.40 The line would not change. This point would add nothing to (Yi Yi )2 which is the quantity we minimized to find b0 and b1 . So wed get the same answer whether or not it was there. P 1.41 (a) Minimize S = (Yi b1 Xi )2 by differentiating respect to b1 and P withP setting the derivative equal to zero gives b1 = Yi Xi / Xi2 .