You are on page 1of 5
Mulitple Regression Example relationships. Because we are short of time we will use a less involved approach and limit the number of predictor variables in our examples, Now the whole picture becomes more complex. We are concerned with the nature and significance of the relations between the independent vatiables and the dependent variable. Questions which are often asked are as follows: 1. What is the relative importance of the different factors? 2. What is the magnitude of the effect of each of the independent variables on the dependent variable. 3. Can any independent variable be dropped because of a lack of effect on the dependent variable? 4. Should any independent variables not yet included in the model be considered for possible inclusion? Simple Example of a Multiple Regression obs, Y x. x i 46 “4 1.00 2 51 15 125 3 69 16 3.00 4 1% ar 3.25 5 80 18 4.00 6 82 19 5.25 7 97 20 5.50 In the data above, X; and Xp are two independent variables and Y is a dependent variable, Perhaps the most useful summary of the information from this data set is provided by the analysis of variance for regression: Source af. Regression 2 1832.13 916.06 i 1 1824.14 1824.146% Xy 1 7.99 7.99 NS Deviation from regression 4 83.30 20.82 eee eee cee ‘Total (corr. 1915.43 ‘The least squares equation for the model containing both Xj and X, is obtained by solving the following normal equations. DXBby + EDX yXpdy + LEX pXqby DX EX Xho + DOXFKY + DK Xyhy DX EXgXqho + — EXpXyby + — OXZbg = OMY Numerically, these equations are 7 by + 9b; 23.2 by = 99 9b + -2051b, + 417-7 bp = 8709 23.25 bg + 41775, + = 98.04bp = 1841.05 Solution of the above set of normal equations gives the following regression equation: Y = 20.281 + 5.219 X, + 3.549 Xp entered first into the model is the only From the analysis of variance, we see that Xy predictor variable which is required. TEX; is ignored, we have a simple regression of Y on Xp which gives by = 9.825. The t-test of the hypothesis Bp = 0 in this simple linear regression model is as follows; 2 = s.0°* amet ye - — pee . 7 21.827/Ex 2? ‘This shows that if elther variable is included alone itis significant. However, the second predictor variable n't needed. This tella us that the two independent variables are highly correlated, R?, the coefficient of multiple determination is 1832.12/1915.43, is SS Regression/SSy = 957 978. R, the coefficient of multiple correlation is {R2 ‘The test of the hypothesis that together these variables are not conteibuting significantly to the explanation of Y is carried out by forming an F-ratio of mean square regression over mean square error. Since the hypothesis is rejected, we conclude that one or both of these variables is (are) contributing significantly to the explanation of Y. ‘The predicted values for each of the combinations of the X; in the original data are: Obs. 47.388 53.495 64.995 71.032 78.913 88.569 94.676 ‘The expression for the vatiance of the predicted mean value of Y at the point X; 18.5 and Xp = 4.5 (a point not in the data set) is calculated as follows: v1.0 sma sea |]: 2 x5 45 |] some sess a2 || es wes a8 ase |] as INVERSE MATRIX where s? = error mean square = 20.82. Note that the right-most expressions are matrices, We are dealing with the product of matrices in computing thie variance, Multiple regression, although appearing complicated, is necessary because many biological phenomena are dependent upon more than one factor. This is a very useful technique in agricultural research. For your applications it would seem advisable to keep the number of predictor variables to a manageable number. Not only does the computing become difficult with many X; variables but the prediction equation becomes unwicldy. Often it is possible to find a few Important predictor variables which may be used to predict Y well. Assessment of the relative Importance of the predictor variables is part of the process in conducting a multiple regression analysis, Standard Output From Regression Programs 1. ‘The analysis of vatiance for the regression 2 RP 3. The regression equation 4. Standard error and t values for the regression coefficients 5. Predicted Y’s are usually optional 6. The standard errors of predicted Y's. Kinds of Regression Models 1. Prediction - The object here is to find a set of predictors which do @ good job of predicting Y, the dependent variable. Less emphasis is placed on the biological relevance of the factors. The main test is whether or not a variable aids in the prediction of Y. There are procedures available using computer packages which assist in the building of a prediction model given that we know little of the biological relevance of the factors. Some of these are: stepwise regression, forward selection, maximum R? improvement, backward elimination and all possible regressions. 2. Control - Here the levels of one or more quantitative factor(s) ly controlled in an experimental setting and then the ) is related to the levels response to the controlled variable( of the factors by multiple regression. We are usually trying to determine optimum combinations or optimum conditions using such an approach. A good example is an N-P-K fertilizer experiment in which each factor has four levels. We then fit « “response surface” to find the “physical optimum” and the combination of rates of N, P and K which produce this optimum. Usually part of the analysis is an economic analysis to determine the economic optimum combination of N, P and K.

You might also like