You are on page 1of 36

Multivariate Analysis I: Multiple Regression Analysis

Multivariate Analysis I: Multiple Regression Analysis

Learning Objectives
Upon completion of this chapter, you will be able to:
Understand the applications of the multiple regression model Understand the concept of coefficient of multiple determination, adjusted coefficient of multiple determination, and standard error of the estimate Understand and use residual analysis for testing the assumptions of multiple regression Use statistical significance tests for the regression model and coefficients of regression Test portions of the multiple regression model Understand non-linear regression model and the quadratic regression model, and test the statistical significance of the overall quadratic regression model Understand the concept of model transformation in regression models Understand the concept of collinearity and the use of variance inflationary factors in multiple regression Understand the conceptual framework of model building in multiple regression
Multivariate Analysis I: Multiple Regression Analysis 2

The Multiple Regression Model


Regression analysis with two or more independent variables or at least one non-linear predictor is referred to as multiple regression analysis Multiple regression model with k independent variables

Multiple regression equation

Multivariate Analysis I: Multiple Regression Analysis

Figure 16.1: Summary of the estimation process for multiple regression

Multivariate Analysis I: Multiple Regression Analysis

Multiple Regression Model with Two Independent Variables


Multiple regression model with two independent variables is the simplest multiple regression model where highest power of any of the variables is equal to one. Multiple regression model with two independent variables

Multiple regression equation with two independent variables

Multivariate Analysis I: Multiple Regression Analysis

Example 16.1
A consumer electronics company has adopted an aggressive policy to increase sales of a newly launched product. The company has invested in advertisements as well as employed salesmen for increasing sales rapidly. Table 16.2 presents the sales, the number of employed salesmen, and advertisement expenditure for 24 randomly selected months. Develop a regression model to predict the impact of advertisement and the number of salesmen on sales.

Multivariate Analysis I: Multiple Regression Analysis

Table 16.2: Sales, number of salesmen employed, and advertisement expenditure for 24 randomly selected months of a consumer electronics company

Multivariate Analysis I: Multiple Regression Analysis

Using MS Excel, Minitab and SPSS for multiple regression


Ch 16 Solved Examples\Excel\Ex 16.1.xls Ch 16 Solved Examples\Minitab\EX16.1.MPJ Ch 16 Solved Examples\SPSS\Ex 16.1.sav Ch 16 Solved Examples\SPSS\Output Ex 16.1.spv

Multivariate Analysis I: Multiple Regression Analysis

Determination of Coefficient of Multiple Determination

In case of multiple regression, the coefficient of multiple determination is the proportion of variation in the dependent variable y that is explained by the combination of independent (explanatory) variables.

This implies that 73.90% of the variation in sales is explained by the variation in the number of salesmen employed and the variation in the advertisement expenditure.

Multivariate Analysis I: Multiple Regression Analysis

Adjusted R Square
Adjusted R square is commonly used when a researcher wants to compare two or more regression models having the same dependent variable but different number of independent variables.

This indicates that 71.42% of the total variation in sales can be explained by the multiple regression model adjusted for the number of independent variables and sample size.

Standard Error of the Estimate

Multivariate Analysis I: Multiple Regression Analysis

10

Figures 16.8 & 16.9: Partial regression output from MS Excel and Minitab showing coefficient of multiple determination, adjusted R square, and standard error

Multivariate Analysis I: Multiple Regression Analysis

11

Residual Analysis for the Multiple Regression Model


Linearity of the regression model Constant error variance (Homoscedasticity) Independence of error Normality of error

Multivariate Analysis I: Multiple Regression Analysis

12

Testing the Statistical Significance of the Overall Regression Model

Figure 16.18(a): Computation of the F statistic using MS Excel (partial output for Example 16.1)

Multivariate Analysis I: Multiple Regression Analysis

13

Significance of Regression Coefficients

t Test for Testing the Statistical

The hypotheses for testing the regression coefficient of each independent variable can be set as

Figure 16.19(a) : Computation of the t statistic using MS Excel (partial output for Example 16.1)

Multivariate Analysis I: Multiple Regression Analysis

14

Non-linear Regression Model: The Quadratic Regression Model

Figure 16.22 : Existence of non-linear relationship (quadratic) between the dependent and independent variable (2 is the coefficient of quadratic term)

Multivariate Analysis I: Multiple Regression Analysis

15

Non-linear Regression Model: The Quadratic Regression Model (Contd.)

Multivariate Analysis I: Multiple Regression Analysis

16

Example 16.2
A leading consumer electronics company has 125 retail outlets in the country. The company spent heavily on advertisement in the previous year. It wants to estimate the effect of advertisements on sales. This company has taken a random sample of 21 retail stores from the total population of 125 retail stores. Table 16.5 provides the sales and advertisement expenses (in thousand rupees) of 21 randomly selected retail stores.

Multivariate Analysis I: Multiple Regression Analysis

17

Table 16.5: Sales and advertisement expenses of 21 randomly selected retail stores

Fit an appropriate regression model. Predict the sales when advertisement expenditure is Rs 28,000.
Multivariate Analysis I: Multiple Regression Analysis 18

Using MS Excel, Minitab, and SPSS for the Quadratic Regression Model
Ch 16 Solved Examples\Excel\Ex 16.2 quadratic1.xls Ch 16 Solved Examples\Minitab\EX 16.2 QUADRATIC 1.MPJ Ch 16 Solved Examples\SPSS\Ex 16.2 Quadratic.sav Ch 16 Solved Examples\SPSS\Ouput Ex 16.2.spv

Multivariate Analysis I: Multiple Regression Analysis

19

A Case When the Quadratic Regression Model Is a Better Alternative to the Simple Regression Model

Figure 16.31: Fitted line plot for Example 16.2 (simple regression model) produced using Minitab
Multivariate Analysis I: Multiple Regression Analysis 20

A Case When the Quadratic Regression Model is a Better Alternative to the Simple Regression Model

Figure 16.33: Fitted line plot for Example 16.2 (quadratic regression model) produced using Minitab
Multivariate Analysis I: Multiple Regression Analysis 21

Testing the Statistical Significance of the Overall Quadratic Regression Model

F Statistic is used for testing the significance of the quadratic regression model as it is used in the simple regression model. Testing the Quadratic Effect of a Quadratic Regression Model

t Statistic is used for testing the significance of the quadratic effect of quadratic regression model.

Multivariate Analysis I: Multiple Regression Analysis

22

Indicator (Dummy Variable Model)

Regression models are based on the assumption that all the independent variables (explanatory) are numerical in nature. There may be cases when some of the variables are qualitative in nature. These variables generate nominal or ordinal information and are used in multiple regression. These variables are referred to as indicator or dummy variables. Researchers usually assign 0 or 1 to code dummy variables in their study. Here, it is important to note that the assignment of code 0 or 1 is arbitrary and the numbers merely represent a place for the category. A particular dummy variable xd is defined as

Multivariate Analysis I: Multiple Regression Analysis

23

Example 16.3
A company wants to test the effect of age and gender on the productivity (in terms of units produced by the employees per month) of its employees. The HR manager has taken a random sample of 15 employees and collected information about their age and gender. Table 16.6 provides data about the productivity, age, and gender of 15 randomly selected employees. Fit a regression model considering productivity as the dependent variable and age and gender as the explanatory variables.

Multivariate Analysis I: Multiple Regression Analysis

24

Table 16.6: Data about productivity, age, and gender of 15 randomly selected employees.

Predict the productivity of male and female employees at 45 years of age.


Multivariate Analysis I: Multiple Regression Analysis 25

Using MS Excel, Minitab and SPSS for the Dummy Variable Regression Model
Ch 16 Solved Examples\Excel\Ex 16.3 dummy & interaction.xls Ch 16 Solved Examples\Minitab\EX 16.3 DUMMY & INTERACTION.MPJ Ch 16 Solved Examples\SPSS\Ex 16.3.sav Ch 16 Solved Examples\SPSS\Output Ex 16.3 (Interaction).spv

Multivariate Analysis I: Multiple Regression Analysis

26

Example 16.3

Multivariate Analysis I: Multiple Regression Analysis

27

Model Transformation in Regression Models


In many situations, in regression analysis, the assumptions of regression are violated or researchers find that the model is not linear. In both the cases, either the dependent variable y or the independent variable x or both the variables are transformed to avoid the violation of regression assumptions or to make the regression model linear.

Multivariate Analysis I: Multiple Regression Analysis

28

The Square Root Transformation


Square root transformation is often used for overcoming the assumption of constant error variance (homoscedasticity), and in order to convert a nonlinear model to a linear modal.

Multivariate Analysis I: Multiple Regression Analysis

29

Example 16.4
A furniture company receives 12 lots of wooden plates. Each lot is examined by the quality control inspector of the firm for defective items. His report is given in Table 16.8:

Taking batch size as the independent variable and the number of defectives as the dependent variable, fit an appropriate regression model and transform the independent variable if required.
Multivariate Analysis I: Multiple Regression Analysis 30

Using MS Excel, Minitab and SPSS for the Square Root Transformation
Ch 16 Solved Examples\Excel\Ex 16.4 square root transformation.xls Ch 16 Solved Examples\Minitab\Ex 16.4 transformation.MPJ Ch 16 Solved Examples\SPSS\Ex 16.4 (Square root).sav Ch 16 Solved Examples\SPSS\Output Ex 16.4.spv

Multivariate Analysis I: Multiple Regression Analysis

31

Logarithm Transformation
Logarithm transformation is often used to verify the assumption of constant error variance (homoscedasticity) and to convert a nonlinear model to a linear model.

Multivariate Analysis I: Multiple Regression Analysis

32

Example 16.5
The data related to sales turnover and advertisement expenditure of a company for 15 randomly selected months are given in Table 16.10

Taking sales as the dependent variable and advertisement as the independent variables, fit a regression line using log transformation of variables.
Multivariate Analysis I: Multiple Regression Analysis 33

Using MS Excel, Minitab and SPSS for Logarithm Transformation


Ch 16 Solved Examples\Excel\Ex 16.5 log transformation.xls Ch 16 Solved Examples\Minitab\EX 16.5 LOG TRANSFORMATION.MPJ Ch 16 Solved Examples\SPSS\Ex 16.5.sav Ch 16 Solved Examples\SPSS\Output Ex 16.5.spv

Multivariate Analysis I: Multiple Regression Analysis

34

Collinearity
In multiple regression analysis, when two independent variables are correlated, it is referred to as collinearity and when three or more variables are correlated, it is referred to as multicollinearity. Collinearity is measured by variance inflationary factor (VIF) for each explanatory variable. If explanatory variables are uncorrelated, then variance inflationary factor (VIF) will be equal to 1. Variance inflationary factor (VIF) being greater than 10 is an indication of serious multicollinearity problems. Collinearity is not very simple to handle in multiple regression. One of the best solutions to overcome the problem of collinearity is to drop collinear variables from the regression equation.

Multivariate Analysis I: Multiple Regression Analysis

35

Example 16.6
Table 16.13 provides the modified data for the consumer electronics company discussed in Example 16.1. Two new variables, number of showrooms and showroom age, of the concerned company have been added. Fit an appropriate regression model.

Ch 16 Solved Examples\Minitab\EX 16.6 MODEL BUILDING.MPJ Ch 16 Solved Examples\SPSS\Ex 16.6 (17).sav Ch 16 Solved Examples\SPSS\Output Ex 16.6 (Model Building).spv

Multivariate Analysis I: Multiple Regression Analysis

36

You might also like