You are on page 1of 38

Multivariate Analysis & Use of Statistical Packages

Business Research Methods

March 21, 2014

Topic & Structure of the lesson


2

Topic Outline
Introduction Nature & Techniques of Multivariate Analysis Analysis of Dependence Multiple Regression Regression Model Dummy Variable Treatments Discriminant Analysis Factor Analysis Cluster analysis Applications of SPSS

March 21, 2014

Learning Outcomes
3

On completion of this chapter you should be able to understand:


how to classify and select multivariate techniques multiple regression applications

how multivariate analysis of variance assesses the relationship between two or more metric dependent variables and independent classificatory variables how principal components analysis extracts uncorrelated factors from an initial set of variables and exploratory factor analysis reduces the number of variables to discover the underlying constructs

March 21, 2014

Classifying Multivariate Techniques


204

Dependency

Interdependency

Multivariate Techniques
205

Multivariate Techniques
206

Multivariate Techniques
207

Multiple Regression Analysis

Have a clear notion of what you can and cannot do with regression analysis

Path Diagram of A Linear Regression Analysis

Conceptualization
A Path Model of a Regression Analysis

X1

YY
X2

error

x3

Yi k b1 x1 b2 x2 b3 x3 ei

Multiple Regression

Independent Variables (Xs) -Information Accuracy -Information Relevancy -Ease of use

Dependent variable(Y) User decision to use the information

Multiple Independent Variables (IVs):X1,X2,X3Xk Independent Variables can be correlated with one

another and with dependent variable to varying degrees Exploring relationship between IVs and DV: Xs vs. Y; Regressing Xs onto Y

Goals of Multiple Regression


Investigation of many-to-one relationships where independent

variables are imperfectly related to the dependent variable To ascertain the overall degree of relationship treating the variables as a set (Multiple R) To determine the usefulness of particular IVs as predictors and to compare the usefulness of different IVs To determine if adding a particular IV to an existing set of IVs meaningfully improves prediction To assess the interaction of particular pairs of IVs (or, less often, higher order interactions) To use a previously established set of regression coefficients to predict DV scores for members of a new sample

Uses of Multiple Regression


2011

Develop self-weighting estimating equation to predict values for a DV

Test and explain causal theories

Generalized Regression Equation


2012

Multiple Regression: Practical Consideration

1)Degrees of Relationship How good is the regression equation? Example: Can one predict user decision of using information given computer system capabilities and service accessibility?
2) Importance of IVs Which IV is more important in the regression equation and which IVs are not. Example: Is experience helpful in predicting the outcome of doctors information seeking activities OR gender, age, hours work also influence the outcome?

Multiple Regression: Practical Consideration


3)Adding IVs Can we improve our prediction? If yes, which IV to add in? Example: Is prediction of information seeking behaviors of doctors enhanced by adding library usage variable to the four variables included in the model (experience, age, gender & hours work)?

4) Changing IVs Researchers may include nonlinear relationships in the analysis by redefining IVs( e.g. Curvelinear relationship)

Multiple Regression: Practical Consideration


5)Comparing Set of Independent Variables Given 2 set of predictors, which set can predict better the DV. Example: Is prediction of the use of information sources in a company based on individual characteristics (age, gender, experience) or based on organizational resources (staff, budget for attending annual conferences)

Multiple Regression: Practical Consideration


6) Predicting DV scores for Members of a New Sample Predict DV based on available IV data Regression equation is developed from a portion of a sample and then applied to other portion of the sample. If solution generalizes, the regression equation predicts DV scores better than chance for the new cases, as well.

Test for Multiple Regression


Ho: All regression coefficients are zero (B1= B2 =Bk=0) H1: Not all of the coefficients are zero To test the regression relation between DV and Ivs F statistic

2 Rs

F*

2 (1 Rs )

( N k 1)

F*>F(k, N-k-1), Reject Ho; significant relationship between DV and IVs.

Multiple Regression Example

Ex. A student collected 25 data points in order to carry out a regression estimation for the Price of the Automobile Spare Part based on the several independent variable which were designated as X1, X2, X3, X4 and X5. He performed stepwise regression. The results were as follows:

Model

R2

Adj R2

Variables Included

Un Standardize Standard standardize d Error d Coefficients Coefficients 33.33 1.67 23.76 1.48 2.75 22.84 1.32 2.40 3.66 18.96 1.15 1.89 4.22 1.80

1 2

.45 .73

.445 .722

.75

.732

.77

.71

Constant X3 Constant X3 X5 Constant X3 X5 X1 Constant X3 X5 X1 X4

0.85
0.66 0.85 0.55 0.95 0.14 0.33 0.85 0.11 0.05

0.33
0.28 0.62 0.25 0.58 2.75 0.28 0.72 3.15 2.77

Interpret R & R2 for all the Models. Interpret R2 and Adjusted R2 for Model 1. Among all the models which one would you choose finally to adopt for determining the price of the automobile spare part. Give reason(s) For the Model you would chose, indicate the relative importance of the independent variables.

Ex. Consider the issue of Gender discrimination in the salary earning of women in Manufacturing Industry. In examining this issue, a random sample of 15 workers is drawn from a pool of employed laborer and the workers monthly salary is determined, along with their age and gender. A gender can only be male or female. Let 1 denote male and 0 denote female. The following output was obtained:

Model Summary
Model 1 R .943 R Square .890 Adjusted R Square .872 Std. Error of the Estimate 96.79158 Sig. .000

Coefficients

Model

Unstandardized Coefficients Beta 732.061 11.122 458.684 Std. Error 235.584 7.208 53.458

Standardized Coefficients Beta .158 .877

t
3.107 1.543 8.580

Sig.
.009 .149 .000

Constant Age Gender

Determine I)The two estimating equations for male and female II)Interpret the table III)By what amount does the equation suggest males and females are paid higher than the female.

Selection Methods
2021

Forward

Backward

Stepwise

Discriminant Analysis
2022

Researcher often wishes to classify people or objects

in to two or more groups. One might need to classify persons as either buyers or non-buyers, good or bad credit risks, or to classify superior, average or poor products in some market.
The objective is to establish a procedure to find the

predictors that best classify subjects.


Discriminant Analysis is frequently used in market

segmentation research.

Similarities and Differences between ANOVA, Regression, and Discriminant Analysis


23 ANOVA REGRESSION One DISCRIMINANT ANALYSIS One

Similarities Number of dependent variables Number of independent variables


Differences Nature of the dependent variables Nature of the independent variables

One

Multiple

Multiple

Multiple

Metric Categorical

Metric Metric

Categorical Metric

Discriminant Analysis
24

A technique for analyzing data when the criterion or dependent variable is categorical and the predictor or independent variables are interval in nature. The objectives of discriminant analysis are as follows: Development of discriminant functions, or linear combinations of the predictor or independent variables, which will best discriminate between the categories of the criterion or dependent variable (groups). Examination of whether significant differences exist among the groups, in terms of the predictor variables. Determination of which predictor variables contribute to most of the intergroup differences. Classification of cases to one of the groups based on the values of the predictor variables.

Discriminant Analysis
25

When the criterion variable has two categories, the

technique is known as two-group discriminant analysis.

When three or more categories are involved, the technique

is referred to as multiple discriminant analysis.

The main distinction is that, in the two-group case, it is

possible to derive only one discriminant function. In multiple discriminant analysis, more than one function may be computed. In general, with G groups and k predictors, it is possible to estimate up to the smaller of G - 1, or k, discriminant functions.

Discriminant Analysis Model


26

The discriminant analysis model involves linear combinations of the following form: D = b0 + b1X1 + b2X2 + b3X3 + . . . + bkXk where D = discriminant score b 's = discriminant coefficient or weight X 's = predictor or independent variable The coefficients, or weights (b), are estimated so that the groups differ as much as possible on the values of the discriminant function. This occurs when the ratio of between-group sum of squares to within-group sum of squares for the discriminant scores is at a maximum.

Discriminant Analysis
2027 A.
Actual Group Unsuccessful Successful 0 1 Number of Cases 15 15 Predicted Success 0 13 86.70% 3 20.00% 1 2 13.30% 12 80.00%

Note: Percent of grouped cases correctly classified: 83.33%

B.
X1 X1 X1 Constant

Unstandardized .36084 2.61192 .53028 12.89685

Standardized .65927 .57958 .97505

Interdependency Techniques
2028

Factor Analysis

Cluster Analysis

Factor Analysis
2029

Factor Matrices
2030

A _____Unrotated Factors_____ Variable A B C D E F Eigenvalue Percent of variance Cumulative percent I 0.70 0.60 0.60 0.50 0.60 0.60 2.18 36.3 36.3 II -.40 -.50 -.35 0.50 0.50 0.60 1.39 23.2 59.5 h2 0.65 0.61 0.48 0.50 0.61 0.72

B __Rotated Factors__ I 0.79 0.75 0.68 0.06 0.13 0.07 II 0.15 0.03 0.10 0.70 0.77 0.85

Orthogonal Factor Rotations

Correlation Coefficients, Metro U MBA Study

Variable V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

Course Financial Accounting Managerial Accounting Finance Marketing Human Behavior Organization Design Production Probability Statistical Inference Quantitative Analysis

V1 1.00 0.56 0.17 -.14 -.19 -.21 -.44 0.30 -.05 -.01

V2 0.56 1.00 -.22 0.05 -.26 -.00 -.11 0.06 0.06 0.06

V3 .017 -.22 1.00 -.48 -.05 -.56 -.04 0.07 -.32 0.42

V10 -.01 0.06 0.42 -.10 -.23 -.05 -.08 -.10 0.06 1.00

Factor Matrix, Metro U MBA Study

Variable V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 Eigenvalue Percent of variance Cumulative percent

Course Financial Accounting Managerial Accounting Finance Marketing Human Behavior Organization Design Production Probability Statistical Inference Quantitative Analysis

Factor 1 0.41 0.01 0.89 -.60 0.02 -.43 -.11 0.25 -.43 0.25 1.83 18.30 18.30

Factor 2 0.71 0.53 -.17 0.21 -.24 -.09 -.58 0.25 0.43 0.04 1.52 15.20 33.50

Factor 3 0.23 -.16 0.37 0.30 -.22 -.36 -.03 -.31 0.50 0.35 0.95 9.50 43.00

Communality 0.73 0.31 0.95 0.49 0.11 0.32 0.35 0.22 0.62 0.19

Varimax Rotated Factor Matrix

Variable V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

Course Financial Accounting Managerial Accounting Finance Marketing Human Behavior Organization Design Production Probability Statistical Inference Quantitative Analysis

Factor 1 0.84 0.53 -.01 -.11 -.13 -.08 -.54 0.41 0.07 -.02

Factor 2 0.16 -.10 0.90 -.24 -.14 -.56 -.11 -.02 0.02 0.42

Factor 3 -.06 0.14 -.37 0.65 -.27 -.02 -.22 -.24 0.79 0.09

Cluster Analysis
Select sample to cluster Define variables
Compute similarities Select mutually exclusive clusters

Compare and validate cluster

Cluster Analysis

Cluster Membership
________Number of Clusters ________ Film Cyrano de Bergerac Il y a des Jours Nikita Les Noces de Papier Leningrad Cowboys . . . Storia de Ragazzi . . . Conte de Printemps Tatie Danielle Crimes and Misdem . . . Driving Miss Daisy La Voce della Luna Che Hora E Attache-Moi White Hunter Black . . . Music Box Dead Poets Society La Fille aux All . . . Alexandrie, Encore . . . Dreams Country France France France Canada Finland Italy France France USA USA Italy Italy Spain USA USA USA Finland Egypt Japan Genre DramaCom DramaCom DramaCom DramaCom Comedy Comedy Comedy Comedy DramaCom DramaCom DramaCom DramaCom DramaCom PsyDrama PsyDrama PsyDrama PsyDrama DramaCom DramaCom Case 1 4 5 6 19 13 2 3 7 9 12 14 15 10 8 11 18 16 17 5 1 1 1 1 2 2 2 2 3 3 3 3 3 4 4 4 4 5 5 4 1 1 1 1 2 2 2 2 3 3 3 3 3 4 4 4 4 3 3 3 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 2 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Dendogram

You might also like