You are on page 1of 6

Business Research Methods

Assignment On Multivariate Analysis

Submitted By: Tariq Mahmood Asghar Roll No. # 77 MBA (A, B1)

DATA ANALYSIS
The terms "statistics" and "data analysis" mean the same thing -- the study of how we describe, combine, and make inferences from numbers. A lot of people are scared of numbers (quantiphobia), but statistics has got less to do with numbers, and more to do with rules for arranging them. It even lets you create some of those rules yourself, so instead of looking at it like a lot of memorization, it's best to see it as an extension of the research mentality, something researchers do (crunch numbers) to obtain complete and total power over the numbers. After awhile, the principles behind the computations become clear, and there's no better way to accomplish this than by understanding the research purpose of statistics.

MULTIVARIATE DATA ANALYSIS


As the name indicates, multivariate analysis comprises a set of techniques dedicated to the analysis of data sets with three or more variables. Multivariate analysis is the simultaneous analysis of three or more variables. It is frequently done to refine a bivariate analysis, taking into account the possible influence of a third variable on the original bivariate relationship. Multivariate analysis is also used to test the joint effects of two or more variables upon a dependent variable. In some instances, the association between two variables is assessed with a multivariate rather than a bivariate statistical technique. This situation arises when two or more variable are needed to express the functional form of the association. Multivariate Data Analysis refers to any statistical technique used to analyze data that arises from more than one variable. This essentially models reality where each situation, product, or decision involves more than a single variable. The information age has resulted in masses of data in every field. Despite the quantum of data available, the ability to obtain a clear picture of what is going on and make intelligent decisions is a challenge. When available information is stored in database tables containing rows and columns, Multivariate Analysis can be used to process the information in a meaningful fashion. Multivariate analysis methods typically used for:

Consumer and market research Quality control and quality assurance across a range of industries such as food and beverage, pharmaceuticals, chemicals, energy, telecommunications, etc. Process optimization and process control Research and development

With Multivariate Analysis you can:

Obtain a summary or an overview of a table. This analysis is often called Principal Components Analysis or Factor Analysis. In the overview, it is possible to identify the dominant patterns in the data, such as groups, outliers, trends, and so on. The patterns are displayed as two plots Analyze groups in the table, how these groups differ, and to which group individual table rows belong. This type of analysis is called Classification and Discriminant Analysis Find relationships between columns in data tables, for instance relationships between process operation conditions and product quality. The objective is to use one set of variables (columns) to predict another, for the purpose of optimization, and to find out which columns are important in the relationship.

Tools for Multivariate Analysis The Multiple Correlation Coefficients:


A correlation is a way of measuring the linear association between two variables. But what is the correlation if Y, the dependent variable, is being predicted from more than one independent variable? Answer is the multiple correlations are the correlation between Y and the best linear combination of the independent variables. The best linear combination of independent variables is the multiple regression slopes. In bivariate correlation, the predictor, X, and predicted, Y, were perfectly correlated. In multiple correlation, the predictors, X1 and X2, are perfectly correlated with the predicted, Y. You already know how to establish that "best possible predictor" using multiple regressions described above. You already know how to solve for a correlation coefficient. In this section, we will show you how these principles are used to solve for the multiple correlation coefficient. The multiple correlation coefficients vary in terms of strength; it can vary between . 00 and 1.00. The multiple correlations do not specify direction. If the value of the dependent variable does not change in any systematic way as any of the independent variables varies in either direction, then there is no relationship and the coefficient is 0.00. Thus, if the plane describing the points is flat, indicating no systematic change in the dependent variable as we increase the values of the independent variables, the coefficient is 0.

In formula form, the multiple correlation between X and Y is equal to the covariance of Y and Y predicted from the best linear combination of the X divided by the product of the respective standard deviations.

Multiple Regressions (MR):


Multiple Regression is a "general linear model" with a wide range of applications. It is basically an extension of the bivariate correlation and simple regression analysis. The primary uses of MR are as follows:

Prediction of a continuous Y with several continuous X variables: Unlike ordinary bivariate regression, MR allows the use of an entire set of variables to predict another. Use of categorical variables in prediction: Through the technique of dummy coding, categorical variables (such as marital status or treatment group) can be used in addition to continuous variables. Calculation of the unequal n ANOVA problem: Disproportional cell size in any factorial ANOVA design produces a correlation among the independent variables. MR estimates effects and interactions in this situation by the use of dummy codes. Model nonlinear relationships between Y and a set of X: By the addition of "polynomial" terms (e.g. quadratic, cubic, trends) into the equation, relationships that do not meet the linear assumptions can be analyzed.

Multiple linear regression analysis (MLR):


In MLR, several IV's (which are supposed to be axed or equivalently are measured without error) are used to predict with a least square approach one DV. If the IV's are orthogonal, the problem reduces to a set of univariate regressions. When the IV's are correlated, their importance is estimated from the partial coeffcient of correlation. An important problem arises when one of the IV's can be predicted from the other variables because the computations required by MLR can no longer be performed: This is called multicolinearity.

Non-Linear Regression:
Ordinary multiple regression assumes that each bivariate relationship between X and Y is linear, and that the relationship between Y and Y' is also linear. How can we examine how closely we have met this linearity assumption? We could, of course, always inspect all of the separate scatter plots between all Y and X relationships. But this does not address the second part of the linearity assumption.

We can determine the conformity of our data the MR linearity assumption via direct examination of residual scores. Recall that a residual or "error" score is equal to (Y Y') and that we would expect this to be approximately zero once the linear regression determined. Thus, a plot of residual scores against [Yacute] that shows a peculiar pattern would suggest the violation of linearity. Two of the most common type of nonlinear relationships is represented in this way:

These cases can be modeled through polynomial regression. Polynomial regression simply adds terms to the original equation to account for nonlinear relationships. Squaring the original variable accounts for a quadratic trend, a cubed term accounts for the cubic trend, and so on. After the linear component, each term adds another "bend" in the prediction line. As with traditional MR, polynomial regression can be interpreted by looking at R Squared and changes in R Squared. A couple notes of caution: in order to look at any higher order effect (such as the cubic), all lower order effects (the quadratic and the linear) must be placed into the equation; a degree of freedom is lost for each additional term added into the equation; and lastly, the N:k Ratio changes for each added term. Recall that trend tests in ANOVA were accomplished with subjects grouped into discrete categories. With polynomial regression, the subjects are not grouped, but rather each individual has a unique score on a continuous variable. Therefore, MR trend information is much more complete than information available with typical ANOVA.

Multivariate analysis of variance (MANOVA):


Multivariate analysis of variance (MANOVA) is a generalized form of analysis of variance (ANOVA) methods to cover cases where there is more than one (correlated) dependent variable and where the dependent variables cannot simply be combined. As well as identifying whether changes in the independent variables have a significant effect on the dependent variables, the technique also seeks to

identify the interactions among the independent variables and the association among dependent variables, if any. MANOVA Procedure MANOVA procedures are multivariate, significance test analogues of various univariate ANOVA experimental designs. MANOVA, as with its univariate counterparts typically involve random assignment of participants to levels of one or more nominal independent variables; however, all participants are measured on several continuous dependent variables. There are three basic variations of MANOVA:

Hotelling's T: This is the MANOVA analogue of the two group T-test situation; in other words, one dichotomous independent variable, and multiple dependent variables. One-Way MANOVA: This is the MANOVA analogue of the one-way F situation; in other words, one multi-level nominal independent variable, and multiple dependent variables. Factorial MANOVA: This is the MANOVA analogue of the factorial ANOVA design; in other words, multiple nominal independent variables, and multiple dependent variables.

While all of the above MANOVA variations are used in somewhat different applications, they all have one feature in common: they form linear combinations of the dependent variables which best discriminate among the groups in the particular experimental design. In other words, MANOVA is a test of the significance of group differences in some m-dimensional space where each dimension is defined by linear combinations of the original set of dependent variables. This relationship will be represented for each design into the following sections.

You might also like