Professional Documents
Culture Documents
Submitted By: Tariq Mahmood Asghar Roll No. # 77 MBA (A, B1)
DATA ANALYSIS
The terms "statistics" and "data analysis" mean the same thing -- the study of how we describe, combine, and make inferences from numbers. A lot of people are scared of numbers (quantiphobia), but statistics has got less to do with numbers, and more to do with rules for arranging them. It even lets you create some of those rules yourself, so instead of looking at it like a lot of memorization, it's best to see it as an extension of the research mentality, something researchers do (crunch numbers) to obtain complete and total power over the numbers. After awhile, the principles behind the computations become clear, and there's no better way to accomplish this than by understanding the research purpose of statistics.
Consumer and market research Quality control and quality assurance across a range of industries such as food and beverage, pharmaceuticals, chemicals, energy, telecommunications, etc. Process optimization and process control Research and development
Obtain a summary or an overview of a table. This analysis is often called Principal Components Analysis or Factor Analysis. In the overview, it is possible to identify the dominant patterns in the data, such as groups, outliers, trends, and so on. The patterns are displayed as two plots Analyze groups in the table, how these groups differ, and to which group individual table rows belong. This type of analysis is called Classification and Discriminant Analysis Find relationships between columns in data tables, for instance relationships between process operation conditions and product quality. The objective is to use one set of variables (columns) to predict another, for the purpose of optimization, and to find out which columns are important in the relationship.
In formula form, the multiple correlation between X and Y is equal to the covariance of Y and Y predicted from the best linear combination of the X divided by the product of the respective standard deviations.
Prediction of a continuous Y with several continuous X variables: Unlike ordinary bivariate regression, MR allows the use of an entire set of variables to predict another. Use of categorical variables in prediction: Through the technique of dummy coding, categorical variables (such as marital status or treatment group) can be used in addition to continuous variables. Calculation of the unequal n ANOVA problem: Disproportional cell size in any factorial ANOVA design produces a correlation among the independent variables. MR estimates effects and interactions in this situation by the use of dummy codes. Model nonlinear relationships between Y and a set of X: By the addition of "polynomial" terms (e.g. quadratic, cubic, trends) into the equation, relationships that do not meet the linear assumptions can be analyzed.
Non-Linear Regression:
Ordinary multiple regression assumes that each bivariate relationship between X and Y is linear, and that the relationship between Y and Y' is also linear. How can we examine how closely we have met this linearity assumption? We could, of course, always inspect all of the separate scatter plots between all Y and X relationships. But this does not address the second part of the linearity assumption.
We can determine the conformity of our data the MR linearity assumption via direct examination of residual scores. Recall that a residual or "error" score is equal to (Y Y') and that we would expect this to be approximately zero once the linear regression determined. Thus, a plot of residual scores against [Yacute] that shows a peculiar pattern would suggest the violation of linearity. Two of the most common type of nonlinear relationships is represented in this way:
These cases can be modeled through polynomial regression. Polynomial regression simply adds terms to the original equation to account for nonlinear relationships. Squaring the original variable accounts for a quadratic trend, a cubed term accounts for the cubic trend, and so on. After the linear component, each term adds another "bend" in the prediction line. As with traditional MR, polynomial regression can be interpreted by looking at R Squared and changes in R Squared. A couple notes of caution: in order to look at any higher order effect (such as the cubic), all lower order effects (the quadratic and the linear) must be placed into the equation; a degree of freedom is lost for each additional term added into the equation; and lastly, the N:k Ratio changes for each added term. Recall that trend tests in ANOVA were accomplished with subjects grouped into discrete categories. With polynomial regression, the subjects are not grouped, but rather each individual has a unique score on a continuous variable. Therefore, MR trend information is much more complete than information available with typical ANOVA.
identify the interactions among the independent variables and the association among dependent variables, if any. MANOVA Procedure MANOVA procedures are multivariate, significance test analogues of various univariate ANOVA experimental designs. MANOVA, as with its univariate counterparts typically involve random assignment of participants to levels of one or more nominal independent variables; however, all participants are measured on several continuous dependent variables. There are three basic variations of MANOVA:
Hotelling's T: This is the MANOVA analogue of the two group T-test situation; in other words, one dichotomous independent variable, and multiple dependent variables. One-Way MANOVA: This is the MANOVA analogue of the one-way F situation; in other words, one multi-level nominal independent variable, and multiple dependent variables. Factorial MANOVA: This is the MANOVA analogue of the factorial ANOVA design; in other words, multiple nominal independent variables, and multiple dependent variables.
While all of the above MANOVA variations are used in somewhat different applications, they all have one feature in common: they form linear combinations of the dependent variables which best discriminate among the groups in the particular experimental design. In other words, MANOVA is a test of the significance of group differences in some m-dimensional space where each dimension is defined by linear combinations of the original set of dependent variables. This relationship will be represented for each design into the following sections.