Professional Documents
Culture Documents
http://www.facebook.com/mr.fortyseven
Topics to be covered
Data Analysis : Editing, Coding, Classification, Tabulation, Analysis and Interpretation Statistical Analysis of Business Research: Bivariate Analysis (Chisquare) Multivariate Analysis Factor Analysis, Discriminant Analysis, Cluster Analysis, Conjoint Analysis ANOVA One-way & Two-way classification
http://www.facebook.com/mr.fortyseven
http://www.facebook.com/mr.fortyseven
Tabulation
Data Cleaning
Statistically adjusting the data Selecting a Data Analysis Strategy
http://www.facebook.com/mr.fortyseven
Questionnaire checking The initial step in questionnaire checking involves a check of all questionnaires for completeness and interviewing quality. A questionnaire returned from the field may be unacceptable for several reasons: 1. Part of the questionnaire may be incomplete. 2. The pattern of responses may indicate that the respondent did not understand or follow the instructions. 3. The responses show little variance. 4. The questionnaire is answered by someone who does not qualify for participation. 5. The returned questionnaire is physically incomplete, one or more pages are missing.
http://www.facebook.com/mr.fortyseven
Editing Review of the questionnaires with the objective of increasing accuracy and precision. It consists of screening questionnaires to identify illegible, incomplete, inconsistent or ambiguous responses. This can be done in two stages: a) Field Editing Objective of field editing is to make sure that proper procedure is followed in selecting the respondent, interview them and record their responses. The main problems faced in field editing are: 1. Inappropriate Respondents Instead of house owners, tenant is interviewed. 2. Incomplete interviews, 3. Improper understanding, 4. Lack of consistency, 5. Legibility, 6, Fictitious interview Questionnaires are filled by interviewer himself without conducting the interview. b) Office Editing It is more thorough than field editing. Problems of consistency, rapport with respondents are some of the issues which get highlighted during office editing.
http://www.facebook.com/mr.fortyseven
Example of Inconsistency: A respondent indicated that he doesnt drink coffee, but when questioned about his favorite brand, he replied BRU. Treatment of Unsatisfactory Responses Returning to the field Questionnaires with unsatisfactory responses may be returned to the field, where the interviewers recontact the respondents. Assigning missing value Editor may assign missing values to unsatisfactory responses. This approach may be desirable if 1) the number of respondents with unsatisfactory responses is small, 2) the proportion of unsatisfactory responses for each of these respondents is small, or 3) the variables with unsatisfactory responses are not the key variables. Discarding unsatisfactory respondents This is possible only when proportion of unsatisfactory http://www.facebook.com/mr.fortyseven respondents is small or the sample size is large.
Coding Coding refers to those activities which helps in transforming edited questionnaires into a form that is ready for analysis. Coding speeds up the tabulation while editing eliminates errors. Coding involves assigning numbers or other symbols to answers so that the responses can be grouped into limited number of classes or categories. The code includes an indication of the column and data record it will occupy. For eg. Sex of respondents may be coded as 1for males and 2 for females. Questions Answers Codes
1. Do you own a vehicle? Yes No 2. What is your occupation? Salaried Business Retired 1 2 S B R
http://www.facebook.com/mr.fortyseven
Tabulation Refers to counting the number of cases that fall into various categories. The results are summarized in the form of statistical tables. The raw data is divided into groups and sub-groups. The counting and placing of data in a particular group and sub-group are done. The tabulation involves: 1. Sorting and counting. 2. Summarising of data. Tabulation may be of two types: 1. Simple tabulation In simple tabulation, a single variable is counted. 2. Cross tabulation Includes two or more variables, which are treated simultaneously. Tabulation can be done entirely by hand, or by machine, or by both hand and machine.
http://www.facebook.com/mr.fortyseven
Kinds of Tabulation
Simple or one-way tabulation The multiple choice questions which allow only one answer may use on-way tabulation or univariate. The questions are predetermined and consist of counting the number of responses falling into a particular category and calculate the percentage. Example Table 14.1: Study of number of children in a family
1.
No. of children 0 1 2 Family 10 30 70 Percentage 5 15 35
http://www.facebook.com/mr.fortyseven
2. Cross Tabulation or Two-way Tabulation This is known as Bivariate Tabulation.The data may include two or more variables. Eg. Popularity of a health drink among families having different incomes.
Table 14.3: Use of Health Drink Income per month 1000 No. of 1 children per family (0) 10 5 2 No. of families
23
10012000
20013000
5
20
0
10
8
12
13
42
http://www.facebook.com/mr.fortyseven
Data cleaning Includes consistency checks and treatment of missing responses. Although preliminary consistency checks have been made during editing, the checks at this stage are more thorough and extensive, because they are made by computer. Consistency checks Identify data that are out of range, logically inconsistent or have extreme values. For eg. A respondent may indicate that she charges long distance calls to a calling card, although she does not have one.
http://www.facebook.com/mr.fortyseven
Treatment of missing responses Missing responses represent values of a variable that are unknown, either because respondents provided ambiguous answers or their answers were not properly recorded. 1. Substitute a Neutral Value A neutral value, typically the mean to the variable, is substituted for the missing responses. 2. Substitute an Imputed Response The respondents pattern of responses to other questions are used to impute or calculate a suitable response to the missing questions. 3. Casewise Deletion Cases or respondents with any missing responses are discarded from the analysis. 4. Pairwise deletion Instead of discarding all cases with any missing values, the researcher uses only the cases or respondents with complete responses for each calculation. As a result, different calculations in an analysis may be based on different sample sizes.
http://www.facebook.com/mr.fortyseven
Statistically Adjusting the Data If any correction needs to be done for the statistical analysis, the data is adjusted accordingly. Selecting a Data Analysis Strategy The selection of a data analysis strategy should be based on the earlier steps of the marketing research process, known characteristics of the data, properties of statistical techniques and the background and philosophy of the researcher.
http://www.facebook.com/mr.fortyseven
Univariate Techniques
Univariate techniques are appropriate when there is a single measurement of each element in the sample, or there are several measurements of each element but each variable is analyzed in isolation. Multivariate Techniques Suitable for analyzing data when there are two or more measurements of each element and the variables are analyzed simultaneously. Concerned with the simultaneous relationships among two or more phenomena. Multivariate techniques differ from univariate techniques in that they shift the focus away from the levels (averages) and distributions (variances) of the phenomena, concentrating instead upon the degree of relationships among these phenomena.
http://www.facebook.com/mr.fortyseven
Dependence Techniques are appropriate when one or more variables can be identified as dependent variables and the remaining as independent variables. In interdependence techniques, the variables are not classified as dependent or independent, rather the whole set of interdependent relationships is examined. http://www.facebook.com/mr.fortyseven
Chi-square Test
The chi-square test is represented by the symbol 2 and owes its origin to greek letter chi. This test was first used by Karl Pearson and is one of the most widely used test today. Through the test, we are able to determine the extent of difference between the theory or expected value and the observed or the actual value. 2 = (O-E)2 E Where, O = Observed frequencies E = Expected frequencies It is particularly useful in tests involving nominal data. The chi distribution is positive but an asymmetrical distribution. It has only one parameter, namely degrees of freedom. Degrees of freedom refers to the number of classes to which a value can be assigned freely without exceeding the limitation placed.
http://www.facebook.com/mr.fortyseven
Applications of chi-square: Chi-square as a test of independence We can establish if two or more attributes are associated or independent. Eg. A doctor may be interested in knowing if the new BCG vaccine is effective in controlling the target diseases or not. If there is any relationship between divorce rates and working wives. We start with the null hypothesis that there is no association between the attributes specified . Acceptance criteria Calculated value is less than the table value, null hypothesis is accepted. Rejection Criteria Calculated value is greater than the table value, null hypothesis is rejected.
http://www.facebook.com/mr.fortyseven
As a test of goodness of fit It can be used to determine how well a theoretical distribution (poisson or normal) fits on the observed data or how appropriately a theoretical distribution fits empirical distribution. As a test of homogeneity Used to find out if two or more randomly selected independent samples have been drawn from the same population or not.
http://www.facebook.com/mr.fortyseven
http://www.facebook.com/mr.fortyseven
It can be used without making any assumption about the form of parent distribution or its parameters, since it is based on observed frequencies and not on parameters like mean and standard deviation. It is a distribution free test and hence can be used in any type of population distribution. Additive property of chi-square test that allows the researcher to add the results of independent but related samples. It is easy to calculate and interpret.
http://www.facebook.com/mr.fortyseven
Demerits of chi-square
It has some limitations associated with it. It is used for testing hypothesis but is not useful for estimation. It is not as reliable as parametric tests.
http://www.facebook.com/mr.fortyseven
Discriminant Analysis
In this analysis two or more groups are compared. Discriminant Analysis is a technique for analyzing data when the dependent variable is categorical and the independent variables are interval in nature. For eg. The dependent variable may be the choice of a brand of personal computer and the independent variables may be ratings of attributes of PCs on a seven-point Likert scale. Eg. Where discriminant analysis is used: 1. Those who buy our brand and those who buy competitors brand. 2. Heavy user, medium user and light user of the product.
http://www.facebook.com/mr.fortyseven
The value of dependent variable is calculated by using the data of independent variable. Z = b1x1+b2x2+b3x3+ Where, Z = Discriminant score b1 = Discriminant weight for variable x1 = Independent variable The objectives of discriminant analysis: Development of discriminant functions, or linear combinations of independent variables, which will best discriminate between the categories of the dependent variable. Examination of whether significant differences exist among the groups, in terms of independent variable. Determination of which independent variables contribute to most of the intergroup differences. Classification of cases to one of the groups based on the values of the independent variable. Evaluation of the accuracy of classification.
http://www.facebook.com/mr.fortyseven
When the dependent variable has two categories, the technique is known as two-group discriminant analysis. When three or more categories are involved, the technique is referred to as multiple discriminant analysis. Example of Discriminant analysis: In terms of demographic characteristics, how do consumers who exhibit store loyalty differ from those who do not? Do heavy, medium, and light users of soft drinks differ in terms of their consumption of frozen foods?
http://www.facebook.com/mr.fortyseven
Conjoint Analysis
Conjoint Analysis attempts to determine the relative importance consumers attach to salient attributes and the utilities they attach to the levels of attributes. This information is derived from consumers evaluation of brands, or brand profiles composed of these attributes and their levels. The respondents are presented with stimuli that consist of combinations of attribute levels. They are asked to evaluate these attributes in terms of their desirability. In a situation where the company would like to know the most desirable attributes or their combination for a new product or service, the use of conjoint analysis is most appropriate.
http://www.facebook.com/mr.fortyseven
Example
An airline would like to know, which is the most desirable combination of attributes to a frequent traveller: a) Punctuality, b) Air fare, c)Quality of food served on the flight, and d) Hospitality and empathy shown. A comparison between the utility of a price level of Rs. 400 versus Rs 500, a delivery period of 1 week versus 2 weeks or an after-sales response of 24 hours versus 48 hours. Eg. Of Conjoint Analysis for a laptop: Weight (3kg or 5 kg), Battery life (2 hrs or 4 hrs), Brand name ( Lenovo or Dell) Rank order the combination of the characteristics, i.e. 3kg, 2hrs, Lenovo 5kg, 4 hrs, Dell
http://www.facebook.com/mr.fortyseven
Identification of relevant products or service attributes. Collection of data. Estimation of worth for the attribute chosen.
Conjoint Analysis has been used in marketing for a variety of purposes: Determining the relative importance of attributes in the consumer choice process. Estimating market share of brands that differ n attribute levels. Determining the composition of the most preferred brand. Segmenting the market based on similarity of preferences for attribute levels.
http://www.facebook.com/mr.fortyseven
Factor Analysis
Factor Analysis is a general name denoting a class of procedures primarily used for data reduction and summarization. In marketing research, there may be a large number of variables, most of which are correlated and which must be reduced to a manageable level. Relationships among sets of many interrelated variables are examined and represented in terms of a few underlying factors.
http://www.facebook.com/mr.fortyseven
Two most commonly employed factor analysis procedures: Principal Component Analysis When the objective is to summarise information from a large set of variables into fewer factors, principal component factor analysis is used. Common factor analysis If the researcher wants to analyse the components of the main factor, common factor analysis is used. Eg. Common Factor Inconvenience inside a car.The components may be: 1. Leg Room 2. Seat arrangement 3. Entering the rear seat 4. Door locking mechanism.
http://www.facebook.com/mr.fortyseven
Principal Component Factor Analysis Customer feedback about a two wheeler manufactured by a company. Identified six variables: 1. Fuel efficiency 2. Durability of life 3. Comfort 4. Spare parts availability 5. Breakdown frequency 6. Price Grouping of 1,2,4,5 into Factor 1(Technical Factor) 6 into Factor 2 ( Price Factor) 3 into Factor 3 (Personal Factor)
http://www.facebook.com/mr.fortyseven
Factor analysis is used in the following circumstances: To identify underlying dimensions or factors that explain the correlations among a set of variables. To identify a new, smaller set of uncorrelated variables to replace the original set of correlated variables. To identify a smaller set of salient variables from a larger set for use in subsequent multivariate analysis.
http://www.facebook.com/mr.fortyseven
It can be used in market segmentation for identifying the underlying variables on which to group the customers. In product research, factor analysis can be employed to determine the brand attributes that influence consumer choice. In advertising studies, to understand the media consumption habits of the target market. In pricing studies, to identify the characteristics of price sensitive consumers.
http://www.facebook.com/mr.fortyseven
Cluster Analysis
Cluster Analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups called clusters. Objects in each cluster tend to be similar to each other and dissimilar to objects in the other clusters. Cluster Analysis is used: To classify persons or objects into small number of clusters or group. To identify specific customer customer segment for the companys brand.
http://www.facebook.com/mr.fortyseven
may be clustered on the basis of benefits sought from the purchase of a product. Understanding buyer behaviors Identifying new product opportunities By clustering brands and products, competitive sets within the markets can be determined. A firm can examine its current offerings compared to those of its competitors to identify potential new product opportunities. Selecting test markets Reducing data- Data reduction tool to develop clusters or subgroups of data.
http://www.facebook.com/mr.fortyseven
http://www.facebook.com/mr.fortyseven
Questions
What is conjoint analysis? (3 Marks) What are editing and coding? 3 Marks Distinguish data from information. 3 Marks What are the measures of central tendency? 3 Marks What is cross tabulation? Give an example. 3 Marks Explain the problems during editing of data. 7 Marks What is editing? What is its importance? 7 Marks What is cluster analysis? What are the steps involved in it? Explain. 7 Marks What is tabulation? What are the types of tabulation? 7 Marks Explain the problems during editing of data. 7 Marks Write short notes on the following: i) Conjoint Analysis, ii) Cluster Analysis, iii) Discriminant Analysis, iv) Chi-square Test. 10 Marks
http://www.facebook.com/mr.fortyseven