Professional Documents
Culture Documents
ichaitanyabansalchaitanyabansalchait
anyabansalchaitanyabansalchaitanyab
ansalchaitanyabansalchaitanyabansalc
Marketing Research II
haitnayabansalchaitanyabansalchaitan
Notes: MR. Prantosh Beneerji
yabansalchaitnyabansalchaitnyabansa
BIMM
Collected By: CHAITANYA BANSAL
lchaitanyabansalchaitanyabansalchait
anyabansalchaitanyabansalchaitanyab
ansalchaitanyabansalchaitanyabansalc
haitanyabansalchaitanyabansalchaitan
yabansalchaitanyabansalchaitanyaban
salchaitanyabansalchaitanyabansalcha
itanyabansalchaitanyabansalchaitanya
bansalchaitanyabansalchaitanyabansa
lchaitanyabansalchaitanyabansalchait
1
anyabansalchaitanyabansalchaitanyab
Marketing Research II BIMM
Index Table
Definition 3
Implementation and Research proposal 4
Data Reduction 9
Research Design 11
Attitude Scale 12
Degree of Freedom and Null Hypothesis 13
Basic Tech.
1. Cross tab and 17
2. Correlation and Regression 24
3. Anova 34
Adv. Tech.
1. Factor Analysis 43
2. Discriminant Analysis 56
3. Cluster Analysis 70
4. Multidimensional scaling 89
5. Attribute based perceptual mapping 102
6. Conjoint Analysis 109
MARKETING RESEARCH
Research
Process: @
CHAITANYA BANSAL
b. Field experiments.
Measurement Techniques:
1. Questionnaire
2. Attitude scales
a. Rating scales
b. Comparative scales
c. Perceptual maps
d. Conjoint analysis
3. Observation
4. Projective techniques and Depth interviews
a. Projective techniques
b. Depth interview
Sampling:
a. Population
b. Sample frame
c. Sampling unit
d. Sample size
e. Sample plan
f. Execution.
Research Proposal:
1. Executive Summary
2. Background
3. Objective
4. Research Approach
5. Time and cost
6. Technical appendixes.
Steps in SRO:
1. Classification of PRO from Senior.
2. Situational analysis.
3. Model development
4. Specification for information required.
e.g.
PRO: What is the criterion that the student of XII pass should have.
SRO:
1. Classification of PRO from senior management i.e. school certificate.
2. Situational: speak to teacher what is being taught e.g. language, science, mathematics.
3. Modal development:- we go for modal the subject.
Language (Hindi & English); Science (Physics, Chemistry, Biology); Mathematics
(mathematics).
4. Quantify : give the marks to each subject
Hindi 50, Physics 100 Mathematics 100
English 100 Chemistry 100
Biology 100
------------------------------------------------------
Detergent market:
Population = 110 cr
User = 75 cr
No. of family =15 cr
In this process senior management ask Mr. Thackker about which are the point you are looking
in the distributor, these are as follow and we have to give weight age to each point:
1. Willingness 0.15
2. Financial strength 0.15
3. Geographic convenience 0.15
4. Ware housing 0.10
5. Transportation 0.10
6. Market know how 0.05
7. Market reputation 0.10
8. Past experience 0.05
9. Sales staff 0.05
10.Other product 0.10
Situation analysis: Research collect secondary data about the skill and distribution requirement
for detergents. Then he will visit a few detergent dealers of fertilizers to access their skill. He
will also visit zonal managers to ask them if they can add to his list of attribute.
Structure/direct/
mail
(Preliminary decision What, Whom, How, are the link between questionnaire and SRO.)
Data Reduction : “Process of getting data ready for analysis is called data reduction.”
The steps involved in the reduction of data are:
Steps Time Explanation
1. Field controls While data Field controls are procedures designed to minimize errors
collection during the actual collection of data.
It contains the observation of field work by
supervisor/directors as it occurs.
Also checking the accuracy of fieldwork after it has been
conducted.
Also recontacting a sample of respondent to verify
respondent answer to several questions taken from
different parts of the questionnaire.
2. Editing Unless the questionnaire and analysis are very simple, or
the responses are being entered directly the outcome
definitely go wrong.
So in this process editor should examine every completed
questionnaire before transcribe on the computer.
Also after the data are entered into the computer,
computer editing should be conducted.
3. Coding This is the process of establishing categories and
assigning data to them. Normally it will be a number. For
large data research we use it.
e.g (i) code for close ended:
Breakfast Lunch
------- 2401 -------- 2801
------- 2402 -------- 2802
After data
------- 2403 -------- 2803
collection
(ii) Code for open ended question: if asked people said about SX4
about its style, passion, economy, so in coding we need to define
all this „3‟ sections.
4. Transcribing It is the process of physically transferring data from the
measuring instrument directly into the computer. The
result if this in the creation of a data sheet.
5. Generating new It is often necessary to create new variables as a part of the
variables analysis procedure.
New variables are often generated from combination of
other variables in the data. e.g. data on a person‟s age,
marital status, and presence and age of children may be
combined to generate a new variable called „stage in the
family life cycle.‟
6. Summarizing There are two major kinds of summarizing statistics. The first
Chaitanya Bansal – Promoters 2007-2009
Marketing Research II BIMM
50
Tabulation:
Q. How many hours do you study?
Ans 20 – 1
25 – 5
05 – 3
00 – 4
00 – 5
Hour/Day Frequency %
1 20 40%
2 25 50%
3 5 10%
4 0 0%
5+ 0 0%
Total 50 100%
Attitude scale
Direct Indirect
(Direct response attitude scale) (Derived attitude scale)
Hypothesis – Understanding
Null hypothesis – Ground floor to understanding
e.g. ground floor says there is building but it doesn‟t say how many floors are there.
H0= Region has no significant impact on choice of beverage at 70% confidence level.
no: shows the negativity of the statement.
Confidence level: you should have to mention in each and every null hypothesis.
Fig. 1 Fig. 2
Now see we can put any value in „north- tea‟ box because we put 99 in it we could adjust it by
„north- coffee‟ box and in column wise it can be adjusted by „south- tea‟ .
But once this value we fixed then we can‟t put value in any box so
Degree of freedom = 1
R\B Tea Coffee Other Total
North X 100
South X 100
East X
West X X X
120 80 200
Degree of Freedom = 6
Degree of freedom = (R-1)x(C-1)
R- no. of rows
C- no. of columns
For the value „1.64‟ Confidence level = 80% so „P‟ value = (1-0.8) = 0.2
For the value „12‟ Confidence level = 99.99% so „P‟ value = (1-0.99) = 0.01
Hypothesis: An “educated guess” about the outcome of an empirical test designed to answer a
research question.
For this purpose the business school has conducted a research in which students were asked to
indicate their educational background and their academic performance at MBA. An extract of
the data of the research is provided below for your analysis.
Data:Variable view:
Data view:
Output:
Q.2 Test out the null hypothesis using suitable statistical measure.
Soln
P Observed = 0.089
P Benchmark = 0.10
so reject Ho
AGE:
SOFT DRINK:
Data view:
Output:
Q.2 Test out the null hypothesis using suitable statistical measure.
Soln
P Observed = 0.833
P Benchmark = 0.20
We calculate a(1,2,3….n) and k by past data and calculate the value of Sales.
For this purpose we need to do Correlation and Regression analysis.
a) Market potential
b) No. of dealers
c) No. of sales staff
d) Competition activity index on 5 point scale „1‟ stood for very low and „5‟ for very high.
e) Existing customer base
Historical data was collected territory wise on sales and the above variables. A data abstract
of research is provided below for analysis.
Data view:
Q.2 Develop alternative regression model indicating relationship between sales and the other variables.
Q.3 Determine goodness of fit of each model recommended a suitable model for practical use by the firm.
Select the method that you want (enter, backward, forward) then click ok
Forward Method:
See this two table:
This table gives us the value of R square which shows the correlation between the variables.
We take the „4‟th value because it doesn‟t carry competitor activity which is quite difficult to calculate.
Forward Method:
2. Backwar 97.1% Sales = 0.2(market potential) + 1.101 (dealer) + 1.120(sales staff) –10.288
d
Enter solution we can‟t take all the variable including competitor activity as it is difficult to
calculate and it is not a good solution.
Backward method is better because it doesn‟t take „competitor activity‟ and also the value of
R square is not much less than enter solution.
The Forward solution also doesn‟t take the „competitor activity‟ factor and also does not take
the „dealer‟ but the value of R square is quite less.
In this for 1.1% (97.1- 96.0) we can easily calculate the dealer factor in Backward method. So
Backward method is most suitable solution in this situation
A data extract of research on sales and the above variables is provided below. For your analysis
Q.2 Develop alternative regression model indicating relationship between sales and the other
variables.
Q.3 Determine goodness of fit of each model recommended a suitable model for practical use
by the firm.
Q.4 Recommend a practical model for Pizza hut. Justify your answer.
Data view:
Select the method that you want (enter, backward, forward) then click ok
Chaitanya Bansal – Promoters 2007-2009
Marketing Research II BIMM
Forward Method:
See this two table:
This table gives us the value of R square which shows the correlation between the variables.
We take the „2‟nd value because it doesn‟t carry competitor activity which is quite difficult to calculate with the
highest R square value.
Forward Method:
Enter solution we can‟t take the entire variable including competitor activity so it is difficult to
calculate and is not a good solution.
Backward method is better because it doesn‟t take „competitor activity‟ and also the value of
R square is same as Enter solution.
The Forward solution also don‟t take the „competitor activity‟ factor and also not take the
„Advertisement & customer base‟ factor but the value of R square is quite less.
In this for 1.3% (95.3- 94.0) we can calculate the „Advertisement & customer base‟ factor
which is not difficult to calculate in Backward method.
Anova:
We conducted a research that, Do engineers have better performance than non engineers.
Non 7.4 7.5 7.45 7.85 7.55 7.3 7.35 7.7 7.5
Engi.
Engi. 6.4 6.75 6.45 6.25 6.9 6.15 6.1 6.4 6.5
We can directly conclude that non engineers are better than engineers. But if the data is like
this then:
Non 0.5 3.5 2.5 5.5 8.5 9.0 6.5 5.5 7.5
Engi.
Engi. 3.5 7.5 4.5 6.5 8.9 9.5 3.5 1.5 6.5
e.g. If Dual D studies „3‟ hours a day & in other class (let it be Dual „C‟) all students study for
more or less than „3‟ hours, then they are not distinctly distinguished.
But if Dual D studies „3‟ hours a day and Dual „C‟ studies for „0.5‟ hours then they are
distinctly distinguished.
We take difference among the member and square it down. It is called „sum of square
distances within‟ ( S.O.S. within).
S.O.S (among/between):
Group 1 and Group 2 both are distinctly different from each other, but Group 2 may not be
distinctly different than Group 1 because the data value is quite conserved in group 1 and in
group 2 quite spread .So the distance ,even though is same but if we see group wise then it is
different.
So we have to take second parameter also i.e. „sum of square distances among/between‟
Chaitanya Bansal – Promoters 2007-2009
Marketing Research II BIMM
Anything more than 1 states that the groups are distinctly distinguished from each other.
Shelf height: {eye level, waist level, knee level} – Among group
Sales are different at different level. Now don‟t you think sales also vary with in weekend days,
also in different time in a day?
But for all „3‟ we have „3‟ parameter (eye, waist, knee level) so it become 3x3 = 9
3≠9
f ratio = Eigen value (only when df =1) {means only when there are two group and two variable}
If f observed > f table then reject Ho (page no. 252 „Tull and Hawkin‟).
Grand mean:
(4+7+1)/3 = 4
SOS total 6 42 38 88
# One Way Anova
Case: Food processing company
A food processing company desires to determine whether the self height at which their products
are placed has any impact on sales of that product. For this purpose they have conducted a
research in which the product was placed at three different height.
a) Eye level
b) Waist level
c) Knee level
Sales were found to vary based on col.(a) – day of the week (week/ week end), col.(b) time of
the day (morning/evening).
A data abstract of research tracking sales is provided below for your analysis.
Q.1 Determine whether the group mean for each shelf height are significantly different or nor at
99% confidence level.
Frame a suitable hypothesis for this and test it out with a suitable statistical test justify your
answer.
Data view:
Q.1 Determine whether the group mean for each shelf height are significantly different or not at 99%
confidence level.
SolN :
Output:
Our condition is satisfied here see in above table ‘sig.’= 0.001>>0.01 (i.e 99% c.l.) so reject Ho.
Data view:
Output:
P Observed <PHo
# Perceptual Mapping
Perceptual map: A graphic representation of the perceived relationships among elements in a
set. (Where the elements could be brands, service, or product categories.)
# Factor Analysis
“It is a type of analysis used to determine the underlying dimension of set of data, to determine
relationships among variables, and to condense and simplify a data set.”
The objective of factor analysis is to summarize a large number of original variables into a
small number of synthetic variables, called „factor‟. Determining the factor which is present in
the data has number of application in marketing. e .g.
Brands/ Objects have a large no. of attributes which defines them. Each attribute may/may not
have some correlation. Impact of attribute on consumer is often in the form of combination or
underlying dimensions. Combinations may contain some common and some unique attributes.
Underlying dimension/combination are called factors.
Factor analysis deals with the identification of factors most important to the consumers.
Chaitanya Bansal – Promoters 2007-2009
Marketing Research II BIMM
Perceptual maps are drawn with factors as the axes. The original attributes are incorporated into
the map as vectors such that direction of line indicates nature of association with the factors and
length of line indicates strength of association.
A1, A2, A3, A4, A5, A6 these are the attributes and we take two factors F1 and F2. Now we
calculate the relation of this attributes with F1 and F2. Now we calculate the relation of this
attribute with F1 and F2. We find the correlation with factor. Correlation could be (-1 to +1), so
suppose
To do this they had initially conducted an exploratory research and identified exhaustive list of
variables that could possibly influenced buying behavior.
These variables were then evaluated for the significance of their impact at 90% confidence level
through a suitable statistical test. The short listed variables were then converted into
questionnaire to which respondent were asked to provide these responses on five point Likert
scale where 1: strongly agree; 5: strongly disagree.
1. Will not cancel policy because of age and minor health problems.
2. Tries to handle claim equitably.
3. Difficult to do business with.
4. Provide excellent recommendations about coverage for individual needs.
5. Explain policies fully and clearly.
6. Tends to raise premiums without justification.
7. Policies better than other for older people.
8. Coverage is renewable for life.
9. Take long time to saddle claims.
10.Quick reliable service easily accessible.
Answer:
Factor Contained attribute Name of factor
F1 1,7,8 Handing of age issue
F2 2,3,4,5,6 Service features
F3 9,10 Speed
These variables were then evaluated for the significance of impact on consumer preferences at
90% confidence level. The shortlisted variables were then converted into a questionnaire in the
form of statements. Respondents were asked to provide their responses on 7- point Likert scale
where 1: strongly agree, 7: strongly disagree. An extract of questionnaire as well as the data
extract are provided below for your analysis.
Q.2 Determine the % total variance in the data explained by the abstracted factors
cumulatively.
Data view:
Output
Q.1 Identify the major factors influencing consumer preferences.
Chaitanya Bansal – Promoters 2007-2009
Marketing Research II BIMM
Ans1: These three factors are selected because only values which are greater than 1.
Q.2 Determine the % total variance in the data explained by the abstracted factors cumulatively.
1- 38.828
2- 27.770
3- 13.747Ans3: For an acceptable factor analysis:
a) The Eigen value of rotated factor should be greater than „1‟.
b) Percentage(%) of total variance explained cumulatively by the factor extracted with this
table which are greater than “0.7” also which is less than „-0.7‟ and name them according
to the variables characteristics ,meaning the name which combine all the variables.
Ans 3&4:
F1- Man vehicle /Powerful / Friend jealous / Ads makes feel good --------- MACHO IMAGE
Case: airline
The domestic airline industry has been witness intense competitor during last few years. Fare
prices have dropped to a large extent. In the last 6 month the price of aviation fuel has gone up
considerable this has compelled all carrier to increase the price of their tickets. Indian airline
didn‟t increase their prices. Given that average Indian flyer is price sensitive, this move by
Indian airlines was expected to increase the market share considerably. However this didn‟t
happen.
Jet airway continue been the market leader. Hence Indian airline desires to understand the factor
influencing customer preferences.
To determine this they have conducted a suitable research. An extract of questionnaire used
along with a data extract is provided below for your analysis.
Q.2 Determine the % total variance in the data explained by the abstracted factors
cumulatively.
5: strongly disagree
Variable are:
1) They (jet airways) are always on time.
2) The seats are very comfortable.
3) I love their food.
4) The air hostesses are beautiful.
5) My boss/ friends fly on Jet airways.
6) Jet airways have younger air craft.
7) They have a frequent flyer program.
8) The flight timing suit my schedule.
9) My mom/family feel safe when I fly Jet airways.
10) Flying Jet complements my life style and social standing in society.
Variable view:
Data view:
SPSS Procedure:
Output
Ans3: For an acceptable factor analysis:
Ans 3&4:
# Discriminant Analysis:
“It is a statistical technique for classifying persons or objects into two or more
categories, using a set of intervally scaled predictor variables.”
The objective of discriminant analysis is to classify persons or objects into two or more
categories, using a set of variables which are scaled intervallic. e.g.
Classification of buyer v/s non buyer, good v/s bad risks, early v/s late timing of the market.
When independent variable (prediction variable) is interval or ratio scale and the dependent
variable (criterion variable) is nominal scale (categorical).
e.g. 1) assessment of application for loans i.e. to determine credit worthiness
2) How do loyal consumers vary from non loyal consumer in terms of demographics
psychographic?
3) How do doctors, lawyers, banker differ in terms of their preference forecast.
In discriminant analysis the dependent variable is nominal/ categorical. „Z‟ can take only
limited range and in regression analysis the dependent variable is Ratio/ Interval.
Chaitanya Bansal – Promoters 2007-2009
Marketing Research II BIMM
Protein Vitamin
Person Evaluation X1 X2
1 D 2 4
2 D 3 2
3 D 4 5
4 D 5 4
5 D 6 7
Total 20 22
Average 20/5 = 4 22/5 = 4.4
Protein Vitamin
Person Evaluation X1 X2
6 L 7 4
7 L 8 6
8 L 9 7
9 L 10 6
10 L 11 9
Total 45 32
Average 45/5 = 9 32/5 = 6.4
If the points are distinguish like this then it is very easy to locate discriminant function. But
don‟t you think even a single point coordinate will not allow us to such discriminant function.
e.g. a person who is credit worthy but not paid loan on time then „ ‟ will jump to that side.
The problem appears to be the screening criteria used by the bank at the time of credit card
allotment. Hence the bank desires to revamp screening mechanism for credit card applicants in
future. To do this the bank has carried out a suitable research. Initially an exploratory research
was conducted and a detailed set of variable that could impact credit worthiness were identified.
These variables were evaluated for the significance of their impact on credit worthiness at 90%
confidence level.
1) Consumer age
2) Monthly income
3) No. of years married.
Historical data for the last three years was collected on three variables an extract of this data is
provided below for your analysis.
Q.1 Build a discriminant function that would distinguish between high risk and low risk
customer.
Q.5 Create a discriminant criteria or cutoff value that would enable the bank to classify future
applicant into high risk and low risk category. Justify your answer.
Data:
Variable view:
Data view:
Output:
Chaitanya Bansal – Promoters 2007-2009
Marketing Research II BIMM
Q.1 Build a discriminant function that would distinguish between high risk and low risk
customer.
Ans1:
Q.5 Create an discriminant criteria or cutoff value that would enable the bank to classify
future applicant into high risk and low risk category. Justify your answer.
Ans5 : Discriminant criteria
Now for new data(new respondent) we take the value of their age, income and yrs of marriage
in discriminant function we get the value of Z
Tables to remember
Things to be Find out SPSS Output Table
Discriminant Function Canonical discriminant Function
Classification Accuracy Classification Results
Statistical Significance Eigen values and Wilks Lambda
Best Discriminator Standard Canonical discriminant Function
Discriminant criteria Functions at Group Centroids
@CHAITANYA BANSAL
A national Retail chain desires to categories its customers into loyal customers & normal
customers. To do this the firm has conducted a suitable research. The variable that have a
significant impact on customer loyalty was found to be:
a) Frequency of purchase.
b) Amount of purchase.
c) No. of years purchasing.
Historical data for the last 10years was collected from companies own records. An extract of
this data is providing below for your analysis.
Q.1 Build a discriminant function that would distinguish between high risk and low risk
customer.
Q.5 Create a discriminant criteria or cutoff value that would enable the bank to classify future
applicant into high risk and low risk category. Justify your answer.
Data view:
when the grouping variable “risk” is sifted then we need to define the range. Click on „Define
range button‟ then this table comes
Output:
Q.1 Build a discriminant function that would distinguish between high risk and low risk
customer.
Ans1:
is „Frequency of purchase‟ because its value is greatest (i.e. = .777) among all three.
Q.5 Create an discriminant criteria or cutoff value that would enable the bank to classify
future applicant into high risk and low risk category. Justify your answer.
Now for new data(new respondent) we take the value of their age, income and yrs of marriage
in discriminant function we get the value of Z
# Cluster Analysis
“It is a set of techniques for separating objects into mutually exclusive groups such that the
groups are relatively homogeneous.”
Cluster is group of target customer who are similar in
1) Buying behavior
2) Demographic
3) Psychographic.
Cluster analysis used for market segmentation. The methods for cluster analysis are:
1) Hierarchical clustering / Linkage method.
2) Non hierarchical clustering /Nodal method.
1. Hierarchical clustering: (HC): In this method no. of clusters to be extracted are not pre-
specified. Solution would provide range of cluster which the researcher may decide from.
In HC linkage could be:
Single
Complete
Average.
2. Nodal method (K- means approach): This method require No. of clusters to be pre-
specified.
Typically both method are Euclidean distance (measure of proximity) to create clusters.
Suppose respondent given answer of Q.1 is mildly disagree and Q.2 strongly agree then write it
in R n (5,1).
R1 (4,3); {means on answer of Q.1 Neither Agree Nor Disagree,& Q.2 mildly agree}
R2 (1,2); R3(7,7); R4(7,6); R5(4,4); R6(3,6); R7(2,1); R8(1,1); R9(3,3), R10(6,7)
Rule1: those points will joint which are most near to each other.
Rule2: If the points are at the same distance then we take that point which is having highest „y‟
value means highest coordinate value.
e.g. (3,4) and (3,10) among these we choose (3,10) because “10>4”.
Rule3: Even if the points are at same distance and also their „y‟ coordinate is same then we take
the point in which the value of „X‟ is minimum.
e.g. (1, 9) & (6,9) we choose (1,9) because “1<6”
Rule4: The points will join at the midpoint of those points and the name is given of the point
having least value.
e.g. if (3,10) will join then the name of new point will be „3‟ only but the value of „3‟ will
different but name will be „3‟ only
now this points R1 (4,3); R2 (1,2); R3(7,7); R4(7,6); R5(4,4); R6(3,6); R7(2,1); R8(1,1); R9(3,3),
R10(6,7).
We choose those point which are closer to each other (rule 1). i.e. (R2&R8); (R1&R9);
(R5&R6); (R3&R10) so the points are (2,8); (1,9), (5,6) ,(3,10)
Take the point with highest „Y‟ value i.e. (3, 10) next (1, 9), next (2, 8) next (5, 6)
Agglomeration Schedule:
Stage Cluster Combined Fusion coeff. Difference
C1 C2 between coeff.
1 3 10 1
2 1 9 1 0
3 2 8 1 0
4 5 6 1 0
5 1 5 1 0
6 2 7 1.12 1.12
7 3 4 1.12 0
8 1 2 3.22 2.1
9 1 3 4.62 1.4
There are „9‟ stage because the no. of respondent = 10
Stage = (No. of respondent -1)
The process of combining all the data is called agglomeration.
e.g. Mumbai suburban agglomeration means Mumbai and the area which included to Mumbai.
The solution provides the range of cluster. The table of fusion coeff is showing range of cluster.
Dendrogram:
The word Dendrogram came from „Dendron‟. In human body the cells are connected with each
other by Dendron. Like segmentation these cell are joint with one by one but connect with
different cell by Dendron. Like this graphical representation cell „1‟ ,„2‟ „3‟ ,„4‟ are join with
each other. But as you see in figure cell „1‟ connected with cell „4‟ but cell „2‟ not so. In
segmentation also points are serially given but they are connected with different point
irrespective of their number. This is represented by Dendrogram.
Points 1,2 & 9,10 are very close to each other, they should have to join; but
see due to change of sequence how they look like.
So to take care of such mistake „K‟ means is there. It takes place in iteration.
Now see suppose there are 40 respondents. Spread like this: Stage 1
Now we find out the distance of each respondent from centroid and sum it up.
d1 – for I cluster
D = ∑ii=4 di
Here i=4
Stage 2
Distance is the shortest path between two points. So we cannot reduce „D‟. But we can change
the cluster of some point so that the total distance „D‟ could reduce. This is done through
iteration method and it is done till we reach the smallest value of „D‟.
Stage 3
Q.3 Determine the variable that distinguish between clusters at 90% C.L.
Q.5 Label each cluster suitably after profiling. Justify your answer.
Data View:
Output:
Q.1 Identify the no. segment from data provided.
Ans1: No. of segment find out by manual method from this table
Ans:
Q.3 Determine the variable that distinguish between cluster at 90% C.L.
Ans: In this table the member whose value less than „0.1‟ are the distinguishing
variable at 90% C.L. @CHAITANYA BANSAL
Ans: In Final cluster center table we see only those variables which are
distinguishing variables.
Health Conscious
C3 C1 C4 C2
Frgn co. C1 C4
C2 C3
Settle abroad C2 C3 C4
C1
Branded product C2 C4
C3
C1
Go out C2
C1 C3 C4
@CHAITANYA BANSAL
Credit cards C2 C4
C3 C1
C1: C2:
1. Highly Prefer Email over post 1. Highly prefer post over email
2. People are health conscious 2. People aren‟t health conscious
3. Agree that Foreign co. increases 3. Foreign co. highly influence the
efficiency of Indian company efficiency of Indian co.
4. Women may or may not take 4. Agree that women take part in
active part in purchase active Shopping
5. Highly prefer to settle abroad 5. Prefer to stay in home country
6. Wearing branded Product always 6. Doesn‟t prefer branded pdt.
7. Always go out in Weekends 7. Not go out in weekends
8. Strongly avoid to purchase on 8. Buy on credit
credit
C3: C4:
1. Moderately prefer email over post 1. Mdrtly prefer post over email.
2. People are strongly health 2. People are not health conscious
conscious 3. Foreign co. increase efficiency of
3. Foreign co. doesn‟t increase the Indian co.
efficiency of Indian company 4. Believe that woman does not take
4. Strongly agree that women take active part in shopping
active part in shopping 5. Want to live in home country
5. Prefer to live in home country always.
over abroad 6. Never wear branded product.
6. Prefer branded product over
unbranded 7. Stay at home in weekend
7. Go out in weekend 8. Prefer to buy on Cash
8. Strongly prefer to buy on credit
@CHAITANYA BANSAL
Q.5 Label each cluster suitably after profiling justify your answer.
Ans: From the ans4 table we give then suitable name according to their characteristics @CHAITANYA
BANSAL
C1 Modern
C2 Believers
C3 Nationalistic
C4 Strivers
Tables to remember
Things to be Find out SPSS Output Table
No. of segment Agglomeration Schedule
Respondent Belongs Cluster membership
Variable that distinguish Anova
Dominant distinguish characteristics Final Cluster Center
@CHAITANYA BANSAL
Suppose you are asking to a consumer why you are drinking cold drink such as „coke‟ we have
these attributes:
Don‟t you think respondent say “apne ko nahi pata baba yeh sab” (I don‟t know these things).
These attribute create biasing with in respondent.
MDS primarily used to create perceptual map of product/Brand positioning in the minds of the
consumer for a particular group of products/brands.
Positions of competing brands in a product category are found out through MDS.
Raw perception: It knows consumer preference without influencing your view or attribute.
Respondent may be thinking about the attribute, that might be don‟t know, but the necessary
thing is that we don‟t have to tell these attribute.
Now suppose research is conducted for the distance between the cities.
Q. How many directions do you want to draw on the map such that Nagpur and other cities are
located on where they are on map assuming that above distances are equal?
Suppose we take any point on this sheet, the Pune- Nagpur distance is 600Km so we draw circle
of radius r = 600 because the Pune can‟t go beyond this radius. This is the locus of Pune.
Now distance between Delhi and Nagpur is 1000 but the distance between Delhi and Pune is
1500 so by keeping this we draw the circle above the point Nagpur.
We get two point D1 and D2. By this process we can locate all the point is map. The maximum
can happen that the map of India came „laterally inverted‟. Like this
Don‟t you think the brand has also are at some distance in mind of consumer. e.g. difference
between Pepsi and Coke = 15 ; Pepsi and Sprite = 80.
Consumer are thinking on their own , don‟t influence the respondent by our attribute.
Normally it may be possible to identify for a particular product category the important attribute
form a customer point of view.
These attribute may have an impact on the positioning of each brand in the customer mind and
on the customer buying behavior. Further these attribute may be taken „2‟/ „3‟ at a time the plot
created to understand specific position of each brand.
Such plot or map may be relatively straight forward but it may not capture the consumer mind
completely. This is because customer thinks simultaneously on multiple product dimensions/
attributes while evaluating the products.
1. Attribute based.
2. Non attribute based.
The non attribute based procedure is based on similarity or preference. It is also called the
similarity - dissimilarity approach.
S1 S2
1. No. of dimension / parameters used Score of each brand on the each attribute.
by respondent to evaluate brands.
Map data of DRAS /respondent into dimensions obtained from MDS responses to interpret
meaning of dimensions.
Chaitanya Bansal – Promoters 2007-2009
Marketing Research II BIMM
We are conducting a research on soap. Customer/ respondent are shown cards with name of
two brands on each card. The name on each card is varying.
All pairs of brands studied are shown. Respondent have to decide brand proximity or
differences. Respondent are asked to convey their preferences or difference between brands in
terms of their perceptual distances numerically. Scales may vary from 0-10, 0-20, 0-25,0-100
etc.
These distances are average out across respondents and then converted into a matrix. The MDS
procedure is run to determine:
a. No. of dimensions used by respondent to differentiate the brands.
b. Scores are stimulus on each dimension/stimulus coordinates. The derived are interpreted
by mapping the output of DRAS research on to the dimensions of MDS research.
Kruskal‟s Stress:
Perceptual maps are drawn with the dimensions as the axis, brands are plotted based on
stimulus coordinate score. The location of brands on map indicates brand positioning.
The customer has given two dimension for „4‟ brands (A, B, C, D) dimension are D1 & D2.
When two dimension scale are converted into one dimension we need to take the distance from
origin. The original distance between AB, BC, CD, AC & BD has changed. It shrunk.
Kruskal‟s stress is a measure of the extent of misfit of a specific MDS solution. Values of
Kruskal‟s stress range from 0 to 1. Value close to „1‟ indicate high level of misfit & values
close to „0‟ indicate low level of misfit.
For an acceptable MDS solution Kruskal‟s stress should be less than „0.15‟.
Case: T.V.
A T.V. producer desires to understand the brand positioning of „8‟ T.V. brands as per customer
perception. For this purpose, they have identified the following brands:
(1) Aiwa, (2)Videocon, (3)L.G, (4) Samsung, (5) Sony, (6) Onida, (7) Thomson, (8) BPL
They have conducted a research in two parts in part „A‟ MDS Questionnaire were shown to
respondent. Respondent were ask to indicate on scale of 0-10, the conceptual distances between
each brand pair as per their perception.
The data obtained was then average out across respondent & is converted into misfit. The data
is given below for your analysis.
In part „B‟ a match sample was shown as „DRAS‟ based questionnaire on a set of attribute.
Respondent were asked to evaluate the same „8‟ brands on the „7‟ point „Semantic differential
scale‟. A summary of this data is provided below for your analysis.
DRAS Table
Data View:
SPSS Procedure:
@CHAITANYA BANSAL
@CHAITANYA BANSAL
Output:
Q.1 Determine the no. of dimensions used by respondent to evaluate the brands.
Ans: For an acceptable solution we see two things
a) Kruskal Stress < 0.15
b) R2 > 0.70
till the dimensions not able to got this value SPSS increase the no. of the dimension
and check until these value comes. In our case „3‟ dimensions are used by
respondents.
Q.2 Interpret the dimensions by mapping the data summery of the DRAS Search on
to the dimension obtained from MDS research.
Ans:
See this table to evaluate each dimension take dimension one by one
e.g. we take dimension „3‟ we compare it with DRAS table, follow the steps:
1. See which is the highest value in dimension „3‟ i.e. „1.3724‟- BPL now see in
DRAS(direct response attribute scale) table that for which attribute BPL in having
High value.
Here it is Brand image in which it highest
2. Now see the lowest value in dimension „3‟ i.e. „-1.6871‟- Samsung
now if Brand image contains Samsung as lowest value then there is a possibility
that Dimension „3‟ = Brand image
3. Now to check this we do check the second highest value in this table and if this
equals to DRAS then it sure that
Dimension „3‟ = Brand image
It may be noted that a dimension contains more than „1‟ attribute and it may not
match with DRAS table as it exactly matching here. We need to take some
Flexibility here.
Q.4 Create perceptual map with the dimension as the axis. Position the brand on
this map based on their stimulus coordinate scope on each dimension. Justify your
answer.
Ans: Here we have to prepare three graphs betn
a) Value for money and After sale service
Brand Value After
name for sale
money service
AIWA 1.9545 0.2962
VIDEOCON 0.0613 1.137
LG -0.6209 -1.2429
SS -0.9221 -0.4411
SONY 0.9783 -1.0898
ONIDA 0.892 0.4307
THOMSON -1.0686 1.6324
BPL -1.2746 -0.7225
and prepare graphs for this table by Chats- Scatter plot option name the dimension
in x and y axis and the points located and the position of that brand.
In practical situations may not be easy for a respondent to paired comparisons between a large
numbers of brands all the time. Further the respondent may be to compare our own brand with a
few other leading brands with clarity on differences.
In such cases techniques to create perceptual maps could be attribute based using discriminant
analysis..
Procedure: Brands to be evaluated are short listed. The attribute to evaluate them on are
finalized based on prior research/exploratory research. These attribute evaluated by chi-square
test.
Case: Chocolate
A chocolate company desires to understand of comparison of three leading brands as per
customer perception. The brands are:
1. Nestle
2. Cadbury
3. Amul.
To determine this they have initially conduct a preliminary research and identify the variable
that have a significant impact on consumer brand preferences
A. Price
B. Quality apart from taste.
C. Availability
D. Packaging
E. Taste.
Respondent were asked to provide responses on numeric staple scale. The data extract of
research is provide below for your analysis.
Data view
SPSS Procedure:
Output:
Q.1 Identify the functions that discriminate consumer brand preferences based on the
independent variables.
Ans:
Chaitanya Bansal – Promoters 2007-2009
Marketing Research II BIMM
Q.2 Determine the classification accuracy and the statistical significance of the discriminant
functions.
Ans:
And for statistical significance we need to see two things a) Eigen value > 1, b) Wilks lambda <
0.5 for both in F1 and F2 see these tables
By this table we can conclude that for both Z1 and Z2 this is an acceptable solution.
Q.3 Determine the major constituent variable of discriminant function label each function based
on dominant characteristics.
Ans: By structure matrix we can conclude that which variable constitutes which function.
Q.4 a) Create perceptual maps with the discriminate function as the axis.
b) Depict the brand on this map based on their Centroids score on each discriminant functions.
c) Show the Attributes on the same map as factors based on their correlations to each function.
b) To depict the brands on the map we have to take coordinate of „Functions at group
Centroids‟ table i.e.
but to plot the value of structure matrix are very small we need to multiply this value by the ceiling
value of the term which is highest among the values in „functions at group Centroids‟ table
here the highest value is „2.745‟ we need to take its ceiling value i.e. = 3
now multiply this „3‟ to all value of „structure matrix‟ table values
@CHAITANYA BANSAL
Q.5 Interpret the map and develop the positioning statement for each brand as per customer
perceptions. Justify your answer.
Ans.
A) Nestle: Good availability comparison to Cadbury.
Nestle is perceived as moderately good in packaging compare to Cadbury.
Perceived as moderately superior in quality compare to Cadbury
Price perceived to low
In the matter to taste it perceived to be inferior than Cadbury.
Conjoint Analysis:
“It is a set of technique used to derive the relative importance respondents assign to each
attribute when selecting from among several brands. It also allows an estimate of the best
combination of attributes.”
While creating a new product, if respondent were asked questions on product feature, they are
likely to indicate that they want max. benefit at lowest price.
e.g. In motorcycles consumer may indicate a requirement for the fastest, most powerful, most
fuel efficient more stylish motor cycle at lowest cost such combinations may be impractical or
unrealistic.
Hence consumers need to make tradeoffs, e.g. the tradeoff between power and mileage/ tradeoff
between style and price.
For „2‟ wheeler research is conducted. These are the parameters on which the responses we
have get.
Out of all possible new product concepts an orthogonal array of simplified option are created.
This is done by:
The full new product concept is described on each attribute being considered. Hence the
respondent is not asked to evaluate each attribute individually in isolation. Hence tradeoffs can
be understood.
Respondent are asked to provide rating or ranking to each new product or option. The full
product description is thus evaluated. Analysis is to determine part- worth and utilities.
Questionnaire Table
1 8 65 60 N Y 50 12
2 8 75 30 Y N 78 7
3 8 85 45 N N 66 9
4 10 65 60 Y N 15 18
5 10 75 30 N Y 72 8
6 10 85 45 N N 32 15
7 12 65 30 N N 90 4
8 12 75 45 Y N 80 6
9 12 85 60 N Y 84 5
10 8 65 45 N N 52 10
11 8 75 60 N N 28 16
12 8 85 30 Y Y 95 1
13 10 65 30 N N 42 13
14 10 75 45 N Y 50 11
15 10 85 60 Y N 20 17
16 12 65 45 Y Y 92 2
17 12 75 60 N N 38 14
18 12 85 30 N N 90 3
Ans: The people who have used our bike for at least 1-2 years .
Now to conduct research we call all the people/customer of our bike to dealer centre for free
service. First plan the cities let‟s say we plan „10‟ cities. Let the no. of people who brought our
bike be 1 lakh. Then it is assumed that out of the total at least 1000 people will come for
servicing.
While the servicing is going on,(let‟s say „1‟ bike take „1‟ hr to get full serviced) people don‟t
have any work at that time. We arrange a T.V for them so that they can pass their time. In the
mean time the company representative can go to one of the customers and ask him yo fill the
questionnaire in the following way.
“Hello sir, I am Senior head in-charge of this company, we are developing a new model. Will
you want to be the part- of this project? This will cost you nothing”
Then if the answer is yes then we give him our questionnaire. Don‟t you think this is the best
time to ask questions because the person is free at that time and able to give to answer to the
question very appropriately?
Let‟s say out of 10,000 people, 5000 people say yes. Suppose 2500 give the answer seriously.
Now the sample collected is 2500 from „1‟ city. We have 10 cities.
Method analysis: Dummy variable in linear regression using effect coding. In this if we reduce
level per attribute then „13‟ level come down to 8 so this variable is called dummy variable in
linear regression.
For analysis we can take rating as they are. We can invert the ranking to cross check our result.
Treat one level of each attribute as a dummy variable i.e. omit one level of each attribute from a
regression model, which level to omit may be decided arbitrarily.
Power – 10 bhp, Mileage – 75kmph, Price – 65k, Style – ordinary(no), Comfort – ordinary(no).
(+1) – Implies level of attribute is present in new product option being rated/ranked also in
regression model. (Means present in both).
(0) – Implies that the level of attribute is absent in new product option but present in regression
model. (Means Present in regression model and Absent in new product option).
(-1) – Implies that the level of attribute is present in the new product option being rated/ranked
but absent in regression model. i.e. it is the dummy variable.(Means present in new product
option by absent in regression model).
+1
0 X
-1 X
“ ” -Present, X- Absent.
How we make effect code table. Now see our questionnaire table respondent „1‟, new product
option table
S.N Power (HP) Mileage Price ($) Style (Yes/No) Comfort Respondent 1 Respondent 2
(KMPH) (Yes/No)
Rating Ranking Rating Ranking
1 8 65 60 N Y 50 12
See 1. Power (8) present in new product option but not present in regression model
2. Mileage (65) present in new product option table and also in regression model so mileage
scores X3 – (1), 65 is in regression model but for (65≠85) X4 so X4 – (0)
3. Price (60k) is a dummy variable so for both X5and X6 score is (-1) because it is present in
new product option but for both X5 and X6 regression model it is absent. Same can be apply for
style and comfort.
S.N X1 X2 X3 X4 X5 X6 X7 X8
1 0 1 1 0 0 -1 -1 1
2 0 1 -1 -1 1 0 1 -1
3 0 1 0 1 0 1 -1 -1
4 -1 1 1 0 -1 -1 1 -1
5 -1 -1 -1 -1 1 0 -1 1
6 -1 -1 0 1 0 1 -1 -1
7 1 0 1 0 1 0 -1 -1
8 1 0 -1 -1 0 1 1 -1
9 1 0 0 1 -1 -1 -1 1
10 0 1 1 0 0 1 -1 -1
11 0 1 -1 -1 -1 -1 -1 -1
12 0 1 0 1 1 0 1 1
13 -1 -1 1 0 1 0 -1 -1
14 -1 -1 -1 -1 0 1 -1 1
15 -1 -1 0 1 -1 -1 1 -1
16 1 0 1 0 0 1 1 1
17 1 0 -1 -1 -1 -1 -1 -1
18 1 0 0 1 1 0 -1 -1
Procedure:
then
Regression
Mark label like this because first row contain label. Then click on “ok”.
Select first two column of solution, by these we know the regression equation using ratings.
The coefficients in the regression model indicate the impact that a particular level of attribute
has on rating or inverted ranking. Hence this is the part- worth of that level of attribute.
To determine part- worth of dummy variable (which we leave earlier) we can take either rating
or inverted rating and use other to cross check.
Calculations:
Logic: A large part- worth / large range in part- worth indicate says that the consumer have
higher important to that attribute.
Ri Ai =
In this case the respondent gives maximum weights to power and price, moderate to comfort
and low weight age to style and mileage.
CDV = ∑ U j x PW ij
By the set of U j & PW ij we have different attribute. By this attribute we create the one which is
best.
We prepare the top 15 option and handle this to R&D department top 5 option R&D cannot
make because they may be unrealistic. But top 5-10 R&D should do within time frame.
This is how conjoint analysis has takes place. Conjoint analysis has two issues. (don‟t you think
this data is filled up by only „one‟ respondent) so if we have to measure two issue as follows)
1. Average of rating: This method is easier, technically not so accurate, used when
respondent are 5000-10000, and means it is used when respondent are large.
2. Average out part- worth‟s: More tedious but technically very good and accurate, used
when respondent are less.