Mark 3054 Slides

Australian School of Business
Mark 3054: Market Analysis / MR- II
Module1:
Introduction, Recap and

Housekeeping
Rahul Govind
University of Pittsburgh
Carnegie Mellon University
University of Mississippi
Ford Motor Company

Hewlett Packard
J.D Powers
Rahul Govind
The University of New South Wales
Recap MARK 2052
Rahul Govind
The Market Research Process
1. Establish need for information

2. Establish information needs
3. Determine research design/data sources

4. Develop data collection procedures
5. Sample design
Decision
6. Survey design
7. Collect data
8. Process data
9. Analyze data
10. Write report/present findings

11. Action
Rahul Govind
Step 1: Establish Need for information
(Why) Do we need to conduct Marketing research?
1) Check out what the market is like!

(SWOT analysis-Proactive Research)
Explore a new opportunity
Check for any threats in the market
Identify our strengths
Identify our weaknesses
2) Check out if a strategy makes sense! (Dry run/Concept

Testing)
Introducing a new product will it make money?
Rahul Govind
Step 1 contd. : Establish Need for

information
3) I see changes? Why do I see changes? (Reactionary

Research)
My product used to sell very well but its sales are now declining!
4) Have things changed from the past? (Tracking Study)
I had a 30% Market Share last year. What is it now?
5) The law wants me to! (Mandatory Studies) Location

based
Insurance and Medicine need to check for consumer satisfaction
Rahul Govind
Step 2: Establish Information Needs
What are the questions that we need to ask (the
consumers)?
Will they give us answers to what we want to know in
step 1?
Show me the numbers!!!
Dont use toooo much intuition!!
What percentage of the time do you think you smell bad
amongst friends/co-workers?
What percentage of the time do you think your friends/co-
workers smell bad?
You think < 10%
Everybody else thinks 35%
Rahul Govind
Step 3: Determine Design and Data

Sources
Can I get by using less money and time? (consider FMC)
Use Secondary Data!
General Motors had a similar problem and they conducted a
survey.
Since GM and FMC are perceived similarly, that data might be
used
Can we just get by observing what is going on around
us?
Observational Research
Ask people who come to the showroom and not buy the product
Observe what people who buy a competing brand (Honda) focus
on when they buy the product.
Ok, we do need to spend Time AND Money!
Survey Research
Ask Subjects questions that will help us in answering our questions
Are you satisfied with the reliability of Ford Cars?
Are you satisfied with the looks of Ford Cards?
Rahul Govind
Secondary and Primary Data
Rahul Govind
Step 4: Develop data collection procedure

What types of data do we need?
Attitudes: how do people think/feel about our products?
What is the first thing that comes to mind when you think of FMC?
Behaviors: how do people behave in the market?

Which was your last vehicle purchase?
Demographics: company, people, industry

Where do you live?
How many children under 12 do you have in your family?
How should we collect the data? Get it the right way!

Phone/Personal/Mail/E-mail (Surveys)
Secondary research
Observation
Rahul Govind
Step 5: Sample Design (for survey research)
Who should be interviewed? (Population and Sample)
Want to sell to the entire US. How do I survey EVERYBODY?
Decision maker/influencer
Dad who buys the car (purchaser) or
Child that drives it to school (user)
How should these respondents be contacted?

Telephone interviews (Can respondent visualize color, design etc.?)
Personal interviews (very expensive)
Mail interviews (too slow? Can we wait that long?)
How many people should we interview?

At what level of confidence can we project the results to the
population?
What possible amount of error can we live with?
Rahul Govind
Step 6: Survey Design

Ask as few questions as possible and in a logical
fashion!! Humans are lazy (Cognitive Misers)
Keep it simple and stupid (KISS)
Use language that is convenient and comfortable for the
respondent
Speak in the respondents language
Would you consider yourself a turophile? Actual Survey Qn by
Kraft
Ask the hard questions first and easy ones last.
Humans are not only lazy, but they easily get tired
(Survey Fatigue)
Rahul Govind
Step 7/8: Collect Process Data
Interview quality control procedures
Interviewer quality
Make sure that you haven't hired a lazy guy
Confirm that he is doing his job
Data processing quality
Data cleaning procedures
Qn 6 asked - Do you prefer an automatic transmission?
Respondent did not answer
TRASH
Logic checks
Qn 5 asked - Do you own a car? Respondent said No.
Qn 6 asked Is it an American Car? Respondent said
Yes.
TRASH
Rahul Govind
Step 9: Analyze Data

Does the data tell you anything without using Math
Go back and review information needs Rural people like FMC
better
Everybody hates our reliability
Visualize results A picture says a 1000 words
Think, use your brains!
Now use Math Techniques

Cross-tabulations
Multivariate analysis (regression, semantic scales, conjoint)
More complex designs and analyses
Dont just make the analysis very simple but dont just go
crazy with math either
Rahul Govind
Step 10/11: Write a report/Take Action
Writing reports
User friendly reports
Bullets and dashes format
Picture is worth a 1000 words
BLOT strategy (bottom line on top)
Presenting findings
Be sensitive to the audiences level of knowledge
Be a manager, not a statistician
Sometimes numbers dont mean a thing! But most of the times
nothing means anything without numbers
Action
Design changes in the marketing mix
Go back and see if you need more data to make better decisions
Rahul Govind
What is Market Analysis?

It is NOT a course in statistical formulae
The focus is on understanding the bases of the
techniques
It is NOT about plugging numbers into formulae
It is about gaining hands on experience using
analytical software SPSS
It is about how output can be used
interpretation and communication
Rahul Govind
Outcomes of this course
Use SPSS to analyze a variety of data typically
collected by marketers.
Explain when and how a range of statistical

techniques may be applied to marketing situations.
Translate the output from statistical analyses into a

language that is understandable to marketing
managers
Competently and confidently communicate (verbal

and written) the true meaning of statistical output.
Adequately self-reflect and self-assess behavior in

teamwork situations.
Rahul Govind
But most importantly

Predict and hone your needs for answering certain
business questions
Visualize numbers
And both of these translate to what we call....
the ability to think
Rahul Govind
The two stories of Marketing
Build a better mousetrap and the world will beat a
path to your door - Ralph Waldo Emerson
Before we build a mousetrap, we should check if

there are any mice out there Yogi Berra
Rahul Govind
Why do new products fail?
Rahul Govind
Begin 3054
Rahul Govind
Statistical Analyses
Used to find out
What people have in common
How they differ
Predict how they will act in the future
(Burns & Bush 2000)
Rahul Govind
Types of statistical analyses
Descriptive Analysis (who are they?)
Used to describe the data to reveal general pattern of
responses
To portray the typical respondent
What are the demographics of the sample?

What percentage of people have purchased a new car in the
last 2 years?
Are people satisfied with their bank (if not which ones?)
What cause-related products do people buy?
Rahul Govind

Inferential Analysis
To draw conclusions about
1. the population
2. previous samples
3. future samples
based on the current sample
Is the satisfaction level of current students the same as past

years of students?
Rahul Govind
Differences Analyses
To determine if differences exist between groups
Is there a difference between males and females in what they

want from a holiday destination?
Is there a difference between local and international students in
their motivation to purchasing cause-related product?
Rahul Govind

Associative Analyses
To determine the strength and direction of
relationships between 2 or more variables
Is there an association between the importance of quality of

interior fittings of a car and the importance of comfort?
Are certain types of students more satisfied with their degree
choices than others?
What are the broad types of motivations towards buying
products? Can you classify people based on these broad types?
If so, what are these classifications?
Rahul Govind
Predictive Analyses
Allows forecasts of future events
Estimate the level of Y given the amount of X
Does the provision of certain characteristics help predict the

likelihood of purchasing a product?
Which features of a holiday resort have the greatest impact on
satisfaction?
Rahul Govind
Within each type of analysis there

are a range of techniques -
Descriptive Analysis
E.g., Means, medians, frequency, standard deviation
Inferential and Difference Analysis

E.g., T-tests, ANOVA
Associative Analysis
E.g., Correlation, Crosstabs with chi-square, Factor
analysis, Cluster analysis
Predictive Analysis
E.g., Regression
Rahul Govind
How do you know what method to use?
To answer this you need to:

Identify the variables of interest
How many variables are there?
Determine whether you are dividing the variables of
interest into dependent and independent variables
Determine the scale of each of these variables, i.e.
type of data
Rahul Govind
Choice of Technique one variable

(Kinnear & Taylor 1991; Churchill 1999)
Single Analysis Variable
Nominal Data Ordinal Data Scale(d) Data

1. Mode 1. Median 1. Mean
2. Frequency 2. Inter-quartile range 2. St dev
3. Chi-square 3. z test
4. t test
Rahul Govind
Choice of Technique Two or more variables
(Kinnear & Taylor 1991; Churchill 1999)
Yes Do you have Dependant (DV) No

& Independent Variables(IV)?
Scale of DV? N
Chi-square Scale of var?
O
Ordinal S STOP 1. Rank Corr
DV 2. Chi-Square S
N/O
Scaled S
1. Chi-Sq DV
2. Conjoint 1. Paired t-test
Nominal N/O 2. Correlation
DV 1. Indept t-test 3. Factor Analysis
S Regression
N/O 2. ANOVA 4. Cluster Analysis
3. Dummy var 5. MDS
Discriminant
Chi-square 4. Regression
Analysis
5. Conjoint
Rahul Govind
Overview of the Stages of Data

Analysis
Editing
Coding
Data Entry Descriptive analysis
Data Analysis Univariate and

Multivariate
Analysis
Interpretation
Rahul Govind
Some Editing issues
Preliminary questionnaire screening
Checks of Completed Questionnaires

Unsystematic/Systematic
What to Look For in Questionnaire Inspection

Incomplete Questionnaires
Non-responses to Specific Questions/Item Omissions
Yea- or Nay-Saying Patterns
Middle-of-the-Road Patterns
Unreliable Responses
Rahul Govind
Coding
Aim of coding:
Retain as much information in the data file as on the hard
copies of the questionnaire
Facilitate data analysis
Rahul Govind
Coding and Data entry some
issues
How often do you visit a dentist? (Tick ONE box only)
Every 6 months
Every year
Every 2 years
Only once in the last 5 years
CODE: One variable, coded as 1, 2, 3, 4
Rahul Govind
Why have you not used the Optometry Clinic? (May tick
more than one option)
Too far from where I live

Dont have time
Didnt know there was one
Dont trust student examinations
CODE: 4 variables (one for each alternative), coded as

0/1
Rahul Govind
What is the likelihood you would go to a show at
the Opera House within the next year?
Very Unlikely Neither Likely Very

Unlikely like or unlikely Likely
CODE: One variable, coded as

0, 1, 2, 3, 4
-2, -1, 0, 1, 2
1, 2, 3, 4, 5
Rahul Govind
Measurement:
What are the various types of data?
Examples
Nominal What is your gender?

Male Female
Ordinal What is your age?

<18yrs 18-29yrs >29yrs
How satisfied are you?
Interval/Scale
V unsat Unsat Neither Sat V sat
Rahul Govind
Mark 3054: Market Analysis / MR- II
Module 2:
Data Preparation &
Customer Profiling
Outline
Getting to Know your data

Starting to build a profile of your sample
Cross tabulation and Chi-square
Rahul Govind
Question..
Once you have the data coded and entered in a
data set, what do you do first?
Rahul Govind
Dataset for analysis

Objective of that research:
To understand how satisfied their customers were with the
company, and customers perceptions of the companys
performance.
Rahul Govind
Objectives
Res Obj, can be broken down to sub-objectives,
for example:
To characterise customers.
To understand what the customers want from a customer service
company.
To understand the performance of the regions and whether they
differ in any way.
To evaluate the companys performance.
To identify the factors that impact on customers satisfaction.
In following weeks we will cover a range of

techniques which will help us obtain this insight.
Rahul Govind
Getting to Know
the Data
Initial examination of the data

Data Reduction
Rahul Govind
Getting to know your data
Frequencies
These help to:
Graphical examination
Detect
Measures of central tendency
outliers
Measures of dispersion
Detect errors
Test
assumptions
Rahul Govind
Frequencies
What are they?
Type of data used?
Common uses of frequency tables
Data cleaning:
determine the degree of non-response
locate blunders
locate outliers
Determine empirical distribution of a variable
relates to graphing
Calculate summary statistics
Rahul Govind
Rahul Govind
Example Customer service data

From the frequency table -
4 (2%) people did not answer the question (coded as
missing)
sample size for the question (n) was 196
200-4
19% of respondents are from Region 6
There are 3 errors
Rahul Govind
How to use this information
Thus -If there is a coding error, correct the error and

re-run the frequency to obtain the correct
percentages
Regroup the categories, if appropriate

WHY?
Rahul Govind
Graphical Examination
Highlights:
the nature of the variable - the shape of the distribution
relationships between variables
unusual values
SPSS examples: What can you say from these
graphs?
bar chart and histogram
Rahul Govind
Assumptions
Common Assumptions for tests
Normality Kurtosis and Skewness
Linearity
Equal Variance
Rahul Govind
Data Reduction
As you get to know your data, you start the process
of data reduction
Why?
Summarise
Communicate
Rahul Govind
To help summarise and communicate:
MEASURES OF CENTRAL TENDENCY
Mode - the mode is the value that occurs most frequently in a data
set or a probability distribution
Median - is described as the numerical value separating the higher

half of a sample, a population, or a probability distribution, from the
lower half.
Mean
n
x i
Arithmetic Mean ( x ) = i 1
n
Rahul Govind
To help summarise and communicate:

MEASURES OF VARIABILITY/DISPERSION
Frequency Distribution
Range
Mean Absolute Deviation

n
Standard Deviation (s) = ( x x)

i 1
i
n 1
Rahul Govind
Measures of Variability
Standard deviation
How much do the responses vary?
Do most respondents answer the same?
Low variation in responses = low variance in opinion
Do survey participants respond all over the scale?
Represents the typical difference of any one value from
the mean
Rahul Govind
Standard Deviation
It shows how much variation or "dispersion" there is
from the average (mean, or expected value). A low
standard deviation indicates that the data points tend
to be very close to the mean, whereas high standard
deviation indicates that the data are spread out over
a large range of values.
Rahul Govind
If I don't have SD
Rahul Govind
Further investigation of s.d.

Investigate the variability in responses across
questions
Compare the s.d. for importance of helpful staff with that
for importance of local representative courteous
2.778 vs 2.063
What does this imply?
Rahul Govind
Other measures
Skewness: how much a distribution of
responses may be skewed to the left side or to
the right side
Kurtosis: how peaked of flat in shape the

distribution of responses are.
These as well as the other descriptive measures

can in calculated in SPSS
(eg. Analyse/Descriptive Statistics/Descriptives)
Rahul Govind
Rahul Govind
Leptokurtic vs Platykurtic
Rahul Govind
Information gathered from getting to know

your data helps you to begin to understand
profile or characterise - your customer. For
example:
How they answered questions on average

Range of responses
Frequency of different alternatives
Rahul Govind
So what do we now know about
our customer service data?
Can gain description of who is on our sample (age,
gender etc)
Describe the average and pattern of response to
key variables
Rahul Govind
So far ..
Started to summarise the data gain some facts
about respondents
E.g., average response, variability in response,
commonly used categories etc
But it also starts to raise more questions:

E.g. Do all the respondents have the same views? Do
different groups have varying views? Are there
differences between males and females, or the age
groups in what is important? Or in their assessment of
the companys performance?
For this we need more techniques!

Rahul Govind
Gaining Further Insight
Cross tabulations
Rahul Govind
Cross tabulations
Extends the frequency table to 2 variables
Variables are nominal / ordinal
Counts the number of observations in each
possible sub-group or cell
Need to be on the lookout for too many cells with very
small counts (<5) Reduces reliability
Analyse/Descriptive Statistics/Crosstabs
Rahul Govind
Lets follow up on a question:
The overarching questions here:
How can we characterise customers? What can we find
out about them? Are they the same or not?
This can be investigated from different angles

lets look at one - Age:
Is there an association between age groups and the
type of transaction they undertake?
OR
Are younger age groups less likely than older groups to
use the company for private transactions?
Rahul Govind
Questions from q/aire:

Age group
17-30, 30-40, 40-50, 50-60 and > 60
Type of transaction
Private
Business
Rahul Govind
Interpreting the output:
Look at percentages (not frequencies) to gain an

understanding
Direction of percentages?
From crosstab we might say:
More in old groups us the company for Public
transactions than the younger ones.
70% (>60) vs approx 51% for younger groups
Rahul Govind
Conclusion?
There appears to be some association between age
and type of transaction. However..
Is this tendency, or association, significant?

That is, is there enough evidence to conclude that
the ages differ in the type of transaction they use the
company for ?
If so, how strong is that evidence?
Rahul Govind
Implications
Need a benchmark or bar to help us decide when

there is enough evidence to say This situation could
not have occurred by chance .
Rahul Govind
Hypothesis Testing
From past experience we may have some
assumptions about the relationships between
variables or on how consumers may rate certain
aspects of our products
Aim: To examine whether a particular proposition
(hypothesis) concerning the population is likely to
hold
Is there enough evidence from the sample to
reject the hypothesis?
If not, we say that from this sample we cannot reject the
hypothesis - we do not say that we accept it!
Rahul Govind
Hypothesis testing creating a
benchmark
From our sample we can calculate a test statistic to
help us test our hypothesis.
This test statistic can take on a range of values,
depending on the sample we have drawn from the
population.
All these values together give you the distribution for
the statistic.
Rahul Govind
Values in the tails are possible if our hypothesis is true

but very unlikely! Therefore if the value of our test
statistic falls in these regions we reject our hypothesis
Rahul Govind
The cut-off points for the tails are usually defined such
that the probability of getting this value or greater is 0.05 .
This 0.05 is known as the significance level (the
probability of us making an incorrect decision), and the
cut-off point is the critical value
Therefore, we have established our benchmark or bar
to judge if the evidence from the sample is sufficient to
say that there is a statistical relationship
I.e., for a particular test statistic, if the probability
of the test statistic (p value ) is <0.05, there IS
evidence to REJECT the null hypothesis.
Rahul Govind
Lets now return to our

crosstabs
How is this knowledge

translated to the current
situation?
Rahul Govind
Chi-square Test
Statistical test (and test statistic) used in
connection with cross-tabulations
Tests the presence of an association between 2

nominal or ordinal variables
Based on the frequencies i.e. counts
Rahul Govind
Chi- Square cont.

Compares what you have observed (from your sample)
with what you would expect if there were no
relationship between the variables
If the difference between the observed and expected

frequencies is too large then you have evidence to
reject the null hypothesis
Rahul Govind
Is this association significant?
Hypothesis test - variables are
independent, i.e., there is no
association between them.
Null: There is no association between age and
type of transaction
Alternative: There is an association between
age and type of transaction.
Rahul Govind
A small exercise to try at home

Level Easy - Divide the population based on
gender
Conduct chi-square tests to see if there is a difference in
their preference
Level - Medium - Divide the population into two

groups based on age
Age >60
Age <60
Rahul Govind
Main Points from this module.
Examination of your data and data reduction help

you, the analyst, to get a feel for what the data
is about. A range of simple techniques helps with
this task.
Cross tabulation, with an associated Chi-

square test, allows you to find whether the
association between the two (or more)
variables is significant.
i.e. Chi square tests for the significance of the
cross tab.
Rahul Govind
Profiling the Customer:
Gaining Further Understanding of the Target Market
t-Tests
One Sample, Independent & Paired
Outline
Extending to techniques that allow us to test
assumptions about the data and differences between
groups of respondents.
One sample t-test
Independent t-test
Paired t-test
Rahul Govind
Key understanding from today:
So , from the techniques we cover today and next
week, we can:
Understand our target market more
Understand similarity of thoughts, behaviours etc between
different basic groups within our target market,
and hence start to understand whether there are patterns
in the data
Are men and women similar on their liking for _______?

Do older people like ______ more than younger ones?
Do people in ____ spend more time travelling than the ones in
_____?
Rahul Govind
Going from the sample to the

population
From the survey (i.e. the sample) we obtain
descriptive statistics
However, often need more information - need to
extend our findings from the sample to gain an
understanding of the population
How is this done?
Rahul Govind
Sample vs Population
Sample Population
(parameters) (parameters)
statistics parameters
mean x mean
st dev s st dev
percentage p percentage
slope b slope
Rahul Govind
Using our sample we can ...

Estimate a Parameter
How well does the sample information reflect the true
population?
Uses the sample information to compute an interval
which describes the range of the parameter
Test a Hypothesis
Does the sample reflect (a managers) prior belief about
the population?
Uses information/evidence from the sample to infer about
the population parameter
Rahul Govind
Hypothesis Testing
From past experience we may have some
assumptions about the relationships between
variables or on how consumers may rate certain
aspects of our products
Aim: To examine whether a particular proposition

(hypothesis) concerning the population is likely to
hold
Rahul Govind
A Sampling Distribution -
2-tail test, a0.05
a.025 a.025
0
Rahul Govind
Developing Hypotheses
Null
Hypothesis Alternative
Statement that Asserts
Hypothesis
the Status Quo.
Statement that Is
One Always Tested the Opposite of the
by Statisticians & Null Hypotheses.
Market Researchers.
Difference Is Not
Simply Due to
Random Error.
Rahul Govind
Lets put this in context

All techniques are conducted to help find out
information on our questions about the data.
Currently, we are aiming to understand the data
how respondents answered the questions, their
views etc.
So, lets look at one of the questions in the data
views on performance of staff
Rahul Govind
Example:
Past evidence suggests that on average, people
are indifferent to using the car as a source of
enjoyment (i.e. they always answer in the middle
of a scale neither agree nor disagree).
However, we think that this may not be so (our
hypothesis).
Rahul Govind
Example:
Null hypothesis:
H0: =4
i.e. On average, customers do not agree nor disagree
to the car being a major source of entertainment
Alternative hypothesis:
H1: 4
i.e. On average, customers agree or disagree to the car
being a major source of entertainment
Rahul Govind
Example
From the SPSS output, we found
sample size: 154
Sample mean: 3.84
test statistic (t value): -1.12
confidence level: 5% (0.05)
probability of obtaining test statistic or greater value
(sig (2-tailed) - p value): 0.264
Rahul Govind
(Students) T-distribution
Rahul Govind
General Interpretation
Reject H0 if :
test statistic is not in the allowable range i.e. it is less
than 0.05
If you reject H0 this means:
based on the sample there is NO evidence to indicate
that the population parameter is not equal to the
hypothesised (in H0) value
In other words, the sample indicates that there is a
large probability that the population parameter is equal
to the hypothesised value
Rahul Govind
Example
From the SPSS output, we found
sample size: 155
Sample mean: 3.35
test statistic (t value): -4.02
confidence level: 5% (0.05)
probability of obtaining test statistic or greater value
(sig (2-tailed) - p value): 0.00
Rahul Govind
General Interpretation
Reject H0 if :
test statistic is not in the allowable range i.e. p value is
less than 0.05
If you reject H0 this means:
based on the sample there IS evidence to indicate that
the population parameter is not equal to the
hypothesised value
In other words, the sample indicates that there is a
large probability that the population parameter is NOT
equal to the hypothesised value.
Rahul Govind
Implications from our example

Since our test statistic is not in the allowable
range, we Reject H0
There is evidence which indicates that for
average customers buying a luxury car is about
being good to oneself.
How do we know this?
Rahul Govind
But do we stop here with our
question?
We now have some understanding of peoples views
on the performance of the company, but do all
groups people think the same?
Rahul Govind
Further Understanding of the Target

Market
What we have already seen
do respondents care about a certain attribute (diff from
mean?)
Not necessarily the mean
The next question How do sub-groups in the target

market differ? A very important question for the
STP process and the first step in marketing strategy.
Sex
Education
Income
Rahul Govind
Some Techniques for Understanding
the target market
No. groups/samples
>2
2
Groups/Samples related?
N Y
ANOVA
Independent Paired
t test t test more than 2
Two groups of groups of
responses that are Two groups of responses responses
tested as though they that originated from the
may come from same population
different populations. (people)
Rahul Govind
Independent samples t test

Examines differences between 2 groups
E.g. differences in behaviour, attitude
Assumes samples are independent, i.e. cannot
belong to both groups
Uses the information in the samples to test whether
the 2 populations are distinct or not -
Is there a difference in the population averages?
Rahul Govind
Independent samples t test (2)
Variables required:
One to describe the groups (e.g., gender, usage)
Needs to be nominal (preferably)
One variable you are interested to see of there are
differences in
Needs to be scale (preferably)
Rahul Govind
Question ..Hypothesis
So, continuing with the overarching question what are
peoples view of the company:
Are there differences between males and females in their
thoughts on issue number ________?
Break down into more specific ANALYSIS

questions:
Enables you to look at the broader question from
different angles
Do males and females differ in how important comfort in
a car is?
Do males and females differ in the weight they place on
their familys opinion?
Rahul Govind
Lets follow this through .
Analysis Question:
Do males and females differ in the weight they place on
their familys opinion?
Need to translate each Analysis

question into a Hypothesis
Rahul Govind
First. Look at the

questionnaire
Q: What is your sex?
Male/Female
When buying a new luxury car, my family's

opinion is very important to me. Issue 12 on
Survey
Strongly Strongly
disagree agree
Samples distinct or not?

Rahul Govind
Developing the Hypothesis
First step in testing hypotheses is to develop the
hypotheses to be tested.
Hypotheses are developed prior to the collection of

data & are part of a research plan.
Hypotheses allow a researcher to make comparisons

between two groups of respondents and to determine if
there are important differences between the groups.
Rahul Govind
Null hypothesis
No difference between the groups - in terms of
their average value
No difference between the population
parameters
i.e. 1 - 2 = 0 OR 1= 2
In other words, for our example:
H0: Males and females do not differ in the importance
of their familys opinion
Rahul Govind
Alternative Hypothesis
Is a difference between the groups - in terms of
their average value
Is a difference between the population
parameters
i.e. 1 - 2 0 OR 1 2
H1: Males and females DO differ in the importance of
their familys opinion
Rahul Govind
Understanding the results

By looking at the sample means you may see a
difference
Is this difference significant (ie is it statistically
meaningful)?
Large enough to say that the average value of these 2
groups is different
Implying that the 2 populations are distinct.
Depends on sample size and confidence level
Rahul Govind
Output from Example
From the SPSS output we get
sample sizes: 56 and 87
variances equal or not? Equal (p=0.09)
means: 3.66 and 3.34
t value (test statistic): -1.04
p value (sig value): 0.3
Decision: Cannot Reject H0
Rahul Govind
Interpretation (1)
There is enough evidence from the sample to reject
the null hypothesis
Males and females do differ in their average opinion
on a car being a source of fun.
Go back to the sample descriptives to describe
what the difference is, i.e. which group, on average,
agrees more strongly?
Rahul Govind
Interpretation (2)
There is a significant difference between Males
and Females in their opinion of a car being a
source of fun and excitement. Females tend to
agree more strongly with this statement
compared to males(average 4.27 compared to 3.5,
p=0.009)
Rahul Govind
Paired t test
Examines differences between 2 groups of
responses
One set of people, 2 sets of answers
before and after experiment
same respondent answering 2 questions
Check on consistency of answers
Matching of questions
(eg difference in importance and performance of a
range of attributes)
Assumes some connection between the questions
Rahul Govind
Overarching question what do customers want in a
service company:
What attributes are important?
Analysis questions:
Are all attributes equally important?
What are customers views on the importance of pleasant
interiors and importance of staff being helpful?
Analysis question Hypothesis
Rahul Govind
Question to investigate
Analysis question: Is there a difference in how
customers rate the importance of speed and
mileage?
1 sample, 2 sets of responses
Both variables are scale (preferably)
Rahul Govind
Firstgo back to the data
Q: On the following scale, please indicate your view:
Critical Minor
Import Import
Car Attribute Safety 1 2 3 4 5 6 7
Car Attribute Mileage 1 2 3 4 5 6 7
Rahul Govind
Null Hypothesis
No difference in the responses to the 2 questions
Mean of the differences is 0
i.e. diff = 0
H0: On average, there is no difference in the way

customers rate the importance of speed and mileage.
Rahul Govind
Alternative hypothesis
There is a difference in the responses to the 2
questions
Mean of the differences is not 0
i.e. diff 0
HA: On average, there is a difference in the way

customers rate the importance of speed and mileage.
Rahul Govind
Understanding the results

By looking at the mean of the differences (ie
difference between a respondents answer to QA and
answer to QB),
Is this mean significantly different from 0?
Rahul Govind
Variable used in the
paired t-test
ID Speed Mileage Diff
1 3 4 -1
2 7 7 0
3 2 5 -3
4 5 5 0
5 6 7 -1
6 4 4 0
7 2 7 -5
Rahul Govind
First .
Look at sample descriptives to gain an
understanding of
what to expect, e.g. does the difference appear large or
small?
How to interpret, e.g. what variable may be rated higher
(or lower)
Rahul Govind
Output from the Example
From the SPSS output we get:
sample size: 154
Correlation: NS
Mean of the differences: -0.26
t value: -0.181
p value (sig value): 0.856
Decision: Cannot Reject H0
Rahul Govind
Interpretation
There is enough evidence from the sample to reject
the null hypothesis
There is a difference in the way people rate the
importance of the two attributes
Always check: Significant - but is it a meaningful
difference?
Is difference large enough to act upon or is it just
significant due to a large sample size?
Rahul Govind
The three tests from this module
One sample t-test:
Tests assumption about the mean of a variable
Independent sample t test:
tests for a difference between means of 2 unrelated
samples
Paired t test:
tests whether there is a difference in the way one sample
has answered 2 questions.
Rahul Govind
ANOVA Setup and Analysis
Examining Multiple Groups
Understanding sub-groups within

your target market
In the previous module we looked at the target

market in terms of 2 distinct groups
What if there are more than 2 distinct groups of
people?
Do a number of 2 group tests? X
Leads to an increase in the overall error (significance level)
Use ANOVA (Analysis of Variance)
Rahul Govind
Example:
Broad question:
Let us analyse, one at a time, what attributes are
important to different types of customers?
Analysis Question: Does the importance of

various car attributes differ according to age?
Age recoded into 4 groups or constant at 5:
17-35yrs
36-45yrs
46-55yrs
55yrs+
Rahul Govind
One-Way ANOVA
Lets take one attribute
Importance of gas mileage
rated from 1 (critical importance) to 7 (minor importance)
Therefore our question is now:

Does the average rating of importance of
mileage differ by age?
Rahul Govind
One-Way ANOVA
Only one categorical variable (a single factor)
Several levels (categories) for that factor
The typical hypothesis tested through ANOVA is that the

factor is irrelevant to explain differences in the dependent
variable (i.e. the means are equal, as in t-tests)
Apart from the tested factor(s), the groups

should be safely considered homogeneous
between each other
Rahul Govind
How does ANOVA work?

Tests to see whether the groups have the same
average values (ie come from the same
population)
Null hypothesis (Ho): all the means are equal
(Ha) : at least one mean is different
2 variables:
One defines the groups (indept var or factor) variable is
..
The other defines what you have measured (dependent
var) variable is ..
Rahul Govind
Based on the variation between and within the groups
Between variance
W (bw levels of respondents)
i
t Age <35 Age 35- Age > 55
h
(Responses to a Question)
55
i 1
n
2
v
a
3
r
i
a 4
n
c
e 5
Rahul Govind
(Sudaman & Blair 1998)
Within
Between
Rahul Govind
What does this mean for us?
Interested in whether the between groups variation is
much greater than the within groups variation.
If it is, then we have evidence that the groups do

have different average values
Then interested in discovering which group, or

groups, have different average values
Rahul Govind
The basic principle of the ANOVA

If the variation explained by the different factor
between the groups is significantly more relevant
than the variation within the groups, then the factor is
assumed to be statistically relevant in explaining the
differences
Rahul Govind
Australian School of Business 10
The test statistic
The test statistic is computed as:
sB2 Variance between groups

F 2
sW Variance within groups
This test statistic compares the weight of the
variance explained by the factors to the weight of the
variance not explained by the factors
Rahul Govind
Australian School of Business 11
Hypotheses
Null Hypothesis:
The average rating of the importance of .does not
differ by age
i.e. Average (17-35) = average (36-45yrs) = average (46-55yrs)
.= average(55yrs +)
1 = 2 = 3 = 4
Alternate Hypothesis:
The average rating of the importance of differs by age
i.e. At least one average of the subgroups is different from the other
averages
Eg. 1 2 3 OR 1 = 2, however 3 and 4 are
different etc
Rahul Govind
Underlying Assumptions
Normality of the dependent variable
Plots (primarily) and other tests
Homogeneity (equality) of variance across the
groups
This can be relaxed.
Rahul Govind
Example
SPSS output:
Test statistic - F:
p value (sig value):
Decision: ..H0
Rahul Govind
Interpretation
What does this mean?
There is evidence from our sample indicating that
the importance of .. differs by age.
Where do we go from here?
Rahul Govind
Question ...
Which group or groups are possibly different?
This is not given through the hypothesis test!
For our case, we can examine the sample means - is
this reliable?
Rahul Govind
Need more information than your own ability to
distinguish between numbers!
Especially if you have many sub-groups, as is the case
here!
Rahul Govind
Use of Post Hoc Tests

Allows you to understand which group or set of
groups are different in average value to the rest.
The test you choose depends on whether the
groups can be assumed to have equal variance
or not.
Equal var: Tukey, Duncan, Scheffe
Unequal var: Tamhanes, Dunnetts
Re-examine the SPSS output
Conclusion now?
Rahul Govind
Levenes test
It tests the null hypothesis that the population
variances are equal across all the sub-groups being
examined.
If p<0.05, then population variances are significantly

different.
Rahul Govind
Interpretation
The importance of speed of repairs does differ by
age. We just identified which ones are different!
Rahul Govind
Warning!!
Do NOT do a series of 2 sample tests to discover
these differences!
You will artificially (and incorrectly) increase the chance of
finding a statistically significant difference in your sample!
YOU CAN HOWEVER COMPARE THE MAXIMUM

AND MINIMUM MEANS to test if a difference exists.
Rahul Govind
NOTE:
Only go to the Post Hoc tests if

the ANOVA result is significant
(i.e. p value<0.05)
Rahul Govind
Main Points from ANOVA
ANOVA helps understand differences between
more than 2 independent groups
Post Hoc tests will identify which group(s) are
different
Choice of post hoc test depends on whether equal
variances can be assumed or not
Rahul Govind
Repeated Measures
ANOVA
Examining Multiple Groups - 2
Outline
Further Expanding our toolbox of techniques
Extension beyond 2 sets of responses:
Repeated Measures ANOVA
Rahul Govind
1
Extension beyond 2 sets of
responses
i.e., looking beyond the paired t-test
Difference in more than 2 sets of responses by the
same individuals
Need to account for the fact that the responses are
not independent
i.e., each respondent has provided information on each
question
Need to use Repeated Measures ANOVA
Through General Linear Models
Rahul Govind
What is GLM
The general linear model (GLM) is a statistical linear
model. It may be written as
Y = XB + U
where
Y is a matrix with series of multivariate responses
X is a design matrix
B is a parameter matrix (to be estimated)
U is a matrix containing errors
The errors are usually assumed to follow a multivariate

normal distribution.
Rahul Govind
2
Examples of questions
Has consumers attitude towards brand X changed
over time (measured monthly for the last 6 months )?
Does on-going training improve participants skill
development?
Is there a difference in respondents liking of the four
brands of soft drink?
Rahul Govind
Question Hypothesis
Broad Question:
What attributes are important to consumers?
Analysis Q:
Are all the attributes equally important?
H0: There is no difference in the average rating of
importance of the listed attributes.
H1: There is a difference in the average rating of
importance of the listed attributes.
Rahul Govind
3
Steps to RMA
General Linear Model repeated measures

Define within-subject factor.. In our case it is _____?
Give the measure a name what does the scale
signify? Importance of the attribute.
Click Define and identify all the variables being
compared.
Click on the options button.
Display means for attributes
Compare main effects Bonferoni/Bonferroni
Rahul Govind
Interpretation
There is a significant difference peoples average
importance of the various attributes of a customer service
company (p<.05).
Rahul Govind
4
Interpretation
Attribute 1 is statistically different in importance from
attributes 3,4,5,7..
Rahul Govind
Main Points from Repeated ANOVA

Repeated measures ANOVA helps you to
understand differences between more than 2
related answers from the ________
Once again, post hoc tests will identify which
group(s) are different
Think about what the results are telling you

try to group the like variables together.
Rahul Govind
5
Module 6 - Exploring
Relationships
Relationships in general
Revisiting Crosstabs and Chi-
square
Introduction to Correlation
Outline
Where are we what can we currently say about our data?

Types of relationships between variables
Relationships between 2 nominal variables
Revisiting cross tabs and Chi-square
Correlation
Pearson correlation
Spearman rank correlation
Introduction to Regression
Rahul Govind
1
Recap
What types of questions can we answer so far?

Examples:
What is the demographic profile of our sample?
Do they have definite opinions on topics?
On average, do different subgroups (eg
demographic, geographic, usage) differ in their
response or behaviour?
Are the general patterns consistent or not?
Rahul Govind
Lets expand our investigation further:
What other questions may we ask ourselves about the data

set we are investigating?
What attributes influence overall performance
perceptions of airlines?
What factors are associated with readings more
books?
What factors impact on satisfaction or likelihood of
purchase?
These are examples of questions for testing the
association/relationship between variables
Rahul Govind
2
Recall our types of analysis
Descriptive Analysis
E.g., Means, medians, frequency, standard deviation
Inferential and Difference Analysis
E.g., T-tests, ANOVA
E.g., Correlation, Crosstabs with chi-square
Predictive Analysis
E.g., Regression
Rahul Govind
Used to determine systematic relationships among

variables
Are the variables related?

If so, how are they related?
Rahul Govind
3
Types of relationships
Non-monotonic
Monotonic
Linear
Curvlinear and non-linear
Rahul Govind
1. Types of relationships - Non-monotonic
Presence or absence of a variable is associated with

presence or absence of another variable
No discernable direction to relationship (or not interested in
exploring), but a relationship exists
Variables:
type of data at least one is nominal (or made
nominal)
Rahul Govind
4
Examples
Is there an association between gender and type

information source used?
Is there an association between type of student and the
type of environmentally friendly product they would
purchase?
Rahul Govind
2. Types of relationships - Monotonic
Can only assign a general direction to the association

between two variables
Monotonic increasing
one variable increases as the other also increases
Monotonic decreasing
one variable decreases as the other variable also
decreases
No indication in the amount of change
Variables: type of data both ordinal (subject to some trivial
exceptions)
Rahul Govind
5
Examples
Is there a relationship between age and amount of

influence a child has on choosing their clothes?
Is there a relationship between education level and interest

in different sports at the Olympics?
Is there a relationship between age and how often people

purchase a cause-related product?
Rahul Govind
Rahul Govind
6
This is also Monotonic
Rahul Govind
3. Types of relationships - Linear
A straight-line association between two variables

More precise and more information than a monotonic
relationship
The amount of change is able to be calculated
y = a + bX
Variables:
type of data both scale (metric) we will learn how to deal
with other variables types later.
E.g. Does likelihood of purchasing a brand X increase
with an increase in attractiveness of the package?
Rahul Govind
7
Example
Sales vs Price
40
35
30
25
sales
20 Sales
15
10
5
0
0 1 2 3 4 5 6
price
Rahul Govind
4. Types of relationships - Curvlinear
The association between the 2 variables is described by a curve rather than a straight line.
Eg U shape, J shape
Variables:
type of data both scale (metric)
Rahul Govind
8
Kuznets curve
Rahul Govind
J Shaped Curve
Rahul Govind
9
Example - curvlinear relationship
Income vs Age
70
60
50
Income
40
30
20
10
0
0 20 40 60 80 100
Age
Rahul Govind
Describing relationships:- Characteristics
Presence
a relationship exists between 2 variables
Direction
is the relationship is positive or negative
Strength of association
strong, moderate, weak or nonexistent
how consistent is the relationship
Rahul Govind
10
Summary of relationship characteristics
Presence Direction Strength
Nonmonotonic X X
Monotonic X
Linear
Non-linear
Rahul Govind
So, How do we test for these associations?
We can use crosstabs with Chi-square and Correlation.
The actual procedure (test) depends on the type of data we

have!
Rahul Govind
11
Examining Non-monotonic relationships -
Recap on Crosstabs & Chi square
Crosstabs with Chi-square are used to assess the
presence or not of a non-monotonic relationship
Recall
Variables are nominal/ordinal
Counts the number of observations in each possible sub-
group or cell
Rahul Govind
Example:
Question: Is there any association between region and

type of transaction?
Is this association significant?

Type of association?
Null: There is no association between region and type of
transaction.
Alternative: There is an association between region and
type of transaction.
Rahul Govind
12
From the example
SPSS output
p level: 0.005
Reject null

There is evidence to say that there is an association.
Need to now go and describe the association How?
Rahul Govind
Remember
Chi-square only tells if there is an association or not

It does NOT describe the association or say how strong it is
Need to gain this information through other means
Revisit the crosstab
Use other statistics
Rahul Govind
13
Linear relationships - Correlation Analysis
Measured through the correlation coefficient: r

Range from -1 to +1
Absolute size of coefficient indicates the amount of
association - its strength
Sign indicates the direction
Correlation based on the degree of covariation among the
variables
Rahul Govind
The three steps of correlation and regression

1. Look at the scatter plot (maybe run a correlation)
2. Conduct the regression
3. Interpret the results
Rahul Govind
14
What different
levels of
correlation look
like.
(Churchill 2000, Fig 21.6)
Rahul Govind
Rules of thumb
0.81 - 1.0 strong

0.61 - 0.8 medium
0.41 - 0.6 weak
0.21 - 0.4 very weak
0.00 - 0.2 none
Rahul Govind
15
Pearson Correlation
Variables need to be scale (very few exceptions to this).

Measures the degree of association between 2 variables
Correlation
NOT cause and effect
ONLY relationship between the 2 variables
ONLY LINEAR relationship
Remember there are a number of possible meanings
of a correlation of 0.
Rahul Govind
Example
Broad question: Is there any relationship between

perceptions of the customer service company?
Analysis Q:Is there an association between importance

of staff competence and value for money??
Null: There is no relationship between the cost of

maintenance and need for high gas mileage.
ie Null: = 0
Alternate: There is a relationship between the cost of

maintenance and need for high gas mileage.
Rahul Govind
16
Interpretation
Correlation coefficient: r = 0.391

Strength of association?
Significant: Yes, p value = 0.000
There is a positive relationship between the importance

of staff competence and importance of value for money
(r=0.391, p=0.000), however, the relationship is weak.
The association is linear.
Rahul Govind
Relating Correlation and Regression
In many situations, people just do not want to describe the

relationship between two variables, they want to go
deeper to predict the effects of one variables on the
other.
Therefore you can extend beyond correlation to
regression!
Lets look at a plot of the relationship between 2 variables
Rahul Govind
17
Scatter Plot
What can it tell us?
How precisely can one describe the relationship?
3
Number of Cars
2
RECMD46E
0
0 1 2 3 4 5 6 7
TPERF46C
Number of Kids
Rahul Govind
Regression
What is it?
Specification of the relationship (linear)

Y= a + b*X
Terminology
Predictor variables: the independent or X variables

metric variables
Criterion variables: the dependent or Y variable
MUST be a scalar variable (in simple regression)
Error or residual: difference between the Y value you have
observed and the Y value predicted from the regression line.
Rahul Govind
18
Principles of regression
Equation: Y = a + bX +
is the error
Estimation through MLS:
i.e. finds the line which minimises the sum of the squared
errors over all the x values (e2 = 0)
Assumptions
error () has mean of 0
variance of the error terms is constant
variance of errors is independent of the values of X
errors are normally distributed
Rahul Govind
Y pred 2
e = Yactual - Ypred
RECMD46E
Y actual 1
0
0 1 2 3 4 5 6 7
TPERF46C Scatter Plot simple case
Rahul Govind
19
Principles of regression (assumptions)
Rahul Govind
Example
regression with 1 independent variable
Analysis question:
Is the overall satisfaction with the car usefully predicted by
satisfaction with the attribute comfort?
Dependent (Criterion) variable (Y):
Overall satisfaction
Predictor variable (X):
Comfort of the car
Rahul Govind
20
First:
Look at:
correlations
scatter plot
Why?
Rahul Govind
Output
Look at:
R2: coefficient of determination

percentage of the change in the Y variable explained by the
changes in the X variables
Results of ANOVA
significance of the whole regression procedure
Significance of coefficient
Does this variable make a significant contribution in explaining
the Y variable?
Size of coefficient
Rahul Govind
21
Equation -
Resulting equation:
Ov Sat = 1.878 + 0.65 * Comfort Satisfaction
Interpretation of equation and coefficient:
For each unit change in staff helpful, overall satisfaction
increases by 0.65 units
Perceptions of staff helpfulness is significant in predicting

overall satisfaction(p level for this variable: p=0.000); this
relationship is moderate (R2= 23.2%)
Rahul Govind
22
Module 7 - Exploring
Relationships
Multiple Regression (Multi-variate Analysis)

Explore a couple of procedures for doing this
Understand possible problems, in particular multi-
collinearity, and how to deal with them
Extending the applicability of regression
Incorporation of non-metric independent variables
Recap simple regression

Dept var (Y) what you are interested in
predicting/estimating
Indept var (X) have control over these
Want: correlation between DV and IV
Regression: Finds line of best fit

Precisely describes linear relationship between DV &IV
Is the IV useful in predicting Y (does it impact Y)?
Rahul Govind
1
Extend this to Multiple Regression
Marketing relationships are complex need to go
beyond simple regression
Equation:
Y = a + b1X1 + b2X2 + b3X3 +
Additional assumption now:
Predictor or independent variables (Xs) are uncorrelated
If this is not obeyed, you have the problem of
Multicollinearity
Rahul Govind
Multicollinearity
Why is this a concern?
Effects the significance of the coefficients
Reduces the efficiency of the estimates (of the
coefficients)
Creates problems interpreting the coefficients
Subsequent use of the coefficients
Problems if want to use as a basis of what to use in
strategy - Since here you are interested in the
importance of each variable
Rahul Govind
2
Multicollinearity
How do you test this?

Correlation between pairs of variables
Condition Index
If CI >15, there is possibly a problem
If CI >30, there is definitely a problem
Tolerance (amount variability in selected IV not explained
by other IVs)
If Tolerance <0.1 or 0.2
VIF >10 or 5
Rahul Govind
Procedures for Multiple Regression

1. ENTER method:
You have control over variables put in and taken out of

regression equation
2. STEPWISE methods:
Variables enter and leave regression equation

according to predefined rules
Rahul Govind
3
Procedure 1
ENTER Method
We go through this so that you understand

the process behind multiple regression.
Rahul Govind
Example - ENTER method:

Research Question
What impacts on customers overall satisfaction?
Analysis question:
Are perceptions of the customer service companys
performance useful in predicting overall satisfaction?
If so, what aspects of performance are the most
influential?
Criterion/explained/Dependent Variable?
Predictor/Independent/Explanators variables?
Method:
Enter method
Rahul Govind
4
Output
Fit of equation?
Usefulness of the regression procedure?
Which variables significant?
Problem of multi-collinearity?
Look at VIF>8
What do you do now?
Remove the offending variable(s) and re-run the
regression
Rahul Govind
Interpretation of equation
Ov Sat = Constant + b x1 + b x2 ..
Interpretation of coefficients
The average change in Y for a unit change in the X, given

that all other X variables are held constant
For our example:
Overall satisfaction will increase by ..
Rahul Govind
5
Interpretation
Relative importance of the variables

variables may be measured on different scales
Use standardised betas (and/or t-values) to
compare importance of the different X variables
What variable is more influential in our equation?
Rahul Govind
Report Interpretation - an illustration

Certain perceptions of the performance of the
customer service company have been found to
impact on a customers overall satisfaction. For
instance, positive perceptions of X1, and X2
increase a persons overall satisfaction (R2 = 0.xx),
with perception of X1(b1=0.xx) having the strongest
impact. X3 and X4 were some factors that did not
impact on overall satisfaction (p>0.05).
Rahul Govind
6
Conducting Regression Analysis
Scatter plots
Choose variables Run regression
NO Significance of
Rethink overall procedure
variables
Overall fit of equation
NO
Assumptions obeyed?
NO
All predictor variables significant?
Rahul Govind Use equation

Procedure 2
STEPWISE REGRESSION
This is the procedure you would be expected

to understand and use.
Rahul Govind
7
What is Stepwise Regression?
Used for sorting through a number of independent

variables
From your set of indept variables, it will provide you

with a subset of variables which are useful (i.e.
significant) in prediction your dependent variable
Rahul Govind
Stepwise regression
Various types:
Forward - one in, then adds one at a time
Backwards - all in, then eliminates one at a time
Stepwise - variables can enter or leave the
equation at each step depending on their
contribution to explaining the dependent variable
Rahul Govind
8
Stepwise regression
Warning:
may not produce the best equation
Variables in equation are related to the
multicollinearity present
Therefore, does not replace common sense!
Rahul Govind
Example:
Analysis Q: What combination of the following

variables are useful in predicting the overall
satisfaction with the customer service company?

How do the results compare our previous results

(using Enter method)?
Rahul Govind
9
Testing Assumptions of Regression
Rahul Govind
Assumptions of regression
Variables normally distributed

Errors have equal (constant) variance
Errors uncorrelated
Independent variables uncorrelated
Rahul Govind
10
Definitions and testing
Heteroscedasticity
variance of the errors is not constant
Testing: plot residuals (y axis) against predicted y (on X
axis)
Autocorrelation
the size of error is related to time; errors are not
independent
Testing: plot residuals (y axis) against time(X axis)
Normality
Testing: normal probability plot
Rahul Govind
Testing the Assumptions
Plots of the residuals in SPSS

Plot standardised residual against standardised y
Plot standardised residual against time
Rahul Govind
11
Heteroscedasticity
Rahul Govind
Autocorrelation
Rahul Govind
12
Ideal situation
Rahul Govind
Splitting data based on levels of IV

Does the relationship being studied differ for
Men and women?
The four income levels?
The five education levels?
The five age categories?
Should we look at the aggregate as well as the case

scenarios?
Rahul Govind
13
Expanding the Applicability
of Regression
Rahul Govind
Incorporating non-metric
Independent Variables
Usual scale for independent variable?
Non-metric variables - incorporated via dummy
variables
Dummy variable is a variable that only takes the
value 0 or 1
Dependent variable must always be metric!
Rahul Govind
1
Defining a dummy variable
Number of dummy variables needed equals the

number of categories minus 1
ie No. dummy variables = No. categories - 1
Therefore, if you wish to incorporate gender into
a regression, you would only need 1 dummy
variable, D1, such that:
D1 = 1, if the respondent was male
= 0, if the respondent was female
Rahul Govind
X variables Y Variable
Resp Gender Dummy_ Quality Satis

gender
1 M 1 4 5
2 M 1 4 5
3 F 0 3 3
4 F 0 2 1
5 M 1 5 4
Rahul Govind
2
Dummy variable regression
Overall Equation:
Satis = a + b1*Dummy_gender +b2*Quality
This reduces to -
Males:
Satis = a + b1 +b2*Quality
Females:
Satis = a + b2*Quality
The non-metric
variable has the
effect of changing
the constant for each
category.
Rahul Govind
Creating Binary Dummy variables
Rahul Govind
3
5
4.5
3.5
2.5 Satis-female
Satis-male
2
1.5
0.5
0
0 1 2 3 4 5 6 7 8
Rahul Govind
Incorporating dummy variables in

regression
Explains case when you have more than 2 categories for
your nominal/ordinal variable (Remember we are talking
about INDEPENDENT variables!!)
Rahul Govind
4
Main Points from Module
Regression provides the specification (i.e.

equation) of the relationship between variables
for our cases, linear association
Breaking of assumptions leads to inaccurate
interpretation and unreliable results
Check for multicollinearity
Know how to identify the key predictors and how
to interpret a regression equation
Concentrate on understanding Stepwise

regression
Rahul Govind

Mark 3054 Slides

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mark 3054 Slides

Uploaded by

Copyright:

Available Formats

Australian School of Business

Mark 3054: Market Analysis / MR- II

Introduction, Recap and

Ford Motor Company

The Market Research Process

1. Establish need for information

3. Determine research design/data sources

10. Write report/present findings

(Why) Do we need to conduct Marketing research?

1) Check out what the market is like!

2) Check out if a strategy makes sense! (Dry run/Concept

Step 1 contd. : Establish Need for

3) I see changes? Why do I see changes? (Reactionary

4) Have things changed from the past? (Tracking Study)

I had a 30% Market Share last year. What is it now?

5) The law wants me to! (Mandatory Studies) Location

Insurance and Medicine need to check for consumer satisfaction

Step 3: Determine Design and Data

Step 4: Develop data collection procedure

Behaviors: how do people behave in the market?

Demographics: company, people, industry

How should we collect the data? Get it the right way!

How should these respondents be contacted?

How many people should we interview?

Step 6: Survey Design

Step 9: Analyze Data

Now use Math Techniques

What is Market Analysis?

Explain when and how a range of statistical

Translate the output from statistical analyses into a

Competently and confidently communicate (verbal

Adequately self-reflect and self-assess behavior in

But most importantly

And both of these translate to what we call....

the ability to think

Before we build a mousetrap, we should check if

Why do new products fail?

(Burns & Bush 2000)

What are the demographics of the sample?

Types of statistical analyses

Is the satisfaction level of current students the same as past

Is there a difference between males and females in what they

Types of statistical analyses

Is there an association between the importance of quality of

Does the provision of certain characteristics help predict the

Within each type of analysis there

Inferential and Difference Analysis

To answer this you need to:

Choice of Technique one variable

Single Analysis Variable

Nominal Data Ordinal Data Scale(d) Data

Yes Do you have Dependant (DV) No

Overview of the Stages of Data

Data Entry Descriptive analysis

Data Analysis Univariate and

Checks of Completed Questionnaires

What to Look For in Questionnaire Inspection

CODE: One variable, coded as 1, 2, 3, 4

Too far from where I live

CODE: 4 variables (one for each alternative), coded as

Very Unlikely Neither Likely Very

CODE: One variable, coded as

Nominal What is your gender?

Ordinal What is your age?