Professional Documents
Culture Documents
Module1:
Rahul Govind
University of Pittsburgh
Carnegie Mellon University
University of Mississippi
Rahul Govind
The University of New South Wales
Recap MARK 2052
Rahul Govind
The University of New South Wales
5. Sample design
Decision
6. Survey design
7. Collect data
8. Process data
9. Analyze data
Rahul Govind
The University of New South Wales
Step 1: Establish Need for information
Rahul Govind
The University of New South Wales
My product used to sell very well but its sales are now declining!
Rahul Govind
The University of New South Wales
Step 2: Establish Information Needs
What are the questions that we need to ask (the
consumers)?
Will they give us answers to what we want to know in
step 1?
Show me the numbers!!!
Dont use toooo much intuition!!
What percentage of the time do you think you smell bad
amongst friends/co-workers?
What percentage of the time do you think your friends/co-
workers smell bad?
You think < 10%
Everybody else thinks 35%
Rahul Govind
The University of New South Wales
Rahul Govind
The University of New South Wales
Secondary and Primary Data
Rahul Govind
The University of New South Wales
Rahul Govind
The University of New South Wales
Step 5: Sample Design (for survey research)
Who should be interviewed? (Population and Sample)
Want to sell to the entire US. How do I survey EVERYBODY?
Decision maker/influencer
Dad who buys the car (purchaser) or
Child that drives it to school (user)
Rahul Govind
The University of New South Wales
Rahul Govind
The University of New South Wales
Step 7/8: Collect Process Data
Interview quality control procedures
Interviewer quality
Make sure that you haven't hired a lazy guy
Confirm that he is doing his job
Data processing quality
Data cleaning procedures
Qn 6 asked - Do you prefer an automatic transmission?
Respondent did not answer
TRASH
Logic checks
Qn 5 asked - Do you own a car? Respondent said No.
Qn 6 asked Is it an American Car? Respondent said
Yes.
TRASH
Rahul Govind
The University of New South Wales
Dont just make the analysis very simple but dont just go
crazy with math either
Rahul Govind
The University of New South Wales
Step 10/11: Write a report/Take Action
Writing reports
User friendly reports
Bullets and dashes format
Picture is worth a 1000 words
BLOT strategy (bottom line on top)
Presenting findings
Be sensitive to the audiences level of knowledge
Be a manager, not a statistician
Sometimes numbers dont mean a thing! But most of the times
nothing means anything without numbers
Action
Design changes in the marketing mix
Go back and see if you need more data to make better decisions
Rahul Govind
The University of New South Wales
Rahul Govind
The University of New South Wales
Outcomes of this course
Use SPSS to analyze a variety of data typically
collected by marketers.
Rahul Govind
The University of New South Wales
The two stories of Marketing
Build a better mousetrap and the world will beat a
path to your door - Ralph Waldo Emerson
Rahul Govind
The University of New South Wales
Rahul Govind
The University of New South Wales
Begin 3054
Rahul Govind
The University of New South Wales
Statistical Analyses
Used to find out
What people have in common
How they differ
Predict how they will act in the future
Rahul Govind
The University of New South Wales
Types of statistical analyses
Descriptive Analysis (who are they?)
Used to describe the data to reveal general pattern of
responses
To portray the typical respondent
Rahul Govind
The University of New South Wales
Rahul Govind
The University of New South Wales
Types of statistical analyses
Differences Analyses
To determine if differences exist between groups
Rahul Govind
The University of New South Wales
Rahul Govind
The University of New South Wales
Types of statistical analyses
Predictive Analyses
Allows forecasts of future events
Estimate the level of Y given the amount of X
Rahul Govind
The University of New South Wales
Descriptive Analysis
E.g., Means, medians, frequency, standard deviation
Associative Analysis
E.g., Correlation, Crosstabs with chi-square, Factor
analysis, Cluster analysis
Predictive Analysis
E.g., Regression
Rahul Govind
The University of New South Wales
How do you know what method to use?
Rahul Govind
The University of New South Wales
Rahul Govind
The University of New South Wales
Choice of Technique Two or more variables
(Kinnear & Taylor 1991; Churchill 1999)
Editing
Coding
Rahul Govind
The University of New South Wales
Some Editing issues
Preliminary questionnaire screening
Rahul Govind
The University of New South Wales
Coding
Aim of coding:
Retain as much information in the data file as on the hard
copies of the questionnaire
Facilitate data analysis
Rahul Govind
The University of New South Wales
Coding and Data entry some
issues
How often do you visit a dentist? (Tick ONE box only)
Every 6 months
Every year
Every 2 years
Only once in the last 5 years
Rahul Govind
The University of New South Wales
Why have you not used the Optometry Clinic? (May tick
more than one option)
Rahul Govind
The University of New South Wales
What is the likelihood you would go to a show at
the Opera House within the next year?
Rahul Govind
The University of New South Wales
Measurement:
What are the various types of data?
Examples
Rahul Govind
The University of New South Wales
Australian School of Business
Mark 3054: Market Analysis / MR- II
Module 2:
Data Preparation &
Customer Profiling
Outline
Rahul Govind
Australian School of Business
Question..
Once you have the data coded and entered in a
data set, what do you do first?
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Objectives
Res Obj, can be broken down to sub-objectives,
for example:
To characterise customers.
To understand what the customers want from a customer service
company.
To understand the performance of the regions and whether they
differ in any way.
To evaluate the companys performance.
To identify the factors that impact on customers satisfaction.
Rahul Govind
Australian School of Business
Getting to Know
the Data
Rahul Govind
Australian School of Business
Getting to know your data
Frequencies
These help to:
Graphical examination
Detect
Measures of central tendency
outliers
Measures of dispersion
Detect errors
Test
assumptions
Rahul Govind
Australian School of Business
Frequencies
What are they?
Type of data used?
Common uses of frequency tables
Data cleaning:
determine the degree of non-response
locate blunders
locate outliers
Determine empirical distribution of a variable
relates to graphing
Calculate summary statistics
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
How to use this information
Rahul Govind
Australian School of Business
Graphical Examination
Highlights:
the nature of the variable - the shape of the distribution
relationships between variables
unusual values
SPSS examples: What can you say from these
graphs?
bar chart and histogram
Rahul Govind
Australian School of Business
Assumptions
Common Assumptions for tests
Normality Kurtosis and Skewness
Linearity
Equal Variance
Rahul Govind
Australian School of Business
Data Reduction
As you get to know your data, you start the process
of data reduction
Why?
Summarise
Communicate
Rahul Govind
Australian School of Business
To help summarise and communicate:
MEASURES OF CENTRAL TENDENCY
Mode - the mode is the value that occurs most frequently in a data
set or a probability distribution
Mean
n
x i
Arithmetic Mean ( x ) = i 1
n
Rahul Govind
Australian School of Business
Frequency Distribution
Range
n 1
Rahul Govind
Australian School of Business
Measures of Variability
Standard deviation
How much do the responses vary?
Do most respondents answer the same?
Low variation in responses = low variance in opinion
Do survey participants respond all over the scale?
Represents the typical difference of any one value from
the mean
Rahul Govind
Australian School of Business
Standard Deviation
It shows how much variation or "dispersion" there is
from the average (mean, or expected value). A low
standard deviation indicates that the data points tend
to be very close to the mean, whereas high standard
deviation indicates that the data are spread out over
a large range of values.
Rahul Govind
Australian School of Business
If I don't have SD
Rahul Govind
Australian School of Business
2.778 vs 2.063
Rahul Govind
Australian School of Business
Other measures
Skewness: how much a distribution of
responses may be skewed to the left side or to
the right side
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Leptokurtic vs Platykurtic
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
So what do we now know about
our customer service data?
Can gain description of who is on our sample (age,
gender etc)
Describe the average and pattern of response to
key variables
Rahul Govind
Australian School of Business
So far ..
Started to summarise the data gain some facts
about respondents
E.g., average response, variability in response,
commonly used categories etc
Cross tabulations
Rahul Govind
Australian School of Business
Cross tabulations
Extends the frequency table to 2 variables
Variables are nominal / ordinal
Counts the number of observations in each
possible sub-group or cell
Need to be on the lookout for too many cells with very
small counts (<5) Reduces reliability
Analyse/Descriptive Statistics/Crosstabs
Rahul Govind
Australian School of Business
Lets follow up on a question:
The overarching questions here:
How can we characterise customers? What can we find
out about them? Are they the same or not?
Rahul Govind
Australian School of Business
Type of transaction
Private
Business
Rahul Govind
Australian School of Business
Interpreting the output:
Rahul Govind
Australian School of Business
Conclusion?
There appears to be some association between age
and type of transaction. However..
Rahul Govind
Australian School of Business
Implications
Rahul Govind
Australian School of Business
Hypothesis Testing
From past experience we may have some
assumptions about the relationships between
variables or on how consumers may rate certain
aspects of our products
Aim: To examine whether a particular proposition
(hypothesis) concerning the population is likely to
hold
Is there enough evidence from the sample to
reject the hypothesis?
If not, we say that from this sample we cannot reject the
hypothesis - we do not say that we accept it!
Rahul Govind
Australian School of Business
Hypothesis testing creating a
benchmark
From our sample we can calculate a test statistic to
help us test our hypothesis.
This test statistic can take on a range of values,
depending on the sample we have drawn from the
population.
All these values together give you the distribution for
the statistic.
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
The cut-off points for the tails are usually defined such
that the probability of getting this value or greater is 0.05 .
This 0.05 is known as the significance level (the
probability of us making an incorrect decision), and the
cut-off point is the critical value
Therefore, we have established our benchmark or bar
to judge if the evidence from the sample is sufficient to
say that there is a statistical relationship
I.e., for a particular test statistic, if the probability
of the test statistic (p value ) is <0.05, there IS
evidence to REJECT the null hypothesis.
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Is this association significant?
Hypothesis test - variables are
independent, i.e., there is no
association between them.
Null: There is no association between age and
type of transaction
Alternative: There is an association between
age and type of transaction.
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Main Points from this module.
Rahul Govind
Australian School of Business
Profiling the Customer:
Gaining Further Understanding of the Target Market
t-Tests
One Sample, Independent & Paired
Outline
Extending to techniques that allow us to test
assumptions about the data and differences between
groups of respondents.
One sample t-test
Independent t-test
Paired t-test
Rahul Govind
Australian School of Business
Key understanding from today:
So , from the techniques we cover today and next
week, we can:
Understand our target market more
Understand similarity of thoughts, behaviours etc between
different basic groups within our target market,
and hence start to understand whether there are patterns
in the data
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Sample vs Population
Sample Population
(parameters) (parameters)
statistics parameters
mean x mean
st dev s st dev
percentage p percentage
slope b slope
Rahul Govind
Australian School of Business
Test a Hypothesis
Does the sample reflect (a managers) prior belief about
the population?
Uses information/evidence from the sample to infer about
the population parameter
Rahul Govind
Australian School of Business
Hypothesis Testing
From past experience we may have some
assumptions about the relationships between
variables or on how consumers may rate certain
aspects of our products
Rahul Govind
Australian School of Business
A Sampling Distribution -
2-tail test, a0.05
a.025 a.025
0
Rahul Govind
Australian School of Business
Developing Hypotheses
Null
Hypothesis Alternative
Statement that Asserts
Hypothesis
the Status Quo.
Statement that Is
One Always Tested the Opposite of the
by Statisticians & Null Hypotheses.
Market Researchers.
Difference Is Not
Simply Due to
Random Error.
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Example:
Past evidence suggests that on average, people
are indifferent to using the car as a source of
enjoyment (i.e. they always answer in the middle
of a scale neither agree nor disagree).
However, we think that this may not be so (our
hypothesis).
Rahul Govind
Australian School of Business
Example:
Null hypothesis:
H0: =4
i.e. On average, customers do not agree nor disagree
to the car being a major source of entertainment
Alternative hypothesis:
H1: 4
i.e. On average, customers agree or disagree to the car
being a major source of entertainment
Rahul Govind
Australian School of Business
Example
From the SPSS output, we found
sample size: 154
Sample mean: 3.84
test statistic (t value): -1.12
confidence level: 5% (0.05)
probability of obtaining test statistic or greater value
(sig (2-tailed) - p value): 0.264
Rahul Govind
Australian School of Business
(Students) T-distribution
Rahul Govind
Australian School of Business
General Interpretation
Reject H0 if :
test statistic is not in the allowable range i.e. it is less
than 0.05
If you reject H0 this means:
based on the sample there is NO evidence to indicate
that the population parameter is not equal to the
hypothesised (in H0) value
In other words, the sample indicates that there is a
large probability that the population parameter is equal
to the hypothesised value
Rahul Govind
Australian School of Business
Example
From the SPSS output, we found
sample size: 155
Sample mean: 3.35
test statistic (t value): -4.02
confidence level: 5% (0.05)
probability of obtaining test statistic or greater value
(sig (2-tailed) - p value): 0.00
Rahul Govind
Australian School of Business
General Interpretation
Reject H0 if :
test statistic is not in the allowable range i.e. p value is
less than 0.05
If you reject H0 this means:
based on the sample there IS evidence to indicate that
the population parameter is not equal to the
hypothesised value
In other words, the sample indicates that there is a
large probability that the population parameter is NOT
equal to the hypothesised value.
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
But do we stop here with our
question?
We now have some understanding of peoples views
on the performance of the company, but do all
groups people think the same?
Rahul Govind
Australian School of Business
>2
2
Groups/Samples related?
N Y
ANOVA
Independent Paired
t test t test more than 2
Two groups of groups of
responses that are Two groups of responses responses
tested as though they that originated from the
may come from same population
different populations. (people)
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Independent samples t test (2)
Variables required:
One to describe the groups (e.g., gender, usage)
Needs to be nominal (preferably)
One variable you are interested to see of there are
differences in
Needs to be scale (preferably)
Rahul Govind
Australian School of Business
Question ..Hypothesis
So, continuing with the overarching question what are
peoples view of the company:
Are there differences between males and females in their
thoughts on issue number ________?
Rahul Govind
Australian School of Business
Lets follow this through .
Analysis Question:
Do males and females differ in the weight they place on
their familys opinion?
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Null hypothesis
No difference between the groups - in terms of
their average value
No difference between the population
parameters
i.e. 1 - 2 = 0 OR 1= 2
In other words, for our example:
H0: Males and females do not differ in the importance
of their familys opinion
Rahul Govind
Australian School of Business
Alternative Hypothesis
Is a difference between the groups - in terms of
their average value
Is a difference between the population
parameters
i.e. 1 - 2 0 OR 1 2
In other words, for our example:
H1: Males and females DO differ in the importance of
their familys opinion
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Output from Example
From the SPSS output we get
sample sizes: 56 and 87
variances equal or not? Equal (p=0.09)
means: 3.66 and 3.34
t value (test statistic): -1.04
p value (sig value): 0.3
Decision: Cannot Reject H0
Rahul Govind
Australian School of Business
Interpretation (1)
There is enough evidence from the sample to reject
the null hypothesis
Males and females do differ in their average opinion
on a car being a source of fun.
Go back to the sample descriptives to describe
what the difference is, i.e. which group, on average,
agrees more strongly?
Rahul Govind
Australian School of Business
Interpretation (2)
There is a significant difference between Males
and Females in their opinion of a car being a
source of fun and excitement. Females tend to
agree more strongly with this statement
compared to males(average 4.27 compared to 3.5,
p=0.009)
Rahul Govind
Australian School of Business
Paired t test
Examines differences between 2 groups of
responses
One set of people, 2 sets of answers
before and after experiment
same respondent answering 2 questions
Check on consistency of answers
Matching of questions
(eg difference in importance and performance of a
range of attributes)
Assumes some connection between the questions
Rahul Govind
Australian School of Business
Overarching question what do customers want in a
service company:
What attributes are important?
Analysis questions:
Are all attributes equally important?
What are customers views on the importance of pleasant
interiors and importance of staff being helpful?
Rahul Govind
Australian School of Business
Question to investigate
Analysis question: Is there a difference in how
customers rate the importance of speed and
mileage?
Rahul Govind
Australian School of Business
Firstgo back to the data
Q: On the following scale, please indicate your view:
Critical Minor
Import Import
Car Attribute Safety 1 2 3 4 5 6 7
Rahul Govind
Australian School of Business
Null Hypothesis
No difference in the responses to the 2 questions
Mean of the differences is 0
i.e. diff = 0
Rahul Govind
Australian School of Business
Alternative hypothesis
There is a difference in the responses to the 2
questions
Mean of the differences is not 0
i.e. diff 0
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Variable used in the
paired t-test
1 3 4 -1
2 7 7 0
3 2 5 -3
4 5 5 0
5 6 7 -1
6 4 4 0
7 2 7 -5
Rahul Govind
Australian School of Business
First .
Look at sample descriptives to gain an
understanding of
what to expect, e.g. does the difference appear large or
small?
How to interpret, e.g. what variable may be rated higher
(or lower)
Rahul Govind
Australian School of Business
Output from the Example
From the SPSS output we get:
sample size: 154
Correlation: NS
Mean of the differences: -0.26
t value: -0.181
p value (sig value): 0.856
Decision: Cannot Reject H0
Rahul Govind
Australian School of Business
Interpretation
There is enough evidence from the sample to reject
the null hypothesis
There is a difference in the way people rate the
importance of the two attributes
Always check: Significant - but is it a meaningful
difference?
Is difference large enough to act upon or is it just
significant due to a large sample size?
Rahul Govind
Australian School of Business
The three tests from this module
One sample t-test:
Tests assumption about the mean of a variable
Independent sample t test:
tests for a difference between means of 2 unrelated
samples
Paired t test:
tests whether there is a difference in the way one sample
has answered 2 questions.
Rahul Govind
Australian School of Business
ANOVA Setup and Analysis
Rahul Govind
Australian School of Business
Example:
Broad question:
Let us analyse, one at a time, what attributes are
important to different types of customers?
Rahul Govind
Australian School of Business
One-Way ANOVA
Lets take one attribute
Importance of gas mileage
rated from 1 (critical importance) to 7 (minor importance)
Rahul Govind
Australian School of Business
One-Way ANOVA
Only one categorical variable (a single factor)
Rahul Govind
Australian School of Business
2 variables:
One defines the groups (indept var or factor) variable is
..
The other defines what you have measured (dependent
var) variable is ..
Rahul Govind
Australian School of Business
Based on the variation between and within the groups
Between variance
W (bw levels of respondents)
i
t Age <35 Age 35- Age > 55
h
(Responses to a Question)
55
i 1
n
2
v
a
3
r
i
a 4
n
c
e 5
Rahul Govind
Australian School of Business
Within
Between
Rahul Govind
Australian School of Business
What does this mean for us?
Interested in whether the between groups variation is
much greater than the within groups variation.
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business 10
The test statistic
The test statistic is computed as:
Rahul Govind
Australian School of Business 11
Hypotheses
Null Hypothesis:
The average rating of the importance of .does not
differ by age
i.e. Average (17-35) = average (36-45yrs) = average (46-55yrs)
.= average(55yrs +)
1 = 2 = 3 = 4
Alternate Hypothesis:
The average rating of the importance of differs by age
i.e. At least one average of the subgroups is different from the other
averages
Eg. 1 2 3 OR 1 = 2, however 3 and 4 are
different etc
Rahul Govind
Australian School of Business
Underlying Assumptions
Normality of the dependent variable
Plots (primarily) and other tests
Homogeneity (equality) of variance across the
groups
This can be relaxed.
Rahul Govind
Australian School of Business
Example
SPSS output:
Test statistic - F:
p value (sig value):
Decision: ..H0
Rahul Govind
Australian School of Business
Interpretation
What does this mean?
There is evidence from our sample indicating that
the importance of .. differs by age.
Rahul Govind
Australian School of Business
Question ...
Which group or groups are possibly different?
This is not given through the hypothesis test!
For our case, we can examine the sample means - is
this reliable?
Rahul Govind
Australian School of Business
Need more information than your own ability to
distinguish between numbers!
Especially if you have many sub-groups, as is the case
here!
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
Levenes test
It tests the null hypothesis that the population
variances are equal across all the sub-groups being
examined.
Rahul Govind
Australian School of Business
Interpretation
The importance of speed of repairs does differ by
age. We just identified which ones are different!
Rahul Govind
Australian School of Business
Warning!!
Do NOT do a series of 2 sample tests to discover
these differences!
You will artificially (and incorrectly) increase the chance of
finding a statistically significant difference in your sample!
Rahul Govind
Australian School of Business
NOTE:
Rahul Govind
Australian School of Business
Main Points from ANOVA
ANOVA helps understand differences between
more than 2 independent groups
Post Hoc tests will identify which group(s) are
different
Choice of post hoc test depends on whether equal
variances can be assumed or not
Rahul Govind
Australian School of Business
Repeated Measures
ANOVA
Outline
Further Expanding our toolbox of techniques
Extension beyond 2 sets of responses:
Repeated Measures ANOVA
Rahul Govind
Australian School of Business
1
Extension beyond 2 sets of
responses
i.e., looking beyond the paired t-test
Difference in more than 2 sets of responses by the
same individuals
Need to account for the fact that the responses are
not independent
i.e., each respondent has provided information on each
question
Need to use Repeated Measures ANOVA
Through General Linear Models
Rahul Govind
Australian School of Business
What is GLM
The general linear model (GLM) is a statistical linear
model. It may be written as
Y = XB + U
where
Y is a matrix with series of multivariate responses
X is a design matrix
B is a parameter matrix (to be estimated)
U is a matrix containing errors
2
Examples of questions
Has consumers attitude towards brand X changed
over time (measured monthly for the last 6 months )?
Does on-going training improve participants skill
development?
Is there a difference in respondents liking of the four
brands of soft drink?
Rahul Govind
Australian School of Business
Question Hypothesis
Broad Question:
What attributes are important to consumers?
Analysis Q:
Are all the attributes equally important?
H0: There is no difference in the average rating of
importance of the listed attributes.
H1: There is a difference in the average rating of
importance of the listed attributes.
Rahul Govind
Australian School of Business
3
Steps to RMA
Rahul Govind
Australian School of Business
Interpretation
There is a significant difference peoples average
importance of the various attributes of a customer service
company (p<.05).
Rahul Govind
Australian School of Business
4
Interpretation
Attribute 1 is statistically different in importance from
attributes 3,4,5,7..
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
5
Module 6 - Exploring
Relationships
Relationships in general
Revisiting Crosstabs and Chi-
square
Introduction to Correlation
Outline
Rahul Govind
Australian School of Business
1
Recap
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
2
Recall our types of analysis
Descriptive Analysis
E.g., Means, medians, frequency, standard deviation
Inferential and Difference Analysis
E.g., T-tests, ANOVA
Associative Analysis
E.g., Correlation, Crosstabs with chi-square
Predictive Analysis
E.g., Regression
Rahul Govind
Australian School of Business
Associative Analysis
Rahul Govind
Australian School of Business
3
Types of relationships
Non-monotonic
Monotonic
Linear
Curvlinear and non-linear
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
4
Examples
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
5
Examples
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
6
This is also Monotonic
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
7
Example
Sales vs Price
40
35
30
25
sales
20 Sales
15
10
5
0
0 1 2 3 4 5 6
price
Rahul Govind
Australian School of Business
The association between the 2 variables is described by a curve rather than a straight line.
Eg U shape, J shape
Variables:
Rahul Govind
Australian School of Business
8
Kuznets curve
Rahul Govind
Australian School of Business
J Shaped Curve
Rahul Govind
Australian School of Business
9
Example - curvlinear relationship
Income vs Age
70
60
50
Income
40
30
20
10
0
0 20 40 60 80 100
Age
Rahul Govind
Australian School of Business
Presence
a relationship exists between 2 variables
Direction
is the relationship is positive or negative
Strength of association
strong, moderate, weak or nonexistent
how consistent is the relationship
Rahul Govind
Australian School of Business
10
Summary of relationship characteristics
Nonmonotonic X X
Monotonic X
Linear
Non-linear
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
11
Examining Non-monotonic relationships -
Recap on Crosstabs & Chi square
Crosstabs with Chi-square are used to assess the
presence or not of a non-monotonic relationship
Recall
Variables are nominal/ordinal
Counts the number of observations in each possible sub-
group or cell
Rahul Govind
Australian School of Business
Example:
Rahul Govind
Australian School of Business
12
From the example
SPSS output
p level: 0.005
Reject null
There is evidence to say that there is an association.
Need to now go and describe the association How?
Rahul Govind
Australian School of Business
Remember
Rahul Govind
Australian School of Business
13
Linear relationships - Correlation Analysis
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
14
What different
levels of
correlation look
like.
Rahul Govind
Australian School of Business
Rules of thumb
Rahul Govind
Australian School of Business
15
Pearson Correlation
Rahul Govind
Australian School of Business
Example
Rahul Govind
Australian School of Business
16
Interpretation
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
17
Scatter Plot
What can it tell us?
How precisely can one describe the relationship?
3
Number of Cars
2
RECMD46E
0
0 1 2 3 4 5 6 7
TPERF46C
Number of Kids
Rahul Govind
Australian School of Business
Regression
What is it?
Rahul Govind
Australian School of Business
18
Principles of regression
Equation: Y = a + bX +
is the error
Estimation through MLS:
i.e. finds the line which minimises the sum of the squared
errors over all the x values (e2 = 0)
Assumptions
error () has mean of 0
variance of the error terms is constant
variance of errors is independent of the values of X
errors are normally distributed
Rahul Govind
Australian School of Business
Y pred 2
e = Yactual - Ypred
RECMD46E
Y actual 1
0
0 1 2 3 4 5 6 7
Rahul Govind
Australian School of Business
19
Principles of regression (assumptions)
Rahul Govind
Australian School of Business
Example
regression with 1 independent variable
Analysis question:
Is the overall satisfaction with the car usefully predicted by
satisfaction with the attribute comfort?
Dependent (Criterion) variable (Y):
Overall satisfaction
Predictor variable (X):
Comfort of the car
Rahul Govind
Australian School of Business
20
First:
Look at:
correlations
scatter plot
Why?
Rahul Govind
Australian School of Business
Output
Look at:
Rahul Govind
Australian School of Business
21
Equation -
Resulting equation:
Ov Sat = 1.878 + 0.65 * Comfort Satisfaction
Interpretation of equation and coefficient:
For each unit change in staff helpful, overall satisfaction
increases by 0.65 units
Rahul Govind
Australian School of Business
22
Module 7 - Exploring
Relationships
Rahul Govind
Australian School of Business
1
Extend this to Multiple Regression
Marketing relationships are complex need to go
beyond simple regression
Equation:
Y = a + b1X1 + b2X2 + b3X3 +
Additional assumption now:
Predictor or independent variables (Xs) are uncorrelated
If this is not obeyed, you have the problem of
Multicollinearity
Rahul Govind
Australian School of Business
Multicollinearity
Why is this a concern?
Effects the significance of the coefficients
Reduces the efficiency of the estimates (of the
coefficients)
Creates problems interpreting the coefficients
Subsequent use of the coefficients
Problems if want to use as a basis of what to use in
strategy - Since here you are interested in the
importance of each variable
Rahul Govind
Australian School of Business
2
Multicollinearity
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
3
Procedure 1
ENTER Method
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
4
Output
Fit of equation?
Usefulness of the regression procedure?
Which variables significant?
Problem of multi-collinearity?
Look at VIF>8
What do you do now?
Remove the offending variable(s) and re-run the
regression
Rahul Govind
Australian School of Business
Interpretation of equation
Ov Sat = Constant + b x1 + b x2 ..
Interpretation of coefficients
Rahul Govind
Australian School of Business
5
Interpretation
Rahul Govind
Australian School of Business
6
Conducting Regression Analysis
Scatter plots
NO Significance of
Rethink overall procedure
variables
Overall fit of equation
NO
Assumptions obeyed?
NO
Procedure 2
STEPWISE REGRESSION
Rahul Govind
Australian School of Business
7
What is Stepwise Regression?
Rahul Govind
Australian School of Business
Stepwise regression
Various types:
Forward - one in, then adds one at a time
Backwards - all in, then eliminates one at a time
Stepwise - variables can enter or leave the
equation at each step depending on their
contribution to explaining the dependent variable
Rahul Govind
Australian School of Business
8
Stepwise regression
Warning:
may not produce the best equation
Variables in equation are related to the
multicollinearity present
Rahul Govind
Australian School of Business
Example:
Rahul Govind
Australian School of Business
9
Testing Assumptions of Regression
Rahul Govind
Australian School of Business
Assumptions of regression
Rahul Govind
Australian School of Business
10
Definitions and testing
Heteroscedasticity
variance of the errors is not constant
Testing: plot residuals (y axis) against predicted y (on X
axis)
Autocorrelation
the size of error is related to time; errors are not
independent
Testing: plot residuals (y axis) against time(X axis)
Normality
Testing: normal probability plot
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
11
Heteroscedasticity
Rahul Govind
Australian School of Business
Autocorrelation
Rahul Govind
Australian School of Business
12
Ideal situation
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
13
Expanding the Applicability
of Regression
Rahul Govind
Australian School of Business
Incorporating non-metric
Independent Variables
Usual scale for independent variable?
Non-metric variables - incorporated via dummy
variables
Dummy variable is a variable that only takes the
value 0 or 1
Dependent variable must always be metric!
Rahul Govind
Australian School of Business
1
Defining a dummy variable
Rahul Govind
Australian School of Business
X variables Y Variable
Rahul Govind
Australian School of Business
2
Dummy variable regression
Overall Equation:
Satis = a + b1*Dummy_gender +b2*Quality
This reduces to -
Males:
Satis = a + b1 +b2*Quality
Females:
Satis = a + b2*Quality
The non-metric
variable has the
effect of changing
the constant for each
category.
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
3
5
4.5
3.5
2.5 Satis-female
Satis-male
2
1.5
0.5
0
0 1 2 3 4 5 6 7 8
Rahul Govind
Australian School of Business
Rahul Govind
Australian School of Business
4
Main Points from Module