You are on page 1of 96

Introduction to Data Analysis

Basics
Levels

of
Measurement

Key

Terms

Variables

Concepts

Independent

Nominal

Construct

Dependent

Ordinal

Variable

Moderating

Interval

Definition

Mediating

Ratio

Dictionary
Operational

Control

Data Analysis Process

STAGES OF DATA ANALYSIS


EDITING

CODING

DATA ENTRY

DATA ANALYSIS

ERROR
CHECKING
AND
VERIFICATION

Introduction
Preparation

of Data

Editing,

Handling Blank responses, Coding,


Categorization and Data Entry
These activities ensure accuracy of the data and
its conversion from raw form to reduced data

Exploring,

Displaying and Examining

data
Breaking

down, inspecting and rearranging data


to start the search for meaningful descriptions,
patterns and relationship.

Coding Rules

Appropriate to the
research problem

Exhaustive
Categories
Categories
should
should be
be

Mutually exclusive

Derived from one


classification principle

Appropriateness
Lets say your population is
students at institutions of higher
learning
What is you age group?

15 25 years
26 35 years
36 45 years
Above 45 years

Exhaustiveness

What is your race?

Malay
Chinese
Indians
Others

Mutual Exclusivity

What is your occupation type?

Professional
Managerial
Sales
Clerical
Others

Crafts
Operatives
Unemployed
Housewife

Single Dimension
What

is your occupation type?

Professional
Managerial
Sales
Clerical
Others

Crafts
Operatives
Unemployed
Housewife

Coding Open-ended Responses

Coding Open Ended Questions

Handling Blank Responses


How

do we take care of missing


responses?
If

> 25% missing, throw out the questionnaire

Other ways of handling

Use the midpoint of the scale


Ignore (system missing)
Mean of those responding
Mean of the respondent
Random number

How to Select a Test


Two-Sample Tests

k-Sample Tests

____________________________________________

____________________________________________

Measurement
Scale

One-Sample Case

Related Samples

Independent
Samples

Nominal

Binomial
x2 one-sample test

McNemar

Ordinal

KolmogorovSmirnov one-sample
test
Runs test

Sign test

Related Samples

Fisher exact
test
x2 two-samples
test

Median test

Friedman twoway ANOVA

Repeatedmeasures ANOVA

Wilcoxon

Mann-Whitney

matched-pairs
test

Cochran Q

Independent
Samples

x2 for k samples

Median
extension
Kruskal-Wallis
one-way ANOVA

Kolmogorov-

Smirnov
Wald-Wolfowitz
Interval and
Ratio

t-test

Z test

t-test for paired


samples

t-test

Z test

One-way
ANOVA
n-way ANOVA

Data Transformation
Weights
Assigning

numbers to responses on a
pre-determined rule

Respecification

of the Variable

Transforming

existing data to form new


variables or items

Recode
Compute

Scale Transformation
Reason

for Transformation
to improve interpretation and
compatibility with other data sets
to enhance symmetry and stabilize
spread
improve linear relationship between
the variables (Standardized score)

Xi - X
z
s

Data Transformation
Section 1 - Computer Anxiety
Computers make me feel
uncomfortable

I get a sinking feeling when I


think of trying to use a computer

Computers scare me

I feel comfortable using a


computer

Working with a computer makes


me nervous

Sample SPSS Codebook

Research Model
5 items

Attitude
4 items

Subjective
norm
4 items

Perceived
Behavioral
Control

5 items

3 items

Intention to
Share
Information

Actual
Sharing of
Information

Factor Analysis - Command

Assumptions in FA

Question:
How valid is our instrument?

KMO and Bartlett's Test


Kaiser-Meyer-Olkin Measure of Sampling
Adequacy.
Bartlett's Test of
Sphericity

Approx. Chi-Square
df
Sig.

KMO should be > 0.5


Bartletts Test should be
significant ie; p < 0.05

.882
2878.230
78
.000

Measure of Sampling Adequacy


MSA
0.80 and above

Comment
Meritorious

0.70 0.80

Middling

0.60 0.70

Mediocre

0.50 0.60

Miserable

Below 0.50

Unacceptable

Assumptions in FA
Anti-image Matrices
Anti-image Covariance

Anti-image Correlation

Att1
Att2
Att3
Att4
Att5
Sn1
Sn2
Sn3
Sn4
Pbc1
Pbc2
Pbc3
Pbc4
Att1
Att2
Att3
Att4
Att5
Sn1
Sn2
Sn3
Sn4
Pbc1
Pbc2
Pbc3
Pbc4

Att1
.045
-.021
-.032
-.036
-.029
.014
-.013
-.027
.027
-.003
-.007
.011
-.011
.863a
-.286
-.458
-.421
-.420
.118
-.130
-.272
.210
-.031
-.067
.100
-.108

a. Measures of Sampling Adequacy(MSA)

Att2
-.021
.123
-.016
-.018
-.015
-.020
.000
.016
-.038
-.013
-.009
.004
-.010
-.286
.961a
-.142
-.131
-.131
-.100
-.002
.094
-.179
-.073
-.050
.025
-.054

Att3
-.032
-.016
.107
.012
-.018
-.015
.021
-.010
-.014
-.005
.009
-.020
.014
-.458
-.142
.936a
.095
-.165
-.084
.131
-.064
-.071
-.030
.055
-.125
.087

Att4
-.036
-.018
.012
.161
-.014
.000
-.013
.033
-.020
.001
.022
-.007
-.005
-.421
-.131
.095
.942a
-.105
.001
-.066
.175
-.081
.003
.111
-.036
-.027

Att5
-.029
-.015
-.018
-.014
.106
-.019
.005
.024
-.012
.017
-.006
-.007
.022
-.420
-.131
-.165
-.105
.939a
-.102
.031
.154
-.058
.105
-.039
-.044
.134

Sn1
.014
-.020
-.015
.000
-.019
.317
-.121
-.063
.033
-.003
-.003
.070
-.052
.118
-.100
-.084
.001
-.102
.891a
-.447
-.236
.095
-.012
-.010
.252
-.185

Sn2
-.013
.000
.021
-.013
.005
-.121
.233
-.066
-.072
.014
-.003
-.041
.029
-.130
-.002
.131
-.066
.031
-.447
.899a
-.290
-.243
.060
-.015
-.171
.119

Sn3
-.027
.016
-.010
.033
.024
-.063
-.066
.223
-.127
-.021
.035
.002
.004
-.272
.094
-.064
.175
.154
-.236
-.290
.879a
-.438
-.088
.153
.007
.019

Sn4
.027
-.038
-.014
-.020
-.012
.033
-.072
-.127
.375
.004
-.008
-.001
.022
.210
-.179
-.071
-.081
-.058
.095
-.243
-.438
.886a
.014
-.025
-.005
.073

Pbc1
-.003
-.013
-.005
.001
.017
-.003
.014
-.021
.004
.253
-.161
.013
-.069
-.031
-.073
-.030
.003
.105
-.012
.060
-.088
.014
.785a
-.650
.052
-.273

Pbc2
-.007
-.009
.009
.022
-.006
-.003
-.003
.035
-.008
-.161
.243
-.076
.011
-.067
-.050
.055
.111
-.039
-.010
-.015
.153
-.025
-.650
.767 a
-.311
.046

Pbc3
.011
.004
-.020
-.007
-.007
.070
-.041
.002
-.001
.013
-.076
.247
-.164
.100
.025
-.125
-.036
-.044
.252
-.171
.007
-.005
.052
-.311
.732a
-.662

Pbc4
-.011
-.010
.014
-.005
.022
-.052
.029
.004
.022
-.069
.011
-.164
.250
-.108
-.054
.087
-.027
.134
-.185
.119
.019
.073
-.273
.046
-.662
.749a

How many Factors?


Total Variance Explained
Initial Eigenvalues
% of
Component
Total
Variance
Cumulative %
1
6.730
51.772
51.772
2
3.328
25.596
77.368
3
.962
7.400
84.769
4
.439
3.376
88.145
5
.432
3.325
91.470
6
.232
1.786
93.256
7
.215
1.651
94.907
8
.183
1.407
96.313
9
.131
1.010
97.324
10
.124
.951
98.275
11
.102
.785
99.059
12
.088
.673
99.733
13
.035
.267
100.000
Extraction Method: Principal Component Analysis.

Extraction Sums of Squared Loadings


% of
Total
Variance
Cumulative %
6.730
51.772
51.772
3.328
25.596
77.368
.962
7.400
84.769

Rotation Sums of Squared Loadings


% of
Total
Variance
Cumulative %
4.476
34.427
34.427
3.287
25.288
59.715
3.257
25.054
84.769

S
c
r
e
P
l
o
t
7
6
5
4

Eigenvalue

How many Factors? - Scree Plot

3
2
1
01
2
3
4
5
6
7
8
9
1
0
1
1
2
1
3
C
o
m
p
o
n
e
tN
u
m
b
e
r

Rotation Orthogonal or Non


Orthogonal Factor
Rotation
Unrotated
Factor II
+1.0

Oblique Factor Rotation


Unrotated
Factor II
+1.0

Rotated Factor II
V1

V1

+.50

+.50

Orthogonal
Rotation: Factor II

V2

V2

Oblique
Rotation: Factor
II

Unrotated
Factor I
-1.0
-.50
0

+.50
+1.0
V4
-.50
V5
-1.0

Unrotated
Factor I
-1.0
-.50
0

V3
Rotated
Factor I

+.50
+1.0

V3
V4

-.50
V5
-1.0

Oblique
Rotation:
Factor I

Orthogonal
Rotation: Factor
I

Assigning Questions
Communalities

Rotated Component Matrix a

Att1
Att2
Att3
Att4
Att5
Sn1
Sn2
Sn3
Sn4
Pbc1
Pbc2
Pbc3
Pbc4

1
.897
.855
.871
.885
.907
.401
.409
.376
.264
.129
.106
.067
.076

Component
2
.146
.197
.128
.098
.053
-.046
-.013
-.070
-.081
.888
.894
.892
.894

3
.377
.375
.369
.296
.319
.758
.825
.843
.820
.025
-.062
-.064
-.043

Extraction Method: Principal Component Analysis.


Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 5 iterations.

Initial
Extraction
1.000
.967
1.000
.910
1.000
.910
1.000
.880
1.000
.928
1.000
.738
1.000
.847
1.000
.857
1.000
.749
1.000
.806
1.000
.815
1.000
.805
1.000
.808
Extraction Method: Principal Component Analysis.
Att1
Att2
Att3
Att4
Att5
Sn1
Sn2
Sn3
Sn4
Pbc1
Pbc2
Pbc3
Pbc4

Amount of shared, or common


variance, among the variables
General guidelines all communnalities
should be above 0.5

Significant Loadings
Factor Loading

Sample Size Needed

0.30

350

0.35

250

0.40

200

0.45

150

0.50

120

0.55

100

0.60

85

0.65

70

0.70

60

0.75

50

Table in Report
Component
1

Att1

.897

.146

.377

Att2

.855

.197

.375

Att3

.871

.128

.369

Att4

.885

.098

.296

Att5

.907

.053

.319

Sn1

.401

-.046

.758

Sn2

.409

-.013

.825

Sn3

.376

-.070

.843

Sn4

.264

-.081

.820

Pbc1

.129

.888

.025

Pbc2

.106

.894

-.062

Pbc3

.067

.892

-.064

Pbc4

.076

.894

-.043

Eigenvalue

4.476

3.287

3.257

% Variance
(84.77)

34.43

25.29

25.05

Reliability - Command

Question:

Reliability

How reliable are our instruments?

Reliability Statistics
Cronbach's
Alpha
.977

Should be
preferably > 0.3

N of Items
5

Item-Total Statistics

Att1
Att2
Att3
Att4
Att5

Scale Mean if
Item Deleted
15.25
15.26
15.24
15.21
15.25

Scale
Variance if
Item Deleted
6.681
6.560
6.906
6.825
6.555

Corrected
Item-Total
Correlation
.973
.925
.929
.900
.935

Cronbach's
Alpha if Item
Deleted
.965
.972
.972
.975
.970

Reliability
Reliability Statistics
Cronbach's
Alpha
.912

N of Items
4
Item-Total Statistics

Sn1
Sn2
Sn3
Sn4

Scale Mean if
Item Deleted
11.20
11.03
11.00
11.21

Scale
Variance if
Item Deleted
4.243
4.135
4.021
4.250

Corrected
Item-Total
Correlation
.761
.855
.856
.736

Cronbach's
Alpha if Item
Deleted
.900
.868
.867
.909

Reliability
Reliability Statistics
Cronbach's
Alpha
.919

N of Items
4
Item-Total Statistics

Pbc1
Pbc2
Pbc3
Pbc4

Scale Mean if
Item Deleted
10.48
10.45
10.43
10.40

Scale
Variance if
Item Deleted
4.984
4.793
5.042
5.246

Corrected
Item-Total
Correlation
.814
.826
.809
.814

Cronbach's
Alpha if Item
Deleted
.895
.892
.897
.897

Reliability
Reliability Statistics
Cronbach's
Alpha
.966

N of Items
5

Item-Total Statistics

Intent1
Intent2
Intent3
Intent4
Intent5

Scale Mean if
Item Deleted
15.28
15.28
15.29
15.28
15.24

Scale
Variance if
Item Deleted
6.591
6.612
6.553
6.716
6.445

Corrected
Item-Total
Correlation
.951
.888
.901
.877
.904

Cronbach's
Alpha if Item
Deleted
.951
.961
.959
.962
.958

Table in Report
Variable

N of Item

Item
Deleted

Alpha

Attitude

0.977

SN

0.912

Pbcontrol

0.919

Intention

0.966

Actual

0.771

Example - Recoding
Perceived Enjoyment
PE1

The actual process of


using Instant Messenger is
pleasant

PE2

I have fun using Instant


Messenger

PE3

Using Instant Messenger


bores me

PE4

Using Instant Messenger


provides me with a lot of
enjoyment

PE5

I enjoy using Instant


Messenger

Recoding - Command

Data before Transformation

Computing New Variable - Command

Data after Transformation

Frequencies - Command

Question:

Frequencies

1. Is our sample representative?


2. Data entry error
Gender

Valid

Male
Female
Total

Frequency
144
48
192

Percent
75.0
25.0
100.0

Valid Percent
75.0
25.0
100.0

Cumulative
Percent
75.0
100.0

Current Position

Valid

Technician
Engineer
Sr Engineer
Manager
Above manager
Total

Frequency
34
66
54
32
6
192

Percent
17.7
34.4
28.1
16.7
3.1
100.0

Valid Percent
17.7
34.4
28.1
16.7
3.1
100.0

Cumulative
Percent
17.7
52.1
80.2
96.9
100.0

Table in Report
Gender
Male
Female
Position
Technician
Engineer
Sr Engineer
Manager
Above manager

Frequency

Percentage

144
48

75.0
25.0

34
66
54
32
6

17.7
34.4
28.1
16.7
3.1

Descriptives - Command

Descriptives
Descriptive Statistics

Age
Years working in the
organization
Total years of
working experience
Attitude
subjective
Pbcontrol
Intention
Actual
Valid N (listwise)

N
Statistic
192

Minimum
Statistic
19

Maximum
Statistic
53

Mean
Statistic
33.39

Std.
Deviation
Statistic
8.823

192

18

5.36

4.435

1.448

.175

1.333

.349

192

28

9.04

7.276

1.051

.175

-.025

.349

192
192
192
192
192
192

2.00
2.00
2.00
2.00
2.33

5.00
5.00
5.00
5.00
5.00

3.8104
3.7031
3.4792
3.8188
4.0625

.64548
.67034
.73672
.63877
.58349

-.480
-.101
.015
-.528
-.361

.175
.175
.175
.175
.175

.242
.755
-.028
.687
-.328

.349
.349
.349
.349
.349

Skewness
Statistic
Std. Error
.667
.175

Kurtosis
Statistic
Std. Error
-.557
.349

Question:
1. Is there variation in our data?
2. What is the level of the phenomenon we are measuring?

Table in Report
Mean

Std. Deviation

Attitude

3.81

0.65

Subjective Norm

3.70

0.67

Behavioral Control

3.48

0.74

Intention

3.82

0.64

Actual

4.06

0.58

Chi Square Test - Command

Crosstabulation

Question:
Is level of sharing dependent on gender?

Gender * Intention Level Crosstabulation

Gender

Male

Female

Total

Count
% within Gender
% within Intention Level
% of Total
Count
% within Gender
% within Intention Level
% of Total
Count
% within Gender
% within Intention Level
% of Total

Intention Level
Low
High
110
34
76.4%
23.6%
70.5%
94.4%
57.3%
17.7%
46
2
95.8%
4.2%
29.5%
5.6%
24.0%
1.0%
156
36
81.3%
18.8%
100.0%
100.0%
81.3%
18.8%

Total
144
100.0%
75.0%
75.0%
48
100.0%
25.0%
25.0%
192
100.0%
100.0%
100.0%

Chi-Square Tests

Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
Linear-by-Linear
Association
N of Valid Cases

Value
8.934 b
7.704
11.274
8.888

df
1
1
1
1

Asymp. Sig.
(2-sided)
.003
.006
.001

Exact Sig.
(2-sided)

Exact Sig.
(1-sided)

.002

.001

.003

192

a. Computed only for a 2x2 table


b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 9.
00.

T-test - Command

Question:

t-test
(2 Independent) Does intention to share vary by gender?
Group Statistics

Intention

Gender
Male
Female

Mean
3.9000
3.5750

144
48

Std.
Deviation
.60302
.68619

Std. Error
Mean
.05025
.09904

Independent Samples Test


Levene's Test for
Equality of Variances

F
Intention

Equal variances
assumed
Equal variances
not assumed

3.591

Sig.
.060

t-test for Equality of Means

df

Sig. (2-tailed)

Mean
Difference

Std. Error
Difference

95% Confidence
Interval of the
Difference
Lower
Upper

3.122

190

.002

.32500

.10410

.11965

.53035

2.926

72.729

.005

.32500

.11106

.10364

.54636

Paired t-test - Command

Question:

t-test
(2 Dependent) Are there differences between intention to
share and actual sharing behavior?

Paired Samples Statistics

Pair
1

Mean
3.8188
4.0625

Intention
Actual

Std.
Deviation
.63877
.58349

N
192
192

Std. Error
Mean
.04610
.04211

Paired Samples Correlations


N
Pair 1

Intention & Actual

192

Correlation
.817

Sig.
.000

Paired Samples Test


Paired Differences

Pair 1

Intention - Actual

Mean
-.24375

Std.
Deviation
.37326

Std. Error
Mean
.02694

95% Confidence
Interval of the
Difference
Lower
Upper
-.29688
-.19062

t
-9.049

df
191

Sig. (2-tailed)
.000

One Way ANOVA - Command

Question:

One way ANOVA


(k independent) Does intention vary by position?
ANOVA
Intention

Between Groups
Within Groups
Total

Sum of
Squares
7.864
70.068
77.933

df
4
187
191

Mean Square
1.966
.375

F
5.247

Sig.
.001

Intention
Duncana,b
Subset for alpha = .05
Current Position
N
1
2
Engineer
66
3.6424
Manager
32
3.6625
Technician
34
3.8941
Sr Engineer
54
4.0000
Above manager
6
4.5333
Sig.
.101
1.000
Means for groups in homogeneous subsets are displayed.
a. Uses Harmonic Mean Sample Size = 19.157.
b. The group sizes are unequal. The harmonic mean
of the group sizes is used. Type I error levels are
not guaranteed.

Mann-Whitney - Command

Question:

Mann-Whitney
(2 independent) Does the variables vary by gender?
Ranks
Intention

Gender
Male
Female
Total

N
144
48
192

Mean Rank
103.64
75.08

Sum of Ranks
14924.00
3604.00

Test Statistics a
Mann-Whitney U
Wilcoxon W
Z
Asymp. Sig. (2-tailed)

Intention
2428.000
3604.000
-3.266
.001

a. Grouping Variable: Gender

Kruskal-Wallis - Command

Question:

Kruskal-Wallis
(k independent) Does the variables vary by position?
Ranks
Intention

Position
Technician
Engineer
Sr Engineer
Manager
Above manager
Total

N
34
66
54
32
6
192

Mean Rank
101.32
79.68
114.54
81.63
171.17

Test Statistics a,b


Chi-Square
df
Asymp. Sig.

Intention
28.179
4
.000

a. Kruskal Wallis Test


b. Grouping Variable: Position

Correlation - Command

Correlation
(Interval/ratio)

Question:
Are the variables related?
Correlations

Attitude

subjective

Pbcontrol

Intention

Actual

Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N

Attitude
1

subjective
Pbcontrol
Intention
.697**
.212**
.808**
.000
.003
.000
192
192
192
192
.697**
1
-.052
.653**
.000
.471
.000
192
192
192
192
.212**
-.052
1
.281**
.003
.471
.000
192
192
192
192
.808**
.653**
.281**
1
.000
.000
.000
192
192
192
192
.606**
.552**
.031
.817**
.000
.000
.665
.000
192
192
192
192

**. Correlation is significant at the 0.01 level (2-tailed).

Actual
.606**
.000
192
.552**
.000
192
.031
.665
192
.817**
.000
192
1
192

Table Presentation
Attitude
Attitude
subjective

subjective

Pbcontrol

Intention

Actual

1
.740**

Pbcontrol

.201**

-.047

Intention

.885**

.662**

.326**

Actual

.660**

.553**

.059

.805**

*p< 0.05, **p< 0.01

Correlation
(Ordinal)

Question:
Are the variables related?
Correlations

Spearman's rho

Pay

Promotion

Work

Supervision

Coworkers

Correlation Coefficient
Sig. (1-tailed)
N
Correlation Coefficient
Sig. (1-tailed)
N
Correlation Coefficient
Sig. (1-tailed)
N
Correlation Coefficient
Sig. (1-tailed)
N
Correlation Coefficient
Sig. (1-tailed)
N

**. Correlation is significant at the 0.01 level (1-tailed).


*. Correlation is significant at the 0.05 level (1-tailed).

Pay
Promotion
1.000
.043
.
.314
130
130
.043
1.000
.314
.
130
130
-.415**
-.476**
.000
.000
130
130
-.019
-.263**
.417
.001
130
130
-.090
-.259**
.154
.001
130
130

Work
Supervision Coworkers
-.415**
-.019
-.090
.000
.417
.154
130
130
130
-.476**
-.263**
-.259**
.000
.001
.001
130
130
130
1.000
.117
.160*
.
.092
.035
130
130
130
.117
1.000
.029
.092
.
.372
130
130
130
.160*
.029
1.000
.035
.372
.
130
130
130

Friedman Test - Command

Question:

Friedman
(k related samples) Is the rating similar?
Ranks
Sn1
Sn2
Sn3
Sn4

Mean Rank
2.34
2.67
2.70
2.29

Test Statistics a
N
Chi-Square
df
Asymp. Sig.

192
43.149
3
.000

a. Friedman Test

Regression - Command

Question:

Multiple
Regression Which variables can explain the intention to
share?

Variables Entered/Removed b
Model
1

Variables
Entered
Pbcontrol,
subjective,
a
Attitude

Variables
Removed

Method
.

Enter

a. All requested variables entered.


b. Dependent Variable: Intention

Model Summaryb
Model
1

R
R Square
.832a
.693

Adjusted
R Square
.688

Std. Error of
the Estimate
.35703

a. Predictors: (Constant), Pbcontrol, subjective, Attitude


b. Dependent Variable: Intention

DurbinWatson
1.501

Multiple Regression
ANOVAb
Model
1

Regression
Residual
Total

Sum of
Squares
53.968
23.964
77.933

df
3
188
191

Mean Square
17.989
.127

F
141.127

Sig.
.000a

a. Predictors: (Constant), Pbcontrol, subjective, Attitude


b. Dependent Variable: Intention

Coefficientsa

Model
1

(Constant)
Attitude
subjective
Pbcontrol

Unstandardized
Coefficients
B
Std. Error
.191
.197
.601
.059
.227
.056
.143
.037

a. Dependent Variable: Intention

Standardized
Coefficients
Beta
.607
.238
.165

t
.971
10.103
4.043
3.821

Sig.
.333
.000
.000
.000

Collinearity Statistics
Tolerance
VIF
.453
.472
.877

2.210
2.116
1.140

Assumptions (Multicollinearity)
Collinearity Diagnosticsa

Model
1

Dimension
1
2
3
4

Eigenvalue
3.936
.043
.013
.008

a. Dependent Variable: Intention

Condition
Index
1.000
9.581
17.195
22.890

(Constant)
.00
.00
.91
.09

Variance Proportions
Attitude
subjective
.00
.00
.02
.10
.19
.02
.79
.88

Pbcontrol
.00
.55
.21
.24

Assumptions (Outliers)
Casewise Diagnosticsa
Case Number
70
82
83
166
178
179

Std. Residual
3.152
4.042
3.071
3.152
4.042
3.071

a. Dependent Variable: Intention

Intention
5.00
5.00
4.20
5.00
5.00
4.20

Predicted
Value
3.8748
3.5570
3.1037
3.8748
3.5570
3.1037

Residual
1.12520
1.44295
1.09631
1.12520
1.44295
1.09631

After Removing Outliers


Model Summaryb
Model
1

R
.900a

Adjusted
R Square
.807

R Square
.810

Std. Error of
the Estimate
.27373

DurbinWatson
1.725

a. Predictors: (Constant), Pbcontrol, subjective, Attitude


b. Dependent Variable: Intention
ANOVAb
Model
1

Regression
Residual
Total

Sum of
Squares
58.261
13.637
71.898

df
3
182
185

Mean Square
19.420
.075

F
259.182

Sig.
.000a

a. Predictors: (Constant), Pbcontrol, subjective, Attitude


b. Dependent Variable: Intention

Coefficients a

Model
1

(Constant)
Attitude
subjective
Pbcontrol

Unstandardized
Coefficients
B
Std. Error
.067
.153
.758
.050
.085
.047
.145
.029

a. Dependent Variable: Intention

Standardized
Coefficients
Beta
.784
.091
.173

t
.441
15.281
1.801
5.015

Sig.
.659
.000
.073
.000

Collinearity Statistics
Tolerance
VIF
.396
.412
.875

2.523
2.426
1.143

Assumptions Advanced Diagnostics


(Hair et al., 2006)
Residuals Statisticsa

Predicted Value
Std. Predicted Value
Standard Error of
Predicted Value
Adjusted Predicted Value
Residual
Std. Residual
Stud. Residual
Deleted Residual
Stud. Deleted Residual
Mahal. Distance
Cook's Distance
Centered Leverage
Value

Minimum
2.1329
-3.172

Maximum
4.9380
2.106

Mean
3.8188
.000

Std.
Deviation
.53156
1.000

.027

.111

.048

.020

192

2.1423
-.96087
-2.691
-2.731
-.98909
-2.779
.130
.000

4.9493
1.44295
4.042
4.253
1.59761
4.461
17.495
.485

3.8179
.00000
.000
.001
.00086
.004
2.984
.011

.53167
.35421
.992
1.012
.36911
1.031
3.453
.051

192
192
192
192
192
192
192
192

.001

.092

.016

.018

192

a. Dependent Variable: Intention

N
192
192

H
i
s
t
o
g
r
a
m
D
e
p
n
d
e
n
V
a
i
b
l
e
:
I
n
t
e
n
t
i
o
n
7
0
6
5
0
4
3
0

Frequncy

Assumptions (Normality)

2
0
1
M
e
a
n
=
1
.
9
E
1
7
0
S
t
d
.
D
e
v
=
0
9
2
N
=
1
9
2
-4
2
0
2
4
6
R
e
g
r
e
s
io
n
S
ta
n
d
a
r
d
iz
e
d
R
e
s
id
u
a
l

N
o
r
m
a
lP
-0
l1
P
o
t
f
R
e
g
r
e
s
i
o
n
S
t
a
n
d
a
r
d
i
z
e
d
R
e
s
i
d
u
a
l
D
e
p
n
d
n
t
V
a
b
l
e
:
I
e
t
o
n
.0
0
.0
8
..6
4

ExpectdCum
Prob

Assumptions
(Normality of the Error term)

.0
0
2
.0
.0
.O
2
0
.
4
0
.
6
0
.
8
1
.
0
b
s
e
r
v
e
d
C
u
m
P
r
o
b

S
c
a
t
e
r
p
l
o
t
D
e
p
n
d
e
n
V
i
a
b
e
:
I
n
t
e
n
t
i
o
n
4
2
0

R
egrR
seiondSatludentized

Assumptions (Constant Variance)

-2
2
.02
.5
03
.0
3
.
5
0
4
.
0
4
.
5
0
5
.
0
In
te
n
tio
n

P
a
r
t
i
l
R
e
g
r
s
i
o
n
P
l
o
t
D
e
p
n
d
e
n
V
a
i
b
l
e
:
I
e
n
i
n
.0
1
5
0
.5

Inteion

Assumptions (Linearity)

--0
.1
5
.5-2-1A
0
1
tiu
d
e

P
a
r
t
i
l
R
e
g
r
s
i
o
n
P
l
o
t
D
e
p
n
d
e
n
t
V
a
i
b
l
e
:
I
e
n
i
n
2
.1
0
5
.5
0

Inteion

Assumptions (Linearity)

.-1
0
5
.0
-2-1s
0
1
2
u
b
je
c
tiv
e

P
a
r
t
i
a
l
R
e
g
r
e
s
i
o
n
P
l
o
t
D
e
p
n
d
e
n
t
V
a
i
b
l
e
:
I
e
n
i
n
2
.1
0
5
.5
0

Inteion

Assumptions (Linearity)

.-1
0
5
.0-2-1
0
1
P
b
c
o
n
tro
l

Table Presentation
Variable

Dependent = Intention
Standardized Beta

Attitude
Subjective Norm
Perceived Control

0.607**
0.238**
0.105**

R2
Adjusted R2
F Value
D-W

0.693
0.688
141.13
1.501

*p< 0.05, **p< 0.01

Discriminant - Command

Question:

Discriminant
Analysis

Which variables can discriminate high


and low intention to share?

Analysis Case Processing Summary


Unweighted Cases
Valid
Excluded Missing or out-of-range
group codes
At least one missing
discriminating variable
Both missing or
out-of-range group codes
and at least one missing
discriminating variable
Unselected
Total
Total

N
127

Percent
66.1

.0

.0

.0

65
65
192

33.9
33.9
100.0

Group Statistics

Level
Low

High

Total

Attitude
Norm
pbc
Attitude
Norm
pbc
Attitude
Norm
pbc

Mean
3.6481
3.5409
3.4038
4.3130
4.2609
3.6522
3.7685
3.6713
3.4488

Std.
Deviation
.62349
.62813
.76981
.57470
.60995
.78240
.66448
.68190
.77494

Dividing the Sample into Estimation and


Split/Holdout Sample: Random Selection
Command:
TRANSFORM RANDOM
NUMBER SEED
TRANSFORM COMPUTE
Randz = UNIFORM(1) > 0.65
will give 65% of respondent for
estimation and the remainder for
holdout sample

Valid N (listwise)
Unweighted Weighted
104
104.000
104
104.000
104
104.000
23
23.000
23
23.000
23
23.000
127
127.000
127
127.000
127
127.000

Test for Model


Wilks' Lambda
Test of Function(s)
1

Wilks'
Lambda
.796

Chi-square
28.214

df
3

Sig.
.000

Test Results
Box's M
F

5.942
Approx.
.939
df1
6
df2
9055.846
Sig.
.465
Tests null hypothesis of equal population covariance matrices.

Goodness of Model
Eigenvalues
Function
1

Eigenvalue
.257a

% of
Variance
100.0

Cumulative %
100.0

Canonical
Correlation
.452

a. First 1 canonical discriminant functions were used in the


analysis.

Tests of Equality of Group Means

Attitude
Norm
pbc

Wilks'
Lambda
.850
.833
.985

F
22.007
24.998
1.949

df1
1
1
1

df2
125
125
125

Sig.
.000
.000
.165

Coefficients
Standardized Canonical
Discriminant Function Coefficients

Attitude
Norm
pbc

Function
1
.322
.741
.321

Canonical Discriminant Function Coefficients


Function
1
Attitude
.524
Norm
1.185
pbc
.415
(Constant)
-7.759
Unstandardized coefficients

Structure Matrix
Function
1
Norm
.883
Attitude
.828
pbc
.246
Pooled within-groups correlations between discriminating
variables and standardized canonical discriminant functions
Variables ordered by absolute size of correlation within
function.

Classification
Functions at Group Centroids
Function
Level
1
Low
-.236
High
1.069
Unstandardized canonical discriminant
functions evaluated at group means

Classification Function Coefficients


Level
Low
High
2.848
3.532
8.746
10.293
6.553
7.095
-32.031
-44.209
Fisher's linear discriminant functions
Attitude
Norm
pbc
(Constant)

N Z
N
Z
Z
N N
A

CU

ZA = centroid Group A
ZB = centroid Group B
NA & NB = Number in each group

Predictive Validity
Classification Resultsb,c,d

Cases Selected

Original

Count
%

Cross-validated a

Count
%

Cases Not Selected

Original

Count
%

Level
Low
High
Low
High
Low
High
Low
High
Low
High
Low
High

Predicted Group
Membership
Low
High
100
4
16
7
96.2
3.8
69.6
30.4
100
4
16
7
96.2
3.8
69.6
30.4
52
0
6
7
100.0
.0
46.2
53.8

Total
104
23
100.0
100.0
104
23
100.0
100.0
52
13
100.0
100.0

a. Cross validation is done only for those cases in the analysis. In cross validation, each
case is classified by the functions derived from all cases other than that case.
b. 84.3% of selected original grouped cases correctly classified.
c. 90.8% of unselected original grouped cases correctly classified.
d. 84.3% of selected cross-validated grouped cases correctly classified.

Benchmark for Comparison

How good is the Hit Ratio? Compute Hit Ratio for


split sample and compare it against

Maximum Chance Criterion: This is just the size of the largest


group. Minimum criterion to be met by the Hit Ratio
Proportional Chance Criterion: Should be used when group sizes
are unequal. If two groups this is given as follows:

Cpro = p2 + (1 - p)2

p = proportion in group

Presss Q: Compares No. of correct classification (n) against Total


Sample (N) and Number of Groups (k)

[N - (n * k)]2
Press Q
N(k - 1)

Press Q 2 with 1 degree of freedom. (3.84,

6.64)

Logistic Regression- Command

Logistic Regression- Command

Initial Output
Case Processing Summary
Unweighted Cases
Selected Cases

N
Included in Analysis
Missing Cases
Total

Unselected Cases
Total

No missing cases

Percent
100.0
.0
100.0
.0
100.0

192
0
192
0
192

a. If weight is in effect, see classification table for the total


number of cases.

Dependent Variable Encoding


Original Value
Low
High

Internal Value
0
1

Correctly classifies all those


With high values but misses
All those with low values.

Classification Tablea,b
Predicted

Step 0

Observed
Sharing Level

Low
High

Overall Percentage
a. Constant is included in the model.
b. The cut value is .500

Sharing Level
Low
High
0
84
0
108

Percentage
Correct
.0
100.0
56.3

This is the proportion of


respondents in the high
Sharing category

Output
Variables in the Equation

Step 0

Constant

B
.251

S.E.
.145

Wald
2.984

df
1

Sig.
.084

Exp(B)
1.286

Variables not in the Equation


Step
0

Variables

Attitude
SN
PBC

Overall Statistics

Score
30.588
38.624
.120
41.833

df

The constant is entered


first, the other variables
are not included

1
1
1
3

Sig.
.000
.000
.729
.000

The Wald statistics is like the


t-value. The constant by
itself does not significantly
Improve prediction

Output
Block 1: Method = Enter
Omnibus Tests of Model Coefficients
Step 1

Step
Block
Model

Chi-square
48.073
48.073
48.073

df
3
3
3

Sig.
.000
.000
.000

Model Summary
Step
1

-2 Log
Cox & Snell
likelihood
R Square
a
215.088
.221

Nagelkerke
R Square
.297

The Model accounts


for 29.7% of the variance

a. Estimation terminated at iteration number 5 because


parameter estimates changed by less than .001.

Hosmer and Lemeshow Test


Step
1

Chi-square
71.722

df
8

Sig.
.000

A significant chi square indicates


That the predicted probabilities do
Not match the observed probabilities.
This is not what we usually want.

Contingency Table Hosmer Lemeshow


Contingency Table for Hosmer and Lemeshow Test

Step
1

1
2
3
4
5
6
7
8
9
10

Sharing Level = Low


Observed
Expected
16
15.496
18
15.112
14
12.942
12
8.200
4
1.512
2
12.814
10
6.312
2
6.777
0
3.959
6
.875

Sharing Level = High


Observed
Expected
2
2.504
2
4.888
6
7.058
8
11.800
0
2.488
32
21.186
8
11.688
20
15.223
20
16.041
10
15.125

This is a more detailed assessment


of the Hosmer Lemeshow Test.
need to look at how close or
how different are the observed
and expected values for each group

Total
18
20
20
20
4
34
18
22
20
16

Classification
Classification Tablea
Predicted

Step 1

Observed
Sharing Level

Sharing Level
Low
High
48
36
8
100

Low
High

Overall Percentage

Percentage
Correct
57.1
92.6
77.1

The overall Predictive


Accuracy = 77.1%

a. The cut value is .500

Variables in the Equation

Step
a
1

Attitude
SN
PBC
Constant

B
.717
1.351
-.197
-6.741

S.E.
.415
.443
.243
1.501

Wald
2.993
9.302
.657
20.171

df
1
1
1
1

Sig.
.084
.002
.418
.000

Exp(B)
2.049
3.860
.821
.001

95.0% C.I.for EXP(B)


Lower
Upper
.909
4.617
1.620
9.193
.510
1.323

a. Variable(s) entered on step 1: Attitude, SN, PBC.

An increase of 1 unit on the Attitude


measure increases the odds of
Sharing at a higher level by 2.049
times, controlling for SN and PBC

An increase of 1 unit on the SN


measure increases the odds of
Sharing at a higher level by 3.860
times, controlling for SN and PBC

You might also like