You are on page 1of 49

1

Statistics and SPSS


2
Session 1

introduction to
statistics and spss
At the end of session 1 you will be
able to:
3

1. Understand the definition of Statistics
2. Understand what is descriptive / inferential
statistics
3. Understand various type of data
4. Understand sources of data
5. Create Codebook and coding
questionnaire
6. Set the variable properties.
7. Code the value for each variable.
8. Know how to enter your data.






Introduction to Statistics
4
What is statistics?
What is a variable?
What is a data?
Sources of data?
Types of data
Descriptive Statistics
Graphical/Tables/Illustrations
Numerical Measurement
Inferential Statistics
What is statistics?
5
Statistics is a group of methods used to collect,
analyze, present and interpret data and to make
good decision.
What is variable / observation
/data?
A variable is a characteristics under study that assumes
different values for different elements.

Observation is the value of a variable for an element.

A data set is a collection of observations on one or more
variables.
Source of data?
6
Primary
Observation
Experiment
Survey
Secondary
Ready and available data set from individuals or
institutions such BNM, Statistical departments,
World bank etc.
Survey
Census and Sampling
Sampling Probability or Non Probability

Types of data
7
Nominal - data that can be group exclusively in
one of the collectively exhaustive groups.
Ordinal - data that can be group exclusively in
one of the collectively exhaustive groups and
can be ranked.
Interval - data that can be group exclusively in
one of the collectively exhaustive groups, can be
ranked and the difference between each group
is known but NO meaningful zero.
Ratio - data that can be group exclusively in one
of the collectively exhaustive groups, can be
ranked and the difference between each group
is known and with meaningful zero.
Qualitative or Quantitative ; Discrete or
Continuous
Descriptive & Inferential Statistics
8
Descriptive statistics consists of methods for
organizing, displaying and describing data by
using illustrations and summary

Inferential statistics consists of methods that use
sample results to help make decisions or
predictions about a population.

Descriptive Statistics
9
Graphical Methods Tables and graphs
Depends on type of data Qualitative or
Quantitative
Tables Frequency Distribution Table & Relative Frequency
Table
Graphs Bar, Histogram, Pie, Pareto. Histogram. Line.
Polygon and Ogive
Numerical Measurements
Central Tendency Mean, Median, Mode & Mid
Range
Variability Variance, Standard Deviation
Position Quantiles, Percentiles, Deciles
Shape skewness (left or right) and kurtosis
(Sharp or Flat)


Codebook and Coding
10
Prepare a codebook (perlu letak code utk setiap soalan)
cth lelaki 0, prmp 1
Variable
SPSS variable name
Coding Instructions
Coding Variable Value
Close ended questionCth berikan pandangan anda
(open ended quest) list down all the answer listdown
(kumpulkan jwpn letak 2 @ 3 jwpn)
Dichotomous question. Example: Yes and No 0,1, gender
Multiple answers - cth jwpn (tandakan seberapa byk jwpn)
cth menu
New variable for each possible answer
The value of new variable is 0,1
Open ended questions
List down all the possible questions
Group all the answers to meaningful group
Missing Value :9 missing value & 8 not applicable

Data Entry
11
Prepare the template using Questionnaire from
Exercise 1
Using SPSS
Using Excel
Data Entry using Questionnaire from Exercise 1
SPSS first screen
12
How to open and exit file?
13
Data entry variable view
14
Example of variable view
15
Example of data view
16
What to look for at variable view?
17
Name of variable
No illegal characters like space bar, $, * !
Use short and simple name that related to
questionnaire
Type numeric or string
Width length of data
Decimals
Label Name of your variable
Value value assign for your observation
Missing value if you have any missing value

18
Session 2

data entry and
descriptive statistics
At the end of session 2 you will be
able to:
19

1. Screen and clean up the data.
2. Obtain descriptive statistics from SPSS.
3. Recode, regroup and transform data.
4. Know how to select cases for analysis




Exercise 2 Screening and Cleaning
20
Use file Exercise 2 - Unclean Data Set.sav

Step 1 -Run frequency table to screen the data.

Step 2 - Check the error in each frequency table.

Step 3 Correct any error. Refer back to original
questionnaire.

Step 4 Repeat Step 1 to 3 until you are sure that the
data set is clean.

Descriptive statistics
21
FREQUENCY TABLE
Commands Analyze , Descriptive Statistics,
Frequencies
Frequency table
22
Frequency table
23
Frequency table - output
24
frequency table
25
Purpose of Frequency table
26
Cleaning up
Detect error
Recode/Regroup
To provide basic information of each variable
Exercise 3 run descriptive statistics
27
Used the cleaned data set from exercise 2.
Obtain descriptive statistics for each variables.
Commands Analyze, Descriptive Statistics,
Descriptive
Output:

Descriptive Statistics
1130 2.0E+09 2.0E+09 2.0E+09 467276.895
1130 1.00 4.00 1.0221 .18454
1130 1.00 4.00 1.0549 .39773
1130 1.00 6.00 1.0619 .45493
1130
Matri ks number
Taraf perkahwinan
Etni k
Agama
Vali d N (li stwi se)
N Mini mum Maximum Mean Std. Devi ation
Descriptive statistics using excel
28
Under DATA, DATA ANALYSIS
Data transformation
29
Recode
Commands : Transform Recode Same Variable
or different variable
Data transformation
30
Compute
31
Session 3

correlation
and
regression

At the end of session 3 you will be
able to:
32

1. Obtain correlation and interpret the result.
2. Differentiate between independent and
dependent variable.
3. Perform simple run regression analysis
4. Perform Cross Tabulation and Chi Square
Test
5. Perform T-Test




correlation
33
Independent variable (X)
Dependent variable (Y)
Only for continuous data or quantitative data.
Between negative 1 to positive one.
Denote by r = .11 (start with decimal point)
Strength of relationship
Close to 1- strong and close to 0 weak
Significant or not
Only show the ASSOCIATION and not CAUSAL
Exercise 4 - correlation
34
Open file Exercise 4 - Correlation Record 1.sav
Run Correlation.


Correlation scatter plot
35
REGRESSION
36
Simple regression one independent and one
dependent variable
Multiple regression one dependent and many
independent variable
Predicting variable from another.
Y = a + bX + error
a = intercept and b is the slope (how much Y will
change as a result of changes in X)
Error all other factors that influence Y not
captured by X
assumption
37
1. Variable types
Independent Quantitative or dummy variable
Dependent Quantitative variable
2. Non-zero variance
3. No PERFECT Multicollinearity
4. Predictors are uncorrelated with external
variables
5. Homoscedasticity
6. Independent variable Durbin Watson
7. Normally distributed erors
8. Independence
9. Linearity
Regression ols
38
Ordinary Least Square Method
Minimize the residuals
Regression R
2
39
Sum of Square Residual Mean less Predicted
(SSE).
Sum of Square Regression- Actual Value less
Predicted Value (SSR).
Sum of Square Total Actual Value less Mean
Value (SST)
R
2
= how much changes in Y is explained by X
R
2
= 1- SSE/SST or SSR/SST

Exersice 5 - regression
40
Open file Exercise 5 - Regression Record
2.sav


Regression How?
41
Step 1 Analyze, Regression, Linear
Regression output
descriptive
42

Descriptive Statistics
193.2000 80.69896 200
614.4123 485.65521 200
27.5000 12.26958 200
6.7700 1.39529 200
Record Sales
(thousands)
Advertsi ng Budget
(thousands of pounds)
No. of pl ays on Radio 1
per week
Attracti veness of Band
Mean Std. Devi ati on N
Regression output
correlation
43

Correlations
1.000 .578 .599 .326
.578 1.000 .102 .081
.599 .102 1.000 .182
.326 .081 .182 1.000
. .000 .000 .000
.000 . .076 .128
.000 .076 . .005
.000 .128 .005 .
200 200 200 200
200 200 200 200
200 200 200 200
200 200 200 200
Record Sal es
(thousands)
Advertsing Budget
(thousands of pounds)
No. of pl ays on Radi o 1
per week
Attracti veness of Band
Record Sal es
(thousands)
Advertsing Budget
(thousands of pounds)
No. of pl ays on Radi o 1
per week
Attracti veness of Band
Record Sal es
(thousands)
Advertsing Budget
(thousands of pounds)
No. of pl ays on Radi o 1
per week
Attracti veness of Band
Pearson Correlati on
Si g. (1-tai l ed)
N
Record Sal es
(thousands)
Advertsing
Budget
(thousands
of pounds)
No. of pl ays
on Radio 1
per week
Attracti veness
of Band
Regression output
model summary
44

Model Summary
c
.578
a
.335 .331 65.99144 .335 99.587 1 198 .000
.815
b
.665 .660 47.08734 .330 96.447 2 196 .000 1.950
Model
1
2
R R Square
Adjusted
R Square
Std. Error of
the Esti mate
R Square
Change F Change df1 df2 Si g. F Change
Change Statisti cs
Durbi n-
Watson
Predi ctors: (Constant), Advertsi ng Budget (thousands of pounds)
a.
Predi ctors: (Constant), Advertsi ng Budget (thousands of pounds), Attracti veness of Band, No. of pl ays on Radi o 1 per week
b.
Dependent Vari abl e: Record Sal es (thousands)
c.
Regression output
anova table
45

ANOVA
c
433687.8 1 433687.833 99.587 .000
a
862264.2 198 4354.870
1295952 199
861377.4 3 287125.806 129.498 .000
b
434574.6 196 2217.217
1295952 199
Regressi on
Resi dual
Total
Regressi on
Resi dual
Total
Model
1
2
Sum of
Squares df Mean Square F Si g.
Predi ctors: (Constant), Advertsi ng Budget (thousands of pounds)
a.
Predi ctors: (Constant), Advertsi ng Budget (thousands of pounds), Attracti veness of
Band, No. of pl ays on Radio 1 per week
b.
Dependent Vari abl e: Record Sal es (thousands)
c.
Regression how to report
46
B SE B
Model 1
Constant 134.14 7.54
Advertising Budget 0.10 0.01 .58*
Model 2
Constant -26.61 17.35
Advertising Budget 0.09 0.01 .51*
Plays on BBC Radio
1
3.37 0.28 .51*
Note: R
2
= .34 for Model 1; R
2
= .33 for Model 2. * p<.001
Cross tabs how?
47
For qualitative (ordinal and nominal) data only.
Command Analyze, Descriptive Statistics,
Crosstab
Crosstabs how?
48
Independent variable row or column?
Click percentage in row or column where your
independent variable is.
Pearson Chi Square test
Conditions:
Expected count more than 5
Exclusive data set
Likelihood ratio samples are small



Exercise 6
49
Open file Exercise 6 - Chi Square & Recode.sav
Cross tabulations
Interpret the results

You might also like