You are on page 1of 122

DATA PROCESSING, ANALYSIS AND INTERPRETATION

(SOCIAL SCIENCE RESEARCH)


Pablo E. Subong, Jr., Ed.D., Ph.D.

West Visayas State University

OBJECTIVES
To develop skills in data processing manually and with the use of SPSS Be able to process hypothetical data Be able to properly analyze the data

INTRODUCTION
SPSS for windows is a computer package that will perform a wide variety of statistical procedures. Data management and analysis can be handled well with SPSS. Using SPSS we can manipulate data, make graphs and perform statistical techniques varying from means to regression.

WHAT IS SPSS?

SPSS stands for Statistical Package for the Social Sciences

The SPSS home-page is: www.spss.com

WHAT CAN YOU DO WITH SPSS?


Run Frequencies Calculate Descriptive Statistics Compare Means Conduct Cross-Tabulations Recode Data Create Graphs and Charts Do T-Tests Conduct ANOVAs Run Various Type of Regressions And Much More!

WHAT I WILL SHOW YOU TODAY!!

Bringing your data into SPSS Recoding SPSS uses


Survey Experimental study Social science research

SPSS WINDOWS PROCESS


Data window Variable view window Output window Chart editor window

MANAGEMENT OF DATA AND FILES


SPSS can read different types of data files. You can open not only SPSS files but also excel and other files. You can create a new data set with SPSS. You can also edit, delete and view the contents of your data file.

HOW TO USE DIFFERENT FILE TYPES?


Excel

file csv file SPSS file

TYPES OF VARIABLES

You can select type of variable

String Numeric

You can also select format of variable


Categorical Ordinal Interval

CATEGORICAL (NOMINAL)
A categorical variable is one that has two or more categories, but there is no intrinsic ordering to the categories. Gender Hair color is also a categorical variable

ORDINAL VARIABLE
An

ordinal variable (nominal) is similar to a categorical variable. The difference between the two is that there is a clear ordering of the variables.

SES (Socio Economic Status) Education

Even

though we can order these from lowest to highest, the spacing between the values may not be the same across the levels of the variables.

INTERVAL VARIABLE
An interval variable is similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. Annual Income measured in Euros

WHY DOES IT MATTER?


Statistical computations and analyses assume that the variables have specific levels of measurement Can you compute average of hair color? Does it makes sense to compute the average of educational experience? An average requires a variable to be interval.

DATA ANALYSIS
Data analysis embraces both the problem of finding an appropriate model, on the one hand, and model estimation and testing, on the other. In this context normality assumption becomes important. In social sciences, it is hard to find typical bell shaped normal distribution.

NORMAL DISTRIBUTION

In general, the bell shape distribution has the following characteristics


The average is located in the center of the distribution. The greater the distance from average, the lower the frequency.

Sample Coding Book


Infants sex =sex Male=1 Female=2 Family income ($)=fincome 5,000-29,999=0 30,000-59,999=1 60,000-99,999=2 Maternal age (years)=m_age Maternal Smoking status=m_smk Yes=1 No=0 Birth weight (granms) =bwgt Maternal weight before pregnancy (pounds)=m_wgt Fathers weight before the pregnancy=f_wgt

Sample Birth Weight Data


ID 1 2 3 4 5 6 7 ... ... 99 100 sex 2 1 2 2 2 2 1 fincome 2 2 1 0 0 0 0 m_age 29 25 28 28 19 35 27 m_smk 1 1 0 1 1 0 1 bwgt 3770 3742 3175 2919 3288 3175 3883 m_wgt 122 125 160 110 105 120 125 f_wgt 167 200 210 165 160 160 180

2 1

2 1

24 23

0 0

4337 4110

123 115

173 140

-DATA PROCESSING -SPSS DEMO

USING SPSS

FOR

WINDOWS

3 May 1999 20

Introduction Data procedures Statistical procedures Syntax files Editing output

3 May 1999 21

INTRODUCTION

STEPS FOR ANALYZING DATA


Enter the data Select the procedure and options Select the variables Run the procedure Examine the output

3 May 1999 22

COMMON OPERATIONS MENU OPTIONS


In

the menu, click Statistics Choose Summarize Click Frequencies

3 May 1999

23

COMMON OPERATIONS VARIABLES DIALOG BOX


This type of dialog box is used for many procedures. Variables are selected from the list on the left.

Click the arrow to move them to the appropriate box on the right.
3 May 1999

24

USING SPSS FOR WINDOWS DATA PROCEDURES


Ways to Enter Data Entering Data Directly

Defining variables Entering data

Viewing Data Recoding Variables Computing New Variables Selecting Cases

WAYS TO ENTER THE DATA


SPSS datafile Import data

26

Database file Spreadsheet file ASCII text file


Enter data directly with Data Editor

ENTERING DATA DIRECTLY-DEFINE THE VARIABLES

3 May 1999

27

ENTERING DATA DIRECTLYDEFINE THE VARIABLE

Name Type

and size Labels Missing values

3 May 1999

28

DEFINE THE VARIABLE - NAME


Name

the variable

No more than 8 characters Each name unique Must begin with a letter Certain characters not allowed Not case sensitive

3 May 1999

29

DEFINE THE VARIABLE - TYPE


Define

the variable

type. Define the variable width. Define the number of decimal places.

3 May 1999

30

DEFINE THE VARIABLE - LABELS


Labels

will be displayed in the output. Variable Label

can be more descriptive than variable name

3 May 1999

31

DEFINE THE VARIABLE MISSING VALUES

Missing

values are used to define userspecified missing information.


No response Refused to answer Data entry mistakes

3 May 1999

32

DEFINE THE VARIABLE COLUMN FORMAT

Column Format is used to define column width and alignment in the Data Editor window

3 May 1999

33

ENTERING DATA DIRECTLY

Each row is a case (e.g., survey form). Enter the value for each variable. Press <Tab> key or right arrow key to move to next variable.
3 May 1999

34

ENTERING DATA DIRECTLY

Leave blank or use user-defined missing value if no answer. Press <Enter> key to move to next case.

3 May 1999

35

CHANGE THE VIEW - VALUE LABELS


Data entered as numeric codes can be displayed as value labels. In the menu, click View Click Value Labels

3 May 1999

36

RECODE PROCEDURE
Recode is used to to change the values of an existing variable to create a new variable based on the values an existing variable
3 May 1999

37

RECODE INTO NEW VARIABLE


In

the menu, click Transform. Select Recode. Click Into Different Variable(s)

3 May 1999

38

RECODE INTO NEW VARIABLE


Select

and move variable(s) over.


and label new variable. Old and New Values

Name

Click

3 May 1999

39

RECODE INTO NEW VARIABLE


For each value of the existing variable Enter the new value Repeat for each value or range of values Click Continue
3 May 1999

40

RECODE INTO NEW VARIABLE

Click Click

Change OK

3 May 1999

41

DEFINE LABELS FOR NEW VARIABLE


In

the Data menu, click Define Variable. Labels.


value labels for the new variable.

Click Enter

3 May 1999

42

COMPUTE PROCEDURE
Compute is used to create a new variable. In the menu, click Transform. Click Compute.

3 May 1999

43

COMPUTE PROCEDURE
Name

the new variable. Click Type&Label to define the characteristics of the new variable.

3 May 1999

44

COMPUTE PROCEDURE

Label

the new variable. Enter the variable type.

3 May 1999

45

COMPUTE PROCEDURE
Enter

the numeric expression that will determine the values of the new variable. Click OK.

3 May 1999

46

SELECT CASES
For a subset of the datafile, use Select Cases. In the menu, click Data. Click Select Cases...

3 May 1999

47

SELECT CASES ALCOHOL DRINKERS ONLY

To select only those cases which meet certain criteria, choose the If option.

3 May 1999

48

SELECT CASES ALCOHOL DRINKERS ONLY


Enter the expression that will determine which variables will be selected. Click Continue.

3 May 1999

49

SELECT CASES ALCOHOL DRINKERS ONLY

When youve finished specifying selection criteria, click OK.

3 May 1999

50

USING SPSS FOR WINDOWS STATISTICAL PROCEDURES


3 May 1999

Summarizing Data

51

Frequencies Crosstabs (Chi Square)

Comparing Means

T-Tests One-Way Analysis of Variance

Nonparametric Tests
Wilcoxon Signed Ranks Mann-Whitney U Kruskal-Wallis

FREQUENCIES

In

the menu, click Statistics Choose Summarize Click Frequencies

3 May 1999

52

FREQUENCIES

Select

and move the variables. Click Statistics.

3 May 1999

53

FREQUENCIES

Choose

the appropriate statistics. Click Continue.

3 May 1999

54

FREQUENCIES - CHARTS

For histograms or other charts, click Charts.

3 May 1999

55

FREQUENCIES

Choose the type of chart and click Continue

3 May 1999

56

FREQUENCIES

To select the format of the table(s), click Format.

3 May 1999

57

FREQUENCIES

Choose the format and click Continue

3 May 1999

58

FREQUENCIES

Click OK to run the Frequencies procedure.

3 May 1999

59

FREQUENCIES - FORMAT OPTION ORGANIZE OUTPUT BY VARIABLES

3 May 1999

60

FREQUENCIES - FORMAT OPTION COMPARE VARIABLES

3 May 1999

61

FREQUENCIES - DISTRIBUTION TABLE


i

a u l r r u r c c c e V 0 8 2 9 9 1 4 6 9 8 2 2 3 5 3 3 3 5 7 0 4 1 7 6 6 5 9 4 1 7 6 3 0 1 8 7 8 7 0 8 8 0 6 1 8 9 5 8 2 0 1 1 2 0 0 T 4 9 0 S 0 1 T 4 0

3 May 1999

62

FREQUENCIES - HISTOGRAM
Apgar 1 minute score
300

200

100

Frequency

Std. Dev = 1.83 Mean = 7.8 0 0.0 2.0 4.0 6.0 8.0 10.0 N = 424.00

Apgar 1 m inute score

3 May 1999

63

CROSSTABS
In

the menu, click on Statistics Choose Summarize Click Crosstabs

3 May 1999

64

CROSSTABS
Move

the outcome variable(s) to the Row(s) box. Move the predictor variable(s) to the Column(s) box. Click Statistics.

3 May 1999

65

CROSSTABS

Select

the appropriate statistics. Click Continue.

3 May 1999

66

CROSSTABS
To select the counts, percentages, and residuals to be displayed in each cell, click Cells.

3 May 1999

67

CROSSTABS
Select

the information to be displayed in each cell. Click Continue.

3 May 1999

68

CROSSTABS

To run the Crosstabs procedure, click OK.

3 May 1999

69

CROSSTABS - OUTPUT

t c

3 May 1999

70

CROSSTABS - OUTPUT
e

s c i T y n o e o t s B L C i o 0 8 8 c a E . . . 2 8 0 % % % % c a % % % % c i % % % % N C 1 3 8 9 7 E . . . 8 2 0 % % % % c a % % % % c i % % % % T C o 2 4 8 7 5 E . . . 0 0 0 % % % % c a % % % % c i % % % %

3 May 1999

71

INDEPENDENT SAMPLES T-TEST


In

the menu, click Statistics. Choose Compare Means. Click Independent Samples T-Test.

3 May 1999

72

INDEPENDENT SAMPLES T-TEST


Select

and move

Test Variable(s) Grouping Variable


Click

Define Groups.

3 May 1999

73

INDEPENDENT SAMPLES T-TEST


Enter

the values for the groups. Click Continue.

3 May 1999

74

INDEPENDENT SAMPLES T-TEST

Click OK to run the procedure.

3 May 1999

75

INDEPENDENT SAMPLES T-TEST - OUTPUT

s u f a o n a l e r . e i g a e e o a F p d i t r r g i p f e e

7 2 6 2 0 4 3 4 3 a

6 0 4 4 1 2 1 a

3 May 1999

76

ONE-WAY ANALYSIS OF VARIANCE


In

the menu, click on Statistics. Choose Compare Means. Click One-Way Analysis of Variance.

3 May 1999

77

ONE-WAY ANALYSIS OF VARIANCE


Move

the dependent variable(s) to the Dependent List box. Move the grouping variable(s) to the Factor box. For comparison tests, click Post Hoc.

3 May 1999

78

ONE-WAY ANALYSIS OF VARIANCE


Select

the appropriate Post Hoc comparisons . Click Continue.

3 May 1999

79

ONE-WAY ANALYSIS OF VARIANCE


Click Options for Descriptive statistics Homogeneity of variance Mean plots Missing values options

3 May 1999

80

ONE-WAY ANALYSIS OF VARIANCE

Make appropriate selections, then click Continue.

3 May 1999

81

ONE-WAY ANALYSIS OF VARIANCE

To run the One-way ANOVA procedure, click OK.

3 May 1999

82

ONE-WAY ANALYSIS OF VARIANCE OUTPUT


i

t x i i i

i f f i

3 May 1999

83

ONE-WAY ANALYSIS OF VARIANCE OUTPUT

3 May 1999

84

ONE-WAY ANALYSIS OF VARIANCE OUTPUT

Co

De T u

d e e a U e o p r w e p S . o I o ( ( i u J I E u J g ) U N n 6 5 2 1 8 2 8 O 7 1 5 3 4 0 9 O 5 7 7 6 0 7 7 * N U o 5 6 2 1 8 8 2 O 6 1 4 7 3 0 3 O 6 6 5 7 0 2 8 * O U 1 7 5 3 4 9 0 N 1 6 4 7 3 3 0 O 0 8 2 9 1 0 3 * O U 7 5 7 6 0 7 7 * N 6 6 5 7 0 8 2 * O 8 0 2 9 1 3 0 *

* . T h 3 May 1999

85

WILCOXON SIGNED RANKS TEST


In

the menu, click Statistics Choose Nonparametric Tests Click 2 Related Samples

3 May 1999

86

WILCOXON SIGNED RANKS TEST


Move

selected variable pairs to the Test Pair(s) List box. Choose the statistical test(s). Click Options...
3 May 1999

87

WILCOXON SIGNED RANKS TEST

Check Descriptives box for descriptive statistics.

3 May 1999

88

WILCOXON SIGNED RANKS TEST

Click OK to run the procedure.

3 May 1999

89

WILCOXON SIGNED RANKS TEST

b a

g a n u o r g a n u o a Z 1

3 May 1999

90

MANN-WHITNEY U TEST
In

the menu, click Statistics Choose Nonparametric Tests Click 2 Independent Samples

3 May 1999

91

MANN-WHITNEY U TEST
Select

and move

Test Variable(s) GroupingVariable

Click

Define Groups.

3 May 1999

92

MANN-WHITNEY U TEST

Enter

the values for the groups. Click Continue.

3 May 1999

93

MANN-WHITNEY U TEST
Click

Options.

After changing options, click Continue. Click OK to run the procedure.

3 May 1999

94

MANN-WHITNEY U TEST - OUTPUT

b a

g a n c o 0 0 Z 8

3 May 1999

95

KRUSKAL-WALLIS TEST
In

the menu, click Statistics Choose Nonparametric Tests Click K Independent Samples

3 May 1999

96

KRUSKAL-WALLIS TEST
Move

the dependent variable(s) to the Test Variable List box. Move the grouping variable(s) to the Grouping Variable box. Click Define Range.
3 May 1999

97

KRUSKAL-WALLIS TEST
Enter

the minimum and maximum values for the Grouping Variable. Click Continue.

3 May 1999

98

KRUSKAL-WALLIS TEST

Check

the box for the Kruskal-Wallis H. Click OK to run the procedure.

3 May 1999

99

KRUSKAL-WALLIS TEST - OUTPUT

3 May 1999

100

USING SPSS FOR WINDOWS EDITING THE OUTPUT


3 May 1999

101

Pivot Tables Scatterplots Charts

SCATTERPLOT

In

the menu, click on Graphs. Choose Scatter.

3 May 1999

102

SCATTERPLOT

Choose

the appropriate type of plot. Click Define.

3 May 1999

103

SCATTERPLOT
Select

and move the variables for the X and Y axes to the appropriate box. Click OK to run the procedure.

3 May 1999

104

SCATTERPLOT - OUTPUT
5000 4000

Regression line must be added.

3000

2000

1000

BTWT

0 10 20 30 40 50 60 70

BMI

3 May 1999

105

EDIT THE SCATTERPLOT


In the Output Window
Click

the chart object to select it. In the menu, click Edit. Choose SPSS Chart Object. Click Open.
3 May 1999

106

SCATTERPLOT

The Chart Window will open.

3 May 1999

107

EDIT THE SCATTERPLOT

In the Chart Window


In

the menu, click Chart. Click Options.

3 May 1999

108

EDIT THE SCATTERPLOT

Check

the Total

box. Click OK.

3 May 1999

109

SCATTERPLOT - OUTPUT
5000 4000

Regression line is added.

3000

2000

1000

BTWT

0 10 20 30 40 50 60 70

BMI

3 May 1999

110

EXERCISE DATASETS
Coding and recoding Survey about smoking habit Test of Difference

STATISTICAL DATA ANALYSIS AND INTERPRETATION


Prepared By: PABLO E.SUBONG, JR., Ed.D., Ph.D.

TABLE 1: NMAT PERFORMANCE OF THE BS BIOLOGY STUDENTS


Category
A. Entire Group B. Gender Male Female C. SES High Average Low D. Type of School Private Public E. Mental Ability High Average Low 1.67 1.90 1.96
Scale 1.00-1.66 1.67-2.32

Mean
1.96 1.85 2.09 1.75 1.83 2.29 1.85 2.07

Description
Average Average Average Average Average Average Average Average Average Average Average
Description High Average

S.D.
0.60 0.66 0.51 0.72 0.48 0.45 0.61 0.59 0.67 0.47 0.60

2.33-3.00

Low

NMAT PERFORMANCE OF THE BS BIOLOGY STUDENTS

The NMAT Performance of the BS Biology students is presented in Table 1. Generally, the NMAT performance of the BS Biology students is average, (M=1.96, s.d.=0.60)

When they are classified into their gender, socioeconomic status, type of school, and mental ability, the BS Biology students exhibited the same level of NMAT performance which is average.

TABLE 2: T-TEST RESULTS FOR THE DIFFERENCES IN THE NMAT PERFORMANCE OF THE BS BIOLOGY STUDENTS
Compared Groups A. Gender Male Female B. Type of School 34 82.80 71.63 16.74 20.92 1.782 .084 d.f. Mean s.d. t-ratio t-Prob.

Private Public

34

76.22

22.52 79.44

0.496 15.87

0.623

p > 0.05 Not significant at 0.05 alpha

DIFFERENCES IN THE NMAT PERFORMANCE OF THE BS BIOLOGY STUDENTS

The differences in the NMAT performance of the BS Biology students are shown in Table 2. The t-test computations reveal no significant differences in the NMAT performance of the BS Biology students when they are classified into gender, t(34)=1.782, p=0.084. The null hypothesis of no significant difference in the NMAT performance of the BS Biology students that would exist between gender was accepted. This simply shows that both male and female BS Biology students have the same performance in their NMAT. Likewise, when they are classified into type of school, students coming from private and public schools exhibited the same performance in their NMAT, t(34)=0.496, p=0.623. This similar performance might be attributed to the fact that public school nowadays can now compete with the private schools in terms of scholastic performance of the students.

TABLE 3-A: ANOVA RESULTS FOR THE DIFFERENCES IN THE NMAT PERFORMANCE OF THE BS BIOLOGY STUDENTS CLASSIFIED AS TO SOCIOECONOMIC STATUS
Sources of Variation Between Groups Within Groups Total Degrees of Freedom 2 33 35 Sum of Squares 1143.17 1855.83 12999.00 Mean Square s 571.58 359.27 F-ratio F-Prob.

1.591

0.219

p > 0.05 Not significant at 0.05 alpha

TABLE 3-B: ANOVA RESULTS FOR THE DIFFERENCES IN THE NMAT PERFORMANCE OF THE BS BIOLOGY STUDENTS CLASSIFIED AS TO THEIR MENTAL ABILITY
Sources of Variation
Between Groups Within Groups Total

Degrees of Freedom
2 33 35

Sum of Squares
5346.50 7652.50 12999.00

Mean F-ratio Square s


2673.25 11.528 231.89

FProb.
0.000

p < 0.05 Significant at 0.05 alpha

TABLE 3-C: POST HOC TEST FOR THE DIFFERENCES IN MEANS IN THE NMAT PERFORMANCE OF BS BIOLOGY STUDENTS CLASSIFIED AS TO MENTAL ABILITY
NMAT Performance High Mental Ability Average Mean Difference 12.75 Significant 0.138

Average
p < 0.05 Significant at 0.05 alpha

Low
Low

29.75
17.00

0.000
0.034

ANOVA results revealed no significant differences in the NMAT performance of the BS Biology students when they classified as to their socioeconomic status, F(2,33)=1.591, p=0.219. Meaning, those BS Biology students with high, average, and low socioeconomic status, their performance level in their NMAT is similar. But when the BS Biology students are classified into their mental ability, ANOVA results revealed a significant difference in their NMAT performance, F(2,33)=11.528, p=0.000. The results are reflected in Table 3-B.

Pair-wise comparison using Scheffe Test in Table 3-C showed that those BS Biology students with high and average mental ability do not differ significantly in their NMAT performance, but those students with high mental ability, differ in their NMAT performance with those students with low mental ability. Likewise, those students with average mental ability differ in their NMAT performance with those students with low mental ability.

THANK YOU

You might also like