Professional Documents
Culture Documents
Qualitative: Qualitative variables are also known as "categorical" variables. They describe
attributes of objects by names or labels. A person's religion (e.g Hindu, Muslim, Christian)
or the colour of the person's eyes (e.g., black, brown, blue) are examples of qualitative or
categorical variables.
Quantitative: Quantitative variables are also know as "numeric" variables. They record a
measurable quantity. For example, when we speak of the population of a city, we are
talking about the number of people in the city - a measurable attribute of the city. Therefore,
population would be a quantitative variable.
Example
eg: Nationality:
1 = Australian
2 = British
3 = Canadian
4 = Dane
5 = Other
eg: Education
1 = No education
2 = Primary School
3 = High School
4 = Graduate
5 = Postgraduate
Nominal
Ordinal
Categorical
Interval
Numeric
eg: Age
Recorded in whole
years
Ratio
Numeric
eg: Income
Table - 02
Property
Description
Name
The name of the variable. Variable names can not contain spaces. To
change a variable's name, double-click on the variable that you wish
to re-name. Type your new variable name.
Type
The type of variable. This column refers to how the data is stored, the
number of characters it can contain besides other formatting
information. This is not to be confused with the Type of Variables
discussed at the beginning of Session II.
SPSS recognizes the following types:
Numeric, Comma, Dot, Scientific notation, Date, Dollar, Custom
currency, String and Restricted Numeric (integer with leading zeros)
To change a variable's type, click inside the cell corresponding to the
Type column for that variable. A square "..." button will appear; click
on it to open the Variable Type window. Click the option that best
matches the type of variable. Click OK.
Width
Decimals
The number of digits after a decimal point for each value of the
variable (applicable to numeric variables)
Label
Value
Missing
The user-defined values that indicate data are missing for a variable
(e.g., -99). Note that this does not affect or eliminate SPSS's default
missing value code ("."). This column merely allows the user to specify
alternative codes for missing values.
Columns
Align
Measure
Role
The role that a variable will play in your analyses (i.e., independent
Select the variables you wish to define in the box on the left and click on the blue arrow button.
The selected variables will be moved to the box on the right under the heading 'Variables to
Scan. The Continue button is now enabled.
Click on Continue.
SPSS will scan the selected variables and identify the existing properties associated with those
variables and display them in a screen where you can view and change the properties for each
variable as shown in the following screen.
Figure - 03
On the screen in Figure - 03 you select each variable in turn from the scanned variables list and
enter the properties as described in Table - 02.
When you are done describing all the variables click OK
ADVANCED:
When you have completed defining the properties of all the variables, instead of clicking on the
OK button, you can click on the Paste button. This will open the SPSS Syntax Editor screen
into which all the SPSS commands used to define the variable properties will be pasted.
You can save this syntax into a file for future use. The next time if you have to import your file
again into SPSS, you will not need to go through all the steps shown above to define the
variable properties. You can open the syntax file you save and execute all the commands in it.
The variable properties will be defined.
Concept Check:
1) Give 3 examples of Nominal variables in the Titanic dataset.
ANSWER:
3) What is the difference between Nominal and Ordinal variables?
ANSWER:
4) List the variables in the Titanic dataset that:
a) Can be placed on a scale of measurement.
ANSWER:
b) Can be considered Ordinal Variables.
ANSWER:
c) Are strings.
ANSWER:
5) Can .docx files be read into SPSS ?
ANSWER:
A useful first step is to use the SPSS Frequencies command found from the menu.
1. Click on Analyze Descriptive Statistics Frequencies
2. Select all the variables in the list (except ones that represent serial number of cases or in
the example data set the Name of Passenger variable because one would expect a
name to be unique to a passenger).
3. Click on the Statistics button
4. In the Frequency statistics window, place a check mark against: Mean, Median, Mode and
any other optional statistic that you may be interested in examining.
5. Click on Continue
6. Click on Close
SPSS opens an Output Window and displays pages of summary statistics and frequency tables
abnormal and extreme values (example Age could have been entered as 250 in a
particular case)
number of cases with missing values ( i.e cases which have no data recorded for the
variable)
identify variables that can be recoded into groups (e.g. Fare could be recoded into: Low,
Medium and High)
get a general feel about the integrity and suitability of the data for further analysis
As you would have observed, for variables measured on a scale (like Age and Fare), the
frequency table could be very long because each case is likely to have a unique number.
For scale variables, it is more informative to generate descriptive statistics.
1. Go to Analyze Descriptive Statistics Descriptives
2. Select the variables Age and Fare
3. Set the Options for the statistics you wish to see
4. Click OK.
We have used the Frequency distribution here to detect wrongly coded variables, to spot
abnormalities / extreme values in the data.
However the Frequency distribution plays a greater role in statistics. It provides a useful summary
of the data being studied. It is a part of a collection of statistics known as Descriptive Statistics
which are used to describe the data. In particular the frequency distribution gives measures of
central tendency and dispersion, indicating the mean, median and mode and spread of the data for
each variable.
Test - 1
Look at the outputs of the Descriptive Statistics and Frequencies command and answer the
following:
1) What is the mean Fare paid by passengers on the Titanic ?
ANSWER:
2) What is the mode of the Fares paid by passengers on the Titanic ?
ANSWER:
3) How many cases in the Titanic dataset do not have Age entered ?
ANSWER:
4) What is the mean Age of passengers on the Titanic ?
ANSWER:
5) What is the median Age of passengers on the Titanic ?
ANSWER:
6) What is the proportion of passengers on the Titanic who survived ?
ANSWER:
7) How many passengers on the Titanic did not pay any fare ?
ANSWER:
5 SPSS: Histograms
While the Frequency distribution displays a table of numbers that summarizes the distribution of
values of each variable, showing how the values are spread from minimum to maximum, the
Histogram provides a graphical representation of the distribution.
In SPSS, histograms are produced from the same menu option that produced frequency tables.
1. Click on Analyze Descriptive Statistics Frequencies
2. Select the variables for which you want to produce histograms (select Age and Pclass as
an example)
3. At the bottom of the variable select screen, uncheck the check-box against the label
Display Frequency Tables
4. Click on the Charts button
5. Select the radio button Histograms
6. Click on Continue
7. Click on Close
The histogram will be displayed in the currently open SPSS output window.
Figure - 04
typing mistakes
Data cleaning activity typically takes a large chunk of time in data analysis. It is a very important
step nevertheless because erroneous data can lead to erroneous conclusions.
This session will be conducted as a hands-on exercise under supervision, according to the
following instructions.
Lab Exercise: Correcting and Cleaning Data
1. Read the supplied data file: titanic_ex_II.csv
2.
3.
4.
5.
6.
7.
8.
OR
1. Read the data from the file cafedata.xls into SPSS. Study the accompanying file
cafedata_documentation.txt which provides information about this dataset.
2. The article associated with this data set appears in the Journal of Statistics Education,
Volume 19, Number 1 (March 2011) issue. Read this article here:
http://www.amstat.org/publications/jse/v19n1/depaolo.pdf
3. Once the data has been read into SPSS, assign meaningful variable labels and value
labels.
4. Produce some frequency tables and histograms.
Online Resources:
1. https://statistics.laerd.com/statistical-guides/types-of-variable.php
2. https://statistics.laerd.com/statistical-guides/measures-central-tendency-mean-mode-median.php