You are on page 1of 24

Quantifying Data

Data Entry
Define variables, enter case data, conduct runs
Coding and Recoding
If numeric values not pre-assigned, decide
on coding system
If there is open-ended data, would need to
decide how to deal with responses
Defining your variables

Data Cleaning
Reread each set of responses back (immediately) to
confirm accuracy
Possible-code cleaning
easiest way to check is to run a frequency
distribution
Contingency cleaning
On the if questions
Sort by response
do you recycle then check the what do you
recycle variable
Can also run cross tabs and make sure cells are
empty

Basic Analysis Measures of Central Tendency


Mean: sum of values divided by the
number of cases
simple average

Median: middle attribute in a list of


observed attributes
extreme cases eliminated

Mode:
most
attribute

frequently

occurring

used with nominal variables, i.e.. sex


most respondents were women
usually report with percentage,
women

60%

were

Cross Tabs
Used often with Bivariate data
Convention usually places
independent variables
top in columns

across

dependent variables in rows below

Coding and data entry options


Transfer sheets are special forms ruled off in
80 columns
Edge coding involves recording code #'s in
margins of questionnaires
Direct data entry involves entering data directly
into computer; eliminating transfer sheets
Data entry by interviewer (CATI)
Optical scan sheets

Coding
What is it?
It is the assignment of numerical values to
information or responses gathered by
a research instrument

Codebook: describes the locations of


variables and lists the codes assigned to
the attributes of the variables

Data Management Process


concerned with the process by which raw
data gathered by some instrument are
converted into numbers for analysis
purposes

Collect information
instrument

with

data

gathering

Use codebook to transfer this information to a


transfer sheet or code sheet (optional)
Create data file from information on code
sheet by entering data from a computer
keyboard
Check/clean up data file for accuracy
Data cleaning done by
Computer edit programs
Examine distributions
Contingency cleaning

What about open-ended items?


Read through responses a create
preliminary code based on responses

If more than 10% of responses fall into


"other" category, code needs to be
revised to include many of these
responses

Elementary Quantitative Analyses


To understand the meaning of
univariate, bivariate, and multivariate
analysis
To become familiar with the meaning
of several univariate and bivariate
statistics

Analysis Strategies
Why do we have to have them?
People who read our research
are interested in the highlights
Should try to communicate
findings in an understandable and
painless fashion

Three types of analysis


Univariate analysis
the examination of the distribution of cases on
only one variable at a time (e.g., college
graduation)
Bivariate analysis
the
examination
of
two
variables
simultaneously (e.g., the relation between
gender and college graduation)
Multivariate analysis
the examination of more than two variables
simultaneously (e.g., the relationship between
gender, race, and college graduation)

Purpose
Univariate analysis
Purpose: description
Bivariate analysis
Purpose: determining the empirical
relationship between the two variables
Multivariate analysis
Purpose: determining the empirical
relationship among the variables

Types of Statistics
Techniques that summarize and describe
characteristics
of
a
group
or
make
comparisons of characteristics between groups
are knows as descriptive statistics.
Inferential statistics are used to make
generalizations
or
inferences
about
a
population based on findings from a sample.
The choice of a type of analysis is based on the
evaluation questions, the type of data
collected, and the audience who will receive
the results.

Univariate Analysis
Involves examination of the distribution
of cases on only ONE variable at a time
Frequency distributions are listings of the
number of cases in each attribute of a
variable
Ungrouped frequency distribution
Grouped frequency distribution

Proportions express number of cases of


the criterion variable as part of the total
population; frequency of criterion
variable divided by N

Percentages are simple 100 X


proportion
Or [100 X (frequency of criterion
variable divided by N)]

Rates make comparisons more


meaningful by controlling for
population differences

Measures of Central Tendency


Measures of central tendency reflect the
central tendencies of a distribution
Mode reflects the attribute with the
greatest frequency
Median reflects the attribute that
cuts the distribution in half
Mean reflects the average; sum of
attributes divided by # of cases

Measures of Dispersion
Measures of
spread
or
distribution

dispersion reflect
distribution
of

the
the

Range is the difference between largest &


smallest scores; high low
Variance is the average of the squared
differences between each observation and
the mean
Standard deviation is the square root of
variance

Types of Variables
Continuous: increase steadily in tiny
fractions
Discrete: jumps from category to
category

Subgroup Comparisons
Somewhere between univariate
&
bivariate,
are
Subgroup
Comparisons
Present descriptive univariate
data
for
each
of
several
subgroups
Ratios: compare the number of
cases in one category with the
number in another

Bivariate Analysis
Bivariate analysis focus on the
relationship
between
two
variables

Contingency Tables
Format:
attributes of independent
variable are used as column headings
and attributes of the dependent
variable are used as row headings
Guidelines
for
presenting
interpreting contingency tables

&

Contents of table described in title


Attributes of each variable clearly described
Base on which percentages are computed
should be shown
Norm is to percentage down & compare across
Table should indicate # of cases omitted from
analysis

Multivariate Analysis
Multivariate Analysis allow the
separate and combined effects of
the independent variable to be
examined

You might also like