Professional Documents
Culture Documents
To reverse score, switch the highest and lowest numerical values of a response, then substitute the
next highest and lowest values, and so on. Non-numerical responses, such as other or non-
applicable, should not be included in reverse scoring.
EXAMPLE
Recoding answers
You may need to recode some answers to questions that have an open-ended "other" response option.
For example, one person may answer the question, "Do you consider yourself African American,
Caucasian, Asian, Hispanic, or other?" by circling "other" and writing, "Chinese." To maintain
consistency, code the answer as "Asian" rather than "other."
To create more meaningful categories, combine the "agree" and "strongly agree" categories to obtain
the percentage of agreement (65%) and the "disagree" and "strongly disagree" categories to obtain
the percentage disagreement (24%).
Compute crosstabulations
Compute cross-tabulations to see relationships between responses for two survey questions. For
example, you may ask students to provide their race in the first question and rate their overall
satisfaction with a project in the second question. Display response choices for the first question as
column labels and choices for the second question as row labels. A cross-tabulation might reveal that
the percentage of participants who were satisfied with a project depended on their race:
EXAMPLE
Cross-tabulations can highlight contrasts between groups of participants or findings that are
consistent across groups. Your research questions and hypotheses should help you choose which
cross-tabulations to compute.
Chisquare statistics
Chi-square statistics- used with data that fall into mutually exclusive categories, such as gender,
rather than continuous numerical data-tell you whether one categorical variable is independent of
another. For example, if you randomly survey 75 male and 75 female math students at UT Austin, a
chi-square test could tell you if satisfaction with math courses is independent of gender among UT
Austin students. Statistical programs provide a p value that indicates the probability that group
differences occurred by chance alone. A p value of .03 indicates that there is a 3% probability that
gender differences in your sample are due to chance error rather than real differences between men
and women at UT Austin. Results are generally considered to be statistically significant if they yield p
values of .05 or less. Chi-square tests require a minimum of five responses per cell, so-as with all
statistical tests-make sure you understand and meet the assumptions underlying the test.
Correlation
Correlationsindicate whether a statistically significant positive or negative relationship exists
between two continuous variables. For example, a positive correlation between time spent watching
television and time spent watching movies at theaters indicates that the more time respondents
spend watching television, the more time they tend to spend watching movies. Finding significant
correlations between variables does not tell you what causes those relationships.
Linear regression
Linear regressionis used to determine a line that best summarizes a set of data points and enables
the prediction of an outcome variable using one or more continuous variables. For example, you
might use course satisfaction ratings to predict course grades.
Confidence interval
A confidence intervalprovides a range of values likely to include an unknown population
value.Sampling erroris usually expressed as a confidence interval. Even if you randomly sample to
ensure survey respondents are representative of the population of interest, there will still be some
error because of chance variation. For example, responses of 200 randomly selected graduate
students to a UT Austin library survey question asking how often they use interlibrary services during
an academic year yield a mean of 4.5 uses. A 95% confidence interval, expressed as 2.8 5.8, tells
you that there is a 95% probability that the actual mean number ofusesforallgraduate students
falls between 2.8 and 5.8 per year. The width of the confidence interval provides an estimate of the
degree of uncertainty about the unknown population value. Confidence intervals are only valid if a
survey sample is randomly selected.
Additional information
American Association for Public Opinion Research (2000).StandardDefinitions:FinalDispositions
ofCaseCodesandOutcomeRatesforSurveys, Second Edition [Electronic version], Ann Arbor,
MI:AAPOR
Linearregression.(n.d.) Retrieved June 21, 2006 from the Yale University Department of Statistics
Index of Courses 1997-98 Web site:http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm