You are on page 1of 2

DATA ANALYSIS STEP BY STEP CHECKLIST

Luca Camerini & Carmen Faustinelli

1. Dataset preparation a. For each variable, insert the values, labels, level of measurement. b. Provide sound code-names for the variables. c. Code missing data (both system-missing and filtered-out missing). 2. Descriptive statistics a. For each categorical/ordinal/dichotomous variable report counts, percentages, mode, median. b. For each continuous variable, report the mean, mode, median, variance, standard deviation, skewness, kurtosis, missing rate. 3. Preliminary data analyses: a. Missing data analysis and (eventual) imputation. Both univariate and multivariate. b. Normality analysis. Both univariate and multivariate. c. Outliers analysis. Both univariate and multivariate. d. Power analysis. 4. Correlation analysis a. Correlate all the items related to the predictors with items related to the outcomes and explore the correlation matrix. If one predictor has a very low correlation with the outcome it is unlikely that any subsequent analysis will turn out to be significant. Be sure to use the appropriate correlation formula depending on the nature of the variables involved (e.g. Pearson vs. Spearman). 5. Measurement analysis a. At the most basic level, report the Cronbachs alpha for the items that are assumed to belong to the same scale/index. b. At a more advanced level, perform a Confirmatory Factor Analysis to refine the scales. c. If nothing works, conduct an Explorative Factor Analysis (this is to be avoided and considered as the last option). d. (If there are reasons to do so) check for measurement invariance.

6. Structural analysis a. Check that the assumptions of your test are met (e.g. homogeneity of variance for ANOVA family or absence of multicollinearity for regression family). What follows assumes that regression is used (by far the most frequent case). b. First, check for moderated/interaction effects (if any). This can be accomplished by traditional methods (e.g. product-terms analysis) or by more complex techniques (e.g. SEM multigroups solutions). c. Second, test the indirect effects (if any). This can be accomplished by traditional methods (e.g. Baron-Kenny) or more complex techniques (e.g. SEM). d. Third, run all the regression analyses for all the assumed direct effects. e. NOTE: do not start with direct effects if the hypotheses involve mediation or moderation. Test for the more complex effects first.

You might also like