You are on page 1of 10

MK 852: Introduction to SPSS (Statistical Package for Social Sciences) I.

To Begin:
The first thing you need to do every time you open SPSS is to open a data file. (For this tutorial we'll use an existing data file, rockvisaedit.sav) SPSS shows data in two different views: Data View lists responses of the all respondents who were interviewed. Variable View shows name and description of the variables as well as the possible values these variables can get.

II. Cleaning the Data: Eliminating the Missing Variables


Few databases are perfect. There maybe missing values and data entry or coding errors, thus one of the first steps should be to check the database for errors and remove these. Missing values skew the results of data summaries and therefore should be isolated from the database. a. Open Variable View. b. Go to Missing column. c. Click on cell of the appropriate question. This will open the Missing Values window.

d. Click on Discrete Missing Values. Enter 99. Click OK. (Alternatively, you can enter a Range, which will take care of everything that is not a correct response.) Hint: After doing this for one variable, you can copy and paste values in the Missing column to other cells.

III. Obtaining Descriptive Statistics of the Data


In SPSS, there are three different ways of obtaining descriptions of the data: Descriptives Frequencies Crosstabs

1. Running Descriptives: a. Choose Analyze from the menu. b. Click on Descriptive Statistics. c. Click Descriptives.

d. e. f. g.

Select the required variables. Click Options. Select Mean (only for quantitative variables), Maximum, Minimum and Standard Deviation. Click Continue. Click OK.

Example: On average, how do respondents compare the Visa credit card offer to the card they use most often? (Hint: Determine the average value for variable q30f). a. Choose Analyze from the menu. b. Click on Descriptive Statistics. c. Click Descriptives. d. Select q30f. e. Click Options. f. Select Mean, Std Deviation, Minimum, Maximum.

g. Click Continue. Click OK. NOTE: SPSS gives the output of the analysis in a separate output window. You can use the tables and charts in the output window by simply copying and pasting it to a Word document. If you will be using the same output later, you need to save it in your computer by clicking on the Save button (or alternatively choosing Save option from File menu). 2. Running Frequencies: This procedure also provides graphical displays for the description of data.

a. Choose Analyze from the menu. b. Click on Descriptive Statistics. c. Click Frequencies.

d. Select the required variable. e. Options available are as follows: Statistics for descriptive statistics of quantitative variables Charts for graphical display of data Format to specify the order in which the results are to be displayed f. Click Continue. Click OK. Example: Which card do the respondents use most often? (Hint: Use variable q3 for answer.) a. Choose Analyze from the menu. b. Click on Descriptive Statistics. c. Click Frequencies. d. Select q3. e. Check Display frequency tables. f. Click Statistics. g. Choose Minimum and Maximum. h. Click Continue.

i. j.

Click Charts. Select Bar Charts. Select Percentages.

k. Click Continue. Click OK. 3. Running Crosstabs: This is useful in examining the way different variables maybe associated with each other. This is in general restricted to non-metric measures since metric data can be judged by other procedures such as correlations. a. Choose Analyze from the menu. b. Click on Descriptive Statistics. c. Select Crosstabs.

d. e. f. g. h.

Choose one or more variables for Row and one or more variables for Column. Select Statistics for tests of association. Select Cells for observed and expected values. Select Format. Click OK.

Example: Is there a relationship between respondents agreeing to a change in their attitude towards the use of credit cards in the past year and the type of card they use most often? (Hint: Use variables q3 and q20 for answer). a. Choose Analyze from the menu. b. Click on Descriptive Statistics. c. Select Crosstabs. d. Choose variable q3 for column and variable q20 for row. e. Click Statistics. Select Chi-Square tests.

f. Click Continue. g. Click Cells. Select Column under Percentages. h. Click Continue. Click OK. Interpretation: To interpret the output, you need to look at the table called Chi-Square Tests, which tell you whether the relation is statistically significant or not.

IV. Transforming Variables


This is required when variables need to be condensed for ease of further analysis, when they need to be reverse coded for negative responses or for determining the average response of a question. 1. Recoding: We need to recode the variables for the following purposes: Reassigning values of existing variables Creating a new variable by collapsing ranges of existing values into new values Reverse coding negative items a. Choose Transform from menu. b. Select Recode. c. Select Into Different Variables.

d. Select required variables (in case of multiple variables they must of the same type that is either all numeric or all string). e. Enter an output name for each new variable and select Change. f. Click Old and New Values. Specify an old value and a new value. Click Add. g. Click OK. Check that a new column is created at the end of the data editor file. Example: Collapse responses to variable q1i into 3 categories, Agree, Disagree, and Neither agree nor disagree a. Choose Transform from the menu. b. Select Recode.

c. d. e. f. g.

Select Into different variables. Select variable q1i. Under Output Variable, go to Name. Type newq1i. Click Change. Click Old and New Values. Recode as follows: Old Values New Values 5 1 4 1 3 2 2 3 1 3 6 SYS MISSING NOTE: When entering old and new values you have two options. You can enter each value one by one using Value option under both old and new values. e.g. Old Value: Value: 5 New Value: Value: 1 You can specify a range under old values. e.g. Old Value: Range: 4 through 5 New Value: Value: 1 Dont forget to click Add each time you enter an old and new value.

h. After you finish all old and new values, click Continue. Click OK. Check to see that the new variable has been added as a new column. Creating dummy variables to represent a categorical variable: You can use recoding to create dummy variables. (You will need to execute this step before your regression analysis.) Example: Create dummy variables from the level of education of the primary wage earner of a household (Hint: Use answers for q36). Create dummy variables for three categories, Less than high school graduates and high school graduates, Some college and college graduates, Any postgraduate work. Since you have three different categories in your analysis, you will need two dummy variables. Use Old and New Values in the Recoding to create dummy variables as follows: Dummy Variables Old and New Values

High School College

Less than high school (1) AND high school (2) = 1 All else (3,4,5) = 0 Some College (3) AND College graduates (4) = 1 All else (1,2,5) = 0

2. Computing Variables: This is another method of collapsing data but it allows the computation of variables based on numeric transformations of other variables. a. Choose Transform from menu. b. Click Compute.

c. Type a name for the target variable. d. To build an expression, either use functions that are listed in the Function area or type directly in the Numeric Expression field. When you use functions from the Function list, you need to fill in the parameters indicated by the question marks. e. Click OK and a new column will be created at the end of the data editor file with the new variable. Example: Find the average response to the importance people attach to their money, credit cards and material possessions. (Hint: Use variables l, m and n from question 1). a. Choose Transform from the menu. b. Click Compute. c. Type in Target Variable as q1lmn. d. Select Mean from Functions. e. Select variables q1l, q1m and q1n.

f.

Click OK. Check that a new variable is created at the end of the data editor file.

V. Statistical Analysis
1. Correlation Analysis: There are two types of correlation: Spearman for ordinal data Pearson for metric data (interval and ratio data) a. Choose Analyze from menu. b. Choose Correlate. c. Choose Bivariate.

d. e. f. g. h.

Select required variables. Choose Spearman or Pearson under Correlation Coefficients. Select Two Tailed under test of significance. Select Flag Significant Correlations. Paste and Run.

Example: Find the level of correlation between variables q1(c, d and e). a. Choose Analyze from menu. b. Choose Correlate. c. Choose Bivariate. d. Select question 1(c,d and e). e. Choose Pearson under Correlation Coefficients. f. Select Two-tailed under Test of Significance. g. Check Flag significant Correlations.

h. Click OK. 2. Regression Analysis: In order to run a regression analysis in SPSS, you first have to come up with a regression model. A thorough regression analysis follows the following steps: Identify dependent variable. Choose the independent variables. Run correlations among variables. Run regressions model(s). Interpret the output. The regression model is run in SPSS as follows: a. Choose Analyze from the menu. b. Choose Regression. c. Select Linear.

d. Select dependent variable. e. Select independent variables. f. Click OK. Example: Analyze the impact of an individuals attitudes towards shopping on the level of satisfaction with his or her present financial situation. (Hint: According to the corresponding regression model, dependent variable is q1i and independent variables are q1c, q1d, q1f, q1g, q1n, q35 and q39. Note that it is assumed here that you have already run a correlation between these variables to see whether they are correlated.)

Need to run correlation to check whether there is multicollinearity, and then dummy code the variables q35 and q39. a. Choose Analyze from the menu. b. Choose Regression. c. Select Linear. d. Select dependent variable (q1i). e. Select independent variables q1c, q1d, q1f, q1g, q1n, q35 and q39.

f. Click OK. How to interpret results of regression analysis: Check R2 value to see what proportion of variability in the dependent variable is explained by the independent variables. The regression model must be significant. ANOVA F-test results must have p<.05. Examine the standardized betas and their significance level. Check if p<.05 or t>2. Use the relative size of the significant standardized betas to arrive at substantive conclusions. Examine the direction of the relationship. Interpret the coefficients of dummy variables

10

You might also like