You are on page 1of 4

Chapter 1 Fundamental concepts SPSS - Descriptive statistics

Before starting with any advanced analysis, it is a good habbit to start with some descriptive statistics and simple graphics, to see what is going on in your data! Datafile used: gss.sav

How to get there: Analyze


Frequencies

Descriptive Statistics

This menu selection opens the following Frequencies dialog box:

As you can see, the variables are difficult to read. To make them easier to read, well use variable names instead of labels in dialog boxes. Do this by choosing Edit Options. Then, in the Options dialog box, click the General tab. In the Variable Lists group box (top at the right), select Display names and click OK. This change doesnt have effect until the next time you open a data file! So close the datafile, and reopen it. Return to the Frequency dialog box. Now youll see the following Frequencies dialog box:

Choose the variable(s) for which you need descriptive statistics by selecting them and clicking on the arrow. They appear in the Variable(s): box. Display frequency tables is automatically selected. In a frequency table the absolute and the relative frequencies are shown, as well as the percentage and cumulative percentage of valid cases (without missing values). The cumulative percentage is the portion that is smaller or equal to the concerning value. Button Statistics One can select many descriptive statistics. Most importantly, these are the Mean, Median, and Mode, and Std. deviation, Range, Minimum and Maximum. See following figure.

Button Charts Some simple charts can be obtained, such as bar charts, pie charts and histograms. A histogram is a graphical display of counts for ranges of data values. In histograms, one can choose to indicate the normal curve as well. See following figure.

When a chart is obtained in the output, they can be modified in the SPSS Viewer. A new window appears, the SPSS Chart Editor, in which changes can be made by clicking on a certain part of the chart (e.g. axis,

legend, title) In the following figure, the window Category Axis appears by clicking on the x-axis title Respondents Sex.

Output of running frequencies


When you perform an analysis using Frequencies on the variable degree, without indicating any options, the results are the following:

Output 1

Frequencies
Statistics RS Highest Degree N Valid 1496 Missing 4
RS Highest Degree Frequency 279 780 90 234 113 1496 2 2 4 1500 Percent 18,6 52,0 6,0 15,6 7,5 99,7 ,1 ,1 ,3 100,0 Valid Percent 18,6 52,1 6,0 15,6 7,6 100,0 Cumulative Percent 18,6 70,8 76,8 92,4 100,0

Valid

Missing

Less than HS High school Junior college Bachelor Graduate Total Don't know No answer Total

Total

In the table Statistics, the number of cases (N) is splitted in Valid and Missing cases. In the frequency table RS Highest Degree, the variable degree is splitted into the possible answers (Less than HS, High School, ..etc), and their absolute (Frequency) and the relative (Percent) frequencies are shown, as well as the percentage and cumulative percentage of valid cases (Valid Percent and Cumulative Percent). Percent calculates the relative frequencies including the missing cases. However, Valid Percent calculates the relative frequencies excluding the missing cases, so that the relative frequencies of the valid cases count up to 100 %.

Output 2

When you perform an analysis using Frequencies on the variables age, indicating the options mean, median and mode (button Statistics) , and histogram with normal curve (button Charts), some of the results are the following (we left the table Age of Respondent out because it is very large):

Frequencies
Statistics Age of Respondent N Valid Missing Mean Median Mode 1495 5 46,23 43,00 28a

a. Multiple modes exist. The smallest value is shown

Age of Respondent
200

100

Frequency

Std. Dev = 17,42 Mean = 46,2 0 20,0 30,0 25,0 35,0 40,0 50,0 60,0 55,0 70,0 80,0 75,0 85,0 90,0 45,0 65,0 N = 1495,00

Age of Respondent

As usual, the number of valid and missing cases are visible in the Statistics table. The other descriptive statistics (Mean, Median and Mode), are indicated in the same table. The histogram of the variable age shows its distribution, with Age of Respondent on the x-axis and Frequency on the y-axis. The distribution seems to be approximately normal, and skewed to the left.

You might also like