You are on page 1of 6

Usability of PSPP 1

Usability of PSPP

Lim Zhen Ling Samantha

Nanyang Technological University


Usability of PSPP 2

PSPP is a free open source alternative to SPSS the commonly used proprietary
statistical package. SPSS’s cost of ownership is high – access is limited to the school’s
computer laboratories. It can be accessed via NTU VPN in the comfort of the home however
non-Internet Explorer users will not be able to do so. Therefore, I would like to investigate
whether I can easily conduct similar statistical tasks using PSPP and thereby evaluate my user
experience of this program.

I decided to follow-up on a tutorial on multiple regressions. The three tasks I would


like to carry out are generation of box plot, histogram and summary statistics for the four
variables in the tutorial dataset, a correlations table for the four variables and linear
regression for a reduced model i.e. M2 which contains only one out of three independent
variables (IVs).

I conducted the tasks using my Acer laptop in my room at home. I found the GNU
PSPP website using the Google search engine. Then I downloaded the software via the
website and installed it on my laptop. With a simple click of the icon, the program started up.

My first impression of the program is that it looked more accommodating to the


novice user than SPSS. There are large colourful icons for various functions such as OPEN,
SAVE, GOTO CASE and FIND. Despite visibility of these functions, I was at a loss to how
to plan my action sequence as the program did not appeal to my prior knowledge. Apart from
OPEN and SAVE, the other icon functions were unfamiliar. A closer inspection revealed a
similar horizontal menu bar at the top.

I clicked on OPEN icon; a dialog box asking for the FILE NAME appeared; I selected
the dataset with a click and selected OPEN; the system feed backed by loading the dataset
immediately. To generate histogram, box plot and summary statistics, I selected ANALYZE
from the menu bar; I selected DESCRIPTIVE STATISTICS; a drop down menu appeared at
the side; I selected EXPLORE and a dialog box appeared. I have to classify the four variables
into the FACTOR LIST and the DEPENDENT LIST. As I cursor over the variables, I was
pleased to notice the short form of the variable popped up on top of the long variable name.
This is a convenient feature. However, I noticed I was unable to highlight and drop multiple
variables at one time. After classification, I selected STATISTICS; no options for choice of
summary statistics were available. I selected CONTINUE; I clicked on OPTIONS in the
dialog box, still no options for histogram and box plot. I restarted by exploring the alternative
FREQUENCIES. I closed the current dialog box; I selected ANALYZE; I clicked on
DESCRIPTIVE STATISTICS and then selected FREQUENCIES. A dialog box popped up. I
selected the DV and clicked on the arrow button in the middle to shift the DV to the
VARIABLES list. I scrolled down and ticked the boxes for mean, standard deviation, kurtosis
and skewness. I explored the dialog box prompted by the FORMAT button. Still, no options
for histogram or box plot. I clicked OK to generate the output. Another dialog box appeared. I
have successfully obtained summary statistics but not the histogram or box plot, which PSPP
lack functions to produce. However, I was surprised to see that instead of a graphical
interface, the tables are made up of # and = which caused the data to be difficult to read (see
Fig. 1.1).
Usability of PSPP 3

My second task is to obtain correlations for the four variables in the dataset. First, I
selected ANALYZE but there was no CORRELATIONS option in the drop down menu. I
have reached a gulf of execution. I decided to search for the option by trial-and-error. Under
ANALYZE, there were only three options; only the two former options had drop down
menus. I decided to explore the options available in their drop down menus. I decided to
explore DESCRIPTIVE STATISTICS as I felt it was unlikely the rest contained the
CORRELATIONS function. The CORRELATIONS option was found under the
STATISTICS button which was in the CROSSTABS dialog box. However after generating
the output, I realized that option in CROSSTABS is a false clue as the data obtained did not
coincide with that from SPSS. Before I could declare this task a failure, I recalled that I am
able to manually type in the syntax (in SPSS) which would generate the output. I attempted
that as it was likely that the CORRELATIONS option is available but poor mapping caused
visibility to be compromised, resulting in critical functions to be ‘hidden’ from the user. In
the CROSSTABS dialog box, I clicked on PASTE. A dialog box for the syntax appeared. By
deleting all and typing in the appropriate syntax used in SPSS into the dialog box, selecting
the PLAY icon, output is generated. The icon is a good metaphor borrowed from the familiar
video player. An ERROR MESSAGE dialog box popped up indicating the PRINT function
(of the syntax) could not be recognized (see Fig. 1.2). The feedback was not meaningful but
fortunately it is constrained by the sole option available: OK. I selected it and output was
generated successfully.

Lastly, I was to run linear regression analysis for a reduced model. To begin, I
selected ANALYZE then I selected LINEAR REGRESSION. In the dialog box, I classified
progress as DV and logo as IV. I noticed there were no options for the entry method of the
variables i.e. FORWARD, BACKWARD. I assumed that the default entry method if
unspecified was ENTER. I clicked on the STATISTICS button in the dialog box; I selected
the boxes for ANOVA, R and COEFF and clicked CONTINUE. I selected OK to generate the
output. The data was successfully generated however I noticed the results are in two decimal
places which were less accurate. I attempted right-clicking and clicking on the numerical data
for options on number of decimal places or the original non-estimate; no options were
available. Again, poor mapping thwarted with a function’s accessibility.

Instructions

Summary statistics (PSPP cannot generate histogram and box plot)

Go to ANALYZE (menu bar at the top); select DESCRIPTIVE STATISTICS; select


FREQUENCIES. Select the variable you would like to generate the statistics for then click on
the ARROW BUTTON in the middle to put the variable into the VARIABLES list. Scroll
down and tick the boxes of the summary statistics you would like to generate. Click OK to
generate the output.

Correlations (by syntax input)

Go to ANALYZE; select DESCRIPTIVE STATISTICS; select CROSSTABS. You may put


the variables in any way you wish (it doesn’t matter). After putting the variables under
Usability of PSPP 4

ROWS and COLUMNS, you will notice that the PASTE button is active – select that. A
dialog box will pop up. Clear the dialog box i.e. by selecting all and pressing DELETE.
CORRELATIONS

/VARIABLES=Progress Logo Vocal Syntax [short form of variables]

/PRINT=TWOTAIL NOSIG

/MISSING=PAIRWISE

Click on the PLAY icon to generate output.

Linear regression for a reduced model

Go to ANALYZE; select LINEAR REGRESSION; select your DV and place it under


DEPENDENT VARIABLE; select the IVs in your linear regression model and place it under
INDEPENDENT VARIABLE. Note that by selecting all the IVs available, you would be
running an analysis on the full model instead – do not do so. Click on the STATISTICS
button; ensure that COEFF, R and ANOVA are ticked; select CONTINUE. Select OK to run
the output.

Conclusion

PSPP is a useful statistical tool. As an open source project in its developmental stages,
usage remains somewhat bumpy and frustrating. As my mental model for usage of this
program is derived from experience using SPSS, I used that as a guide to complete the tasks
in PSPP. Therefore, the software appeared easy to use and tasks can be completed when
PSPP is consistent with SPSS design. However, I do not have a mental model for PSPP’s
unique organization and features. Given some time to learn these discrepant functions, the
action sequences will become automatic, thus the software will seem significantly easier to
use. Still, not all errors can be accountable by unfamiliarity. The lack of visibility of critical
functions for instance is a misrepresentation of the designer’s model to the system model
resulting in a flawed user mental model. The inappropriate conceptual model causes planning
of action sequences to be incorrect, inevitably resulting in tasks uncompleted.

Overall, five to six errors were made, not inclusive of trial-and-error in task two. The
alternative software SPSS would be easier to use due to an existing mental model and well-
learned automatic action sequences. Since that is the most widely used statistical tool in the
market, designers should assume that most users’ men0tal models are derived from it and
thus design PSPP to be more consistent with it. This will allow savings in learning the
interface and thereby enhance user experience. The existing drop down menus may be short
easing serial search but visibility and accessibility are compromised. I suggest the
multiplying the display of options in the initial drop down menus – this will enable clues to
generating the correct action sequences and minimize gulfs of execution. Other critical
functions such as generation of graphs and choice of entry method in linear regression should
Usability of PSPP 5

be visibly included. Intuitiveness of the program can also be enhanced. For instance, the user
should be allowed to manipulate the data directly.

Annexe

Fig. 1.1

Fig. 1.2
Usability of PSPP 6

You might also like