You are on page 1of 7

The Stata Journal

Editor
H. Joseph Newton
Department of Statistics
Texas A&M University
College Station, Texas 77843
979-845-8817; fax 979-845-6077
jnewton@stata-journal.com

Editor
Nicholas J. Cox
Department of Geography
Durham University
South Road
Durham City DH1 3LE UK
n.j.cox@stata-journal.com

Associate Editors
Christopher F. Baum
Boston College

Jens Lauritsen
Odense University Hospital

Nathaniel Beck
New York University

Stanley Lemeshow
Ohio State University

Rino Bellocco
Karolinska Institutet, Sweden, and
University of Milano-Bicocca, Italy

J. Scott Long
Indiana University

Maarten L. Buis
T
ubingen University, Germany

Thomas Lumley
University of WashingtonSeattle

A. Colin Cameron
University of CaliforniaDavis

Roger Newson
Imperial College, London

Mario A. Cleves
Univ. of Arkansas for Medical Sciences

Austin Nichols
Urban Institute, Washington DC

William D. Dupont
Vanderbilt University

Marcello Pagano
Harvard School of Public Health

David Epstein
Columbia University

Sophia Rabe-Hesketh
University of CaliforniaBerkeley

Allan Gregory
Queens University

J. Patrick Royston
MRC Clinical Trials Unit, London

James Hardin
University of South Carolina

Philip Ryan
University of Adelaide

Ben Jann
ETH Z
urich, Switzerland

Mark E. Schaer
Heriot-Watt University, Edinburgh

Stephen Jenkins
University of Essex

Jeroen Weesie
Utrecht University

Ulrich Kohler
WZB, Berlin

Nicholas J. G. Winter
University of Virginia

Frauke Kreuter
University of MarylandCollege Park

Jerey Wooldridge
Michigan State University

Stata Press Editorial Manager


Stata Press Copy Editors

Lisa Gilmore
Jennifer Neve and Deirdre Patterson

The Stata Journal publishes reviewed papers together with shorter notes or comments,
regular columns, book reviews, and other material of interest to Stata users. Examples
of the types of papers include 1) expository papers that link the use of Stata commands
or programs to associated principles, such as those that will serve as tutorials for users
rst encountering a new eld of statistics or a major new technique; 2) papers that go
beyond the Stata manual in explaining key features or uses of Stata that are of interest
to intermediate or advanced users of Stata; 3) papers that discuss new commands or
Stata programs of interest either to a wide spectrum of users (e.g., in data management
or graphics) or to some large segment of Stata users (e.g., in survey statistics, survival
analysis, panel analysis, or limited dependent variable modeling); 4) papers analyzing
the statistical properties of new or existing estimators and tests in Stata; 5) papers
that could be of interest or usefulness to researchers, especially in elds that are of
practical importance but are not often included in texts or other journals, such as the
use of Stata in managing datasets, especially large datasets, with advice from hard-won
experience; and 6) papers of interest to those who teach, including Stata with topics
such as extended examples of techniques and interpretation of results, simulations of
statistical concepts, and overviews of subject areas.
For more information on the Stata Journal, including information for authors, see the
web page
http://www.stata-journal.com
The Stata Journal is indexed and abstracted in the following:
R
CompuMath Citation Index
RePEc: Research Papers in Economics
R
)
Science Citation Index Expanded (also known as SciSearch

Copyright Statement: The Stata Journal and the contents of the supporting files (programs, datasets, and
c by StataCorp LP. The contents of the supporting files (programs, datasets, and
help files) are copyright 
help files) may be copied or reproduced by any means whatsoever, in whole or in part, as long as any copy
or reproduction includes attribution to both (1) the author and (2) the Stata Journal.
The articles appearing in the Stata Journal may be copied or reproduced as printed copies, in whole or in part,
as long as any copy or reproduction includes attribution to both (1) the author and (2) the Stata Journal.
Written permission must be obtained from StataCorp if you wish to make electronic copies of the insertions.
This precludes placing electronic copies of the Stata Journal, in whole or in part, on publicly accessible web
sites, fileservers, or other locations where the copy may be accessed by anyone other than the subscriber.
Users of any of the software, ideas, data, or other materials published in the Stata Journal or the supporting
files understand that such use is made without warranty of any kind, by either the Stata Journal, the author,
or StataCorp. In particular, there is no warranty of fitness of purpose or merchantability, nor for special,
incidental, or consequential damages such as loss of profits. The purpose of the Stata Journal is to promote
free communication among Stata users.
The Stata Journal, electronic version (ISSN 1536-8734) is a publication of Stata Press. Stata and Mata are
registered trademarks of StataCorp LP.

The Stata Journal (2009)


9, Number 4, pp. 643647

Stata tip 81: A table of graphs


Maarten L. Buis
Department of Sociology
T
ubingen University
T
ubingen, Germany
maarten.buis@uni-tuebingen.de

Martin Weiss
Department of Economics
T
ubingen University
T
ubingen, Germany
martin.weiss@uni-tuebingen.de

A useful option within the graph command is the by() option ([G] by option). This
option allows us to explore graphs across one or more additional categorical variables.
This Stata tip will discuss how to ne-tune the display of such a graph when we want
to compare graphs across two variables. Consider, for example, the graph created with
the following set of commands and displayed in gure 1:
. sysuse auto
(1978 Automobile Data)
. fillin rep78 foreign
. twoway scatter price mpg, by(foreign rep78, cols(5) compact)
Domestic, 2

Domestic, 3

Domestic, 4

Domestic, 5

Foreign, 1

Foreign, 2

Foreign, 3

Foreign, 4

Foreign, 5

Price

5,000

10,000

15,000

5,000

10,000

15,000

Domestic, 1

10

20

30

40 10

20

30

40 10

20

30

40 10

20

30

40 10

20

30

40

Mileage (mpg)
Graphs by Car type and Repair Record 1978

Figure 1. Comparing the relationship between price and mileage across repair status
and car type (with a title for every subgraph)
There is a tabular look to this graph, with the rows representing the origin of the
cars and the columns representing their repair status. This design helps us identify
the relevant comparisons in a way similar to so-called trellis graphics (Cleveland 1993;
Becker, Cleveland, and Shyu 1996) and lattice graphics (Sarkar 2008).
The rst step in creating such a graph is to make sure that the columns and rows are
properly aligned. We did that by using the cols() suboption of the by() option to make
sure that there are as many columns as there are categories of rep78. We also used
c 2009 StataCorp LP


gr0042

644

Stata tip 81

the fillin command ([D] fillin) to make sure that there is at least one observation
for every combination of rep78 and foreign, whereby those combinations that did
not exist in the original data contain only missing valueshere foreign cars with a
repair status of poor (1) or fair (2). This has the eect of creating empty graphs for
those nonexisting combinations. An alternative would be to use the holes() suboption
within the by() option, which would leave the area otherwise occupied by a subgraph
completely blank. Within the table-like logic of this graph, we like to use the empty
graphs for combinations that in principle could occur but did not, while reserving those
completely blank spaces for combinations that for logical reasons cannot occur. In our
example, there is no reason why foreign cars could not have a poor or fair repair status,
so we used the empty graphs. If we had compared graphs across gender and pregnancy
status, we would instead have left a hole for the combination pregnant men.
There is still a problem with this graph: the subgraph labels contain superuous
information and distract from the tabular design. Ideally, we would like to have titles
at the top representing the columns and titles on the right representing the rows. To
get the column titles, all we would need to do is keep only the top titles and suppress all
the others. In the example below, this is done in the following way: the variables rep78
and foreign are combined into one variable with the egen function group() ([D] egen
and Cox [2007]). Then value labels that correspond to the column titles are assigned
to the rst ve categories (rep78 contains ve categories, so in this graph there will be
ve columns). The remaining values of this new variable are assigned the value label
" ". This will suppress the remaining titles. This example also adds titles to the column
axis and the row axis. The resulting graph is shown in gure 2.
. egen group = group(foreign rep78)
(5 missing values generated)
. label define group 1
>
2
>
3
>
4
>
5
>
6
>
7
>
8
>
9
>
10

"Poor"
"Fair"
"Average"
"Good"
"Excellent"
" "
" "
" "
" "
" "

. label value group group


. twoway scatter price mpg,
> by(group, cols(5)
>
r1title("Car type",
>
orientation(rvertical)
>
size(medsmall))
>
t1title("Repair status",
>
size(medsmall))
>
note("") compact)

M. L. Buis and M. Weiss

645
Repair status
Fair

Average

Good

Excellent

Price

5,000

10,000

15,000

Car type

5,000

10,000

15,000

Poor

10

20

30

40 10

20

30

40 10

20

30

40 10

20

30

40 10

20

30

40

Mileage (mpg)

Figure 2. Comparing the relationship between price and mileage across repair status
and car type (with column titles)
The row titles can now be added using the Graph Editor ([G] graph editor and
Mitchell [2008, chapter 2]). The Graph Editor can be started within the graph window
by going to the file menu and clicking on Start Graph Editor. Within the Graph
Editor, the window of immediate interest is the Object Browser on the right-hand side
of the Graph Editor. If we expand plotregion1 (see gure 3) by clicking on the +, we
can see that the characteristics of the individual subgraphs are identied by adding a
[1] for the rst (top left) subgraph, [2] for the second subgraph (to the right of the
rst subgraph), etc.

Figure 3. The object browser


In our example, we want to add a title to the right of the fth and tenth subgraphs.
A title to the right of a graph is an r1title, so within the Object Browser, we are
looking for the elements r1title[5] for the top row title and r1title[10] for the

646

Stata tip 81

bottom row title. This is specic to this example, because we wanted to create a graph
with ve columns and two rows. If, for example, we wanted to make a graph with
three rows and three columns, we would be looking for r1title[3], r1title[6], and
r1title[9], for the top, middle, and bottom row title, respectively.
When you double-click on r1title[5], you will get the dialog box shown in gure 4.
In this dialog box, we can type the title. The remaining options govern how this row
title will look. If we use the sj scheme ([G] schemes intro) for our graph, then the
options shown in gure 4 will make sure that the row titles will look the same as the
column titles. If you use another scheme, you can nd out how to set those options by
double-clicking on one of the column titles and looking in the resulting dialog box at
how the options are set. The title for the bottom row is created in the same way by
typing the title and setting the options for r1title[10]. The resulting graph is shown
in gure 5.

Figure 4. Properties specied in the Graph Editor for r1title[5]

M. L. Buis and M. Weiss

647
Repair status
Fair

Average

Good

Execellent

15,000

Poor

Price

5,000

Foreign

10,000

15,000

Car type

5,000

10,000

Domestic

10

20

30

40 10

20

30

40 10

20

30

40 10

20

30

40 10

20

30

40

Mileage (mpg)

Figure 5. Comparing the relationship between price and mileage across repair status
and car type (with column and row titles)

References
Becker, R. A., W. S. Cleveland, and M.-J. Shyu. 1996. The visual design and control
of trellis display. Journal of Computational and Graphical Statistics 5: 123155.
Cleveland, W. S. 1993. Visualizing Data. Summit, NJ: Hobart.
Cox, N. J. 2007. Stata tip 52: Generating composite categorical variables. Stata Journal
7: 582583.
Mitchell, M. N. 2008. A Visual Guide to Stata Graphics. 2nd ed. College Station, TX:
Stata Press.
Sarkar, D. 2008. Lattice: Multivariate Data Visualization with R. New York: Springer.

You might also like