Professional Documents
Culture Documents
POLITICAL
METHODOLOGY
Edited by
J A N E T M. B O X - S T E F F E N S M E I E R
H E N R Y E. B R A D Y
and
DA V ID C O L L I E R
OXEORD
U N IV E R S IT Y PRESS
O X FO R D
U N IV ER SITY PRESS
6dp
THE
"O X F O R D
HANDBOOKS
O F
POLITICAL
SCIENCE
G
eneral
E d i t o r : R o b e r t E. G
o o d in
P O L IT IC A L T H E O R Y
John S. Dryzek, Bonnie Honig & Anne Phillips
P O L IT IC A L IN S T IT U T IO N S
R. A. W. Rhodes, Sarah A. Binder & Bert A. Rockman
P O L IT IC A L B E H A V IO R
Russell J. Dalton & Hans-Dieter Klingemann
C O M P A R A T IV E P O L IT IC S
Carles Boix & Susan C. Stokes
LAW & P O L IT IC S
Keith E. Whittington, R. Daniel Kelemen & Gregory A. Caldeira
P U B L IC P O L IC Y
Michael Moran, Martin Rein & Robert E. Goodin
P O L IT IC A L E C O N O M Y
Barry R. Weingast & Donald A. Wittman
IN T E R N A T IO N A L R E L A T IO N S
Christian Reus-Smit & Duncan Snidal
C O N T E X T U A L P O L IT IC A L A N A L Y S IS
Robert E. Goodin & Charles Tilly
P O L IT IC A L M E T H O D O L O G Y
Janet M. Box-Steffensmeier, Henry E. Brady & David Collier
This series aspires to shape the discipline, not just to report on it. Like the GoodinKlingemann New Handbook o f Political Science upon which the series builds, each of
these volumes will combine critical commentaries on where the field has been
together with positive suggestions as to where it ought to be heading.
C H A P T E R
14
EXPERIMENTATION
IN POLITICAL
SCIENCE
R E B E C C A B. M O R T O N
K E N N E T H C. W I L L I A M S
1 T he A d v e n t of E x p e r im e n t a l Po lit ic a l
S cien ce
E x p e r i m e n t a t i o n is increasing dramatically in political science. Figure 1 4 . 1 shows
the number o f experimental articles published by decade in three major mainstream
journals the American Political Science Review (APSR), American Journal o f Political
Science (AJPS), and Journal o f Politics (JO P) from 1 9 5 0 to 2 0 0 5 . 1 These figures do
not include the new use o f so-called survey experiments and natural experiments.
Moreover, other political science journals have published political experimental ar
ticles, such as Economics and Politics, Political Behavior, Political Psychology, Public
Choice, Public Opinion Quarterly, and the Journal o f Conflict Resolution, as have
numerous economics and social psychology journals. Furthermore, a number o f
political scientists have published experimental work in research monographs; see for
example Ansolabehere and Iyengar ( 1 9 9 7 ) , Lupia and McCubbins ( 1 9 9 8 ) , and M orton
and Williams ( 2 0 0 1 ) .
As discussed in Kinder and Palfrey (1993), there are a number o f reasons for the
increase in experimentation in political science in the 1970s and 1980s. We believe that
' Earlier figures are fro m M cG ra w a n d H oekstra (1994); m o re recent ones w ere co m piled b y the
authors. N ote that o u r figures are sign ifican tly greater than those co m piled b y M cD erm ott (2002).
340
R E B E C C A B. M O R T O N & K E N N E T H C. W I L L I A M S
2 W h a t is a n E x p e r i m e n t ?
2.1 The Myth of the Ideal Experiment
Political science is a discipline that is defined by the substance o f what we study
politics. As Beck (2000) remarks, researchers freely use whatever methodological
M cD erm otts calculations w ere lim ited to publication s b y certain auth ors accordin g to undefined criteria
(established political scientists).
2 A full and detailed treatm ent o f experim en tation in political science can n ot be presented in a
chapter-length study; w e encourage readers to consult M o rto n a n d W illiam s (2008), w h o expan d o n the
concepts and issues presented here.
341
solutions are available, drawing from other social sciences like economics, sociology,
and psychology as well as statistics and applied mathematics to answer substan
tive questions. Unfortunately, this has led to misinterpretations as experimentalists
learn about their method from other disciplines, and few nonexperimentalists have
an understanding o f advances that have occurred outside political science. The most
prominent misconception is that the best experiment is one where a researcher m a
nipulates one variable, called a treatment, while having an experimental group who
receives the treatment and a control group who does not, and randomly assigning
subjects across groups. This perception arises largely because most nonexperimen
talists learn about experiments through the lens o f behavioral social science methods
formulated in the 1950s. Certainly such a design can be perfect for the case where a
researcher is interested in the causal effect o f a binary variable that might affect the
choices o f subjects in some known context, and the researcher is interested in nothing
else. But this is rarely true for any significant question in twenty-first-century polit
ical science research. Furthermore, the aforementioned advances in technology have
changed the nature o f experimental research in fundamental ways that researchers
writing in the 1950s could not have anticipated. By adhering to an outdated view of
what an experiment should be and what is possible with experimentation, political
scientists often rule out experimentation as a useful method for many interesting
research questions.
There is no perfect or true experiment. The appropriate experimental design de
pends on the research question, just as is the case with observational data. In fact, the
variety o f possible experimental designs and treatments is in some ways greater than
the range o f possibilities with observational data. But how is experimental research
different from research with observational data? That is, what is experimental political
science?
3 T h is assum es that th e researcher m easures the data perfectly; clearly choices m ade in m easurem ent
can result in a type o f p o st-D G P intervention.
342
R E B E C C A E. M O R T O N & K E N N E T H C. W I L L I A M S
3 O t h e r F e a t u r e s of E x p e r im e n t a t io n
Beyond intervention, some would contend that control and random assignment are
must have attributes to an experiment. Below we discuss these two additional
features.
3.1 Control
In the out-of-date view o f political experimentation, control refers to a baseline
treatment that allows a researcher to gather data where he or she has not intervened.
But many experiments do not have a clear baseline, and in some cases it is not
necessary. For example, suppose a researcher is interested in evaluating how voters
choose in a three-party election conducted by plurality rule as compared to how
they would choose in an identical three-party election conducted via proportional
representation. The researcher might conduct two laboratory elections where sub
jects payments depend on the outcome o f the election, but some subjects vote in a
proportional representation election and others vote in a plurality rule election. The
researcher can then compare voter behavior in the two treatments.
In this experiment and many like it, the aspect o f control that is most important is
not the fact that the researcher has a comparison, but that the researcher can control
confounding variables such as voter preferences and candidate identities in order to
make the comparison meaningful. To make the same causal inference solely with
observational data the researcher would need to do two things: (x) rely on statistical
methods to control for observable confounding variables such as control functions
in regression equations, proximity scores, or matching methods, and (2) make the
untestable assumption that there are no unobservable variables that confound the
causal relationship the researcher is attempting to measure.4 In observational data,
things like candidate identities cannot be held constant and it may be impossible to
match the observable variables. B y being able to control these things, an experi
mentalist can ensure that observable variables are perfectly matched and that many
unobservable variables are actually made observable (and made subject to the same
matching).
E X P E R I M E N T A T I O N IN P O L IT IC A L SCIEN CE
343
4 T he V a l id it y of E x p e r im e n t a l
Po litical Scien ce Research
4.1 Defining Validity
The most important question facing empirical researchers is what can we believe
about what we learn from the data or how valid is the research? The most definitive
344
recent study o f validity is provided in the work o f Shadish, Cook, and Campbell
(2002) (hereafter SCC). They use the term validity as the approximate truth of
the inference or knowledge claim. As M cGraw and Hoekstra (1994) remark, political
scientists have adopted a simplistic view o f validity based on the early division of
Campbell (1957), using internal validity to refer to how robust experimental results
are within the experimental data, and external validity to refer to how robust experi
mental results are outside the experiment.5 Again, we note that political scientists are
adhering to an outdated view, for there are actually m any ways to measure the validity
o f empirical research, both experimental and observational.
SCC divide validity into four types, each o f which are explained below: statistical
conclusion validity, internal validity, construct validity, and external validity. Statistical
conclusion validity is defined as whether there is statistically significant covariance
between the variables the researcher is interested in, and whether the relationship is
sizeable. Internal validity is defined as the determination o f whether the relationships
the researcher finds within the particular data-set analyzed are causal relationships.
It is important to note that this is a much more narrow definition than generally
perceived by many political scientists. In fact, Campbell (1986) relabeled internal
validity to local molar causal validity.
When some political scientists think o f internal validity, particularly with respect
to experiments that evaluate formal models or take place in the laboratory, they are
referring to construct validity. Construct validity has to do with how valid the infer
ences o f the data are for the theory (or constructs) that the researcher is evaluating.
SCC emphasize issues o f sampling accuracy that is, is this an appropriate data-set
for evaluating the theory? Is there a close match between what the theory is about and
what is happening in the manipulated data-generating process? Thus, when political
scientists refer to internal validity they are often referring to a package o f three things:
statistical conclusion validity, local molar causal validity, and construct validity.
5 F or an exam ple o f h ow the outdated view o f valid ity still d om in ates discussion s o f experim entation
in political science see M cD erm ott (20 0 2 ,3 5 -8 ).
E X P E R IM E NT AT IO N IN PO LI T IC A L SCIENCE
345
Political scientists typically assum e that external validity m eans that the data-set
used fo r the analysis m ust resem ble som e natural or unm anipulated DGP. Since
experim entation m eans m anipulation and control o f the D G P b y the researcher,
and an experim ental data-set generated through that m anipulation and control, then
som e argue that the results cannot be externally valid b y definition. Nevertheless, even
in observational data, by choosing variables to m easure and study and b y focusing
on particular aspects o f the data, the researcher also engages in m anipulation and
control through statistical and m easurem ent choices. T h e researcher w ho w orks w ith
observational data sim plifies and ignores factors or makes assum ptions about those
things he or she cannot m easure. Thus, the observational data-set is also not natural
since it is sim ilarly divorced from the natural D G P ; b y the same reasoning the results
also cannot be externally valid.
It is a m istake to equate external validity w ith w hether a given data-set used
to establish a particular causal relationship resem bles the un m anipulated D G P
(which can never be accurately m easured or observed). Instead, establishing whether
a result is externally valid involves replication o f the result across a variety o f data
sets. F or exam ple, if a researcher discovered that m ore inform ed voters were m ore
likely to turnou t in congressional elections, and then found this also to be true in
m ayoral elections, in elections in G erm any, and in an experim ent w ith control and
m anipulation, then w e w ou ld say the result show ed high external validity. The same
thing is true w ith results that originate from experim ental analysis. I f an experi
m ent dem onstrates a particular causal relationship, then w hether that relationship
is externally valid is determ ined b y exam ining the relationship across a range o f
experim ental and observational data-sets; it is not ascertained b y seeing i f the o rig
inal experim ental data-set resem bles a hypothesized version o f the unm anipulated
DGP.
O ne o f the m ost interesting developm ents in experim ental research has com e
in lab oratory experim ents that test the behavioral assum ptions in rational choice
based m odels in econom ics. In order to test these assum ptions a researcher must
be able to m aintain significant control over the choices before an in dividual and
be able to m easure subtle differences in preference decisions as these choices are
m anipulated b y the researcher. These things are extrem ely hard to discern in ob
servational data since it is alm ost im possible to m easure isolated preference changes
w ith enough variation in choices. T he robustness o f som e o f the behavioral vio la
tions has been established across a w ide range o f experim ental data-sets. A n entire
new field w ithin econom ics, b ehavioral gam e theory, has evolved as a consequence,
w ith a w hole new set o f perspectives on understanding econom ic behavior this is
som ething that w as alm ost unthinkable tw enty years ago. It is odd that m any in
political science w ho disagree w ith assum ptions m ade in rational choice m odels are
also quick to dism iss the experim ents w hich test these assum ptions; these critics
claim that these experim ents do not have external validity, but external validity
is really about the robustness o f these experim ents across different form ulations,
and n ot about w hether the experim ent resem bles the hypothesized unm anipulated
DGP.
346
5 D im e n s io n s of E x p e r i m e n t a t i o n in
P o litical S cien ce
5.1 Location: Field, Lab, or Web?
A dimension that is especially salient among political scientists, particularly nonex
perimentalists, is location. Laboratory experiments are experiments where the subjects
are recruited to a common location, the experiment is largely conducted at that
location, and the researcher controls almost all aspects in the environment in that
location, except for subjects behavior. Field experiments are experiments where the
researchers intervention takes place in an environment where the researcher has only
limited control beyond the intervention conducted, and the relationship between the
researcher and the subject is conducted often through variables such as postal delivery
times or work schedules o f subjects outside o f the researchers control. Furthermore,
since many o f the individuals in a field experiment that are affected by the researcher
are not aware that they are being manipulated, these also raise ethical issues not
generally involved in laboratory experiments. For example, in Wantchekons (2003)
experiment in Benin, candidates were induced to vary their campaign messages in
order to examine the effects o f different messages on voter behavior. But voters were
unaware that they were subjects in an experimental manipulation.
E X P E R I M E N T A T I O N IN PO LI T IC AL SCIE NCE
347
348
had found that the surveys showed significantly different attitudes towards politics
over time. But at the same time, the survey questions had changed. Thus, to determine
whether attitudes had truly changed, or whether the survey question form at changes
accounted for the results, Sullivan et al. surveyed the same set o f voters but randomly
determined which set o f questions (pre-1964 or after) each respondent would be
asked. The researchers found that the change in the format o f the survey implied that
the respondents had different political attitudes, and concluded that the empirical
analysis comparing the two time periods was fundamentally flawed.
The most exciting advances in survey experimental design can be found in the
work o f Paul Sniderman and colleagues with respect to computer-assisted telephone
interviewing (CATI) (see Sniderman, Brody, and Tetlock 1991). Using computers, tele
phone interviewers can randomly assign questions to subjects, thereby manipulating
the question environment. The National Science Foundation funded program TimeSharing Experiments in the Social Sciences (TESS) has facilitated the advancement o f
survey experiments in political science, providing researchers with a large, randomly
selected, diverse subject population for such experiments (not to mention the in
frastructure needed for conducting the experiments with this population).7 For a
more comprehensive review o f current field experiments, see the chapter by Gerber
and Green in this volume.
E X P E R IM E NT AT IO N IN PO LI T IC AL SCIENCE
349
350
Number o f voters
3
1.00
0.75
0.10
0.75
1.00
0.10
0.25
0.25
1.00
voters choose sincerely, A would win. As Myerson and Weber (1993) demonstrate,
formally the voting situation o f all three o f these outcomes are equilibria in pure
strategies.
Suppose now, however, that there is a m ajority requirement that is, in order for a
candidate to win, he or she must receive at least 50 percent o f the vote. If no candidate
receives more than 50 percent o f the vote, a run-off election is held between the two
top vote receivers. In such a case then, even if all voters vote sincerely in the first
round election and candidate C wins that vote, C would have to face either A or B
in a run-off (assuming that ties are broken by random draws). In a run-off, either
A or B would beat C, so C cannot win. O f course voters o f Types 1 and 2 might
engage in strategic voting in the first round (as discussed above), thereby avoiding
the need for a run -off Morton and Rietz (2006) formally demonstrate that when
there are majority requirements and voters perceive that there is some probability
albeit perhaps small that their votes can affect the electoral outcome, that the only
equilibrium is for voters to vote sincerely in the first stage and let the random draw
for second place choose which candidate faces C in a run -off (and then wins the
election).
In the end, this theory suggests that strategic voting (in our payoff matrix) is not
likely when there are majority requirements, but is possible when these requirements
do not exist. These predictions depend on assumptions about payoffs, voting rules,
and the distribution o f voter types, as well as voter rationality. An experimenter can
create a situation that closely matches these theoretical assumptions about payoffs,
voting rules, and the distribution o f voter types, and then evaluate whether indi
viduals act in accordance with theoretical predictions. The results from such an
experiment tell the theorist about the plausibility o f their assumptions about human
behavior.
EX P E R I M E N T A T I O N IN PO LI T IC A L SCIENCE
351
opinion polls can serve as a coordinating device for voters o f Types 1 and 2; that is,
they argue that these types o f voters would like to coordinate on a common candidate.
One way such coordination may be facilitated is by focusing on whichever candidate
is ahead in a public opinion poll or in campaign contribution funds prior to the
election. However, there is no basis within the Myerson and Weber theory on which to
argue that voters will use polls or contributions in this fashion. Researchers have used
experiments to see if voters do use polls and campaign contributions to coordinate,
and the results support Myerson and Webers conjecture that polls and contributions
can serve as a coordination device (Rietz 2003).
In contrast to the approach o f relaxing theoretical assumptions or investigating
conjectures in the context o f a theoretically driven model as above, some prom i
nent political scientists see searching for facts through experiments as a substitute
for theorizing. For example, Gerber and Green (2002) advocate experiments as an
opportunity to search for facts with little or no theory, remarking (2002, 820): The
beauty o f experimentation is that one need not possess a complete theoretical model
o f the phenomenon under study. By assigning... at random, one guarantees that the
only factors that could give rise to a correlation. . . occur by chance alone. Although
they go on to recognize the value o f having theoretical underpinnings more generally,
they clearly see this as simply being more efficient; they do not view a theoretical
model as a necessary component in the experimental search for facts.
35 2
choices: a choice the voter makes when fully informed and a choice a voter makes
when uninformed. O f course, because only one state o f the world is ever observed for
any given voter, the researcher must hypothesize that these counterfactual worlds exist
(even though she knows that they do not). Furthermore, the researcher must then
make what is called the stable unit treatment assumption (SUTVA), which demands
no cross-effects o f treatment from one subject onto another's choices, homogeneity
o f treatment across subjects, and a host o f other implicit assumptions about the DGP
that are often left unexplored (Morton and Williams 2008). Finally, the researcher
must also typically make the untestable assumption that unobservable variables do
not confound the effect that she is measuring; that there are no selection effects that
interfere with the randomization he is using in the experimental design. In the end,
searching for facts even in field experiments requires that a researcher theorize. It
is not possible to search for facts and discover nuances o f the DGP through trial and
error without theorizing about that process.
Which empirical study is correct? Frechette, Kagel, and Morelli use data from labo
ratory experiments where the underlying bargaining framework was controlled by the
E X P E R I M E N T A T I O N IN PO L IT IC A L SCIENCE
353
354
o n c lu d in g
e m a r k s
Political science is an experim ental discipline. N evertheless, we argue that m ost p olit
ical scientists have a view o f experim entation that fits w ith a 1950s understanding o f
research m ethodology, and one that especially m isunderstands the issue o f external
validity. These false perceptions lead m any political scientists to dism iss experim ental
w ork as not being relevant to substantive questions o f interest. In this chapter w e have
provided an analysis o f tw enty-first-century experim entation in political science, and
have dem onstrated the w ide variety o f ways in w hich experim entation can be used
to answer interesting research questions w hile speaking to theorists, em piricists, and
policy-m akers.
We expect the use o f experim entation techniques to continue to rise in political
science, just as it has in other social sciences. In particular, w e expect that the tech
nological revolution w ill continue to expand the opportunities for experim entation
in political science in two im portant ways. First, the expansion o f interactive Webbased experim ental m ethods w ill allow researchers to conduct large-scale experi
ments testing gam e-theoretic m odels that have previou sly only been contem plated.
For exam ple, through the internet, it is possible for researchers to consider how
voting rules m ight affect outcomes in m uch larger electorates than has been possible
in the laboratory. Second, advances in brain -im agin g technology w ill allow political
scientists to explore m uch m ore deeply the connections betw een cognitive processes
and political decisions. These two types o f experim ents w ill jo in w ith traditional
laboratory and field experim ents to tran sform political science into a discipline
where experim ental w ork w ill one day be as prevalent as traditional observational
analyses.
R eferences
S., a n d I y e n g a r , S. 1997. Going Negative: How Political Advertisements Shrink
and Polarize the Electorate. New York: Free Press.
S t r a u s s , A., S n y d e r , J., and T i n g , M. 2005. Voting weights and formateur advan
tages in the formation o f coalition governments. American Journal o f Political Science, 49:
550-63.
B a r o n , D., a n d F e r e j o h n , J. 19 8 9 . B a r g a in in g in le g is la tu r e s . American Political Science Review,
A n so la beh ere,
8 3 :1 1 8 1 - 2 0 6 .
E X P E R I M E N T A T I O N IN PO LI T I C A L SCIENCE
355
1986. Sciences social system o f validity-enhancing collective belief change and the
. problems o f the social sciences. Ch. 19 in Selected Papers, ed. E. S. Overman. Chicago:
University o f Chicago Press.
D a v i s , D ., and H o l t , C. 1993. Experimental Economics. Princeton, NJ: Princeton University
Press.
D i c k s o n , E., and S c h e v e , K. 2006. Testing the effect o f social identity appeals in election
campaigns: an fMRI study. Working paper, Yale University.
F i o r i n a , M., and P l o t t , C. 1978. Committee decisions under majority rule. American Political
Science Review, 72: 575-98.
F r e c h e t t e , G., K a g e l , J., and M o r e l l i , M. 2005. Behavioral identification in coalitional
bargaining: an experimental analysis o f demand bargaining and alternating offers. Econometrica, 73:1893-937.
G e r b e r , A., and G r e e n , D. 2 0 0 0 . The effects o f canvassing, direct mail, and tele
phone calls on voter turnout: a field experiment. American Political Science Review, 94:
653-63.
------------- 2002. Reclaiming the experimental tradition in political science. Pp. 805-32 in
Political Science: State o f the Discipline, ed. I. Katznelson and H. V. Milner. New York: W. W.
Norton.
------------- 2004. Get out the Vote: How to Increase Vote Turnout. Washington, DC: Brookings.
G o s n e l l , H. 1927. Getting out the Vote: An Experiment in the Stimulation o f Voting. Chicago:
University o f Chicago Press.
H o l l a n d , P. W. 1988. Causal inference, path analysis, and recursive structural equation models
(with discussion). Pp. 449-93 in Sociological Methodology, ed. C. C. Clogg. Washington, DC:
American Sociological Association.
K i n d e r , D., and P a l f r e y , T. R. 19 9 3 . On behalf o f an experimental political science. Pp. 1 - 4 2
in Experimental Foundations o f Political Science, ed. D. Kinder and T. Palfrey. Ann Arbor:
University o f Michigan Press.
L a L o n d e , R. 19 8 6 . Evaluating the econometric evaluations o f training programs. American
Economic Review, 7 6 : 6 0 4 - 2 0 .
L a u , R. R ., and R e d la w sk , D. P. 2001. Advantages and disadvantages o f cognitive heuristics in
political decision m aking. American Journal o f Political Science, 45: 951-71.
A., and M c C u b b i n s , M . 19 9 8 . The Democratic Dilemma: Can Citizens Learn What They
Need to Knowl Cambridge: Cambridge University Press.
M c D e r m o t t , R. 2002. Experimental methods in political science. Annual Review o f Political
Science, 5: 31-61.
M c G r a w , K.', and H o e k s t r a , V. 1994. Experimentation in political science: historical trends
and future directions. Pp. 3-30 in Research in Micropolitics, vol. iv, ed. M. Delli Carpini,
L. Huddy, and R. Y. Shapiro. Greenwood, Conn.: JAI Press.
M o r e l l i , M. 1999. Demand competition and policy compromise in legislative bargaining.
American Political Science Review, 93: 809-20.
M o r t o n , R. 1999. Methods and Models: A Guide to the Empirical Analysis o f Formal Models in
Political Science. Cambridge: Cambridge University Press.
and R i e t z , T. 2006. Majority requirements and strategic coordination. Working paper,
New York University.
and W i l l i a m s , K. 2001. Learning by Voting: Sequential Choices in Presidential Primaries
and Other Elections. Ann Arbor: University o f Michigan Press.
------------- 2008. Experimental political science and the study o f causality. Unpublished man
uscript, New York University.
L u p ia ,
356
u tz,