You are on page 1of 36

How Much Disagreement is Good for Democratic Deliberation?

The CaliforniaSpeaks Health Care Reform Experiment 1


Kevin M. Esterling Associate Professor Department of Political Science UCRiverside kevin.esterling@ucr.edu Archon Fung Professor JFK School of Government Harvard University archon fung@harvard.edu

Taeku Lee Professor Department of Political Science UC Berkeley taekulee@berkeley.edu March 26, 2010

presented an earlier version of this paper at the Annual West Coast Experimental Political Science Conference, Del Mar, CA, May 15, 2009 and to a seminar at Claremont Graduate University on March 2, 2010. We thank the participants at both of these venues for valuable feedback. We thank AmericaSpeaks for providing the data.

1 We

Electronic copy available at: http://ssrn.com/abstract=1401151

Abstract Are we the kind of creatures who are suited to govern ourselves through deliberation? We seek to answer one important component of this question: how do individuals respond to deliberation in groups with varying levels of disagreement? We use a natural experiment in which approximately 3000 individuals were divided into small groups composed of about 8-10 persons. These groups deliberated for one day about health care reform in California. We demonstrate that there is a non-monotonic eect of disagreement upon deliberative quality. Elements of deliberative quality include mutual respect, understanding, proferring of reasons and arguments, equal opportunity for discursive engagement, and neutrality. Deliberative quality is maximized at moderate levels of disagreement and lower at high levels of ideological agreement or disagreement. Furthermore, individuals exhibit higher levels of persuasion in deliberative contexts of moderate disagreement. These ndings support the view that many individuals have elements of a political psychology that is well suited for deliberation. They do not recoil when encountering disagreement nor do they especially value deliberating with those who see the world in very similar ways. Instead, they regard as most successful deliberations with moderate levels of dierence perhaps those in which they were acquire new information, perspectives, or reasons. Beyond our substantive nding, this paper oers a methodological template for experimental studies of deliberation.

Electronic copy available at: http://ssrn.com/abstract=1401151

Introduction

Are we the kind of political creatures who are suited to govern ourselves through deliberation? Philosophers and normative theorists have argued that we should govern ourselves through public deliberation. Deliberation underwrites the legitimacy of a democratic government. When laws and policies emerge from deliberations among citizens in which they oer each other reasons for those laws, aim toward consensus, and strive to minimize the extent of their disagreement to seek an economy of moral disagreement in a principled way those laws are the ones that are most likely to be the ones that all citizens can accept (Bohman, 1998; Cohen, 1989; Freeman, 2000; Gutmann and Thompson, 1996; Rawls, 1993). But how do individuals actually experience the demands of deliberation? If deliberation is like a healthy vegetable high in ber, vitamins and minerals, democratically speaking, is it a food that people love to eat (Jacobs, Cook, and Carpini, 2009)? Conversely, are they allergic to it and so prone to react in unhealthy ways that outweigh deliberations virtues (Hibbing and Theiss-Morse, 2002)? Or, do most people suer the demands of deliberation willingly but somewhat reluctantly, like some people eat spinach? An important part of the answer to this question hinges upon how individual respond when they deliberate with others who disagree with them. Disagreement poses no particular challenge for models of democracy that aggregate citizens preferences into a social choice (Downs, 1957; Elster, 1986). Disagreement is both a condition and challenge, however, for deliberative democrats. Moral disagreement over issues of values, ideology and policy, is the problem that deliberation is intended to address (Thompson, 2008). Deliberative democrats advocate institutions of law and policy-making in which citizens and public leaders oer reasons to persuade one another and aim toward consensus (Cohen, 1989; Habermas, 1996; Knight and Johnson, 1994). When, as will often be the case, they cannot reach consensus, they should aim to minimize the extent of their disagreement to seek an economy of moral disagreement in a principled way (Gutmann and Thompson,

Electronic copy available at: http://ssrn.com/abstract=1401151

E. Funglee: Disagreement & Deliberative Quality

1996). Deliberation, therefore, requires individuals to listen to one another and to adjust their own views and positions in light of what they hear. Are we the kind of creatures who will respond positively by tolerating, listening and adjusting in this way when we engage in deliberation with those who hold with diverse views and who disagree with us? Or, do we recoil from disagreement, entrenching our positions, resenting those who challenge them, perhaps lose condence in our own positions, and suer in silence?1 We investigate this question by testing empirically for the eect of disagreement on deliberative quality. We use data from the CaliforniaSpeaks statewide conversations on health care reform, held in August of 2007 in eight cities across California. At this event, nearly 3000 participants assembled to discuss health care issues and policy proposals. At each site, participants were assigned to small groups of about 8-10 persons, and completed surveys in which they report, among other things, their perceptions of the quality of the deliberative process at the event. Identifying and estimating the eects from a deliberative exercise in practice are dicult. In general, empirical tests of deliberative quality are plagued by an ubiquitous form of self-selection, termed homophily, or the near universal tendency that people generally like to talk to others like themselves. We are interested, however, in how people respond to disagreement once it is encountered at dierent levels. Fortuitously, AmericaSpeaks method of assigning participants to discussion tables eectively randomized participants to dierent levels of disagreement. Using estimates of disagreement encountered at each table, we are able to show that the participants perception of the quality of the deliberative process, as well as the amount of within-table persuasion, was maximized at moderate or intermediate levels disagreement.2
In her pathbreaking book Beyond Adversary Democracy, Jane Mansbridge 1980 argues that in contexts of deep disagreement, adversary modes of democratic decision-making are more appropriate than those, like deliberation, that aim toward consensus in part because eorts to seek consensus can work to exclude and silence weaker parties. 2 The primary methodological challenge for this paper centers on our measure of the treatment itself, which in this case is measured with error. We propose and implement statistical methods for estimating treatment eects when the treatment itself is measured with error (Imai and Yamamoto, 2008). We use a Bayesian analytical method that accommodates both the uncertainty and potential biases that can arise
1

E. Funglee: Disagreement & Deliberative Quality

Four Deliberative Psychologies

Some political psychologies are compatible with democratic delibertion and some are not. Consider four stylized types of psychological dispositions toward ideological disagreement and how they relate to the prospects for successful deliberation. If individuals are disagreement-phobic (Type I), they seek out those who are similar to themselves and respond negatively when confronted with ideological dierence. Dierence-phobic individuals may be intolerant, resistant to undertanding the arguments of others, oering due regard for others interests, or pursue strategies of strategic talk. To the extent that such individuals change their minds through deliberation, they might change most when with those who agree with them conrming and exacerbating their prior views resulting in a pattern of deliberative polarization (Sunstein, 2002). In Rawlsian terms, disagreement-phobic individuals are rational but not reasonable. The prospects for deliberative democracy are dim if disagreement-phobia is the dominant political psychology in a political culture or among humans generally. Many theorists of deliberation have implicitly supposed a second type of deliberative psychology; individuals are reasonable and disagreement-tolerant (Type II). Rawls account of liberalism, for example, requires individuals not just to be rational in the sense that they want to achieve their individual ends, but also reasonable in that they are willing to constrain the pursuit of their own ends according to principles and regulations that are fair. In the deliberative arena, such individuals must be willing to hear out those who disagree with them (though they do not relish disagreement) so that they can develop principles and agreements that are fair to all. Such individuals might be most persuaded in deliberative contexts of moderate disagreement because they hear and incorporate opposing viewpoints. However, rational but reasonable and tolerant individuals prefer deliberation in contexts of low disagreement because the requirements of reasonableness create less strain on the rational pursuit of individual ends. At the limit, where
with a stochastic treatment, and when either pretest or post-test data are missing.

E. Funglee: Disagreement & Deliberative Quality

there is no ideological disagreement, being reasonable does not constrain rationality at all because everyone agrees. The reasonableness and tolerant political psychology is the minimum threshold necessary for deliberation to be successful. Disagreement-curious (Type III) individuals want to learn and challenge themselves by encountering other perspectives. The attitude here is not one of mere toleration, but enthusiastic experimentation. Disagreement curious individuals want to enter John Stuart Mills 1859 wide-open marketplace of ideas enthusiastically, perhaps to test their own views or to try-on those of others. One is unlikely to be challenged, pressed, or perhaps to learn very much in a deliberation or more accurately a discussion where everyone agrees; its a boring experience. Disagreement-curiousity is substantially more favorable for democratic deliberation than one in which most people are just reasonable and tolerant. With the latter type, deliberation requires concessions albeit willingly made while deliberation is an opportunity for exploration and development for the disagreement-curious type. At the same time, those who are disagreement-curious will be turned o in debates among loud-mouthed extremists, who cling to their own views at all costs, with no interest in constructively engaging each other. With the fourth and nal type, disagreement-philia (Type IV), individuals favor deliberative environments with high levels of disagreement. They may be contrarians and debaters who relish performance or the challenge of persuading skeptical interlocutors. Disagreement-philia may be the most favorable type for deliberation in a plural society. However, we nd little support in the literature on political or social psychology or deliberative democratic theory to suppose that this type is at all common. In the experiment (described below), we asked individuals to evaluate the quality of their deliberative experience. If either disagreement-phobia (Type I) or reasonableness and toleration (Type II) are the prevalent deliberative psychological types, one would observe a monotonic pattern in which evaluations of deliberative quality decrease with the level of group ideological dierence. If disagreement-philia (Type IV) is prevalent,

E. Funglee: Disagreement & Deliberative Quality

we would observe the opposite monotonic pattern: evaluations of deliberative quality would increase with the level of ideological disagreement. What the data below show is a non-monotonic pattern that is consistent with the prevalence of Type III disagreement curious deliberative psychology. Individual evaluations of deliberative quality are most favorable at moderate levels of ideological disagreement. Furthermore, individuals also exhibit evidence of greatest persuasion at moderate levels of ideological disagreement. We nd therefore that the typical individual in these sessions exhibits a political psychology that is most conducive to democratic deliberation. That many of these citizen participants, deliberating in a real-world setting, appear to possess the political psychology most conducive to high quality deliberation may surprise critics of deliberative democracy and deliberative theorists alike.

The CaliforniaSpeaks Natural Experiment

Our project envisions deliberative quality to be a cognitive outcome in response to the amount of discursive disagreement that one encounters; that is, the discussion itself should be causal. The important work of Diana Mutz 2004 has established that those who speak with others who are dierent from themselves what she calls cross-cutting exposure are more likely to know the arguments of others and likely to be more tolerant of dierence. We build upon her work here with an experimental approach. The most pressing problem in testing this sort of deliberative expectation non-experimentally is self selection. People generally select whom they talk to, and most people like to talk to others who are like themselves. This self-selection, where birds of a feather ock together, is commonly labeled homophily. Furthermore, those who engage with those who disagree may have a taste for disagreement and so seek that out. With either type of self selection, the eects of the deliberation itself cannot be discerned from comparisons across deliberative sessions.

E. Funglee: Disagreement & Deliberative Quality

Dierences in deliberative outcomes across the sessions might be spuriously related to the heterogeneity in personality types that also vary across these sessions. The danger here is that personality types are driving both disagreement and deliberative quality, rather than the latter caused by the former.3 We might especially expect a homophilic grouping of personality types to occur at a deliberative event that lacks assigned seating. At such a session, debate participants would enter a room sequentially and then must sit down at a xed number of discussion tables. Among the rst ones to arrive, one can imagine ideological extremists to group together at tables (conservatives with conservatives and liberals with liberals) and moderates to group together (a mix of moderate liberals and moderate conservatives). As the tables ll up, the last ones to arrive will have less discretion over which table to choose, and hence may be forced to sit with others they normally would not choose, including extremists from the opposite end of the ideological spectrum. That is, among the late arrivers, some extreme liberals may be forced to sit with extreme conservatives simply because of space constraints, and vice versa. The ideologically homogeneous tables will experience the lowest disagreement, the moderates will experience moderate disagreement, and the late arrivers will experience the most disagreement, and it is not dicult to imagine the quality of debate being driven by the distribution of personality types at each table, rather than the discussion itself.

3.1

Natural Randomization in the Design of the CaliforniaSpeaks Event

We are able to overcome this problem of self-selection and homophily through a natural experiment that occurred in AmericaSpeaks deliberative sessions (Fung and Lee, 2008).
This problem is easy to envision in the present context. One can imagine passive types choosing each other as discussion partners; they might have low disagreement with each other, the discussion will be at, and so the session may be of low deliberative quality. Next, reasonable types might choose each other, and they might have vigorous but moderate disagreement and the session will be of high deliberative quality. Finally, irascible types might choose each other, talk past each other, and the session will be of low deliberative quality.
3

E. Funglee: Disagreement & Deliberative Quality

On August 11, 2007, over 3000 Californians gathered together in eight locations throughout the state to discuss health care reform and to construct policy recommendations. AmericaSpeaks ran a total of eight events that day, in the cities of Eureka, Fresno, Los Angeles, Oakland, Riverside, Sacramento, San Diego, and San Luis Obispo. Participants spent the whole day at small group tables (typically 8 to 10 people per table) discussing health care issues and policies. The participants also lled out surveys prior to the session and then immediately afterward; the former survey is our source of pretreatment data and the latter is our source of post-treatment data.4 At the sessions, participants could not select their own tables. Instead, they were assigned to tables as they arrived at the event. In addition, AmericaSpeaks went to great lengths to ensure that participants went to and stayed at their assigned table. Compliance with the table assignments worked especially well in four sites: Eureka, Riverside, Sacramento, and San Luis Obispo. At these sites, nearly all of the participants went to the table they were assigned, and they stayed at their assigned tables throughout the day.5 To avoid complications from non-compliance, we limit our analyzes of table-level eects in this paper to these sites (1317 participants). Compliance with the table assignments is especially crucial in a deliberative experiment. If a participant fails to comply with her table assignment, this not only aects her treatment received, but also the treatment received among the participants at the table she goes to, as well as among the participants at the table she does not go to. While the assignment to tables was done with great care, the assignment process was not random. Instead, the assignment process was a complex function of the order in which participants arrived at the events. The assignment process was designed partly to separate participants who arrived at the same time, who might know each other, and
Participants noted their table number on both surveys. 223 of these participants (16.9 percent) failed to ll out a post-test survey. In the analyzes below, we impute the missing post-test data from participants pretest data as one would with ordinary survey research, and the model estimates are marginal to the missing data distributions (Tanner and Wong, 1987). 5 In Humboldt, none out of 396 participants failed to comply with their table assignment; in San Luis Obispo, none out of 264; in Sacramento, three out of 407; and in Riverside, four out of 250 participants.
4

E. Funglee: Disagreement & Deliberative Quality partly to mix early arrivers with late arrivers.6

Although table assignments were not made randomly, it turns out that the level of disagreement that participants experienced varies across tables at the events in a way that is uncorrelated with the attributes or potential outcomes of participants. That is to say, the participants were assigned to disagreement levels (our treatment) in a manner that was indistinguishable from a random assignment, or in other words, the treatment was ignorable. We assert ignorability of the treatment both theoretically and empirically. Below we measure the level of disagreement at a table as a function of participants ideological ideal points, where the greater the dispersion of the ideal points at a table, the greater the amount of disagreement. The appendix provides a detailed statistical assessment of the quality of the randomization, using formal distribution and covariate balance tests. To prove ignorability for this natural experiment theoretically, one need only make one assumption: Assumption 1 Ideology is uncorrelated with the participants arrival order at the event. Assumption 1 only requires that knowing a participants ideology would tell us nothing about when the participant is likely to arrive at the event. That is, if liberals and conservatives are equally likely to be early or late, then assumption 1 is satised. To prove ignorability using assumption 1, consider the worst case for assignments, that AmericaSpeaks simply sequentially seated participants at tables depending on their arrival order (that is, ignoring the actual mixing they did between late and early arrivers). The worst case would assume that people who love deliberating would show up early, and those who hate deliberating would show up late (or vice versa). By this assumed assignment procedure, people who love deliberation would be grouped at the same tables, and
Specically, the organizers at each site divided up the tables into three groups. They assigned the rst arriver to the rst table of the rst group, the second arriver to the second table of the rst group, and so on. They repeated this process among the rst group of tables until the tables in the rst group were half-full, at which point they began assigning arrivers to the second group of tables. They repeated this process for the second and the third group of tables. When all of the tables were half-full, they used an identical assignment process to ll up the tables in the rst group, then to ll up the tables in the second group, and then the third.
6

E. Funglee: Disagreement & Deliberative Quality

people who hate deliberation would be seated at other tables. In this case, the participants potential outcomes, dened as both their baseline enjoyment of deliberation and how much increased satisfaction they receive from participating in the event, would be correlated with their arrival order. By assumption 1, however, ideology is unrelated to arrival order; liberals and conservatives are equally likely to arrive early or late. Because our measure of disagreement is a function of participants ideological ideal points, this implies that disagreement levels at tables (i.e., the treatment) is uncorrelated with the arrival order, and hence assignment to disagreement levels is unrelated to participants potential outcomes. That is, participants who love deliberation are just as likely to experience low, moderate, or high disagreement, and the same is true for those who hate deliberation. By assumption one, then, participants assignment to disagreement levels is ignorable, that is, essentially like a randomized experiment, and hence the causal eect of disagreement is identied in these data. In this paper, we examine the eects of disagreement on two distinct types of deliberative outcomes. In the rst model, we test whether the eect of (stochastically-measured) disagreement is non-monotonic for how participants subjectively perceive the quality of the deliberative process and the quality of the policy recommendations decided on at each site. In the second model, we examine rst whether within table persuasion occurred using spatial regression, and then whether the amount of persuasion is a non-monotonic function of disagreement. Testing for eects of disagreement for these two distinct outcomes, perceived deliberative quality and persuasion, lets us speak to the multifaceted nature of the concept of deliberation.

Disagreement and Deliberative Quality

We posit that disagreement is causally related to the quality of deliberation. In the rst model, we examine the eect of table-level disagreement on participants subjective as-

E. Funglee: Disagreement & Deliberative Quality

10

sessments of deliberative quality. These subjective assessments fall into two categories: regarding the quality of the deliberative process, and regarding the quality of the policy recommendations decided on at each site. We cannot observe participants true perceptions of deliberative quality since quality as a concept is both multifacted and ambiguous. Instead, we collect data on a wide array of subjects perception of indicators of deliberative quality. These indicators were developed from theoretical accounts of deliberative democracy. We use these indicators to construct latent variable scales that measure participants perception of deliberative quality in several ways. We include these scales as latent variable outcomes in a structural equation model (SEM), where the latent variables, structural eect parameters, and missing data parameters are all estimated simultaneously. The post-test survey contained a block of items that measure participants perception of the deliberative quality of the events, all with the following question stem: We would like to begin by asking you how much you agree or disagree with the following statements about todays town meeting. (For each statement, please check only one option). The response categories were Strongly Agree, Agree, Neither, Disagree, Strongly Disagree, and Dont Know (where the last category is treated as missing, and hence imputed as if the participant had been forced to answer, see (Mondak, 2001)). This block of items contained eight indicators for the quality of the deliberative process. These are, People at this meeting listened to one another respectfully and courteously [Mutual respect] Other participants seemed to hear and understand my views [Understanding] Even when I disagreed, most people made reasonable points and tried to make serious arguments [Reason giving] I am more informed about the challenges and options for health care reform in California [Informative session]

E. Funglee: Disagreement & Deliberative Quality

11

The meeting today was fair and unbiased. No particular view was favored. [Agenda neutrality] Everyone had a real opportunity to speak today. No one was shut out and no one dominated discussions [Inclusion and equal opportunity for participation] Participating today was part of my civic duty as a Californian to speak out and be heard on this issue [Civic duty] I had fun today. Politics should be like this more often This block also contained four items useful to measure participants subjective assessment of the quality of the policy recommendations adopted at their sites. I personally agree with the voting results at the conclusion of todays meeting [Acceptance] Decision makers should incorporate the conclusions of this town meeting into Californias health care policy [Legitimacy] I personally changed my views on health care reform as a result of what I learned today [Openness] I would participate in an event like this one again Figure 1 diagrams the statistical model, which is a full SEM. In the gure, variables in circles are latent variables (so the value for each participant is a distribution, not a constant), variables in rectangles are measured in a survey (pretreatment variables are shaded), and the arrows indicate variables that are assigned to equations. For each indicator equation, we use an ordered logit link function estimating m 1 thresholds, where m is the number of response categories, and a factor coecient. We regress each deliberative quality scale on a table-level disagreement measure and its square, for the rst and second order terms testing for non-monotonic eects. As we

E. Funglee: Disagreement & Deliberative Quality

12

Table-Level Functions Policy Qualityi

View Changedi Voting Resultsi Leaders Incorporatei Participate Againi

Disagreement = SD(Ideal Pointj) jTi, ji

Disagreement2 = Var(Ideal Pointj) jTi, ji

More Informedi People Respectfuli Others Heardi Process Qualityi Unbiased Meetingi Reasonable Pointsi Opportunity to Speaki

Pretest Response Missingi Site Fixed Effectsi Ideal Pointi

Civic Dutyi Had Funi

= Pretreatment Party IDi Ideology Self Reporti Gone Too Fari Limit Governmenti Seen Sickoi = Post-treatment

Figure 1: Deliberative Quality Statistical Model

describe next, the model takes the disagreement variable variable as a distribution to be estimated for each participant, and this correctly accounts for the stochastic nature inherent in our measure of disagreement. In this model, if the coecients for the rst order terms for disagreement are positive and signicant, while the coecients on the second order terms are negative and signicant, then we can conclude that deliberative quality is a concave (or, inverse-U) function of disagreement.

4.1

Estimating the Treatment Received: Disagreement Levels

We measure table level disagreement for subject i by the standard deviation of the other subjects estimated ideal points who are seated at i s table (i.e., excluding i s own ideal point). By this measure, the wider the dispersion of ideal points, the more disagreement there is, or is likely to be, at the table. We use the following items to construct an ideology scale, where high values indicate

E. Funglee: Disagreement & Deliberative Quality a conservative ideal point and low values indicate a liberal ideal point.

13

We have gone too far in pushing equal rights in this country (ve point agree/disagree scale) Liberal/Moderate/Conservative self identication (three point scale) Political Party self identication (Democrat, Independent, Republican) Have you seen Sicko (yes/no) Limit governments role to providing coverage for the unemployed (ve point agree/disagree scale) These items load on a single factor (results not reported), and are a mixture of standard ideology indicators (the rst three items) and health-specic ideology indicators (the last two items). We use this battery of items to measure health-care specic ideological points, using a Bayesian latent variable ideal point estimator (Bafumi, Gelman, Park, and Kaplan, 2005; Clinton, Jackman, and Rivers, 2004; Ho and Quinn, 2008; Jackman, 2000; Martin and Quinn, 2002; Trier and Jackman, 2008). The statistical model treats each participants ideal point as a parameter to be estimated, and since the model is fully Bayesian, the ideal points are estimated as distributions, not as simple point estimates or point estimates with standard errors. The model also regresses the participants ideal point distributions on an indicator for pretest missingness and dummies for each site (excluding Eureka as the baseline) as a way to impute the missing pretreatment values of the ideal point indicators (see the discussion regarding sensitivity analyzes below). We use these ideal point distributions to estimate table-level disagreement, which we take to be the table-level dispersion of the ideal points for everyone else at the table.7 Since the ideal points are estimated rather than observed, they have uncertainty; the
The programming for the table-level functions is based on Congdons Bayesian spatial models (Congdon, 2003, chapter 7).
7

E. Funglee: Disagreement & Deliberative Quality

14

ideal points by necessity are distributions, not constants. Since the table-level measures of disagreement are themselves functions of these distributions, they also are stochastic distributions with uncertainty. To identify the eect of a stochastic treatment, we simultaneously 1) measure ideal points as distributions with a latent variable model, 2) estimate table-level disagreement as a function of these distributions, and 3) propagate the uncertainty of table-level disagreement (the treatment received) through the statistical model using a structural equation model. To accomplish this, we regress the outcome variables in a full structural equation model using the table-level disagreement distribution for each participant and its square (i.e., rst and second order terms) as explanatory variables. The full model is diagrammed in gure 1.

4.2

SEM Results

In this section we walk through the results of the model diagrammed in gure 1. To recap, the model measures ideal points and the two deliberative quality scales, calculates table-level disagreement using the estimated ideal points, and then tests whether disagreement causally aects deliberative quality.8

4.2.1

Ideal Point Estimates and Deliberative Quality Scales

The statistical model estimates three scales, one scale measuring participants ideological ideal points, and a pair of scales measuring participants subjective perceptions of the quality of deliberation at their event. To measure each scale, we use a set of items that are highly inter-correlated; none of the items taken singly measures the latent variable that we
We estimate the SEM using MCMC Bayesian data analysis. To implement the model, we use WinBUGS (Spiegelhalter, Thomas, Best, and Gilks, 1996). We assume a standard normal prior for the ideal points and a standard multivariate normal prior for the two deliberative quality outcome scales (with the correlation parameter freely estimated). We assign diuse priors to all factor coeents and structural parameters. All missing data are imputed dynamically using the model estimates (Tanner and Wong, 1987). We run the model until all parameters have converged using the Gelman and Rubin (1992) diagnostic, which happens very quickly (fewer than 1,000 draws are needed to achieve stationarity). We then run the chains an additional 10,000 iterations, thinning at regular intervals to yield a simulated marginal posterior distribution of the parameters with 1002 draws.
8

E. Funglee: Disagreement & Deliberative Quality

15

desire to measure, but taken together in a measurement model the items estimate a scale representing the latent variable of interest. The scales are very similar to more familiar factor scores estimated from factor analysis. The dierence, however, is in the context of a full structural equation model, the estimator is able to account for the uncertainty in how the scales are estimated. In eect, the model estimates a full distribution for each scale, for each participant, and the latent variables are measured by these distributions rather than by point estimates. Table 1 reports the results from the latent variable portion of the model. The individual items are described above. Notice that for each scale, the factor coecients are large and statistically signicant. These factor coecients statistically relate each indicator to the estimated scale. When the factor coecients are large and signicant, the items can be said to be reliable indicators for the latent variable. The model estimates the correlation between the process quality and the policy quality scales. We nd a very high correlation between the two scales, 0.75 (p < 0.001). This shows that when participants believe the process is of high quality, they are also likely to believe the policy is of high quality, and vice versa. Interestingly, participants ideology is uncorrelated with either their assessment of either (i) the deliberative quality or the (ii) quality of the policy conclusions. This result lends support to deliberative theorists emphasis that legitimacy grows from the deliberative character of a decision-making process rather than from its substantive results.9 [another sentence or two here from the deliberative literature would be good....].

4.2.2

Measured Disagreement

The model dynamically measures disagreement as a table-level function of the ideal points of participants at each table. The mean level of disagreement across the tables is ap9 Because of the way participants were assigned to tables, participants ideology is uncorrelated with our main variable of interest, table-level disagreement. Indeed, the correlation between disagreement and ideological divergence from the median ideal point is -0.006 (p=0.85).

E. Funglee: Disagreement & Deliberative Quality

16

Table 1: Measurement Model Results Mean Process Quality Factor Coecients More Informed People Respectful Others Heard Views Unbiased Meeting Reasonable Points Opportunity to Speak Civic Duty Had Fun Today Policy Quality Factor Coecients View Changed Voting Results Leaders Incorporate Participate Again 1 2.35 2.15 1.39 1.72 2.06 1.29 1.47 SE (Fixed) (0.20) (0.15) (0.09) (0.11) (0.14) (0.10) (0.11)

1 2.01 2.31 1.76

(Fixed) (0.14) (0.21) (0.15)

Individual Ideology Factor Coecients Party ID 2.01 Ideology Self Report 2.73 Gone Too Far 1.67 Limit Government 0.91 Seen Sicko 1 Additional Model Parameters Correlation betw. Process and Policy Fac- 0.745 tors () Table Disagreement Mean 0.88 Table Disagreement Standard Deviation 0.26

(0.16) (0.25) (0.12) (0.08) (Fixed)

(0.027) (0.02) (0.01)

p 0.05

E. Funglee: Disagreement & Deliberative Quality

17

proximately 0.88 with a standard deviation of about 0.26. We note that the table-level functions are computed using the full distributions of the ideal points of participants, not the point estimates of distributions (the latter would be true if instead one were to use predicted factor scores as independent variables in the regression). We retrieved the Bayesian posterior distributions of disagreement for each participant in the sample. We graph these posterior distributions for 100 randomly selected participants in gure 2. In the gure, known as a caterpillar plot, each vertical bar gives the 95 percent condence interval (technically, the 95 percent highest posterior density interval, which is a type of condence interval that accommodates asymmetric posterior distributions); the overall mean is indicated with a horizontal dashed line. The table medians range from about 0.4 to about 1.3, a range with which is nearly exactly plus and minus two standard deviations or just the amount of variation one would expect if the table assignments were random. There were no tables with complete agreement, which one might expect to see if seating were self-selected. The medians of these distributions are connected with a red line. If one were to use point estimates of ideal points to construct the disagreement measure, one would retrieve a constant measure of disagreement for each participant, indicated by where the red line intersects each distribution. One cannot use these point estimates in this context, however, since they would mask the uncertainty by which disagreement is measured. Instead, the models below use the full distribution of each participants ideal point to estimate disagreement, and so for each subject, disagreement enters each regression as a distribution instead of as a point estimate. In the structural equation model, these ideal point distributions are estimated at the same time as the structural parameters in the regressions, and hence the uncertainty in the ideal points are propagated through the statistical model (Tanner and Wong, 1987). This procedure correctly accounts for the uncertainty we have in measuring the treatment. Because table-level disagreement is ignorable, we are able to identify the causal treatment eect of disagreement when the

E. Funglee: Disagreement & Deliberative Quality

18

Table Disagreement Caterpillar Plot (100 Participants)


qq qq

1.5

qq q q q q q q qq q q q q q q qqq q qqq q qq q q q q q q q q q q q q qqq q q q q q q q q q q q q q q q qq q qqq q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qqqq q q q qq q q q q q q q q q q q q qq q q q q q q q q q q q qq q q q q q qqq q q q qqq qq q q q q qq q q q q q q q qq q qq q q q q q q qq q qqq qq q qqq q q q q q q

95 Percent HPD Interval

0.0 0

0.5

1.0

20

40 Table Rank

60

80

100

Figure 2: Stochastic Treatment: Disagreement Levels Across Tables

treatment itself is stochastic. Under the assumption that disagreement is randomly assigned (see above for the theoretical proof and the appendix for statistical tests), the statistical model can identify the causal eect of disagreement on deliberative quality.

4.2.3

The Causal Eects of Disagreement on Deliberation

The results for the causal eect of disagreement on deliberative quality are summarized in gures 3 and 4. Figure 3 presents the full posterior distributions of the structural parameters for the rst and second order disagreement variables, for each of the two deliberative quality scale dependent variables, process quality and policy quality. Notice that, as expected,

E. Funglee: Disagreement & Deliberative Quality

19

First Order Betas


1.0 1.0

Second Order Betas

0.8

0.6

Density

0.4

Density 3.5 4.0 4.5 5.0 5.5 6.0 6.5

0.2

0.0

0.0 4.5 4.0 3.5 3.0 2.5 2.0 Process Quality Process Quality 2.0 2.5 3.0 3.5 4.0 4.5 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Density

Density

0.0

0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

3.0

2.5

2.0

1.5

Policy Quality

Policy Quality

Figure 3: Deliberative Quality Parameter Densities

the rst order coecients are positive and statistically signicant, while the second order coecients are negative and signicant. This implies that, for each scale, deliberative quality at each table is a concave function of disagreement. Figure 4 gives a sense of the size of the eect. The black curve in the gure is the expected eect of disagreement on deliberative quality, or the eect based on model point estimates. The light blue curves in this gure represent 100 random draws of model parameters from the posterior distribution and help to give a sense of the uncertainty in the parameter point estimates. Recall that mean disagreement ranges from about 0.3 to 1.5, with an overall mean of

E. Funglee: Disagreement & Deliberative Quality

20

Process Quality

Policy Quality

3.0

2.5

Debate/Process Quality Index

Policy Results Quality Index 0.4 0.6 0.8 1.0 1.2 1.4

2.0

1.5

1.0

0.5

0.0

0.0 0.4

0.5

1.0

1.5

2.0

2.5

3.0

0.6

0.8

1.0

1.2

1.4

Table Ideological Disagreement

Table Ideological Disagreement

Figure 4: Deliberative Quality Maximized at Moderate Levels of Disagreement

0.9. In each frame of this gure, notice that the expected deliberative quality at the two extremes of disagreement (low and high) are statistically dierent from the expected deliberative quality in the middle range. Deliberative quality thus appears to be maximized at the middle range. While deliberative quality is concave, the concavity is not symmetric around the mean. Instead, tables with low disagreement perceive moderate deliberative quality; tables with moderate disagreement perceive high deliberative quality; and tables with high disagreement perceive low deliberative quality. This pattern is very sensible, and is consistent with our expectations for the non-monotonic eect of disagreement.

E. Funglee: Disagreement & Deliberative Quality 4.2.4 Sensitivity Analysis

21

We consider two sets of sensitivity analyzes regarding the measurement of disagreement: the sensitivity of the model results to the table-level function we use to calculate disagreement and to the missing data distributional assumptions. First, we check to ensure that our model results do not depend on how we measure disagreement. In the main model, we measure disagreement for each participant as the standard deviation of the ideal points of everyone else at her table. As an alternative measure, we use the dierence between the maximum and minimum ideal point (excluding the participants ideal point) as the table-level disagreement function. The results we report do not change. The disagreement measure has a higher variance and the structural eect parameters are smaller, and as a result the eect of disagreement using this alternative measure is nearly identical with the standard deviation measure of disagreement. Second, consider the problem of missing pretreatment data. Of the 1317 participants in the study, 202 did not ll out a pre-test survey; this is 15.3 percent of the whole sample and 18.5 percent of those who did complete a post-test survey.10 Ordinarily, missing data pose no problem in a Bayesian model, since the missing data are simply taken as additional parameters to estimate, and the uncertainty over these estimates are also propagated through the other model parameters (Tanner and Wong, 1987). As we mention above, imputing the post-test survey responses using pretreatment survey responses is relatively straightforward. Unfortunately, we have no individual-level covariates to safely impute pretreatment responses, and of course, this matters very much for the estimate of the treatment, tablelevel disagreement, since table-level disagreement is a now also a function of these missing data. While one might be willing to assume those who failed to complete a pretest are on average like those who do complete the pretest, we cannot rule out that this is not the case. For example, what if everyone who failed to complete the pretest was conservative?
10

Participants who did not ll out either survey are dropped from the sample.

E. Funglee: Disagreement & Deliberative Quality Or liberal?

22

We use sensitivity analyzes to investigate the possible consequences of missing pretreatment data (Imai and Yamamoto, 2008). In the sensitivity analysis, we can use a wide range of assumptions regarding the distributions of the missing data, and the range of estimates from these alternative models help to identify a conceivable range for the treatment eect estimates. If the range is narrow, we can conclude the results are robust to the alternative assumptions. We do the sensitivity analysis using three dierent distributional assumptions for the missing data. In the main analysis, we impute missing pretest data using the mean of the ideal points of the participants site, and the variance of the full distribution of ideal points across all sites. In this analysis, the missing pretreatment data are draws from the site-level distribution of ideal points. We note here that on average, the participants at the Eureka site were slightly more conservative than the other three sites (p < 0.05). In this analysis, we assume respondents are as likely to be liberal or conservative as everyone else at the site, and our uncertainty in the imputation is governed by the full distribution of ideal points across all sites. In the second analysis, we assume that the missing pretreatment data are draws from a liberal distribution, with mean one standard deviation below the site-level mean and variance the same as above. The third analysis does the same thing but with an assumed conservative distribution. When we ran these separate analyzes, we found little to no changes in the estimates of treatment eects (results not reported). The estimates of the main analysis are robust to these alternative assumptions. This is likely because the response rate on the pretreatment survey is very high, nearly 82 percent, and so there is simply not enough missing pretreatment data to matter for our analyzes.

E. Funglee: Disagreement & Deliberative Quality

23

The Eect of Disagreement on Persuasion

The second model tests for the eect of disagreement on table-level persuasion on the various policy proposals considered at the events. We test for persuasion in two steps. First, we use ordinary spatial regression methods to test whether there is a dependence of policy preferences among participants seated at the same table. This model tests whether there is a shift in participants expected post-test policy preferences across the items, and whether this shift itself is a (positive or negative) function of the shifts of others seated at the table. After establishing the existing of persuasion, we then investigate whether the degree of persuasion is a non-monotonic function of disagreement.

5.1

Testing the Existence of Persuasion

The post-test survey has a block of items asking participants their policy preferences on a set of proposals. The block of questions is preceded with You have spent the day considering dierent ways to pay for and provide health care in California. In the earlier survey that you lled out we asked you about several dierent approaches. Now that the state-wide conversation is over we again want to know how strongly you agree or disagree with each of these approaches to changing our health care system. (For each approach, please check only one box.) The response categories are the same ve point scale plus Dont Know used for the deliberative quality scale items (where again, Dont Know is recoded as missing and its values imputed). The items are: Q1: Expand Coverage by working with employers to cover more working people and families Q2: Fundamental change to insure all Californians through a state-administered system that all Californians and their employers pay into Q3: Limit governments role to providing coverage for the low-income or unemployed, or those who cant get insurance on their own

E. Funglee: Disagreement & Deliberative Quality

24

Q4: All Californians should receive a health care voucher or tax credit, to be used to purchase their own coverage Q5: Health insurance companies should be required to oer aordable coverage plans to everyone, regardless of their health condition

Table-Level Functions Disagreement = SD(Ideal Pointj) jTi, ji Disagreement2 = Var(Ideal Pointj) jTi, ji

Post-Treatment Preferencei 1. Employer Coverage 2. State Admin. System 3. Low Income Only 4. Voucher or Tax Credit 5. Insurance Co. Mandate Pretest Response Missingi Site Fixed Effectsi Ideal Pointi

Mean(Post-Treatment Preferencej) (jTi, ji)

Party IDi

Ideology Self Reporti

Gone Too Fari

Limit Governmenti

Seen Sickoi

Figure 5: Non-monotonic Persuasion Statistical Model

Figure 5 diagrams the statistical model. In this rst model, we test only the main eect of persuasion, and hence constrain the parameters associated with the dotted arrows of gure 5 to zero. In this case, the model is a standard nonlinear spatial regression model, as described in Congdon (2003, chapter 7). Post-treatment policy preferences for each individual are modeled as a function of her pretreatment ideological ideal point, as well

E. Funglee: Disagreement & Deliberative Quality

25

as a function of the mean post-treatment preferences of others seated at her table. The key paremeter here is k , estimated for each equation k 1 . . . 5, that is the structural parameter capturing the eect of the others preferences. If k is positive and signicant, this indicates if everyone else at the table has a large shift in their expected post-treatment preference, then person i also can be expected to have a shift in the same direction; conversely, if everyone elses preferences stay put, so does person is. (Negative rhos are very unusual in this type of model.) This model is estimated in the same manner as the deliberative quality model of gure 1, with diuse priors placed on the k parameters and all other estimation procedures the same. Figure 6 shows the posterior distributions for each of the ve rhos. Notice rst that for each item, rho is positive, substantively quite large, and statistically signcant. This indicates a very strong dependence of preference shifts within tables, and hence, when remembering that the tables were randomly assigned,11 these results make a strong case for the existence of persuasion. In the gure, we have shaded two distributions with maroon; these items empirically load on the ideal point distribution and so are likely to be ideologically structured. The black-shaded distributions do not load on the ideology scale. These loadings are sensible, since the two maroon items involve new government programs, and the black items involve employer-centered reforms or vouchers. Notice, however, that the extent of persuasion is not dependent on the ideological structuring (or lack of structuring) of each item. In this case, persuasion occurs fairly equally across the ve set of items.
Random assignment solves the basic problem of heterogeneity in testing causal eects in a spatial context. See (Congdon, 2003, chapter 7). In short, the problem of heterogeneity involves the existence of unobserved variables that are spatially distributed, and is exactly analogous to the problem of homophily we discuss above.
11

E. Funglee: Disagreement & Deliberative Quality

26

Parameter Density (Maroon questions load on an ideological factor)


Density 0 12

0.0

0.2

0.4

0.6

0.8

1.0

Rho for Q1: Expand coverage by working with employers to cover more working people and families

Density

0 8

0.0

0.2

0.4

0.6

0.8

1.0

Rho for Q2: Fundamental change to insure all Californians through a stateadministered system

Density

0.0

0.2

0.4

0.6

0.8

1.0

Rho for Q3: Limit government's role to providing coverage for the lowincome or unemployed

Density

0.0

0.2

0.4

0.6

0.8

1.0

Rho for Q4: All Californians should receive a health care voucher or tax credit, to be used to purchase their own coverage

Density

0.0

0.2

0.4

0.6

0.8

1.0

Rho for Q5: Health insurance companies should be required to offer affordable coverage plans to everyone

Figure 6: Within-Table Persuasion Monotonic Rho Densities

5.2

Persuasion is Non-monotonic in Disagreement

Having established the existence of table-level persuasion, we now turn to the eect of disagreement on the degree of persuasion. If deliberative quality is a non-monotonic function of disagreement, then so too should be the extent of persuasion. To test for this, we re-ran the same model, but this time interacting the table-level mean preference shift with both the disagreement measure, and the square of the disagreement measure. We use the model diagrammed in gure 5, but this time we estimated the

E. Funglee: Disagreement & Deliberative Quality

27

parameters associated with the dotted arrows.12 The results are shown in gures 7 and 8.
First Order Rhos
Density 1.2 Density 0.6 0.6 1.2

Second Order Rhos

0.0

0 Q1

0.0 4

0 Q1

Density

1.2

Density 4 2 0 Q2 2 4

0.0

0.6

0.0 4

0.6

0 Q2

1.2

Density

Density 4 2 0 Q3 2 4

0.6

0.0

0.0 4

0.6

1.2

0 Q3

Density

0.6

Density 4 2 0 Q4 2 4

0.0

0.3

0.0 4

0.3

0.6

0 Q4

0 Q5

0.0 0.2 0.4 4

Density

0.0

Density

0.3

0 Q5

Figure 7: Non-monotonic Persuasion Rho Densities

Notice that in gure 7, for each of the ve items, the rst order term is positive and signicant, while the second order term is negative and signicant. This indicates a concave function in disagreement. The estimated response functions are shown in gure 8. To construct these graphs,
In the model, we only include the interaction terms, but not the main eect of either table-level disagreement or of table-level preference shifts. The models do not work when we include these. Thus, we are simply allowing the shape of the response function for preference shifts to bend as a function of disagreement.
12

E. Funglee: Disagreement & Deliberative Quality

28

Employer Mandates

Low Income Insurance

Insurance Company Mandates

2.0

2.0

Random Intercept

Random Intercept

Random Intercept 0.4 0.8 1.2

1.5

1.5

1.0

1.0

0.5

0.5

0.4

0.8

1.2

0.5 0.4

1.0

1.5

2.0

0.8

1.2

Table Ideological Disagreement

Table Ideological Disagreement

Table Ideological Disagreement

State Administered Programs

Voucher or Tax Credit

2.0

Random Intercept

Random Intercept 0.4 0.8 1.2

1.5

1.0

0.5

0.5 0.4

1.0

1.5

2.0

0.8

1.2

Table Ideological Disagreement

Table Ideological Disagreement

Figure 8: Persuasion is also Maximized at Moderate Disagreement

we take the rst and second order rhos as estimated constants, set the table level shift in preferences to one, and then vary disagreement from 0.3 to 1.5. Just as with the deliberative quality model, the response functions in each case is concave. In addition, each is asymmetric, where there is moderate persuasion at tables with low disagreement, high persuasion at moderate disagreement, and low persuasion at high disagreement. Again, this overall pattern is sensible, and strongly suggests that deliberative quality is non-monotonic in disagreement.

E. Funglee: Disagreement & Deliberative Quality

29

Conclusion

Substantively, we show that both deliberative quality and the extent of persuasion appears to be a non-monotonic eect of disagreement in democratic deliberations. This nding should be of interest to theorists of deliberative democracy and well as to empirical scholars who wish to test their work. Much prior research for example on group polarization, the psychology of conrmation, conict aversion, and homophily generally suggests that individuals are not well suited to cope with the disagreement that necessarily accompanies democratic deliberation. Our analysis of this natural experiment, however, shows that individuals are more likely to learn, to change their minds, to enjoy, and to regard as worthwhile deliberations in which there are moderate levels of disagreement. Or in terms of our typology of the four political psychologies, the typical participant in these real-world deliberative sessions appears to be disagreement-curious. In addition, we have advanced a number of methodological points regarding estimating treatment eects in this context. This rich dataset allows us to overcome many of the problems that one typically encounters in testing deliberative eect empirically. The random assignment solves the problem of homophily and other forms of self-selection; between table deliberative eects are identied because table assignments were random. We have advanced a latent variable structural model that accommodates a stochastic (randomly assigned) treatment. We used data augmentation combined with sensitivity checks handle missing pretreatment data. And we used ordinary data augmentation to handle missing post-treatment data. For these reasons, we feel that this study can usefully serve as a template for those interested in the empirical eects of deliberation, hopefully fostering more of the empirical turn in the literature on deliberative democracy.

E. Funglee: Disagreement & Deliberative Quality

30

Appendix: Assessing the Quality of the Randomization

In order to claim that the eects of disagreement we nd are causal, we must show that participants assignment to disagreement levels across tables is ignorable. A treatment is said to be ignorable if it is uncorrelated with the potential outcomes of the participants, both participants baseline enjoyment of deliberation and their response to the sessions themselves, (Rubin, 1974). Above we prove that the assignment process that AmericaSpeaks used at the events made disagreement levels ignorable, where this proof required only a single assumption. Here, we test for the ignorability of disagreement level statistically. To ensure these tests do not depend on our model and its specication, we measure the ideology of each participant using the ve indicators (see gure 1) and ordinary principal components analysis (with polychoric correlations among the indicators). We then uses these scores to calculate the mean ideology score at each of the 155 tables. We assess table-level disagreement as the standard deviation of ideology scores at each table. We empirically test for the ignorability of the treatment empirically in two ways. First, we test for the distribution of ideology and disagreement across tables, and the results of these distributional tests are shown in gure 9. The rst panel of gure 9 tests the distribution of mean ideology across tables. If assumption 1 is false, then by homophily we would expect liberals to tend to be grouped with liberals, and conservatives to tend to be grouped with conservatives. This would imply there would be too many homogeneously liberal and conservative tables than one would observe from random assignment. The rst panel in gure 9 clearly demonstrates this is not the case. This gure plots the empirical quantiles of the distribution of disagreement across tables against the theoretical quantiles from a normal distribution with the same mean and standard deviation. As the gure shows, several tests of distributions show that the empirical distribution is indistinguishable from a random normal distribution up to three moments. Thus, the ideologies and hence disagreement is distributed randomly across tables.

E. Funglee: Disagreement & Deliberative Quality

31

Table Ideology Means


1.0
q q qq q q q q q q q q q q q q q q q q q qq qq qq q qq qq qq q q q q q q q q q q q q q q q q qq qq q q q q qq q q qq qq qq qq q q q q q q qq qq qq qq qq q q q qq q q q q

Table Disagreement Means


2.0 Empirical Quantiles
q

Empirical Quantiles

0.5

1.5

1.5

0.5 0.0

0.5

1.0

0.0

KS Test, p=0.61 Kurtosis Test, p=0.94 Skewness Test, p=0.14

q q q q q qq q q qq qq q q q qq q q q q q q q q q q q q q q q q q q q q q qq qq qq qq q q q q q q q q q q q qq qq qq q q q q qq qq q q q qq qq qq q

0.5 0.0

0.5

1.0

1.5

KS Test, p=0.91 Kurtosis Test, p=0.63 Skewness Test, p=0.67 1.0 1.5 2.0

0.0

0.5

Theoretical Quantiles

Theoretical Quantiles

Figure 9: QQ-Plots Showing Table-level Ideology and Disagreement are Normally Distributed

The second panel of gure 9 tests the distribution of disagreement across tables. This panel likewise shows that disagreement across tables is distributed normally, as if disagreement for each table were a draw from a normal distribution. If assumption one were false, by homophily we would see most tables clumped toward a low range of disagreement, and fewer tables in the medium to high range; the skew would have to be to the right since space constraints in the room prevent participants from freely choosing other debate partners once the room begins to ll. The second way to show that ignorability of the treatment is through balance tests of covariates across the levels of disagreement. Table 2 shows that out of 17 covariates, only one has a conditional distribution that diers between participants who experience low levels of disagreement and participants who experience high levels of disagreement. If the treatment is ignorable, one would expect about one in twenty covariates to be out of balance. In addition, the omnibus balance test of the covariates taken jointly (Hansen

E. Funglee: Disagreement & Deliberative Quality

32

Table 2: Observed Covariate Balance Test Results Standardized Dierence -0.045 -0.052 0.026 0.007 -0.036 0.034 0.016 -0.001 -0.035 -0.040 0.013 -0.009 0.011 0.068 -0.038 0.013 0.019 Z-Score -1.61 -1.85 0.93 0.24 -1.26 1.22 0.56 -0.05 -1.25 -1.42 0.49 -0.32 0.40 2.42 -1.36 0.48 0.67

Seen Sicko Democrat Independent Education Liberal Moderate Female Homeowner Black Hispanic Asian Employed Insured Health Status Major Illness Voted in 2006 Ideology

Omnibus Balance Test 2 DF p-Value 28.3 30 0.554 p 0.05. Individual and omnibus balance test statistics computed with the xBalance feature of the R package RItools (Hansen and Bowers, 2008).

and Bowers, 2008) cannot reject the null hypothesis of covariate balance. Taken together, the statistical tests reported in this appendix clearly indicate that the table assignments in the natural experiment were as if they were random. Because the treatment that participants actually received is ignorable in this study, we can test for the eects of table-level disagreement on deliberative quality across tables. Since assignment to disagreement level was essentially random, as one would nd in a fully randomized experiment, we are able to identify the causal eect of disagreement on deliberative quality.

E. Funglee: Disagreement & Deliberative Quality

33

References
Bafumi, J., A. Gelman, D. K. Park, and N. Kaplan (2005). Practical issues in implementing and understanding bayesian ideal point estimation. Political Analysis 12 (Spring), 171187. Bohman, J. (1998). Survey article: The coming of age of deliberative democracy. Journal of Political Philosophy 6 (4), 400425. Clinton, J., S. Jackman, and D. Rivers (2004). The statistical analysis of roll call data. American Political Science Review 98 (May), 355370. Cohen, J. (1989). The Good Polity (Alan Hamlin and Philip Petttit ed.)., Chapter Deliberation and Democratic Legitimacy, pp. 1734. New York, N.Y.: Basil Blackwell. Congdon, P. (2003). Applied Bayesian Modelling. Hoboken, N.J.: John Wiley & Sons, Ltd. Downs, A. (1957). An Economic Theory of Democracy. New York, N.Y.: HarperCollins. Elster, J. (1986). Foundations of Social Choice Theory (Jon Elster and Aanund Hylland ed.)., Chapter The Market and the Forum: Three Varieties of Political Theory, pp. 103132. New York, N.Y.: Cambridge University Press. Freeman, S. (2000). Deliberative democracy: A sympathetic comment. Philosophy and Public Aairs 29 (4), 371418. Fung, A. and T. Lee (2008). The Dierence Deliberation Makes: A Report on the CaliforniaSpeaks Statewide Conversations on Health Care Reform. AmericaSpeaks. Gelman, A. and D. B. Rubin (1992). Inference from iterative simulation using multiple sequences. Statistical Science 7 (Nov.), 434455. Gutmann, A. and D. Thompson (1996). Democracy and Disagreement: Why Moral Conict cannot be Avoided in Politics, and What Should be Done about It. Princeton, N.J.: Princeton University Press. Habermas, J. (1996). Between Facts and Norms: Contributions to a Discourse Theory of Law and Democracy (William Rehg, Translator ed.). Cambridge, Mass.: MIT Press. Hansen, B. B. and J. Bowers (2008). Covariate balance in simple, stratied and clustered comparative studies. Statistical Science 23 (2), 219236. Hibbing, J. R. and E. Theiss-Morse (2002). Stealth Democracy: Americans Beliefs About How Government Should Work. New York, N.Y.: Cambridge University Press. Ho, D. E. and K. M. Quinn (2008). Measuring explicit political positions of media. Quarterly Journal of Political Science 3, 353377.

E. Funglee: Disagreement & Deliberative Quality

34

Imai, K. and T. Yamamoto (2008, June). Causal inference with measurement error: Nonparametric identication and sensitivity analysis of a eld experiment on democratic deliberations. Princeton University, Department of Politics typescript. http://imai.princeton.edu. Jackman, S. (2000). Estimation and inference via bayesian simulation: An introduction to markov chain monte carlo. American Journal of Political Science 44 (April), 369398. Jacobs, L. R., F. L. Cook, and M. X. D. Carpini (2009). Talking Together: Public Deliberation and Political Participation in America. Chicago, Ill.: University of Chicago Press. Knight, J. and J. Johnson (1994). Aggregation and deliberation: On the possibility of democratic legitimacy. Political Theory 22 (May), 27796. Mansbridge, J. J. (1980). Beyond Adversary Democracy. New York, N.Y.: Basic Books. Martin, A. D. and K. M. Quinn (2002). Dynamic ideal point estimation via markov chain monte carlo for the u.s. supreme court, 1953-1999. Political Analysis 10 (2), 134153. Mill, J. S. (1859). On Liberty. London: John W. Parker and Son. Mondak, J. J. (2001). Developing valid knowledge scales. American Journal of Political Science 45 (Jan.), 224238. Mutz, D. C. (2004). Cross-cutting social networks: Testing democratic theory in practice. American Political Science Review 96 (Feb.), 111126. Rawls, J. (1993). Political Liberalism. New York, N.Y.: Columbia University Press. Rubin, D. B. (1974). Estimating casual eects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66 (5), 688701. Spiegelhalter, D., A. Thomas, N. Best, and W. Gilks (1996). Bugs 0.5: Bayesian inference using gibbs sampling manual (version ii). Technical report, MRC Biostatistics Unit. Sunstein, C. R. (2002). The law of group polarization. Journal of Political Philosophy 10 (2), 175195. Tanner, M. A. and W. H. Wong (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association 82 (398), 528540. Thompson, D. F. (2008). Deliberative democratic theory and empirical political science. Annual Review of Political Science 11, 497520. Trier, S. and S. Jackman (2008). Democracy as a latent variable. American Journal of Political Science 52 (Jan.), 201217.

You might also like