You are on page 1of 8

Distinguishing between Coherent and Incoherent Texts

Résumé

Dans cet article, je montre que les théories actuelles du discours sont incapables d’expliquer pourquoi
des arrangements différents des mêmes fragments sémantiques ont des caractéristiques différentes par
rapport à la cohérence. Je propose un critère de cohérence basé sur la fort e tendance d’unités de texte
associées à certaines relations rhétoriques à obéir à un ordonancement canonique; en même temps ce
critère tient compte de la tendance des informations sémantiquement et rhétoriquement similaires à se
grouper dans des fragments de texte plus grands. Je formalise ce critère comme un problème de satis-
faction de contraintes et je montre comment on peut obtenir une procédure de décision qui est capable
de discerner entre les textes cohérents et les textes incohérents. La procédure a été implémentée en
Lisp.

Abstract

In this paper, I show that current discourse theories are not able to explain why different orderings of
the same textual segments exhibit different properties with respect to coherence. I then propose a crite-
rion of coherence that exploits both the strong tendency of textual units that are associated with certain
rhetorical relations to obey a canonical ordering and the inclination of semantically and rhetorically
related information to cluster into larger textual spans. I formalize this criterion as a constraint sat-
isfaction problem and I show how it can yield a decision procedure that is capable of distinguishing
between coherent and incoherent texts. The procedure has been implemented in Lisp.
Distinguishing between Coherent and Incoherent Texts
Daniel Marcu

Department of Computer Science, University of Toronto, marcu@cs.toronto.edu

1. Motivation

Consider the following examples of incoherent text (in which each textual unit is labeled for reference):

(1) [The system performs the enhancement.a1 ] [Before that, the system resolves conflicts.b1 ] [First,
the system asks the user to tell it the characteristic of the program to be enhanced.c1 ] [The
system applies the transformations to the program. d1 ] [It confirms the enhancement with the
user.e1 ] [It scans the program in order to find opportunities to apply transformations to the
program.f1 ] (Hovy, 1988)

(2) [About 30% of the teenagers will become experimental smokers.a2 ] [We know that 3,000 teens
start smoking each day.b2 ] [About 90% of them once thought that smoking was something that
they’d never do.c2 ] [Of those who will start smoking, about 90% will end up with a pack and
a lighter for the rest of their lives. d2 ] [No matter how much one wants to stay a non-smoker,e2 ]
[the truth is that the pressure to smoke in junior high is greater than it will be any other time
of one’s life.f2 ] [About 75% of the young adults will pick up a cigarette and let curiosity take
over.g2 ]

As Hovy puts it, the text in example (1) “is not satisfactory because you have to work hard to make sense
of it” (Hovy, 1988, p. 163); and the same comment applies to example (2) as well. This paper will spell
out the reasons that understanding texts (1) and (2) is difficult, and will provide a formal account of
coherence. Section 2 acknowledges that coherence can be achieved in multiple ways and shows why
current formal and computational approaches to discourse analysis and text planning cannot be used
as means for differentiating between coherent and incoherent text. Section 3 presents the foundations
of our constraint-based formulation of coherence and a decision procedure that distinguishes between
coherent and incoherent texts.
2. Coherent and incoherent texts

Central to the approach that I present here is the observation that given a body of knowledge, there is
more than one way in which that knowledge can be expressed coherently. Preserving the labeling of
the textual units from text (2), one can say, for example, that both ordering e2 < f2 < a2 < d2 < b2 <
g2 < c2 and e2 < f2 < g2 < a2 < d2 < b2 < c2 are coherent. Text (3), which is given below, is a
realization of the latter.

(3) [No matter how much one wants to stay a non-smoker,e2 ] [the truth is that the pressure to smoke
in junior high is greater than it will be any other time of one’s life: f2 ] [75% of the young adults
will pick up a cigarette and let curiosity take over,g2 ] [About 30% of will become experimental
smokers.a2 ] [Of those who will start smoking, about 90% will end up with a pack and a lighter
for the rest of their lives.d2 ] [We know that 3,000 teens start smoking each day,b2 ] [although it
is a fact that 90% of them once thought that smoking was something that they’d never do.c2 ]

Given the impressive body of research that addresses coherence, it is surprising to notice that few efforts
have been channeled towards studying incoherence.

On one hand, those who do analysis and build discourse structures on the top of linear text assume
implicitly that the input to their theories or systems is coherent. For example, by default, Lascarides,
Asher, and Oberlander (1992) label as “narration” all the discourse relations that cannot be identified
otherwise; and Asher (1993) labels them “continuation”. Mann and Thompson’s (1988) and
Hobbs’s (1990) approaches provide a too-weak definition for coherent texts because although text (2)
is incoherent, one can nonetheless associate with it a valid rhetorical structure tree (Marcu, 1996) (see
figure 1).

On the other hand, systems that generate text are usually able to provide only one or a few of the many
ways in which a body of knowledge can be expressed coherently. The most flexible approaches to text
generation, i.e., those based on plan operators (Hovy, 1988; Moore and Paris, 1993; Maybury, 1992;
Young and Moore, 1994) formalize the generation of coherent text as a top-down tree-expansion pro-
cedure (often called “hierarchical planning”) that maps in successive steps a high-level communicative
goal into a discourse tree. In these approaches, to determine all the possible ways in which coherence
can be achieved, it is necessary to quantify over all communicative goals and over all plans that can
be built starting from them. Incoherent text would be defined then, using a closed-world assumption
technique, as text that has not been produced in the first stage. Obviously, such a definition is unreason-
restatement
a2-d2 e2-g2
concession evidence
a2-c2 e2-f2
concession justification
d2 g2
a2-b2

c2 e2 f2
joint

a2 b2

Figure 1: A rhetorical structure analysis of text (2).

able both computationally and theoretically: in fact it is impossible even to enumerate all the high-level
communicative goals.

3. Distinguishing between coherent and incoherent texts

Formally, the problem that I address can be expressed as follows: given a set U = fu1; u2; : : :; ung of
n semantic or textual units and a set RC of rhetorical relations: 1 among these units, build an abstract
structure, i.e., a text plan, that subsumes all the ways in which the units in U can be realized coherently.
For example, if we take a2; b2 ; : : :; g2 from texts (2) and (3) to be the units in U and we enumerate all
rhetorical relations that hold among them, we obtain (4):
8
>> U = fa2; b2; c2; d2; e2; f2; g2g and
><
RC = fevidence(a2; f2); evidence(b2; f2); evidence(d2; f2); evidence(g2; f2);
(4) >> concession(a2; c2); concession(b2; c2); concession(d2 ; c2); concession(g2; c2);
>:
justification(e2; f2); restatement(c2; e2)g
Since a text can be thought of as a linear sequence of textual units, I assimilate the targeted abstract
structure with the set of all possible orderings that can be defined over U that are coherent with respect
to a given criterion.

The criterion for coherence that I choose stems from what I believe to be an under-exploited part of
Mann and Thompson’s Rhetorical Structure Theory (1988): the corpus study that supports the develop-

1 Throughout this paper, the term “rhetorical relation” is used in Mann and Thompson’s sense (1988). The
convention that I use is that the nucleus of a relation is always given as the second argument in the tuple
relation name(satellite; nucleus).
Satellite before Nucleus Nucleus before Satellite
Antithesis Elaboration
Conditional Purpose
Background Enablement
Justify Restatement
Concessive Evidence
Solutionhood
Figure 2: Canonical orders of text spans for some rhetorical relations [Mann and Thompson, 1988,
p. 256]

ment of their theory exhibits strong patterns of ordering for the nuclei and satellites of various rhetorical
relations. They call these patterns canonical orderings (see figure 2).

Our idea is to formalize both the strong tendency of textual units that are associated with certain rhetori-
cal relations to obey a given ordering; and the inclination of semantically and rhetorically related infor-
mation to cluster into larger textual spans (Mooney, Carberry, and McCoy, 1990; McCoy and Cheng,
1991). To avoid the enumeration of all the permutations of the elements in U , I adopt a constraint-
satisfaction perspective.

A formulation of coherence as a constraint-satisfaction problem. Following a constraint-satisfaction


approach, I associate to each textual unit an integer variable whose domain ranges from 1 to n, where
n is the cardinality of U . For example, I associate to formulation (4) seven variables, va2 ; vb2 ; : : : ; vf2 ,
each ranging from 1 to 7.

For each rhetorical relation in RC I associate one ordering and one adjacency constraint. Ordering
constraints are meant to capture the patterns displayed in figure 2; they are formalized as inequalities
that hold between the variables that are associated with the units given as arguments for the relation
that is considered. For example, for the rhetorical relation evidence(a2; f2 ), the ordering constraint
is vf2 < va2 because in the canonical ordering for this relation, the nucleus f2 , i.e., the claim, goes
before the satellite a2 , i.e., the evidence. Adjacency constraints are meant to capture the inclination to
cluster of semantically or rhetorically related information. They are formalized by stipulating that the
difference between the values of the variables that are associated with the corresponding nucleus and
satellite of a rhetorical relation be 1. For the same relation, evidence(a2; f2 ), the associated adjacency
constraint is (va2 v
= f2 + 1) _ (va2 v ? 1).
= f2

Since text is inherently linear, with every pair of variables we assign also a unicity constraint; this con-
straint prevents two text units being mapped into the same value. However, if two textual units occur
as arguments of the same relations in the set RC , it is impossible to distinguish between their rhetorical
contributions to the text. In these cases, the unicity constraint is not asserted. For example, in formu-
lation (4), one cannot treat a2; b2 ; d2 ; and g2 differently, because all these units participate in the same
relations: they are all evidence for f2 and they all increase the positive regard for the situation presented
in c2 . Consequently, there is no unicity constraint associated with any pair of them, but for a pair such
as a2; e2, one will have va2 6= ve2 .

It should be clear that in the general case, due to conflicts among various constraints, it will be impossi-
ble to simultaneously satisfy all the constraints that are associated with a given formulation. Therefore,
instead of searching for solutions that satisfy all the constraints, it is more appropriate to search for so-
lutions that maximize the number of constraints that are satisfied. The formulation presented in this
section yields the following definitions:

Definition 1 A constraint-satisfaction (CS) formulation of the coherence problem for an ordered set
U of textual units and an unordered set RC of rhetorical relations that hold between the members of U
consists of a set of n variables whose domains range from 1 to n, where n is the cardinality of U ; a set of
unicity constraints that are associated with every pair of variables that could be treated differently with
respect to RC ; and a set of ordering and adjacency constraints that are associated with every element
in RC .

Definition 2 Let fU; RC g be a CS formulation of a coherence problem. We say that G is the associat-
ed graph of problem fU; RC g, if it has a node for each element in U and an edge between every two
nodes whose corresponding elements in U occur in a relation in RC .

Definition 3 Let T be a text, U a segmentation of that text into textual units, and RC the set of rhetor-
ical relations that hold among the units in U . Let P be the corresponding CS formulation of the coher-
ence problem and G its associated graph. We say that text T is coherent if and only if the ordering of
the units in T satisfies “most” of the ordering and adjacency constraints that are associated with P ,
and if G is connected. Otherwise, we say that T is incoherent.

The requirement for G being connected stems from the necessity to prevent the labeling of juxtaposi-
tions of the kind shown below as “coherent”:

(5) John bought a book. Pretty soon, everything will be covered with snow.
Definition 3 may be interpreted as a direct formulation of a decision procedure that distinguishes be-
tween coherent and incoherent texts. A LISP implementation of this decision procedure labels texts (1)
and (2) as incoherent because they satisfy only four out of 10 and nine out of 20 ordering and adjacency
constraints respectively. The same procedure labels text (3) as coherent because it satisfies 14 out of
20 ordering and adjacency constraints.

In texts that consist of only two or three sentences, in order to achieve various pragmatic effects, one
can disregard some of the canonical orderings that occur in natural language text without losing coher-
ence. However, when larger texts are considered, the canonical orderings can no longer be disregarded:
examples (1) and (2) prove this point.

One could argue that the use of “most” in definition 3 is not precise enough; however, given the inherent
structure of the problem and the large degree of disagreement that human subjects have with respect to
the classification of texts into coherent and incoherent (Mann and Thompson, 1988; Charolles, 1983),
I prefer for the moment to avoid the use of precise numerical values. I believe that such values could
be derived only from adequate empirical studies.

4. Conclusion

This paper criticized the ability of current formal and computational theories of coherence to detect
incoherent text, provided a formal account for coherence, and a decision procedure that is capable of
distinguishing between coherent and incoherent texts.

Acknowledgements. I am grateful to Graeme Hirst for invaluable comments on earlier drafts of this
paper. This reasearch was supported by a fellowship and a grant from the Natural Sciences and Engi-
neering Research Council of Canada.

References

Asher, Nicholas. 1993. Reference to Abstract Objects in Discourse. Kluwer Academic Publishers,
Dordrecht.
Charolles, Michel. 1983. Towards a heuristic approach to text-coherence problems. In Fritz Neubauer,
editor, Coherence in natural-language texts, volume 38 of Papers in textlinguistics. Helmut Buske Ver-
lag Hamburg, pages 1–16.

Hobbs, Jerry R. 1990. Literature and Cognition. CSLI Lecture Notes Number 21.

Hovy, Eduard H. 1988. Planning coherent multisentential text. In Proceedings of the 26th Annual
Meeting of the Association for Computational Linguistics, pages 163–169, State University of New
York at Buffalo, June 27–30.

Lascarides, Alex, Nicholas Asher, and Jon Oberlander. 1992. Inferring discourse relations in context.
In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pages
1–8.

Mann, William C. and Sandra A. Thompson. 1988. Rhetorical structure theory: Toward a functional
theory of text organization. Text, 3:243–281.

Marcu, Daniel. 1996. Building up rhetorical structure trees. To appear in the Proceedings of the
Thirteen National Conference on Artificial Intelligence, Portland, Oregon, August 4–8.

Maybury, Mark T. 1992. Communicative acts for explanation generation. International Journal of
Man–Machine Studies, 37:135–172.

McCoy, Kathleen F. and Jeannette Cheng. 1991. Focus of attention: Constraining what can be said
next. In Cécile L. Paris, William R. Swartout, and William C. Mann, editors, Natural Language Gen-
eration in Artificial Intelligence and Computational Linguistics. Kluwer Academic Publishers, pages
103–124.

Mooney, David J., Sandra Carberry, and Kathleen F. McCoy. 1990. The generation of high-level struc-
ture for extended explanations. In Proceedings of the International Conference on Computational Lin-
guistics, COLING-90, volume 2, pages 276–281, Helsinki.

Moore, Johanna D. and Cécile Paris. 1993. Planning text for advisory dialogues: Capturing intentional
and rhetorical information. Computational Linguistics, 19(4):651–694.

Young, R. Michael and Johanna D. Moore. 1994. DPOCL: A principled approach to discourse plan-
ning. In Proceedings of the Seventh International Workshop on Natural Language Generation, pages
13–20, Kennebunkport, Maine, June.

You might also like