Professional Documents
Culture Documents
4, July 2016
ABSTRACT
In todays world of digital media, connecting millions of users, large amounts of information is being
generated. These are potential mines of knowledge and could give deep insights about the trends of both
social and scientific value. However, owing to the fact that most of this is highly unstructured, we cannot
make any sense of it. Natural language processing (NLP) is a serious attempt in this direction to organise
the textual matter which is in a human understandable form (natural language) in a meaningful and
insightful way. In this, text entailment can be considered a key component in verifying or proving the
correctness or efficiency of this organisation. This paper tries to make a survey of various text entailment
methods proposed giving a comparative picture based on certain criteria like robustness and semantic
precision.
KEYWORDS
Natural Language Processing, Textual Entailment, Semantic inference, Textual inference, Textual
hypothesis,
1. INTRODUCTION
In todays world of textual information everywhere, and days of high amounts of information
being generated every day, in the form of emails, chats, discussion forums and comments on
articles, it has become a herculean task to make sense of the text and categorise them
meaningfully. And the text generated here is generated by human interactions and is in natural
language.
DOI: 10.5121/ijaia.2016.7405
59
2. THE PROBLEM
Looking at the brief history of NLP[4],
NLP , where since the advent of computers, we have been trying
ways to communicate with the machine effectively, we see an interesting trajectory of events.
Earlier a lot importance was given in extracting meaning from each word that a human utters or
seeks a computer as much information as possible from each of the words. This meaning making
or inference involved processing information at different levels ranging
ranging from syntactic, lexical,
semantic, discourse/pragmatic. But only later have we as a community started realising that this
processing of each word often is not needed, instead one must focus on extracting as much sense
from the sentence as possible as a whole. Thus the problem of NLP can be understood as
extracting as much meaning as possible from the given sentence.
However, broadly from the early 21st century, focus has been semantic inference and the area of
Recognising Textual Entailment popularly known as RTE is a child of this pursuit. A typical
system followed for any Natural Language processing is shown below[4].
below
60
Various tasks that are typically encountered include Question Answering (QA), Information
Extraction (IE), (multi-document)
document) summarization, and machine translation (MT) evaluation
[3].RTE
.RTE is proposed as a generic task capturing various earlier existing tasks of QA, IE, MT, etc.
RTE Challenge guidelines [3] specifically look for 7 such tasks namely Information Retrieval
(IR), Comparable Documents (CD), Reading Comprehension (RC), Question Answering (QA),
Information Extraction (IE), Machine Translation
Translation (MT), Paraphrase Acquisition (PP). These are
proven with some strong empirical studies and arguments[5][7].
arguments
The efficacy of any given approach is understood by 2 key measures namely accu
accuracy and a
parameter defined as Confidence-Weighted
Confidence
score (cws).
). These are also used to compare various
approaches. This is a measure to grade how well a system correctly entails the T-H
T H pair. This is
weighted based on the classification of the TE into positive
positive (T entails H), negative (T does not
entail H) and non-TE
TE (T neither entails nor contradicts H) categories[3],
categories
[8].. A correct judgement
gives higher confidence than the wrong judgements. It should work well in identifying and
explaining both successful and failed T
T-H matches. cws is score between 0 and 1 and higher the
score better is the system in meeting the requirement
requirement of RTE. A system with a good overall
cws(on all the 7 tasks IR - - PP) is deemed better at TE.
61
Figure.3: A system picture of the RTE using atomic propositions approach [9]
concepts of T-H
H pairs. Some (310) knowledge axioms are also incorporated. Then WordNet is
used to create lexical chains. The above processes are used
used to define the arguments of a predicate.
This will make the logic prover, COGEX later not to miss any possible connections between the
T-H.
H. For, each case, a prover score is calculated iteratively till a refutation
refutation is reached, relaxing the
arguments of thee predicate step by step. This did very well with the CD with accuracy of 0.78 and
cws of 0.822. But with other tasks the performance was not very satisfactory. The overall
accuracy was 0.551 andcwswas
was 0.56.
Further, model building approach if employed in contrast to the theorem proving ones, which will
handle the negation well. Here theorem prover [10] and model builder are used in tandem to
handle and account for both the negative and positive results effectively. This hybrid approach
very clearly shows an advantage over the previous studies, with an accuracy of 0.61 and cws of
0.65. The same authors have later tried with logical inferences too, with no major improvements
[15]
63
64
Though the results are not good in comparison with the earlier studies on RTE2 and RTE3
datasets, the results are promising with RTE4 and RTE5 datasets. Further,
Further on-the-fly
implementations have also shown good accuracies.
Most of the results here very preliminary yet show promise in the fact that they out perform state
stateof the art first order logic system in the tests done on the
th FraCas datasets.
4. COMPARATIVE OBSERVATIONS
BSERV
We can see that each of the above methods have different advantages and disadvantages from the
logical point of view. Mixing up Semantic inferences with the lexical and syntactical inferences is
something that the area is seems to struggle with. The
The Table in the appendix at the end shows a
comparative overview of all the above mentioned approaches their performances quantitatively
wherever is available. Also, most of these approaches still being tested using more and more data
sets, as the knowledge base is also developing in parallel, as can be seen with the various versions
of RTEs and the likes of FraCas.
67
International Journal of Artificial Intelligence and Applications (IJAIA), Vol. 7, No. 4, July 2016
5. FUTURE SCOPE
There has been an attempt to solve the Natural Language processing problem for many decades.
However as can be seen in the above attempts to solve text entailment, a very concerted effortis
being put a large community of researchers to make the machine process and understand natural
language.
In the case of Text entailment, many algorithms have been proposed but, most of them are still
not able to crack the problem or give a breakthrough. A revolution in the approach of solving this
problem, (paradigm shift in the lines of historian and philosopher of Science Thomas Kuhn[28],
is very badly needed. We can be more confident than ever, with the amount of computational
prowess current technological age has.
With wider applications these days, with expanding reach of the technology, thanks to the smart
phone revolution, the natural language processing is becoming a much sought after technology.
Moreover, the latest studies have also started looking at languages other than English like this
recent study in Arabic[29] Also, there are studies going on for speech recognition in the same
flow of things. So, the approaches of higher order logical inference and semantic inferences
which we ended with will be key thread to pursue in future directions.
There are two key aspects that could be realised to be hurdles to the TE problem. As shown in the
above review, the first is the algorithm. The approach has slowly transformed from a mere
syntactic one to more rigorous ones. We have stopped at the initial stages of higher order logic
inference. The second key issue is the availability of sufficient data base of the resources like
lexical resources and the more rigorous test data. Both these problems need to be handled
together. The second issue however, could be easier, in the context of everyday interaction these
days using digital media producing big data [30][32]. With more solid resources and potential
test data, a crucial breakthrough is needed in the better algorithms, the second major problem.
Attempts like Mannings Foundations of Statistical NLP [33] outline an endeavour in this very
direction to chalk out the lookout for an effective algorithm.The latest handbook of semantic
theory by Lappin[34] is another great guide book to start the pursuit of a breakthrough. Some
latest approaches in this direction are the Bi-direction LSTM model and Inter-attention[35] and
various other new models[36] addressing entailment using limited data[37] which is one of the
key issues with the NLP problem.
6. CONCLUSIONS
This area of research as we can see is a very great meeting place for the people from Artificial
intelligence, computer sciences, linguistics and philosophy. Also, practically, with the amount of
data that is being generated and the information and insight these could carry in them, makes the
problem even more relevant to todays requirements. Further, the communication with the
machine is various interesting problem in the context of the entire human evolution and the
scientific pursuit. The results of this research can be very productive in enabling intelligent
machine and the robots to be more user-friendly and have a smooth transition with the natural
world. Machines understanding the human language as it is a going to break through in the entire
history of science and technology.
68
International Journal of Artificial Intelligence and Applications (IJAIA), Vol. 7, No. 4, July 2016
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
69
International Journal of Artificial Intelligence and Applications (IJAIA), Vol. 7, No. 4, July 2016
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
70
International Journal of Artificial Intelligence and Applications (IJAIA), Vol. 7, No. 4, July 2016
APPENDIX: TABLE
S.
No
overall cws
Approach Key Claim/ task (only for
RTE)
Describing a
computer tool
which can
extract text
entailments by
comparing
atomic
propositions.
Textual
Entailment
Resolution
via Atomic
Propositio
ns
Robust
Textual
Inference
Via
Learning
and
Abductive
Reasoning
Applying
COGEX to
Recognize
Textual
Entailment
Recognisin
g Textual
Entailment
with
Logical
Parsing
sentences into
logical-formula
like
representations.
Then a
minimum cost
set of
assumptions are
realised using
an abductive
theorem prover.
If one sentence
follows the
other low cost
set of
assumptions.
This marries
precision of
logical
reasoning with
the robustness
of machine
learning.
Transforming
T-H pair into
logic form
representation
with semantic
relations. The
system then
generates
axioms as
linguistic
rewriting rules
and lexical
chain axioms to
connect T-H.
Using Model
building and
machine
learning
0.5067
0.651
0.56
0.65
accuracy/
precision
(when
mentioned)
0.5188
0.57
0.551
0.61
FraCas Advantages/
Authors
score Disadvantages
Year
Simple.
Inadequate
Knowledg
e rule
database
for a
robust
Entailment
.
Elena
Akhmat
ova
2005
More
adaptive
and
flexible.
Highly
syntactical,
Semantic
aspects
still weak.
RajatRai
na,
Andrew
Y.Ng
and
D.Manni
ng
2005
Has better
semantic
connectivit
y. Creating
knowledge
base could
be tedious.
Abraha
m
Fowler
et al.
2005
Better
semantic
inference
and hybrid
model
Johan
Bos and
KatjaMa
rkert
2005
71
International Journal of Artificial Intelligence and Applications (IJAIA), Vol. 7, No. 4, July 2016
Inference
Recognisin
g Textual
Entailment
using
LCC's
Ground
Hog
When
logical
inference
helps
determinin
g textual
entailment
(and when
it doesnt)
Natural
Logic for
Textual
Inference
Semantic
Inference
at the
LexicalSyntactic
Level
leading to
more
robust
entailment.
Introducing a
new system for
recognizing
textual
entailment
(known as
GROUNDHO
G), which
utilizes a
classificationbased approach
to combine
lexico-semantic
information
derived from
text processing
applications
with a large
collection of
paraphrases
acquired
automatically
from the
WWW.
Comparison of
logical
inference
(shallow
method) in its
efficacy in text
entailment.
Most
approaches
sacrifice
semantic
precision for
robustness. But
those based on
first order logic
and theoremproving are
highly brittle.
This is a middle
way
Classical
approaches to
semantic
inference rely
on complex
logical
representations.
However,
>0.652
0.616
(shallow)
and
0.606
(both)
0.673
78.5
(precision)
RE
(Relation
Extraction)
setting
Machine
learning
based
approach
and high
accuracies
possible.
Requires
large
amount of
training
examples.
Hicks, et
al.
2006
No
significant
improvem
ent in
results
using the
logical
inference
(except for
a few
tasks)
Johan
Bos and
KatjaMa
rkert
2006
(accura
cy)
59.56%
Tries to
use both
deep and
shallow
using
Natural
logic
(NatLog),
among the
first to use
FraCas
Bill
MacCart
ney and
Christop
her
D.Manni
ng
2007
Does not
use
standard
RTE or
FraCas
setting.
More
precise and
BarHaim,
Dagon,
Greental
, and
Shnarch
2007
72
International Journal of Artificial Intelligence and Applications (IJAIA), Vol. 7, No. 4, July 2016
practical
applications
usually adopt
shallower
lexical or
lexicalsyntactic
representations,
but lack a
principled
inference
framework. We
propose a
generic
semantic
inference
framework that
operates
directly on
syntactic trees.
New trees are
inferred by
applying
entailment
rules, which
provide a
unified
representation
for varying
types of
inferences.
10
Modeling
Semantic
Containme
nt and
Exclusion
in Natural
Language
Inference
Using a new
method to
reduce the error
incorporating
semantic
inference, by
using a
sequence of
atomic edits
between T-H
Recognizi
ng Textual
Entailment
with
Logical
Inference
First the
semantic
interpretation
of the sentence
is performed
and then check
if the logic for
the H is implied
by some
inference
elaborated
less
approxima
te than the
other
studies. A
new
framework
to
incorporat
e semantic
as well as
lexicalsyntactic
levels
64.5
(accuracy
RTE3)
~ 65%
correctness
70.49
(accura
cyFraCas
)
Hybrid
method.
More
reliable
results
than earlier
NatLog.
Using
semantic
containme
nt,
exclusion
and
impilcativi
ty explains
many
everyday
patterns.
Provides
explanatio
ns.
Produces
preliminar
y positive
results to
start with.
Deductive
style of
reasoning.
Bill
MacCart
ney and
Christop
her
D.Manni
ng
2008
Peter
Clark
and Phil
Harrison
2008
73
International Journal of Artificial Intelligence and Applications (IJAIA), Vol. 7, No. 4, July 2016
11
12
13
Inference
Rules and
their
Applicatio
n to
Recognizi
ng Textual
Entailment
Combined
Distributio
nal and
Logical
Semantics
Logical
Inference
on
Dependenc
y-based
Compositi
onal
Semantics
version of T.
The system
also tries to
produce
explanations
for the
entailments,
sometimes also
erroneous.
Starting with
automatically
acquired
collection and
refine it and
obtain more
rules using
hand crafted
lexical
resource. Then
produce a
dependency
based structure
representation
from texts, with
an aim to
provide proper
base for the
inference rule
application.
Following
formal
semantics in
mapping
language to
logical
representations
but differ in
that the
relational
constants used
are induced by
offline
distributional
clustering at the
level of
predicateargument
structure.
Equipping the
DCS
framework with
logical
inference, by
defining
abstract
denotations as
an abstraction
of the
computing
process of
Use of
noisy
knowledge
high
precision
(>55)
across
tasks
Flexible
combinato
rial
approach.
Tedious
inference
rules
writing.
89%
(accura
cy
single
premis
e) 80%
(accura
cy
multipl
e
premis
es)
Distributio
nal logical
Semantics.
Advanced
clustering
techniques
deployed
for textual
entailment
M Lewis
and
Mark
Steedma
n
2013
accuracies
of above
59%
79.5%
(accura
cy
single
premis
e) 80%
(accura
cy
multipl
e
premis
es)
DCS and
logical
inferences.
On the fly
implement
ations
Tian,
Miyao,
and
Matsuza
ki
2014
Georgia
na Dinu
and Rui
Wang
2009
74
International Journal of Artificial Intelligence and Applications (IJAIA), Vol. 7, No. 4, July 2016
14
Higherorder
logical
inference
with
compositio
nal
semantics
denotations in
original DCS.
An inference
engine is built
to achieve
inference on
abstract
denotations.
Furthermore,
we propose a
way to generate
on-the-fly
knowledge in
logical
inference, by
combining our
framework with
the idea of tree
transformation.
Developing a
bridge between
parser and
semantic
composition.
higher order
logic used for
logical
inference
69%
(accura
cy)
Opens
gates for
higher
order logic
inferences.
Koji
Mineshi
ma, et
al.
2015
75