You are on page 1of 4

International Journal of Scientific Engineering and Technology

Volume No.5 Issue No.1, pp: 28-31

ISSN:2277-1581
01 Jan.2016

Literature Review of Opinion Target Extraction techniques


Poonam R. Gohad , Archana S. Vaidya
Department of Computer Engineering,
Gokhale Education Societys R. H. Sapat College of Engineering, Nashik, India
poonam.gohad@gmail.com, archana.s.vaidya@gmail.com
Abstract: Public opinion on events and products has an
increasing interest from customer as well as organizations.
Today lot of work is being carried out in opinion mining. One
of its parts is opinion target extraction. In this, opinion targets
are extracted and analyzed using various methods. These
methods include use of Word alignment model, Double
propagation, Shallow semantic parsing and Conditional
random fields. Conditional random fields(CRF) are used in
cross domain settings for target extraction. CRF is also used
in a cross language scenario in order to extract opinion target
of another language. This paper involves a literature review of
the mentioned methods.
Keywords: Opinion target extraction, Double propagation,
Word Alignment model, Shallow semantic parsing, cross
language.
I. Int ro d uct io n
Thoughts of other people have always been important
information for most of us while making various decisions.
Before the large spread of Web, when any individual wanted to
make a decision, he or she asked for opinions from friends and
families. When organizations wish to find the opinions or
reviews of the public about its services or products, it conducted
opinion polls, various surveys. Nevertheless, with the Web,
especially with the explosive growth of the content generated by
the Web in the past few years, the world has actually been
transformed [1].
There are millions of internet users at present, as an
outcome of which most of the social media has gathered
massive amount of valuable peer review and documents on
almost everything. With the fast growth of e-commerce, more
products are sold on the Web as well as many people are buying
products online. In order to achieve customer satisfaction and
their shopping experiences, it has become important for
manufacturers to enable customers to review or to express
opinions on the products which they buy. Most of the time the
reviews are text and this makes it very hard for a potential
customer to read them and to make a decision on whether to
buy the product or not. In order to make it easy, mining this
pool of reviews and detecting opinion feature has become
useful.
Analysis of opinions from a given review corpus is
known as opinion mining. Opinion mining analyzes peoples
opinions, sentiments, attitudes and emotions regarding entities
such as products or services, various organizations and their
attributes. In opinion mining, opinion target indicates an
attribute or entity on which the user opinion has been expressed.
Opinion targets are topics on which opinion is
expressed. They are important because without knowing the
doi : 10.17950/ijset/v5s1/106

targets of the opinion, the opinions expressed in either sentence


or a document are of limited use. For example, consider the
sentence I am not happy with the battery life of this phone,
here battery life is the target of the opinion being expressed. If
that is not known, this opinion is of little value [1]. The process
of finding such opinion targets is called Opinion target
extraction. The actual task of opinion target extraction relies
on Supervised Learning Algorithms such as Conditional
Random Fields. As well as methods like Word alignment
model, Double propagation, Shallow semantic parsing.
However, these techniques make use of large amount of
annotated data to train models that can label the unseen data.
II. Opinion Target Extraction Techniques
Large amount of work is being carried out in opinion target
extraction which uses various approaches for the same.
Supervised, unsupervised algorithms or semi-supervised
algorithms are used in the opinion target extraction process.
Some approaches for opinion target extraction are described
below.
2.1 Double Propagation
Qiu et al. [4] proposed an approach, which extracts
opinion words or targets iteratively using known, extracted
opinion targets by identification of syntactic relations to extract
opinion words and targets. This Double Propagation
approach propagates information backward and forward
between opinion words and targets and hence, it is called so.
Here, a seed opinion lexicon is required to bootstrap the
method. The extraction approach uses the rule based strategy to
define relations, which is followed by a propagation algorithm.
Polarities to the extracted opinions are assigned using the
homogeneous rule, heterogeneous rule and intra-review rule.
During propagation, noise (incorrect opinions words and
targets) may occur. The whole idea of the propagation approach
is to first extract opinion words and opinion targets using the
seed opinion lexicon. Moreover, the newly extracted opinion
words and targets are used for further opinion target and word
extraction. The propagation continues until no more new
opinion words or targets can be identified. Thus targets can still
be extracted with high recall, even if the seed opinion lexicon is
small and simultaneously, the opinion lexicon is also expanded.
A method has been defined for pruning the errors but does not
provide effective results. Hence, In order to improve precision
working on opinion target pruning methods needs to be
considered. Also, to improve the relation coverage, automatic
Page 28

International Journal of Scientific Engineering and Technology


Volume No.5 Issue No.1, pp: 28-31

ISSN:2277-1581
01 Jan.2016

learning of syntactic relations using the pattern matching different extraction stages the interdependencies are not
technique from large corpus can be tried.
captured. By jointly identifying opinion entities and opinion
relations, specifically in a sentence, the goal is to identify spans
2.2 Shallow Semantic Parsing
of opinion expressions and opinion arguments (targets and
holders) as well as linking relations that are associated.
Li et al. [5] proposed a simplified approach with Training data consists of text which has manually annotated
shallow semantic parsing to extracting opinion targets. This is opinion expression and argument spans, each of which has a list
done by specifying opinion target extraction as a shallow of relation ids which specifies the linking relation between
semantic parsing problem where the opinion expressions are opinion expressions and their arguments.
considered the predicate and the corresponding targets are
Here, a joint inference model makes use of the
considered its arguments. Firstly, Opinion target extraction is
modeled from parse tree level, which provides abundant knowledge from predictors which then optimize subtasks of
structured syntactic information for use, instead of word opinion extraction, and obtains a globally optimal solution. The
inference goal is to find optimal prediction for both opinion
sequence level, as only lexical information is available in it.
Secondly, the focus is on determining whether a entity identification and relation extraction. As per
constituent is an opinion target or not, by using a simplified experimental results the joint inference approach outperforms
shallow semantic parsing framework. Evaluation states that traditional pipeline methods significantly and for the problem
structured syntactic information plays an important role in of opinion extraction the baseline tackle subtasks in isolation.
capturing the domination in relationship between an opinion For future work, the model can be extended to handle more
expression and its targets.Other types of information for opinion complex opinion relations, e.g. nesting as well as crosstarget extraction can be used for exploring here. Opinion holder sentential relations. This can be potentially achieved by
extraction can be included in the parsing framework to make it implementing more powerful predictors and linguistic
more effective. Moreover, this parsing approach can be used for constraints which are more complex.
cross domain opinion target extraction and opinion holder
2.5 Word Alignment Model.
extraction.
2.3 Word based Translation Model
Liu et al. have proposed an approach to extract opinion
targets using word based translation model (WTM) [6]. At first,
word based translation model is applied in a monolingual
scenario for mining the associations between opinion targets
and words. Then, a graph based algorithm is used to extract
opinion targets, where from the mined associations; candidate
opinion relevance is estimated and is incorporated along with
importance of candidate to generate a global measure.
By using Word based Translation Model, opinion
relations can be captured more precisely, especially for relations
with long-span. This method can effectively avoid noises which
occur from parsing errors while dealing with texts in vast Web
corpora. Graph-based algorithm is used to extract opinion
targets in a global process, which can effectively reduce error
propagation problem in traditional bootstrap based methods.
Experimental results show that the method outperforms double
propagation method.
Other word alignment methods, such as discriminative
model can be used in the system. For identifying the opinion
relations between words more precisely some syntactic
information can be added to the world translation model to
constrain the word alignment process. Moreover, for the
candidate opinion relevance estimation it will be useful to add
some knowledge of opinion words in the model.
2.4 Joint Inference
Yang and Cardie [7] proposed that most of the
previous approaches perform the extraction of opinion entities
and opinion relations in a pipelined form, where between
doi : 10.17950/ijset/v5s1/106

Liu et al. [8] have proposed an approach using


partially-supervised alignment model, where opinion relations
identification is regarded as an alignment process. Here, a
graph-based co-ranking algorithm is used for confidence
estimation of each candidate. Further, opinion targets or
opinion words are extracted on the basis of candidates with
higher confidence. As compared to previous methods which
used nearest-neighbor rules [11], this model captures opinion
relations for long-span relations more precisely.
An opinion target can find its corresponding modifier
through word alignment. A partially-supervised word
alignment model (PSWAM) is employed. A portion of the links
of the full alignment in a sentence can be easily obtained.
Hence, by using this the alignment model can be constrained
and better alignment results can be obtained. To obtain partial
alignments, syntactic parsing is used. Although existing
syntactic parsing cannot precisely obtain the entire syntactic
tree of informal sentences, some opinion relations can still be
obtained precisely using high-precision syntactic patterns. A
constrained Expectation-Maximization (EM) algorithm based
on hill-climbing [17] is then used to determine alignments in
sentences, where the model will be consistent with the links as
much as possible. In such a way, many errors induced by
completely unsupervised WAMs will be corrected.
As syntax-based methods have negative effects of
parsing errors in informal online texts, this word alignment
model effectively alleviates the problem. Extracting opinion
targets or words can be regarded as a co-ranking process.
Specifically, an Opinion Relation Graph is constructed for
modeling all opinion target or word candidates and the opinion
relations among them. A co-ranking algorithm based on
random walk is then proposed to estimate each candidates
confidence on the graph. In this process, high-degree vertices
Page 29

International Journal of Scientific Engineering and Technology


Volume No.5 Issue No.1, pp: 28-31
are penalized to weaken their impacts as well as to decrease the
probability of a random walk going into unrelated regions on
the graph. At the same time in order to make collaborated
operations on candidate confidence estimations calculation on
the prior knowledge of candidates for indicating some noises
and incorporating them into ranking algorithm are made.
Finally, candidates with higher confidence than the threshold
are extracted.
In particular, the proposed model obtains better
precision because of the usage of partially supervised alignment
as compared to unsupervised alignment. In order to decrease the
probability of error generation, while estimating candidate
confidence, vertices of higher-degree can be penalized in graphbased co-ranking algorithm. Considering additional types of
relations between words, like topical relations, in Opinion
Relation Graph, may prove beneficial for co-extracting opinion
targets and opinion words.
2.6 Single and Cross domain settings with Conditional Random
Fields
N. Jakob and I. Gurevych [9] proposed the problem as
an information extraction task, on the basis of Conditional
Random Fields (CRF). As a baseline supervised algorithm by
Zhuang et al. is used. The algorithm was evaluated
comprehensively on datasets from four different domains; it
was also annotated with opinion targets on a sentence level.
Moreover, the performance of CRF-based approach [16] and in
a single- and cross-domain opinion target extraction setting the
baseline have shown how a CRF-based approach for opinion
target extraction performs in a single-domain and cross-domain
setting. a comparative evaluation of our approach on datasets
from four different domains has been presented. In the singledomain setting, CRF-based approach outperforms the
supervised baseline on all four datasets.
Error analysis has indicated that additional features,
which can capture opinions in complex sentences, are required
for improving the performance of the opinion target extraction.
The CRF-based approach also yields impressive results in the
cross domain setting. The features employed are seen scaling
well across domains, given that the opinion target vocabularies
are substantially different. For future work, It is necessary to
investigate how the machine learning algorithms, which are
designed for the problem of domain adaptation perform in
comparison to this approach.
Since three among the features employed in CRFbased approach are based on the respective opinion expressions,
it becomes necessary to investigate the alleviation of possible
negative effects caused by errors in the opinion expression
identification if not annotated in the gold standard. Similar
challenges are observed by Choi et al. regarding the analysis of
complex sentences [15]. Although data is user-generated, a
manual inspection shows that the documents had relatively high
textual quality. It is to investigate the approaches taken can be
applied to which extent in the analysis of newswire, such as to
identify targets with co-reference resolution, and can it also be
applied to this task on a user-generated discourse.

doi : 10.17950/ijset/v5s1/106

ISSN:2277-1581
01 Jan.2016

Zhou et al. [18] proposed an approach where English


annotated corpus is translated into target language with the help
of machine translation service. Natural language processing
tools to parse both the original English corpus and the
translated target language corpus are used. Features generated
from the target language corpus can be used, and can also
project the features of the English corpus into target language
using word alignment information. Here POS taggers are used
to generate features in both languages. Thus, two target
language training datasets with different features are obtained,
one of which is generated from the translated target language
corpus, and the other is projected from the original English
corpus. The features are mapped in both datasets into a unified
feature space, which means the two training datasets can adapt
to the same target language test dataset. After training two
labeling models with based on the two training sets, the cotraining algorithm is used to improve the performance of both
models by exploiting unlabeled target language data.
III. Conclusion
As a lot of research has been already done in the opinion
mining field. Opinion target extraction specifically has a lot to
explore. With various techniques to extract opinion targets, this
paper describes the procedure for the same in order to provide a
overview of each technique. In double propagation method to
improve precision of opinion target pruning methods working
needs to be done. Shallow semantic parsing can be used for
cross domain opinion target extraction. In Word based
translation model syntactic information and knowledge of
opinion words can be added. The joint inference technique can
implement more powerful predictors and handle more complex
opinion relations. In the word alignment model considering
additional type of relations between words may prove
beneficial. For single and cross domain setting performance of
machine leaning algorithms with this approach can be
investigated. In cross language opinion target extraction,
conditional random fields and co-training algorithms are used.
This approach can be used for various target languages.
Acknowledgement
We would like to acknowledge Gokhale Education Society's R.
H. Sapat College of Engineering, Computer department for
their support, guidance and encouragement for writing this
paper.
References
i. B. Liu, , N. Indurkhya and F. J. Damerau, Eds., Sentiment
analysis and subjectivity, in Handbook of Natural Language
Processing, 2nd Ed. ed. Boca Raton, FL, USA: Chapman and
Hall/CRC, 2010.
ii. B. Pang and L. Lee, Opinion mining and sentiment
analysis. Foundations and Trends in Information Retrieval 2(1-2),
pp. 1135, 2008.
iii. Bing Liu and Lei Zhang, A Survey of Opinion Mining and
Sentiment Analysis, Book Mining Text Data, Pages 415-463.

Page 30

International Journal of Scientific Engineering and Technology


Volume No.5 Issue No.1, pp: 28-31
iv. G. Qiu, B. Liu, J. Bu, and C. Chen, Opinion word
expansion and target extraction through double propagation,
Computat. Linguist., vol. 37 no. 1, pp. 927, 2011.
v. S. Li, R. Wang, and G. Zhou, Opinion target extraction
using a shallow semantic parsing framework, in Proc. 26th AAAI
Conf. Artif. Intell., 2012.
vi. K. Liu, L. Xu, and J. Zhao, Opinion target extraction using
word based translation model, in Proc. Joint Conf. Empir. Meth. Nat.
Lang. Process. Comput. Nat. Lang. Learn., 2012, Assoc. for Comput.
Linguist.
vii. B. Yang and C. Cardie, Joint inference for fine-grained
opinion extraction,in Proc. ACL, 2013.
viii. K. Liu, L. Xu, and J. Zhao, Co-Extracting Opinion Targets
and Opinion Words from Online Reviews Based on the Word
Alignment Model, IEEE transactions on knowledge and data
engineering, vol. 27, no. 3, March 2015.
ix. L. Zhuang, F. Jing, and X. Zhu, Movie review mining and
summarization,in Proc. ACM 15th Conf. Inf. Knowl. Manage., 2006,
pp. 4350.
x. E. Cambria, B. Schuller, Y. Xia, C. Hava, New Avenues in
Opinion Mining and Sentiment Analysis,Published by the IEEE
Computer Society, March 2013.
xi. M. Hu and B. Liu, Mining opinion features in customer
reviews, in Proc. 19th Nat. Conf. Artif. Intell., San Jose, CA, USA,
2004, pp. 755760.
xii. N. Jakob and I. Gurevych, Extracting opinion targets in a
single- and cross-domain setting with conditional random fields, in
Proc. Conf. Empir. Meth. Nat. Lang. Process., 2010, pp. 10351045.

doi : 10.17950/ijset/v5s1/106

ISSN:2277-1581
01 Jan.2016

xiii. L. Zhuang, F. Jing, and X. Zhu, Movie review mining and


summarization, in Proc. ACM 15th Conf. Inf. Knowl. Manage, 2006,
pp. 4350.
xiv. T. Ma and X. Wan, Opinion target extraction in Chinese
news comments. in Proc. 23th Int. Conf. Comput. Linguistics,
Beijing, China, 2010, pp. 782790.
xv. Y. Choi, C. Cardie, E. Riloff, and S. Patwardhan.
Identifying sources of opinions with conditional random fields and
extraction patterns. In Proceedings of Human Language Technology
Conference and Conference on Empirical Methods in Natural
Language Processing, pages 355362, Vancouver, Canada, October,
2005.
xvi. J. D. Lafferty, A. McCallum, and F. C. N. Pereira,
Conditional random fields Probabilistic models for segmenting and
labeling sequence data, in Proc. 18th Int. Conf. Mach. Learn., 2001.
xvii. Q. Gao, N. Bach, and S. Vogel, A semi-supervised word
alignment algorithm with partial manual alignments, in Proc. Joint
Fifth Workshop Statist. Mach. Translation Metrics MATR, Uppsala,
Sweden, Jul. 2010, pp. 110. Xinjie Zhou, Xiaojun Wan, and Jianguo
Xiao, CLOpinionMiner: Opinion Target Extraction in a CrossLanguage Scenario, IEEE/ACM transactions on audio, speech, and
language processing, vol. 23,April 2015.
xviii. X. Zhou, X. Wan and J. Xiao, CLOpinionMiner: Opinion
Target Extraction in a Cross-Language Scenario, IEEE/ACM
transactions on audio, speech, and language processing, vol. 23,April
2015.

Page 31

You might also like