Contextual Sentiment Polarity Analysis Using Conditional Random Fields

Contextual Sentiment Analysis
Using Conditional Random Fields

Joseph Garcia
University of the Philippines Diliman
jrenangarcia@gmail.com
Dylan Valerio
University of the Philippines Diliman
dylan_valerio@yahoo.com
I. INTRODUCTION
The new user-centric Web hosts a large volume of data created by
various users. Social media have made common users the new cocreators of web content rather than being passive consumers.
Opinions expressed in social media in form of reviews or opinion
posts constitute an important and interesting topic worthy of
exploration and exploitation. With the increase in the accessibility
of opinion resources such as movie reviews, social network tweets
etc. and availability of opinion sentiment lexicons, opinion
analysis is currently one of the more interesting problems in
Natural Language Processing. The new challenging task now is to
do opinion mining on large volumes of text and devise suitable
learning algorithms to understand the opinions of others.
Sentiment analysis refers to the use of language processing to
extract opinions and beliefs of people in electronic document. It
deals with analyzing and trying to identify the expressions of
opinion and mood of the writer. Sentiment analysis has been
successfully used in literature to analyze the subjective content of
product reviews [1][2][3]. A task common to sentiment analysis is
recognizing the sentiment polarity of a given document. Given a
movie review R, sentiment polarity analysis is the binary
classification task in inferring whether review R, represented as a
set of words belongs to polarity classes p {+1, 1}, +1 implying
positive polarity and -1 implying negative polarity.
A common approach to extracting polarity is analyzing worldlevel features. The occurrence of a word is a factor for computing
the polarity of a sentence, and in turn factorizes to the polarity of
the entire document. This bag-of-words representation has led to
varying success. However, we recognize that this representation is
oversimplified. Sentences have a structure, where words either
diminish, negate or intensify its polarity. Complex style, such as
humor, sarcasm and intelligence is built on top of this structure,
thus, words, analyzed only individually, loses its context.
Traditional methods, such as mapping words into negative and
positive polarity with varying strengths are ineffective in
incorporating structure and context to sentiment analysis.
Consider the following sentences,
1.
2.
Enigma is well-made , but it's just too dry and too placid.
The story loses its bite in a last-minute happy ending that's
even less plausible than the rest of the picture. Much of the
way, though, this is a refreshingly novel ride .
At the onset, the first movie review above is positive, but the
second clause has more intense negative polarity and thus should
be considered negative. The second movie review looks to have
more negative phrases, but at the end, the author reverses the
polarity of the document, giving it a positive sentiment. In this
study, we discuss different approaches to incorporate structure
and context in sentiment analysis thru the use of conditional

random fields, more specifically for (1) negation cue detection
and (2) contextual polarity interactions.
2. RELATED LITERATURE
Nakagawa et al used conditional random fields with hidden
variables to capture word dependencies in sentiment analysis [1].
Instead of individual words, they used dependency trees to group
together words into subjective phrases that has a root head word.
Getting the polarity of the phrase is contingent on getting the
dependencies of words and their interaction with their respective
parent words in the dependency tree. The hidden node captures
information regarding the dependencies of the words and other
words, as well as sentiments given phrases. Central to their paper
is the idea of polarity reversal and p, which takes into account
reversing the polarity of a phrase when the head word is part of
the polarity reversal lexicon. They tested their algorithm on 4
Japanese and 4 English corpora and found that their Tree-CRF
worked significantly better than the best models that take into
account polarity reversal as well.
Matsumoto et al also used dependency sub trees to incorporate
structure in their work [3]. They used frequent pattern mining
techniques, Prefix span and FREQT to extract words occurring in
frequent sentences. Due to the exponential growth of sequences
given a corpus, they constrained their sentences to short clauses,
SBAR, then removed some punctuations and words with part-ofspeech tags that are irrelevant to the sentiment analysis task. They
then used a feature vector representation of their corpus as input
to their linear support vector machine algorithm.
Zhang et al has treated sentiment analysis as a information
extraction task [2]. Firstly, they used conditional random fields to
determine the features of a product review, for example, the
camera or the battery life of a mobile phone. Second, they extract
the opinion of the author given the recognized feature of the first
task. Lastly, they extract the opinions polarity. The first task,
product review recognition, can be considered an entity
recognition task, in which the entity is the product feature. Using
a state-of-the-art sequence tagger, conditional random fields, the
authors have delivered results such as [positive] [battery life] and
[negative] [camera].
Common to most related literature described above is the attempt
to capture sentence structure specifically to recognize polarity
reversal. Jbara and Radev have discussed negation detection and
resolution in their work [4]. They used an available corpus, the
SEM Shared Task 2012 Corpus, which contained annotated works
of Sir Arthur Conan Doyle, The Hound of Baskervilles and The
Adventures of Wisteria Lodge. The paper focused on three tasks,
1
negation cue detection, negation scope resolution and negated

event detection. In this paper, the first task is used in one of the
approaches to incorporate polarity reversal.
3. METHODOLOGY
3.1 Data Set
For this paper, we used the corpus available from Rotten
Tomatoes, a website that lets users rate and review movies [15].
There is a total of 10,662 movie reviews, with an even number of
positive and negative sentiments. There are a total of 18,342
unique words. For the negative reviews only, there exist 5761
unique words not found in the positive reviews. On the other
hand, 5422 unique words are found in the negative reviews. A
table of the most frequent words unique to each set is outlined at
Table 1.
Word unique to
Positive
Frequency
Word unique to
Negative
Frequency
riveting
20
unfunny
26
gem
17
badly
25
wonderfully
15
poorly
19
detailed
14
disguise
17
heartwarming
14
pointless
17
lively
14
seagal
17
vividly
14
bore
16
polished
13
benigni
15
spare
13
product
14
tour
13
pinocchio
13
Another approach is to use a sentiment lexicon which contains

sentiment scores for each word. Each word in the lexicon is
mapped to a value [-4,-1] for negative sentiment and [1,4] for
positive sentiment. Summing up the values for each movie review
yields the final sentiment. The lexicon consists of 2,477 words
and
we
obtained
the
AFINN-111
lexicon
from
http://www2.imm.dtu.dk
Another approach is to use negation. Similar to the goals in the
related literature described, we incorporate negation cue detection
implemented in [4] to tag each word as negated or not negated.
Coupled with lemmatization, we can compress each word and tag
their negation cue.
The last approach is to use phrase-level analysis. A sentence is
chunked according to a trained conditional random field (CRF)
model and then modifiers are identified using dependency tree.
Similar to [1] and [2], modifiers can either reverse or intensify
phrase polarities. As we shall see, this achieves the best results in
our experiments.
3.3 Proposed Approaches

Sentiment classification differs from the traditional topic-based
text classification. Although the traditional bag of words
approaches provide good results for sentiment classification, we
are expecting to achieve improved classifier performance when
contextual features are incorporated. Sentences which contain
positive or negative polarity words do not necessarily have the
overall sentiment polarity. Interactions between words should be
observed instead of handling them independently.
3.3.1 Adding negation cue detection to lemmas
Table 1: Most frequent words found in the data set
A quick review reveals that positive reviews have adjectives or

adverbs that are can get reversed. Phrases like not riveting or
poorly polished comes to mind. On the other hand, the negative
reviews tend to have less adjectives and in turn becomes
impossible to reverse. For example, phrases such as not bore or
not very pointless are far-fetched.
3.2 Approaches Used

For this study, we make use of several methods. Firstly, we make
use of unigrams and bigrams as our base features. These serve as
baseline to our proposed approaches in incorporating more
information.
Next, we make use of part-of-speech tags concatenated with each
of the word. As we will see in the results, this may add some
generalization capability to our features. For example, consider
(1) The love story is boring and (2) I love this movie. The word
love denotes a neutral word in (1) and a positive sentiment in (2)
depending on its part of speech annotation.
Failure to account for negation may result in the prediction of the

opposite sentiment in sentiment analysis systems. The existence
of negation in a sentence is determined by the presence of a
negation cue. A negation cue is a word. phrase, a prefix or postfix
that triggers the negation.
Negation cues were detected and added into the corpus using a
CRF trained on the Conan Doyle Corpus from Clips Shared Task
2012. We used the Hound of Baskervilles collection, which
contained annotations of negation cues, negation scope and
negated event. The corpus contained 69,054 words, 5,780 unique
tokens and 3,642 sentences. This approach focuses on the
negation cues and accordingly annotates each word if it is a
negation cue. Combining this with lemmas for generalization, the
model can learn unlikely words with affixes such as uninteresting,
immature, and unappealing.
The features used for negation cue detection are the tokens, their
lemmas, their POS tags, a simplified mapping for POS (ADJ, VB,
ADV, NN, PRO), if the token is a punctuation mark, if the token
contains negative prefix and if the token contains negative postfix.
On a test set of the Conan Doyle Corpus, the negation cue
detection task achieved 99.475% accuracy with a baseline of
98.479% when naively tagging every word as not negated. This
model was then used to annotate the Rotten Tomatoes corpus.
Each word is lemmatized and annotated if the original token is
negated. Accordingly, the format is the following: [lemma]_[is
negated].
For feature selection, we chose the top 40% features of the term
document matrix. We compute the TFIDF measure for each
token. The formula measures how much information the
presence/absence of a feature contributes to making the correct
2
classification decision. MI(X;C) function provides the measure of

how well a token X discriminates between the positive and
negative polarity classes as formally defined by:
Formula 1: Mutual Information computation
Conditional Random Fields in NLP applications, we used this

model for our aspect extraction.
Training data: Data set were acquired from Conference on
Computational Natural Language Learning (CoNLL-2000). The
data consists of 900 annotated sentences from the Wall Street
Journal. Features of the model are the word and its corresponding
part-of-speech tag. A model feature window of lengths 1, 2 and 3
were defined.
Contextual Polarity Shifters
See Figure 1 for the workflow diagram.
Figure 1: Negation cue detection workflow diagram
3.3.1
Phrase-level Analysis with Context Shifters
In this approach, sentiment scores were first calculated by wordlevel and then contextual changes are propagated to phrase-level
and finally, the sentiment overall-level polarity is computed by
summing up the sentiment scores of each phrases. This approach
is often called recursive sentiment analysis. Sentiment scores are
propagated in a bottom-up level manner: beginning with the prior
polarity of each word and ending with the root of the tree,
containing the overall sentiment polarity.
Contextual polarity shifters play an important role in contextual

sentiment classification. It is effective in changing the degree and
even reversing the overall polarity sentiment for text with no
negatively polar words. We will be using four well-known
polarity shifter types namely, Intensifiers, Diminishers and
Reversers. For Reversers type, we further divide it into two
subtypes: Negators (Word-level Reversers) and Phrase-Level
Reversers which we will be referring to as Reversers in the
succeeding text.
1. Intensifiers and Diminishers These are terms that change
the degree of the expressed sentiment. Intensifiers increase the
polarity score of the modified words while diminishers reduce the
polarity score.
Examples:
(1)
[The movie] is very good.
(2)
[The movie] is probably good.
In the first example, intensifier very modifies the polarity score of

good. The increased polarity of good will then be propagated to
the phrase [The movie]. Similarly, for the second example,
probably diminishes the polarity of good which, in effect
diminishes its effect on the phrase score of [The movie]. Note that
intensifiers and diminishers, in this paper do not change the sign
of the affected words.
Computation:
Current polarity score of affected word * 2, if intensifier
Current polarity score of affected word / 2, if diminisher
2. Negators (Word-level Reversers) Negators flip the
polarity score of affected words. In addition to not, negators can
belong to various word classes. Negators include never, none,
nobody, nowhere, nothing, neither, etc. The semantic function of
negation is to transform an affirmative statement into its opposite
meaning and vice versa.
Examples:
Figure 2: Phrase-level Sentiment Representation
Aspect Phrases
Aspect extraction is an important preliminary phase since they
serve as the scope for sentiment words interactions. They are also
a critical component in the computation for the overall sentiment
score which then affects the output sentiment polarity. Review
aspect phrases are essentially non-overlapping word segments
which form the base Noun Phrase. For an input movie review, we
extract the Aspect Noun Phrases using a phrase chunker
implemented in CRF++ [14]. Due to the effectiveness of
(1)
[The dialogue] is not very boring
(2)
[The story and characters] are nowhere near interesting
Computation:
Current polarity score of affected word * -1, if negator
3. Reversers (Long-range Reversers) The effects of reversers
are similar with negators. As observed in the movie review
domain, reverser terms such as but does not affect just the words
but the phrases preceding it.
Examples
sum of the sentiment scores of each word belonging in the

phrase. Phrase sentiment scores are then shifted based on the
contextual polarity shifters within their scope.
(1) [The concept] is okay, [the scenery] is great and the [acting]
is fine but the [movie] is too long
In the above example, overall sentiment polarity is classified as
negative which is correct. The three preceding aspect phrases will
all contain reversed scores when but is propagated.
(3) Out-of-scope words sentiment score calculation

We now proceed to assign prior sentiment scores to words
with no assigned Aspect Phrase. Calculation of sentiment
scores is similar to the method carried out for words inside
an Aspect Phrase.
Computation:
For every preceding phrases, phrase score * -1, if reverser
(4) Contextual Shifter scores propagation

Using Stanfords Grammatical Analyzer, we form a
corresponding dependency tree for the input review. Using
the dependency tree, we identify grammatical interactions
between the context shifter and the affected words. Shifting
of sentiment scores were performed depending on the type of
the context shifter. For context shifting, the below
propagation order is observed for correct context
preservation:
Sentiment Lexicons
1.) We used SentiWords 1.0 for calculating the prior sentiment
score of each word. SentiWords 1.0 is a freely available sentiment
lexicon containing 155,000 words associated with a sentiment real
number score. Sentiment scores are learned from SentiWordNet.
SentiWords was built using the method described in Guerini et al.
(2013) and the dataset presented in Warriner et al. (2013).
Downloaded from https://hlt.fbk.eu/technologies/sentiwords
2.) To obtain the list of contextual shifters, we collected
annotated words in the Harvard General Inquirer Lexicon. The
lexicon is a collection of syntactic, semantic and pragmatic
information attached to part-of-speech tagged words. Lexicon was
downloaded from http://www.wjh.harvard.edu/~inquirer/
3.) The prior sentiment score of an aspect phrase is highly
dependent on the prior scores of the nouns within it. An issue that
we encountered was that SentiWords 1.0 also provided high
scores for generally neutral nouns making classification
performance biased to the existence of non-domain words. To
circumvent this, we limit the sentiment effect of noun words. For
each noun, we determine if it is related to the domain of film
review by looking it up on a domain lexicon before obtaining its
prior sentiment score from SentiWords. The movie review domain
lexicon was obtained from:
http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html
Sentiment Polarity Classification Algorithm
The algorithm for the phrase-level sentiment analysis is detailed
below. The classifier is implemented in Java language with the
necessary language processing modules downloaded from the
available resources from The Stanford Natural Language
Processing Group at http://nlp.stanford.edu/software/index.shtml
a.
b.
c.
d.
Negators
Adverb modifiers (diminishers and intensifiers)
Adjective modifiers (diminishers and intensifiers)
Verb object modifiers
(5) Propagation from word-level sentiment scores to Aspect

Phrase-level sentiment scores
The shifted word-level sentiment scores are now propagated
back to the corresponding Aspect Phrases using again the
grammatical relationships obtained from the dependency
tree. This will update the prior phrase sentiment scores we
have computed in the second phase.
(6)
Reverser contextual shifter propagation

We rescan the entire movie review and identify words which
are annotated as Reversers. Now that phrase sentiment scores
are already updated during this phase, we can do sentiment
score reversals for phrases which precede the reverser word.
(7) Overall sentiment score calculation

Overall sentiment score is calculated by summing up all
sentiment scores for each Aspect Phrase
(8) Sentiment Polarity output
Input: Movie review text:

(1) Aspect Phrases Extraction and Initialization
First, we obtain the corresponding phrase chunk labels for each
word (B-beginning of aspect phrase, I-within an aspect phrase, Onot contained in an aspect phrase). We then initialize phrase
sentiment score for each returned aspect phrase to 0.00. For movie
reviews with no labeled aspect phrases, we assume that the entire
movie review input belongs to one aspect phrase.
We introduce to be the threshold such that when overall score >

, then we classify the review as positive, negative otherwise. For
our experimental setup, we want to obtain the optimal parameter
which will produce high results but ensure good trade-off results
between precision and recall measures. Intuitively, increasing the
threshold value will improve the results of classifying negative
reviews at the cost of decreasing the allowable range for positive
reviews.
(2) Prior Aspect Phrase sentiment score calculation

Before looking up the score of each word in the SentiWords
lexicon, we do validation first if a particular word is indeed a
movie related domain-related term using the obtained
domain lexicon. We then use the SentiWords lexicon to
obtain the prior sentiment score for each words belonging in
the Aspect Phrases. The phrase sentiment score is now the
4
Expertiment-2: Movie review testing set (4,263 reviews)
Threshold
Overall Accuracy
Precision
Recall
F-Score
62.2801%
79.6435%
59.1226%
0.6787
0.1
63.6875%
74.7655%
61.2135%
0.67314
0.2
64.6493%
71.1069%
62.9830%
0.6680
0.3
65.0481%
67.0732%
64.4725%
0.65747
0.4
65.3061%
62.9925%
66.0600%
0.64490
Results for Experiment-1 shows that the classifier model performs

better on correctly classifying positive reviews than negative
reviews. This supports our findings that negative reviews tend to
have lesser negatively polar words and implicit ambiguities such
as sarcasm, irony which are difficult to learn and classify. No
significant changes in performance are observed when classifying
a smaller subset of movie reviews.
For both results from experiments 1 and 2, increasing the
threshold value improves classifier recall performance at the cost
of precision decrease. The same observation can be said to the Fscore value. In Experiment 1, we see that at threshold = 0.3,
positive, negative and overall accuracy all have reached the 60%
range. At this threshold value, we have a good trade-off
performance for precision and recall. This scenario is also
observed for the Experiment 2. For this reason, we have chosen
threshold = 0.30 to be used for the testing set and the results will
be used in comparison with other baseline classifiers.
Figure 3: Sentiment Polarity Classification Algorithm
4. EXPERIMENTAL RESULTS
Classifier Performance Evaluation

We ran the classifier for all movie reviews in the data set for
Experiment 1. For the next experiment, we compared the
performance of the classifier against other classification methods
we have used. We used the partitioned validation set as the input
data for the classifier and observed the results.
We evaluate the performance of the discussed sentiment

classification models using the measures: overall accuracy,
precision, recall and F-measure. The findings are provided in the
below table.
Method
Features
The experiment results are presented below:
Accuracy
Precision
Recall
F-Score
Bag of Words
Expertiment-1: Entire movie review data set (10,662 reviews)
Threshold
Overall Accuracy
Precision
Recall
F-Score
61.6676%
78.9533%
58.6702%
0.6732
0.1
63.2527%
73.5884%
60.9824%
0.6670
0.2
63.7029%
69.5554%
62.2670%
0.65710
0.3
63.6654%
65.2223%
63.2527%
0.64222
0.4
64.1156%
61.0955%
65.0230%
0.6300
unigrams
9826
52.2738%
70.8861%
51.6570%
0.5976
bigrams
12434
44.4444%
22.5504%
40.1168%
0.2887
unigrams
+ bigrams
22260
44.1163%
35.7712%
42.9375%
0.3903
Context
part-ofspeech
13126
56.7192%
59.0930%
56.9536%
0.5800
sentiment
word
count
AFFIN111
Lexicon
62.2128%
56.4463%
63.8050%
0.5990
Table 4: Results of Baseline methods
Method
Features
Accuracy
Precision
Recall
F-Score
Proposed Methods
unigrams with negation
8630
48.5004%
67.9475%
48.9204%
0.56885
unigrams with negation

MI Feature reduction
200
52.5773%
77.7888%
51.7134%
0.6213
bigrams with negation

MI Feature reduction
1700
53.7960%
60.1690%
53.3670%
0.56564
lemmas
50.2430%
02.3490%
40.9680%
0.0444
phrase-level with context shifters

(threshold=0.0)
SentiWords Lexicon
62.2801%
79.6435%
59.1226%
0.6787
phrase-level with context shifters

(threshold=0.3)
SentiWords Lexicon
65.0481%
67.0732%
64.4725%
0.6422
Table 5: Results of Proposed methods
5. DISCUSSION OF RESULTS
For the bag of words approaches, the unigrams feature is the most
accurate with 52.27% accuracy and F-score of 0.5976. In terms of
precision and recall, the unigram model has a good precision of
70.89% but has lower recall at 51.66%. The use of a bigram
model reduces the unigram accuracy to 44.44% and weakens the
classifier overall. A combination of bigrams and unigrams
resulted in an increase in F-measure on the bigram feature model
(0.29 to 0.39) but no improvement on its overall accuracy.
Incorporating context using part of speech tag features did not
provide better performance as compared to the unigrams model. It
however has a slightly higher recall but suffers greatly from
precision performance. With the use of a sentiment lexicon for
polarity features in the next method, overall accuracy increases
from 52.27% up to 62.21%. Recall also improves to 63.81% but
precision is lower at 56.45%. We can observe that by adding
context features such as part-of-speech tags and polar word
counts, we can improve the recall performance of bag of word
models and achieve higher overall accuracy. The observed
limitation of these two methods is its difficulty in classifying
positive move reviews which entails a lower precision result.
In our proposed methods, unigram lemmas with negation proved
only slightly better than unigrams. We also observed that feature
reduction significantly improved our Nave Bayes models
performance (increasing accuracy from 48% to 52%). Using only
lemmas as features proved devastating to performance (50.24%
accuracy but 0.044 F-score). In the case of the phrase-level model
with context shifters, evaluation results show that it outperforms
the other classification models, including the bag of words
approaches. We achieved the highest precision (79.64%) and Fscore (0.6787) when the threshold is in default value, 0.00.
Further increasing the threshold to 0.30 reduced the precision to
67%, as expected but achieved the highest recall (64.4725%) and
overall accuracy (65.0481%).
In general, incorporating negation to the classifier model produce
results similar with the two context baseline methods in terms of
recall and overall accuracy. However, they perform well on
classifying positive movie reviews hence, achieving better
precision results and higher F-scores. The observation of word
interactions thru polarity shifts further boosted sentiment polarity
classification performance and achieved the highest results among
the other models.
6. CONCLUSION
In this paper, we discussed different approaches in sentiment
polarity classification. We used Conditional Random Fields as a
sequence labeling model for negation cue detection and aspect
phrase extraction. Our proposed methods, which takes into
consideration the context of the reviews performed significantly
better than the traditional bag of words models. We also achieved
the highest performance measures using phrase-level analysis
with context shifting. Indeed, the analysis of contextual word
interactions can significantly improve the classifiers performance
for the Sentiment Polarity classification task.
The results obtained with our implementations are highly
encouraging and suggest that modifications can be further made
and that more effective sentiment classification models can be
constructed.
REFERENCES
[1] Tetsuji Nakagawa, Kentaro Inui, and Sadao Kurohashi.
2010. Dependency tree-based sentiment classification using CRFs
with hidden variables. In Human Language Technologies: The
2010 Annual Conference of the North American Chapter of the
Association for Computational Linguistics(HLT '10). Association
for Computational Linguistics, Stroudsburg, PA, USA, 786-794.
[2] Shu Zhang; Wen-Jie Jia; Ying-Ju Xia; Yao Meng; Hao Yu,
"Opinion Analysis of Product Reviews," Fuzzy Systems and
Knowledge Discovery, 2009. FSKD '09. Sixth International
Conference on , vol.2, no., pp.591,595, 14-16 Aug. 2009.
[3] Shotaro Matsumoto, Hiroya Takamura, and Manabu Okumura.
2005. Sentiment classification using word sub-sequences and
dependency sub-trees. In Proceedings of the 9th Pacific-Asia
conference on Advances in Knowledge Discovery and Data
Mining (PAKDD'05), Tu Bao Ho, David Cheung, and Huan Liu
(Eds.). Springer-Verlag, Berlin, Heidelberg, 301-311.
[4] Amjad Abu-Jbara and Dragomir Radev. 2012. UMichigan: a
conditional random field model for resolving the scope of
negation. In Proceedings of the First Joint Conference on Lexical
and Computational Semantics - Volume 1: Proceedings of the
main conference and the shared task, and Volume 2: Proceedings
of the Sixth International Workshop on Semantic
Evaluation(SemEval '12). Association for Computational
Linguistics, Stroudsburg, PA, USA, 328-334.
[5] Guerini M., Gatti L. & Turchi M. Sentiment Analysis: How
to Derive Prior Polarities from SentiWordNet. In Proceedings of
the 2013 Conference on Empirical Methods in Natural Language
Processing (EMNLP'13), pp 1259-1269. Seattle, Washington,
USA. 2013.
[10] T. Wilson, J. Wiebe, and P. Hoffmann, Recognizing

Contextual Polarity: An Exploration of Features for Phrase Level
Sentiment Analysis, Computational Linguistics, vol.
35, no. 3, pp. 399-433, 2009.
[11] A. Agarwal, F. Biadsy and K. Mckeown,
Contextual Phrase-Level Polarity Analysis using Lexical
Affect Scoring and Syntactic Ngrams,, In Proceedings of
ECACL 2009, pp. 2432, 2009
[12] S. M. Kim and E. Hovy, Determining the Sentiment
of Opinions, In Proceedings of the 20th International
Conference on Computational Linguistics (COLING 2004),
Geneva, Switzerland, 2004, pp. 1367-1373
[13] Lars Kai Hansen, Adam Arvidsson, Finn rup Nielsen,
Elanor Colleoni,
Michael Etter, "Good Friends, Bad News - Affect and Virality in
Twitter", The 2011 International Workshop on Social Computing,
Network, and Services (SocialComNet 2011).
[14] Taku Kudo, 20007. CRF++:Yet Another CRF Toolkit.

Version 0.58 available at http://crfpp.aourceforge.net/.
[15] Movie reviews data set obtained from:
http://www.cs.cornell.edu/people/pabo/movie-review-data/
[6] Warriner A. B., Kuperman V. & Brysbaert M. "Norms of

valence, arousal, and dominance for 13,915 English lemmas".
Behavior research methods, 45(4), 1191-1207. 2013.
Download page: http://crr.ugent.be/archives/1003
[7] B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up?:
sentiment classification using machine learning techniques,"
presented at the Proceedings of the ACL-02 conference on
Empirical methods in natural language
processing - Volume 10, 2002.
[8] B. Liu and L. Zhang, "A Survey of Opinion Mining and
Sentiment Analysis," C. C. Aggarwal and C. Zhai, Eds.,
ed: Springer US, 2012, pp. 415-463.
[9] J. Yi, T. Nasukawa, R. Bunescu, and W. Niblack, "Sentiment
Analyzer: Extracting Sentiments about a Given Topic using
Natural Language Processing Techniques," pr esented at the
Proceedings of the Third IEEE International
Conference on Data Mining, 2003.

Contextual Sentiment Polarity Analysis Using Conditional Random Fields

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Contextual Sentiment Polarity Analysis Using Conditional Random Fields

Uploaded by

Copyright:

Available Formats

Contextual Sentiment Analysis

Using Conditional Random Fields

and context in sentiment analysis thru the use of conditional

negation cue detection, negation scope resolution and negated

Another approach is to use a sentiment lexicon which contains

3.3 Proposed Approaches

3.3.1 Adding negation cue detection to lemmas

Table 1: Most frequent words found in the data set

A quick review reveals that positive reviews have adjectives or

3.2 Approaches Used

Failure to account for negation may result in the prediction of the

classification decision. MI(X;C) function provides the measure of

Formula 1: Mutual Information computation

Conditional Random Fields in NLP applications, we used this

See Figure 1 for the workflow diagram.

Figure 1: Negation cue detection workflow diagram

Phrase-level Analysis with Context Shifters

Contextual polarity shifters play an important role in contextual

[The movie] is very good.

[The movie] is probably good.

In the first example, intensifier very modifies the polarity score of

Figure 2: Phrase-level Sentiment Representation

[The dialogue] is not very boring

[The story and characters] are nowhere near interesting

sum of the sentiment scores of each word belonging in the

(3) Out-of-scope words sentiment score calculation

(4) Contextual Shifter scores propagation

(5) Propagation from word-level sentiment scores to Aspect

Reverser contextual shifter propagation

(7) Overall sentiment score calculation

Input: Movie review text:

We introduce to be the threshold such that when overall score >

(2) Prior Aspect Phrase sentiment score calculation

Expertiment-2: Movie review testing set (4,263 reviews)

Results for Experiment-1 shows that the classifier model performs

Figure 3: Sentiment Polarity Classification Algorithm

Classifier Performance Evaluation

We evaluate the performance of the discussed sentiment

The experiment results are presented below:

Expertiment-1: Entire movie review data set (10,662 reviews)

Table 4: Results of Baseline methods

unigrams with negation

bigrams with negation

phrase-level with context shifters

phrase-level with context shifters

Table 5: Results of Proposed methods

[10] T. Wilson, J. Wiebe, and P. Hoffmann, Recognizing

[14] Taku Kudo, 20007. CRF++:Yet Another CRF Toolkit.

[6] Warriner A. B., Kuperman V. & Brysbaert M. "Norms of

You might also like