You are on page 1of 2


S pht trin ca cng ngh thng tin nhng nm gn y gip cng vic ca con
ngi tr nn d dng v hiu qu hn. c bit, s ra i ca Internet lm thay i
hon ton cuc sng con ngi. Con ngi d dng tm kim thng tin phc v nhu cu
hc tp, lm vic v gii tr.
Trong thc t hin nay, lng thng tin trn Internet cng ngy cng khng l, iu
ny khin vic s dng World Wide Web khai thc thng tin hu ch, phc v cho cc
mc ch c nhn lun c nhng kh khn nht nh ca n. Nht l i vi nhng mc
ch i hi phi c s thu thp thng tin vi khi lng ln trn Web, chng hn mt
cng ty cng ngh chuyn sn xut cc thit b di ng mun thu thp cc nh gi ca
khch hng v sn phm, t tm ra c th hiu khch hng, ng thi nh gi
c cht lng sn phm hin ti ca cng ty cng nh ca i th.
Vic phn tch nhn xt ca khch hng l mt trong nhng ng dng ca vic phn tch
cm xc trong vn bn mt hng nghin cu ang pht trin mnh. Mt trong nhng
cng on quan trng trong phn tch cm xc l vic phn tch tnh ch quan ca cu
trong vn bn, cng on ny s nhn bit nhng cu mang kin ca ngi dng v loi
b cc thnh phn khng quan trng, t s gip vic nh gi cm xc trong vn bn
c chnh xc.
Vi tm quan trong ca vic phn tch tnh ch quan trong vn bn, tc gi lun vn
tm hiu cc nghin cu v cc phng php phn tch tnh ch quan. Lun vn kt
hp thnh cng phng php phn tch c php v m hnh N-gram t c kt qu
rt kh quan trn b d liu cc nh gi b phim Error! Reference source not found.. ng
thi, tc gi cn tm hiu cc phng php c cc nhm nghin cu p dng cho vic
phn loi cm xc trong vn bn cho vic phn tch tnh ch quan nh Delta TFDIF,
Weighted log-likelihood ratio, Nave Bayes kt hp SVM, Frequent Dependency subtree,
Frequent Dependency subsequence. Tc gi c gng hin thc v ci tin cc phng
php t c kt qu tt cho vic phn loi tnh ch quan trong vn bn.
Hin ti, lun vn ang tp trung vo vic phn tch cc nh gi phim ca ngi
dng v y l d liu kh phn loi nht v c s dng nhiu trong cc nghin cu v
phn tch cm xc (sentiment analysis).

The dramatic development of information technology in recent years has been
helping people achieve efficient working performance. One of the most noticeable
applications of information technology is the invention of the Internet, which plays
indispensable role in human life. With the rampant of the Internet, people can find
desirable information with ease which meet our increasing demand ranging from
acquiring novel knowledge to entertainment.
Nowadays, there are a rising number of websites on the Internet, which lead to a
great deal of difficulty to retrieve necessary information for individual needs especially
those consuming tremendous data. Commercial enterprises that specialize in mobile
devices, for example, desire to collect feedback from users that help them to find out
what kinds of products their customers are interested in and to improve quality of their
products as well as their opponents ones. Analyzing customers feedback is one of
relevant applications of sentiment analysis that has been growing stronger in recent years.
One of the critical tasks in sentiment analysis is subjectivity classification which
identifies and removes redundant objective information and enhances the accuracy of
sentiment analysis process.
Because of the crucial role of subjectivity classification in document, author of
thesis has studied state-of-the-art methods and contemporary researches related to this
field. The author has successfully combined the use of N-gram model and Syntactic
method to achieve considerably good result with the given movie review data [10]. Not
only has he applied other methods, used frequently in sentiment analysis tasks, such as
Delta TFDIF, Weighted log-likelihood ratio, Nave Bayes combine SVM, Frequent
Dependency SubTree and Frequent Dependency subsequence, to subjectivity
classification, but he has also sought to implement and enhance the accuracy of these
methods to accomplish better results. Currently, he concentrates on movie reviews which
are significantly difficult to analysis and classify.