Professional Documents
Culture Documents
Kyung Soo Cho, Ji Yeon Lim, Jae Yeol Yoon, Young Hee Kim,
Seung Kwan Kim, and Ung Mo Kim
School of Information and Communication Engineering SungKyunKwan University,
2nd Engineering Building 27039 CheonCheon-Dong, JangAn-Gu,
Suwon 440-746, Republic of Korea
kisschks@hotmail.com, {01039374479,vntlffl}@naver.com,
younghees@gmail.com, libertas@korea.kr, umkim@ece.skku.ac.kr
Abstract. Presently, many researching fields are crossed and mashed up to each
fields, however, some of computer science fields cannot be solved by technique
only. Opinion mining sometimes needs a solution from other fields, too. For
example, we use a method from psychology to gain information from text about
users. Likewise, we suggested a new method of opinion mining which is using
MapReduce before, and this method also uses a WordMap which is dictionarylike. WordMap just has information of category and value of word. If we use a
novel method of Opinion mining, it could be mining opinion from web more
powerful than before. Therefore, for stronger opinion mining, we suggest a
framework of Opinion mining in MapReduce.
Keywords: Framework, Opinion mining, WordMap, POS tagging, MapReduce.
1 Introduction
Opinion mining and Semantic web techniques are fascinating domain of searching
engine. Between them, Opinion Mining is one of the mining techniques, extracts
estimation from the internet, analyzes it, and puts out the results. These results are
usable and useful in many areas like marketing or product reviews. Nonetheless,
current methods are inefficient and use time too much for huge data because they run
on a single node to process. To settle this problematic, cloud computing, which is the
center of attention for next computing environment, is appropriate. MapReduce,
which is one of cloud computing methods, already be used in Google file system.
Therefore, this paper suggests Opinion Mining in MapReduce framework to this
novel trial for designing under a cloud computing environment, and we look forward
to the framework showing performance moderately. This framework is able to be
utilized when a developer who has wanted for some object and expectations about
performance makes Opinion mining tools in MapReduce.
This paper is composed as follows: in the section 2, we explain a technique of
opinion mining and existing representative research which has relation with the
framework. In the section 3, we present the framework of opinion mining in
MapReduce function. In the section 4, we finish the paper with a conclusion and our
future work.
C. Lee et al. (Eds.): STA 2011 Workshops, CCIS 187, pp. 5055, 2011.
Springer-Verlag Berlin Heidelberg 2011
51
2 Related Work
2.1 Opinion Mining Methods
Opinion mining study has been gradually growing since the late 90s. Known as
sentiment classification, Opinion Mining focuses not on the topic, but a users mental
attitude that topic. In late years, opinion mining has been applied to product reviews,
or other commercial things. [1] WY. Kim and others suggest a method for opinion
mining of product reviews using association rules. [2] Opinion mining field also
includes featured-based opinion mining, summarization, comparative sentence,
relation mining, opinion searching, opinion spamming, and the linguistic resource
defining & constructing. [3] [4]
In a case of sentiment classification, reading text and analyzing make a result like
<word | value>, and <word | value> is similarly to MapReduces [5] data structure. So
sentiment classification has a lot possibility of well-matching within MapReduce. In
addition, some rules for analysis, which is like the POS tagging technique [6] [7], or
dictionary information are usable, too.
52
Fig. 2. MapReduce
53
54
searching results wealthier. Also it is able to use strong marketing analyzing tools in
companies for collecting their product reviews, and government is able to utilize this
framework for their information gathering and analysis. For example, In case of
America, Google and CIA make co-financing investment company which called
recorded futures. This company uses mining methods with a technique of huge data
processing. This fact is issued in several newspapers.
References
1. Conrad, J.G., Schilder, F.: Opinion mining in legal blogs. In: Proceedings of the 11th
International Conference on Artificial Intelligence and Law, pp. 231236. ACM, New
York (2007)
2. Kim, W.Y., Ryu, J.S., Kim, K.I., Kim, U.M.: A Method for Opinion Mining of Product
Reviews using Association Rules. In: Proceedings of the 2nd International Conference on
Interaction Sciences: Information Technology, Culture and Human (ICIS 2009), Seoul,
Korea, November 24-26, pp. 270274 (2009)
3. Esuli, A., Sebastiani, F.: SENTIWORDNET: A Publicly Available Lexical Resource for
Opinion Mining. In: Proceedings of the 5th Conference on Language Resources and
Evaluation (LREC 2006), Citeseer (2006)
4. Esuli, A., Sebastiani, F.: PageRanking WordNet synsets: An application to opinion
mining. In: Proceedings of the 45th Annual Meeting of the Association for Computational
Linguistics (ACL 2007), Citeseer (2007)
5. Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters.
Communications of the ACM 51(1), 107113 (2008)
6. Stanford Tagger Version 1.6 (2008), http://www.nlp.staford.edu/software/tagger.shtml
7. Stanford Parser Version 1.6 (2008), http://nlp.stanford.edu/software/lex-parser.shtml
8. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in
Information Retrieval 2(1-2), 1135 (2008)
9. Kim, S.M., Hovy, E.: Determining the sentiment of opinions. In: Proceedings of the 20th
International Conference on Computational Linguistics (2004)
55
10. Potthast, M., Becker, S.: Opinion Summarization of Web Comments. Advances in
Information Retrieval, 668669 (2010)
11. Read, J.: Using emoticons to reduce dependency in machine learning techniques for
sentiment classification. In: Proceedings of the ACL Student Research Workshop, pp. 43
48. Association for Computational Linguistics (2005)
12. Kim, S.M., Hovy, E.: Extracting opinions, opinion holders, and topics expressed in online
news media text. In: Proceedings of ACL/COLING Workshop on Sentiment and
Subjectivity in Text, Sydney, Australia (2006)
13. Cho, K.S., Ryu, J.S., Jeong, J.H., Kim, Y.H., Kim, U.M.: Credibility Evaluation and
Results with Leader Weight in Opinion Mining. In: The 2nd International Conference on
Cyber-Enabled Distributed Computing and Knowledge Discovery, Huangshan, China,
October 10-12 (2010)
14. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra,
T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data.
ACM Transactions on Computer Systems (TOCS) 26(2), 4 (2008)
15. Cardona, K., Secretan, J., Georgiopoulos, M., Anagnostopoulos, G.: A grid based system
for data mining using MapReduce. Technical Report TR-2007-02, AMALTHEA (2007)
16. Bayir, M.A., Toroslu, I.H., Cosar, A., Fidan, G.: Smart miner: a new framework for
mining large scale web usage data. In: Proceedings of the 18th International Conference
on World Wide Web, pp. 161170. ACM, New York (2009)
17. Xia, T.: Large-scale sms messages mining based on map-reduce. In: International
Symposium on Computational Intelligence and Design, ISCID2008, pp. 712. IEEE, Los
Alamitos (2008)
18. Cho, K.S., Jung, N.R., Kim, U.M.: Using WordMap and Score-based Weight in Opinion
mining with MapReduce. In: IEEE International Conference on Service-Oriented
Computing and Applications (2010)