You are on page 1of 5

INFORMATION NETWORK ENGINEERING FOR

PRODUCT RATING USING SOCIAL MEDIA (Studi


Kasus : Bank Central Asia)
Yansyah Saputra Wijaya
School of Electrical Engineering and Informatics
Institut Teknologi Bandung
Bandung, Indonesia
23515039@std.stei.itb.ac.id
Abstract With the rapid growth of e-commerce and social
media online rating systems supplied allows the buyer or user to
post a review about a product purchased or used. Online rating
systems have an important role in consumer decision in
purchasing a product. Based on the 2013 survey conducted by
Dimensional Research, 90% of respondents say that positive
online reviews influence their decision to buy a product, while
86% said that their decision not to buy the products affected by
the negative reviews online. Online rating systems usually have a
scale of 0 to 5, and customers usually are not objective in their
assessments as more likely to choose a rating of 5 is better than 4
or the company itself provides false ratings to make their product
look better. In both cases, ratings like this is spamming and
ranked opinion is totally unreliable, it can be omitted if the
ratings are drawn from social media, where it is the opinion
voluntary. Posts in social media is usually in the form of a text
that largely unstructured. We need a way to process the text one
way is to use sentiment analysis. Sentiment analysis is a technique
that extracts the user opinion and sentiment of social media, to
take the views of users, and opinions, as compared to a
questionnaire or survey.
Keyword Twitter, Opinion Mining, Product Rating, Social
Network

I.

INTRODUCTION

With the rapid growth of e-commerce and social media


online rating systems supplied allows the buyer or user to post
a review about a product purchased or used. Online rating
systems have an important role in consumer decision in
purchasing a product [1]. Based on the 2013 survey conducted
by Dimensional Research, 90% of respondents say that
positive online reviews influence their decision to buy a
product, while 86% said that their decision not to buy the
products affected by the negative reviews online [2].
Online rating systems usually have a scale of 0 to 5, and
customers usually are not objective in their assessments as
more likely to choose a rating of 5 is better than 4 or the
company itself provides false ratings to make their product
look better. In both cases, ratings like this is spamming and
ranked opinion is totally unreliable, it can be omitted if the
ratings are drawn from social media, where it is the opinion
voluntary [3].

The benefits of product ratings are taken from social media


can help customers to do more detailed search and choose
products that have a good ranking in the desired features.
Unlike the opinion commonly used applications, product
ratings of social media is the result of data processing of
public opinion about the products they're using [3].
Companies also get information on the product ratings and
how the feasibility of the product is well received or not by the
public, so the company is able to create a better product more
in line with what people want.
Posts in social media is usually in the form of a text that
largely unstructured. We need a way to process the text one
way is to use sentiment analysis. Sentiment analysis is a
technique that extracts the user opinion and sentiment of social
media, to take the views of users, and opinions, as compared
to a questionnaire or survey [4].
In this paper will discuss how data collection through
social media and then processed to produce a product rating to
be visualized into a charts.
Several studies have been doing research in this area, some
of them states that use of the word is not formally on the tweet
making process sentiment analysis becomes difficult [4],
besides the writings irrelevant as the word is abbreviated and
use of notations that are not appropriate or excessive make the
need for more precision in sentiment analysis [5], so in this
paper the author will only categorize the data is worth the
positive, negative, or neutral [6].
The remainder of the paper is structured as follows:
Section II describes the related works. Section III describes
the author motivation. Section IV discusses the method and
the dataset that is used for experiment. Section V will be the
experimental results with current tools, and Section VI it is the
conclusion of this paper.
II.

RELATED WORK

In the last decade, interest in mining sentiment and


opinions in the text has grown rapidly, due in part to the large
increase of the availability of document and messages
expressing personal opinions [7].
Most of the existing methods are processing the reviews in
terms of positive and negative comments. But this approach is

not enough for a customer to make decision about product.


The proposed approach not only finding the positive and
negative comments for any product or product features, but
also rating them in the order of positivity and negativity. Also,
the proposed approach gives the degree of comparisons for a
particular product and product features [5].
This information can also be recommended to the people
who are in need when they search for similar products which
help them to make necessary decisions based on the others'
opinions [3]. Opinion from random user cant make a trust on
some user, the connection of opinion is needed. The
importance of connections' opinion changes over time. Thus,
there is a need to consider the evolving trust relationships
among user [8]. In particular, sentiment in Twitter data has
been used for prediction or measurement in a variety of
domains, such as stock market, politics and social
movements [6].
The most important of any research is all of them, it is
requires a lot of precision [5], not enough data set is available
[3], enhance the accuracy of sentiment analysis and It is also
noticeable that due to the highly appeared misspelled words
and slangs in tweets, it is not easy to extract the sentiment [4].
III.

TABLE I. CURRENT SYSTEM


(-) Disadvantages

The system has been available

The rating can not be trusted

Not depend on the data source

High operating costs


Rating only for one product

Fig. 1. Advantages and disadvantages of the Current System


TABLE II.
(+) Advantages

Depend on the data source

Low operating costs


Rating not only for
product

PROPOSED SYSTEM
(-) Disadvantages

The rating can be trusted

METHOD AND MATERIAL

A. Data
Twitter is a popular microblogging service, allows users to
post tweets, status message with length up to 140 characters
[Davenport]. These tweets usually carry personal views or
emotions towards the subject mentioned in the tweets.
Because of that, in this paper we choose twitter as a data
source.
The system will use the API to perform data collection
through twitter. API or commonly called Application
Programming Interface is a program or application provided
by the certain developer that we or the other application
developers can more easily access the application. This API
essentially serves as a bridge between the application with
other applications.
We must remember that twitters public API provides only
1% or less of its entire traffic, without control over the
sampling procedure, which is likely insufficient for
accurate analysis of public opinion [5].

MOTIVATION

This paper will provide a great source of unstructured


information especially opinions that may be useful to others,
like companies, their competitors and other consumers. For
example, someone who wants to buy a camera, can look for
the comments and reviews from someone who just bought a
camera and commented on it or written about their experience
or about camera manufacturer. He can get feedback from
customer and can make the decision. Also a manufacturing
company can improve their products or adjust the marketing
strategies. [5]. This is the difference from current system to
our proposed system.

(+) Advantages

IV.

Experts needed to make the system


one

Fig. 2. Advantages and disadvantages of the Proposed System

Fig. 3. Using API in System

B. Preprocessing
The text of tweets differs from the text in articles, books,
or even spoken language. It includes many idiosyncratic
uses, such as emoticons, URLs, RT for re-tweet, @ for
user mentions, # for hashtags, and repetitions. It is
necessary to preprocess and normalize the text [6].
C. Sentiment Model
The design of the sentiment model used in our system
was based on the assumption that the opinions expressed
would be highly subjective. Therefore, data will be classified
directly after passing through the processing as in figure 4.
D. Opinion Classification
In opinion classification, we will be categorized into 3
parts: positive, negative and neutral. For example, if we have a
tweet like this
I like the iphone because bodycase, camera and also not
slow
and
Your iphone slow? This is the trick to make it faster
The first tweet containing slow and followed by not so
the system will classify these tweets into positive category,
while the second tweet containing question marks after slow
so the system will classify these tweets into neutral category.
If there is a tweet containing slow and not followed or after
by other word, the system will classify into negative category.
For better performance, the system should be able to assess
how the level of satisfaction a tweet, for example a tweet

containing very slow must have higher value than a tweet


that only contain slow. So if the category of the tweet is
negative, we must know how negatice is it, so we have more
accuracy on our rating sistem.
But in fact, people in Indonesian not use the word
according dictionary and abbreviations that should not be on
their tweet.

TABLE IV.
Keyword

TWEET CLASSIFICATION

Positive

Negative

Neutral

Iphone Slow

12

84

Samsung Slow

16

61

Fig. 6. Tweet Classification

After classifying the data next rating and visualize the data
in a graph so easily understood by users. As for the formula
used is :

Rating=
(1)

Fig. 4. Sentiment Model

Fig. 7. Product Rating Charts

V.

RESULT

A total of 182 tweets were gotten from twitter. These tweets


contain keyword IPHONE Slow and Samsung Slow. From
182 tweets we found keyword Iphone Slow have 102 tweet,
and keyword Samsung Slow have 80 tweets.
TABLE III.
Tweet

Negative Tweet
100
Postive Tweet + Negative Tweet

TOTAL TWEET
Total

Total Tweet

182

Iphone Slow

102

Samsung Slow

80

VI.

CONCLUSION

We presented a system for Twitter sentiment analysis


for rating a product to help the customers to have a more
detailed search and choose product which have good rating in
desired features. But as you can see this system have lack of
accuracy on classification and rate the product as described in
section Opinion Classification. Futhermore, lack of data
because we use free API become another problem.The
recommendation for further research is to get more
accurate results and use the premium API twitter to get the
data.
REFERENCES

Fig. 5. Total Tweet


[1]

After doing preprocessing on the tweet, we classify the data


and we got a result like this.

[2]

W. Duan, B. Gu, and A.B. Whinston, The Dynamics of Online Wordof-Mouth and Product Sales - An Empirical Investigation of the Movie
Industry, J. Retailing, vol. 84, no.2, 2008, pp. 233242.
W.Zhou, Y.Liu, Online Product Rating Manipulation and Market
Performance, May 2015.

[3]

[4]
[5]

R.Nithish, S.Sabarish, M.Navaneeth Kishen, A.M.Abirami, Dr.


A.Askarunisa, An Ontology based Sentiment Analysis for mobile
products using tweets, in ICoAC, 2013.
W.Yen Chong, B.Selvaretnam, L.Soon, Natural Language Processing
for Sentiment Analysis, in Engineering and Technology, 2014.
V. Yogesh Karkare, S. R. Gupta, Product Evaluation using Mining and
Rating Opinions of Product Features, in Signal Processing and
Computing Technologies, 2014

[6]

[7]
[8]
[9]

H.Wang, D.Can, A.Kazemzadeh, F. Bar, S. Narayanan, "A System for


Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election
Cycle", in Proceedings, pages 115120, 2012
B.Pang, L.Lee, "Opinion Mining and Sentiment Analysis", Ithaca, 2008
A. Davoudi, M. Chatterjee, "Product Rating Prediction Using Centrality
Measures in Social Networks", Sarnoff Symposium, 2015
S. W. Davenport, S. M. Bergman, J. Z. Bergman, J. Z., and M.
E.Fearrington, Twitter versus Facebook: Exploring the role of
narcissism in the motives and usage of different social media
platforms., Computers in Human Behavior 32, pp. 212-220, 2014.

You might also like