You are on page 1of 6

Sentiment Analysis of smart phone based OS via

Twitter tweets
Saif Ullah
UCP, saifullah27@gmail.com

Abstract - In past decade, due to the rapid growth of In the last decade, social media platforms rapidly grew and
online social platforms people constructed a global built a global communication over the world through
world for communication through different applications different social applications. The massive information is
of social media on internet. The massive information is generated on online social platforms on daily basis. Such as
generated on online social platforms on daily basis. The on twitter there are above than 500 million posts per day [1].
massive use of social media is influencing digital market. Online social networks affects the field of business,
To know customer’s sentiment, loyalty, attitude and promotion, and web based business as it clarifies customer
behavior towards the particular brand, social media conduct and response about specific business products and
monitoring is a new way. The objective of the conducted services. The organizations are affected due to people
study to perform sentiment analysis of different opinions and purchase selection and organizations are now
available mobile operating system to know customers utilizing contents of online social networks to analyze user’s
sentiments and attitude towards these brands. The behavior before entering into actual market. For the
following brands (iOS, Android, Windows, BlackBerry concerns of social media analytics, the posts, comments and
and Symbian) are selected. However, the contents feedback are requires to conclude results. The process of
available on social media are in unstructured form and retrieving information form online social networks and to
to analyze unstructured data is still a challenging task. analyze the information for business decision is called social
media analytics. Generally, social media analytics
Index Terms - twitter, R language, sentiment analysis, techniques are used to discover sentiment of customer with
machine learning the end goal to help promoting and client benefit exercises.
The significance of online networks is natural and adaptably
PROBLEM DEFINITION utilized by organizations and people to know the current
There are different sorts of information generate in social trends in market. It encourages organizations to know
media platforms by social media consumers/groups that client’s perspectives and their remarks on product’s quality
should be monitor in a systematic way to measure and services to take better decision in the favor of successful
individuals sentiments about different brands and products. business. The regular goals incorporate expanding incomes,
Text mining is an approach that support to build valuable decreasing the cost of customer services, gathering customer
business visions through sentiment analysis on contents, the feedback on products, and in addition improve customer
contents are may be in different forms such as document, opinions regarding to specific product [1], [2].
tweets, Facebook posts, comments, videos or images.
To understand the online network analytics, there is need to
There are several mobile operating systems are available in view problem from two perspective: the business
market with their own functionaries but following mobile perspective and technical perspective. As concern regarding
OS are selected for current study: 1) iOS, 2) Android to business perspective, there is need to know currents
(Samsung), 3) BlackBerry, 4)Windows and 5) Symbian’s. market trends to compete the competitors and target the
iOS comes with limited devices such as iPod, iPad, iPhone customers. In short, organization must know the right time
while Android comes with a huge number of devices with to introduce their products and services in market.
different companies but we focused on Samsung. Moreover, organizations must acknowledge the market
BlackBerry comes with KeyOne, Passport and etc and state, either there is a similar product already exist or not. If
Windows comes with limited mobile phones such as Lumia. similar products exists, it is essential to know what are the
However, Symbian’s OS is not a well establish OS todays positive and negative feedback are received these similar
but we consider it to analyze why it fails. All operating products so organization can take certain actions to improve
systems exist since long time and supported by huge their products and services. This process enables
devices. To measure customer’s behavior and sentiments organization to take advantage above competitors in terms
towards these mobile operating system, this paper presents a of competition.
sentiment analysis study.
After releasing the product or services in market,
organizations take interests to validate customer’s feedback
INTRODUCTION to know the customer response towards the products or
services. It includes the number of followers, replies, To extract and perform sentiment analysis on twitter tweets
retweets and reaction on released product. In the end, it the tool sentiment viz [20] has used. The tool is open source
enables organizations to know and understand the customer and available online. It provides the functionality to extract
behavior and opinion. The technical perspective is belong tweets based on particular keywords and generated multiple
those difficulties which are face during the information results on sentiment dictionary. The tool classify the tweets
retrieval from online networks and to filter the positive and based on Russell’s model of emotion affect [20]. Further to
negative sentiments based on certain criteria. In addition, estimate sentiment score, the machine learning approach
the analysis of online networks needs to access via internet such as Naïve Bayes model is apply by the tool. They built
and require huge memory to store the gather information for up dictionary for sentiment containing above than ten
further preprocessing and analysis. It also incorporates the thousand English terms. To rate every sentiment term, a
techniques related to data cleaning, transformation data scale ranging 1-9 is considered.
from unstructured form to structure form.
SENTIMENT ANALYSIS
In order to determine customer’s opinion is not an easy The accessibility of the Internet, Web and cell phones made
tasks as it seems. To identify customer’s behaviors towards it possible to connect and communicate immense mass of
particular event or topic of interest requires to conduct individuals with each other’s through online networks
sentiment analysis approach. Sentiment analysis determines accounts from any place and any time in the world. Rather
user’s attitude polarity and emotion towards certain than being the inactive buyers, individuals have begun to
sentence. To accomplish this, machine learning and natural end up dynamic purchasers by means of their online
language process (NLP) approaches are take place and networking accounts. Cell phones have just cooked their
developer or analyst face difficulties while applying these use. An overview led in year 2015 announced that the cell
approaches on unstructured content. In the last few years, an phone proprietorship has come to a stunning measure of
emerging interest has been observed in these analysis 86% in American 18-to-29-year-olds [9]. Interpersonal
techniques in order to utilize social media content for interaction locales have since a long time ago understood
sentiment analysis, opinion analysis, observe cohesion this and everybody can see its aftermaths. One can see each
between community and for advertising purpose. Now a online network accessible as an application on Play Store
day’s social media playing an important role in modeling of and Apple Store, online networks have turned into a
public opinion, it is essential to utilize large volume of significant piece of our regular daily existence. As of now,
dataset in an efficient way. A primary phase, read, annotate Facebook is the world's greatest informal community, with
and limit the dataset size that can be easily analyze. more than1.44 billion clients as of year 2015 (around 2% of
the total populace), with Instagram (300 million) and
The section II represent scope of presented study. Section II Twitter (284 million). Facebook's clients around the globe
represents the research question. Section III represents the spend a normal of 20+ minutes for every day on the
tool which are used for sentiment analysis. Section IV informal community, loving remarking, and looking through
demonstrates the brief explanation of sentiment analysis. announcements which represents almost 20% ever on the
Section B represents literature review, Section VI represents web [8, 9].
research methodology. Section VII represents results. The
conclusion is represented in Section VIII. This implies with this much data accessible online we can
SCOPE without much of a stretch track the "patterns", "prevalent
contemplations" and "opinions" of individuals effortlessly.
The five operating system (smart phones) have selected for The general population nature of client substance, for
current study. Although, Android comes with multiple smart example, measure of 'takes after', 'likes', 'remarks' on social
phone companies but we have limited it to Samsung through networks can give an understanding to the themes pulling in
hashtags. While iOS offers from single company Apple Inc. client intrigue. People have likewise begun swinging to
so the iPhone series is selected for iOS. For the BlackBerry informal communities as wellsprings of continuous news
the two following models are selected (Passport, KeyOne). and suppositions. To help this, stage suppliers have made it
For the Windows based mobile Nokia Lumia is selected. conceivable to look through the enormous measure of open
announcements. Stages like Google, Twitter, and Facebook.
RESEARCH QUESTION They have additionally offered the likelihood to get to
There are two major research questions are considered for people in general notices through their pursuit APIs,
selected case study: bringing about a burst of business and research endeavors to
1. Explore how user response change towards assemble information through examination of the common
different OS over the time. substance. This has persuaded investigate into content
2. Analyze and conclude which OS contains the more examination and use of the current data recovery and pattern
positive or negative sentiments. location systems to online networking so as to profit by the
learning encased inside the client produced content. This
TOOL investigation offers significant experiences into the themes
that pull in the consideration of an extensive part of social terms. The sentiment bearing terms were fetched from the
networks clients. Not only for the people but rather general Whissell [18] dictionary. They process tweets from
suppositions in from these examination are additionally multiple stages to clean them such as tokenization,
extremely critical for journalists (reporters for news), abbreviations are converted into original words and links are
customer behavior tracking organizations, election outcome removed. After data cleaning, the tweets are isolated in
predictions, economical predictions and more. pleasant and unpleasant sentiments using supervised
learning approach with n-gram. In the same year, Jiang et al
The text or sentiment analysis is perform on gathered [19] classified tweets based on SVM approach and classify
information from twitter tweets. The term sentiment tweets on three attributes (pleasant, unpleasant and neutral).
analysis applies to the way toward utilizing an arrangement
of etymological, factual, and machine learning strategies In 2009, a study conducted on the analysis of tweets polarity
which leads to structure the data to substance of printed by Go et al [11]. The researcher applied supervised
hotspots for insight, exploratory information examination, classification approach on emotions (tweets). Two emotions
research, or examination. Moreover, sentiment Analysis is are considered Happy “” and sad “” as positive and
the way toward distinguishing the passionate tone in a negative effects. In 2005, Read applied supervised
progression of words which on turn is utilized to pick up a classification technique to build corpus from tweets based
comprehension of the states of mind, sentiments and on positive and negative sentiments [10]. The researchers
feelings communicated inside an online mention. This paper applied different techniques such as SVM, Naïve Bayes and
utilizes sentiment analysis in scope of seven unique and Entropy and stated that straightforward utilization of
differing feelings, to be specific, anger, anticipation, disgust, unigrams resulted in good outcome but the more
fear, joy, negative, positive, sadness, surprise and trust. improvements are possible in results by applying the
unigrams and bigrams together. In 2010, Pak and Paroubek
Along with different techniques of machine learning an build tweets corpus based on positive and negative emotions
essential protagonist is played by sentiment technique to and performed and compared different learning approaches
identify sentiment of massive contents. Todays, several and claimed they have achieved better results as compared
analysis tools are exist which are performed particular tasks to old studies using Naïve Bayes considering unigrams and
to identify individual’s excitement towards forthcoming speech tags [12].
movies, relates individuals emotions towards political party
to know positive and negative attitudes, measure individual METHODOLOGY
sentiment based on rating criteria such as to know good and Generic Dataset
based things about hotel/restaurant services and facilities.
Due to massive amount of shared information on social The sentiment analysis is conducted on leading operating
media platforms (blogs, forums and etc), it is not possible to system in market such as iOS, BlackBerry, Windows,
analyze massive amount of data manually. Symbian’s and Android. The maximum tweets are fetched
from the twitter through sentiment viz tool. The tool itself
LITERATURE REVIEW build corpus from tweets using lexicon methods and
The social media analysis is gaining attention day by day in compare each token or emotion with sentiment dictionary to
order to achieve individuals feeling what they think about decide either it is positive sentiment or negative sentiment.
product or topic. Generally, sentiment approaches are sorted Further, to compute sentiment score, sentiment lexicon and
in two methods 1) lexicon based [15] and 2) machine appropriate techniques are applied. Sentiment lexicons is a
learning approach [13]. The lexicon based approach process to map or create relationships between words and
depends on corpus (collection of terms/information or sentiment score. Results are presented using graphs. The
content). Along with sentiment lexicon method, machine overall process is divided into four phases:
learning techniques utilize different types of methods such
as syntactic, linguistic and hybrid techniques. A research • gather tweets from twitter through twitter API (viz tool)
[14] established reviews polarity through recognition of Phase-1
adjectives polarity who appears in reviews. The research
resulted in ten times better accuracy as compare to pure • Perform data normalizing on gathered contents
• To apply techniques of sentiment analysis, perform feature reduction
approaches of machine learning. Unfortunately, these Phase-2
approaches are failed while applied in new domain because
• Conduct sentiment Analysis
they are not flexible with ambiguous sentiment terms. The • Polarity Classification
adjective terms changed meaning of sentence in sentiment Phase-3 • Generate visualization of results

lexicon [16].
• determine sentiment score based on emotions
In 2011, Zhang et al [17] applied hybrid techniques in Phase-4
tweets for sentiment analysis. The hybrid technique was the
combination of supervised learning and sentiment bearing
Phase 1: information gathering
In the phase-1, the tweets are gathered through sentiment
viz. One year old tweets (thousands of tweets) are gathered
to perform sentiment analyze and to get accuracy in results.

Phase 2: Data Cleaning


In the Phase-2, the gathered dataset is transformed from
Figure 1 Android hashtags
unstructured form to structured form through the reduction
techniques. The words are converted into lower case, The Figure 2 presents the sentiment score for Android
trimming is performed on text to remove whitespaces, stop (Samsung). The few individuals are lies in unpleasant (tense
words and punctuations. After cleaning the gathered dataset, and bored) quadrant. Further, the Figure 2 presents the
a term document matrix is built to know the frequency of cluster of tweets based on different emotions.
each word.

Phase 3: Sentiment Analysis


The gathered contents are classified using machine learning
techniques based on polarity. A comparative word cloud is
generated based on polarity classification. To generate word
cloud, R offers word cloud package that helps to generate
most frequent words and visualize the results in comparative Figure 2 Android Sentiment Score
way.
The Figure 3 presents the time line for android (Samsung)
Phase 4: Sentiment Score over a few hours in a day. The tweets are clustered in four
The sentiment scoring approach is applied to filter positive categories e.g (happy, unhappy, relaxed, and upset) based
and negative sentiments, the results are presented in next on emotions. For each category tweets counts are presented.
section. The positive and negative sentiments are separated
into two lists. To compute sentiment score the R method
‘plyr’ is used that takes vector as input contains sentences.
To decide either word is contains positive impact or
negative impact, each word is compared to from dictionary.
The method ‘plyr’ calculate the difference between positive
and negative words based on words frequency. The resultant
vectors contained text contents with respective sentiment Figure 3 Android Time Line
score magnitude.
The Figure 5 shows the sentiment comparison of iOS based
RESULTS smart phones, it is noticed form the generated graph that the
response change towards iOS based smart phones over the
Graphical Representation of Sentiment Score time. The resultant graphs shows that iPhone4 received the
less negative sentiments as compared to others. It is due to
To represents the results of conducted study on different
because iPhone4 was cam with new design, a High
smart phone operating systems, the three types of graphs are
resolution camera with video recording and improved in
generated. The first type of graph shows the cloud of hash
terms of hardware performance over 3G. The negative core
tags, the second type of graph shows the sentiment score
slightly increase for iPhone5 because the size of screen was
that is calculated based on different emotions and the third
4-inch. The negative sentiment score for iPhone5C
type of graph shows time line (sentiment score of last five
increased from iPhone 4 and 5 because it was not so much
days) for each operating system.
different from iPhone5 rather than it was less expensive and
containing plastic body. The tremendous changes came in
The Figure 1 represents the considered hashtags for android
iPhone6, 7 and 8. The positive sentiment score as show in
(Samsung). Although, several companies are offering
Figure 4 is increased tremendously for iPhone6. Because it
Android based smart phones including Huwei, HTC and etc
was came with new design and hardware improvements.
but the conducted study focused on Galaxy (S series)
iPhone7 also received a great positive feedback from
Samsung models. Because Samsung Galaxy series is the
customers but iPhone8 doesn’t. It may be because there is
bigger competitors of iPhone and other smart phone
not so much improvements in iPhone8 as compared to
companies and gave tough time in market to competitors.
earlier models (iPhone6, 7).
The graph (Figure 1) shows the pleasant, un-pleasant, active
and seduced hashtags over the time. The tweets are filtered
based on Android Samsung hash tags.
Figure 4 iOS hashtags Figure 9 Windows Time line

As compare to android (Samsung), the very less individuals The Figure 10 determines the hashtags those are used for
are lies in unpleasant quadrant. The majority of the people BlackBerry OS. The Figure 11 shows the sentiment score
lies in pleasant quadrant. for BlackBerry OS. A large number of individuals are lies in
pleasant quadrant.

Figure 5 iOS Sentiment Score


Figure 10 BlackBerry hashtags
The time line for iOS based phones is presented in Figure 6.
The Figure 6 shows the number of tweets that following in
particular category.

Figure 11 BlackBerry Sentiment Score

9+1111111111111111111Figure 6 iOS Time Line

The Figure 7 presents the hashtags for Windows based


mobiles. The Figure 8 shows the sentiment score for
windows based smart phones. However, the negative
response is less than android and iOS based smart phones
but the tweets count is also less than android and iOS. So Figure 12 BlackBerry Time line
relatively it can be concluded that windows based smart
phones are not supported by a huge population. The Figure 13 shows hash tags for Symbian’s OS while
Figure 14 shows sentiment score. The number of users are
less than other OS.

Figure 7 Windows hashtags


Figure 13 Symbian hashtags

Figure 8 Windows Sentiment Score


Figure 14 Symbian Sentiment Score
[10] Jonathon Read. Using emoticons to reduce dependency in machine
learning techniques for sentiment classification, 2005.
[11] Alec Go, Richa Bhayani, and Lei Huang. Twitter sentiment
classification using distant supervision. 2009.
[12] Alexander Pak and Patrick Paroubek. Twitter as a corpus for sentiment
analysis and opinion mining, 2010.
[13] Boiy, E., Moens, M.F.: A machine learning approach to sentiment
analysis in multilingual web texts, 2009.
Figure 15 Symbian Time line [14] Moghaddam, S., Popowich, F.: Opinion polarity identification through
adjectives. CoRR abs/1011.4623 (2010)
[15] Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-
CONCLUSION based methods for sentiment analysis. Computational Linguistics, 2011.
[16] Mullaly, A., Gagn´e, C., Spalding, T., Marchak, K.: Examining
Due to the rapid growth in online networks in last decade, a ambiguous adjectives in adjective-noun phrases: Evidence for
tremendous amount of data is generated on daily basis. The representation as a shared core-meaning, 2010.
analysis of data available on online networks provides an [17] Ley Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, and
Bing Liu. Combining lexiconbased and learning-based methods for twitter
opportunity to organizations to dig down the data and make sentiment analysis, 2011.
better decision in terms of gain success in business. [18] Cynthia Whissell. The Dictionary of Affect in Language, 1989.
However, to analyze unstructured information is a [19] Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, and Tiejun Zhao.
challenging issue. This paper presented an investigation of Target-dependent twitter sentiment classification, 2011.
past works are done in the field of sentiment analysis and
performed sentiment analysis on leading smart phone based
OS. The sentiment analysis is conducted on twitter data
because twitter is one of the most leading social network.
Three strategies are considered in current work to perform
analysis on twitter unstructured information including
polarity, word cloud and text mining. Five smart phone
operating systems are taken as case study to conduct
sentiment analysis. The results are showed that iOS is the
leading operating system for smart phones. However,
Android also getting users attention in recent years due to
improvements in hardware and software. For the Blackberry
OS based smart phone, the cloud tags shows that users are
interesting in Passport and KeyOne model. The Windows
and Symbian based smart phone are not used by a large
number of users as compared to other OS as shown in
Results section. As compare to iPhones, Android based
smart phones are affordable in terms of cost and
compatibility.
REFERENCES
[1] M. Rouse, "Social Media Analytics," TechTarget, November 2012.
[2] F. K. Gohar, "Social Media Network Analytics," in Seven Layers of
Social Media Analytics: Mining Business Insights from Social Media,
2015.
[3] P. Bo and L. Lillian, "Opinion mining and sentiment analysis,"
Foundations and Trends in Information Retrieval, pp. Vol. 2, Nos. 1–2 (1–
135), 2008
[4] M. Isaac, "Mining Social Media For Predictive Analytics," Kampala,
Uganda: School Of Computing and Engineering, Uganda Technology and
Management University, 2015
[5] P. Priyanka and M. Khushali, "A Review: Text Classification on Social
Media Data," IOSR Journal of Computer Engineering, vol. 17, no. 1, 2015
[6] J. La, "Comparison of data analysis packages: R, Matlab, SciPy, Excel,
SAS, SPSS, Stata," 23 February 2009. [Online]. Available:
https://brenocon.com/blog/2009/02/comparison-of-data-analysispackages-
r-matlab-scipy-excel-sas-spss-stata/. [Accessed 31 October
2016].
[7] R. Matthew, Data Mining Facebook, Twitter, LinkedIn, Google+,
GitHub, and More, O'reilly, 2014.
[8] Irena Pletikosa Cvijikj and Florian Michahelles, “Monitoring Trends on
Facebook,” 2011
[9] Here's how much time people spend on Facebook per day. (2015, July
08).

You might also like