You are on page 1of 10

Search and Tell: Topic Distributions in Queries and Tweets

Tanushree Mitra
School of Interactive Computing
Georgia Institute of Technology
Atlanta, GA, USA
tmitra3@cc.gatech.edu
David A. Shamma
Internet Experiences Group
Yahoo! Research
Santa Clara, CA, USA
aymans@acm.org
Eric Gilbert
School of Interactive Computing
Georgia Institute of Technology
Atlanta, GA, USA
gilbert@cc.gatech.edu
ABSTRACT
What topics are within bounds for search, but out of bounds for
Twitter, and vice versa? Where do they overlap? Using a ground-up,
empirically-based approach, we compare topics across tweets and
search queries. Using a random sample of tweets and queries, we
perform a deep content analysis. We nd substantial and signicant
differences on many topics. For example, we observe entertainment,
food and sports topics in tweets far more often than we see them
among search queries. On the other hand, people routinely query for
shopping, yet rarely mention it on Twitter. There is also considerable
overlap: we see references to celebrities, health & beauty, travel &
recreation, education, gaming and weather as often in tweets as in
searches. By identifying where topics converge and diverge, this
work bridges a contextual gap between social and search, informing
modern systems that combine the two, as well as computer-mediated
communication theory.
Keywords
twitter, search queries, social search, topics, cross-site
Categories and Subject Descriptors
H.5.3. [Information Interfaces and Presentation: Group and Or-
ganization Interfaces]: Web-based interaction
1. INTRODUCTION
. . . when ones activity occurs in the presence of other persons,
some aspects of the activity are expressively accentuated and
other aspects, which might discredit the fostered impression,
are suppressed. It is clear that accentuated facts make their
appearance in what I have called a front region; it should
be just as clear that there may be another regiona back
region or back stagewhere the suppressed facts make an
appearance.
The Presentation of Self in Everyday Life, pg. 111 [19]
Most Americans nd themselves constantly migrating between
search engines and social network sites (SNSs) [44]between a
place where we think out loud (e.g., Twitter) and a place where
we keep our thoughts to ourselves (e.g., Google). In Goffmans
language, Twitter is the front-stage and search is the back-stage
the place where we can express thoughts and desires without fear of
social retribution. How do the two stages alter the meaning of what
we ask or post? What topics are within bounds for search, but out
of bounds for Twitter, and vice versa? Where do they overlap? In
other words, what are the topics that people use when they have an
audience (as in Twitter) versus when they interact with a machine
(as with a search engine)?
In this paper, we explore these questions. We believe the answers of-
fer contributions to two bodies of research: computer mediated com-
munication (CMC) theory and social search. Recent research shows,
for instance, that tweets can detect breaking news [43], enhance
situational awareness in an emergency [53], and predict opinion
polls [41]. Yet, all this work is predicated on what kinds of things
people will or will not say on systems like Twitter (i.e., compared to
a baseline like search). We believe our work informs the theoretical
underpinnings of work such as this. Moreover, the present research
has practical implications for social search systemssystems which
are based on the premise that search is not a solitary activity but
is informed by social sources of information [15, 16, 21, 35, 42].
Systems like socialmention

, so.cl and social-media integration in


general web search all hinges on the idea that it can pull information
from the social network in the context of a users search [36]. In
other words they connect what people say to what people search.
While we know a great deal about each side in isolation, (e.g., both
on Twitter [22, 40] and in search [9, 11]), we know very little about
behavior at the intersectionpart of a broader issue regarding cross-
site research.
This paper aims to bridge the gap. Rather than working at the
unigram or bigram level, we perform a deep content analysis to
look for contextual topics that emerge in a set of tweets and queries
randomly sampled from the same day. This allows us to move
beyond strictly lexical meaning and term matching, arriving much
closer to what people mean. For example, we observe entertain-
ment, food and sports in tweets far more often than we see them
among search queries. On the other hand, people routinely query
for shopping, yet rarely mention it on Twitter. And, as perhaps is to
be expected, we nd references to pornographic material relatively
often in search, but almost never on Twitter. There is also overlap:
we see references to celebrities, health & beauty, travel & recreation,
education, gaming and weather as often in tweets as in searches.
We begin by reviewing recent work on both social network sites and
social search. Next, we categorize a sample of tweets and queries,
inductively developing a qualitative coding scheme to capture tweet
1
and query topics. We then arrive at a quantitative, distributional
analysis of these data and their associated codes. We conclude the
paper by comparing and contrasting the topics that emerge in these
two different systems, offering implications for social theories as
well as search tools and practice.
2. RELATED WORK
We rst give a brief overview of related work on the usage of social
network sites and search systems. We also discuss the state of current
social search systems and how our work can inform their design.
2.1 Social Network Site and Search Engine Use
Social network sites (e.g., Facebook, Twitter, Google+) are used in
different ways to fulll different purposes [4]. People use Twitter to
report ones experiences, seek information about others, announce
events, and broadcast thoughts and opinions [22, 40, 41]. A detailed
analysis of tweets containing @ symbol demonstrated the potential
of microblogging services like Twitter to be used as a collaboration
system [22]. Topics in microblogging streams have been classied
using sophisticated machine learning techniques [45, 23] and using
the hashtag as a strong identier [29]. Textual analysis of event
centric tweets has identied two types of temporal topics in twitter
event streams: localized peaky topics which are momentary in
nature and persistent conversations which last over time [48].
While studies have examined the topical space of microblogging
systems, we are unaware of any work that compares them to search
queries: an area of study which can inform the design of these hybrid
systems.
Additionally, social network sites (SNSs) have been explored as
question asking systems [36, 37], with particular focus on the answer
speed, quality, types and topics of questions asked, and without em-
phasizing on the broader aspect of what topics emerge when people
use these systems. People ask questions to their social connections
via Facebook and Twitter status messages to get trusted responses,
personalized and high quality answers, and answers to subjective
questions [37, 36]. However, they still prefer Web searching over
question asking in SNSs [36]. The reason for this behavior is largely
attributed to the less relevant results returned by SNSs [16]. People
issue repeated Twitter queries to monitor current events, local news,
weather report and other peoples activities and opinions [52]. On
the other hand, when using search engines, they change and modify
Web search queries to learn about a topic or to navigate through
Web content [52].
One of the formal investigations on trending topics involved tracking
inuenza for early contagion detection. Ginsberg et al. showed it
is possible to use search queries to detect inuenza epidemics in
areas with a large population of web search users [18]. Though
initially successful, complications arose as people changed their
health-seeking behaviors and habits, which caused an error in pre-
diction [13]. Recently, researchers have begun to examine Twitter
to nd similar trends, which is met with some optimism and some
criticism [8]. We assert the fragility in these systems, both in their
construction and in the changes temporal in behavior, comes from
a need to understand the information request or communication in
both queries and tweets alike.
Our work builds on these studies by comparing and contrasting
the topical space of a widely used social system (Twitter) and Web
search queries from a major search engine (Yahoo! Search). While
several studies have suggested improvements to search interfaces by
mining queries and their associated results from social systems, none
of them have focused on the distinctions in topical meanings be-
tween social streams and search queriesa factor which inuences
the design of social search systems and, more generally, hybrid
CMC systems.
2.2 Current Social Search Systems
Companies and researchers have explored combining social content
with search into what are known as social search systems. Googles
Social Search [10] improves search results by pulling content from
Twitter followers or from FriendFeed. (Google has since discontin-
ued this service). Their follow-up Search plus Your World Google+
integration
1
helps users discover pages and other peoples proles
based on a topic of interest. Microsofts So.cl
2
uses Facebook to
provide a similar experience. Bings social search
3
allows a user to
get help from their Facebook friends on a search query.
While leveraging social connections to improve the search experi-
ence, these systems dont account for topical differences between
when a person issues a query privately on his machine. The system
SearchBuddies saw reports of this by participants [21], a socially
embedded search engine which provides algorithmic answers to
questions posed via Facebook status messages. Responses which
were better off as private messages than as public displays were
explicitly deleted by the question asker. Moreover, people are hesi-
tant to answer certain questions for privacy concerns, especially on
sensitive topics (i.e., how do I stop drinking, a query from our
dataset) [55]. The survey results [55] also indicated that users are
most inclined to ask Entertainment and Technology related ques-
tions of their social networks. A topical categorization of questions
asked on Twitter also found Entertainment as the most popular topic
followed by Personal and Health related questions [42]. However,
Personal and Health related questions received very few responses.
While differences like these have been addressed within the multime-
dia research community [49, 30, 39], they are rarely accounted for
when integrating search with social systems. Our work highlights
the need for social search systems to distinguish between the genres
of search queries. Imagine building a social search system to answer
queries about a topic, but those topics have low representation
in SNSs. Such a system will rarely return relevant results. These
shortcomings in current systems prompt us to identify topics of
interest which are more prevalent in search engines compared to
social systems, and those which span both spaces. Identifying topical
similarities and differences, our work addresses a gap that exists
in current social search system designs. Furthermore, the present
ndings also lay new theoretical groundwork at the intersection of
two massive internet systems, social and search.
3. METHOD
We analyze two datasets: Twitter status messages and search query
data from the Yahoo! search engine. We then develop a topical
coding scheme for those statuses and queries, nally performing an
analysis on the distribution of topics.
3.1 Twitter Data
As part of the 2011 Text Retrieval Conference (TREC), the National
Institute of Standards and Technology (NIST) released a corpus of
16 million tweets collected over a timespan of two weeks (January
1
http://www.google.com/insidesearch/plus.html
2
http://www.so.cl/
3
http://www.bing.com/social
2
24th to February 8th, 2011)
4
. The corpus is a representative sample
of Twitter froma timespan covering two notable events: the Egyptian
revolution and the U.S. Superbowl. Our main focus was not to
study these event structures, but rather to perform a topical analysis,
examining the different topics people tweet about versus what they
search for on the web.
We next extracted all the tweets from a single day, January 30th,
to use in our analysis. January 25th marks the onset of Egyptian
revolution, and the Superbowl XLV was played on February 6th.
Choosing a day in between provides a good sample to perform
cross-channel topical analysis. The sample from January 30th had
a total of 909,721 tweets, of which 789,125 were public tweets.
We fetched these using the Twitter API and after ltering out non-
English tweets, we had 309,245 status messages. We randomly
selected 1,000 of them for deeper content analysis; we enforced this
limit primarily because of the resource-intensity of multiple-round
manual coding.
3.2 Search Query Data
To compare against the Twitter NIST corpus, we used a single
day sample of the search queries from the Yahoo! Web search
engine. This sample was also taken from January 30, 2011 and
was handed to us with all user-identifying information stripped from
the dataset. The sample was ltered to only include queries came
from a valid, logged-in user with a valid browser cookie. This is
a standard technique to remove robots, crawlers and scripts from
the data and was done before we received the sample. In the end,
our dataset contained 10,219,017 search queries. From this set we
randomly selected 1,000 search queries for content analysis. All
together, the sampling, ltration, and repeat sampling removes most
session data and evidence of query repair [33].
3.3 Generating Codes
A ground-up, empirically-based approach was followed to generate
codes for tweets and search queries to nd the shape of the overall
corpus and not the repeat pattern of individuals. The rst author
iterated twice through all 2,000 data points generating codes that
reect the underlying topics behind the tweets and queries. Next,
all three authors went over the coding scheme and a subsample
of tweets and queries which reected these codes. After reducing
redundancy between the categories and collapsing those which had
less than 15 data points in both the search and tweet corpus, we
arrived at a coding scheme with 18 codes. Table 2 lists them along
with denitions and associated examples.
Prior to developing the coding categories, we referenced the topic
denitions of the Reuters OpenCalais
5
topic classication system.
OpenCalais is an emerging industry standard for tagging web con-
tent and incorporates many training sources. The topics included in
it are : Business & Finance, Disaster & Accident, Education, Enter-
tainment & Culture, Environment, Health, Hospitality & Recreation,
Human Interest, Labor, Law & Crime, Politics, Religion & Belief,
Social Issues, Sports, Technology & Internet, Weather and War &
Conict.
3.4 Qualitative Methods vs. Topic Models
Our choice of qualitative approach was motivated by the follow-
ing reasons. First, the short length of Web queries, sparse feature
4
http://trec.nist.gov/data/tweets/
5
http://opencalais.com
space and a changing query vocabulary over time makes automatic
classication a challenging task [20, 27]. Automatic query classi-
cation also yields some degree of inaccuracy. Rather, research on
Web search queries have used human annotation and have reported
reliable results [51, 25, 26]. Thus we decided on manual coding for
search query categorization.
Deciding on the approach for tweet categorization required more
consideration. An existing body of quantitative research uses Latent
Dirichlet Allocation (LDA) [3] for topic modeling of tweets [23, 45,
52]. LDA is a popular unsupervised probabilistic topic modeling
technique, which originated from the machine learning community.
LDA does not require any manually constructed training data, but
needs the count of the topics it has to distill from the document
collection. The topics are a set of related words which tend to co-
occur in similar documents. While LDAon tweets [45] have detected
clusters of words with stylistic cohesion, our initial exploration of
using LDA for identifying categories in our tweet dataset gave poor
results. While detailed discussion of the quantitative analysis is
beyond the scope of this paper, we show the following example to
illustrate some shortcomings of LDA over human annotation:
Tweet: RT @chicagobulls: Pacers coach Jim OBrien just
received his second T and an ejection
While both the human raters classied the above tweet as Sports,
LDA assigned four different categories (Table 1). The top ve words
in each of the topics generated by LDA and assigned to this tweet
hardly bear any resemblance to sports. In fact, the contextual domain
knowledge that helped the human raters easily classify this tweet is
absent in the quantitative approach.
topic 1 topic 2 topic 3 topic 4
Probability 0.43 0.29 0.14 0.14
Top 5 words
time like get http
need lol going bit
watch really tomorrow watching
lol one week another
one much give post
Table 1: LDA topic assignment for the tweet: RT @chicagob-
ulls: Pacers coach Jim OBrien just received his second T and an
ejection
Another advantage of using manual coding is the ease with which
humans can categorize status updates containing images and videos.
Consider the following tweet which points to images of Apple
products.
Tweet: Ph.D.,new post: iGrill, iRig & Mikey
http://ow.ly/3MDpJ
The raters looked at the images posted in the above web link,
identied them as images of technology gadgets and classied the
tweet as Technology. LDA on the other hand assigned two different
categories. Looking at the top ve words in them, neither of these
categories could be tagged as technology. Based on these ndings,
we decided on manual coding for tweet categorization.
3
One of our main guiding principles behind manual coding was
to perform a deeper content analysis, as opposed to a supercial
analysis on a large dataset.
3.5 Trace-Based Approach
Our current technique is effectively a close study of the traces left
by users in the Yahoo! search engine and in Twitter. Our choice of
a trace-based approach over recruiting participants for interviewing
their search and tweet behavior, might rst appear as a shortcoming
of our work. However our rationale behind doing this was based
on results reported by two studies, one contradicting the other:
A topical study [2] of an entire weeks query logs from a search
engine indicated that the most popular queries were shopping (13%),
entertainment (13%), Research & Learn (9%), Computing (9%),
health (5%), travel (5%) and Personal & Finance (3%). However,
when queries were collected by making users aware of the collection
process and promising anonymity of their web usage behavior,
Celebrity queries were reported to be most prevalent (45.95%) [52].
This is likely because participants are not comfortable reporting their
searches related to nancial and health issues or pornography [37].
Similar differences have also been reported in comparing naturally
occurring Q&A data with self-reported Q&A behavior [42]. While
self-reporting resulted in Technology to be the most popular question
topic, traced-based approach showed Entertainment followed by
Personal and Health as the popular topics. To prevent the participant
response bias we chose a trace-based approach.
One potential concern with this approach is that our sample might
contain tweets from both ordinary users and organizational accounts.
To get an approximate estimate of the bias in our sample, we picked
a sample of 100 user accounts who had tweets in our randomly
selected sample. We look up each of their Twitter user proles
searching for signs of ordinary user account. We read their prole
description and the rst few tweets in their prole home page. If we
are able to determine that the account belongs to a user we label it as
a user account, otherwise we leave it unlabeled. This process labeled
81% of the accounts as user accounts, resulting in a bias small
enough not to have any signicant effect on our results. Moreover,
prior research on classifying Twitter users into ordinary and elite
users found that only about 5% of tweets received by ordinary users
are from organizations [54].
3.6 Coding
Two human raters independently categorized the 1,000 tweets and
1,000 queries in accordance with the coding scheme in Table 2. A
total of 673 conicts were detected: 313 among tweets and 360
among queries. The inter-rater reliability measured using Cohens
was 0.64, with point-wise agreement at 0.66. As per Altmans [1]
interpretation of , a value between 0.6 and 0.8 is considered to be
good agreement. However, we decided to undertake one round of
conict resolution. The two raters together revisited a portion of
the conicts to identify possible causes. Four conicting data points
were selected from each of the 18 categories, from both the tweet set
and search queries. Thus 144 of 673 conicts (roughly 20%) were
revisited.
Disagreements stemmed fromtwo main sources. First, several tweets
had broken links. While the rst rater simply classied these under
Miscellaneous, the second rater tried to nd context from the words
preceding the link. They agreed to follow the latter approach for a
subsequent round of independent re-rating of the conicts. Second,
there were tweets and queries which could belong to more than one
category. For example:
that concert was great! :) now, i think im gonna AT-
TEMPT to read my biology chapter for this quiz &
work on my lab report."
The rst rater assigned the category Education, while the second
focused on the 1
st
half of the tweet and tagged it as Entertainment.
Another example is the search query 2012 super bowl tickets.
This was categorized simultaneously as Sports and as Shopping.
To summarize, conict resolution was difcult because of lack of
context, a common problem for shorts text like tweets and queries.
Following Namaan et. als conict resolution approach [40], we
retain the category interpretations by both raters. However, when
a category assignment is Miscellaneous, preference is given to the
rater who assigned a more specic category. Using this conict
resolution method did not result in over-coding since, on average,
tweets and queries had 1.127 and 1.165 codes each, respectively.
Post conict resolution, the raters independently re-rated remaining
conicts and reported = 0.74, with point-wise agreement at 0.76.
3.7 Limitations
Our work has some limitations, however. For example, in comparing
the topical bounds of a widely used SNS (Twitter), we have limited
our study to only one type of interactionpublic tweets, which
differ from the more private interactions taking place on Facebook,
Google+, or even protected tweets. Later, we discuss opportunities
for future research based on the limitations of this study.
4. RESULTS
We calculated the proportion of these categories in our random
sample of tweets and queries, and then performed
2
tests of in-
dependence. Figure 1 presents the results. Since we are testing for
differences in 18 categories simultaneously, we need a Bonferroni
correction, reducing the signicance level =
0.05
18
0.003.
We found non-random differences in 11 of the 18 categories, with
p < 0.003.
4.1 Entertainment
We see that people tweet about entertainment far more than they
search for it,
2
(1, N = 2000) = 29.44, p < 0.003. This nding
echoes with previous research on studying question asking and
answering (Q&A) behavior on Twitter [42]. We categorized all
searches and tweets about music videos, lyrics, movies, shows and
parties as entertainment. Gosling [46] showed that people use music
preferences to communicate information about their identity to an
audience and the audience in turn use this information to paint
a picture of the individuals personality. We believe that similar
dynamics come into play in other forms of entertainment: videos,
movies and parties. Using Twitter as a platform, people often try
to portray their entertainment preferences so that the observer can
form an impression of them. On the other hand, search queries
related to entertainment reect a users intent to actually reach a
particular music video or gain information about a show of his
interest. Compare the following example:
Query: if you could see me now song
Tweet: #nowplaying Ryan Shaw - It Gets Better
In the query, the user is searching for a song which interests her
and need not worry about identity management. We return to the
interpretations of these results later.
4
Code Includes Example tweets Example queries
Business/Finance Business and nance news, lottery, betting,
banking, money, ecommerce, exchange rates,
companies, stocks, startup ideas
Saturday Sourcebook: The 10 best inno-
vation links of the past week
lendingtree com; chase bnk; memacot-
tage auction; louisiana lottery
Travel/Recreation Traveling, camping, leisure activities, hob-
bies, google maps, tweets with geolocation
What are you doing in Omaha for fun? whale watching california coast trips;
Beach Hotel Thailand
Celebrity Celebrity @-mentions, celebrity searches I dare @justinbieber what is john travolta nationality
Education University, college, research, class activities,
tutoring, searching for educational material,
library, history, arts, museums, galleries
Im so excited to be teaching in Kid-
stown; It was a great school nonethe-
less. . . just way too costly
different theories of learning; niagara uni-
versity athletics
Entertainment Music, video, shows, movies, night clubs,
sites for video (e.g., YouTube); tweeting nw
(now watching), np (now playing)
Im at Chicago Theatre; #nowplaying
Tompi Love; Just saw the preview for
#CedarRapids. Looks super awesome!!
great day to be alive lyrics; Do Some-
thing! Awards; YouTube; robot chicken
streaming
Gaming Gaming websites, tweeting about games, lev-
els achieved, new games
I unlocked the Focus Attack Sage
achievement!
Sky Fire Fighter games; game breaker
TV; world of warcraft
Food Tweets mentioning what one is eating, drink-
ing, cooking, having for breakfast, lunch, din-
ner. Queries for restaurants, food items
just hadd the bestest chinese food ever;
@Maryy_xx3 I love coffee especially
hazelnut
steves house of pizza; hummus; salmon
with caper; baking diffrence between
small and large mufns
Health/Beauty Medications, treatments, disorders, beauty
tips, beauty solutions
Yea bit**, I drnk slim fast for breakfast!
#realtalk
How to lower cortisol levels
News General news, disasters, accidents, politics 1 dead in building mishap in Byculla
http://bit.ly/fN46XR
turkey newspaper
Profanity Use of profane language, swearing, cursing
in searches and tweets
Not picky but my bitch gotta be dope to
compliment the swagg
asian girl shit
Pornography Search and tweet about pornographic sites
and content
imagine live without sucking titties adult chat & cams
Search Navigation to search engines bing, google
Shopping Searching for products, gadgets, services,
tweeting about purchases (Eg.:I bought this,
going to the mall
im gettin 8 racks by ummm; Rented the
social network and despicable me.
christian dior collection; austin area fur-
niture stores
Advertisement Discounts, sales, coupons Daily Mobile News: Verizon is Dropping
Prices on..; $75 for $150 Color Gloss
dream horse ads; ofce depot ad
Sports Sports news, player and sports team mentions,
sport events
RT @iMJody: #RealWomen play foot-
ball; Leaning toward the Steelers
sydney cup 2007; harrisburg game show;
New England Patriots
Weather Weather news and information, querying for
weather sites and channels
Can already smell summer; WEATHER:
Saint Louis, Missouri Weather: 29F Mist
29 degree tavern fort worth; main cli-
mate; weather.com
Technology Computers, internet, iPhone, apps, software,
hardware, hacking, troubleshooting
Nook Color Runs Android 3.0 Honey-
comb Upsidedown. . .
nd and delete items in excel; how to
hack yahoo; merge csv using command
Miscellaneous Tweets and queries which did not fall in the
above categories
Someone take me on a cute date; Oh my
god, the dog is wearing Sun Glasses. I
did NOT see that coming!!
at most of the time; ghosts in texas; hi
how are you
Table 2: Coding scheme for tweets and search queries along with examples from each category. Codes are not mutually exclusive.
5
p-value > 0.003
Entertainment
Shopping
Advertisement
News
Food
Sports
Profanity
Pornography
Business/Finance
Health/Beauty
Tweets Queries
23.7%
Technology
Search
Celebrity
Travel/Recreation
Education
Gaming
Weather
Miscellaneous
14.1%
15.8%
3.6%
p-value < 0.003
13.2%
1.4%
10.2%
3.9%
3.1%
8.6%
1%
4.6%
3.3%
6.4%
5.5%
0.4%
8.1%
4.5%
8.5%
3.9%
9.6%
5.6%
9.1%
7.3%
6.1%
5.9%
4.8%
7.7%
4.5%
4.5%
3.5%
1.8%
1.7%
9.4%
4%
0%
Figure 1: Tweets and search query proportions for each of the
categories.
2
tests of independence show that there is a signi-
cant difference in 11 of the 18 categories (p < 0.003).
4.2 Shopping and Advertisement
We found that shopping has a major presence in search queries
compared to tweets
2
(1, N = 2000) = 83.58, p < 0.003, while
the converse holds for advertisements,
2
(1, N = 2000) = 101.14,
p < 0.003. Advertising is a persuasive communication medium,
with the goal of walking the customer from unawareness to aware-
ness of the product, ultimately concluding in a purchase [12]. Search
queries like nike womens boots and walmart electric heaters
reect a users awareness of these products and probably his intent
to buy them. Search engines make the best use of a users intent to
target relevant ads. Thus people would almost never ever search for
ads, but be exposed to relevant ones via algorithms.
The scenario is different on sites like Twitter. Twitter users are
exposed to ads in the form of tweets from friends and followers
talking about a product (implicit ads), or promoted tweets and trends
from organizations. These tweets may not be relevant to the user but
they are exposed to them nonetheless. A recent Forbes news column
entitled GM Says Facebook Ads Dont Work, Pulls $10 Million
Account, set off a large online discussion about the effectiveness of
search versus social ads, mostly revolving around user intent [38].
We wade into this argument later in the Discussion section.
Often tweets document product purchases. They can inform us
about consumer action and perhaps be used as a direct measure of
the effectiveness of ads.
Tweet: Jus bought a pair of jays!
Tweet: I saw a bag similar to the bag of Ate kat from Ilocos
and so happy I purchased it!!
4.3 News and Sports
Twitter has played an important role in transforming the way news
stories get reported. For example, citizen journalists have used
Twitter to convey information otherwise difcult to convey via
conventional news channels [34]. In fact, breaking news [43] has
spread through Twitter well before traditional media published
it [24]. A larger proportion of tweets than search queries fall under
the News category,
2
(1, N = 2000) = 29.33, p < 0.003, perhaps
reecting Twitters Whats happening? prompt. An interesting
point to note is the way query searches for news are different from
tweets conveying news. Consider the following example:
Tweet: RT @BreakingNews: 19 private jets carrying families
of wealthy #Egypt and Arab businessmen leave Cairo, go to
Dubai AP, Al Jazeera
Query: bbc arabic live
Both convey timely information about the unrest in the Arab world,
but the query leads the user to a one-stop destination for more
detailed and varied news. Similar to news, sports events are actively
discussed in Twitter and we nd a signicantly higher number of
tweets about sports than searches,
2
(1, N = 2000) = 9.75, p <
0.003, perhaps reecting the high decay-rate of sports information.
4.4 Food
Lending credence to a popular Twitter stereotype, we observe a
higher proportion of tweets about food as compared to searches,

2
(1, N = 2000) = 26.47, p < 0.003. For example, consider the
following tweets:
Tweet: It will be love at rst bite with these Chocolate and
Raspberry Cream Tarts.
Tweet: Bout To Bless My Stomach w/ This Chick l A
Chicken Sandwhich #mhmm
One interpretation is people talk more about the good things they eat,
more than they look for good things to eat. What motivates people
to talk about food on Twitter? Yoshihiko [32] performed a series of
studies to discover whether knowledge of food is shared socially and
whether people use it to maintain relationships. He found positive
results on both points. Our ndings corroborate these results.
6
4.5 Profanity and Pornography
Using profane content is a common practice in online communities,
and Twitter is no exception. What is perhaps surprising is that
profanity is almost always targeted to an audience (Twitter) and
rarely used in queries,
2
(1, N = 2000) = 22.51, p < 0.003.
Going back to our previous example quoted in an earlier section, we
see how a user expresses his frustration through his tweet, something
which we almost never encounter in search behavior.
Tweet: Sleeping pattern f***ed!! [asterisks in original]
Query: dyssomnia
This behavior pattern is ipped in the Pornography category,
2
(1, N =
2000) = 43.66, p < 0.003.
4.6 Business & Finance
The Business/Finance category comprises tweets and search queries
related to business, money, banks and nances. We found a sig-
nicant difference in tweets and search queries belonging to this
category,
2
(1, N = 2000) = 10.38, p < 0.003. We return to its
implications later in the Discussion section.
4.7 Technology
Tweets and queries mentioning electronic and digital products, soft-
ware and hardware applications and questions about troubleshooting
technological glitches have been grouped under Technology. We
see a higher percentage of queries than tweets in this category,

2
(1, N = 2000) = 17.41, p < 0.003. While we see many
references to technical troubleshooting in search queries, we rarely
see them in tweets.
4.8 Search
Another perhaps more obvious category where we nd a sweeping
presence in queries is Search,
2
(1, N = 2000) = 98.75, p <
0.003. These are mainly navigational queries [5] where the users
intention is to reach a top level of a website they already have in
mind, such as searching for care.com or facebook.
4.9 Similar Topics
We nd statistically identical behavior in 6 of the 18 categories
(excluding Miscellaneous). People tweet about celebrities and they
also search for celebrity news, photos, videos and styles. People
extensively use search engines to plan their vacations (e.g., grand
canyon vacations lake resorts). They also tweet about it once they
get there (e.g., Had an excellent trip to the Somme this week).
In Health/Beauty, tweet and query proportions did not show sig-
nicant differences either. But it was illuminating to see queries
oriented towards nding information on medications, treatments
and diseases (e.g., ringworm treatment); tweets, on the other hand,
focused on maintaining a healthy and youthful appearance (e.g.,
My eyelashes are falling out at an alarming rate. V concerned I will
wake up tomorrow with a bald eye). Finally, we found it difcult to
classify about 9% of the queries and 4% of the tweets in the corpus,
labeling them as Miscellaneous.
5. DISCUSSION
Our study reveals the topical distribution emerging from two differ-
ent, massive internet systems: 1) Twitter, where words are visible to
an audience; and, 2) a search engine, where the sole interaction is
with a machine and the focus is information seeking. We see a large
proportion of tweets in the Entertainment category, mostly talking
about what someone is listening to or watching, and whether they
like it. People talk more about food and cooking than they look for
good things to eat. They enthusiastically tweet and retweet breaking
news and live sporting events. This behavior is perhaps an attempt
at self-verication [7], verifying whether others attribute the same
meaning to an individuals role performance.
Certain topics seem more appropriate for public consumption than
others. Compared to how often they search for them, people rarely
talk about certain topics on Twitter. For example, in the Shopping
and Business/Finance categories we see lower percentages of tweets
than search queries. Perhaps this is due to the stigma sometimes
attached with brazenly pursuing material possessions. Individuals
might not want to portray an image of a materialistic, self-centered
individual to their Twitter audience. Or, they may not want to reveal
the extent of their wealth [47]. We see this as a potentially deep area
for future research.
We also see similar phenomena in the Technology category. We
nd several instances of search queries where people look for ways
to troubleshoot a problem. However, tweets about these issues at a
much lower rate. Are people hesitant to accept their lack of technical
know-how in front of their Twitter followers despite the help they
may receive, or is it because they are more condent they will nd a
better answer through a search engine? Or Twitter might present a
challenge for describing a problem in 140 characters which contrasts
the developed language for short search queries.
5.1 Practical Implications
These ndings provide a starting point for what kind of things people
say and what kind of things they ask search engines. The focus
here is understanding more deeply the context around search and
social media: the intent of a social post versus the intent of a query.
The same text can represent different intents, which presents new
challenges for how algorithms and systems address the structure of
the content itself [31]. Consider the design of a question answering
system dealing with questions specic to the technological domain.
An important rst step in such a system is question processing. If the
designer builds this based on technical troubleshooting questions
posed in SNSs, she will rarely nd technical questions in SNSs,
and the system will likely perform poorly. The scenario will be
similar when the domain is switched from Technology to Business &
Finance. If a business & nancial advice knowledge base is built
based on questions people ask on SNSs, the system would likely
only cover a small subset of the things people care about.
We saw that people query extensively for shopping and tweet their
purchases, product preferences and advertise products, but never
search for advertisements. Consider the following tweet:
Tweet: Just ordered myself a skirt fromurban outtters . . . when
I get some more money in feb im buying some braces too
. . . bring it on!
The user not only tells about her recent purchase, but also talks
about her future purchasing intentions. Perhaps her recent invest-
ment in products from Urban Outtters and her future intentions
to buy braces can serve as useful inputs to an advertising system
for targeting relevant ads. Advertising systems often match user-
entered Web search query against keywords associated with ads for
7
content-targeted advertising [28]. The challenges associated with
that approach is identifying relevant ads using queries which are
often short and lack context [6]. Query expansion using external
source of knowledge, specically Web search results and a large
taxonomy of commercial topics are suggested ways to address this
challenge [6]. We believe that Twitter could be an excellent source
of information for query augmentation, especially for those topics
which are more often tweeted about than queried for on search
engines.
Perhaps most importantly, consider systems like Bings social search.
It prompts users to share searches with Facebook friends, irrespec-
tive of the genre of the search query. Given what we have found
regarding the distribution of topics like Business/Finance, should a
system prompt a user to broadcast a query such as the following?
what does it mean if bank of america is taking up to 3
months to see if person qualies for a mortgage
Based on the shared social information, Bing also nds friends in
your social network who have knowledge relevant to the search
query. This approach may work best when the search query matches
up with something people often talk about on SNSs. Our ndings
provide a way forward.
How can we harness the richness of public discourse without com-
promising privacy? Perhaps systems such as these could consider
allowing a user to post certain search queries anonymously to an
SNS. The three characteristics of social translucence theory [14]
visibility, awareness, and accountability would work in interesting
ways in such a system design. The person posting an anonymous
query is not visible to his social connections, but his connections
are aware that it was issued by someone within their social network.
This might permit discussion about otherwise off-limits topics, like
money, shopping, etc. However, will there be accountability attached
to answering a question like this?
5.2 Theoretical Implications
We believe our work makes two important contributions to existing
theory in cross-site CMC research. First, we show how topics differ
between two massive internet systems. This sets up an interesting
follow-on question: Why do such differences occur? While we often
speculate in this paper about personal motivations behind the data
we observe, observational quantitative studies like this one usually
cannot establish motivation. This is likely best tackled with interview
techniques.
We saw that people are more likely to talk about some topics than
search for them. The topical differences were drawn without respect
to individual differences on Twitter: for example, How might these
results differ with audience size? This gives way to new research
questions: What topics would be more popular if people can control
their audience or direct messages towards specic constituencies
(e.g., Google circles)? We need more work on problems like these
to see how differences like the ones we present here manifest under
variations like audience size or composition.
5.3 Limitations
One limitation of our study is that the demographics of the two
channels do not completely overlap [17, 50]. At the same time,
they are by no means entirely disjoint. We hope that future work
will look into overcoming this limitation by nely segmenting user
populations. Moreover, further research needs to be done to explore
deeper questions around motivation and intent. Also, our current
study is limited to highly public SNS interactions. The broad range
of social interactions vary from being highly public to very private
are left unexplored. This suggests future work should examine how
the topical space varies as a function of privacy in SNSs; however,
that would require privileged data access.
6. CONCLUSION
Our study addresses the behavioral differences and similarities be-
tween two widely used internet systems: a search engine and Twitter.
It examines how topics in these systems compare and contrast and
draws a connection between what people say versus what they
search for. Addressing a broader issue of cross-site studies in CMC
research, this work bridges a gap between two important internet
systems, and can inform modern CMC research as well as the design
of social search. We hope that this study motivates future researchers
to nd the intrinsic reasons behind the differences we observe.
7. ACKNOWLEDGEMENTS
We would like to thank various colleagues for reviewing early drafts
of this work. Also, we extend our gratitude to the funders of this
work.
8. REFERENCES
[1] Douglas G. Altman. Practical Statistics for Medical Research.
Chapman and Hall, 1991.
[2] Steven M. Beitzel, Eric C. Jensen, Abdur Chowdhury, David
Grossman, and Ophir Frieder. Hourly analysis of a very large
topically categorized web query log. In Proc. SIGIR, 2004.
[3] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent
dirichlet allocation. J. Mach. Learn. Res., 2003.
[4] d. boyd and N.B. Ellison. Social Network Sites: Denition,
History, and Scholarship. Journal of Computer-Mediated
Communication, 13(1):210230, 2008.
[5] Andrei Broder. A taxonomy of web search. SIGIR Forum,
36(2):pages 310, September 2002.
[6] Andrei Z. Broder, Peter Ciccolo, Marcus Fontoura, Evgeniy
Gabrilovich, Vanja Josifovski, and Lance Riedel. Search
advertising using web relevance feedback. In Proc. CIKM,
pages 10131022, 2008.
[7] Peter J. Burke and Anna Riley. Indentities and
self-verication in the small group. Social Psychology
Quarterly, 58(2):6173, 1995.
[8] Declan Butler. When google got u wrong. Nature,
494(7436):155, 2013.
[9] Michael D. Byrne, Bonnie E. John, Neil S. Wehrle, and
David C. Crow. The tangled Web we wove: a taskonomy of
WWW use. In Proc. CHI, pages 544551, 1999.
[10] Mike Cassidy and Matthew Kulick. An update to google
social search. http://googleblog.blogspot.com/
2011/02/update-to-google-social-search.html,
2011. Accessed: 2/2013.
[11] Andy Cockburn and Steve Jones. Which way now? analysing
and easing inadequacies in www navigation. International
Journal of Human-Computer Studies, 45:105129, 2000.
[12] Russell H. Colley. Dening Advertising Goals for Measured
Advertising Results. Assoc. of Natl. Advertisers, 1961.
[13] Samantha Cook, Corrie Conrad, Ashley L Fowlkes, and
Matthew H Mohebbi. Assessing google u trends
8
performance in the united states during the 2009 inuenza
virus a (h1n1) pandemic. PLoS One, 6(8):e23610, 2011.
[14] Thomas Erickson and Wendy A. Kellogg. Social translucence:
an approach to designing systems that support social
processes. ACM Trans. Comput.-Hum. Interact., March 2000.
[15] Brynn M Evans and Ed H Chi. Towards a model of
understanding social search. In Proceedings of the 2008 ACM
conference on Computer supported cooperative work, pages
485494. ACM, 2008.
[16] Brynn M. Evans, Sanjay Kairam, and Peter Pirolli. Do your
friends make you smarter?: An analysis of social strategies in
online information seeking. Inf. Process. Manage., November
2010.
[17] Evergreen Consulting Group. Search Engine Demographics
for 2010.
http://evergreendirect.com/index.php/2010/02/search-engine-
demographics-for-2010. Accessed September 19, 2012.,
2010.
[18] Jeremy Ginsberg, Matthew H Mohebbi, Rajan S Patel,
Lynnette Brammer, Mark S Smolinski, and Larry Brilliant.
Detecting inuenza epidemics using search engine query data.
Nature, 457(7232):10121014, 2008.
[19] E. Goffman. The presentation of self in everyday life, 1959.
[20] Luis Gravano, Vasileios Hatzivassiloglou, and Richard
Lichtenstein. Categorizing web queries according to
geographical locality. In Proc. CIKM, New York, NY, USA,
2003.
[21] B. Hecht, J. Teevan, M.R. Morris, and D. Liebling.
SearchBuddies: Bringing Search Engines into the
Conversation. In Proc. ICWSM, 2012.
[22] C. Honey and S.C. Herring. Beyond microblogging:
Conversation and collaboration via twitter. In Proc. HICSS,
pages 110, 2009.
[23] Liangjie Hong and Brian D. Davison. Empirical study of topic
modeling in twitter. In Proc. SOMA, New York, NY, USA,
2010. ACM.
[24] Mengdie Hu, Shixia Liu, Furu Wei, Yingcai Wu, John Stasko,
and Kwan-Liu Ma. Breaking news on twitter. In Proc. CHI,
pages 27512754, 2012.
[25] Scott B. Huffman and Michael Hochster. How well does result
relevance predict session satisfaction? In Proc. SIGIR, pages
567574, 2007.
[26] Rosie Jones and Kristina Lisa Klinkner. Beyond the session
timeout: automatic hierarchical segmentation of search topics
in query logs. In Proc. CIKM, 2008.
[27] In-Ho Kang and GilChang Kim. Query type classication for
web document retrieval. In Proc. SIGIR, New York, NY, USA,
2003.
[28] Ansio Lacerda, Marco Cristo, Marcos Andr Gonalves,
Weiguo Fan, Nivio Ziviani, and Berthier Ribeiro-Neto.
Learning to advertise. In Proc. SIGIR09, pages 549556,
New York, NY, USA, 2006. ACM.
[29] David Laniado and Peter Mika. Making sense of twitter. In
Proceedings of the 9th international semantic web conference
on The semantic web - Volume Part I, ISWC10, pages
470485, Berlin, Heidelberg, 2010. Springer-Verlag.
[30] Yu-Ru Lin, Hari Sundaram, Munmun De Choudhury, and
Aisling Kelliher. Discovering multirelational structure in
social media streams. ACM Trans. Multimedia Comput.
Commun. Appl., 8(1):4:14:28, February 2012.
[31] Peter Mika. Making things ndable: semantics for web search
and online media. In Proceedings of the International
Conference on Web Intelligence, Mining and Semantics,
WIMS 11, pages 3:13:2, New York, NY, USA, 2011. ACM.
[32] Yoshihiko Miyazaki. Social knowledge of food: How and why
people talk about foods, 2008.
[33] Robert J. Moore, Elizabeth F. Churchill, and Raj Gopal Prasad
Kantamneni. Three sequential positions of query repair in
interactions with internet search engines. In Proceedings of
the ACM 2011 conference on Computer supported
cooperative work, CSCW 11, pages 415424, New York, NY,
USA, 2011. ACM.
[34] E. Morozov. Iran elections: A twitter revolution?, June 2009.
[35] Meredith Ringel Morris. A survey of collaborative web search
practices. In Proc. CHI, 2008.
[36] Meredith Ringel Morris, Jaime Teevan, and Katrina Panovich.
A comparison of information seeking using search engines
and social networks. In Proc. ICWSM, 2010.
[37] Meredith Ringel Morris, Jaime Teevan, and Katrina Panovich.
What do people ask their social networks, and why?: a survey
study of status message q&a behavior. In Proc. CHI, 2010.
[38] J. Muller. GM Says Facebook Ads Dont Work, Pulls $10
Million Account. http://onforb.es/LO5Hur. Accessed
September 19, 2012. Forbes, 2012.
[39] Mor Naaman. Social multimedia: highlighting opportunities
for search and mining of multimedia data in social media
applications. Multimedia Tools Appl., 56(1):934, January
2012.
[40] Mor Naaman, Jeffrey Boase, and Chih-Hui Lai. Is it really
about me?: message content in social awareness streams. In
Proc. CSCW, 2010.
[41] Brendan OConnor, Ramnath Balasubramanyan, Bryan
Routledge, and Noah Smith. From tweets to polls: Linking
text sentiment to public opinion time series. In Proc. ICWSM,
2010.
[42] Sharoda A Paul, Lichan Hong, and Ed H Chi. Is twitter a good
place for asking questions? a characterization study. In
ICWSM, 2011.
[43] Swit Phuvipadawat and Tsuyoshi Murata. Breaking news
detection and tracking in twitter. In Proc., pages pages
120123. IEEE Computer Society, 2010.
[44] K. Purcell. Search and email still top the list of most popular
online activities. In Pew Internet & American Life Project,
2011.
[45] Daniel Ramage, Susan T. Dumais, and Daniel J. Liebling.
Characterizing microblogs with topic models. In Proc.
ICWSM, 2010.
[46] Peter J. Rentfrow and Samuel D. Gosling. Message in a
Ballad: The Role of Music Preferences in Interpersonal
Perception. Psychological Science, 17(3), 2006.
[47] S. Sengupta. Preferred Style: Dont Flaunt It in Silicon Valley.
http://nyti.ms/N3sdkp. Accessed September 19, 2012. New
York Times, 2012.
[48] David A. Shamma, Lyndon Kennedy, and Elizabeth F.
Churchill. Peaks and persistence: modeling the shape of
microblog conversations. In Proc. CSCW, 2011.
[49] David A. Shamma, Ryan Shaw, Peter L. Shafton, and Yiming
Liu. Watch what i watch: using community activity to
understand content. In Proceedings of the international
workshop on Workshop on multimedia information retrieval,
MIR 07, pages 275284, New York, NY, USA, 2007. ACM.
9
[50] A. Smith. Twitter Update 2011.
http://pewresearch.org/pubs/2007/twitter-users-cell-phone-
2011-demographics. Accessed September 19, 2012. Pew
Research, 2011.
[51] Amanda Spink, Deitmar Wolfram, Bernard Jansen, B. J.
Jansen, and Tefko Saracevic. Searching the web: The public
and their queries, 2001.
[52] Jaime Teevan, Daniel Ramage, and Merredith Ringel Morris.
#twittersearch: a comparison of microblog search and web
search. In Proc. WSDM, 2011.
[53] Sarah Vieweg, Amanda L. Hughes, Kate Starbird, and Leysia
Palen. Microblogging during two natural hazards events: what
twitter may contribute to situational awareness. In Proc. CHI,
pages 10791088, 2010.
[54] Shaomei Wu, Jake M Hofman, Winter A Mason, and
Duncan J Watts. Who says what to whom on twitter. In
Proceedings of the 20th international conference on World
wide web, pages 705714. ACM, 2011.
[55] Jiang Yang, Meredith Ringel Morris, Jaime Teevan, Lada A.
Adamic, and Mark S. Ackerman. Culture matters: A survey
study of social q&a behavior. In Proc. ICWSM, 2011.
10

You might also like