Professional Documents
Culture Documents
Abstract—We consider the problem of ranking the popularity of items and suggesting popular items based on user feedback. User
feedback is obtained by iteratively presenting a set of suggested items, and users selecting items based on their own preferences
either from this suggestion set or from the set of all possible items. The goal is to quickly learn the true popularity ranking of items
(unbiased by the made suggestions), and suggest true popular items. The difficulty is that making suggestions to users can reinforce
popularity of some items and distort the resulting item ranking. The described problem of ranking and suggesting items arises in
diverse applications including search query suggestions and tag suggestions for social tagging systems. We propose and study
several algorithms for ranking and suggesting popular items, provide analytical results on their performance, and present numerical
results obtained using the inferred popularity of tags from a month-long crawl of a popular social bookmarking service. Our results
suggest that lightweight, randomized update rules that require no special configuration parameters provide good performance.
Index Terms—Popularity ranking, recommendation, suggestion, implicit user feedback, search query, social tagging.
1 INTRODUCTION
the history of tag selections.1 Fig. 1 shows an example user
W Econsider the problem of learning the popularity of
items that is assumed to be a priori unknown but has to interface to enter tags for a Web page.
be learned from the observed user’s selection of items. In The learning of item popularity is complicated by the
particular, we consider systems where each user is presented suggesting of items to users. Indeed, we expect that users
with a list of items (“suggested items”) and the user selects a would tend to select suggested items more frequently. This
could be for various reasons, for example, (least effort)
set of preferred items that can contain either suggested items
where users select suggested items as it is easier than
or any other items preferred by this user. The list of suggested
thinking of alternatives that are not suggested or (bandwa-
items would typically contain only a (small) subset of popular gon) where humans may tend to conform to choices of other
items. The goal of the system is to efficiently learn the users that are reflected in the suggestion set showing a few
popularity of items and suggest popular items to users. popular items. In practice, we find indications that such
Items are suggested to users to facilitate tasks such as popularity bias may well happen; see, for example, Sen et al.
browsing or tagging of the content. Items could be search [18] and Suchanek et al. [20]. In Fig. 3, we provide results of
query keywords, files, documents, and any items selected by our own user study that indicate users’ tendency to imitate.2
One may ask, if suggesting popular items seems proble-
users from short lists of popular items. A specific application
matic due to potential popularity disorder, why make
is that of tagging of the content where items are “tags” suggestions in the first place? This is for several reasons; for
applied by users to content such as photos (e.g., Flickr), example, suggestions may help recall what candidate items
videos (e.g., YouTube), or Web pages (e.g., del.icio.us) for are. A fix to avoid popularity skew would be to suggest all
their later retrieval or personal information management. candidate items and not restrict to a short list of few popular
items. This is often impractical for reasons such as limited
The basic premise of social tagging is that the user can choose
user interface space, user ability to process smaller sets
any set of tags for an information object according to her easier, and the irrelevance of less popular items. So, the
preference. In most existing social tagging applications, users number of suggestion items is limited to a small number
are presented with tag suggestions that are made based on (e.g., seven for the suggested tags in del.icio.us).
In this paper, our goal is to propose algorithms and
analyze their performance for suggesting popular items to
. M. Vojnovi c and D. Gunawardena are with Microsoft Research Ltd., users in a way that enables learning of the users’ true
7 JJ Thomson Avenue, CB3 0FB Cambridge, UK.
E-mail: {milanv, dinang}@microsoft.com. preference over items. The true preference refers to the
. J. Cruise is with the Department of Mathematics, Bristol University, preference over items that would be observed from the
University Walk, Royal Fort Annex, Bristol BS8 1TW, UK. users’ selections over items without exposure to any
E-mail: marjrc@bristol.ac.uk. suggestions. A simple scheme for ranking and suggesting
. P. Marbach is with the Bahen Center for Information Technology (BCIT),
University of Toronto, Room BA5232, Toronto, ON M5S 3G4, Canada. popular items (that appears in common use in practice)
E-mail: marbach@cs.toronto.edu.
1. We focus on the ranking of items where the only available information
Manuscript received 16 July 2008; revised 29 Nov. 2008; accepted 8 Dec. is the observed selection of items. In learning of the user’s preference over
2008; published online 16 Jan. 2009. items, one may leverage some side information about items, but this is out
Recommended for acceptance by Y. Chen. of the scope of this paper.
For information on obtaining reprints of this article, please send e-mail to: 2. The user study was conducted with about 400 participants that tagged
tkde@computer.org, and reference IEEECS Log Number a set of six Web pages. For a Web page, the user was either presented or not
TKDE-2008-07-0366. presented tag suggestions which enabled us to measure the frequencies of
Digital Object Identifier no. 10.1109/TKDE.2009.34. tag selections in the presence and absence of suggestions.
1041-4347/09/$25.00 ß 2009 IEEE Published by the IEEE Computer Society
1134 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 21, NO. 8, AUGUST 2009
Fig. 1. An example tag entry user interface. The user is presented with
recommended tags and selects any preferred tags by typing them in the
provided text box.
Fig. 2. User’s choice model: an item is selected over a set of eight items
presents a fixed number of the most popular items as with the set S ¼ f2; 4; 5; 7g of suggested items. With probability 1 pS ,
observed from the past item selections. We show analysis item is selected by sampling from the distribution r over the entire set of
that suggests such a simple scheme can lock down to a set of items, else, the same but confined to the items in the set S.
items that are not the true most popular items if the
popularity bias is sufficiently large, and may obscure the users—this is because we are interested in small
learning the true preference over items. In this paper, we suggestion sets and assume that we can allow for
propose alternative algorithms designed to avoid such randomized presentation order.
reinforcements and provide formal performance analysis of The design objective to learn the true popularity ranking
the ranking limit points and popularity of the suggested of items means that the ranking order induced by the
items. While convergence speed of the algorithms is of ranking scores ðtÞ at time t is the same as that induced by
interest, its formal analysis is out of the scope of this paper. the true popularity ranking scores r, as the number of users
In the remainder of the paper, we first define more t tends to be large. In other words, we want that for any two
precisely the problem that we consider (Section 2), provide items i and j, ri rj implies i ðtÞ j ðtÞ, for sufficiently
a summary of our results (Section 3), and discuss related large t. The design objectives are also to suggest true
work (Section 4). We then introduce the algorithms that we popular items and identify quickly the true popular items.
study (Section 5) and present our main analytical results Ideally, we would like the ranking order induced by ðtÞ to
(Section 6) and numerical results (Section 7). The proofs of conform to that induced by r after a small number of
our main results are given in the Appendix. selections. We characterize the precision of a set of items S
using the following metric. For a set S of size s,
2 PROBLEM FORMULATION jfi 2 S : ri rs gj
precðSÞ ¼ : ð1Þ
In this section, we present a more precise description of the s
problem of ranking and suggesting items that we consider, This is a standard information-retrieval measure of precision
and define a user’s choice model that we use for our analysis. [17], [21] defining the relevant items to be 1; 2; . . . ; v, where
2.1 Ranking and Suggesting Items v ¼ maxfi 2 C : ri rs g. Note that (1) values the same any
item i such that ri rs . Alternatively, one may consider
We consider the situation where users select items from a weighted precision that would value more items with larger
given set C :¼ f1; 2; . . . ; cg, where c > 1. We let frequencies, which we do not consider in this paper.
r ¼ ðr1 ; r2 ; . . . ; rc Þ We derive some of our analytical results assuming a
model for the user’s choice of items described in the
be the users’ true preference over the set of items C and following section.
call r the true popularity rank scores. For an item i, we
interpret ri as the portion of users that would select item i 2.2 User’s Choice Model
if suggestions were not made. We assume that the true We use the following model for how users choose items.
popularity rank scores r are such that: 1) ri is strictly Suppose a user is presented a set S of suggested items. The
positive for each item i, 2) items are enumerated such that user selects an item from the entire set of items by sampling,
r1 r2 rc , and 3) r is normalized such that it is a using the true item popularity distribution r, with prob-
probability distribution, i.e., r1 þ r2 þ þ rc ¼ 1. ability 1 pS . Otherwise, the user does the same but
An algorithm is specified by: 1) ranking rule: the rule that confines her choice to items in the suggestion set S. (See
specifies how to update the ranking scores of items that are Fig. 2.) In other words, we have
denoted by ¼ ð1 ; . . . ; c Þ and 2) suggestion rule: the rule
that specifies what subset of items to suggest to a user. We Prfselected item ¼ ij suggestion set ¼ Sg
assume that the size of the suggestion set is fixed to s, a ri 1i2S ð2Þ
positive integer that is a system configuration parameter. A ¼ ð1 pS Þri þ pS P :
j2S rj
notable difference with respect to ranking problems such as
that of web search results is that our ranking algorithms do We admit this simple and intuitive model in order to
not account for the order in which items are suggested to facilitate analysis under a model of user’s choice that biases
ET AL.: RANKING AND SUGGESTING POPULAR ITEMS
VOJNOVIC 1135
applications where the size of the suggestion set is limited 5.1 A Naive Algorithm
but the display of a larger set of popular is required. We first introduce the simple algorithm TOP which consists
Finally, we present numerical results obtained by of a ranking and a suggestion rule as defined below.
evaluating our analytical results using the popularity rank
scores of tags for bookmarks inferred from a month-long
crawl of the social bookmarking application del.icio.us.
4 RELATED WORK
The problem studied in this paper relates to the broad area
of recommendation systems (e.g., Kumar et al. [10] and
Sandler and Kleinberg [9]) in which the goal is to learn The ranking rule is to set the rank score of an item equal
which items are preferred by users based on the user’s to the number of selections of this item in the past. For this
selection of items. The important distinction of our problem algorithm and the algorithms introduced later, we initialize
is in that we consider a system with feedback, established
Vi ¼ 0 for each item i. The implicit assumption is that we
through suggestions. This may results in positive reinforce-
assume no prior information about the popularity of items,
ments similar in spirit to those of well-known preferential
and hence, initially assume that all items are equally
attachment [3], [5]. Another related area is that of voting
systems. Specifically, our system could formally be seen as popular.3 The suggestion rule sets the suggestion set to a
an instance of approval voting [8], [4] in that each user can set of the top s most popular items with respect to the
select any set of candidates offered on a voting ballot. current popularity rank scores.
Again, intrinsic to our problem is the continual feedback We will later identify cases when this simple algorithm
that indicates popular candidates. In the voting systems can get locked down to a ranking that induces a different
literature, it is well recognized that such feedback may bias ranking than that induced by the true popularity ranking r,
the voting results; see, e.g., Simon [19] for an analysis of the and thus, may fail to learn the true popularity of items. To
bandwagon and underdog effects induced by preelection overcome this problem, we consider in the following
polls. Our work is related to statistical learning problems of alternative ranking and suggestion rules.
the multiarmed bandit type (e.g., Lai and Robbins [11]),
described as follows: We consider a finite set of items. Each 5.2 Ranking Rules
user is presented with an item that is selected by this user In this section, we define two ranking rules called rank
with (unknown) probability specific to this item. The goal is rule 1 and rank rule 2.
to present items to the users such that the expected Rank rule 1. A simple ranking rule is the one that we
cumulative number of item selections is maximized. An already encountered in the algorithm TOP, where the rank
asymptotically optimal rule to decide which item to present score for an item i is incremented by 1 whenever a user
was found by Lai and Robbins [11] and was further selects this item.
extended by Anantharam et al. [1] to allow presenting
more than one item. Further related work includes that of Init: Vi ¼ 0 for each item i
using the click-through data to enhance the Web search At the t-th item selection:
results. Pandey et al. [15] and Cho et al. [6] studied the If item i selected:
entrenchment problem where the search engine result sets Vi Vi þ 1
lock down to a set of popular URLs and proposed to i Vi =t
intervene the results with randomly sampled URLs. An
We will see that this ranking may fail to discover the
important distinction of our problem to that studied in the
ranking order of the true popularity when combined with a
aforementioned prior work is that selection of items is not
suggestion rule that reinforces items that were selected
restricted to items presented to the user. Finally, we discuss
early on, as it is the case under TOP.
the research on social tagging as our numerical results are
provided specifically for this context. In [7], Golder and Rank rule 2. We noted that rank rule 1 may fail to
Huberman provide various statistical characterization discover the ranking order of the true popularity if used with
results on the tagging in the social bookmarking application suggestion rules such as TOP. To overcome this problem, we
del.icio.us. Sen et al. [18] studied the effect of the tag may redefine the rank scores in the following way.
suggestions on the users’ choice of tags in MovieLens Init: Ti ¼ 0, Vi ¼ 0, for each item i
systems, which they instrumented for experiments. Their
At the t-th item selection:
results suggest that tags applied by users are affected by tag
For each item i:
suggestions. Xu et al. [23] proposed a system for recom-
mended tags but their work differs from the goal of this If item i not suggested:
paper. Finally, we refer to Suchanek at al. [20] for an Ti Ti þ 1
estimation procedure of the imitation rate defined in this If item i selected:
paper and estimates for tagging of Web pages scenario. Vi Vi þ 1
i Vi =Ti .
5 ALGORITHMS 3. In practice, one may use prior information about item popularity. For
example, in social bookmarking applications, information from the key-
We provide a precise definition of the algorithms for ranking words metadata and content of Web pages can be used to bias the initial tag
and suggesting items that we consider in this section. popularity.
ET AL.: RANKING AND SUGGESTING POPULAR ITEMS
VOJNOVIC 1137
Now, the rank scores are updated only for an item that greater than one item, M2S is different from suggesting the
was selected and was not suggested to the user. The last distinct used items due to the random eviction of items
ranking score i for an item i can be interpreted as the rate from the suggestion set, but note that the rule does bias to
at which item i is selected over selections for which item i presenting recently used items. We will show how exactly
was not suggested. We have the following result: this update rule tends to bias the sampling of the suggestion
set with respect to true popularity rank scores of items. As
Lemma 1. Consider any suggestion rule combined with the rank
an aside, note that M2S relates to the self-organized sorting
rule 2 under the only assumption that each item exits the
of items known as move-to-front heuristic (e.g., [16]).
suggestion set infinitely often. Then, under the user’s choice It follows from the description of the suggestion rule M2S
model: limt!þ1 ðtÞ ¼ r. that any item would recurrently appear in the suggestion
set, provided only that it is recurrently selected by users
The lemma follows by noting that under the user’s choice with some positive probability (no matter how small). Our
model, an item i that is not suggested is selected with next algorithm is similar in spirit to M2S, but is designed so
probability proportional to ri . The result tells us that under that an item appears recurrently in the suggestion set only if
the user’s choice model, rank rule 2 combined with a sufficiently popular, with respect to its true popularity. This
suggestion rule from broad set, guarantees to learn the true lockdown feature characterizes the simple algorithm TOP,
popularity ranking of items. The only assumption is that but note that our goal is to ensure that the lockdown is to a
suggestion rule is such that each item exits the suggestion set subset of true top popular items. We call this new algorithm
with a probability that is lower bounded by a positive (but FM2S (frequency move to set) for the reasons that we
possibly very small) constant. However, rank rule 2 might discuss shortly; the algorithm is defined as follows.
have slow rate of convergence as the rank scores are
updated only over a subsequence of item selections that
were not suggested. For this reason, we will focus on rank
rule 1 combined with different suggestion rules and
consider robustness in learning the true popularity ranking.
Theorem 1. Consider the algorithm TOP under the user’s choice 1. Limit ranking. Under the user’s choice model, from
model with arbitrarily fixed tie break of item ranks. any initial ranking scores ð0Þ, the ranking scores ðtÞ
converge to the true ranking scores r if and only if the
1. Top-popular sets. Suppose the user imitation prob- imitation probability satisfies
ability is p < 1. Given r and s, the top popular
rankings are induced by the ranking scores given by 1
p< ; ð5Þ
( ðAÞ
ð1 pÞri þ p Pri r ; i 2 S;
i ðSÞ ¼ j2S j ð3Þ where ðAÞ is the spectral radius of the matrix A ¼
ð1 pÞri ; i 2 C n S; ðaij Þ defined by
where S is a set of s items that satisfies 1 X 1
! aij ¼ ri c1 P ; ð6Þ
X k2S 0 rk
maxj2CnS rj p s1 S 0 2Ss : fi;jg S 0
rj 1 : ð4Þ
j2S
min r
j2S j 1 p for i; j 2 C, where Ss contains all subsets of s distinct
items from C.
2. Suggestion set. Under condition (5), the limit
2. Uniqueness. The set f1; 2; . . . ; sg is a unique top frequency at which an item i is suggested to users is
popular set if and only if the imitation probability p is
such that s1
si ¼ ri þ ð1 ri Þ:
c1
p < pcrit ðr; sÞ;
The average precision of the suggestion set (1) is
where pcrit ðr; sÞ is the threshold imitation given by
v s1
aði; jÞ IEðprecðSÞÞ ¼ rv þ ð1 rv Þ ;
pcrit ðr; sÞ ¼ min ; s c1
0i<s<jc 1 þ aði; jÞ Pv
where rv :¼ ð i¼1 ri Þ=v.
with
Proof. Proof is provided in the Appendix. u
t
!
X
i X
j
riþ1
aði; jÞ ¼ rk þ rk 1 : The result in (5) gives an exact condition under which for
k¼1 k¼jsþiþ1
rj
given true popularity ranking scores r and the suggestion
set size s, the algorithm guarantees to learn the true
Proof. Proof is provided in the Appendix. u
t popularity ranking scores r. For this to hold, the imitation
probability must be smaller than the threshold asserted in
Equations (3) and (4) follow from our user choice model (5). The following corollary gives a sufficient condition.
and the definition of the top popular suggestion rule—(4)
is equivalent to saying that minj2S j maxj2CnS j . Item 2 Corollary 1. A sufficient condition for the relation in (5) to hold is
tells us that the true popularity ranking of items (with Pc
1 1 i¼csþ1 ri =ðs 1Þ
fixed tie break) is unique if and only if the imitation p< þ 1 :
s s r1
probability is smaller than the threshold asserted under
item 2. This threshold depends on the true popularity A stronger but simpler sufficient condition is p < 1=s.
ranking scores r and the size of the suggestion set s. This
result is obtained by finding the values of the imitation Note that for the suggestion set size of one item (s ¼ 1),
probability p for which the left-hand side in (4) is strictly the algorithm learns the true popularity rank r for any
greater than p=ð1 pÞ over all sets S of items other than imitation probability p < 1. Note further that in this case,
the set S . We will use the above result in Section 7 to the limit frequency at which an item i is suggested is equal
evaluate the threshold imitation probabilities for the true to ri . We will see that the same property holds with the next
popularity ranks r inferred from our data set. algorithm that we consider.
6.2 Frequency Proportional 6.3 Move to Set
We now consider the suggestion rule PROP combined with We show that under the suggestion set update rule
rank rule 1. In particular, we focus on characterizing its M2S starting from any initial suggestion set of items, the
robustness to imitation. We derive exact characterization of probability distribution of the suggestion set converges to a
the threshold imitation probability below which the ranking unique limit distribution that is characterized in terms of the
scores ðtÞ converge to the true popularity ranking scores r. true popularity distribution r and the suggestion set size s.
When this condition holds, we obtain a closed-form Theorem 3. The limit state of the suggestion rule M2S as the
expression for the frequencies with which items are number of items tends to be large is characterized as follows:
suggested to users and the average precision of the
suggested set. 1. Suggestion set. From any initial suggestion set Sð0Þ,
the probability distribution of the suggestion set SðtÞ
Theorem 2. The limit state of the suggestion rule PROP as the number of item selections t tends to infinity
combined with rank rule 1 is characterized as follows. converges to
ET AL.: RANKING AND SUGGESTING POPULAR ITEMS
VOJNOVIC 1139
Y
ðS 0 Þ / r i ; S 0 2 Ss ; ð7Þ The result of the theorem may prove useful for design of
i2S 0 new suggestion rules.
where the set Ss contains all subsets of s distinct items Proof. Suppose ri rj . i ðþ1Þ j ð1Þ ,
from the set of all items C.
X ri 1i2S 0
2. Frequencies at which items are suggested. The ð1 pÞri þ pS 0 P ðS 0 Þ
frequency at which an item i is suggested, si , has the S 0 2Ss k2S 0 rk
following properties: 1) the larger the item true X rj 1j2S0
popularity ri , the larger the frequency si and 2) the ð1 pÞrj þ pS 0 P ðS 0 Þ;
S 0 2Ss k2S rk
frequency si is sublinear in ri , i.e., si =ri sj =rj , for
P
any items i and j such that ri rj . where p ¼ pS 0 ðS 0 Þ. The last inequality is im-
S 0 2Ss
3. Combination with rank rule 1. For any imitation plied by
probability p < 1, the limit ranking scores induce the
X ri 1i2S0 X rj 1j2S0
true popularity ranking, i.e., pS 0 P ðS 0 Þ pS P ðS 0 Þ;
S 0 2Ss k2S k
0 r S 0 2S k2S 0 rk
s
ri rj ) i ðtÞ j ðtÞ; for sufficiently large t:
which is further implied by
Proof. Proof is available in the Appendix. It relies on X ri
Theorem 4, which is of independent interest and shown pS 0 P ðS 0 Þ1i2S0 ;j 2= S0
0 k2S 0 r k
later in this section. u
t S 2Ss
X rj
pS 0 P ðS 0 Þ1i 2= S0 ;j2S0 :
Item 1 tells us that in the limit (steady state), the 0 k2S 0 rk
S 2S s
algorithm samples a suggestion set S with the probability
proportional to the product of the true popularity of the The last inequality can be rewritten as
items in the set S. Hence, at a high level, the M2S update X r
rule is similar to PROP, one difference being that PROP tends pA[fig Pi ðA [ figÞ
ri þ k2A rk
to the sampling proportional to the sum of the true item A2Sij
X r ð8Þ
popularity rank scores (provided that the stability condition pA[fjg Pj ðA [ fjgÞ:
in Theorem 2 holds). Another way to look at the suggestion A2S
r j þ k2A rk
ij
rule M2S is as a construction of a Markov chain that has the P P
stationary distribution such that the probability of the Now, ri =ðri þ k2A rk Þ rj =ðrj þ k2A rk Þ indeed holds;
suggestion set is proportional to the product of the true thus, inequality (8) is true provided that
popularity rank scores of the items in the set. The frequency
at which an item appears in the suggestion set is given by pA[fig ðA [ figÞ pA[fjg ðA [ fjgÞ; all A 2 Sij ;
item 1 upon summing over the suggestion sets that which holds if both A and B hold. u
t
containing the given item. We were unable to obtain a
closed-form expression for this frequency in general for any 6.4 Frequency Move to Set
s 1. In the absence of a closed-form formula, item 2 In this section, we examine the suggestion rule FM2S. We
provides properties about the frequency at which an item is show that this algorithm tends to suggest only a subset of
suggested. Item 3 shows that the following interesting sufficiently true popular items (“competing set”) and
property holds: combining rank rule 1 with M2S guarantees precisely characterize the competing set of items.
learning the true popularity ranking for any imitation
probability less than 1. The result follows using the Theorem 5. The equilibrium points of FM2S are characterized
following theorem which is of general interest—it provides as follows:
a sufficient condition on the stationary distribution of the
1. Competing set of items. Assume that the state of the
suggestion set and the imitation parameter for the rank
algorithm FM2S has a stationary regime. In the
rule 1 to induce the true ranking. We denote with Si;j the set
stationary regime, only a subset of items f1; 2; . . . ; c0 g
containing all subsets of the set C n fi; jg of size s 1.
are suggested with strictly positive probability, where c0
Theorem 4. Consider a suggestion rule under the user’s choice is the largest integer i such that s i c and
model with parameters r and ðpS0 ; S 0 2 Ss Þ. Assume that the s
suggestion set under has the limit distribution . If for any i ri > 1 hi ðrÞ; ð9Þ
and j such that ri rj , the following monotonicity conditions i
hold: with hi ðrÞ the harmonic mean of r1 ; . . . ; ri , i.e.,
Fig. 7. FM2S. (a) The frequency at which rank 1 tag is suggested versus
suggestion set size. (b) Same as in left but showing mean frequencies at
which tags rank 1-10 are suggested.
Fig. 5. TOP: Critical imitation for the suggestion set size s ranging from 1
to 10 beyond which the algorithm can fail to learn the top k tags with suggested versus the true popularity rank of this tag; see
k ¼ s. The median values of the critical imitation are around 0.1. Fig. 6 for results obtained with the suggestion set size set to
seven tags. This illustrates the main properties of the
from 1 to 10 items. The median threshold imitation suggestion rule FM2S: locking down to displaying a set of
probability is around 0.1 across all the suggestion set sizes, items with high true popularity, the gradual decrease of the
suggesting that already at small level of imitation, the frequency with which an item is suggested with its true
undesired lockdown may happen. We also examined the popularity rank. In Fig. 9a, we have examined the average
threshold imitation beyond which TOP cannot guarantee precision of the suggestion set under FM2S. We now
learning the true popularity ranking of top k true popular examine the frequencies with which individual tags appear
tags. We refer the interested reader for these results to
in the suggestion set versus their true popularity ranks and
Appendix [22] (Fig. 4). In summary, results suggest that
across a range of suggestion set sizes. The results are shown
TOP may result in failing to learn the true popularity
in Fig. 7. In particular, we note that on average, the rank 1
ranking, for small values of the imitation probability.
Precision of suggestions. We next evaluate the mean tag of a bookmark appears in more than 90 percent of
precision of the suggestions made by algorithms PROP, M2S, suggestion sets, indicating good performance. In Theorem 5,
and FM2S. In Fig. 9, we show the respective mean precisions we have determined how many items eventually get into
of the suggested items for PROP, M2S, and FM2S for a range the suggestion set (“competing set”) with strictly positive
of suggestion set size from 1 to 40 tags. We observe that the probability as a function of the suggestion set size and the
mean precision under FM2S is better than that under either true popularity rank scores. How does the size of the
PROP or M2S. The mean precision under M2S is better than competing set for the true popularity ranks in our data set
under PROP, and this is particularly emphasized for small across differ with suggestion set sizes? Fig. 8 shows the ratio
suggestion set sizes (the range of practical relevance). The of the competing set size and the suggestion set size for
mean precision under FM2S is at least 80 p for suggestion suggestion set sizes ranging from 1 to 40 tags. We observe
set sizes greater or equal to 7, which may be regarded a that the competing set size is about twice or less than the
good performance. suggestion set size, for suggestion set sizes 5 tags. This
We next examine the suggestion rule FM2S in more suggests that the competing set tends to increase propor-
detail. We first examine the frequency at which a tag is tionally with the suggestion set size.
Convergence speed. We have evaluated speed of
convergence of the algorithms considered in this paper by
simulations. For space reasons, we omit to present these
Fig. 9. Average precision of the suggestion set: (a) PROP, (b) M2S, and (c) FM2S versus the suggestion set size. The results are obtained using
Theorems 2, 3, 4, and 5 with the inferred true tag popularities.
p
results here and refer the interested reader to [22, Fig. 6]. In min fðS 0 Þ > ; ð11Þ
S 0 2Ss nf1;...;sg 1p
summary, the results provided no evidence that either M2S
or FM2S are slower than TOP. where
!
X maxj2Ss nS rj
8 CONCLUDING REMARKS fðSÞ ¼ rj 1 : ð12Þ
j2S
minj2S rj
We proposed simple randomized algorithms for ranking
and suggesting popular items designed to account for Note that condition (11) is equivalent to p < f =ð1 þ f Þ
popularity bias. We focused on understanding the limit where f :¼ minS 0 2Ss nf1;...;sg fðS 0 Þ and, recall that Ss contains
ranking of the items provided by the algorithms, and how it subsets of C of size s. It suffices to show that
relates to that of the true popularity ranking and assessed f ¼ min0i<s<jc aði; jÞ, where a is defined in the theorem.
the quality of suggestions as measured by the true To that end, partition the set Ss in the following way. For a
popularity of suggested items. We believe that the problem set S 0 2 Ss , let i and j be such that the following hold: riþ1 ¼
posed in this paper opens interesting directions for future
maxk2CnS0 rk and rj ¼ mink2S0 rk . From (12), we have
research including analysis of convergence rates of the
ranking algorithms, consideration of alternative ranking min fðS 0 Þ ¼ min min gði; j; S 0 Þ;
S 0 2Ss nf1;...;sg 0i<s<jc S 0 :fi;jg S 0
and suggesting rules, and alternative user choice models. P riþ1
where gði; j; S 0 Þ ¼ ðr1 þ þ ri þ k2S 0 nfi;jg rk Þð rj 1Þ. The
APPENDIX result follows by noting that
min gði; j; S 0 Þ
A.1 Proof of Theorem 1 S 0 :fi;jg S 0
Item 1. Let Vi ðtÞ be the cumulative number of item i riþ1
¼ r1 þ þ ri þ rjsþiþ1 þ þ rj 1
selections over t item selections. Under the user’s choice rj
model and the TOP POPULAR suggestion rule, V is a ¼ aði; jÞ:
Markov chain specified by the transition probabilities
Converse: Suppose p pcrit ðr; sÞ. By the above identities,
P ðV 0 jV Þ ¼ i ðV Þ; this means that there exists a set S 0 2 Ss n f1; . . . ; sg such
where V 0 ¼ V þ ei with ei a vector of dimension c with all that (4) holds, which completes the proof. u
t
the coordinates equal to 0, but the ith equal to 1, and
A.2 Proof of Theorem 2
ri 1i2SV
i ðV Þ ¼ ð1 pÞri þ p Pc ; Item 1. V is a Markov chain with the transition probabilities
j¼1 rj 1j2SV
specified by
where SV is a set of s most popular items with respect to the P ðV 0 jV Þ ¼ i ðV Þ; ð13Þ
rank scores V .
By the law of large numbers, V ðtÞ=t converges to ðSÞ with V 0 ¼ V þ ei , where
as t goes to infinity, where ðSÞ is given by (3) and S is a X ri 1i2S0
i ðV Þ ¼ ð1 pÞri þ p P fS0 ðV Þ;
suggestion set that satisfies mini2S i ðSÞ maxi2CnS i ðSÞ. S 0 2S j2S 0 rj
s
P
Combining with (3), the latter condition can be rewritten Vj
j2S 0
as (4). fS0 ðV Þ :¼ P P :
S 0 2Ss j2S 0 Vj
Item 2. Direct: Suppose that ðSÞ with S ¼ f1; . . . ; sg is a P P c1
unique stationary ranking, for given imitation rate p. From Note that S 0 2Ss j2S 0 Vj ðtÞ ¼ m t, where m :¼ s1 .
(4), we have This follows from
ET AL.: RANKING AND SUGGESTING POPULAR ITEMS
VOJNOVIC 1143
!
XX X
c X A.3 Proof of Corollary 1
Vj ðtÞ ¼ 1j2S 0 Vj ðtÞ
S 0 2Ss j2S j¼1 S 0 2S s
We use the well known row-sum bound for the spectral
c radius of a non-negative matrix A:
c1 X c1
¼ Vj ðtÞ ¼ t ¼ m t:
s 1 j¼1 s1 ðAÞ max rðiÞ;
i
Let vðtÞ ¼ IEðV ðtÞÞ. It follows that where rðiÞ is row-sum of the ith row of the matrix A.
We have
1 X ri 1i2S0 X
vi ðt þ 1Þ ¼ vi ðtÞ þ ð1 pÞri þ p P vj ðtÞ;
m t S 0 2S j2S 0 rj j2S0 1 X 1
s rðiÞ ¼ s c1 P ; i ¼ 1; . . . ; c;
c1 s1 S 0 2Ss1 ðCnfigÞ 1 þ k2S 0
rk
where m :¼ s1 . For ðtÞ defined as ðtÞ ¼ vðtÞ=t, we have ri
to move to site v with given probability pðu; vÞ, and It remains only to show that
such an attempt is successful only if site v is not X X
already occupied by a particle. We note that our r ðS 0 n figÞ r ðS 0 n fjgÞ:
process X is an exclusion process where sites are S 0 2S s :i2S 0 ;j 2
= S0 S 0 2Ss :j2S 0 ;i 2
= S0
items, particles are items in the suggestion set, each But this clearly holds in view of (20).
particle attempts to move at instances of a Poisson Item 3. The result follows from Theorem 4. Indeed,
process with rate 1=s, and pðu; vÞ ¼ rv . The transition consider any suggestion set S such that i 2 = S and j 2 S for
rates of X are specified by some i and j for which it holds ri rj . Let S 0 ¼ S n fjg [ fig.
We then have
1
X ! X þ ei ej with rate ri ð1 Xi ÞXj : ð18Þ Y Y
s ðSÞ ¼ rk rj rk ri ¼ ðS 0 Þ;
Q k2Snfjg k2Snfjg
Items 2 and 2a. Let r ðAÞ :¼ k2A rk , A C. The long-
run frequency si at which item i is suggested is given by which shows that condition A holds. We assumed that pS
P p for 0 p < 1; thus, condition B is true.
0
S 0 2Ss :i2S 0 r ðS Þ
si ¼ P 0
: ð19Þ A.5 Proof of Theorem 5
S 0 2Ss r ðS Þ
A.5.1 System State and Dynamics
We need to show that ri rj implies si sj . Note that si Let Wi ðtÞ be the cumulative number of item i selections
sj is equivalent to when this item was not suggested. Let Xi ðtÞ ¼ 1 if
candidate i is in the suggestion set and Xi ðtÞ ¼ 0 otherwise.
X X
r ðS 0 Þ r ðS 0 Þ: Let ZðtÞ be the minimum Wi ðtÞ over the items i that are in
S 0 2S s :i2S 0 S 0 2S s :j2S 0 the suggestion set, i.e.,
Now, the last inequality is equivalent to ZðtÞ ¼ minfWi ðtÞ; i 2 C : Xi ðtÞ ¼ 1g:
X X
r ðS 0 Þ r ðS 0 Þ; Further, let MðtÞ be the number of items that are in the
S 0 2Ss :i2S 0 ;j 2
= S0 S 0 2S s = S 0 ;j2S 0
:i 2 suggested set with Wi ðtÞ of such an item equal to ZðtÞ, i.e.,
The last inequality can be rewritten as In the sequel, we consider redefined as follows. Let
DðtÞ ¼ ZðtÞ W ðtÞ and consider the Markov process
X 1 X
r ðS 0 Þ þ r ðS 0 n figÞ
0
S 2S :fi;jg S 0 r i 0 0
S 2S :i2S ;j 2
=S 0 ðtÞ ¼ ðDðtÞ; XðtÞ; MðtÞÞ ð21Þ
s s
X 1 X
r ðS 0 Þ þ r ðS 0 n fjgÞ: specified by the transition probabilities:
S 0 2S :fi;jg S 0
r j S 0 2S :j2S 0 ;i 2
= S0
s s
pð0 jÞ ¼ ri ð1 Xi Þ1Di >0 ;
Now, note that by ri rj , indeed Xj 1Dj ¼0
pð00 jÞ ¼ ri ð1 Xi Þ1Di ¼0 1M>1 ;
X X M
1 1 000
r ðS 0 Þ r ðS 0 Þ: pð jÞ ¼ ri ð1 Xi Þ1Di ¼0 Xj 1Dj ¼0 1M¼1 ;
r
S 0 2S :fi;jg S 0 i
r
S 0 2S :fi;jg S j
s s pðjÞ ¼ 1 pð0 jÞ pð00 jÞ pð000 jÞ;
ET AL.: RANKING AND SUGGESTING POPULAR ITEMS
VOJNOVIC 1145
have for all i 2 C which implies that IEðDi ðtÞÞ tends to 1 at t goes
to infinity, which cannot hold as by definition,
lim fri ½1 IEðXi ðtÞÞ rm ½1 IEðXm ðtÞÞg ¼ 0: ð24Þ
t!þ1 PrðDi ðtÞ 1Þ ¼ 1, for any t 0 and i 2 C. u
t
It follows that for all i 2 C We next define the set C 0 in terms of the parameters r
and s. We have the following.
1
IEðXi ðþ1ÞÞ ¼ 1 rm ½1 IEðXm ðþ1ÞÞ : ð25Þ Lemma 4. Any set C 0 that satisfies conditions (28), (29), and
ri
(30) must be such that C 0 ¼ f1; . . . ; ag for some
Note that a 2 fminðs; cÞ; . . . ; cg.
X
c Proof. Indeed, the assertion is true as for i 2 C 0 and rj ri ,
Xi ðtÞ ¼ s; for all t 0: ð26Þ conditions (28), (29), and (30) imply j 2 C 0 . t
u
i¼1
Combining the last two It remains only to show that a in the lemma is equal to
identities, we have
rm ð1 IEðXm ðþ1ÞÞÞ ¼ 1 sc hc ðrÞ. Inserting the last the expression asserted in the theorem. By condition 2, (29),
identity into (25), we obtain and (26), we have
s ha ðrÞ
s hc ðrÞ IEðXi ðþ1ÞÞ ¼ 1 1 ; i ¼ 1; . . . ; a: ð32Þ
lim IEðXi ðtÞÞ ¼ 1 1 ; i 2 C: ð27Þ a ri
t!þ1 c ri
As is positive recurrent, it must be IEðXi ðþ1ÞÞ > 0, for Now, note that (28) is equivalent to ra > ða sÞ=ðr11 þ
all i 2 C. By the above identity, this is equivalent to þ r1a Þ. By simple rearrangements, we have that the
1 1 sc hcrðrÞ > 0, for i 2 C, which after some elemen- last condition is equivalent to: a ra ðr11 þ þ r1a Þ < s. Let
i
tary calculus can be rewritten as (22). u
t fðaÞ be defined by the left-hand side of the last inequality. It
can be easily checked that fðaÞ is nondecreasing with a.
Remark 2. We showed that condition (22) is necessary for
to be positive recurrent. While we believe that the Hence, it follows that a is such that minðs; cÞ a c0 , where
condition (22) would also be sufficient, it remains open
0 1 1
to prove this. To this end, one may try the Lyapunov c ¼ max i 2 fb; . . . ; cg : i ri þ þ <s
r1 ri
stability methods.
1146 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 21, NO. 8, AUGUST 2009
with b :¼ minðs; cÞ. We next show that, in fact, it must be [22] M. Vojnovic, J. Cruise, D. Gunawardena, and P. Marbach,
“Ranking and Suggesting Tags in Collaborative Tagging Applica-
a ¼ c0 for the conditions (28), (29), and (30) to hold. To tions,” Technical Report MSR-TR-2007-06, Microsoft Research,
contradict, suppose that a < c0 . By the definition of c0 , we Feb. 2007.
have rc0 þ1 > ðc0 þ 1 sÞ=ðr11 þ þ r 10 Þ. By a simple alge- [23] Z. Xu, Y. Fu, J. Mao, and D. Su, “Towards the Semantic Web:
c þ1 Collaborative Tag Suggestions,” Proc. Workshop Collaborative Web
bra, we note that the last condition is equivalent to: rc0 þ1 > Tagging Workshop at the WWW 2006, May 2006.
ðc0 sÞ= ðr11 þ þ r10 Þ. But this violates condition (30);
c
hence, a contradiction. Milan Vojnovic received the BSc and MSc
degrees in electrical engineering from the Uni-
Item 2. The asserted frequencies of the appearance of versity of Split, Croatia, in 1998 and 1995,
items in the suggestion set follow from (32). respectively, and the PhD degree in commu-
Item 3. Take the expectation in (1) and use IEð1i2S Þ ¼ si , nication systems from EPFL, Switzerland, in
2003. He is a researcher in the Systems
where si ’s are given by (10). u
t and Networking Group at Microsoft Research
Cambridge, United Kingdom. His research inter-
ests are in design and analysis of computer
ACKNOWLEDGMENTS systems and networks. He received IWQoS
2007 Best Student Paper Award with Shao Liu and Dinan Gunawardena
The work by J. Cruise and P. Marbach was done in part
for a work on congestion control protocols, ACM SIGMETRICS 2005
while an intern and a visiting researcher with Microsoft Best Paper Award with Laurent Massoulié for a work on the performance
Research Cambridge. of peer-to-peer file dissemination, IEEE INFOCOM 2005 Best Paper
Award with Jean-Yves Le Boudec for a work on the random mobility
models, and ITC-17 2001 Best Student Paper Award with Jean-Yves Le
REFERENCES Boudec for a work on analysis of equation-based congestion control. In
2005, he has been awarded ERCIM Cor Baayen Award.
[1] V. Anantharam, P. Varaiya, and J. Walrand, “Asymptotically
Efficient Allocation Rules for the Multiarmed Bandit Problem with James Cruise received a degree and is
Multiple Plays—Part i: i.i.d. Rewards,” IEEE Trans. Automatic currently working toward the PhD degree in
Control, vol. 32, no. 11, pp. 968-976, Nov. 1987. mathematics at Cambridge University. He spent
[2] J.R. Anderson, “The Adaptive Nature of Human Categorization,” three months completing an internship at Micro-
Psychological Rev., vol. 98, no. 3, pp. 409-429, 1991. soft Research, Cambridge, United Kingdom.
[3] A.L. Barabási and R. Albert, “Emergence of Scaling in Random Recently, he has moved to a post at Bristol
Networks,” Science, vol. 286, pp. 509-512, 1999. University. His work has focused on asympotics
[4] S. Brams and P. Fishburn, Approval Voting. Birkhauser, 1983. for queuing theory.
[5] S. Chakrabarti, A. Frieze, and J. Vera, “The Influence of Search
Engines on Preferential Attachment,” Proc. Symp. Discrete Algo-
rithms (SODA), 2005.
[6] J. Cho, S. Roy, and R.E. Adams, “Page Quality: In Search of an
Unbiased Web Ranking,” Proc. ACM SIGMOD ’05, 2005.
[7] S. Golder and B.A. Huberman, “The Structure of Collaborative Dinan Gunawardena received the BA and MA
Tagging Systems,” J. Information Science, vol. 32, no. 2, pp. 198-208, degrees in computer science from Cambridge
2006. University, United Kingdom, in 1999 and 1995,
[8] http://en.wikipedia.org/wiki/Approval_voting, 2009. respectively. He is a research software develop-
[9] J. Kleinberg and M. Sandler, “Using Mixture Models for ment engineer with the Systems and Networking
Collaborative Filtering,” Proc. 36th Ann. ACM Symp. Theory of Group, Microsoft Research Cambridge, United
Computing (STOC), 2004. Kingdom. His research interests are in the design
[10] R. Kumar, P. Rajagopalan, and A. Tomkins, “Recommendation and development of multimedia computer sys-
Systems: A Probabilistic Analysis,” Proc. 39th Ann. Symp. Founda- tems, and embedded systems and networks. He
tions of Computer Science (FOCS), 1998. received the IWQoS 2007 Best Student Paper
[11] T.L. Lai and H. Robbins, “Asymptotically Efficient Adaptive Award with Shao Liu and Milan Vojnovic for a work on congestion control
Allocation Rules,” Advances in Applied Math., vol. 6, pp. 4-25, 1985. protocols. He has filed numerous patents in the area of computer systems
[12] T.M. Liggett, Interacting Particle Systems, second ed. Springer, 2006. and networks, and has contributed to networking research that has
[13] R.D. Luce, Individual Choice Behavior: A Theoretical Analysis. Dover, shipped in the Microsoft Vista Operating System. He also contributed to
1959. key enabling technology that is implemented in the Asymmetric Digital
[14] R.M. Nosofsky, “Choice, Similarity, and the Context Theory of Subscriber Line (ADSL) standards.
Classification,” J. Experimental Psychology, vol. 10, no. 1, pp. 104-
114, 1984.
Peter Marbach received the Eidg Dipl El-Ing
[15] S. Pandey, S. Roy, C. Olston, J. Cho, and S. Chakrabarti, “Shuffling
from the ETH Zurich, Switzerland, in 1993, the
Stacked Deck: The Case for Partially Randomized Ranking of
MS degree in electrical engineering from the
Search Engine Results,” Proc. 31st Int’l Conf. Very Large Data Bases
Columbia University, New York, in 1994, and the
(VLDB), 2005.
PhD degree in electrical engineering from the
[16] R.M. Phatarfod, “On the Matrix Occurring in a Linear Search
Massachusetts Institute of Technology (MIT),
Problem,” J. Applied Probability, vol. 18, pp. 336-346, 1991.
Cambridge, Massachusetts, in 1998. He was
[17] G. Salton and M.J. McGill, Introduction to Modern Information
appointed as an assistant professor in 2000, and
Retrieval. McGraw-Hill Education, 1983.
as an associate professor in 2005, at the
[18] S. Sen, S.K. Lam, A.-M. Rashid, D. Cosley, D. Frankowski,
Department of Computer Science, University of
J. Osterhouse, F.M. Harper, and J. Riedl, “Tagging, Communities,
Toronto. He has also been a visiting professor at Microsoft Research,
Vocabulary, Evolution,” Proc. 2006 20th Anniversary Conf. Compu-
Cambridge, United Kingdom, at the Ecole Polytechnique Federal at
ter Supported CooperativeWork (CSCW), 2006.
Lausanne (EPFL), Switzerland, and at the Ecole Normale Superieure,
[19] H.A. Simon, “Bandwagon and Underdog Effects and the
Paris, France, and a postdoctoral fellow at Cambridge University, United
Possibility of Election Predictions,” Public Opinion Quarterly,
Kingdom. His research interests are in the fields of communication
vol. 18, pp. 245-253, 1954.
networks, in particular in the area of wireless networks, peer-to-peer
[20] F. Suchanek, M. Vojnovic, and D. Gunawardena, “Social Tagging:
networks, and online social networks. He has received the IEEE
Meaning and Suggestions,” Proc. 17th ACM Conf. Information and
INFOCOM 2002 Best Paper Award for his paper “Priority Service and
Knowledge Management (CIKM), Oct. 2008.
Max-Min Fairness.” He is on the Editorial Board of the ACM/IEEE
[21] C.J. van Rijsbergen, Information Retrieval, second ed. Butterworths,
Transactions of Networking.
http://www.dcs.gla.ac.uk/Keith/Preface.html, 1979.