You are on page 1of 7

Complete Genre Listing

Exploring Implicit Hierarchical Structures for Recommender Systems


Action & Adventure
Action Classics
Anime & Animation
Animation for Grown-ups
Comedy
African-American Comedies
Action Comedies Anime Action Best of British Humor
Action Thrillers Anime Comedy Cult Comedies
Adventures Anime Drama Dark Humor & Black Comedies

Blaxploitation
Suhang Wang, Jiliang Tang, Yilin Wang and HuanTearjerkers
African-American Action Anime Fantasy
Anime Feature Films
Liu Foreign Comedies
Late
Historical Night Comedies
Documentaries Slashers and Serial Killers
Blockbusters Anime Horror Latino Comedies
Indie Documentaries Supernatural Horror
School of Computing, Informatics, and Decision Systems Engineering
Comic Books and Superheroes Anime Sci-Fi Gay & Lesbian
Political Comedies
Military Documentaries Teen Screams
Crime Action Anime Series Gay & Lesbian Comedies
Romantic
Miscellaneous Comedies
Documentaries Vampires
Deadly Disasters
Espionage Action
Arizona State University, USA
Kids' Anime Gay & Lesbian Dramas
Saturday Night Live
PBS Documentaries
Gay & Lesbian Romance
Werewolves
PoliticalScrewball
Documentaries Zombies

Heist Films
{suhang.wang,
Foreign Action & Adventure Classics
jiliang.tang, yilin.wang.1, huan.liu}@asu.edu
Classic Comedies
Foreign Gay & Lesbian
Slapstick
Rockumentaries
Indie Gay & Lesbian
ScienceSpoofs and Satire
and Nature Documentaries Romance
Martial Arts Classic Dramas Gay African-AmericanHome Romance
Social &Sports Comedies
Cultural Documentaries Books
Military & War Action Classic Sci-Fi & Fantasy Stand-Up
Sports Documentaries Lesbian Foreign Romance
Super Swashbucklers Classic Thrillers Travel &Mockumentaries
Adventure Documentaries Bisexual Search: Foreign Steamy Romance
Westerns Classic War Stories Showbiz Comedies
Mockumentaries LOGO Indie Romance
Home > Books > Antiques & Collectibles
Classic Westerns Romance Classics
Children & Family Epics Music & Musicals Categories Books
ForeignFaith & Spirituality Romantic Comedies
Ages 0-2 Abstract Film Noir ForeignFaith
Action &&Spirituality
Adventure Feature Films Classical Music Antiques & Collectibles
(30302) Romantic Dramas
30302 products found in
Sorted by: Bestselling |
Ages 2-4 Foreign Classics Foreign ArtInspirational
House Stories Classical Choral Music Americana (415) Steamy Romance
Blue Book Dolls
Ages 5-7 Foreign Classic Comedies Religious &
Foreign Children & FamilyMythic Epics Classical Instrumental Music Art (1583)
Items in real-world recommender systems exhibit
Ages 8-10 Foreign Classic Dramas Religious & Spiritual Dramas
Foreign Classics Opera & Operetta Autographs (145)
Books (1367)
Sports & Fitness
Paperback)Jan Fo
Paperback, 1980
Buy: $0.75 Save
Ages 11-12 Foreign Silent Films Religious Comedies
Foreign Classic Comedies & Satires Sacred Classical Music Baseball
certain hierarchical structures. Similarly, user pref-
Animal Tales Silent Films Faith
Foreign & Spirituality
Classic DramasDocumentaries Country & Western/Folk
Bottles (288)
Basketball
Buttons & Pins (66)
Postsecret : Extra
Book Characters Inspirational Biographies American Folk & Bluegrass Extreme Sports
Care & Restoration (122)
erences also present hierarchical structures. Re-
Cartoons Drama
Foreign Silent Films
Religion & Mythology Documentaries
Foreign Comedies Classic Country & Western
Clocks & Watches (746) Lives by Frank W
BMX & Extreme Biking
Coins, Currency & Medals
Hardcover, 2005
Buy: $1.79 Save
Coming of Age African-American Dramas
cent studies show that incorporating the explicit hi- Spiritual
Foreign Regions Mysteries New Country (3808) Extreme Combat & Mixed M
Dinosaurs Biographies AfricaInspirational Music Inspirational Music Comics (236)
Extreme Motorsports
Dolls (986)
erarchical structures of items or user preferences
Disney Classic Dramas
Courtroom Dramas
ArgentinaGospel Music Gospel Music Figurines (139) Extreme Snow &The IceJedi
Sports
Path : A
Daniel Wallace (2
Education & Guidance Inspirational
Australia/ New Zealand Rock & Pop Inspirational Rock & Pop Extreme Sports Compilation
Firearms & Weapons Wallace
can improve the performance of recommender sys-
Family Adventures Crime Dramas Belgium New Age New Age
(1406)
Furniture (338)
Hardcover, 2010
Mountain Biking Buy: $45.00
Family Animation Dramas Based on Real Life Brazil Sacred Classical Music Sacred Classical Music General (6970) Mountaineering & Climbing
tems. However, explicit hierarchical structures are
Family Classics Dramas Based on the Book China Sacred Folk & Traditional Music Sacred Folk & Traditional Music
Jewelry (416)
Skateboarding Standard Catalog
Glass & Glassware (1485)

Family Comedies Dramas Based on Bestsellers Judaica Jazz & Easy Listening Stunts & Generaland Jim Supica (
Mayhem
usually unavailable, especially those of user prefer-
Family Dramas Dramas Based on Classic Literature
Czech Republic
Kids' Inspirational
Eastern Europe Afro-Cuban & Latin Jazz
Kitchenware (361)
Football
Magazines & Newspapers
Richard Nahas
Hardcover, 2007
Buy: $24.92
Dramas Based on Contemporary Classic Jazz (46)
Golf
ences. Thus, there’s a gap between the importance
Family Sci-Fi & Fantasy
Kids' Music Literature
France Inspirational Sing-Alongs
Inspirational Stories for Kids Contemporary Jazz
Military (34)
Martial Arts, BoxingMy&Secret
Wrestling
Germany Non-Sports Cards (39) : A Po
of hierarchical structures and their availability. In
Kids' TV Family Dramas
Foreign Dramas
GreeceMindfulness & Prayer Jazz Greats
Swing & Big Band
Boxing
Paper Ephemera (104)
Performing Arts (149)
Extreme CombatBuy:
Hardcover)Frank
Hardcover, 2006
& Mixed M
Nickelodeon Hong KongHealing & Reiki $0.75 Save

this paper, we investigate the problem of explor-


Teen Comedies Gambling Dramas India Meditation & Relaxation Vocal Jazz
Political (35)
General Martial Arts
Popular Culture (553)
Teen Dramas Gay & Lesbian Dramas Iran Prayer & Spiritual Growth Vocal Pop Karate
Porcelain & China (790)
ing the implicit hierarchical structures for recom-
Teen Romance Indie Dramas Israel Karaoke Postcards (1250)
Kung Fu
A Lifetime of Sec
(2007, Hardcover

mender Latino Dramas


systems when they are not explicitly avail- Italy(a) Faith & Spirituality
Horror (b) Music & Musicals (c) half.com
Latin Music
Posters (928) Hardcover, 2007
Martial Arts & Boxing
Pottery & Ceramics (908) Worko
Buy: $3.46 Save
Documentary Medical Dramas JapanB-Movie Horror Afro-Cuban & Latin Jazz Self-Defense
Radios & Televisions (see

able. African-American Documentaries Military & War Dramas Creature Features also Performing Arts) (33)
Brazilian Music Tai Chi & QigongEmpty Mansions
We propose
Biographical a novel recommendation frame-
Documentaries Period Pieces
Korea
Figure 1: Netflix Movie Hierarchical Structure and half.com
Latin Cult Horror
America Latin Pop
Records (159)
Reference (333) Wrestling and the Spending
Crime
work Faith
HSR Documentaries
to bridge Pre-20th Century Period Pieces
the gap, which enables us to Foreign Horror Reggaeton Motorsports Dedman and Pau

& Spirituality Documentaries 20th Century Period Pieces


Book Hierarchical Structure
Mexico
Asian Horror Rock en Español
Rugs (31)
Silver, Gold & Other
Auto Racing
Dedman, Paul Cla
Hardcover, 2013
Middle East Metals (899) Buy: $9.00
captureInspirational
the implicit hierarchical
Biographies
Religion & Mythology Documentaries
structures of users
Political Dramas
Romantic Dramas
Italian Horror
Netherlands
Frankenstein
Traditional Latin Music Car Culture
Sports (see also headings
under Sports Cards) (167)
Music Lessons Extreme Motorsports
and items simultaneously.
Spiritual Mysteries Experimental results on
Showbiz Dramas Items in real-world recommender systems could exhibit
Philippines
PolandHorror Classics Bass Lessons
Sports Cards (517)
Stamps (1436)
Gunsmithing - th
Motorcycles & Motocross
Paperback)Patric
Paperback, 2010
Foreign Documentaries Social Issue Dramas RussiaMonsters
certain hierarchical structures. For example, Figure 1(a) and
Drum Lessons Other Sports
Teddy Bears (336) Buy: $14.89
two real world datasets demonstrate the effective-
HBO Documentaries Sports Dramas Satanic
Scandinavia Stories Guitar & Banjo Lessons Textiles & Costume (143)
Bodybuilding
1 Cycling
Tobacco-Related (160)

ness ofHistorical
the
Indie
Documentaries
proposed
Documentaries framework. Tearjerkers 1(b) are two snapshots from Netflix DVD rental page . In
Slashers
Southeast Asiaand Serial Killers
Supernatural Horror
Miscellaneous Music Lessons Toy Animals (110) The Overstreet C
Spain Piano & Keyboard Lessons Toys (1316) Horse Racing M. Overstreet (20
Military Documentaries Gay & Lesbian
Gay & Lesbian Comedies
the figure, movies are classified into a hierarchical struc-
Teen Screams
Thailand Voice Lessons Transportation (412)
Wine (28)
Tennis
Paperback, 2014
Buy: $19.89
Miscellaneous Documentaries Vampires
United Kingdom Musicals Miscellaneous Sports
PBS Documentaries Gay & Lesbian Dramas ture as genre→subgenre→detailed-category. For example,
Foreign Werewolves
Documentaries Classic Movie Musicals
Other (17)
Outdoor & Mountain Sports
1 Introduction
Political Documentaries
Rockumentaries
Gay & Lesbian Romance
Foreign Gay & Lesbian ForeignZombies
Dramas Classic Stage Musicals
the movie Schindler’s List first falls into the genre Faith
Contemporary Movie Musicals
Fishing
Hunting
2014 Standard Ca

Indie Gay & Lesbian Foreign Gay


Romance & Lesbian
Science and Nature Documentaries Contemporary Stage Musicals Mountain Biking
Recommender systems [Resnick and Varian, 1997] intend
Social & Cultural Documentaries Gay Spirituality, under which it belongs to sub-genre Faith &
ForeignAfrican-American
Horror Romance Foreign Musicals Mountaineering & Climbing
Lesbian Asian Horror
Foreign Romance
Sports Documentaries
to provide users with information of potential interest based
Travel & Adventure Documentaries Bisexual Spirituality Feature Films and is further categorized as In-
Italian Horror
Foreign Steamy Romance
Must-See Musicals
Show Tunes
Snow & Ice Sports
Extreme Snow & Ice Sports
LOGO Foreign Languages
Indie Romance
on their demographic profiles and historical data. Collab-
Mockumentaries spirational Stories (see the hierarchical structure shown in
Arabic Language
Romance Classics
Rock & Pop Ice Hockey
Music & Musicals
Foreign
orative Filtering (CF), which only requires past user rat-
Foreign Action & Adventure Classical Music
Fig. 1(a)). Similarly, Fig. 1(c) shows an Antiques & Col-
Romantic Comedies
Romantic Dramas 2
ings to predict
Foreign Artunknown
House Classical Choral Music
ratings, has attracted more and
Classical Instrumental Music
lectibles category from half.com . We can also observe hi-
Steamy Romance
Foreign Children & Family
Foreign [Classics
more attention Hofmann, 2004; Zhang et al., 2006; Koren,
Opera & Operetta erarchical structures, i.e., category→sub-category. For ex-
Sports & Fitness

2010]. Collaborative Filtering can be roughly categorized ample, the book Make Your Own Working Paper Clock be-
into memory-based [Herlocker et al., 1999; Yu et al., 2004; longs to Clocks & Watches, which is a sub-category of An-
Wang et al., 2006] and model-based methods [Hofmann, tiques & Collections. In addition to hierarchical structures
2004; Mnih and Salakhutdinov, 2007; Koren et al., 2009]. of items, users’ preferences also present hierarchical struc-
Memory-based methods mainly use the neighborhood in- tures, which have been widely used in the research of deci-
formation of users or items in the user-item rating matrix sion making [Moreno-Jimenez and Vargas, 1993]. For exam-
while model-based methods usually assume that an underly- ple, a user may generally prefer movies in Faith Spiritual-
ing model governs the way users rate and in general, it has ity, and more specifically, he/she watches movies under the
better performance than memory-based methods. Despite the sub-category of Inspirational Stories. Similarly, an antique
success of various model-based methods [Si and Jin, 2003; clock collector may be interested in Clocks & Watches sub-
Hofmann, 2004], matrix factorization (MF) based model has category under the Antiques & Collections category. Items
become one of the most popular methods due to its good per-
formance and efficiency in handling large datasets[Srebro et 1
Snapshots are from http://dvd.netflix.com/AllGenresList
al., 2004; Mnih and Salakhutdinov, 2007; Koren et al., 2009; 2
Snapshot is from http://books.products.half.ebay.com/antiques-
Gu et al., 2010; Tang et al., 2013; Gao et al., 2013]. collectibles W0QQcZ4QQcatZ218176
in the same hierarchical layer are likely to share similar 2.1 The Basic Model
properties, hence they are likely to receive similar rating In this work, we choose weighted nonnegative matrix factor-
scores. Similarly, users in the same hierarchical layer are ization (WNMF) as the basic model of the proposed frame-
likely to share similar preferences, thus they are likely to work, which is one of the most popular models to build
rate certain items similarly [Lu et al., 2012; Maleszka et al., recommender systems and has been proven to be effective
2013]. Therefore, recently, there are recommender systems in handling large and sparse datasets [Zhang et al., 2006].
exploiting explicit hierarchical structures of items or users WNMF decomposes the rating matrix into two nonnegative
to improve recommendation performance [Lu et al., 2012; low rank matrices U ∈ Rn×d and V ∈ Rd×m , where U is the
Maleszka et al., 2013]. However, explicit hierarchical struc- user preference matrix with U(i, :) being the preference vec-
tures are usually unavailable, especially those of users. tor of ui , and V is the item characteristic matrix with V(:, j)
The gap between the importance of hierarchical structures being the characteristic vector of vj . Then a rating score from
and their unavailability motivates us to study implicit hierar- ui to vj is modeled as X(i, j) = U(i, :)V(:, j) by WNMF. U
chical structures of users and items for recommendation. In and V can be learned by solving the following optimization
particular, we investigate the following two challenges - (1) problem:
how to capture implicit hierarchical structures of users and
items simultaneously when these structures are explicitly un- min kW (X − UV)k2F + β(kUk2F + kVk2F ) (1)
U≥0,V≥0
available? and (2) how to model them mathematically for
recommendation? In our attempt to address these two chal- where denotes Hadamard product and W(i, j) controls the
lenges, we propose a novel recommendation framework HSR, contribution of X(i, j) to the learning process. A popular
which captures implicit hierarchical structures of users and choice of W is - W(i, j) = 1 if ui rates vj , and W(i, j) = 0
items based on the user-item matrix and integrate them into otherwise.
a coherent model. The major contributions of this paper are
summarized next: 2.2 Modeling Implicit Hierarchical Structures
• We provide a principled approach to model implicit hi- In weighted nonnegative matrix factorization, the user pref-
erarchical structures of users and items simultaneously erence matrix U and the item characteristic matrix V can
based on the user-item matrix; indicate implicit flat structures of users and items respec-
tively, which have been widely used to identify communi-
• We propose a novel recommendation framework HSR,
ties of users [Wang et al., 2011] and clusters of items [Xu et
which enables us to capture implicit hierarchical struc-
al., 2003]. Since both U and V are nonnegative, we can fur-
tures of users and items when these structures are not
ther perform nonnegative matrix factorization on them, which
explicitly available; and
may pave the way to model implicit hierarchical structures
• We conduct experiments on two real-world recommen- of users and items for recommendation. In this subsection,
dation datasets to demonstrate the effectiveness of the we first give details about how to model implicit hierarchical
proposed framework. structures based on weighted nonnegative matrix factoriza-
The rest of the paper is organized as follows. In Section 2, tion, and then introduce the proposed framework HSR.
we introduce the proposed framework HSR with the details The item characteristic matrix V ∈ Rd×m indicates the
of how to capture implicit hierarchical structures of users and affiliation of m items to d latent categories. Since V is non-
items. In Section 3, we present a method to solve the op- negative, we can further decompose V into two nonnegative
timization problem of HSR along with the convergence and matrices V1 ∈ Rm1 ×m and Ṽ2 ∈ Rd×m1 to get a 2-layer im-
time complexity analysis. In Section 4, we show empirical plicit hierarchical structure of items as shown in Figure 2(a):
evaluation with discussion. In Section 5, we present the con-
clusion and future work. V ≈ Ṽ2 V1 (2)
where m1 is the number of latent sub-categories in the 2-nd
2 The Proposed Framework layer and V1 indicates the affiliation of m items to m1 latent
Throughout this paper, matrices are written as boldface cap- sub-categories. We name Ṽ2 as the latent category affiliation
ital letters such as A and Bi . For an arbitrary matrix M, matrix for the 2-layer implicit hierarchical structure because
M(i, j) denotes the (i, j)-th entry of M. ||M||F is the Frobe- it indicates the affiliation relation between d latent categories
nius norm of M and T r(M) is the trace norm of M if M is in the 1-st layer and m1 latent sub-categories in the 2-nd layer.
a square matrix. Let U = {u1 , u2 , . . . , un } be the set of n Since Ṽ2 is non-negative, we can further decompose the la-
users and V = {v1 , v2 , . . . , vm } be the set of m items. We
tent category affiliation matrix Ṽ2 to V2 ∈ Rm2 ×m1 and
use X ∈ Rn×m to denote the user-item rating matrix where
X(i, j) is the rating score from ui to vj if ui rates vj , oth- Ṽ3 ∈ Rd×m2 to get a 3-layer implicit hierarchical structure
erwise X(i, j) = 0. We do not assume the availability of of items as shown in Figure 2(b):
hierarchical structures of users and items, hence the input of V ≈ Ṽ3 V2 V1 (3)
the studied problem is only the user-item rating matrix X,
which is the same as that of traditional recommender systems. Let Ṽq−1 be the latent category affiliation matrix for the
Before going into details about how to model implicit hierar- (q − 1)-layer implicit hierarchical structure. The aforemen-
chical structures of users and items, we would like to first tioned process can be generalized to get the q-layer implicit
introduce the basic model of the proposed framework. hierarchical structure from (q − 1)-layer implicit hierarchical
(a) 2 Layers (b) 3 Layers (c) q Layers
Figure 2: Implicit Hierarchical Structures of Items via Deeply Factorizing the Item Characteristic Matrix.

structure by further factorizing Ṽq−1 into two non-negative


matrices as shown in Figure 2(c):
V ≈ Vq Vq−1 . . . V2 V1 (4)
Similarly, to model a p-layer user implicit hierarchical
structure, we can perform a deep factorization on U as
U ≈ U1 U2 . . . Up−1 Up (5)
where U1 is a n × n1 matrix, Ui (1 < i < p) is a ni−1 × ni
matrix and Up is a np−1 × d matrix.
With model components to model implicit hierarchical
structures of items and users, the framework HSR is proposed
to solve the following optimization problem
min ||W (X − U1 . . . Up Vq . . . V1 )||2F
U1 ,...,Up ,V1 ,...,Vq

Xp q
X
2
+ λ( ||Ui ||F + ||Vj ||2F )
Figure 3: An Illustration of The Proposed Framework HSR.
i=1 j=1

s.t. Ui ≥ 0, i ∈ {1, 2, . . . , p}, where Ai and Hi , 1 ≤ i ≤ p, are defined as:


Vj ≥ 0, j ∈ {1, 2, . . . , q} 
U1 U2 . . . Ui−1 if i 6= 1
(6) Ai = (8)
I if i = 1
An illustration of the proposed framework HSR is demon- And
strated in Figure 3. The proposed framework HSR performs 
Ui+1 . . . Up Vq . . . V1 if i 6= p
a deep factorizations on the user preference matrix U and the Hi = (9)
Vq . . . V1 if i = p
item characteristic matrix V to model implicit hierarchical
structures of items and users, respectively; while the original The Lagrangian function of Eq.(7) is
WNMF based recommender system only models flat struc- L(Ui ) = ||W (X−Ai Ui Hi )||2F +λ||Ui ||2F −T r(PT Ui )
tures as shown in the inner dashed box in Figure 3. (10)
where P is the Lagrangian multiplier. The derivative of
3 An Optimization Method for HSR L(Ui ) with respect to Ui is
The objective function in Eq.(6) is not convex if we update ∂L(Ui )
= 2ATi [W (Ai Ui Hi − X)] HTi + 2λUi − P
all the variable jointly but it is convex if we update the vari- ∂Ui
ables alternatively. We will first introduce our optimization (11)
method for HSR based on an alternating scheme in [Trigeor- By setting the derivative to zero and using Karush-Kuhn-
gis et al., 2014] and then we will give convergence analysis Tucker complementary condition [Boyd and Vandenberghe,
and complexity analysis of the optimization method. 2004], i.e., P(s, t)Ui (s, t) = 0, we get:
 T
Ai [W (Ai Ui V − X)] HTi + λUi (s, t)Ui (s, t) = 0

3.1 Inferring Parameters of HSR
(12)
Update Rule of Ui Eq.(12) leads to the following update rule of Ui as:
To update Ui , we fix the other variables except Ui . By re- s  T 
moving terms that are irrelevant to Ui , Eq.(6) can be rewritten Ai (W X)HTi (s, t)
as: Ui (s, t) ← Ui (s, t)  T 
Ai (W (Ai Ui Hi ))HTi + λUi (s, t)
min ||W (X − Ai Ui Hi )||2F + λ||Ui ||2F (7) (13)
Ui ≥0
Update Rule of Vi have p user layers and q item layers. This initializing process
Similarly, to update Vi , we fix the other variables except Vi . is summarized in Algorithm 3.1 from line 1 to line 9. After
By removing terms that are irrelevant to Vi , the optimization initialization, we will do fine-tuning by updating the Ui and
problem for Vi is: Vi using updating rules in Eq.(13) and Eq.(17) separately.
The procedure is to first update Vi in sequence and then Ui
min ||W (X − Bi Vi Mi )||2F + λ||Vi ||2F (14) in sequence alternatively, which is summarized in Algorithm
Vi ≥0
3.1 from line 10 to line 20. In line 21, we reconstruct the
where Bi and Mi , 1 ≤ i ≤ q, are defined as user-item matrix as Xpred = U1 . . . Up Vq . . . V1 . A miss-
ing rating from ui to vj will be predicted as Xpred (i, j)3 .

U1 . . . Up Vq . . . Vi+1 if i 6= q
Bi = (15)
U1 . . . Up if i = q
3.2 Convergence Analysis
and  In this subsection, we will investigate the convergence of Al-
Vi−1 . . . V1 if i 6= 1 gorithm 3.1. Following [Lee and Seung, 2001], we will use
Mi = (16)
I if i = 1 the auxiliary function approach to prove the convergence of
We can follow a similar way as Ui to derive update rule for the algorithm.
Vi as: 0

s Definition [Lee and Seung, 2001] G(h, h ) is an auxiliary


 T 
Bi (W X)MTi (s, t) function for F (h) if the conditions
Vi (s, t) ← Vi (s, t)  T T
 0
Bi (W (Bi Vi Mi ))Mi + λVi (s, t) G(h, h ) ≥ F (h), G(h, h) = F (h) (18)
(17)
are satisfied
Algorithm 1 The Optimization Algorithm for the Proposed Lemma 3.1 [Lee and Seung, 2001] If G is an auxiliary func-
Framework HSR. tion for F, then F is non-increasing under the update
Input: X ∈ Rn×m , λ, p, q, d and dimensions of each layer
Output: Xpred h(t+1) = arg min G(h, h( t)) (19)
p q
1: Initialize {Ui }i=1 and {Vi }i=1
2: Ũ1 , Ṽ1 ← WNMF(X, d)
Proof F (ht+1 ) ≤ G(h(t+1) , h(t) ) ≤ G(h(t) , h(t) ) ≤
3: for i = 1 to p-1 do G(h(t) )
4: Ui , Ũi+1 ← NMF(Ũi , ni ) Lemma 3.2 [Ding et al., 2006] For any matrices A ∈
0
5: end for Rn×n k×k
+ , B ∈ R + , S ∈ R+ , S ∈ R+
k×k k×k
and A, B are
6: for i = 1 to q-1 do symmetric, the following inequality holds
7: Ṽi+1 , Vi ← NMF(Ṽi , mi )
n X k 0
8: end for X (AS B)(s, t)S2 (s, t)
9: Up = Ũp , Vq = Ṽq 0 ≥ T r(ST ASB) (20)
s=1 t=1
S (s, t)
10: repeat
11: for i = 1 to p do Now consider the objective function in Eq.(7), it can be writ-
12: update Bi and Mi using Eq.(15) and Eq.(16) ten in the following form by expanding the quadratic terms
13: update Vi by Eq.(17) and removing terms that are irrelevant to Ui
14: end for
J (Ui ) = T r −2ATi (W X)HTi UTi

15:
16: for i = p to 1 do
+ T r ATi W (ATi Ui Hi ) HTi UTi
 
(21)
17: update Ai and Hi using Eq.(8) and Eq.(9)
T
18: update Ui by Eq.(13) + T r(λUi Ui )
19: end for
20: until Stopping criterion is reached Theorem 3.3 The following function
21: predict rating matrix Xpred = U1 . . . Up Vq . . . V1 0
G(U, U )
 
X Ui (s, t)
With the update rules for Ui and Vj , the optimization =−2 (ATi (W X)HTi )(s, t)Ui (s, t) 1 + log 0
algorithm for HSR is shown in Algorithm 3.1. Next we s,t
Ui (s, t)
briefly review Algorithm 3.1. In order to expedite the ap- 
X (ATi W (ATi Ui Hi ) HTi )(s, t)U2i (s, t)
proximation of the factors in HSR, we pre-train each layer + 0
to have an initial approximation of the matrices Ui and Vi . s,t
Ui (s, t)
To perform pretraining, we first use WNMF [Zhang et al.,
+T r(λUi UTi )
2006] to decompose the user-item rating matrix into Ũ1 Ṽ1 (22)
by solving Eq.(1). After that, we further decompose Ũ1 into
Ũ1 ≈ U1 Ũ2 and Ṽ1 ≈ Ṽ2 V1 using nonnegative matrix 3
The code can be downloaded from
factorization. We keep the decomposition process until we http://www.public.asu.edu/∼swang187/
is an auxiliary function for J (Ui ). Furthermore, it is a con- the Douban dataset and get a dataset consisting of 149,623
vex function in Ui and its global minimum is movie ratings of 1371 users and 1967 movies. For both
s  T  datasets, users can rate movies with scores from 1 to 5. The
Ai (W X)HTi (s, t) statistics of the two datasets are summarized in Table 1.
Ui (s, t) ← Ui (s, t)  T 
Ai (W (Ai Ui Hi ))HTi + λUi (s, t)
Table 1: Statistics of the Datasets
(23)
Dataset # of users # of items # of ratings
Proof The proof is similar to that in [Gu et al., 2010] and MovieLens100K 943 1682 100,000
thus we omit the details. Douban 1371 1967 149,623
Theorem 3.4 Updating Ui with Eq.(13) will monotonically
decrease the value of the objective in Eq.(6).
4.2 Evaluation Settings
Proof With Lemma 3.1 and Theorem 3.3, we have
(0) (0) (0) (1) (0) (1) Two widely used evaluation metrics, i.e., mean absolute er-
J (Ui ) = G(Ui , Ui ) ≥ G(Ui , Ui ) ≥ J (Ui ) ≥ ror (MAE) and root mean square error (RMSE), are adopted
. . . . That is, J (Ui ) decreases monotonically. to evaluate the rating prediction performance. Specifically,
Similarly, the update rule for Vi will also monotonically MAE is defined as
decrease the value of the objective in Eq.(6). Since the value P
of the objective in Eq.(6) is at least bounded by zero , we (i,j)∈T |X(i, j) − X̃(i, j)|
M AE = (24)
can conclude that the optimization method in Algorithm 3.1 |T |
converges.
and RMSE is defined as
3.3 Complexity Analysis v
uP  2
Initialization and fine-tuning are two most expensive opera-
u
t (i,j)∈T X(i, j) − X̃(i, j)
tions for Algorithm 3.1. For line 3 to 5, the time complex- RM SE = (25)
|T |
ity of factorization of Ũi ∈ Rni−1 ×d to Ui ∈ Rni−1 ×ni
and Ũi+1 ∈ Rni ×d is O(tni−1 ni d) for 1 < i < p, and where in both metrics, T denotes the set of ratings we want
O(tnn1 d) for i = 1, where t is number of iterations takes to predict, X(i, j) denotes the rating user i gave to item j
for the decomposition. Thus the cost of initializing Ui ’s is and X̃(i, j) denotes the predicted rating from ui to vj . We
O(td(nn1 + n1 n2 + · · · + np−2 np−1 ). Similarly, the cost of random select x% as training set and the remaining 1 − x%
initializing Vi ’s is O(td(mm1 + m1 m2 + · · · + mq−2 mq−1 ) as testing set where x is varied as {40, 60} is this work. The
(line 6 to 8). The computational cost of fine-tuning Ui in random selection is carried out 10 times independently, and
each iteration is O(nni−1 ni + nni m + ni−1 ni m). Sim- the average MAE and RMSE are reported. A smaller RMSE
ilarly, the computational cost of fine-tuning Vi in each it- or MAE value means better performance. Note that previous
eration is O(mmi−1 mi + mmi n + mi−1 mi n). Let n0 = work demonstrated that small improvement in RMSE or MAE
n, m0 = m, np = mq =P d, then the timePcomlexity of fine- terms can have a significant impact on the quality of the top-
p q
tuning is O(tf [(n + m)( i=1 ni−1 ni + j=1 mi−1 mi ) + few recommendation[Koren, 2008].
Pp Pq
nm( i=1 ni + j=1 mj )]), where tf is the number of iter-
ations takes to fine-tune. The overall time conplexity is the 4.3 Performance Comparison of Recommender
sum of the costs of initialization and fine-tuning. Systems
The comparison results are summarized in Tables 2 and 3 for
4 Experimental Analysis MAE and RMSE, respectively. The baseline methods in the
table are defined as:
In this section, we conduct experiments to evaluate the ef-
fectiveness of the proposed framework HSR and factors that • UCF: UCF is the user-oriented collaborative filtering
could affect the performance of HSR. We begin by introduc- where the rating from ui to vj is predicted as an aggre-
ing datasets and experimental settings, then we compare HSR gation of ratings of K most similar users of ui to vj . We
with the state-of-the-art recommendation systems. Further use the cosine similarity measure to calculate user-user
experiments are conducted to investigate the effects of dimen- similarity.
sions of layers on HSR. • MF: matrix factorization based collaborative filtering
4.1 Datasets tries to decompose the user-item rating matrix into
two matrices such that the reconstruction error is min-
The experiments are conducted on two publicly available imized [Koren et al., 2009].
benchmark datasets, i.e., MovieLens100K 4 and Douban 5 .
MovieLens100K consists of 100,000 movie ratings of 943 • WNMF: weighted nonnegative matrix factorization tries
users for 1682 movies. We filter users who rated less than 20 to decompose the weighted rating matrix into two non-
movies and movies that are rated by less than 10 users from negative matrices to minimize the reconstruction er-
ror [Zhang et al., 2006]. In this work, we choose
4
http://grouplens.org/datasets/movielens/ WNMF as the basic model of the proposed framework
5
http://dl.dropbox.com/u/17517913/Douban.zip HSR.
Table 2: MAE comparison on MovieLens100K and Douban
Methods UCF MF WNMF HSR-User HSR-Item HSR
40% 0.8392 0.7745 0.8103 0.7559 0.7551 0.7469
MovieLens100K
60% 0.8268 0.7637 0.7820 0.7359 0.7363 0.7286
40% 0.6407 0.5973 0.6192 0.5792 0.5786 0.5767
Douban
60% 0.6347 0.5867 0.6059 0.5726 0.5721 0.5685

Table 3: RMSE comparison on MovieLens100K and Douban


Methods UCF MF WNMF HSR-User HSR-Item HSR
40% 1.0615 0.9792 1.0205 0.9681 0.9672 0.9578
MovieLens100K
60% 1.0446 0.9664 0.9953 0.9433 0.9412 0.9325
40% 0.8077 0.7538 0.7807 0.7313 0.7304 0.7284
Douban
60% 0.7988 0.7403 0.7637 0.7225 0.7219 0.7179

• HSR-Item: HSR-Item is a variant of the proposed chical structures of users and items contain complemen-
framework HSR. HSR-Item only considers the implicit tary information and capturing them simultaneously can
hierarchical structure of items by setting p = 1 in HSR. further improve the recommendation performance.
• HSR-User: HSR-User is a variant of the proposed 4.4 Parameter Analysis
framework HSR. HSR-Users only considers the implicit
hierarchical structure of users by setting q = 1 in HSR. In this subsection, we investigate the impact of dimensions
of implicit layers on the performance of the proposed frame-
work HSR. We only show results with p = 2 and q = 2, i.e.,
0.722
W X ≈ W (U1 U2 V2 V1 ) with U1 ∈ Rn×n1 , U2 ∈
0.572
Rn1 ×d , V1 ∈ Rd×m1 , and V2 ∈ Rm1 ×m , since we have
RMSE

0.72
MAE

0.718
0.57 similar observations with other settings of p and q. We fix
500
200 400
500
0.568
200
400 d to be 20 and vary the value of n1 as {100, 200, 300, 400,
400
600
200
300 400
600
200
300
500} and the value of m1 as {200, 400, 600, 800, 1000}. We
800 800
m1
1000 100 n1
m1
1000 100 n1
only show results with 60% of the datasets as training sets due
(a) RMSE for Douban 60% (b) MAE for Douban 60% to the page limitation and the results are shown in Figure 4.
In general, when we increase the numbers of dimensions, the
performance tends to first increase and then decrease. Among
0.95
n1 and m1 , the performance is relatively sensitive to m1 .
0.945 0.74

5 Conclusion
RMSE

MAE

0.94
0.735
0.935

0.93
500 0.73 500 In this paper, we study the problem of exploiting the implicit
400 400
200
400
600
300 200
400
300 hierarchical structures of items and users for recommendation
200 600 200
800
m1
1000 100 n1
m1
800
1000 100 n1 when they are not explicitly available and propose a novel
recommendation framework HSR, which captures the im-
(c) RMSE for MovieLens100K (d) MAE for MovieLens100K
60% 60%
plicit hierarchical structures of items and users into a coher-
ent model. Experimental results on two real-world datasets
Figure 4: Parameter Analysis for HSR. demonstrate the importance of the implicit hierarchical struc-
tures of items and those of users in the recommendation per-
Note that parameters of all methods are determined via formance improvement.
cross validation. Based on the results, we make the following There are several interesting directions needing further in-
observations: vestigation. First, in this work, we choose the weighted non-
• In general, matrix factorization based recommender sys- negative matrix factorization as our basic model to capture
tems outperform the user-oriented CF method and this the implicit hierarchical structures of items and users and we
observation is consistent with that in [Koren et al., would like to investigate other basic models. Since social net-
2009]. works are pervasively available in social media and provide
• Both HSR-Item and HSR-Users obtain better results independent sources for recommendation, we will investigate
than WNMF. We perform t-test on these results, which how to incorporate social network information into the pro-
suggest that the improvement is significant. These re- posed framework.
sults indicate that the implicit hierarchical structures of
users and items can improve the recommendation per- 6 Acknowledgements
formance. Suhang Wang, Jiliang Tang and Huan Liu are supported by,
• HSR consistently outperforms both HSR-Item and or in part by, the National Science Foundation (NSF) un-
HSR-Users. These results suggest that implicit hierar- der grant number IIS-1217466 and the U.S. Army Research
Office (ARO) under contract/grant number 025071. Yilin and hierarchical structure of user profiles. Knowledge-
Wang is supported by the NSF under contract/grant number Based Systems, 47:1–13, 2013.
1135656. Any opinions expressed in this material are those [Mnih and Salakhutdinov, 2007] Andriy Mnih and Ruslan
of the authors and do not necessarily reflect the views of the Salakhutdinov. Probabilistic matrix factorization. In Ad-
NSF and ARO. vances in neural information processing systems, pages
1257–1264, 2007.
References [Moreno-Jimenez and Vargas, 1993] Jose Maria Moreno-
[Boyd and Vandenberghe, 2004] Stephen Boyd and Lieven Jimenez and Luis G Vargas. A probabilistic study of pref-
Vandenberghe. Convex optimization. Cambridge univer- erence structures in the analytic hierarchy process with in-
sity press, 2004. terval judgments. Mathematical and Computer Modelling,
[Ding et al., 2006] Chris Ding, Tao Li, Wei Peng, and Hae- 17(4):73–81, 1993.
sun Park. Orthogonal nonnegative matrix t-factorizations [Resnick and Varian, 1997] Paul Resnick and Hal R Varian.
for clustering. In Proceedings of the 12th ACM SIGKDD Recommender systems. Communications of the ACM,
international conference on Knowledge discovery and 40(3):56–58, 1997.
data mining, pages 126–135. ACM, 2006.
[Si and Jin, 2003] Luo Si and Rong Jin. Flexible mixture
[Gao et al., 2013] Huiji Gao, Jiliang Tang, Xia Hu, and Huan model for collaborative filtering. In ICML, volume 3,
Liu. Exploring temporal effects for location recommenda- pages 704–711, 2003.
tion on location-based social networks. In Proceedings of
the 7th ACM conference on Recommender systems, pages [Srebro et al., 2004] Nathan Srebro, Jason Rennie, and
93–100. ACM, 2013. Tommi S Jaakkola. Maximum-margin matrix factoriza-
tion. In Advances in neural information processing sys-
[Gu et al., 2010] Quanquan Gu, Jie Zhou, and Chris HQ tems, pages 1329–1336, 2004.
Ding. Collaborative filtering: Weighted nonnegative ma-
trix factorization incorporating user and item graphs. In [Tang et al., 2013] Jiliang Tang, Xia Hu, Huiji Gao, and
SDM, pages 199–210. SIAM, 2010. Huan Liu. Exploiting local and global social context
for recommendation. In Proceedings of the Twenty-Third
[Herlocker et al., 1999] Jonathan L Herlocker, Joseph A international joint conference on Artificial Intelligence,
Konstan, Al Borchers, and John Riedl. An algorithmic pages 2712–2718. AAAI Press, 2013.
framework for performing collaborative filtering. In Pro-
ceedings of the 22nd annual international ACM SIGIR [Trigeorgis et al., 2014] George Trigeorgis, Konstantinos
conference on Research and development in information Bousmalis, Stefanos Zafeiriou, and Bjoern Schuller. A
retrieval, pages 230–237. ACM, 1999. deep semi-nmf model for learning hidden representations.
In Proceedings of the 31st International Conference on
[Hofmann, 2004] Thomas Hofmann. Latent semantic mod- Machine Learning (ICML-14), pages 1692–1700, 2014.
els for collaborative filtering. ACM Transactions on Infor-
mation Systems (TOIS), 22(1):89–115, 2004. [Wang et al., 2006] Jun Wang, Arjen P De Vries, and Mar-
cel JT Reinders. Unifying user-based and item-based col-
[Koren et al., 2009] Yehuda Koren, Robert Bell, and Chris
laborative filtering approaches by similarity fusion. In Pro-
Volinsky. Matrix factorization techniques for recom- ceedings of the 29th annual international ACM SIGIR con-
mender systems. Computer, 42(8):30–37, 2009. ference on Research and development in information re-
[Koren, 2008] Yehuda Koren. Factorization meets the neigh- trieval, pages 501–508. ACM, 2006.
borhood: a multifaceted collaborative filtering model. In [Wang et al., 2011] Fei Wang, Tao Li, Xin Wang, Shenghuo
Proceedings of the 14th ACM SIGKDD international con- Zhu, and Chris Ding. Community discovery using nonneg-
ference on Knowledge discovery and data mining, pages ative matrix factorization. Data Mining and Knowledge
426–434. ACM, 2008. Discovery, 22(3):493–521, 2011.
[Koren, 2010] Yehuda Koren. Collaborative filtering with [Xu et al., 2003] Wei Xu, Xin Liu, and Yihong Gong. Doc-
temporal dynamics. Communications of the ACM, ument clustering based on non-negative matrix factoriza-
53(4):89–97, 2010. tion. In Proceedings of the 26th annual international ACM
[Lee and Seung, 2001] Daniel D Lee and H Sebastian Se- SIGIR conference on Research and development in infor-
ung. Algorithms for non-negative matrix factorization. In maion retrieval, pages 267–273. ACM, 2003.
Advances in neural information processing systems, pages [Yu et al., 2004] Kai Yu, Anton Schwaighofer, Volker Tresp,
556–562, 2001.
Xiaowei Xu, and H-P Kriegel. Probabilistic memory-
[Lu et al., 2012] Kai Lu, Guanyuan Zhang, Rui Li, Shuai based collaborative filtering. Knowledge and Data Engi-
Zhang, and Bin Wang. Exploiting and exploring hierar- neering, IEEE Transactions on, 16(1):56–69, 2004.
chical structure in music recommendation. In Information [Zhang et al., 2006] Sheng Zhang, Weihong Wang, James
Retrieval Technology, pages 211–225. Springer, 2012.
Ford, and Fillia Makedon. Learning from incomplete rat-
[Maleszka et al., 2013] Marcin Maleszka, Bernadetta Mi- ings using non-negative matrix factorization. In SDM,
anowska, and Ngoc Thanh Nguyen. A method for collabo- pages 549–553. SIAM, 2006.
rative recommendation using knowledge integration tools

You might also like