Modelling Cascades Over Time in Microblogs

Modelling Cascades Over Time
in Microblogs
Wei Xie, Feida Zhu, Siyuan Liu and Ke Wang*
Living Analytics Research Centre
Singapore Management University
* Ke Wang is from Simon Fraser University, and this work was done when the author was visiting
Living Analytics Research Centre in Singapore Management University.
Motivation
Business applications such as viral marketing

have driven a lot of research effort predicting
whether a cascade will go viral.
In real life, there are very few truly viral

cascades.
Previous research work* shows that temporal

features are the key predictor of cascade size.
* Justin Cheng, Lada A. Adamic, P. Alex Dow, Jon M. Kleinberg, Jure Leskovec:
Can cascades be predicted? WWW 2014: 925-936
Time-aware Cascade Model

u5
u4
u0
t0
u1
t1
u2
u3
t3
t2
u5
u0
t0
u1
t1
u2
t2
u4
t4
u3
t3
t + dt

u5
u4
u0
t0
u1
t1
u2
u3
t3
t2
u5
u0
t0
u1
t1
u2
t2
u4
t4
u3
t3
t + dt

u5
u5
u4
u0
t0
u1
t1
u2
u0
t0
u3
u2
t3
t2
u1
t1
u3
t3
t2
u4
t4
t + dt
P i (t) = h i (t, {t j } u j Followee (i) (t) ; ) dt

P(C(t + dt)) = P(C(t + dt)|C(t)) P(C(t))
P(C(t 0 )) = 1
P(C(t + dt)|C(t)) = P i (t) (1 P i (t))
(1)
(2)
u i X
(t)
u i X
(t)

u5
u5
u4
u0
t0
u1
t1
u2
u0
t0
u3
u2
t3
t2
u1
t1
u3
t3
t2
u4
t4
t + dt

P(C(t 0 )) = 1
P(C(t + dt)|C(t)) = P i (t) (1 P i (t))
(1)
(2)
u i X
(t)
users who have re-shared
u i X
(t)

u5
u5
u4
u0
t0
u1
t1
u2
u0
t0
u3
u2
t3
t2
u1
t1
u3
t3
t2
u4
t4
t + dt

P(C(t 0 )) = 1
P(C(t + dt)|C(t)) = P i (t) (1 P i (t))
(1)
(2)
u i X
(t)
u i X
(t)
users who have re-shared users who havent yet

u5
u5
u4
u0
t0
u1
t1
u2
u0
t0
u3
u2
t3
t2
u1
t1
u3
t3
t2
u4
t4
t + dt

P(C(t 0 )) = 1
P(C(t + dt)|C(t)) = P i (t) (1 P i (t))
(1)
(2)
u i X
(t)
u i X
(t)
users who have re-shared users who havent yet
Observations in Twitter
Observation 1. Only the first re-sharer matters.
P i (t) = h i (t, t j ; ) dt
where
j = argmin j {t j |u j Followee (i) (t)}
Observations in Twitter
Observation 1. Only the first re-sharer matters.
P i (t) = h i (t, t j ; ) dt
where
j = argmin j {t j |u j Followee (i) (t)}
Observation 2. The chance of a tweet to be

retweeted decreases as time goes by.
P i (t) = h i ( ; ) dt
where = t t j and h i () is a decreasing function.
Hazard Function Design

P(t < T t + dt|T > t)
f(t)
h(t) = lim
=
dt0
dt
1 F(t)

P(t < T t + dt|T > t)
f(t)
h(t) = lim
=
dt0
dt
1 F(t)
t
F (u)
H(t) = h(u)du =
du = log(1 F(u))| t0 = log(1 F(t))
1 F(u)
0

P(t < T t + dt|T > t)
f(t)
h(t) = lim
=
dt0
dt
1 F(t)
t
F (u)
H(t) = h(u)du =
du = log(1 F(u))| t0 = log(1 F(t))
1 F(u)
0
F(t) = 1 e
H(t)

P(t < T t + dt|T > t)
f(t)
h(t) = lim
=
dt0
dt
1 F(t)
t
F (u)
H(t) = h(u)du =
du = log(1 F(u))| t0 = log(1 F(t))
1 F(u)
0
F(t) = 1 e
t
H(t) =
F(t) = 1 e
H(t)
Exponential distribution

P(t < T t + dt|T > t)
f(t)
h(t) = lim
=
dt0
dt
1 F(t)
t
F (u)
H(t) = h(u)du =
du = log(1 F(u))| t0 = log(1 F(t))
1 F(u)
0
F(t) = 1 e
t
H(t) =
t
H(t) = ( )
F(t) = 1 e
F(t) = 1 e
H(t)
( t )
Weibull distribution

t
H(t) =
t
H(t) = ( )
F(t) = 1 e
F(t) = 1 e
( t )

t
H(t) =
t
H(t) = ( )
F(t) = 1 e
F(t) = 1 e
( t )
H() = F() = 1 e
F() = 1

t
H(t) =
t
H(t) = ( )
F(t) = 1 e
F(t) = 1 e
( t )
H() = F() = 1 e
F() = 1

t
H(t) =
t
H(t) = ( )
F(t) = 1 e
F(t) = 1 e
( t )
H() = F() = 1 e
F() = 1
H() = (1 ( + 1) )
dH()
(+1)
h() =
= ( + 1)
d

H() = (1 ( + 1) )
H() = (1 ( + 1) )
scale parameter
H() = (1 ( + 1) )
scale parameter
shape parameter
H() = (1 ( + 1) )
scale parameter
F() H() =
shape parameter
H() = (1 ( + 1) )
scale parameter
shape parameter
F() H() =
describes the eventual re-tweeting probability
Hazard Rate Illustration

Retweeting Rate
20
16
12
8
4
0
tC
60
Time (Minute)

Retweeting Rate
20
16
12
8
4
0
tC
60
Time (Minute)
Hazard Rate
16e-4
Emperical Rate
Estimated Rate
12e-4
8e-4
4e-4
0
0
10
20
30
40
Time (Minute)
50
60
Dataset
From a Singapore based Twitter data set, we get all the
retweets to construct retweeting cascades. In all we get
2,425,348 cascades.
Probabilistic Model Fitting

TMt Threshold Model
h i (t) = s(|Followee (i) (t)|)
where
s(x) =
1 + e a(xb)
TCM-CH Constant Hazard

H() =
TCM-EH Exponential Hazard

H() = (1 e
dH()
h() =
=
d
dH()
h() =
= k e k
d
TCM-LH Long tail Hazard (our proposed)
H() = (1 ( + 1) )
dH()
h() =
= ( + 1) (+1)
d


For each cascade, observe its development in first
training, and the next T for testing.
T 0 for
Predicting Cascade Growth
Virality Prediction
Thanks
Our work is based on previous

cascade models
J. Goldenberg, B. Libai, and E. Muller. Talk of the network: A complex systems look at the
underlying process of word-of- mouth. Marketing letters, 12(3):211223, 2001.
M.Gomez-Rodriguez,D.Balduzzi,andB.Scho lkopf.Uncovering the temporal dynamics of

diffusion networks. In Proceedings of the 28th International Conference on Machine Learning,
ICML 2011, Bellevue, Washington, USA, June 28 - July 2, 2011, pages 561568, 2011.
S. A. Myers, C. Zhu, and J. Leskovec. Information diffusion and external influence in networks.
In The 18th ACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining,
KDD 12, Beijing, China, August 12-16, 2012, pages 3341, 2012.
M. Gomez-Rodriguez, J. Leskovec, and B. Scho lkopf. Modeling information propagation with

survival theory. In ICML (3), pages 666674, 2013.
N. Du, L. Song, M. Gomez-Rodriguez, and H. Zha. Scalable influence estimation in continuoustime diffusion networks. In Advances in Neural Information Processing Systems 26: 27th
Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting
held December 5-8, 2013, Lake Tahoe, Nevada, United States., pages 31473155, 2013.

Modelling Cascades Over Time in Microblogs

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Modelling Cascades Over Time in Microblogs

Uploaded by

Copyright:

Available Formats

Modelling Cascades Over Time

Business applications such as viral marketing

In real life, there are very few truly viral

Previous research work* shows that temporal

Time-aware Cascade Model

Time-aware Cascade Model

Time-aware Cascade Model

P i (t) = h i (t, {t j } u j Followee (i) (t) ; ) dt

P(C(t + dt)|C(t)) = P i (t) (1 P i (t))

Time-aware Cascade Model

P i (t) = h i (t, {t j } u j Followee (i) (t) ; ) dt

P(C(t + dt)|C(t)) = P i (t) (1 P i (t))

users who have re-shared

Time-aware Cascade Model

P i (t) = h i (t, {t j } u j Followee (i) (t) ; ) dt

P(C(t + dt)|C(t)) = P i (t) (1 P i (t))

users who have re-shared users who havent yet

Time-aware Cascade Model

P i (t) = h i (t, {t j } u j Followee (i) (t) ; ) dt

P(C(t + dt)|C(t)) = P i (t) (1 P i (t))

users who have re-shared users who havent yet

j = argmin j {t j |u j Followee (i) (t)}

j = argmin j {t j |u j Followee (i) (t)}

Observation 2. The chance of a tweet to be

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

Hazard Function Design

describes the eventual re-tweeting probability

Hazard Rate Illustration

Hazard Rate Illustration

Hazard Rate Illustration

Probabilistic Model Fitting

h i (t) = s(|Followee (i) (t)|)

TCM-CH Constant Hazard

TCM-EH Exponential Hazard

TCM-LH Long tail Hazard (our proposed)

Probabilistic Model Fitting

Probabilistic Model Fitting

Predicting Cascade Growth

Our work is based on previous

M.Gomez-Rodriguez,D.Balduzzi,andB.Scho lkopf.Uncovering the temporal dynamics of

M. Gomez-Rodriguez, J. Leskovec, and B. Scho lkopf. Modeling information propagation with

You might also like