Professional Documents
Culture Documents
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.
Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to Statistical Science.
http://www.jstor.org
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
Statistical Science
2006, Vol. 21,No. 2. 256-276
DOI: 10.1214/088342306000000222
? InstituteofMathematical Statistics, 2006
Network-Based
Marketing:
Identifying
via Consumer
Adopters
Likely
Networks
Shawndra
Hill, Foster
Provost
refers to a collection
of marketing
marketing
consumers
of
to
that
take
links
between
increase
sales.
advantage
techniques
on the consumer networks
formed using direct interactions
We concentrate
Network-based
Abstract.
ser
a new data set that represents the adoption of a new telecommunications
we
we
show very strong support for the hypothesis.
show
vice,
Specifically,
consumers
three main results: (1) "Network neighbors"?those
linked to a
the service at a rate 3-5 times greater than baseline
team. In ad
the best practices
of the firm's marketing
new
to
network
customers
the
allows the firm
who
acquire
have fallen through the cracks, because
they would not have
prior customer?adopt
groups selected by
dition, analyzing
otherwise would
built
identified based on traditional attributes.
(2) Statistical models,
a very large amount of geographic,
and
demographic
prior purchase
and substantially
in
data, are significantly
improved by including network
information
allows the ranking of the
formation.
(3) More detailed network
been
with
network
with
neighbors
of small
sets of individuals
Viral marketing,
word of mouth,
Key words and phrases:
targeted market
statistical
network
relational
classification,
analysis,
learning.
ing,
1. INTRODUCTION
Network-based
recognition
Shawndra Hill
is Associate
is a Doctoral
Professor,
and Management
Operations
Candidate
Department
viral marketing
(we do not consider multilevel
which
has
become
known as "network"
ing,
of Information,
Sciences,
N.
Leonard
Stem
Volinsky isDirector,
Labs
Jersey
Research,
07932,
Shannon
USA
Laboratory,
(e-mail:
Firms may
to-consumer
Chris
fprovost@stern.nyu.edu).
Florham
Park,
market
market
AT&T
use
their websites
advocacy
and Shah,
(Kautz, Selman
tomer feedback mechanisms
New
volinsky@research.att.com).
to facilitate
via product
256
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
1997)
consumer
recommendations
or via on-line
(Dellarocas,
2003).
cus
Con
NETWORK-BASED MARKETING
sumer networks
may
or marketing
also provide
leverage to the ad
of
the
firm. For exam
strategy
vertising
ple, in this paper we show how analysis
network improves targeted marketing.
two contributions.
This paper makes
of a consumer
First we
sur
257
list,
re
research literature
vey the burgeoning methodological
on network-based
in
marketing,
particular on statisti
cal analyses
for network-based
marketing. We review
be influential
the research
to spread information
about a product via word
of mouth,
it has been called viral marketing,
although
that term could be used to describe any network-based
techniques
portunities
us to postulate
data requirements
for study
necessary
of network-based
and
ing the effectiveness
marketing
to highlight
the lack of current research that satisfies
those
access
requirements.
Specifically,
both to direct links between
direct
information
research
must
consumers
have
and to
on the consumers'
product adoption.
of inadequate data, prior studies have not been
able to provide direct, statistical support (Van den Bulte
and Lilien, 2001) for the hypothesis
that network link
age can directly affect product/service
adoption.
Because
The
port
second
contribution
is to provide
that network-based
sup
empirical
indeed can im
marketing
intro
prove on traditional marketing
techniques. We
duce telecommunications
data that present a natural
in which
marketing models,
as
as
well
linkages
product adoption
rates can be observed.
For these data, we show three
testbed
for network-based
communication
con
(1) "Network
neighbors"?those
the service at
linked to a prior customer?adopt
a rate 3-5 times greater than baseline
groups selected
main
results:
to
enough (e.g., individuals, booksellers)
the traffic in paid-for editions
(Paumgarten,
When
to con
firms give explicit
incentives
stimulate
2003).
sumers
where
marketing
from
spreads
tion of using
commonly
athletes)
capitalize
to advocate
"cool" members
particularly
to adopt products
(Gladwell,
and
1997; Hightower,
Baker,
2002).
Brady
Network
targeting'. The third mode of network-based
is for the firm to market to prior purchasers'
marketing
social-network
any advo
neighbors,
possibly without
For network
the
cacy at all by customers.
targeting,
firm must have some means
to identify
these social
team. In
of the firm's marketing
by the best practices
the network allows the firm to ac
addition, analyzing
quire new customers who otherwise would have fallen
demographic
and substantially
formation.
allows
permit
improved
by
(3) More
network information
sophisticated
so as to
the ranking of the network neighbors
the selection of small sets of individuals with
of adoption.
example
The Hotmail
targeting and implicit advocacy:
free e-mail service appended
to the bottom of every
e-mail message
the hyperlinked
advertise
outgoing
ment,
complementary,
modes
"Get
targeting
(Montgomery,
user's
implicit
of
Individuals become
vocal advo
advocacy:
Explicit
cates for the product or service, recommending
it to
their friends or acquaintances.
Particular
individuals
at Hotmail,"
thereby
every current user
of
while
2001),
taking
Hotmail
advocacy.
customer base.
tially increasing
in the first month
alone Hotmail
of the
advantage
saw an exponen
Started
in July 1996,
acquired 20,000 cus
1996 the firm had acquired over
tomers. By September
100,000 accounts, and by early
lion subscribers.
some
There
in combination.
may be used
of viral marketing
combines net
work
Traditional
2. NETWORK-BASEDMARKETING
products
simply by conspicuous
firms have tried to induce the
adoption. More
recently,
same effect by convincing
of smaller social groups
A well-cited
as implicit advocates.
Firms
on influential individuals
(such as
consumers
neighbors.
These
three modes
or adoption
to consumer.
do not speak
Implicit advocacy: Even if individuals
about a product, they may advocate
through
implicitly
their actions?especially
through their own adoption
of the product. Designer
labeling has a long tradi
sumers
of awareness
the pattern
consumer
segments
marketing
of
methods
consumers.
do
Some
not
1mil
to
appeal
consumers
ap
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
258
more,
come
more
although
available
and more
has be
noise.
of
network-based
marketing
assumption
is that consumers
propagate
explicit advocacy
about products after they either
information
"positive"
aware of the product by traditional
have been made
key
through
vehicles
marketing
themselves.
Under
or have
the product
experienced
a particular subset
this assumption,
of consumers
more
friends
should want
useful
and Domingos,
2002). Firms
(Richardson
to find these influencers
and to promote
behavior.
Many
quantitative
research
pirical marketing
independently.
are collected
methods
assume
used
in em
that consumers
act
attributes
Typically, many explanatory
on each actor and used
in multivari
or tree induction.
In
ate modeling
such as regression
assumes
interde
network-based
contrast,
marketing
inter
among consumer
preferences. When
pendency
dependencies
their effects
to account for
exist, itmay be beneficial
in statis
in targeting models. Traditionally
as part of
are modeled
tical research, interdependencies
a
a covariance
either
within
structure,
particular obser
ex
vational unit (as in the case of repeated measures
or between
units. Studies of
observational
periments)
instead
network-based
attempt to measure
marketing
these interdependencies
through implicit links, such as
on geographic
or demographic
attributes, or
matching
of
links, such as direct observation
between actors. In this section, we re
the different
types of data and the range of statis
through explicit
communications
view
tical methods
we discuss
the extent
accommodate
to analyze
these methods
networked
them, and
naturally
data.
we discuss
the final
subsection,
inherent
challenges
in incorporating
this network
struc
ture.
3.1
Econometric
Models
is the application
of statistical meth
relation
estimation
of economic
Econometrics
to the empirical
ships. In marketing
ods
of
the estimation
or
one
for
the
marketing
equations:
or firm and one for the market. Regression
two simultaneous
ganization
and time-series
metric
used
3. LITERATUREREVIEW
statistical
an
In each case, we provide
systems.
of the approach and a discussion
of a promi
nent example. This (brief) survey is not exhaustive.
In
recommender
overview
to study the im
on
rice consump
interdependent
preferences
automobile
1991),
(Case,
(Yang and
purchases
have been used
of
Allenby,
Smith
and York,
studies, geogra
as
a
be
in
for
proxy
part
interdependence
phy
as opposed
tween consumers,
to direct, explicit com
are used in
munication.
different methods
However,
2003).
is used
the analysis.
Most
(2003)
recently, Yang and Allenby
are
that traditional
random effects models
suggested
not sufficient
sumer
to measure
networks.
chical mixture
the interdependencies
a Bayesian
developed
They
model where
of con
hierar
is built
interdependence
through an autoregressive
allows testing of the presence
structure
consumers
between
who
to
and demography
links are
in which
exhibit
or
geographic
showed
that the
demographic
is more
geographically
useful
than the demographic
for explaining
network
consumer behavior as it relates to purchasing
Japanese
cars. Although
they do not have data on direct commu
mod
six types of statistical research: (1) econometric
network
classification
surveys,
(2)
(3)
eling,
modeling,
with convenience
(4) designed
samples,
experiments
posed
A drawback
Work
in network-based
of statistics,
(5) diffusion
theory
filtering
and
nication
between
consumers
matrix
pendence
of consumers;
as op
this approach
is that the interde
size n2, where n is the number
has
consumer
networks
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
are extremely
large
NETWORK-BASED MARKETING
estimation
and prohibit parameter
using this method.
or
clever
matrix
clustering of the ob
techniques
Sparse
servations would be a natural extension.
3.2
Network
Network
Models
Classification
models
classification
links between
in a network
entities
use knowledge
to estimate
of the
a quan
in such a
259
about
research
have not been applied to consumer data. Much
in network classification
has grown out of the pioneer
(1999) on hubs and authorities
ing work by Kleinberg
potheses.
vey data
"point" to them. Al
both are
study uses statistical models,
notions of degree centrality
related to well-understood
and distance centrality from the field of social-network
by how many
though neither
others
influential
One paper
a consumer
that models
for max
network
and Domingos
(2002),
imizing profit is by Richardson
as
in which a social network of customers
is modeled
a Markov
random
will
field.
that a given
probability
a
function of the
product is
The
buy a given
states
of her neighbors,
attributes of the product and
or not the customer was marketed
to. In this
whether
framework
to every
to that customer,
the impact that
including
marketing
the marketing
action will have on the rest of the net
(e.g., through word of mouth). The authors tested
reviews from an In
their model on a database of movie
work
non-network
Their
logistic
on
the manufacturer
of an item
implies
a purchase
ofthat
asked
specific
the manufacturer
word-of-mouth
subsequent
able to capture whether
ers of their experience
and
were
analysis.
customer
researchers
product.
about
questions
behavior.
the customers
told oth
word-of-mouth
the researchers
which
contacts
behavior,
of the consumers'
product. Therefore,
of-mouth
actually
3.4
Designed
do not know
the
later purchased
address whether word
they cannot
affects individual
with
Experiments
sales.
Convenience
Samples
to study
enable researchers
Designed
experiments
a
network-based
in
controlled
marketing
setting. Al
a
the
convenience
though
subjects typically
comprise
who answer an
sample (such as those undergraduates
ad in the school
ment
the Web.
Typically
ANOVA
is used
to draw
conclu
sions.
Surveys
Most
research
tion on whether
Frenzen
in this area does
consumers
actually
not have
informa
and Nakamoto
that influence
formation
individuals'
through
(1993)
decisions
a market
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
studied
the factors
to disseminate
via word-of-mouth.
in
The
260
nontrusted
studied
esize
about
data
from
their networks.
a convenience
used
The
experiments
to generalize
sample
the
over
consumer
a complete
network. The authors also em
in their study. They found that the
ployed simulations
be
hazard
moral
the
(the risk of problematic
stronger
the stronger the
havior) presented by the information,
Gen
ties must be to foster information
propagation.
structure and
erally, the authors showed that network
information
characteristics
interact when
transmission
individuals
decisions.
Models
Diffusion
Diffusion
tools, both quantitative
theory provides
to assess
the likely rate of diffusion
and qualitative,
or product. Qualitatively,
researchers
of a technology
numerous
factors that facilitate or hin
have identified
der technology
2004), as well as
(Fichman,
adoption
social
that influence
factors
(Rogers,
product adoption
research involves empir
diffusion
2003). Quantitative
often
from diffusion models,
ical testing of predictions
informed by economic
theory.
The
most
notable
and most
influential
diffusion
model
individual
of product
adoption. Models
incorporate
assume
is ef
diffusion
that network-based
marketing
occurs
when
diffusion
understanding
and the extent to which
it is effective
is important
from using
for marketers,
these methods
benefit
may
fective.
individual-level
enable
In general,
tend accepted
aggregate-level
and the overall
Tout,
Evans
prod
the good
sales predictions,
measuring
of fit (R2 value) of the model
for 11 consumer
durable products. The success of the forecasts suggests
that the model may be useful in providing
long-range
ness
for product
sales or adoption. There has
forecasting
since
been considerable
work on diffusion
follow-up
this groundbreaking
and Kerin
work. Mahajan, M?ller
(1984) review this work. Recent work on product diffu
sion explores
2003) as well
2002)
of the product
2005);
they
(Ueda,
typically
the extent
to which
the Internet
as globalization
(Kumar
a
in
role
diffusion.
play
product
3.6 Collaborative
(Fildes,
and Krishnan,
and Recommender
Filtering
Systems
Recommender
to
mendations
recom
on
de
content
and
and link data (Adomavicius
methods
focus
Collaborative
Tuzhilin,
2005).
filtering
on the links between consumers;
the links are
however,
consumers
not direct. They associate
with each other
mographic
based
on shared purchases
Collaborative
filtering
or similar
is related
network-based
ratings of shared
to explicit consumer
both target market
because
marketing
tasks
benefit
from
ing
learning from data stored inmul
tables
and
(Getoor, 2005). For example, Getoor
tiple
Sahami
(1999), Huang, Chung and Chen (2004) and
and Greiner
between
relational
the connection
(2004) established
the recommendation
and statistical
problem
of proba
learning through the application
bilistic
adoption
and Yakan,
empirically
The model
yielded
Newton
uct diffusion
ver
individual-
an 5-shaped
is slow
curve, where
adoption
and tails off at the end.
takes off exponentially
at first,
This model
the extension
as the comparison
of results using
sus aggregate-level
data.
products.
describes
data. Data
well
who have
of the population
proportion
the
cumulative
let
be
pro
F(t)
adopted. Specifically,
The diffusion
in the population.
portion of adopters
as a func
in its simplest form, models
F(t)
equation,
tion of p, the intrinsic adoption
rate, and q, a mea
sure of social contagion. When
q > p, this equation
of the current
Since
1990;
do not
relational models
(PRM's)
(Getoor, Friedman,
and Pfeffer, 2001 ). However,
neither group used
customers
links
between
for
explicit
learning. Recom
mendation
systems may well benefit from information
Koller
about
perhaps
explicit
quite
consumer
important,
interaction
aspect
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
as an additional,
of similarity.
NETWORK-BASED MARKETING
3.7
Research
and Statistical
Opportunities
see
We
is a burgeoning
interactions
that there
consumers'
body of work
and their effects
on purchasing.
the foregoing
To our knowledge
types
taken in re
statistical
represent the main
approaches
In each approach,
search on network-based
marketing.
or in
in the data collection
there are assumptions made
the analysis
that restrict them from providing
strong
that network
and direct support for the hypothesis
based
indeed
marketing
Surveys
techniques.
fer from small and possibly
biased
samples. Collab
but do
have large samples,
orative filtering models
not measure
individuals. Models
direct links between
in network
have
and econometrics
classification
instead
like geography
proxies
and almost
communications,
used
direct
accurate,
specific
data on which
historically
of data on
(and what)
customers
purchase.
To paint a complete
a particular
product,
for
the
for statistical
of how
research
to analyze
into network
Challenges
that addresses
261
influence.
question
issues:
many statistical
data
Data-set
size. Network-based
marketing
or
often arise from Internet
telecommunications
up
sets
ap
can
When
observations
be
and
quite large.
plications
the
number in the millions
(or hundreds of millions),
for
data
and
the
become
data
typical
analyst
unwieldy
cannot be handled
in memory
by standard statis
software. Even if the data can be loaded,
tical analysis
their size renders the interactive style of analysis com
mon with tools like R or Splus painfully
slow. In Inter
net or telecommunications
studies, there often are two
often
the matrix
work
attributes
such
can be
by some sophisticated
methodology,
into the analysis by creating new covariates.
It remains an open question whether
clever data en
handled
folded
re
still may be quite large. While
much data mining
search is focused on scaling up the statistical toolbox to
data sets, random sampling remains an
today's massive
effective way to reduce data to amanageable
size while
the relationships we are trying to discover,
maintaining
if we assume the network information
is fully encoded
in the derived variables. The amount of sampling nec
environment
and
essary will depend on the computing
the complexity
of the model, but most modern
systems
can handle data sets of tens or hundreds of thousands
sampling, care must be taken to
interest
stratify by any attributes that are of particular
or to oversample
those attributes that have extremely
of observations.
skewed
Low
response
When
distributions.
incidence
of response.
is a consumer's
In applications
purchase
or
reaction
where
the
to a mar
to compress
information
into attributes
the transaction
to be included in the actors' attribute set. It has been
of any continuous
requires discretization
independent
not
if there
which
be desirable. Also,
attributes,
may
are even a moderate
number of independent
attributes,
massive
sources
et al.,
that file squashing
(DuMouchel,
Volinsky
the best features of
1999), which attempts to combine
with
random
data
sampling, can be use
preprocessed
shown
attrition
prediction.
DuMouchel
et al.
the buckets
will
be
eling. Other
solutions
oversampling
positive
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
and/or undersampling
262
of
(2004) gave an overview
negative responses. Weiss
show
the literature on these and related techniques,
extended network
Incorporating
structure lend themselves
network
as to their effective
evidence
ing that there is mixed
ness. Other studies of note include the following. Weiss
network-centric
termined); generally
estimates or rankings,
fault. However, Weiss
to produce probability
speaking,
a 50:50 distribution
is a good de
and Provost's
tree induction,
mented
and Stephen
(2002) experi
Japkowicz
ma
and support-vector
neural networks
in addition to tree induction, showing
(among
with
chines,
machines
other things) that support-vector
sitive to class
imbalance. However,
they
are insen
considered
fluenced
useful
in need
response
of more
empirical
systematic
and theoretical
word-of-mouth
Separating
about
there is information
cations, one cannot
mouth
transmission
Social
theory
cate with
each
conclude
of information
tells
us
other
a concept
called homophily
(Blau,
and
Smith-Lovin
Cook, 2001). Ho
1977; McPherson,
a
for
wide variety of relation
is exhibited
mophily
lar to each
other,
of similarity. Therefore,
linked
ships and dimensions
are
consumers
and
like-minded
like-minded,
probably
consumers
tend to buy the same products. One way to
is to account for con
address this issue in the analysis
scores (Rosenbaum
sumer similarity using propensity
were developed
scores
and Rubin,
1984). Propensity
clinical trials and at
in the context of nonrandomized
tempt to adjust for the fact that the statistical profile of
patients who received treatment may be different than
the profile of those who did not, and that these differ
ences
or enhance
like demographic
of homophily
data, we can
of
account (partially) for the possible
confoundedness
other independent attributes.
indicators
social-network
in an unobserved
embeds
the actors
dicting
"social
(Getoor,
methods
allow
study.
as a Markov
(2001) used
to assign every node a "network value."
this technique
A node with high network value (1) has a high prob
(2) is likely to give the product
ability of purchase,
a high rating, (3) is influential on its neighbors'
rat
attributes
set of
method
One
modeled
by her neighborhood
field. Domingos
and Richardson
random
small
primarily
with unbalanced
Data with
to a robust
(em
analyses.
simple
in our analysis)
from
is to create attributes
ployed
the network
them into a traditional
data and plug
to let each actor be in
Another
is
analysis.
approach
to deal
techniques
include ensemble
data. Other
noise-free
structure.
data. Missing
transactions
data in network
Missing
are common?often
is observ
only part of a network
able. For instance, firms typically have transactional
data on their customers
only or may have one class
of communication
edge
an edge
everywhere
creates
this probability
Thresholding
can be added
which
lesser weight
related
closely
to the network,
(Agarwal and Pregibon,
framework
pseudo-edges,
perhaps with
This
is
2004).
to the link prediction
problem,
where
the next links will be
tries to predict
Nowell
and Kleinberg,
PRM
is not present.
models
which
(Liben
of the
the use
of reference uncertainty
and existence uncertainty. The
a
extension
includes
unified generative model
for both
content and relational structure, where interactions be
tween the attributes
and link structure are modeled
(Getoor,
Friedman,
Koller
and Taskar,
2003).
of
consumer-specific
ing team identified
attributes).
and marketed
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
The
firm's
primar
to po
service
large set
market
to a list of prospects
NETWORK-BASED MARKETING
its standard methods.
We
using
network-related
effects or evidence
investigate
of "viral"
whether
stead
informa
fined 21 marketing
(Table 1) that were used
segments
for campaign management
and post hoc analyses. The
of consumers.
The team be
sample included millions
lieved
cer
to disclose
are not permitted
potheses
tain details, including specifics about the service being
offered and the exact size of the data set.
level
Med
Hi
3 2Y
4 2Y
Med
Hi
5 1Y
Med
6 1Y
73N
Hi
Hi
Med
102N
2
9N
N
N
2
20
21
2
on
demo
1
segments
(see Section
1 3 YHi
1-7
Med-Hi
Med-Hi
1-4
1-4
1-4
1-4
1-7
Hi
Hi0.10
PI
1.7
Hi
Hi0.25
PI
0.1
Med-Hi
8 3 NMed
1-7
Med-Hi
details)
%
Offer
Early Adopt
1-7
1-4
1-4
11
1NHi
1-4
1-4
1-7
1-4
4.1 for
PI
1.60.63
PI
1-4
1-4
1.7
PI
0.1
10.9
0.50P2
P2
13.1
Med-Hi
1-7
of %NN
list
2.41.26 PI
Hi
Y
19 1,2,3
based
ordered
IN?
16 ?
Hi
17N
3
1-7
Hi
181,2
N
1-4
Hi, Med
Hi, Med
were
or
Hi
P2 17.5
0.04
Hi 0.07
P2
11.0
Hi
P2 5.3
0.14
Hi
P27.7
0.25
Med-Hi
2.00.63 P2
Hi0.15
P2
2.0
15 1?Y??
P3
2.0
1.01
?
P2 1.6
0.46
Med-Hi
P2+
2.00.70
Hi P2+ 2.0
0.15
Med
12
N1
Hi
133N
N
Hi 14 1,2
customers
3 comprises
and/or those who have
services; Techl
any international
and
Tech2
low)
(1-10, where
l=high
and other
tech) are scores derived from demographics
Tech2
2Y
3
to campaigns.
attributes
important
previously
(hi, med
the marketing
Techl
based
Table
Intl
score
response
Other
to identify prospective
who would
targets?consumers
a targeted mailing.
receive
The data the marketing
us with did not contain
team provided
the underly
attributes
but in
ing customer
(e.g., demographics),
Segment
Loyalty
would
have varying
to
separate the seg
important
to learn the most from the campaign.
segments
of the service.
for
that de
of this, it was
and, because
technology
to
would be most
successful
that marketing
statistics
attributes
subscribed
new
Descriptive
for derived
firm, including
vices. Roughly,
level
loyalty
with moderate-to-long
tenure
a
In late 2004, a telecommunications
firm undertook
cus
to potential
large direct-mail marketing
campaign
tomers of a new communications
service. This service
believed
values
response
ments
in this way
An important derived
involved
included
further. We
4.1
263
P3 1.80.67
LI
6.0Hi
0.05
Hi
L2
6.0
0.05
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
0.83
0.08
0.22
264
for our
attributes
that estimate
relaxed
customer
to use
the marketing
a proprietary
the customer
a new
to use
ous behavior.
also
We
show
received
different
on previ
that
indicating
based
product,
the Offer,
different marketing
mes
segments
that were
indicate different
postcards
sages: P1-P3
a "+"
and
different
L2
indicate
LI
and
letters,
sent,
the mailing.
that a "call blast" accompanied
indicates
Primary
The
by a "?" in Table
as indicated
models,
Hypothesis
research
goal we
and Network
here
consider
1.
Neighbors
is whether
re
con
between
of independence
laxing the assumption
the estimation
sumers can improve demonstrably
of
our
is
that
first
likelihood.
Thus,
hypothesis
response
someone who has direct communication
with a current
subscriber
is more
It should
be noted
munications
Data
on
communications
events
include
anonymous
stamp and the
a time
the transactors,
the
For
transaction duration.
purposes of this research,
so that individual
all data are rendered anonymous
identifiers
for
are protected.
an at
we constructed
In pursuit of our hypothesis,
tribute called network neighbor
(or NN)?a
flag that
consumer
had commu
the targeted
indicates whether
identities
nicated with
by segment.
team invited us to create
the marketing
In addition,
our own segment, which
target. Our
they also would
that were
of
network
22"
consisted
neighbors
"segment
not already on the current list of targets. To make sure
our list contained
calculated
scores
based
used
with
merit
for
viable
prospects,
the derived
technology
on our
the consumers
team
the marketing
and early
list. They
adopter
filtered
inclusion
on
the initial
was
clusion
list to Tech2
team allowed
had very
those network
For
it into segment 22 if
the market
However,
customers
who
they
a
of
purchase.
probabilities
who did not score high
neighbors
small
to warrant
in segment 22, we still
inclusion
enough
tracked their purchase records to see if any of them sub
scribed to the service in the absence of the marketing
see below. Overall,
the profile of the candi
campaign;
to be subpar
dates in our segment 22 was considered
in terms of demographics,
affinity and technological
our
these tar
for
final
conclusions,
capability. Notably,
the firm would
gets are potential customers
wise ignored. The size of segment 22 was
list.
of the marketing
have other
about
1.2%
the pros
the above process divides
summarize,
two
dimensions:
(1) targets?those
pect universe along
as being
consumers
identified by the marketing models
To
of solicitation?and
(2) network neighbors?
worthy
with a subscriber.
those who had direct communication
Table
2 shows
the relative
combination
or do not subscribe
the
firm.
with
4.3 Modeling
To determine
as
relaxing the independence
the network
data) improves model
a
wide
range of demographic
using
whether
(using
sumption
ing, we fit models
and consumer-specific
are known
of which
mated
likelihood
the values
Data
Consumer-Specific
independent
or believed
attributes
to affect
of purchase).
Overall, we
to assess
150 attributes
for over
fect on sales
likelihood
network-neighbor
These
values
collected
their ef
with
the
included
the
variable.
(many
the esti
following:
data: We obtained
Loyalty
than the simple
formation
finer-grained
categorization
loyalty in
described
types of service,
to prior mailings,
responded
a loyalty score generated by a proprietary model and
information about length of tenure.
above,
past
including
how often the customer
spending,
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
NETWORK-BASED MARKETING
Table
Data
= Y
Target
NN = Y
NN
= N
non target s
NN
1-22
size =
= N
Target
targets
Segments
Relative
NN
categories
Relative
0.015
Non-NN
Non-NN
nontargets
Relative
size >
Consumers
size =
1
models
but who
are
Consumers
were
data
The
for our
in each
up
education
household,
of members
number
ownership.
the census
Some
of this information
was
inferred at
the geographic
data.
As mentioned
earlier, we
Network
attributes:
served
communications
other consumers.
neighbor
flag
of current
subscribers
ob
with
In addition
described
communica
attributes from prospects'
sophisticated
tion patterns. We will return to these in Section 5.6.
4.4
Data
Limitations
the amount
available
for each
not
on marketing
The
neighbors.
were
not
network
to be good
and
neighbors
prospects
also
the mar
by
model.
"relative
The
size"
value
shows
the number
of prospects
overall
response
Loyalty
of information
to relatively
to
helps
logistical
analyses.
straight
Distribution
loyalty groups
The
target group
heavily.
network-neighbor
this
appears to skew toward the less loyal prospects;
is due to the fact that segment 22, which makes
up a
com
of
the
large part
network-neighbor
population,
prises
predominantly
consumers.
low-loyalty
5. ANALYSIS
we
that
evidence
direct, statistical
with prior cus
communicated
tomers are more
likely to become customers. We show
this in several ways,
including
using our own best
Next
consumers
will
show
who
have
(e.g., transactions)
in in
the difference
target. Given
as loyalty varies, we grouped customers by
formation
in our
loyalty level and treated the levels separately
ducting
out-of-sample
cated network
values.
they
considered
relatively
ing
neighbors,
scored poorly
group,
but were
network
who
not
keting
show
who
to because
models.
targets
1-21
identified
by marketing
Prospects
not network
neighbors.
who
were
marketed
Segments
Relative
Notes.
size =0.10
identified
models
and who also
by marketing
Prospects
are network
in
22 have re
Those
segment
neighbors.
on the marketing
scores.
model
thresholds
duced
efforts
to build
improved
competing
targeting models
assessments
of predictive
thorough
data. Then
attributes
we
consider
and con
ability
more
further.
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
on
sophisti
can be
266
Tm&tff
-?*?#t
-Vf/
V,;i.m.
i*rr
-r-ffl
<M?m
'*'? -";:-'"'.'.'
FlG.
distribution
category.
by customer
Loyalty
The network
(NN) show a much
neighbors
1.
categories.
5.1
Network-Based
Improves
Marketing
The
>'&'?S-Tj^:
three
bars
larger proportion
Response
who
the service
adopted
the offer. For each
within
sizes of
consumers
Figure
the
three
than
loyalty
the non-NN
groups
for
our four
data
group.
effect is positive
(the parameter esti
network-neighbor
an increased
mate
is greater than zero), demonstrating
take rate for the network-neighbor
group within each
segment. For
is significantly
of 0 (p < 0.05),
bor significantly
While
segment,
specified period following
we performed
a simple logistic regression for the inde
attribute versus the depen
pendent network-neighbor
In Figure 2, we graphically
dent sales response.
present
odds
17 of these
different
ratio
value
an independent
variable,
as
pretable
comparisons
neighbor
the log-odds
segments,
from the null hypothesis
inter
they are not as directly
of take rates of the network
and non-network-neighbor
groups
take rates for the network
The
in a given
segment.
neighbors
are plotted versus the non-network
in Fig
neighbors
ure 3, where
to
the size of the point is proportional
the log size of the segment. All segments have higher
take rates in the network-neighbor
subgroup, except for
the one segment
sales
that had no network-neighbor
(the smallest sample size). Over the entire data set, the
take rates were greater by a fac
network-neighbors'
tor of 3.4. This value is plotted in Figure 3 as a dotted
line with slope = 3.4. The right-hand plot of Figure 3
shows the relationship
take
between
each segment's
I w
O
o?
rate and
of low-loyalty
log odds).
the relative
network-neighbor
Segmentation
provides an ideal setting to test the sig
inmodel
nificance and magnitude
of any improvement
information, while
ing by including network-neighbor
targeted
show
divided
Segments
FIG.
ted as
2.
Results
by
log odds)
Parameter
estimates
plot
of logistic
regression.
ratios with 95% confidence
intervals.
The number
log-odds
at the value
plotted
ment numbers
(ordered
from
of the parameter
1.
Table
estimate
refers
back
to seg
of the network-neighbor
effect after account
ing for this segment effect, we ran a logistic regression
across all segments,
the main effects for the
including
nificance
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
NETWORK-BASED MARKETING
-A
1
Jake
FIG.
Take
3.
267
Rate
Take Rate
for Non-Network
rates for
with that of
segments.
marketing
Left: For each segment,
comparison
of the take rate of the non-network
neighbors
The
is proportional
the
to the log size of the segment.
size
There
is one outlier
not plotted,
with a take rate
neighbors.
glyph
of
and 0.3% for the non-network
lines are plotted
at x = y and at the overall
take-rate
of 11% for the network
neighbors
neighbors.
Reference
ratio of 3.4. Right: Plot of the take rate for the non-network
the network
group versus
lift ratio for
neighbors.
the network
terms had
the interaction
of
from
segment
cases, and one
The
to be deleted:
one
22, which
from
the network
and used
for each
the two.
neighbors.
stepwise variable
regression
Coeff (ci.)
Segment
Segment
to get an
interval of
Significance2
to the coefficients
themselves.
There
are significant,
the segments
themselves
fore, although
in the presence
of the network attribute the segments'
effect ismostly
negated by the interaction effect. Since
1.7(0.9,2.5)
1.8(1.2,2.4)
2.1(1.3,3.0)
1.9(0.4,
3.3)
1.9(1.2,
2.5)
1.4(1.0,
1.9)
1.3(0.9,
1.7)
Segment = 8
Segment = 17
Segment = 19
NN x Segment = 1
NN x Segment ==2
NN x Segment = 4
NN x Segment = 6
NN x Segment = 7
NN x Segment = 8
NN x Segment = 17
NN x Segment = 19
is an esti
of these interactions
is important. Note
interpretation
that the magnitudes
of the interaction coefficients
are
2.0(1.7,2.3)
Segment = 5
Table 3
Coefficients and confidence intervalsfor thefinal segment model
Attribute
nificance
a network
neighbor
is at least as important
like
that
in this
context.
1.5(0.7,2.2)
In Table
2.2(1.6,2.9)
-1.1
(-2.1,
0.0)
an analog
-0.9
(-1.7,
-0.2)
-1.8
(-4.0,
0.4)
-1.5
(-2.6,
-0.6)
-1.2
(-1.7,
-0.6)
-0.8
(-1.3,
-0.4)
-1.6
(-2.8,
-0.5)
-1.1
(-1.9,
-0.3)
table confirms
the significance
of the main effects and
Each level of the nested model
is
of the interactions.
is
interactions
network-neighbor
of the prospect population.
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
is used
that so
that the
segments
Analysis
NN
interactions
10687
of attributes
at each
is shown
at the 0.05
not identified
us to compare
take rates
targets for the segments
types of targets. However, many of
targets fall into the network-only
the network-neighbor
segment 22. Segment
Significance^1
9 63
370
8 41
10733
22
The
that contained
Change in deviance
10869
of the group
Significance
5.2 Segment
study
11200
Intercept
Segment
Segment + NN
Segment
the network-neighbor
DF
Deviance
Variable
E4
table fo.
that the
22 comprises
prospects
not
to
be
models
deemed
good can
original marketing
can
see
we
from
the
distribu
As
for
didates
targeting.
most
the
tion in Figure
1, this segment for
part contains
who had no prior relationship with the firm.
the take rates for segment 22 with the
compare
take rates for the combined
group, including all of seg
in
ments
the
leftmost
three bars of Figure 4.
1-21,
consumers
We
The network-neighbor
segment 22 is (not surprisingly)
as the NN groups in segments
not as successful
1-21,
1-21 were selected based
since the targets in segments
for mar
them favorable
that made
we
see
the
that
22 net
segment
keting. Interestingly,
non-NN
the
work neighbors
targets from
outperform
on characteristics
1-21. These
segment 22 network neighbors,
segments
on the basis of their network ac
identified primarily
likely by almost 3 to 1 to purchase
tivity, were more
than the more "favorable" prospects who were not net
work neighbors.
Since those in segment 22 either were
be unworthy
would have
(**)
by marketing
analysts
or were
prospects,
they represent
"fallen through the cracks"
deemed
Improving
Now we will
to
who
in the tradi
process.
a Multivariate
Targeting
assess whether
the NN
Model
attribute
can im
a multivariate
prove
targeting model
by incorporating
all that we know or can find out (over 150 different at
demo
tributes) about the targets, including geography,
and other company-specific
from
attributes,
graphics
internal and external sources (see Section 3.2).
As discussed
in Section
3.7, we tried to address
an important causal question
that
(as well as possible)
arises: Is this network-neighbor
effect due to word of
or simply due to homophily?
The observed ef
fect may not be indicating viral propagation,
but in
a
stead may
demonstrate
effective
way
very
simply
to find like-minded
people. This theoretical distinction
mouth
may not matter much to the firm for this particular type
of marketing
process, but is important to make, for ex
before
future campaigns
that try to
ample,
designing
take advantage of word-of-mouth
behavior.
we
cannot
control for unobserved
Although
ities, we can be as careful
to ensure that the statistical
1.35%
levels.
customers
tional marketing
5.3
similar
as possible
in our analysis
NN prospects
of
the
profile
cases. Since
is the same as the profile for the non-NN
set contains many more non-NN
cases than
we
case
a
NN cases,
match each NN
with
single non
our data
0.83%
II
Network
Neighbors
Segs1-21
FlG.
4.
network
compared
nontarget
%%$
Wmn
' W/^n
Network
0.28%|
W0\
mz-y\
' Non-Network '
Neighbors
Segs1-21
Neighbors
Seg 22
segments.
neighbors
the all-network-neighbor
take
All
network
neighbors.
with
non-network-neighbor
group
(segments
Q.11%
?mm?
Network
rates for
in segments
22 and with
segment
rates are relative
to
1-21).
reasonably
NN group.
Neighbors
Non-Targets
Take
case
to it by calculating
that is as close as possible
scores
all
of
the
attributes
propensity
using
explanatory
considered
in
Section
At
the
end of
described
(as
3.7).
as
as is
this matching
the
NN
close
is
process,
group
NN
the
1-21
the
the
possible
in statistical
properties
to the non
to heterogeneity
of data sources across the three
we
scores to create
used
the propensity
loyalty groups,
a matched
data set for each group. For each (individu
in
ally), we fitted a full logistic regression
including
Due
teractions
and selected
a final model
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
using
stepwise
NETWORK-BASED MARKETING
Table
Results
of multivariate
model
Loyalty
1
NN
Significant
NN
Discount
attributes
Level
of Int'l
Referral
(-)
band
Tenure
firm
with
(-)
to loyalty
Belonged
Referral
program
of adults
Number
service
plan
Letter
of country
indicator
to
Belonged
loyalty program
Chumer
(-)
Recent
grad
at residence
children
Any
to
responder
mailing
College
Tenure
plan
Type of previous
score
Credit
Previous
(-)
Region
communicator
International
calling
plan
firm
with
Tenure
Comm.(I)
in house
# of devices
Revenue
NN
Discount
(vs. postcard)
responder
incentive
User
of
Any
children
to mailing
credit card
in house
(-)
(-)
in house
(-)
in house
(95% CI)
Take
rate
Notes.
the effect
variable
liers,
0.99
attributes
Significant
of the variable was
selection.
All
from
0.4%
across
levels
regressions
loyalty
a significant
interaction
(1) indicates
logistic
negative;
attributes were
checked
for out
with other at
and collinearity
the attributes
removed or combined
transformations
tributes, and we
for any significant correlations.
that accounted
Table
5 shows the results of the logistic
regres
were
to be
the
attributes
that
found
show
which
sions,
correlated with
those that were negatively
significant,
take rate, and those that had interactions with the NN
found the network
attribute. Each of the three models
attribute to be significant
along with several
neighbor
others. The significant attributes tended to be attributes
regarding the prospects'
previous relationships with the
firm,
with
such
0.84
(0.49,1.49)
0.9%
as previous
tenure
international
services,
and revenue
churn identifiers
spent with
firm,
the firm. These
most
0.3%
indicates
(p < 0.05). Bold
with
the NN variable.
those
(0.52,1.16)
significance
customers,
knowing
to
any
responded
previous
significant effect.
at 0.01
whether
marketing
level;
(-) indicates
the customer
campaigns
has
has a
weakest.
5.4 Consumers
Not Targeted
above,
only a select subset of our
was
list
based
network-neighbor
subject to marketing,
on relaxed
on eligibility
thresholds
criteria. The re
As
discussed
mainder
The
relative
chosen
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
270
characteristics
that would
have
any known favorable
on
the list of prospects.
The fact that they
put them
are network neighbors
alone supports a relatively high
rate, even
lends some
of direct marketing.
an
to
of word-of
support
explanation
mouth propagation
rather than homophily.
we
will
the remainder of the
Finally,
briefly discuss
take
This
consumer
in the absence
non-NN
space?the
tunately,
it is very difficult
which could be considered
a baseline
category,
all of the other
estimate
cludes
Unfor
group.
nontarget
to estimate
rate for
as customers
of the firm's competitors
and consumers
this product that do not have cur
who might purchase
rent telecommunications
service with any provider.
It
has been established
that the size of the communica
tions market
best
is difficult
to estimate
of this baseline
estimates
below
0.01%,
the nontarget network neighbors.
of our study is that
On the other hand, a by-product
we can upper-bound
the effect of the mass market
even
in general
the target-NN
by comparing
ing campaigns
The
and
the
in
difference
group.
group
nontarget-NN
rates
the targeted network neighbors
take
between
and
the nontargeted
This difference
network neighbors
is about
10 to 1.
cannot all be attributed to the marketing
chosen
effect, since the targeted group was specifically
to be better prospects
and it is likely that more of them
would have signed up for the service even in the total
it does seem reason
of marketing.
However,
an
able to call this factor of 10
upper bound on the
effect of the marketing.
absence
5.5 Out-of-Sample
These
Ranking
that we
results
estimations
Performance
suggest
as to which
as network-neighbor
status. Note that in different
business
scenarios, different
types and amounts of data
are available. For example,
for low-loyalty
customers,
are
few
attributes
known.
We
very
report
descriptive
results here using all attributes; the findings are quali
well
subset of attributes
tatively similar for every different
we have tried (namely,
segment,
loyalty, geography,
The response
is the same as
variable
demographic).
and we
above
els. We measure
nary response
can be
to respond to an offer. Such estimations
the consumer
pool is immense and a
quite valuable:
a
limited
will
have
be
budget. Therefore,
campaign
ing able to pick a better list of "top-/:" prospects will
to increased profit (assuming
lead directly
targeting
prises
the ability
to rank customers
consumer,
we
ing loyalty,
demographic
statistic,
Table
alty groups,
the expected
regression models.
quantifying
benefit
from
loy
the
There
is an in
improved
logistic
crease in AUC for each group, with the largest increase
to loyalty level 1, for which
the least infor
belonging
mation
is available;
note that here the ranking
the network
is not much
information
without
than
ability
better
random.
this improvement,
Figure 5(a) shows cu
curves
mulative
when using the model
response ("lift")
on loyalty group 3. The lower curve
the per
depicts
formance of the model
all
traditional
attributes,
using
To visualize
analysis:
create
attributes
accurately.
a
record
that
(trad atts),
and geographic
as
10.54
the application
of
trad
+ NN
models
atts
atts
0.60
20.64
0.67
30.60
0.64
attributes
includ
attributes,
values
trad
Loyalty
Note.
com
AUC
logistic
Mann-Whitney
the ROC curve
For
the same
customers
likely
proves
used
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
271
NETWORK-BASED MARKETING
2.5
- trad atts
-trad atts + NNi
1.5
E
3
O
0.5
% of Consumers
Targeted
0.8
0.6
0.4
0.2
Cumulative
(Ranked by Predicted
0.4
0.2
Cumulative
Sales)
% of Consumers
0.6
Targeted
(a)
0.8
(Ranked by Predicted
Sales)
(b)
curves
5.
curves.
(a) Lift
NN
Power
of the segmentation
with the NN
The model
attribute.
network-neighbor
for models
attribute
attribute.
be obtained
would
the NN
attribute,
improve the ranking
measures
of social
phisticated
network of existing customers.
a set
7 summarizes
Table
network
from
Performance
Improving
By Adding
Attributes
Network
Sophisticated
More
interaction
from
the
network
we
focus
on the social
the current
"the network"),
along with
of prospects who have communicated
will
call
the network
whether
we
(the network
can improve
targeting
the periphery
with those on
We
investigate
so
by using more
neighbors).
Table
Network
them represent
between
the nodes. The
relationships
intuitive social notions,
SNA measures
help quantify
such as connectedness,
social im
influence, centrality,
on.
so
to
and
understand
portance
Graph theory helps
attribute
ating mathematically.
Three of the attributes
and methods
that we
introduce
7
descriptions
Description
Number
customers
of unique
communicated
customers
of transactions
to/from
Number
of seconds
Number
Degree
Transactions
of communication
Connected
to influencer
Connected
component
Max
them as interconnected
similarity
communicated
with
with
before
customers
before
the mailing
the mailing
before mailing
Is an influencer
size
Size
Max
for oper
can be de
Attribute
Seconds
of social-network
that comprises
(only)
of this service (which here we
network
customers
social
Social-network
a consumer
is a network neigh
whether
Knowing
indicators of consumer-to
bor is one of the simplest
consumer
the fields
additional
the
and
analysis
involves
(SNA)
graph theory.
analysis
trans
information
measuring
(including
relationships
on
a
in
network.
The
nodes
between
mission)
people
the network
and
the
links
between
represent people
degree
5.6
of
with
attributes
sion. The
that we
we
terminology
relationship
in prospect's
local neighborhood?
to
of the connected
component
prospect
belongs
in local neighborhood
with any existing
overlap
neighboring
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
customer
272
sures
the number
of direct
connections
a node
Table
has.
ROC
Within
The network
Given
tices
the connected
of G
components
tices such that all vertices in each set are mutually
con
Attribute(s)
AUC
Transactions
0.68
Seconds
of communication
to influencer
Connected
component
All
network
All
traditional
nected
All
traditional
Note.
is linked
the
borhood.
Observing
local neighbors,
prospect's
of social
a prospect's
local neigh
local neighborhoods
of a
we can define a measure
as the
similarity. We define social similarity
size of the overlap in the immediate network neighbor
hoods of two consumers. Max
is the max
similarity
imum
social
date customers
this information, we
signed up. Using
define influencers as those subscribers who signed up
we see one of their
for the service and, subsequently,
network
neighbors
sign up for the service. Connected
to influencer is an indicator of whether
the prospect
is
to one of these influencers. We appreciate
connected
that we do not actually know
if there was true influ
ence.
We
AUC
values
find
that some
and show
siderable
more
attributes
predictive power
value when combined.
of 0.68
for both
0.68
0.59
Degree
Connected
0.53
size
0.55
0.55
Similarity
nected
the prospect
is connected.
We
also move
beyond
analysis
each
AUC
of
0.71
(loyalty,
demographic,
+ all network
0.71
values
result
the constructed
in combination.
Results
from
network
geographic)
0.66
built
models
logistic
regression
as
well
attributes
individually,
are presented
for
loyalty-level
on
as
3 customers.
the network
there is no ad
attributes,
even
in
of these
AUC,
gain
though many
attributes were shown to be significant
in the broader
analysis above. The similarities
represented
implicitly
or explicitly
in the network attributes seem to account
ditional
information
captured by traditional de
and
other
attributes. That tra
mographics
marketing
ditional demographics
and other marketing
attributes
do not add value is not only of theoretical
interest, but
as well?for
in cases such as this
practical
example,
where demographic
data must be purchased.
Our result is further confirmed
by the lift and take
rate curves displayed
in Figure 6(a) and (b), respec
tively. One can achieve substantially
higher take rates
to using
using the new network attributes as compared
the traditional attributes. For example, we find that for
the top 20% of the targeted list, without
the network
the take rate is 2.2%; with the network at
it
is 3.1%. Likewise,
at the top 10% of the list,
tributes,
the take rate with the network attributes is 4.4% com
them.
pared to 2.9% without
attributes,
6. LIMITATIONS
sion model
We
nication.
connected
an AUC
built with
of 0.71
the network
compared
the network
results
in
of 0.66 without
attributes?using
attributes described
keting
call that this represents
attributes
to an AUC
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
273
NETWORK-BASED MARKETING
0.8
6
CO
? 0.6
4
0.4
0.2
3
trad atts
trad atts + net;
2
1
0.6
0.4
0.2
Cumulative
% of Consumers
Targeted
0.8
(Ranked by Predicted
0.2
0.4
% of Consumers
Cumulative
Sales)
(a)
FIG.
atts)
6.
(a) Lift
the network
attributes
attributes
rate
Power
attributes.
without
the model
without
net) outperforms
and 3.1 % with
attributes
the network
them
(trad atts).
the network
we used were
external
be at least partially
is not well known
sources.
collected
These
by purchasing
data are known to
erroneous
information
communication,
the consumers
Sales)
(b)
firm's
data from
(Ranked by Predicted
curves.
compared
(trad atts +
is 2.2%
0.8
0.6
Targeted
the
if
discussion
forums.
effect to manifest
expect the network-neighbor
of
for different
itself differently
types
products. Most
have fo
of the studies done to date on viral marketing
We
cused
example,
the new
for
of
target
ranked
by score,
the take
service
studied
here
to a roll-out
of another
Customers
who
signed
of telecommunications
pricing plans
to
and so confusing
that this
do not believe
is so extensive
as the pricing
For
plan and the new technology.
the pricing plan, we have the same knowledge
of the
network as we do for the new technology.
For those
ucts
In
in question
discussed
the product.
our data are inferior to some other do
content is visible, such as Internet bulletin
this regard,
mains where
boards or product
For
attributes.
consumers
who
who
belong
they communicate
ordered
the percentage
of these new customers who were net
com
work neighbors,
that is, those who had previously
municated
with a user of the product. This percentage
is a measurement
of the proportion
of new sales be
driven
network
effects.
this per
ing
by
By comparing
centage across two products, we get insight into which
product stimulates network effects more.
an 8-month
We
now
ucts was
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
274
make
N?w5?rvke
Pitting P?ar)
know
s\
on which
results
Month
Fi G. 7.
plot for
Network-neighborness
new
service
versus
pricing
The
we
here
study
telecommunciations
recently
for $2.6
Google's
explicit
networks
corresponds
ing discussed
we
can see
earlier. Before
the campaign,
that the network-neighbor
that
effect was increasing,
more
and more
in a given month
of the purchasers
were network neighbors.
the mass marketing
During
we exposed many non-network
campaign,
neighbors
to the service and many of them ended up purchasing
it, temporarily dropping the network-neighbor
percent
age. After the campaign, we see the network-neighbor
percentage
starting to increase again.
measure
This network-neighborness
should
not be
with
the success
This
other
difference
are more
for purchasing
for purchasing
fects of word
to discern without
are difficult
homophily
the content of the commu
versus
knowing
nication.
7. DISCUSSION
One
of the main
and to whom
concerns
deci
from
networks
sort of directed
take away are that the new service has a higher per
cent of purchasers who are network neighbors
and also
an increasing one (except for the dip in month
5). In
Interestingly,
indicate
direct marketing
that a firm can benefit
purchasing.
Taking
on both the firm's
and substantially
proves significantly
own marketing
"best practices"
and our best efforts to
collect and model with traditional data.
plan.
contrast
to base
explicit
has
network-based
applicability
that
marketing
traditional
beyond
For
companies.
example,
eBay
upstart Skype
purchased
Internet-telephony
billion;
they now also will have large-scale,
data on who
talks to whom. With
gmail,
now has access
to
service, Google
of consumer
and
interrelationships
directed network
gmail for marketing;
already is using
based marketing might be a next step. Various
systems
have emerged
that provide
recently
explicit
linkages
between
Friendster,
acquaintances
(e.g., MySpace,
which could be fruitful fields for network
Facebook),
create interlinked
As more consumers
based marketing.
source
another
data
arises.
More
these
blogs,
generally,
results suggest that such linkage data potentially
could
be a sort of data considered
for acquisition
by many
data now are being col
types of firms, as purchase
lected routinely by many
types of retail firms through
Even
cards.
academic
could bene
loyalty
departments
in spe
such data; for example,
the enrollment
classes could be bolstered
to
cialized
by "marketing"
those linked to existing
students. Such links exist (e.g.,
fit from
It remains to design
via e-mail).
to all.
that are acceptable
tactics
for using
them
above, itmay
of information
is in accord with
is a powerful
which
homophily,
social
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
NETWORK-BASED MARKETING
or geographic
than any demographic
or word of mouth,
cause, homophily
and practically.
theoretically
Either
attributes.
is interesting
R.
FiLDES,
both
V. Mahajan,
19 327-328.
and
of AT&T, as well
for useful
versity of Maryland,
We
suggestions.
mous
reviewers
Getoor,
drafts.
previous
Learning
Mining
Berlin.
Springer,
Learn.
REFERENCES
G.
A.
and Tuzhilin,
the next
Toward
(2005).
Agarwal,
ties of
Philadelphia.
F. M.
Bass,
of Social
D. and Narayandas,
Theory
contacts
initiated
with
D.
Web
pertextual
L.
search
30 107-117.
Systems
A.C.
(1991). Spatial
metrica
S. (1998). Toward
and cost distributions:
and Stolfo,
non-uniform
class
fraud
In Proc.
detection.
on Knowledge
Press, Menlo
S. H.
and
DELLAROCAS,
Promise
agement
event
physics
67 159-182.
M.
of
tional
on Knowledge
Conference
ACM Press, New York.
W.,
DuMouchel,
and Pregibon,
Fifth
ACM
In Proc.
Volinsky,
D.
rule-learning
classification.
Computer
R.
G.
for information
Huang,
Mining
Going
technology
and methods.
J. Assoc.
6-15.
the network
).Mining
SIGKDD
Interna
and Data
Discovery
Mining
Johnson,
T.,
beyond
innovation
Information
Press,
C.
Cortes,
flatter.
In Proc.
on Knowledge
York.
New
the dominant
paradigm
con
Emerging
5
314-355.
Systems
research:
J. Mach.
relation
probabilistic
WEBKDD
San
1999,
Yorker March
17,
K.
Point:
How
Little
T. L.
and Baker,
environment
Inves
(2002).
in hedonic
service
events.
J. Busi
of sporting
study
Can
Things
Boston.
Books,
E.
W.
and Handcock,
to social
MR
Sta
H. C.
S.
systematic
(2002).
/. Amer.
analysis.
systems.
55 259-274.
and Technology
N.
and Stephen,
La
S.
1951262
and Chen,
recommender
M.
network
The
(2002).
Intelligent
study.
imbal
class
Data
Analysis
429-449.
R.
and Agarwal,
(2001).
Evaluating
rare
to
and
classes:
algorithms
classify
boosting
Comparison
on Data
In Proc. IEEE International
Conference
improvements.
IEEE Press,
257-264.
Mining
Kautz,
V.
Kumar,
M.,
H.,
Selman,
Combining
social
B.
and
networks
NJ.
Piscataway,
M.
Shah,
and
(1997).
web:
Referral
collaborative
Comm.
filtering.
sources
in a hyperlinked
BERG,
V.
T. V.
and Krishnan,
An
D.
for
problem
MR
(2002).
networks.
on Information
Conference
ACM Press, New York.
556-559.
Linden,
G.,
Smith,
B.
and
worked
Paper
Working
York University.
Mahajan,
toolkit
(2003).
F. (2004).
a univariate
and
Stern
#CeDER-04-08,
V., M?ller,
J.
E.
and Kerin,
with
30
positive
1389-1404.
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
318-330.
link pre
Twelfth
collaborative
7 76-80.
Computing
S. and Provost,
data:
In Proc.
and Knowledge
York,
recommendations?Item-to-item
Macskassy,
diffusion
Multinational
framework.
social
en
1747649
Sei. 21
Marketing
and KLEINBERG,
J. (2003).
The
alternative
LlBEN-NowELL,
Internet
ACM
A.
Chung,
Japkowicz,
ance problem:
diction
ACM
B.
Taskar,
55 691-101.
approaches
97 1090-1098.
Z.,
Kumar,
Man
and
The New
tional
(2001
Seventh
C,
mechanisms.
D.
coolhunt.
M.
Raftery,
vironment.
of mouth:
of word
flat files
Squashing
International
Conference
(2004).
R D.,
models:
(1999).
SIGKDD
and Data
Discovery
FlCHMAN,
(1991).
307-338.
eds.)
of link structure.
Using
In Proc.
The Tipping
Back Bay
Brady,
R.,
Research
Klein
P. and Richardson,
customers.
study
Conference
AAAI
164-168.
Mining
E. G.
value
57-66.
case
The
Joshi,
Econo
learning with
in credit
scalable
A
The
C
(2003).
digitization
and challenges
feedback
of online
Sei. 49 1407-1424.
Domingos,
Hightower,
tion Science
demand.
(1999).
filtering.
(1997).
for E-commerce
/. Market
Lavrac,
Koller,
M.
M.
GLADWELL,
(2002).
a Big Difference.
Make
International
Data
Stern,
in high-energy
Communications
program
Physics
Fourth
and
Discovery
Park, CA.
Clearwater,
cepts
of
59 953-965.
E.
card
in household
patterns
share
of a large-scale
hy
Networks
and ISDN
anatomy
Computer
engine.
Case,
Chan,
The
(1998).
on
impact
behavior.
and word-of-mouth
category
requirements
Research-38
281
-297'.
ing
S. and Page,
M.
space
tist Assoc.
customer
N.
A.
In Relational
models.
78-88.
tent
Managing
The
and Sahami,
Springer,
D.
and Pfeffer,
relational
and
In
learning.
Conference.
Berlin.
CA.
Diego,
Hoff,
A Primitive
415.
3625
N.,
for collaborative
Gladwell,
ness
(2001).
manufacturers:
SIAM,
models
consumer
and Heterogeneity:
Inequality
Structure.
Free Press, New York.
Bowman,
Brin,
In Proc.
Mining.
for model
growth
product
Sei. 15 215-227.
Management
P. M.
(1977).
communi
Enhancing
blockmodels.
new
(1969).
durables.
BLAU,
(2004).
stochastic
using Bayesian
on Data
International
Conference
interest
SIAM
Fourth
D.
and Pregibon,
L.
Getoor,
relational
International
Koller,
N.,
Friedman,
L.,
statistical
models
Learning
probabilistic
Res. 3 679-707.
MR1983942
(2003).
Adomavicius,
cooperation,
Research
20
J. Consumer
15th
probabilistic
(S. Dzeroski
Data
Getoor,
on
Programming,
in Comput.
Sei.
Friedman,
L.,
(2001).
on
comments
insightful
Structure,
(1993).
information.
Tutorial
(2005).
Logic
Notes
Lecture
discussions
offered
who
L.
ductive
and helpful
like to thank three anony
also
would
K.
of market
Models,
Diffusion
by
Internat. J. Forecasting
360-375.
wal
the flow
GETOOR,
We would
of New-Product
(2003). Review
E. M?ller
and Y. Wind,
eds.
J. and Nakamoto,
Frenzen,
ACKNOWLEDGMENTS
275
Amazon.com
filtering.
Classification
case
School
R.
Interna
Management
study.
of Business,
IEEE
in net
CeDER
New
Introduction
(1984).
word-of
and negative
276
Models.
and Hall,
Chapman
McPHERSON,
M.,
cation
trees
Learn.
Res.
and class
A.
(2001).
N.
PAUMGARTEN,
Interfaces
R. (2004).
Learning,
Banff,
Learning.
(2003). No.
Yorker May
5.
C. and Provost,
Perlich,
Learning
D.
Applying
for collaborative
Relational
on Machine
POOLE,
J. (2001).
Annual
Boosted
(2006).
estimation.
for relational
quantitative
31(2) 90-108.
learning
acknowledged.
Distribution-based
(2006).
with
Canada.
Alberta,
1 fan dept.
F.
marketing
M.
sites
score.
P. R.
identifier
attributes.
aggre
Machine
Estimating
of
the telephone
universe.
In Proc.
Tenth ACM
and Domingos,
P. (2002).
In Proc.
Discovery
knowledge
Mining
ACM SIGKDD
Eighth
Statist.
D.
Evans,
D.
B.
ed.
Free
bias in
Reducing
on the propensity
(1984).
subclassification
using
Assoc.
79 516-524.
J. and Yakan,
A.
case in predictive
tering: Special
Mathematics
82 1-11. MR2159280
fil
Collaborative
(2005).
Internat.
analysis.
J. Computer
Ueda,
den
C.
Bulte,
tion revisited:
can J. Sociology
R. (2004).
New
costly:
J. Artificial
G. M.
WEISS,
Yang,
The
The
effect
Intelligence
(2004).
SIGKDD
hidden
of
Dec.
Research
Mining
preferences.
L.
versus
effort.
sight)
Learning
distribution
persuaders.
when
on
Ameri
The
training data
tree induction.
19 315-354.
with
rarity: A unifying
Newsletter
6 7-19.
(2003). Modeling
/. Marketing
Research
This content downloaded from 202.92.130.58 on Tue, 10 Mar 2015 09:28:56 UTC
All use subject to JSTOR Terms and Conditions
innova
Medical
(2001).
marketing
(in plain
5, 69-75.
(2003).
class
Explorations
S. and Allenby,
G. M.
consumer
G.
contagion
106 1409-1435.
are
ACM
and Lilien,
Social
Walker,
Weiss,
the size
5th
Innovations,
of
Diffusion
and Rubin,
studies
J. Amer.
K.,
Van
The New
(2003).
and Data
Discovery
York.
observational
J. Mach.
probabilistic
In Proc. Workshop
filtering.
21st International
Confer
Bayesian
Mark-recapture
approach.
on Knowledge
International
SIGKDD
Conference
ACM Press, New York.
and Data Mining
659-664.
Richardson,
New
Press,
classifi
62 65-105.
(2004).
E. M.
ROGERS,
of
Hierarchical
sharing
Birds
Review
Tout,
L.
models
on Statistical
gation
A.
probability/quantile
J. and Greiner,
relational
ence
and Buja,
to the Internet.
techniques
61-70.
Mining
To appear.
Montgomery,
Newton,
and COOK,
networks.
on Knowledge
Conference
ACM Press, New York.
International
MR0727836
Rosenbaum,
A.
D., Wyner,
Linear
Generalized
(1983).
York.
L.
in social
Homophily
27 415-444.
Sociology
New
Smith-Lovin,
of a feather:
Mease,
J. A.
P. and Nelder,
McCULLAGH,
framework.
interdependent
40 282-294.