Professional Documents
Culture Documents
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/220534770
CITATIONS READS
79 1,705
3 authors:
Dipak C. Jain
INSEAD
55 PUBLICATIONS 3,750 CITATIONS
SEE PROFILE
All content following this page was uploaded by Siddharth S Singh on 16 May 2014.
Dipak C. Jain
J. L. Kellogg School of Management, Northwestern University, Evanston, Illinois 60208,
d-jain@kellogg.northwestern.edu
T he measurement of customer lifetime value is important because it is used as a metric in evaluating decisions
in the context of customer relationship management. For a firm, it is important to form some expectations
as to the lifetime value of each customer at the time a customer starts doing business with the firm, and at each
purchase by the customer. In this paper, we use a hierarchical Bayes approach to estimate the lifetime value
of each customer at each purchase occasion by jointly modeling the purchase timing, purchase amount, and
risk of defection from the firm for each customer. The data come from a membership-based direct marketing
company where the times of each customer joining the membership and terminating it are known once these
events happen. In addition, there is an uncertain relationship between customer lifetime and purchase behavior.
Therefore, longer customer lifetime does not necessarily imply higher customer lifetime value.
We compare the performance of our model with other models on a separate validation data set. The models
compared are the extended NBD–Pareto model, the recency, frequency, and monetary value model, two models
nested in our proposed model, and a heuristic model that takes the average customer lifetime, the average
interpurchase time, and the average dollar purchase amount observed in our estimation sample and uses them
to predict the present value of future customer revenues at each purchase occasion in our hold-out sample. The
results show that our model performs better than all the other models compared both at predicting customer
lifetime value and in targeting valuable customers. The results also show that longer interpurchase times are
associated with larger purchase amounts and a greater risk of leaving the firm. Both male and female customers
seem to have similar interpurchase time intervals and risk of leaving; however, female customers spend less
compared with male customers.
Key words: customer lifetime value; customer equity; hierarchical Bayes
History: Accepted by Jagmohan S. Raju, marketing; received November 19, 2004. This paper was with the
authors 11 months for 3 revisions. Published online in Articles in Advance December 11, 2007.
It is noteworthy that research on CLV measurement lifetimes with the firm are exponentially distributed.
has so far focused on specific contexts. This is neces- As discussed by Schmittlein and Peterson (1994), in
sary because the data available to a researcher or firm contexts (such as ours) where customer lifetimes are
in different contexts might be different. The two types observed, the NBD–Pareto model has limitations and
of context generally considered are noncontractual is not suitable.
and contractual (e.g., Reinartz and Kumar 2000, 2003). Another approach that can naturally incorporate
A noncontractual context is one in which the firm past behavioral outcomes into future expectations is a
does not observe customer defection, and the relation- Bayesian approach (Rossi and Allenby 2003). Bayesian
ship between customer purchase behavior and cus- methods can incorporate such prior information in
tomer lifetime is not certain (e.g., Fader et al. 2005; the structure of the model easily through the priors of
Schmittlein and Peterson 1994; Reinartz and Kumar the distributions of the drivers of CLV. Furthermore,
2000, 2003). A contractual context, on the other hand, this approach can be used in any context. Therefore,
is one in which customer defections are observed, we use such an approach to measure customer life-
and longer customer lifetime implies higher cus- time value, leveraging the extra information available
tomer lifetime value (e.g., Thomas 2001, Bolton 1998, to the firm in observing customer lifetimes. A hierar-
Bhattacharya 1998). The context of our study, as we chical Bayesian model is developed that jointly pre-
describe later, has elements of both contractual and dicts a customer’s risk of defection and spending
noncontractual settings, a scenario that has not been pattern at each purchase occasion. This information
analyzed in-depth previously (Singh and Jain 2007). is then used to estimate the lifetime value of each
Different models for measuring CLV arrive differ- customer of the firm at every purchase occasion. We
ently at estimates of the expectations of future cus- compare the predictions from our model on a separate
tomer purchase behavior. For example, some models validation sample to those obtained from some extant
consider discrete time intervals and assume that each methods of measuring CLV, namely, the extended
customer spends a given amount (e.g., an average NBD–Pareto framework,1 a heuristic method, and two
amount of spending in the data) during each interval models nested in our proposed model. We also com-
of time. This information, along with some assump- pare the performance of our model in targeting cus-
tion about the customer lifetime length, is used to tomers with the performance of a recency, frequency,
estimate the lifetime value of each customer by and monetary value (RFM) framework, in addition to
a discounted cash-flow method (Berger and Nasr the other models mentioned previously.
1998). In another model, Rust et al. (2004) combine The results show that our proposed model per-
the frequency of category purchases, average quan- forms better in terms of predicting customer lifetime
tity of purchase, brand-switching patterns, and the value and also in targeting valuable customers than
firm’s contribution margin to estimate the lifetime the methods used for comparison. We find that cus-
value of each customer. Because customer purchase tomers’ purchase timing, purchase amount, and risk
behavior might change over a customer’s lifetime of defecting are not independent of each other, which
with the firm, methods that incorporate past cus- validates our joint modeling approach.
tomer behavior to form an expectation of future The remainder of this paper is organized as fol-
customer behavior and, subsequently, the remaining lows: the next section describes the data, §3 details the
customer lifetime value are likely to have advantages model development, §4 discusses the estimates, and
over other methods (e.g., Schmittlein and Peterson §5 applies the model to a separate validation sam-
1994). ple data set and compares its performance with other
A popular method that follows such an approach methods. Finally, §6 ends the paper with a summary
in a noncontractual context is the negative binomial and discussion of the results.
distribution (NBD)–Pareto model by Schmittlein et al.
(1987). In this model, past customer purchase behav- 2. The Data
ior is used to predict the future probability of a The data come from a membership-based direct mar-
customer remaining in business with the firm (the keting company. Examples of such companies are
probability of each customer being alive). Along with membership-based clubs such as music clubs, book
a measure of purchase frequency and amount spent clubs, and other types of purchase-related clubs. The
during a purchase, this probability can be used to esti- membership is open to the general public.2 Informa-
mate customer lifetime value (Reinartz and Kumar tion about any purchase by a customer is known to
2000, 2003; Schmittlein and Peterson 1994). The NBD–
Pareto model is applied in instances where customer 1
Proposed by Schmittlein et al. (1987) and later extended by
lifetimes are not known with certainty, i.e., it is not Schmittlein and Peterson (1994).
known when a customer stops doing business with 2
Due to a data confidentiality agreement with the company, we are
a firm; the model assumes that individual customer unable to divulge more details about the company.
Borle, Singh, and Jain: Customer Lifetime Value Measurement
102 Management Science 54(1), pp. 100–112, © 2008 INFORMS
the firm only when the purchase happens. Similarly, Figure 1 Interpurchase Times
customer lifetime length (total membership duration) 3,000
with the firm is not known to the firm until a cus-
tomer leaves the firm (i.e., the customer terminates 2,500
her membership). In such firms, both the purchase
Purchase occasions
timing and spending on purchases do not happen 2,000
Purchase occasions
mation sample, contains 1,000 past customers and con- 1,500
sists of a total of 7,108 purchase occasions. It traces
the purchase behavior of these customers over their
1,000
entire lifetime with the firm. The dates of member-
ship initiation and termination are known for each
customer, i.e., completed lifetime lengths are known 500
Figure 3 Customer Lifetimes Figure 4(a) Number of Customers Existing After a Particular Purchase
60
140
100
40
80
30
60
20 40
20
10
0
0 3 6 9 12 15 18 21 24 27 30 33 36 39
0
Purchase occasion
0 24 48 72 96 120 144 168 192 216 240
Lifetime in weeks
Figure 4(b) The Corresponding Hazard Pattern
0.3
0.2
left after making the first purchase, the second pur-
0.1
chase, the third purchase, and so on. Figures 4(a)
and 4(b) display the histogram plot and the corre- 0
0 3 6 9 12 15 18 21 24 27 30 33 36 39
sponding hazard of this exit pattern of customers, Purchase occasion
respectively. This is the third dependent quantity of
interest and captures the customer mortality informa-
In the next section we introduce our model and
tion. The horizontal axis in both figures is the number
subsequently apply it to predict the customer lifetime
of purchase occasions; in our estimation sample, we
values at each purchase occasion.
observe a maximum of 41 purchase occasions.5 The
vertical axis in Figure 4(a) is the number of customers
who terminate their membership with the firm after 3. The Model
a particular purchase occasion. The vertical axis in Typical data for each customer can be depicted as in
Figure 4(b) is the average probability of a customer Figure 5. A customer joins the firm, makes her first
defecting (the hazard rate) given that the customer purchase of $x1 after t1 weeks, makes her second pur-
has survived until a particular purchase occasion. chase of $x2 after another t2 weeks, and so on until
Figure 4(b) also contains a third-degree polynomial the ith purchase occasion. Subsequently, the customer
approximation of the actual hazard pattern (the dot- leaves with a censored spell of ti+1 weeks.
ted line). An interesting facet about the empirical haz- We develop a joint model of the three dependent
ard pattern in Figure 4(b) is that the hazard rises quantities of interest viz. the interpurchase time, the
until the sixth purchase occasion and then decreases purchase amount, and the probability of leaving given
until about the 17th purchase occasion and subse- that a customer has survived a particular purchase
quently rises again. It is conceivable that people join occasion (i.e., the hazard rate6 or the risk of defection).
the firm, try it out for a few occasions, and then some We specify models for interpurchase time, purchase
of the customers decide to quit the firm whereas oth- amounts, and the risk of defection and then allow a
ers become consistent purchasers. correlation structure across these three models, thus
leading to a joint model of these three quantities. The
5
The maximum number of times any customer bought from the
6
firm was 41. See Jain and Vilcassim (1991) for an exposition of hazard models.
Borle, Singh, and Jain: Customer Lifetime Value Measurement
104 Management Science 54(1), pp. 100–112, © 2008 INFORMS
model is then jointly estimated and we use the esti- The variable GENDERh is the gender of customer h
mates to predict the customer lifetime value at each (female = 1; male = 0). The coefficient on GENDERh
purchase occasion for every customer in the valida- addresses any gender differences in the population in
tion sample. terms of purchase frequencies (interpurchase times).
The quadratic trend parameter
i (=
/ i +
// i2 allows
3.1. Interpurchase Time Model
for nonstationarity in the interpurchase times across
The interpurchase time is measured in weeks and we
purchase occasions.8
assume that it follows an NBD process, i.e.,
The parameter
1h specifies the impact of lag inter-
T IMEhi ∼ NBD
hi 1 (1) purchase time9 on the current interpurchase time.
We incorporate heterogeneity over this parameter by
where TIMEhi = 0 1 2 3 measures the interpur- specifying a normal distribution for the
1h values:
chase time in weeks for customer h at purchase occa-
sion i (the time between the (i − 1)th and the ith
1h ∼ Normal
¯ 1 12 (5)
purchase occasion), and
hi 1 are the parameters of
the NBD distribution. The parameter
hi is the mean
of the distribution and 1 is the dispersion parameter. 3.2. Purchase-Amount Model
The NBD is a well known and used distribution in The amount (in dollars, used as a general unit of cur-
the marketing literature. It is a generalization of the rency) expended by customer h on purchase occa-
Poisson distribution and is useful in modeling over- sion i is denoted by AMNThi . We assume that this
dispersed count data. Another flexible distribution to variable follows a log-normal process. Thus, we have
model over-dispersed data is the COM-Poisson distri-
bution (Boatwright et al. 2003); however, in our appli- log AMNT hi ∼ Normalhi 2 (6)
cation the NBD outperformed the COM-Poisson in its
predictive ability.7 where hi 2 are the parameters (mean and vari-
The probability mass function of the NBD distribu- ance, respectively) of the distribution. An analogous
tion is as follows: structure (analogous to the interpurchase-time model,
1 + T IMEhi Equations (4) and (5) is allowed for the hi parameter
P TIMEhi /
hi 1 = as follows:
1 TIMEhi + 1
1 TIMEhi
1
hi hi = h + i + 1h log lagAMNT hi + 2 GENDERh
· (2)
1 +
hi 1 +
hi
where i = / i + // i2 (7)
Thus, the likelihood contribution of a complete spell
is as given in Equation (2) whereas the likelihood con-
The coefficient 2 specifies the impact of gender
tribution of a censored spell is as follows:
on purchase amounts and the coefficient i allows
TIMEhi
for a nonlinear trend in the purchase amounts across
1− P r/
hi 1 (3) purchase occasions. The coefficient 1h specifies the
r=0
impact of lagged dollars spent on future amounts
We further specify the parameter
hi as follows: expended. We allow this parameter to vary across cus-
tomers as follows:
log
hi =
h +
i +
1h log lagTIMEhi +
2 GENDERh
where
i =
/ i +
// i2 (4) 1h ∼ Normal¯ 1 22 (8)
7
Although, we must point out that NBD may not dominate over
8
the COM-Poisson in all applications. Where under-dispersion is Here, i indexes the purchase occasion. Higher-order polynomials
prevalent in the data, the COM-Poisson will dominate over the (beyond quadratic) were also estimated and not found to be statis-
NBD (Borle et al. 2007). Even in over-dispersed data, in some appli- tically significant.
9
cations the COM-Poisson would give a better fit (see Shmueli et al. In a few instances (less than 0.5% of the data) where the lag inter-
2005). purchase time is 0, we replace the value with 1.
Borle, Singh, and Jain: Customer Lifetime Value Measurement
Management Science 54(1), pp. 100–112, © 2008 INFORMS 105
Table 3(a) Parameter Estimates Table 4 Parameter Estimates (The Correlation Structure)
(Interpurchase-Time Model)
22587∗
Parameter Estimate ¯ 003843
27021∗
1 22809∗ = ¯ =
002493
004801
¯
/ 00751∗ −93303∗
000482 041193
// −000138∗
0000169 matrix
¯ 1 −00401∗
TIMEhi logAMNThi h(LIFEhi )
001218
12 00326∗ TIMEhi 02073∗ 00164∗ 08435∗
000255 001588 000673 008982
2 −00324 logAMNThi 00164∗ 00757∗ 00422
003961 000673 000567 003788
h(LIFEhi ) 08435∗ 00422 61262∗
008982 003788 094064
Table 3(b) Parameter Estimates
(Purchase-Amount Model)
12
The estimates of household-specific parameters have not been
expected interpurchase times and the expected pur- reported in the manuscript for sake of brevity.
chase amounts, respectively, whereas the h values 13
When the covariance matrix is converted to a correlation matrix,
can be interpreted as a measure of household-specific this correlation is found to be close to 13% [=00164/02073 ∗
base-level risk of defection from the firm at each pur- 00757∧ 05].
Borle, Singh, and Jain: Customer Lifetime Value Measurement
Management Science 54(1), pp. 100–112, © 2008 INFORMS 107
discussion of the estimates of the purchase-amount “average” effect of increases in lag purchase amounts
model and the risk of defection model (Tables 3(b) on the current purchase amounts is minimal, i.e.,
and 3(c), respectively). insignificant.
The parameters
/ and
// in Table 3(a) are the Table 3(c) contains estimates from the customer-
second-order polynomial approximation of the non- defection model. The parameters / , // , and /// form
stationarity in interpurchase times after controlling a third-degree polynomial approximation (as men-
for the effect of household-specific intercept
h and tioned earlier, the higher-order terms in the polyno-
covariates used in the model (Equation (4)). The mial were insignificant) of the nonstationarity in the
signs of these coefficients indicate that interpur- hazard rate of customers leaving the membership at
chase times tend to increase and then decrease as each purchase occasion after controlling for the effect
purchase occasions progress. It is possible that as pur- of household-specific intercept h and covariates used
chase occasions progress, more and more customers in the model (Equation (10)). The signs and magni-
“try out” the service and, in the long run, the less tude of the / , // , /// parameters mirror the empir-
“loyal” and the more “erratic” purchasers have left ical hazard rate shown earlier in Figure 4(b) in that
the firm, and those remaining with the firm have con- the hazard initially rises, then falls, and then rises
sistent and perhaps higher frequencies of purchase. again as purchase occasions progress. Uncovering the
The parameters
¯ 1 12 specify the mean and pattern of mortality is a very important part of the
variance, respectively, of the normal heterogene- model and can be leveraged by the firm in improv-
ity distribution over the household-specific response ing predictions of customer lifetime value. To the best
parameters (
1h representing the effect of lag inter- of our knowledge, the extant literature has not stud-
purchase time on the current interpurchase time. ied such a kind of application where the firm jointly
These are estimated as −00401 00326, implying uses customer mortality pattern and customer pur-
that, on average, the impact of lag interpurchase time chase behavior to better predict CLV.
on current interpurchase time is not significant. Now The parameters ¯ 1 32 specify the mean and vari-
when we look at the customer-specific estimates of
1h ance of the normal heterogeneity distribution over
(not reported in the manuscript), we find that there 1h ’s that are the customer-specific response param-
are only 3.1% customers with a “significant” estimate eters for the effect of lag interpurchase time on the
of
1h (1.3% have a negative estimate, whereas the risk of defection. The estimates of ¯ 1 32 , in other
remaining 1.8% have a positive estimate), also indicat- words, −01096 01716 imply that ¯ 1 , which is the
ing that the “average” effect of lagged interpurchase average impact of lag interpurchase times on the
time on the current interpurchase time in the popula- risk of defection, is not significant. Alternately, look-
tion is minimal (almost absent). Finally, the parameter ing at the customer-specific 1h values, we find that
2 (= − 00324) is not significantly different from 0, none are estimated to be significantly different from
indicating that both male and female customers have zero, implying that there is virtually no impact of lag
similar interpurchase time intervals. interpurchase times on the risk of defection.
Now consider the estimates of the purchase amount Similarly, the parameters ¯ 2 42 [estimated as
model in Table 3(b). The parameters / and // (0.6477, 0.1385)] specify the mean and variance of the
approximate the nonstationarity in purchase amounts. normal heterogeneity distribution over 2h ’s that mea-
Their signs indicate that purchase amounts initially sure the impact of lag purchase amount on the risk
increase and then decrease across purchase occa- of defection. On average, the estimates show a signif-
sions. The parameter 2 (= − 00994) is significant icant impact of lag purchase amount on the risk of
and negative, indicating that women tend to spend defection. Looking at the customer-specific estimates,
less compared to men. On average, women tend to we find that most of the 2h values are positive and
spend 9% [=1 − exp−00994] less dollars per occa- significant, also implying that higher spending by a
sion then men. customer corresponds to an increased subsequent risk
The parameters ¯ 1 22 specify the mean and of defection for the customer. The remaining param-
variance, respectively, of the normal heterogene- eter in Table 3(c), 3 , is the impact of gender on the
ity distribution over the household-specific response risk of defection. The estimated value of −02317 is
parameter for the effect of lag purchase amount on not significant, implying that men and women tend
the current purchase amount 1h in the purchase- to have similar risks of defection.
amount model. Their estimates of −00015 00131 In summary, the estimates show that there is signifi-
imply that, on average, the impact of lag purchase cant nonstationarity in all of the three outcomes mod-
amount on current purchase amount is not significant. eled (i.e., interpurchase time, amount spent, and risk
When we consider the customer-specific estimates of defection). Therefore, consideration of nonstation-
of 1h , we find that there are only 1.1% of cus- arity in measuring lifetime value of customers is likely
tomers with a significant 1h , also indicating that the to improve the measurements. We find that higher
Borle, Singh, and Jain: Customer Lifetime Value Measurement
108 Management Science 54(1), pp. 100–112, © 2008 INFORMS
spending by a customer is related to an increased the gender of the person. The lagged value of “time
risk of subsequent defection, and female customers to next purchase” (interpurchase time) and the lagged
spend less than male customers. The significance of value of purchase amount do not exist. So, using
correlations between the outcomes modeled shows the gender covariate and using the non-household-
the appropriateness of the joint modeling approach specific parameters (in Tables 3(a)–3(c) and 4) the firm
that we follow. predicts (a) the probability of defecting before the first
In the next section, we illustrate the usefulness of purchase “p1 ,” (b) the time to first purchase “t1 ,” and
our model by applying it on a validation sample to (c) the amount of the first purchase “x1 .” These three
predict the present values of the lifetime revenues of predicted values are then used in a simulation of the
customers at each purchase occasion, i.e., customer entire lifespan of the customer.
lifetime value at each purchase occasion. We then The simulation is done as follows: the probability p1
compare the performance of the proposed model with is compared to a uniform0 1 draw and the “death”
some extant methods of CLV estimation and customer event before the next purchase occasion decided.
targeting. If simulated “death” does not occur, the customer
spends x1 amount after time t1 . So now, in the sim-
ulation, the customer has finished the first purchase
5. Application of the Proposed Model occasion. Using the non-household-specific parame-
We consider two related applications of the proposed ters in Tables 3(a)–3(c) and 4 and the now-available
model and illustrate the usefulness of the model com- lagged values of interpurchase time and purchase
pared the extant methods used.14 The first application amount (t1 and x1 , respectively) the firm predicts the
is in predicting customer lifetime values and the sec- triad value (p2 t2 x2 for the next (second) purchase
ond application is in targeting valuable customers. occasion. This simulation goes on until a simulated
death event occurs, at which point the stream of sim-
5.1. Predicting Customer Lifetime Value ulated revenues is calculated for that customer and
We apply the proposed model to predict the present discounted to time zero (the time of the customer
value of future customer lifetime revenues at each joining the service) using an annual discount rate of
purchase occasion for each customer in a validation 12%15 (Gupta et al. 2004). This is done for all the cus-
data sample. This sample consisted of 500 past cus- tomers in the data set and, thus, a total estimate of
tomers (a total of 3,547 purchase occasions) spread customer lifetime value at the time of joining service
across a total of 29 purchase occasions (i.e., the maxi- is obtained.16 17
mum number of times any customer bought from the After the first actual purchase event is observed
firm in this validation data set was 29). Because we by the firm for a customer, the firm has some more
know the actual lifetimes of all of these 500 customers,
we can test the performance of our model in predict- 15
A range of discount rates from 10% to 15% was also used; the rel-
ing customer lifetime values. Figure 6 is helpful in ative performance of the model vis-à-vis other models considered
illustrating the prediction of CLV. does not change.
At the time of membership initiation (time zero), all 16
The simulation is done 1,000 times using the set of 500 thinned
that the firm knows about the customer (in terms of posterior draws from our MCMC chain and the CLV for each cus-
tomer is averaged over these 1000 × 500 iterations.
relevance to prediction using our proposed model) is
17
Assuming costs of servicing customers to be the same across cus-
tomers and, thus, without loss of generality assuming this to be 0,
14
As mentioned earlier, the context of our data is unique, and this the estimate of future revenues discounted to the present time can
limits the choice of extant methods for comparison. be viewed as the customer lifetime value.
Borle, Singh, and Jain: Customer Lifetime Value Measurement
Management Science 54(1), pp. 100–112, © 2008 INFORMS 109
information on the customer, namely, the time to applied in a noncontractual context. It has often been
first purchase, the amount of the first purchase, and used as a benchmark to compare various methods of
that the customer “survived” the purchase occa- lifetime valuations (Fader et al. 2005). The underly-
sion. Using this information and the non-household- ing assumptions of the extended NBD–Pareto model
specific parameters (in Tables 3(a)–3(c) and 4) as (Schmittlein and Peterson 1994) are a poisson purchase
priors, the firm estimates the household-specific process for individual customers (with the poisson rate
parameters
h
1h (Equation (4)), h 1h (Equa- distributed gamma across the population), an expo-
tion (7)), and h 4h 5h (Equation (10)). A simulation nential distribution for individual customer lifetimes
exercise is again carried out as described earlier (with the exponential parameter distributed as gamma
except that now, wherever applicable, the household- across the population), and a normal distribution for
specific parameters are used in the simulation, the end the dollar purchase amounts. Given these assump-
result being an estimate of customer lifetime value tions, Schmittlein and Peterson (1994) derive (among
after the first purchase occasion. other things) an expression for the expected future
A similar process is followed for each purchase dollar volume from a customer with a given purchase
occasion, updating the household-specific parameters history. This can then be used to calculate the present
with the available information and then simulating value of future customer revenues. This is what we
to predict CLV. Every interaction leads to more infor- calculate for each customer at each purchase occasion
mation about the customer and, thus, it is imperative in the model comparison.
that the firm use this information in future predic- One key point in the usefulness of the NBD–Pareto
tions (in our context it implies that the firm update framework is that the researcher does not observe the
the household-specific parameters after every interac- time when a customer becomes inactive, i.e., the end
tion with the customer). The net result is that at each of customer lifetime with the firm. This is clearly
purchase occasion the firm gets an updated estimate not the case in our application where we do observe
of the future lifetime revenues from the customer dis- complete customer lifetimes. So in some sense the
counted to the present time. comparison of predictive performance of the pro-
Note that in practice, a firm would use the model posed model with the extended NBD–Pareto frame-
as follows. Whenever the firm carries out a pre- work may not be a direct comparison. For the sake
dictive exercise to predict the CLV of its existing of completeness, however, we provide a comparison
customers, it will look into its existing customer with the NBD–Pareto model.
database. There would be many customers at varying The “heuristic” model (Model 5) is a simple method
points in their lifespan: some would have just joined, whereby we take the average customer lifetime, the
some would have completed their first purchase occa- average interpurchase time, and the average dollar
sion, some would have completed the second pur- purchase amount observed in our estimation sample
chase occasion, and so on. The firm would estimate and use them to predict the present value of future
the household-specific parameters for these cus- customer revenues at each purchase occasion in our
tomers [
h
1h (Equation (4)), h 1h (Equation (7)), hold-out sample. The heuristic model is a simple yet
and h 4h 5h (Equation (10))] using the available useful method to calculate CLV in the absence of any
purchase history for each customer and the non- available “model.”
household-specific parameters (in Tables 3(a)–3(c) We also compare our proposed model with two
and 4) as priors. Using these parameters, the firm models nested in it. The first nested model is our pro-
would do a simulation exercise as described earlier to posed model without the correlation structure across
estimate the CLV for each customer. The firm would the three components of the model, i.e., this model
repeat this exercise every time it wished to obtain an treats customer defection, spending, and interpur-
estimate of the CLV for its existing customers. chase time as independent of each other. The sec-
To illustrate the relative advantage of the proposed ond nested model is the proposed model without the
model in predicting lifetime value, we compare the covariates (including the trend parameters).
lifetime value estimates from our model with the fol- Figure 7 displays the relative predictive perfor-
lowing other models: (a) the extended NBD–Pareto mance of the models. We plot the actual average cus-
framework; (b) a heuristic method; and (c) two mod- tomer lifetime value after each purchase occasion in
els nested within our proposed model. We explain the our hold-out sample, and compare it with the pre-
details of these models below. dictive performance of the proposed model and the
The NBD–Pareto model (by Schmittlein et al. 1987, other models. The average customer lifetime value is
and later extended by Schmittlein and Peterson 1994) the mean of the lifetime values of all customers sur-
(Model 4 here) is a well regarded model in the liter- viving a purchase occasion. In Table 5, we report the
ature on customer lifetime valuation (Jain and Singh mean absolute deviation (MAD) of the predicted life-
2002, Reinartz and Kumar 2000), recommended to be time value vis-à-vis the actual lifetime values for all
Borle, Singh, and Jain: Customer Lifetime Value Measurement
110 Management Science 54(1), pp. 100–112, © 2008 INFORMS
150
100
50
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28
Purchase occasions
customers across all purchase occasions. In addition, predict the CLV. The MAD statistics show that the
for illustration, we present the average actual and esti- heuristic model also performs poorly relative to the
mated customer value after the 6th purchase occasion. proposed model.
The horizontal axis in Figure 7 is the purchase occa- We now compare the extended NBD–Pareto model
sion and the vertical axis is the average lifetime value to Model 3 (the proposed model without covari-
across all customers who have survived a particular ates) because the NBD–Pareto model does not include
purchase occasion. It is clear from Figure 7 that the covariates. The MAD values show that Model 3 per-
proposed model outperforms the other models com- forms better than the extended NBD–Pareto model.
pared across most of the purchase occasions. One reason for the poor performance of extended
As shown in Table 5, column 3, the overall pre- NBD–Pareto model (relative to the proposed model)
diction from the proposed model (Model 1) is better may be that it does not use the extra informa-
than the other alternatives. In Figure 7, the relative tion in observing completed lifetimes (thus, it can-
advantage of the proposed model over the model not incorporate a time-varying mortality rate), which
without correlation (Model 2) was not visually appar- is explicitly used in our model formulation. The
ent, but comparing the MAD values, we find that the trend variable i (Equation (10)) included in the
proposed model does much better than the nested customer-defection model (§3.3) estimates the time-
model without correlations across the three compo- varying trend observed in customer mortality and
nents (Model 2). This demonstrates that there is clear significantly improves the prediction of customer
value in modeling the correlation structure because lifetime values. This highlights the value of including
we “lose” information if we assume independence the time-varying trend in the model formulation to
across purchase times, purchase amounts, and the risk improve CLV prediction.
of defection. Comparing Model 3 (the other nested
model without the covariates) with Model 1, we find 5.2. Targeting Valuable Customers
that Model 3 performs poorly relative to Model 1. In another related application of the proposed model,
Hence, inclusion of covariates also helps to better we apply it to “score” customers for targeting. This
allows us to compare the model performance with
the widely used RFM value framework. The RFM
Table 5 Predicting Customer Lifetime Values (Comparison Across
Models)
framework is a commonly used technique to score
customers for a variety of purposes (e.g., targeting
CLV (after the 6th MAD customers for a direct-mail campaign). As the name
purchase occasion) (all observations)
suggests, the RFM framework uses information on
Model types ($) ($)
a customer’s past purchase behavior along three
Actual average CLV 6998 0 dimensions (recency of past purchase, frequency of
Model 1 (Proposed model) 6004 4693 past purchases, and the monetary value of past
Model 2 (Proposed model without 6222 5764
the correlation structure)
purchase) to score customers. For our analysis, we
Model 3 (Proposed model without 10410 6107 employed an “advanced form of RFM scoring”
covariates) (Reinartz and Kumar 2003). We regressed the pur-
Model 4 (Extended NBD–Pareto 11389 7229 chase amounts at each purchase occasion (in the val-
model) idation data sample) on the past purchase amounts,
Model 5 (A “heuristic” approach) 2313 6184
the past interpurchase time, and the past cumulative
Borle, Singh, and Jain: Customer Lifetime Value Measurement
Management Science 54(1), pp. 100–112, © 2008 INFORMS 111
Table 6 Targeting Customers (Comparison Across Models) Figure 8 CLV of the Targeted Customers Across Purchase Occasions
can calculate the customer lifetime value for each cus- All authors contributed equally. The authors’ names appear
tomer at each purchase occasion. in random order.
We compare the performance of our model in two
applications on a separate validation data set. First, References
in measuring CLV, we compare our proposed model Berger, P. D., N. Nasr. 1998. Customer lifetime value: Marketing
with the extended NBD–Pareto model, a heuristic models and applications. J. Interactive Marketing 12 17–30.
model, and two other models nested within our pro- Bhattacharya, C. B. 1998. When customers are members: Customer
posed model. Second, in targeting customers, we retention in paid membership contexts. J. Acad. Marketing Sci.
26(1) 31–44.
compare our proposed model to all the models com-
Blattberg, R. C., J. Deighton. 1996. Manage marketing by the cus-
pared earlier and an RFM value framework. The tomer equity test. Harvard Bus. Rev. (July–August) 136–44.
results show that our model performs better at both Boatwright, P., S. Borle, J. B. Kadane. 2003. A model of the joint
predicting the customer lifetime value and targeting distribution of purchase quantity and timing. J. Amer. Statist.
valuable customers than the other models. We also Assoc. 98 564–572.
Bolton, R. N. 1998. A dynamic model of the duration of the cus-
find that jointly modeling customer spending, inter- tomer’s relationship with a continuous service provider: The
purchase time, and the risk of customer defection, role of satisfaction. Marketing Sci. 17(1) 45–65.
incorporating time-varying effects in the model for- Borle, S., U. M. Dholakia, S. S. Singh, R. A. Westbrook. 2007. The
mulation, and including relevant covariates in the impact of survey participation on subsequent customer behav-
ior: An empirical investigation. Marketing Sci. 26(5) 711–726.
model significantly improve the predictive perfor-
Dowling, G. R., M. Uncles. 1997. Do customer loyalty programs
mance of the model. really work? Sloan Management Rev. (Summer) 71–82.
Some of our key results show that longer spells Fader, P. S., B. G. S. Hardie, K. L. Lee. 2005. “Counting your
of interpurchase time are associated with a greater customers” the easy way: An alternative to the Pareto/NBD
model. Marketing Sci. 24(2) 275–284.
risk of customer leaving the firm and also larger pur-
Gupta, S., D. R. Lehmann, J. A. Stuart. 2004. Valuing customers.
chase amounts (though the latter association is weak). J. Marketing Res. 41 7–18.
Both male and female customers seem to have similar Jain, D., S. Singh. 2002. Customer lifetime value research in mar-
interpurchase time intervals; however, women spend keting: A review and future directions. J. Interactive Marketing
less than men. The risk of defection is similar across 16 34–46.
Jain, D. C., N. J. Vilcassim. 1991. Investigating household purchase
male and female customers. timing decisions: A conditional hazard function approach.
Most methods of estimating customer lifetime value Marketing Sci. 10(1) 1–23.
can be best applied in specific situations where their O’Brien, L., C. Jones. 1995. Do rewards really create loyalty?
critical assumptions are satisfied. Our approach is Harvard Bus. Rev. (May–June) 75–82.
best suited for situations where a firm observes when Reinartz, W. J., V. Kumar. 2000. On the profitability of long-life cus-
tomers in a noncontractual setting: An empirical investigation
a customer stops doing business with it, i.e., cus- and implications for marketing. J. Marketing 64 17–35.
tomer lifetimes with the firm are known to the firm Reinartz, W. J., V. Kumar. 2003. The impact of customer relationship
after a customer leaves the firm, and customer pur- characteristics on profitable lifetime duration. J. Marketing 67
chase behavior is stochastic. Examples of such situ- 77–99.
Rossi, P. E., G. M. Allenby. 2003. Bayesian statistics and marketing.
ations would be membership-based purchase clubs Marketing Sci. 22(3) 304–328.
such as movie clubs, music clubs, book clubs, auto- Rust, R., K. Lemon, V. Zeithaml. 2004. Return on marketing: Using
mobile associations, and membership-based retailers customer equity to focus marketing strategy. J. Marketing 68
(e.g., Sams Club and Costco). 109–127.
Schmittlein, D. C., R. A. Peterson. 1994. Customer base analysis:
One potential drawback of this analysis may be
An industrial purchase process application. Marketing Sci. 13(1)
the availability of appropriate covariates. However, 41–67.
the proposed model specification is flexible enough Schmittlein, D. C., D. G. Morrison, R. Colombo. 1987. Counting
to incorporate a richer set of covariates, and thereby your customers: Who are they and what will they do next?
Management Sci. 33(1) 1–24.
improve its predictive performance. What is encour-
Shmueli, G., T. P. Minka, J. B. Kadane, S. Borle, P. Boatwright. 2005.
aging though is that, despite limited availability of A useful distribution for fitting discrete data: Revival of the
covariates, the proposed approach outperforms the COM-Poisson. J. Royal Statist. Soc., Ser. C 54(1) 127–142.
extant methods of CLV prediction and customer target- Singer, J. D., J. B. Willett. 2003. Applied Longitudinal Data Analysis.
ing, at least in the context that is analyzed in this study. Oxford University Press, New York.
Singh, S. S., D. C. Jain. 2007. Customer lifetime purchase behavior:
An econometric model and empirical analysis. Working paper,
Acknowledgments Rice University, Houston, TX.
The authors thank Joseph B. Kadane and Peter Boatwright Thomas, J. S. 2001. A methodology for linking customer acquisition
for their valuable comments and suggestions on this paper. to customer retention. J. Marketing Res. 38(May) 262–268.