You are on page 1of 3

Mathematical transformations for the CLV functions

Probability of the customer being alive


The equation below represents the probability of a customer with history (frequency, recency, T) is
currently alive:
1
( | , , , , , , )
+
1+
+ +
Since we might be multiplying very small numbers, we need to transform the equation above by applying
the logarithm function, thus avoiding underflows. That is why the equation coded in the lifetime library
in python is slightly different from the one presented in the article. The steps below show how to get to
that equation:

1) Using the logarithm rules (here


( )
= ), lets simplify the second part of the denominator:

+ ( )
=
+ +
And since ( ) = . ( ), and log( . ) = ( )+ ( ), then:
( ) ( )
=
Thus, the final expression being used by the lifetimes library in python is:
1
( | , , , , , , )
( )
1+

The code below is in the beta_geo_fitter.py, from lines 212 to 247

def conditional_probability_alive(self, frequency, recency, T,


ln_exp_max=300):
"""
Compute conditional probability alive.

Compute the probability that a customer with history


(frequency, recency, T) is currently alive.

From http://www.brucehardie.com/notes/021/palive_for_BGNBD.pdf

Parameters
----------
frequency: float
historical frequency of customer.
recency: float
historical recency of customer.
T: float
age of the customer.
ln_exp_max: int
to what value clip log_div equation
Returns
-------
float
value representing a probability

"""
r, alpha, a, b = self._unload_params('r', 'alpha', 'a', 'b')

log_div = (r + frequency) * log(


(alpha + T) / (alpha + recency)) + log(
a / (b + where(frequency == 0, 1, frequency) - 1))

return where(frequency == 0, 1.,


where(log_div > ln_exp_max, 0.,
1. / (1 + exp(np.clip(log_div, None, ln_exp_max)))))

Cost function
This is the cost function as described in the BG/NBD article:

As the other equations in the CLV model, we need to take the log of the cost function when
implementing the algorithm in order to avoid underflows. The steps below show how this is done for
each term of the cost function;
( + )
= = ln (r + x) + ( ) ln (r)
(r)
(a + )( + )
= = ln (a + b) + ln (b + ) ln (b) ln (a + + x)
(b)(a + b + )
1 1
= =( + ) = ( + ) ( + )
+ +
a 1
= = ln(a) ln( + 1) ( + ) ln( + )
+ 1 +

Result = -( + + ln( + ))/ .

This last transformation (in red) is called logsumexp or softmax. According to Wikipedia:

The LSE function is often encountered when the usual arithmetic computations are performed in log-
domain or log-scale. Like multiplication operation in linear-scale becoming simple addition in log-scale;
an addition operation in linear-scale becomes the LSE in the log-domain.

A common purpose of using log-domain computations is to increase accuracy and avoid underflow and
overflow problems when very small or very large numbers are represented directly (i.e. in a linear domain)
using a limited-precision, floating point numbers.
In Scala this function is called softmax:

softmax: Return the softmax of the collection. log(exp(a(1)) + ... exp(a(n))). Also known as
logsumexp when used to sum up small probabilities in the log domain

You might also like