Professional Documents
Culture Documents
+ ( )
=
+ +
And since ( ) = . ( ), and log( . ) = ( )+ ( ), then:
( ) ( )
=
Thus, the final expression being used by the lifetimes library in python is:
1
( | , , , , , , )
( )
1+
From http://www.brucehardie.com/notes/021/palive_for_BGNBD.pdf
Parameters
----------
frequency: float
historical frequency of customer.
recency: float
historical recency of customer.
T: float
age of the customer.
ln_exp_max: int
to what value clip log_div equation
Returns
-------
float
value representing a probability
"""
r, alpha, a, b = self._unload_params('r', 'alpha', 'a', 'b')
Cost function
This is the cost function as described in the BG/NBD article:
As the other equations in the CLV model, we need to take the log of the cost function when
implementing the algorithm in order to avoid underflows. The steps below show how this is done for
each term of the cost function;
( + )
= = ln (r + x) + ( ) ln (r)
(r)
(a + )( + )
= = ln (a + b) + ln (b + ) ln (b) ln (a + + x)
(b)(a + b + )
1 1
= =( + ) = ( + ) ( + )
+ +
a 1
= = ln(a) ln( + 1) ( + ) ln( + )
+ 1 +
This last transformation (in red) is called logsumexp or softmax. According to Wikipedia:
The LSE function is often encountered when the usual arithmetic computations are performed in log-
domain or log-scale. Like multiplication operation in linear-scale becoming simple addition in log-scale;
an addition operation in linear-scale becomes the LSE in the log-domain.
A common purpose of using log-domain computations is to increase accuracy and avoid underflow and
overflow problems when very small or very large numbers are represented directly (i.e. in a linear domain)
using a limited-precision, floating point numbers.
In Scala this function is called softmax:
softmax: Return the softmax of the collection. log(exp(a(1)) + ... exp(a(n))). Also known as
logsumexp when used to sum up small probabilities in the log domain