You are on page 1of 14

To Treat or Not to Treat:

The Application of Analytics to


Credit & Collections.

Andy Christian
© 2007 AT&T Knowledge Ventures. All rights reserved. AT&T
and the AT&T logo are trademarks of AT&T Knowledge Ventures.

Introduction
It can be argued that Collections should be viewed as
a business opportunity.
Why do this?
• Reduce net bad debt
• Keep AR balances suppressed
In the current economy, collections has become
necessary to keep some companies solvent.
Analytics can help companies optimize their
collections activities so that resources aren’t wasted.

Page 2

1
Overview of Collections Process

Model Model Step 1 – Credit Model


Step 1 Model Step 2 – Risk Behavior
Model Step 3 – Treatment Action

Application Decision No
for service

Yes
Model
Billing Step 3

Model Risk Action Collections


Step 2 Evaluation Decision Activity

Page 3

Data Architecture

Monthly
Monthly
Daily

ly
th
on
M

Daily

Page 4

2
Clustering

© 2007 AT&T Knowledge Ventures. All rights reserved. AT&T


and the AT&T logo are trademarks of AT&T Knowledge Ventures.

Inside the Decision Engine

• We begin with the clustering process. Clusters were developed with


the help of SAS Enterprise Miner 5.2.

• Cluster node can be found under the ‘Explore’ tab.


• When clustering, the higher the CCC the better.

Page 6

3
Inside the Decision Engine
• SAS Enterprise Miner 5.2 creates a decision tree and an “English”
translation so that you can code it into your system.

Page 7

Consumer Cluster Algorithm


• Here is the whole tree.

Page 8

4
Clustering Result
• After separating the population into clusters, we see evidence of 5 distinct
groups by their corresponding bad rates. The average bad rate for the total
population is 3.3%.

• Cluster 1 – Good population with a 2.0% bad rate.


Cluster 1
12%
Cluster 5
• Cluster 2 – Bad population with an 8.6% bad rate.
Bad Rate - 2.0%
31%

Cluster 2
16%
• Cluster 3 – Excellent population with a low 0.4%
Bad Rate - 3.11%
Bad Rate - 8.6%
bad rate.

• Cluster 4 – Very bad population with a high 12.1%


bad rate. Cluster 4
9%
Cluster 3
Bad Rate - 12.1% 32%

• Cluster 5 – Good to average population with a 3% Bad Rate - 0.4%

bad rate.

Page 9

Risk Behavior Modeling

© 2007 AT&T Knowledge Ventures. All rights reserved. AT&T


and the AT&T logo are trademarks of AT&T Knowledge Ventures.

5
Risk Behavior Modeling
• Now that we have 5 separate clusters, our next step is to use
Enterprise Miner to create 5 distinct models, one for each cluster.

Page 11

Model 1/3 Lift Charts


• In order to measure lift, we chart the cumulative performance for capturing “bads”
across all deciles. The larger the area is under the curve, the bigger the lift, and thus
the better the performance of the model.
• The new model created for Cluster 1/3 outperforms the current model tremendously.

Cluster 1/3 Lift Chart (Cumulative) - LIVSCORE


100%

90%

80%

70%

60%
Cum % of Bads

50%

40%

30%

20%

10%

0%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Decile

Random Step2 Dev Existing

Page 12

6
Validation for Model 1/3
The KS statistic is the measurement of the cumulative amount of goods minus the cumulative amount of bads.
The larger the statistic, the better the ability of the model to separate goods from bads.
The new model has a KS value of 46.3% compared to the existing model which has a KS value of 30.9%. This
difference of 15.4 percentage points is significant.

Cluster 1/3 Equalized New Model Validation KS Statistic 46.31%

Score Interval Interval Cum Cumulative Cumulative K-S Interval


Decile Range Bad Good Total Bad% Good % % of Bads % of Goods Statistic Bad Rate
10% 1-760 660 14,008 14,668 51.8% 9.6% 51.8% 9.6% 42.21% 4.50%
20% 760-790 179 14,490 29,337 14.1% 10.0% 65.9% 19.6% 46.31% 1.22%
30% 790-798 108 14,561 44,006 8.5% 10.0% 74.4% 29.6% 44.78% 0.74%
40% 798-807 87 14,582 58,675 6.8% 10.0% 81.2% 39.6% 41.59% 0.59% 33%
50% 807-811 70 14,598 73,343 5.5% 10.0% 86.7% 49.7% 37.05% 0.48%
improvement
60% 811-811 35 14,634 88,012 2.7% 10.1% 89.5% 59.7% 29.73% 0.24%
70% 811-811 36 14,633 102,681 2.8% 10.1% 92.3% 69.8% 22.50% 0.25%
80% 811-811 33 14,636 117,350 2.6% 10.1% 94.9% 79.9% 15.02% 0.22%
90% 811-811 30 14,639 132,019 2.4% 10.1% 97.3% 89.9% 7.31% 0.20%
100% 811-811 35 14,634 146,688 2.7% 10.1% 100.0% 100.0% 0.00% 0.24%
Total 1,273 145,415 0.87%

Cluster 1/3 Existing Model Validation KS Statistic 30.96%

Score Interval Interval Cum Cumulative Cumulative K-S Interval


Decile Range Bad Good Total Bad% Good % % of Bads % of Goods Statistic Bad Rate
10% 0-711 518 14,150 14,668 40.7% 9.7% 40.7% 9.7% 30.96% 3.53%
20% 711-738 120 14,549 29,337 9.4% 10.0% 50.1% 19.7% 30.38% 0.82%
30% 738-765 100 14,569 44,006 7.9% 10.0% 58.0% 29.8% 28.22% 0.68%
40% 765-797 112 14,557 58,675 8.8% 10.0% 66.8% 39.8% 27.01% 0.76%
50% 797-837 78 14,591 73,344 6.1% 10.0% 72.9% 49.8% 23.10% 0.53%
60% 837-880 95 14,573 88,012 7.5% 10.0% 80.4% 59.8% 20.54% 0.65%
70% 880-933 103 14,566 102,681 8.1% 10.0% 88.5% 69.8% 18.61% 0.70%
80% 933-965 75 14,594 117,350 5.9% 10.0% 94.3% 79.9% 14.47% 0.51%
90% 965-981 36 14,633 132,019 2.8% 10.1% 97.2% 89.9% 7.23% 0.25%
100% 981-998 36 14,633 146,688 2.8% 10.1% 100.0% 100.0% 0.00% 0.25%
Page 13 Total 1,273 145,415 0.87%

Model 2 Lift Charts


•For Cluster 2, the new model outperforms the current model.

Cluster 2 Lift Chart (Cumulative) - LIVSCORE


100%

90%

80%

70%

60%
Cum % of Bads

50%

40%

30%

20%

10%

0%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Decile

Random Step2 Dev Existing

Page 14

7
Validation for Model 2
The new model has a KS value of 48% compared to the existing model which has a KS value of 45%. This
difference of 3 percentage points is moderately significant.
Cluster 2 Equalized New Model Validation KS Statistic 48.03%

Interval Interval Cum Cumulative Cumulative K-S Interval


Decile Score Range Bad Good Total Bad% Good % % of Bads % of Goods Statistic Bad Rate
10% 1-354 1,389 3,857 5,246 44.5% 7.8% 44.5% 7.8% 36.73% 26.48%
20% 354-497 602 4,644 10,492 19.3% 9.4% 63.9% 17.2% 46.63% 11.48%
30% 497-553 353 4,893 15,738 11.3% 9.9% 75.2% 27.1% 48.03% 6.73%
40% 553-587 229 5,017 20,984 7.3% 10.2% 82.5% 37.3% 45.21% 4.37% 6%
50% 587-609 172 5,074 26,230 5.5% 10.3% 88.0% 47.6% 40.44% 3.28%
60% 609-624 127 5,120 31,477 4.1% 10.4% 92.1% 58.0% 34.14% 2.42%
improvement
70% 624-635 91 5,155 36,723 2.9% 10.4% 95.0% 68.4% 26.61% 1.73%
80% 635-643 64 5,182 41,969 2.1% 10.5% 97.1% 78.9% 18.16% 1.22%
90% 643-650 54 5,192 47,215 1.7% 10.5% 98.8% 89.4% 9.37% 1.03%
100% 650-669 37 5,210 52,462 1.2% 10.6% 100.0% 100.0% 0.00% 0.71%
Total 3,118 49,344 5.94%

Cluster 2 Existing Model Validation KS Statistic 45.37%

Interval Interval Cum Cumulative Cumulative K-S Interval


Decile Score Range Bad Good Total Bad% Good % % of Bads % of Goods Statistic Bad Rate
10% 0-464 1,189 4,057 5,246 38.1% 8.2% 38.1% 8.2% 29.91% 22.66%
20% 465-612 714 4,532 10,492 22.9% 9.2% 61.0% 17.4% 43.63% 13.61%
30% 612-660 363 4,883 15,738 11.6% 9.9% 72.7% 27.3% 45.37% 6.92%
40% 660-691 236 5,010 20,984 7.6% 10.2% 80.2% 37.5% 42.79% 4.50%
50% 691-714 156 5,091 26,231 5.0% 10.3% 85.2% 47.8% 37.47% 2.97%
60% 714-734 105 5,141 31,477 3.4% 10.4% 88.6% 58.2% 30.42% 2.00%
70% 734-767 103 5,143 36,723 3.3% 10.4% 91.9% 68.6% 23.30% 1.96%
80% 767-814 110 5,136 41,969 3.5% 10.4% 95.4% 79.0% 16.42% 2.10%
90% 814-904 87 5,159 47,215 2.8% 10.5% 98.2% 89.5% 8.76% 1.66%
100% 904-999 55 5,192 52,462 1.8% 10.5% 100.0% 100.0% 0.00% 1.05%
Total 3,118 49,344 5.94%

Page 15

Multiple Models Standardization


Now that we have all of our models complete and are satisfied with
their performance, we need to align or equalize these models so that
scores from each model will have equivalent meaning.

Cluster 1 Score Range


Avg. Bad Rate 2.1%
Cluster 2 Score Range
Avg. Bad Rate 8.6% 1 999
999 1

Cluster 3 Score Range


Avg. Bad Rate 0.4%
Cluster 4 Score Range
Avg. Bad Rate 12.1% 999 1
999 1
Cluster 5 Score Range
Avg. Bad Rate 3.3%
1 999

Page 16

8
Desired Odds Alignment
• To do this we arbitrarily decide upon a common odds table so that we have
the same score across all models.
• We then use linear regression to fit the standard odds rate curve.
• This can be done with curve fitting software in a matter of seconds.

Score Bad Rate Odds Rate


Range
900 0.33% 0.33%
800 0.65% 0.66%
The odds of going
700 1.30% 1.32% bad double for every
We chose 5% at 500 hundred point
because it was easy 600 2.56% 2.63% decrease in the
to remember.
500 5.00% 5.26% score.

400 9.52% 10.53%


300 17.39% 21.05%
200 29.63% 42.11%
100 45.71% 84.21%
1 62.75% 168.42%

Page 17

Equalizing the Scores


Using the desired odds rates table from the Final Equalization Results
previous slide as the standard, each individual
Score % of Bad
model’s actual odds rates, and automated curve Range Equalized Total Bads Rate
fitting software ; we calculated score 800-849 101,095 30.5% 313 0.3%
transformations and an offset for each model that 750-799 33,377 10.1% 341 1.0%
will give each one a common meaning. 700-749 4,628 1.4% 119 2.6%
650-699 21,156 6.4% 198 0.9%
600-649 96,157 29.0% 1,283 1.3%
•Model 1/3 580-599 9,805 3.0% 327 3.3%
550-579 10,532 3.2% 463 4.4%
Eqlzd_score = 1057230 -2.894e+09/(Score13) + 500-549 18,917 5.7% 922 4.9%
2.64252e+12/(Score132) -8.0451e+14/(Score133) 450-499 10,698 3.2% 952 8.9%
•Model 2 400-449 6,279 1.9% 779 12.4%
Eqlzd_score = 134086.9 -3.4164e+08/(Score2) + 350-399 4,232 1.3% 664 15.7%
2.91346e+11/(Score22) - 8.294e+13/(Score23) 300-349 2,977 0.9% 522 17.5%
250-299 2,237 0.7% 435 19.4%
•Model 4 200-249 1,776 0.5% 432 24.3%
Eqlzd_score = 14951.87 -3.0577e+07/(Score4) + 150-199 1,350 0.4% 365 27.0%
2.12634e+10/(Score42) -4.9522e+12/(Score43) 100-149 1,137 0.3% 329 28.9%
50-99 829 0.3% 257 31.0%
•Model 5 1-49 4218 0.01273 1568 37.2%
Eqlzd_score = 94912.92 -2.3263e+08/(Score5) +
1.90808e+11/(Score52) -5.2203e+13/(Score53) We are a bit off of where we want to be
but the results are satisfactory. The
true 5% bad rate lies somewhere
between 490 and 499.

Page 18

9
Equalizing the Scores
• The left side graph shows the various un-aligned model scores compared to
the bad rate. Notice how each has a different score value for the same bad rate.
• The right side graph shows the aligned model scores compared to the bad rate
after being fitted to the desired curve. Aligning the scores insures that we treat
people fairly and consistently.
Unaligned Score Aligned Score

1.8 1.8

1.6 1.6

1.4 1.4

1.2 1.2

1 1

Bad Rate
B a d R a te

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200

2a 1_5 3a 4a 6a 7a Desired 2a 1_5 3a 4a 6a 7a Desired

Page 19

Validation for Combined Models


Now that all the models have been combined and all of the scores equalized, we can now create our validation
tables to measure the KS of the new models.
The new model has a KS value of 58.1% compared to the existing model which has a KS value of 47%. This
difference of 11 percentage points is tremendously significant.

All Clusters Equalized New Model Validation KS Statistic 58.08%

Avg Interval Interval Cum Cumulative Cumulative K-S Interval


Decile Score Bad Good Total Bad% Good % % of Bads % of Goods Statistic Bad Rate
10% 318 6,127 27,012 33,139 59.7% 8.4% 59.7% 8.4% 51.25% 18.49%
20% 550 1,706 31,434 66,279 16.6% 9.8% 76.3% 18.2% 58.08% 5.15%
30% 614 881 32,259 99,419 8.6% 10.0% 84.9% 28.2% 56.61% 2.66%
40% 652 509 32,631 132,559 5.0% 10.2% 89.8% 38.4% 51.41% 1.54%
19%
50% 677 359 32,781 165,699 3.5% 10.2% 93.3% 48.6% 44.70% 1.08%
60% 727 241 32,899 198,839 2.3% 10.2% 95.7% 58.9% 36.80% 0.73% improvement
70% 739 163 32,977 231,979 1.6% 10.3% 97.2% 69.1% 28.12% 0.49%
80% 761 124 33,016 265,119 1.2% 10.3% 98.5% 79.4% 19.04% 0.37%
90% 811 82 33,058 298,259 0.8% 10.3% 99.3% 89.7% 9.55% 0.25%
100% 775 77 33,064 331,400 0.7% 10.3% 100.0% 100.0% 0.00% 0.23%
Total 663 10,269 321,131 3.10%

All Clusters Existing Model Validation KS Statistic 47.04%

Avg Interval Interval Cum Cumulative Cumulative K-S Interval


Decile Score Bad Good Total Bad% Good % % of Bads % of Goods Statistic Bad Rate
10% 321 3,611 29,529 33,140 35.2% 9.2% 35.2% 9.2% 25.97% 10.90%
20% 611 3,124 30,016 66,280 30.4% 9.3% 65.6% 18.5% 47.04% 9.43%
30% 678 979 32,161 99,420 9.5% 10.0% 75.1% 28.6% 46.56% 2.95%
40% 706 551 32,589 132,560 5.4% 10.1% 80.5% 38.7% 41.78% 1.66%
50% 733 467 32,673 165,700 4.5% 10.2% 85.0% 48.9% 36.15% 1.41%
60% 770 468 32,672 198,840 4.6% 10.2% 89.6% 59.1% 30.54% 1.41%
70% 821 420 32,720 231,980 4.1% 10.2% 93.7% 69.2% 24.44% 1.27%
80% 884 318 32,822 265,120 3.1% 10.2% 96.8% 79.5% 17.31% 0.96%
90% 945 234 32,906 298,260 2.3% 10.2% 99.1% 89.7% 9.34% 0.71%
100% 979 97 33,043 331,400 0.9% 10.3% 100.0% 100.0% 0.00% 0.29%
Total 768 10,269 321,131 3.10%

Page 20

10
Combined Models Lift Charts
• Once we have satisfactory performance out of the individual models we can then
combine them together to see the overall lift.
• Since this comparison does not require the predicted score value, it is not yet
necessary to equalize the scores.
Combined Lift Chart (Cumulative) - LIVSCORE
100%

90%

80%

70%

60%
Cum % of Bads

50%

40%

30%

20%

10%

0%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Decile
Random Combined Existing
Page 21

Treatment Modeling

© 2007 AT&T Knowledge Ventures. All rights reserved. AT&T


and the AT&T logo are trademarks of AT&T Knowledge Ventures.

11
Treatment Behavioral Modeling
• We’ve modeled a customer’s risk level over the next six
months.
• Now we have to determine which of these customer’s are
going to miss their payment this month.
• Then we determine which treatment action will be the most
advantageous for us if the customer does miss their payment.

Risk modeling horizon

Treatment modeling horizon

Page 23

Treatment Action Modeling


• The next step is to determine which action will be the best one to take if
the customer misses a payment
• The outcome of these 7 models provides us with a probability of curing
with each action.

Standard Denial Notice - Sent on the day after pay-by date to inform customers that if they don’t pay then
they risk having their phone turned off.

STD+2 – Delay sending a Denial Notice by 2 days. This allows slow-payers to pay before we send a notice.

STD+5 – Delay sending a Denial Notice by 5 days.

Soft/Hard – Send a Reminder Notice, but follow it up with a Denial Notice if no payment is received.

1 Call Attempt – Dial the customer once. We consider a phone call a harsher treatment action than a notice.

3 Call Attempts – Dial the customer up to 3 times.

5 Call Attempts – Dial the customer up to 5 times.

Page 24

12
Treatment Optimization
• The next step will be to run the Treatment Model results through a series of
optimization equations that account for the cost of each treatment compared to
the balance size of the account.
• The treatment that produces the most return for our cost is the one that is
chosen.
Standard_Notice = ((Probability of No Treatment * Total Bill) + ((1-Prob. No Treatment) *
(Prob. of Curing * Total Bill))) – (Notice Cost + (Prob. of Inbound Call * Cost of Inbound
Call))

Notice + 2 Days = ((Probability of No Treatment * Total Bill) + ((1-Prob. No Treatment) * (Prob.


of Curing * Total Bill))) – (((Total Curr Chgs/15) * (1- Prob. of Curing ) * Charge-off Rate) +
Notice Cost + (Prob. of Inbound Call * Cost of Inbound Call))

Call_3 = ((Probability of No Treatment * Total Bill) + ((1-Prob. No Treatment) * (Prob. of


Curing * Total Bill))) – (((Total Curr Chgs/15) * (1- Prob. of Contact ) * (1- Prob. of Curing )
* Charge-off Rate) + (Prob. of Contact * Cost of Outbound Call) + (Prob. of Contact * Cost
of Outbound Call) * (1- Prob. of Contact ) + (Prob. of Contact * Cost of Outbound Call) * (1-
Prob. of Contact )2 + (Notice Cost + (Prob. of Inbound Call * Cost of Inbound Call))

Page 25

Bottom Line Impact


The 19% lift in KS means that we identify more bads without
disturbing a large number of goods. Identifying bads earlier in the
collections process allows us to collect sooner as well as cut them off
sooner.

• Contacting and collecting on these additional bads (with some


additional goods) costs roughly an additional $300,000 in annual
operational expenses. This includes letters, phone calls, and labor.

• Optimizing the action taken replaces the higher cost phone call
with the lower cost letter and reduces inbound calls received. This
translates to a $2,400,000 annual decrease in operational
expenses.

• However, cutting the Bads off sooner also means that we save at
least one month of additional billing. This translates into a
$10,500,000 annual decrease in uncollectible expense.

Page 26

13
Summary

AT&T Credit & Collections uses SAS® products to:

• Cluster population into like groups.


• Determine 6 month risk behavior.
• Determine probability of needing treatment this month.
• Determine which action is the most beneficial.

Page 27

14

You might also like