You are on page 1of 33

Storytelling

 with  Data  to  


Executives
JIM  GRAYSON  (AUGUSTA  UNIVERSITY)
MIA  STEPHENS  (JMP  – DIVISION  OF  SAS)
Consider  This  Scenario
A  bank  is  struggling  with  the  way  it  decides  who  is  a  good  
credit  risk  and  asks  for  your  help  to  develop  a  model.

Discovery  Summit  2016 2


Define  the  
Modeling  Approach
May  loop  back  at  any  
step
Problem

Business   Prepare  for  


You  follow  the  Business   Problem Modeling
Analytics  Process Business
Analytics
Process

Monitor  
Modeling
Performance

Deploy  
Model
From  Building  Better  Models  with  JMP  Pro,  
Grayson,  Gardner  and  Stephens,  2015.
Data  Preparation
Key  Activities: Key  Tools:
• Determine  which  data  are  needed • SQL/Query
• Compile  (or  collect  new)  data • Data  table  structuring  -­‐ join,  concatenate,  
• Explore,  examine  and  understand  data update,  stack,  summarize,…
• Assess  data  quality • Summary  statistics  and  graphical  displays,  
interactive  tools  and  filtering  Multivariate  
• Clean  and  transform  data procedures  (clustering,  PCA,…)
• Define  features • Transformations,  creating  derived  variables
• Reduce  dimensionality • Missing  data  utilities,  outlier  analysis,  
• Create  training,  validation  and  test  sets recoding,  binning
• Creating  holdout  set(s)
Modeling
Key  Activities: Key  Tools:
• Choose  the  appropriate  modeling  method   • Multiple  Regression
or  methods • Logistic  Regression
• Fit  one  or  more  models • Naïve  Bayes
• Evaluate  the  performance  of  each  model   • kNN
using  validation  statistics  (misclassification,   • Classification  and  Regression  Trees
RMSE,  Rsquare) • Bootstrap  Forests  and  Boosted  Trees
• Choose  the  best  model  or  set  of  models  to   • Neural  Networks
address  the  analytics  problem  (and   • Generalized  Linear  Models
ultimately  the  business  problem) • Survival  Models
• Forecasting/Time  Series
• **Create  ensemble  models  
• Model  Comparison
• Text  Mining
The  Data
• German  Credit  data  set  
available  at    https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)

• Contains  observations  on  30  variables  for  1000  past  applicants.

• Each  applicant  rated  as  either  a  “good  credit”  (700  cases)  or  a  
“bad  credit”  (300  cases)

Discovery  Summit  2016 6


JMP
Presentation  of  Results
You  have  developed  a  model  for  identifying  good  credit  risk  
applicants.

You present  your  modeling  results  to  the  executive  team.

Discovery  Summit  2016 8


You  Present  This  Information  …
Measures of Fit for RESPONSE
Entropy Generalized Mean Misclassification Average
Creator .2.4.6.8 RSquare RSquare Mean -Log p RMSE Abs Dev Rate N Profit AUC
Fit Ordinal Logistic 0.1716 0.2681 0.5061 0.4081 0.3137 0.2550 200 0.0857 0.7929
Partition 0.1002 0.1633 0.5497 0.4322 0.3560 0.3150 200 0.0747 0.7090
Bootstrap Forest 0.2242 0.3397 0.4739 0.3953 0.3377 0.2300 200 0.1118 0.8207
Boosted Tree 0.1999 0.3072 0.4888 0.4030 0.3448 0.2550 200 0.1135 0.8058
Neural 0.2417 0.3625 0.4632 0.3912 0.2974 0.2250 200 0.1153 0.8276
Fit Generalized Two Stage Forward Selection 0.3543 0.4982 0.3944 0.3555 0.2594 0.1750 200 0.1283 0.8750
Fit Generalized Two Stage Forward Selection 0.3378 0.4794 0.4045 0.3620 0.2729 0.2050 200 0.1315 0.8658
Fit Generalized Double Lasso 0.3760 0.5222 0.3812 0.3543 0.2599 0.1800 200 0.12 0.8823

The  best  model,  from  a  profit  perspective,  is  a  Two  Stage  Forward  Selection,  
with  an  average  profit  of  0.1315.

Discovery  Summit  2016 9


And  This  Information…
Predictor Fit Generalized Two Stage Forward Selection
Predicted Count Decision Count
Actual Not Actual Not
RESPONSE Good Risk Good Risk RESPONSE Good Risk Good Risk
Good Risk 124 16 Good Risk 98 42
Not Good Risk 25 35 Not Good Risk 8 52
Predicted Rate Decision Rate
Actual Not Actual Not
RESPONSE Good Risk Good Risk RESPONSE Good Risk Good Risk
Good Risk 0.886 0.114 Good Risk 0.700 0.300
Not Good Risk 0.417 0.583 Not Good Risk 0.133 0.867
Misclassification
Rate
0.2500

Discovery  Summit  2016 10


And  This  Information…
ROC Curve for RESPONSE=Good Risk
1.00 Predictor AUC
0.90 Prob[Good Risk] 0.7929
Prob(RESPONSE==Good Risk) 0.7090
0.80 Prob(RESPONSE==Good Risk)_1 0.8207
0.70 Prob(RESPONSE==Good Risk)_2 0.8058
Probability( RESPONSE=Good Risk ) 0.8276
0.60 Probability( RESPONSE=Good Risk )_1 0.8750
Sensitivity

0.50 Probability( RESPONSE=Good Risk )_2 0.8658


Probability( RESPONSE=Good Risk )_3 0.8823
0.40
0.30

0.20
0.10

0.00
0.00 0.20 0.40 0.60 0.80 1.00
1-Specificity

Discovery  Summit  2016 11


What’s  The  Problem?
• We  are  proud  of  our  technical  work  – we  want  to  show  our  skills  
and  worth  to  the  organization  – and  we  don’t  want  to  “over-­‐sell”
• We  use  our  technical  results    -­‐ which  are  not  understandable  to  
a  non-­‐technical  audience  – to  provide  full  disclosure  and  
understanding
• Non-­‐technical  audience  cannot  bridge  the  gap  for  how  this  
“technical  jargon”  answers  their  problem  – seems  irrelevant  to  
what  they  really  want  to  know  – THE  ANSWER

Discovery  Summit  2016 12


Recommendations
• Best  practices  for  storytelling  to  executives  

• Example  presentation  for  executives

Discovery  Summit  2016 13


SENIOR  ANALYST  (WISE  OLD  OWLS)
Advice  on  communicating  analytic  results  to  senior  executives  from  Jeff  Cline,  “Owl  speaks  lion”,  ORMS  Today,  August  2016

• Have  a  five-­‐minute  version  and  a  two-­‐minute  version


• Clearly  answer:  What?  So  What?  What  now?
• Limit  your  presentation  slides:  Save  brilliance  for  back-­‐up  slides
• Admit  ignorance  when  you  don’t  know
• Be  prepared  to  talk  without  slides
• Send  your  presentation  ahead
• Practice  and  murder  board  before  briefing  (with  a  parliament  of  owls)
SENIOR  EXECUTIVES  (OLD  LIONS)
Advice  on  communicating  analytic  results  to  senior  executives  from  Jeff  Cline,  “Owl  speaks  lion”,  ORMS  Today,  August  2016

• If  I  have  only  five  minutes,  so  do  you


• Don’t  put  the  executive  back  in  math  class
• It  is  not  necessary  to  share  with  me  everything  you  have  learned  in  
reaching  this  point  in  your  life
• Don’t  raise  an  issue  unless  you  also  provide  recommendations
• Give  me  the  main  points  early
• If  you  can  answer  the  question,  say  so  and  get  back  with  me.  
Anything  else  is  a  waste  of  time
• More  pictures,  fewer  words
Best  Practices
Nancy  Duarte  – “How  to  Present  to  Senior  Executives”  [HBR]

• Summarize  up  front  (high  level  findings,  conclusions,  


recommendations,  call  to  action)
• Set  expectations  (summary  and  discussion)
• Create  summary  slides  (10%  rule;  rest  in  appendix)
• Give  them  what  they  asked  for  (answer  specific  request  directly)
• Rehearse  (run  slides  by  honest  coach)

Discovery  Summit  2016 16


Best  Practices
Lisa  Morgan  – “Data  Storytelling:    What  It  Is,  Why  It  Matters”  [IW]

• General  Storytelling  Rules  Apply  (beginning,  middle,  end)


• Consider  the  Audience  (don’t  use  one  size  fits  all  presentation)
• Collaborate  (interdisciplinary  activity)
• Avoid  Distractions  (address  a  specific  goal;  iceberg  rule)

Discovery  Summit  2016 17


Charts:    Two  Questions    -­‐>    Four  Types
DECLARATIVE

Idea  illustration Everyday  dataviz

CONCEPTUAL DATA-­‐DRIVEN

Idea  generation Visual  discovery

EXPLORATORY
Adapted  from  Good  Charts  by  Scott  Berinato,  p.  76.

Discovery  Summit  2016 18


Two  Questions    -­‐>    Four  Types
DECLARATIVE
• Know  the  audience
• Keep  it  simple
• Make  idea,  not  design,  pop
CONCEPTUAL DATA-­‐DRIVEN

EXPLORATORY
Adapted  from  Good  Charts  by  Scott  Berinato,  p.  76.

Discovery  Summit  2016 19


Sample  Presentation
5-­‐10  MINUTE  PRESENTATION  TO  SENIOR  EXECUTIVES
German  Credit  Modeling
AUGUSTA  ANALYTICS
Complete  Report
§ Executive  Summary
§ Appendix  -­‐ Modeling  Methodology  and  Key  Results

DISCOVERY  SUMMIT  2016 22


§ Objectives
§ Current  State
Executive  
Summary § Future  State
§ Summary

DISCOVERY  SUMMIT  2016 23


Objectives
Business  Objective:    
Improve  net  profits  of  loans  by  better  identifying  “good”  
customers.  

Modeling  Objective:    
Develop  a  classification  model  to  predict  if  an  applicant  is  a  
good  or  bad  credit  risk.

DISCOVERY  SUMMIT  2016 24


Data  Resources
Financial  Resources Credit  Purpose Demographics

Checking  account  balance New  car Employment  duration


Savings  account  balance Used  car Age
Credit  history Furniture Rents
Credit  duration  (months) Radio  /  TV Owns  residence
Credit  amount Education Job  category
Installment  rate  as  %   Retraining Number  of  dependents
disposable  income Years  at  present  residence
Owns  real  estate Credit  Information Telephone  in  name
Owns  no  property Co-­‐applicant
Guarantor
Number  existing  credits

DISCOVERY  SUMMIT  2016 25


Current  State
Average  loan  ~  $20,000  

Current  Unit  Gain ~    ($0.055)

Current  Revenue  Per  Loan  ~  ($1100)


70%  Good  Risks 30%  Bad  Risks

DISCOVERY  SUMMIT  2016 26


Developed  Model
Developed  model  to  maximize  profits:
Maximize  Revenues Prediction    
Minimize  Losses Bad  Credit Good  Credit

Good  Credit TRUE


+0.35
Reality  
Bad  Credit FALSE
-­‐1.0

DISCOVERY  SUMMIT  2016 27


Future  State

Average  loan  ~  $20,000  

Predicted  Average  Unit  Gain ~    $0.1315

Predicted  Average  Revenue  Per  Loan  ~  $2630

DISCOVERY  SUMMIT  2016 28


Final  Results
Current   Classification  
State Model

Unit  Gain  (Loss)     -­‐$0.055 $0.1315


Revenue  (Loss)  Per
-­‐$1,100,000 $2,630,000
1,000  Customers
Net  Revenue  Improvement  Per  
$3,730,000
1,000  Customers

DISCOVERY  SUMMIT  2016 29


Summary
Current  State:
Average  loss  per  loan  ~  ($1100)  

Modeling  Results:
Developed  a  classification  model  to  maximize  net  profits
Estimated  average  gain  per  loan  made  ~  $2236  

Key  Drivers:
Co-­‐Applicant  for  Loan,  Owns  Residence,  Rents,  Number  of  Existing  Credits,  
and  Interactions  Between  Many  Factors

DISCOVERY  SUMMIT  2016 30


Appendix  to  Executive  Summary

JMP  13  Web  Reports


JMP  13  Dashboards

DISCOVERY  SUMMIT  2016 31


Summary  – Key  Points
§ Summarize  up  front
§ Don’t  put  the  executive  back  in  math  class
§ It  is  not  necessary  to  share  with  the  executive  everything  
you  have  learned  in  reaching  this  point  in  your  life
§ Give  them  what  they  asked  for

Discovery  Summit  2016 32


Resource  List
1. “How  to  Present  to  Senior  Executive”  by  Nancy  Duarte,  HBR (Communications),  October  4,  2012.
2. “Create  a  Presentation  Your  Audience  Will  Care  About”  by  Nancy  Duarte,  HBR (Communications),  
October  10,  2012.
3. “Do  Your  Slides  Pass  the  Glance  Test?”  by  Nancy  Duarte,  HBR (Communications),  October  22,  2012.
4. “Structure  Your  Presentation  Like  a  Story”  by  Nancy  Duarte,  HBR (Communications),  October  31,  
2012.
5. “Data  Storytelling:    What  It  Is,  Why  It  Matters”  by  Lisa  Morgan,  Information  Week  (Commentary),  May  
30,  2016.    [http://www.informationweek.com/big-­‐data/big-­‐data-­‐analytics/data-­‐storytelling-­‐what-­‐it-­‐is-­‐why-­‐it-­‐matters/a/d-­‐
id/1325544  |  last  accessed  June  30  2016]
6. Good  Charts  by  Scott  Berinato  ,  Harvard  Business  School  Publishing  2016.
7. “Owl  speaks  lion”  by  Jeff  Kline,  ORMS  Today,  August  2016.

Discovery  Summit  2016 33

You might also like