Organizations must take a more proactive approach to running their businesses. Predictive analytics enables the discovery of patterns and trends in historical data. Organizations using predictive analytics solutions generate an average return on investment of 145 percent.
Organizations must take a more proactive approach to running their businesses. Predictive analytics enables the discovery of patterns and trends in historical data. Organizations using predictive analytics solutions generate an average return on investment of 145 percent.
Organizations must take a more proactive approach to running their businesses. Predictive analytics enables the discovery of patterns and trends in historical data. Organizations using predictive analytics solutions generate an average return on investment of 145 percent.
Maximize ROI by Avoiding Common Pitfalls A White Paper 1 Executive Summary 2 Identifying Common Worst Practices 2 Failing to Focus on a Specific Business Initiative 2 Ignoring Critical Steps 3 Spending Too Much Time on Model Evaluation 4 Investing Heavily in Analytic Tools With Little or No Return 4 Failing to Operationalize 5 Avoiding Worst Practices 5 Driving ROI 5 Focusing on Bottom-Line Initiatives 5 Preparing Data 5 Evaluate the Model, Without Over-Evaluating 5 Deploying the Results 6 Keys to Successful Predictive Analytics Deployment 6 Understanding the Business Need 6 Understanding the Data 6 Preparing the Data 6 Modeling 6 Evaluation 6 Deployment 7 WebFOCUS RStat: Cutting-Edge Predictive Modeling 8 Conclusion
Table of Contents Information Builders 1
Executive Summary Reactive decision-making, while successful in the past, has proven ineffective in recent times. Organizations can no longer wait to make critical choices after an opportunity arises or a problem is uncovered. They must take a more proactive approach to running their businesses by anticipating important changes, events, and trends, and taking action before they occur. Thats where predictive analytics comes in. Unlike traditional reporting and analysis techniques, which provide a rear-view perspective of what has happened in the past, predictive analytics enables the discovery of patterns and trends in historical data to determine what will likely occur in the future. This eliminates the need for decision-makers to rely solely on intuition, giving them valuable, forward-looking insight that improves the effectiveness of plans, strategies, and decisions. In his blog, Forrester analyst James Kobielus claims that predictive analytics, is not just about forecasting whats coming down the pike. Its also about keeping the bad alternative futures from happening. If you can see the nasty things that might happen far enough in advance, you have a better chance of neutralizing or squelching them entirely. 1 According to research from IDC, the benefits are even more straightforward: organizations using predictive analytics solutions generate an average return on investment of 145 percent. 2 Unfortunately, many companies dont implement it correctly and fail to achieve these desired results. In this white paper, we will investigate worst practices in predictive analytics. Well discuss why these actions can derail predictive analytics initiatives, and what steps can be taken to avoid making such mistakes. Well also highlight the key steps required for building and deploying effective predictive applications, and showcase WebFOCUS RStat, todays most powerful and full-featured solution for predictive analytics.
1 Kobielus, James. Interdictive Analytics: Catching Baddies at the Pass and in The Nick of Time, Forrester, July 2010. 2 The Financial Impact of Business Analytics: Key Findings, IDC, January 2003. Avoiding Worst Practices in Predictive Analytics 2 As beneficial as predictive analytics can be to an organization, implementation and deployment projects often fall apart or fail to get underway due to common poor practices, procedures, and decisions, such as: n Failing to focus on a specific business initiative that predictive analytics can enhance n Ignoring crucial steps, such as data preparation and access, or deployment of results n Spending too much time evaluating models n Investing in tools that yield little or no returns n Failing to operationalize findings Failing to Focus on a Specific Business Initiative The first step in any successful predictive analytics endeavor is to determine what business questions will be answered by the results. This enables organizations to more readily define project objectives and requirements in a way that satisfies the need driving the initiative. Predictive analytics is most effective when it is used to identify expected cases. For example, customers are scored for risk of churn, to predict who is most likely to defect to a competitor. Or they are scored to determine who is most likely to respond to a certain type of campaign or promotion. The expected behavior is known, but determining who is most likely to engage in a particular behavior requires predictive analytics to identify specific patterns. Though this benefit is substantial, most organizations are also trying to discover something critical that they dont already know. Many fail in this endeavor because they begin building their predictive applications with somewhat loose goals in mind. They try various models, or alter the underlying business questions over and over again. This drains project resources and forces developers into a never-ending cycle of definition, evaluation, and fine-tuning. It can also prevent the organization from reaching its ultimate objective the deployment of a predictive application for end users. The best approach, when decisions need to be made with little or no pre-existing knowledge, is to apply insight from patterns existing in the data to these new cases. Ignoring Critical Steps When deploying predictive analytics, many companies overlook important steps in the process. One of the most frequently ignored is data preparation and access. In reality, this should be the activity to which the most effort is devoted. In fact, data preparation typically accounts for approximately 60 to 80 percent of the cost of a predictive modeling initiative. Raw information must be gathered from various sources across the enterprise, and compiled in a final data set that is fed to the predictive model. This requires more than just pulling data from back-end systems and moving it into a centralized location, such as a data mart or data warehouse. Many companies fail to properly select, cleanse, and enhance data to make it truly analytics-ready. Others are totally unaware of how complete or accurate their information is they think it is clean, but in reality, it is not.
Identifying Common Worst Practices Information Builders 3 Because the information will be dispersed throughout an organization, the proper tables, records, and attributes must be selected. Invalid or erroneous records must be located and corrected, and any missing data must be filled in. Without the proper knowledge and tools a comprehensive business intelligence (BI) platform that can profile, transform, or fill in information, for example data preparation will serve as nothing more than a stumbling block that creates significant delays. What happens when the information used in predictive modeling lacks integrity? The principal of garbage in, garbage out certainly applies here. If the information used is poor, the accuracy of the results will be as well. Many companies also fail to share the results of their efforts on a wide scale. In this case, the insight provided by predictive analytics cannot deliver tangible business value to the very people who can use it, including executives and managers, frontline workers, and external stakeholders such as partners and suppliers. Further, results must not only be distributed to the right people, they must be delivered in a way that is easy for end users to understand, interpret, and act upon. Keeping the predictive analytic results in the back office is a sure way to garner disappointing results. Spending Too Much Time on Model Evaluation Predictive models must be evaluated to determine how accurately they predict patterns. First, they must be measured from a data perspective to ensure that all needed information is available and properly structured before the models are applied. Then they must be assessed from a business perspective to ensure they will meet end-user expectations and requirements. Accuracy comes at a cost, and companies must decide in advance how precise they need their models to be. Is 70 percent good enough? Or do results need to be at least 90 percent correct? Companies often tend to over-evaluate. They add new variables to the models to increase their accuracy, which often requires rebuilding. They test and retest the models, spending tremendous amounts of time making continuous refinements because they are not quite perfect. This delays deployment, and prevents the organization from recognizing the substantial advantages that predictive analytics can offer. There is a tradeoff to be made between time to market, usefulness, and accuracy. Companies must sacrifice some precision in order to accelerate deployment. Or they must halt implementation and rollout and delay the realization of benefits to achieve higher levels of accuracy. The truth is, if a model is better than the current approach to forward-looking decision-making (and it likely is), then it should be considered ready for deployment. No model will ever be perfect, because shifting business strategies and evolving end-user needs require continuous modifications. The idea that models cannot be deployed until they are just right is just wrong, and companies risk never deploying them at all.
Avoiding Worst Practices in Predictive Analytics 4 How will a company know when its model is ready? If high-quality information is used, the models accuracy is satisfactory from a business point of view. If it is properly designed to answer specific business questions, then the interpretability of the results should be the key criteria for determining if it is ready for prime time. It is important to remember that accuracy is key. Even if all other criteria are met, the model cannot be deployed if it does not meet accuracy standards. In addition, an estimate of the models ROI can be determined, and when that ROI is at the proper level, the model can be deployed. Investing Heavily in Analytic Tools With Little or No Return There are several common mistakes made when it comes to investing in predictive analytics tools. Companies often buy expensive, complex analytic software that is way too sophisticated for their needs. These solutions not only come with very high price tags, but also they are typically hard to deploy and difficult to use by anyone other than statisticians and experienced analysts. As a result, they likely contain features and functions that will never be used. All of these factors will significantly diminish return on investment (ROI). Buyers should also determine if they are buying a package for research, or for deployment purposes. Those solutions that will support a targeted research project will require only a single user license for the analyst responsible. Deployment, on the other hand, implies scale and will require an enterprise-level software solution. Many companies dont make this distinction, and end up either under- or over-buying. Other organizations try to build their own software, relying on internal programmers to create a predictive analytics application. Or they purchase a syntax-based solution that requires extensive amounts of manual coding. These solutions drain IT resources, and may not include all the necessary capabilities. Users may also experience complexity in deploying results, rendering the solution totally ineffective. Finally, when it comes to the computing environment, organizations typically need two systems one for predictive analytics, and a reporting system to deliver results. This creates additional and unnecessary hardware, support, and maintenance costs. A simpler and more cost-effective approach is to combine these into a single server environment. Failing to Operationalize For predictive analytics to succeed, it must be embedded into applications that are leveraged whenever users need to make decisions. If an application is not built and deployed, the effort devoted to creating a model will do nothing to enhance forward-looking decision-making. The results will remain in a document that few people will refer to in support of their daily activities. However, when a model is incorporated into a dashboard or reporting environment, the results will be readily accessible to end users, whenever they need them. This will help to create an analytics-driven culture across the entire business.
Information Builders 5 The worst practices we have highlighted dont have to derail a predictive analytics initiative. In fact, they can all be easily avoided by: Driving ROI When planning a predictive application, companies must consider total cost of ownership and anticipated return, to ensure that maximum value is achieved. Focusing on Bottom-Line Initiatives Create models that will provide forward-looking intelligence to help solve specific problems (i.e., minimizing customer churn by uncovering the factors that contribute to it) or help to achieve certain goals (i.e., increasing up-sell and cross-sell revenue by understanding what new products customers are most likely to buy). Preparing Data Guarantee the most accurate possible results by ensuring that disparate data is easily and properly accessed and cleansed before the models are created and applied. Evaluate the Model, Without Over-Evaluating The model must be tested to ensure that it provides better decision-making capabilities over current analysis methods. But over-evaluation can delay deployment and hinder ROI. It simply needs to be assessed until it is determined that it will provide value. At that point, it can be implemented. The statistical properties of the finished model are secondary to the value it brings to the business. Deploying the Results The insight provided by predictive analysis efforts must be shared with key stakeholders across and beyond the organization. For example, a bank that has predicted which customers are most likely to churn should disseminate that information to all those who interact with those clients, including call center staff and branch personnel. That way, everyone can contribute to correcting the problem and ensure that countermeasures are being implemented.
Avoiding Worst Practices Avoiding Worst Practices in Predictive Analytics 6 Now that weve discussed the wrong approach to predictive analytics, lets look at some of the critical steps that must be taken to ensure its success. Understanding the Business Need As mentioned earlier, it is crucial for companies to identify the drivers behind the predictive analytics project in the early planning stages. Once an organization defines what new information it is trying to uncover, what new facts it wants to learn, or what business initiatives need to be enhanced, it can build models and deploy results accordingly. Understanding the Data A thorough collection and exploration of the data should be performed. This enables those who are building the application to get familiar with the information at hand, so they can identify quality issues, glean initial insight, or detect relevant subsets that can be used to form hypotheses suggested by the experts for hidden information. This also ensures that the available data will address the business objective. Preparing the Data To get data ready, IT organizations must select tables, records, and attributes from various sources across the business. Data must be transformed, merged, aggregated, derived, sampled, and weighed. It is then cleansed and enhanced to optimize results. These steps may need to be performed multiple times in order to make data truly ready for the modeling tool. Modeling Once information has been prepared, various modeling techniques should be selected and applied, and their parameters calibrated to optimal values. Choice of the modeling technique is determined by the underlying data characteristics, or by the desired form of the model for scoring. In other words, some techniques may explain the underlying patterns in data better than others, and therefore, the outcomes of various modeling methods must be compared. A decision tree would also be used if it were deemed important to have a set of rules as the scoring model, which is very easy to interpret. Several techniques can be applied to the same scenario to produce results from multiple perspectives. Evaluation Thorough assessments should be conducted from two unique perspectives: a technical/data approach often performed by statisticians, and a business approach, which gathers feedback from the business issue owners and end users. This often leads to changes in the model; but while the technical/data evaluation is important, it should not be so stringent that it significantly delays implementation and use of the model. The models business value should be the primary test. Deployment Deployment, the final step, can mean one of two things: the generation of a single report for analysis, or the implementation of a repeatable data mining or scoring application. The goal here is to create a reusable application that can be used to generate predictions for large volumes of current data. The results are then distributed to front-line workers, in a format they are comfortable with reports, dashboards, maps, or graphics to enable proactive decision-making.
Keys to Successful Predictive Analytics Deployment Information Builders 7 WebFOCUS RStat from Information Builders is the markets first fully integrated BI and data mining environment, seamlessly bridging the gap between backward- and forward-facing views of business operations. With RStat, companies can easily and cost-effectively deploy predictive models as intuitive scoring applications, so business users at all levels can make decisions based on accurate, validated future predictions instead of relying on instinct alone. WebFOCUS RStat provides a single platform for BI, data modeling, and scoring. This eliminates the need to purchase and maintain multiple tools, and frees analysts and other statisticians from spending countless hours extracting and querying data. At the same time, it reduces costs, simplifies maintenance, and optimizes IT resources.
With RStat, scoring routines can be incorporated into any WebFOCUS report or application. RStats greatest benefit is its significantly increased accuracy. With the R engine a powerful and flexible open source statistical programming language as its underlying analysis tool, WebFOCUS RStat can deliver results that are consistent, complete, and correct every time. WebFOCUS RStat provides: n A single tool with access to more than 300 data sources, for both BI developers and data miners n Comprehensive data exploration, descriptive statistics, and interactive graphs n In-depth data visualization and transformation n Hypothesis testing, clustering, and correlation analysis n The ability to build and export models for estimation and classification n Comprehensive model evaluation n Rapid application creation through easy incorporation of scoring routines into WebFOCUS reports
WebFOCUS RStat: Cutting-Edge Predictive Modeling Avoiding Worst Practices in Predictive Analytics 8 Avoiding common worst practices, and adopting best ones, is the key to successfully implementing and using predictive analytics. By knowing what pitfalls to avoid, and what important steps need to be taken, companies can accelerate implementation, maximize user adoption, and realize substantial ROI. Choosing the right supporting solution also plays a vital role in the success of a predictive application. Only WebFOCUS RStat offers unmatched data access capabilities, as well as all the tools needed to build a predictive model, manipulate the results, and deploy them to business users in a way that is easy to understand, interpret, and use.
Conclusion Worldwide Offices Corporate Headquarters Two Penn Plaza New York, NY 10121-2898 (212) 736-4433 (800) 969-4636 United States Atlanta, GA* (770) 395-9913 Baltimore, MD (703) 247-5565 Boston, MA* (781) 224-7660 Channels (770) 677-9923 Chicago, IL* (630) 971-6700 Cincinnati, OH* (513) 891-2338 Dallas, TX* (972) 398-4100 Denver, CO* (303) 770-4440 Detroit, MI* (248) 641-8820 Federal Systems, DC* (703) 276-9006 Florham Park, NJ (973) 593-0022 Gulf Area (972) 490-1300 Hartford, CT (781) 272-8600 Houston, TX* (713) 952-4800 Kansas City, MO (816) 471-3320 Los Angeles, CA* (310) 615-0735 Milwaukee, WI (414) 827-4685 Minneapolis, MN* (651) 602-9100 New York, NY* (212) 736-4433 Orlando, FL (407) 804-8000 Philadelphia, PA* (610) 940-0790 Phoenix, AZ (480) 346-1095 Pittsburgh, PA (412) 494-9699 Sacramento, CA (916) 973-9511 San Jose, CA* (408) 453-7600 Seattle, WA (206) 624-9055 St. Louis, MO* (636) 519-1411, ext. 321 Washington DC* (703) 276-9006 International Australia* Melbourne 61-3-9631-7900 Sydney 61-2-8223-0600 Austria Raffeisen Informatik Consulting GmbH Wien 43-1-211-36-3344 Bangladesh Dhaka 415-505-1329 Belgium* Brussels 32(0)2-743-02-40 Brazil InfoBuild Brazil Ltda. So Paulo 55-11-3285-1050 Canada Calgary (403) 437-3479 Montreal* (514) 421-1555 Ottawa (613) 233-7647 Toronto* (416) 364-2760 Vancouver (604) 688-2499 China Beijing 010-51289680, ext. 8010 Croatia InfoBuild CEE Strmec Samoborski 385-1-23-62-400 Czech Republic InfoBuild CEE Praha 420-221-986-460 Estonia InfoBuild Baltics Tallinn 372-5265815 Finland InfoBuild Oy Espoo 358-207-580-840 France* Svres +33 (0)1-45-07-66-00 Germany Eschborn* 49-6196-775-76-0 Greece Applied Science Ltd. Athens 30-210-699-8225 Guatemala IDS de Centroamerica Guatemala City (502) 2412-4212 Hungary InfoBuild CEE Budapest 36-1-430-3500 India* InfoBuild India Chennai 91-44-42177082 Israel Malam Team SRL Products Petah-Tikva 972-3-7662040 Italy Milan 39-02-92-349-724 Japan KK Ashisuto Tokyo 81-3-5276-5863 Kuwait InfoBuild Middle East Safat 965-2-232-2926 Latvia InfoBuild Baltics Riga 371-67039637 Lebanon InfoBuild Middle East Beirut 961-4-533162 Lithuania InfoBuild Baltics Vilnius 370-5-268-3327 Mexico Mexico City 52-55-5062-0660 Netherlands* Amstelveen 31 (0)20-4563333 Nigeria InfoBuild Nigeria Garki-Abuja 234-803-318-4750 Norway InfoBuild Norge AS Oslo 47-4820-4030 Poland InfoBuild CEE Warszawa 48-22-657-0014 Portugal Lisboa 351-217-217-400 Qatar InfoBuild Middle East Doha 974-4-466-6244 Russian Federation InfoBuild CIS Moscow 7-495-797-20-46 n Armenia n Azerbaijan n Belarus n Kazakhstan n Kyrgyzstan n Moldova n Tajikistan n Turkmenistan n Ukraine n Uzbekistan Saudi Arabia InfoBuild Middle East Riyadh 966-1-479-7623 Singapore Automatic Identification Technology Ltd. Singapore 65-6286-2922 Slovakia InfoBuild CEE Bratislava 421-232-332-513 n Bulgaria n Romania n Serbia n Slovenia South Africa Fujitsu (Pty) Ltd. Cape Town 27-21-937-6100 Johannesburg 27-11-233-5432 South Korea Uvansys Seoul 82-2-832-0705 Spain Barcelona 34-93-452-63-85 Bilbao 34-94-452-50-15 Madrid* 34-91-710-22-75 Sweden InfoBuild AB Solna 46-8-578-772-01 Switzerland Dietlikon 41-44-839-49-49 Taiwan Galaxy Software Services, Inc. Taipei (866) 2-2586-7890 Thailand Datapro Computer Systems Co. Ltd. Bangkok 66(2) 301 2800 Turkey InfoBuild Turkey Ankara 90-312-266-3300 Istanbul 90-212-351-2730 United Arab Emirates InfoBuild Middle East Abu Dhabi 971-2-627-5911 n Bahrain n Egypt n Jordan n Oman Dubai 971-4-391-4394 United Kingdom* Uxbridge Middlesex 0845-658-8484 Venezuela InfoServices Consulting Caracas 58212-763-1653 * Training facilities are located at these offices. Corporate Headquarters Two Penn Plaza, New York, NY 10121-2898 (212) 736-4433 Fax (212) 967-6406 DN7506898.0811 Connect With Us informationbuilders.com askinfo@informationbuilders.com Copyright 2011 by Information Builders. All rights reserved. [95] All products and product names mentioned in this publication are trademarks or registered trademarks of their respective companies. Printed in the U.S.A. on recycled paper