You are on page 1of 22

Maverick: Learning to Love Bug-Ridden, Unpredictable IT

Gartner Symposium/ITxpo Africa 2009 August 3-5, 2009 Cape Town International Convention Centre Cape Town, South Africa

Nick Jones

Notes accompany this presentation. Please select Notes Page view. These materials can be reproduced only with written approval from Gartner. Such approvals must be requested via e-mail: vendor.relations@gartner.com. Gartner is a registered trademark of Gartner, Inc. or its affiliates. This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Warning: This is a maverick presentation.

Software Engineering and Project Management is a Waste of Money


The Professional IT Fantasy The Professional IT Fantasy of 2008 of 2008 "Software quality and reliability "Software quality and reliability is a fixable problem, we just is a fixable problem, we just need better software need better software engineering and project engineering and project management" management"

Software and system quality is going to get worse whatever you do Stop wasting money on better software engineering; it won't help Stop wasting money on CMMI; it won't help Stop wasting money on better project management; that won't help either But, don't panic, quality isn't actually necessary for successful software

Warning: this is a "maverick" presentation. Non-maverick research at Gartner is vetted through a consensus process that draws on the collected wisdom of analysts practicing in the particular subject area. It reflects high impact, highly likely scenarios. Maverick research at Gartner is produced via an incubator process that shelters unconventional thinking from the rigors of our consensus-driven process to ensure we uncover and analyze high-impact future scenarios outside the zeitgeist of conventional thinking. We do not position maverick research as highly likely. We develop maverick positions in small teams. They are not discussed, negotiated and vetted through broad consensus. Maverick positions are intentionally on (or over) the edge to help clients (and Gartner) think about unconventional options. Interestingly, a majority of the unconventional "maverick" positions we have researched since 2004 have been increasing in likelihood but a significant number (a minority) have remained "low likelihood." Action Item: Clients should integrate our maverick research into their strategic planning processes and account for the unconventional positions in their scenario planning.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 1

Maverick: Learning to Love Bug-Ridden, Unpredictable IT

Key Issues
1. Why will software quality and reliability deteriorate through 2015? 2. How will organizations survive and profit from unpredictable and unreliable software?

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 2

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Key issue: Why will software quality and reliability deteriorate through 2015? Market: Commercial and technical drivers will combine to reduce the proportion of software systems which deliver both availability and precision. Tactical Guideline: Assess availability and precision separately when designing systems.

What Needs to Work, and What Won't Work


Applications where precision and/or availability are optional

2015

Availability Precision

2000
Applications that must deliver precise, repeatable results, and must be available

Availability

The goal of "traditional" software engineering has been to deliver both availability and precision. That is, systems that deliver correct results and are available whenever we want them. But, as this presentation will show, a combination of market trends and technology trends will combine to make these goals both unachievable and unnecessary for a much larger proportion of systems. Social networking systems like Facebook, for example, don't require the same level of precision as a banking application. We certainly don't claim that lower levels of availability and precision will be acceptable in all systems. For example, everyone expects precision in personal financial transactions. But even in applications where precision is expected, availability may become more negotiable. For example, Web, and especially mobile, banking may never deliver the classic call center's level of availability, but consumers may tolerate that as the price paid for being able to do things by themselves that required professional assistance in the past. Also, availability is shifting from an application issue to a system issue, where we use the word "system" in a very broad sense. For example, we may accept somewhat unreliable access to Web banking as long as we can switch to an alternative (if perhaps less functional) channel, such as the call center, if our Web access is unavailable. So the system has availability, even if the application doesn't.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Precision

Page 3

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Strategic Imperative: Stop treating software development as if it was an engineering process.

Software Development is NOT Engineering


The illusion that engineering principles will fix software has distracted the industry for decades. Software development is a knowledge acquisition process. Clients don't know what they want; goals and technology evolve over time. Systems development is a journey, not a process to achieve a well-defined destination. User behavior doesn't fit software life cycle models. Specifications are a photo of a race. A snapshot in time, but the race continues. The specification is only known precisely when the race is over in other words, the system is retired.

In 2004, Standish estimated that U.S. IT project success rates had increased to 34%, but mainly because projects had got smaller.

Key Issue: Why will software quality and reliability deteriorate through 2015? The illusion that software development is an engineering process has dogged the industry since its inception, and has contributed to setting unrealistic expectations of what software is, and how it should behave. Software development is a continuous knowledge acquisition activity, which produces software releases as a side-effect. It is certainly not a process that takes a blueprint and delivers a product. Knowledge is never perfect, users themselves don't have clear requirements, and there will always be scope creep and requirement shifts during the lifespan of any non-trivial project. Attempting to constrain user behavior to fit defined software life cycle models is like herding cats: unlikely to be a success. As projects get larger, these challenges become immense. One of the most successful strategies for increasing software project success over the last decade has been to reduce the size of projects. Smaller equals better, but this means that success can only be guaranteed for a project of zero size. Software methods are more like empirical process control than defined process control. Control by continuous monitoring and tuning, rather than attempting to follow a defined path.
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 4

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Market: Through 2015, the social and financial penalties for low software quality will be insufficient to create a significant quality improvement in most regions.

The Psychology and Economics of Software


A "system" is a combination of people and technology. A "development team" is a group of individuals with conflicting goals, inadequate communication and imperfect understanding. The software business model provides little personal or corporate accountability for errors:
- Open source = quality by committee - Shareware/freeware = get what you pay for - Limited contractual liability/penalties - Lack of competition/high switching costs - Consumers don't have the opportunity to pay for quality, and probably wouldn't if they did

Human beings build systems at the limits of their ability. If you give them better tools they just build bigger systems.

Key Issue: Why will software quality and reliability deteriorate through 2015? Software professionals tend to come from mathematical and scientific backgrounds which blinds them to the predominantly social nature of software development. In particular: Systems are not technology; they are a combination of people and technology. A "team" is a group of individuals, often with conflicting personal and corporate goals, who communicate inadequately, and who each have a different (and incomplete) understanding of the project goals. There are few penalties for mistakes. Software contracts limit liability in many systems, switching to an alternative is extremely difficult because there may be no viable substitute, or switching costs are excessive. For example, how would you remove SAP or Windows from the average organization? So even dissatisfied users don't act. Furthermore, the broad expectation that much software is "free" does not provide an economic model that drives quality. None of this is likely to change soon. Psychologically, humans tend to build to the limits of their capability and only change their behavior after a disaster, as illustrated many times by civil engineering. Providing better tools (for software or engineering) just encourages people to build bigger systems, not necessarily better systems.
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 5

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Strategic Planning Assumption: New types of software error and vulnerability will continue to emerge through 2015.

Truths We Don't Talk About Enough


Systems can be large, or they can be robust, but not both. Quality measurements are mostly guesswork. Success is subjective. Bugs in software are like sawdust in carpentry, an unavoidable cost of doing business. Success is a special case for large complex projects, not the norm. New classes of IT problem will continue to arrive for a decade, such as identity theft, human error, real-time sensors, scalability, network vulnerabilities
"Charter Communications "Charter Communications deletes 14,000 email deletes 14,000 email accounts" accounts" "Manufacturer blames "Manufacturer blames bankruptcy on failed ERP bankruptcy on failed ERP implementation" implementation" "Billion-dollar IT failure "Billion-dollar IT failure at Census Bureau" at Census Bureau" "Chaos as 13bn NHS "Chaos as 13bn NHS computer system falters" computer system falters" "Mideast submarine cable "Mideast submarine cable disruptions" disruptions"

Key Issue: Why will software quality and reliability deteriorate through 2015? There seems no respite in the continuous stream of small and large-scale IT problems and disasters; a few examples from 2008 are listed above. Reviewing long-term records of software problems such as the RISKS forum suggests that problems and bugs are endemic to the industry. The detailed cause of problems changes over time as technology and society evolves, but serious software problems are a basic unavoidable consequence of developing software. Bugs are the software industry's equivalent of sawdust, which is an unavoidable consequence of carpentry. As IT evolves, we can certainly expect new types of problem to arise, but there doesn't seem to be any evidence to suggest that the level of problems will decrease.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 6

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Tactical Guideline: Measure user satisfaction, not technical software quality metrics.

Does Software Quality Matter That Much?


Software is amazingly useful, despite its cost, bugs, project overruns
It runs our businesses, flies our planes, controls our pacemakers, simulates our second lives No-one would/could give it up. Everyone complains but no-one does anything. The world still turns Maybe reliability and quality aren't really that important? Maybe the problem isn't the software, but our expectations? The only true measure of software quality is user satisfaction (or at least, lack of rebellion). Many people (especially digital natives) seem to prefer agility to perfection.

Key Issue: Why will software quality and reliability deteriorate through 2015? Despite an endemic lack of quality, software has become essential to our lives. It's almost impossible to live without software in a Western society, and few people would even want to try. So we have already adapted to living in a world with a relatively high background level of IT-related problems. In fact, this is no different to any other useful technology for example, the automobile. Most people feel that driving is too expensive, and the level of casualties on roads is too high, but we won't give up our cars. Gerald Weinberg said: "If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization." But as civilization hasn't ended, maybe reliable software isn't actually that important (most of the time). So the real question is not "can we make software reliable?" but "how do we live with inadequate software?" and perhaps "will the inadequacies get better or worse?" The digital natives may also value agility more than quality. For example, many Web 2.0 sites have bugs, but evolve quickly to fix them, and this seems to be acceptable to many of their users.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 7

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Strategic Planning Assumption: The number of software problems related to synchronization and multithreading experienced by the average organization will rise through 2012.

The Future: Multicore and Clouds = More Untestable Systems


Static code analysis is less helpful

2012
Asynchronicity and parallelism make testing and validation far more challenging Multiprocessor clusters Multicore chips Complex multithreading Bugs can be unpredictable and not reproducible Testing cannot guarantee reproducible correctness

2008
Single processor Single core

Massive parallelism Sequential or simple multithreading

Key Issue: Why will software quality and reliability deteriorate through 2015? Looking forward, many technological trends will combine with the social challenges we mentioned earlier to cause further quality reduction in software. One of these is the shift from sequential computing to massively parallel computing. All the new high-power computing architectures are massively parallel, often with further architectural complexities, such as heterogeneous processors. For example, the world's fastest supercomputer in 2007 had just under 213,000 processors and just under 74 Tb of main memory. Large numbers of asynchronous communicating threads and processes massively increase the complexity of programming and debugging, and reduce the value of tools, such as code analysis, which struggle to predict potential dynamic behavior across multiple threads or processes. Testing becomes unrepeatable as unpredictable timing and synchronization issues can arise at run time. So, tools to assist with quality assurance will become less effective, and a generation of programmers has to gain new skills in developing and testing in environments of massive parallelism and asynchronicity. These challenges will emerge at many levels from browser applications (asynchronous Javascript), desktop applications (multicore CPUs) and data centers (clouds and clusters). Action Item: Train staff on asynchronous multiprocessing and multithreading.
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 8

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Strategic Imperative: Distributed service-oriented networked systems using "cloud computing" and large numbers of partners cannot provide guaranteed service levels.

The Future: A Distributed Mashup, Web 2.0 and SOA Nightmare


Anyone can Anyone can change change implementations implementations without telling without telling you you P2P P2P uncertainty uncertainty

Unknown Unknown components, components, dependencies, dependencies, vulnerabilities vulnerabilities No overall architectural No overall architectural responsibility responsibility "alt dev." best "alt dev." best effort performance effort performance

Network Network vulnerability vulnerability

No overall management and No overall management and control, no global SLAs control, no global SLAs

"We have good node-level availability: 5 9s, we have terrible "We have good node-level availability: 5 9s, we have terrible system level availability: 2 9s," Jim Gray system level availability: 2 9s," Jim Gray

Key Issue: Why will software quality and reliability deteriorate through 2015? The distributed, service-oriented architectures of SOA and Web 2.0 generation will be highly challenging. - There is no overall "system architecture," no overall architect and no overall control. - Implementation of services can be changed at any time, without the knowledge of the service consumer. - Applications are reliant on the performance of public networks which can't be guaranteed. - New approaches to implementing services for example Skype or BitTorrent rely on best effort techniques and peer-to-peer architectures where it's hard to provide service-level agreements (SLAs). Such alternate delivery models offer "good enough" performance at low cost, but no guarantees. - Mashups, clouds and multi-vendor SOA systems are vulnerable to timing and synchronization issues. - You have insufficient visibility into the system as a whole to correctly assess vulnerabilities. - The entire system is reliant on networks which are unreliable and unpredictable, especially as their breadth increases. Globally, packet loss of a few percent is endemic; it can peak at over 12%. So, despite the fact that we can create very reliable nodes (for example, data centers), the overall quality of systems in a networked world will deteriorate.
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 9

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Tactical Guideline: Agree to only "best effort" service-level and support agreements for systems which involve consumer technology.

The Future: Computerization is the Enemy of Traditional Software Quality


Dynamically bound
Pervasive, Pervasive, Interactive Interactive Computing Computing Mobile Web Service Mobile Web Service Web 2.0 Service Web 2.0 Service Enterprise Application Enterprise Application

Unpredictable usage conditions Distributed ownership Human interaction Unverifiable Statically bound,

Number of independent participants in a service

Embedded Embedded Code Code

Predictable usage Single owner

Number of different devices, networks, platforms and technologies

More verifiable

Key Issue: Why will software quality and reliability deteriorate through 2015? Consumerization implies a major architectural change. Classic corporate applications were statically bound, running on well-defined hardware owned by the corporation, had a single owner, a team of developers dedicated to the understanding and maintenance of the system, predictable demand and operations. As we evolve toward reduced control from central IT and greater employee autonomy, the range of devices accessing systems will grow dramatically. System utilization will become more unpredictable, and elements of the system may run on personal technology outside enterprise control. A "system" may include consumer-grade applications created by non-IT staff, delivered as Web mashups. Additionally, many more different devices, networks and operating systems are involved. The consequence is a much more ill-defined "system," with unpredictable usage conditions, including many more types of component, many of which are outside traditional enterprise control. Such systems are effectively untestable, and imply huge support and operations challenges. Worse still, techniques such as formal verification are not helpful when human interaction is involved, as a recent contributor to CACM noted: "Interaction systems are not only difficult to verify, but formally incomplete, impossible to verify."
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 10

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Market: There will be many new opportunities for IT to support ill-defined tasks such as collaboration and personal assistance, but this will often involve systems which deliver imprecise or ill-understood results.

The Future: Fuzzy Systems and Imprecise Answers


Contextual Systems = suggestions, hints and ideas, sometimes appropriate, sometimes not. Adaptive Systems may seem unpredictable. "Digital Assistants" but assistants aren't always right. Systems that "Learn" instead of being programmed but you can't tell what they do or don't know. Systems operating on large quantities of Noisy Data, for example feature recognition in video streams.

Key Issue: Why will software quality and reliability deteriorate through 2015? Some of the new opportunities for using IT are in very ill-defined areas such as: 1. Contextual systems and "digital assistants" that deliver hints, ideas and suggestions. 2. Adaptive systems and algorithms that "learn." 3. Systems that operate on large quantities of noisy or imprecise data. For example, analyzing video streams to find potential criminal behavior or recognize faces. 4. Technology as a facilitator for social interactions. For example, social networks or finding "familiar strangers" people who you may have something in common with and might want to meet. Such techniques enable a wide range of new uses for IT, but will require a new way of thinking about computing as something that may not deliver precise answers, and that might not necessarily be clear about how or why a particular result was derived.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 11

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Tactical Guideline: Circulate a regular "weather report" which discusses new types of computing risk.

The Future: New Risks and New Developers


New risks, vulnerabilities, fragilities New risks, vulnerabilities, fragilities Consolidation creates risky software Consolidation creates risky software monocultures (proprietary or open) monocultures (proprietary or open) Cyber criminals become more Cyber criminals become more sophisticated sophisticated Agility corrodes strategy Agility corrodes strategy Unintended Unintended consequences and consequences and unexpected events unexpected events Unexpected Unexpected dependencies dependencies and correlations and correlations Predicting the Predicting the consequence of consequence of disruptions or changes disruptions or changes will be impossible will be impossible More "intelligent" More "intelligent" devices and services devices and services will imply more updates; will imply more updates; each is a risk each is a risk

The risk of a skills crisis The risk of a skills crisis Demand for software is increasing Demand for software is increasing Developers are in short supply; low Developers are in short supply; low Computer Science enrolments Computer Science enrolments Dangerous weapons in amateur Dangerous weapons in amateur hands, for example "smart devices" hands, for example "smart devices" and personal supercomputers and personal supercomputers More amateur developers More amateur developers

Key Issue: Why will software quality and reliability deteriorate through 2015? IT systems have demonstrated a continuous history of challenges for decades, but the nature of the challenges evolves. Where will the next challenges emerge? A few potential sources include: Monocultures Consolidation in parts of the industry creates monocultures that are as attractive to criminals and hackers as Windows is today. Agility and agile methods are showing promise, but they in turn bring risks empowering small groups to operate independently of corporate strategy. Or perhaps the concept of central strategy is the problem? Mismatched skills and tools. New technologies will enable more smart (programmable) devices and state changes in personal computing hardware; for example, we can imagine the PC evolving into a 64 processor personal supercomputer within a decade. This is placing dangerous weapons in the hands of unskilled individuals. This will combine with the relative unpopularity of software as a career in recent years, and could exacerbate skills challenges and encourage amateur developers. Unintended consequences The massive complexity of distributed SOAs will likely introduce unexpected system behaviors. The growth in intelligent devices implies a growth in software updates, every one of which poses a compatibility risk.
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 12

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Key Issue: How will organizations survive and profit from unpredictable and unreliable software?

Surviving the Software Crisis Means Re-Thinking Software Expectations


"Software quality and "Software quality and reliability is a fixable reliability is a fixable problem, we just need problem, we just need better software better software engineering and project engineering and project management" management" "Software is unreliable "Software is unreliable and unpredictable, but we and unpredictable, but we don't mind because we don't mind because we are prepared for that and are prepared for that and it's still amazingly useful" it's still amazingly useful"

2008 Fantasy
"We believe that current software engineering practices may be approaching the limit of the combination of functionality and reliability that they can deliver." (Rinard).

2015 Realism
"Future software systems will be intelligent and adaptive. They will have the ability to seamlessly integrate with smart applications that have not been explicitly designed to work together." (Sterling)

Key Issue: How will organizations survive and profit from unpredictable and unreliable software? In the first part of this presentation we have tried to show that software bugs, unexpected behaviors and other software problems are an unavoidable consequence of the nature of IT, and that technology and social trends will likely make them more serious. So what should organizations do? If problems can't be avoided, organizations must prepare to live in a world where they are expected, intercepted and sometimes corrected before they can cause business damage. Current software engineering and quality techniques are not going to help, they are already approaching the limits of their capability and are ill-suited to the new challenges we identified.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 13

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Tactical Guideline: Create a working group within IT to identify and pilot technologies which attempt to make systems more robust to identify and correct problems dynamically.

Technological Solutions: Paranoia, Redundancy, Validation


Paranoid Systems
Monitor local performance metrics continually. Define "normal" behavior and identify exceptions. "Rectifiers" force inputs to comfort zones. Don't trust interfaces or partners.

Testing Moves to Runtime


MOP (Monitoring Oriented Programming)

Avoid Monocultures
Redundancy like the space shuttle. Clean room implementations of key algorithms.

Key Issue: How will organizations survive and profit from unpredictable and unreliable software? Techniques to explore include: 1. Make systems "paranoid" assume partners and external devices can't be trusted, monitor performance continuously to identify unexpected behavior and consider software techniques to repair invalid inputs. 2. Move testing from development time to runtime, explore concepts such as Monitoring Oriented Programming (MOP). 3. Avoid monocultures; hybrids can be stronger. Remember that developers are not to be trusted either; many problems have their roots in human error. Consider redundant clean-room implementations of critical system components. Corporate developers can also learn from real-time systems which often have a different perspective on errors, and focus on avoiding harm, rather than continuing execution at all costs. The bottom line is that paranoia, redundancy and validation will shift a lot of development effort from implementing functional specifications to assuring robust operation in the field.
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 14

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Tactical Guideline: Explore academic research into techniques such as acceptability-oriented computing and failure-oblivious computing.

Technological Solutions: Acceptability and Resilience


Acceptability-Oriented Computing
"Acceptability envelope" built into the system. "The goal of perfection is counter productive." Layered/modular architecture with embedded constraints.

Graceful Failure
"Failure oblivious" computing. Graceful degradation, states between operational and failure, meaningful behavior when resources are unavailable. For example, adaptive to network performance degradation.

Recovery-Oriented Computing
"Hardware faults, software bugs and operator errors are facts to be coped with, not problems to be solved."

Resilient actors
Resilient partitioning of pervasive network applications.

Key Issue: How will organizations survive and profit from unpredictable and unreliable software? Academics such as Rinard explore ways to make systems more robust. These include Acceptability Oriented Programming (AOP), "failure-oblivious" computing, recovery-oriented computing and resilient actors. 1. AOP is described as "An approach to the construction of systems in which a designer identifies a set of properties that the execution must satisfy to be acceptable to its users. This is in contrast to the traditional approach, which is to construct a system with as few errors as possible." AOP explicitly recognizes that perfection in terms of bug-free code is unachievable and counter-productive. AOP builds an "acceptability envelope" into systems. Code is layered and modular, with constraints and checks at multiple levels. It also includes concepts such as repair of damaged data structures. 2. Explore approaches that support graceful failure and degradation, such as "failure-oblivious" computing. For example, identify states between fully functional and failed, adapt to degraded partner or network performance, and try to do something meaningful even when some resources are unavailable. 3. Recovery-oriented computing designs in behavior to address bugs and errors. 4. "Resilient actors" is an architecture for distributed pervasive applications.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 15

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Tactical Guideline: Explore "market" architectures and local autonomy as approaches to enable graceful system behavior in unexpected situations.

Technological Solutions: Autonomy, Intervention, Robustness


Autonomy aids resilience and reduces risky dependencies Such as intelligent traffic lights, GM auto painting machines. If global connectivity fails, there is still local intelligence. "Market" architectures versus central control architectures.

Allow human intervention Unpredictable situations will arise. Some are best fixed by people.

Everything soft must be updatable Because getting it right first time is impossible (even in a $1 RFID chip). But this is a risk as well!

Key Issue: How will organizations survive and profit from unpredictable and unreliable software? Approaches such as "market" architectures and autonomy can make systems more resilient and reduce dependencies. For example, General Motors replaced a complex global scheduling system for auto painting machines with a distributed "bidding" algorithm, where each machine attempted to optimize its own utilization. The result was better utilization. Computerized traffic-light control systems can operate autonomously if the central control or network fails, providing a more robust system. Because systems will fail for reasons which can't be predicted, there will always be a need for human intervention; some failures will be best repaired by people. Because reliable software is unachievable, anything with software in it that can't be discarded and replaced must be updatable, even very low-cost consumer devices. But this itself poses a risk many system problems occur as unexpected side-effects of system updates; for example, some RIM BlackBerry outages in recent years.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 16

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Tactical Guideline: Work with industry associations to lobby and educate politicians.

Social Risks and Social Solutions


Lawyers and Politicians Don't Understand IT
Need realistic regulation. Bad experiences: DMCA, e-discovery of RAM, internet filtering ... Industry responsibility to educate society. Regulate only where it matters for example, safety critical systems.

Safer Tools
Fewer Excel macros would be good.

Educate Consumers and Employees


More realistic expectations of what software is. Computers aren't always right. More IT output may need explicit statements of quality. For example: "there's a 30% chance of rain today."

Key Issue: How will organizations survive and profit from unpredictable and unreliable software? Expect continued tension between legislators' tendency to regulate after failures in what's seen as an engineering process, and the fact that regulation isn't the solution. If this isn't managed, we'll suffer legislation related to software behavior, failure and liability which will cripple business agility. Politicians and lawyers don't understand technology especially IT but will become ever more interested in it. The unintended consequences of their actions have already dogged the IT industry; for example, anti-hacking laws impacting the legality of security research; DMCA designed for digital media applied to third-party printer cartridges; and the recent ruling that e-discovery legislation applies to the contents of RAM. Some extensions of legal practice are inevitable and desirable; for example, litigation has already started in virtual worlds, such as around real estate and the value of virtual artifacts. It's also likely that the increasing emotional and social role of software and equipment will result in new areas of litigation. What happens, for example, when a child suffers emotional stress because a software update "kills" his/her virtual pet? We, the industry, have an obligation to educate society about the changing nature of IT, and to explore ways in which uncertainty can be explicitly communicated, for example as we do with weather forecasts.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 17

Maverick: Learning to Love Bug-Ridden, Unpredictable IT Tactical Guideline: Instruct your emerging trends team to explore the opportunities for genetic and biological algorithms as ways to exploit future computing power.

Massively Parallel Low Cost Computation + Unpredictable Software = Opportunity


You don't need an You don't need an algorithm to algorithm to calculate a solution, calculate a solution, you need an you need an algorithm to algorithm to determine whether determine whether one solution is one solution is better than another better than another

Genetic algorithms, Memetic algorithms "Evolve" solutions and even programs Biologically inspired algorithms, For example, "ant's nest" optimization Hybrid human/cloud solutions HumanGrid; mTurk Simulation Results validated by humans Algorithms which match computing in 2012 to 2015 May not give precise answers But may give useful answers Absorb large amounts of low-cost computing power

Complementary Complementary human/computing human/computing solutions exploiting solutions exploiting the strengths of the strengths of both both

Key Issue: How will organizations survive and profit from unpredictable and unreliable software? If we accept the value of fuzzy recommendations and approximate solutions in place of precise answers, then we find new ways to exploit the massive increases in low-cost computing power that clouds, clusters and multicore bring. For example: 1. Explore approaches that don't require algorithmic solutions, but only fitness computation in other words, is one candidate result better or worse than another? Some of these such as genetic algorithms and biologically inspired algorithms like swarm optimization can exploit huge numbers of processors effectively. 2. Explore hybrid human/computing solutions that use networks to combine massive processing and massive numbers of human workers performing micro tasks. For example, HumanGrid or Amazon mTurk. As a side effect, some of these algorithms are very resistant to some forms of error for example, if some of the "population" of an evolutionary algorithm "die," the algorithm can still proceed. The bottom line is that once we embrace fuzzy software, we can find new ways to approach old problems using software.
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 18

Maverick: Learning to Love Bug-Ridden, Unpredictable IT

Business Imperative Action Plan


Today
Understand the true nature of software; it's not engineering.

During the Next 36 Months


Reduce investments in futile attempts to improve software quality. Stop trying to prevent bugs; ensure that their impact is non-catastrophic and you can correct them quickly, if necessary. Educate society computers aren't precise. Make uncertainty explicit. Implement systems which are autonomous, adaptive and paranoid. Learn from academic research and domains like real-time safety-critical systems that are addressing issues such as resilience.

Long Term
Exploit Moore's Law to adopt applications which use massive computing power to provide approximate answers. Learn to live with unreliable, unpredictable software.

Remember, this is a maverick presentation. Maverick positions are not mainstream Garter opinions. They are intentionally on (or over) the edge to help you think about unconventional options. Integrate our maverick research into your strategic planning processes to help you develop the unconventional positions within your scenario planning.

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Nick Jones SA09_121, 8/09, AE

Page 19

Maverick: Learning to Love Bug-Ridden, Unpredictable IT

Gartner Symposium/ITxpo Africa 2009 August 3-5, 2009 Cape Town International Convention Centre Cape Town, South Africa

Nick Jones

Notes accompany this presentation. Please select Notes Page view. These materials can be reproduced only with written approval from Gartner. Such approvals must be requested via e-mail: vendor.relations@gartner.com. Gartner is a registered trademark of Gartner, Inc. or its affiliates. This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

Maverick: Learning to Love Bug-Ridden, Unpredictable IT

Gartner Symposium/ITxpo Africa 2009 August 3-5, 2009 Cape Town International Convention Centre Cape Town, South Africa

Nick Jones

Notes accompany this presentation. Please select Notes Page view. These materials can be reproduced only with written approval from Gartner. Such approvals must be requested via e-mail: vendor.relations@gartner.com. Gartner is a registered trademark of Gartner, Inc. or its affiliates. This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. 2009 Gartner, Inc. and/or its affiliates. All rights reserved.

You might also like