Professional Documents
Culture Documents
1. Introduction
It is often difficult to know when to stop making incremental improvements in a software
development and maintenance process, and instead to make a bold shift. This case study
examines one concrete example, a particularly difficult year in the history of Tartan Inc., during
which I served as a senior engineer. Over the course of that year, we saw—and failed to see—a
variety of warning signs, reached several decision points, and eventually made a bold process
shift.
The following section describes Tartan’s market constraints, code base, and the software process
that had been in use for over a decade. The next section examines specific issues based on this
author’s experience.
One of the lessons of this experience is that more process is not necessarily better, and that
inappropriate process often exacerbates existing problems. Another lesson of this experience is
that software development processes really do matter, and that failure to keep software process
appropriate to the problem at hand can cause great problems. My point, in other words, is that
simple and lightweight process improvements can provide substantial benefit at reasonable cost.
Page 1 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
runtime system available for each target system. The Tartan focus on performance was critical to
market success for the company. Customers, including potential customers, routinely compared
the performance characteristics of the code generated by Tartan compilers to very carefully tuned
hand-written assembly code. Tartan’s emphasis on performance meant that these comparisons
were generally quite favorable. See [12] and the “Ada Outperforms Assembly” sidebar for a
notable example.
The second core focus of Tartan’s
Ada Outperforms Assembly
business model was customer support
and interaction. Product developers With the intent of getting an Ada waiver, a defense
were made available as needed to contractor assigned a junior programmer to re-implement
address customer-reported issues. a portion of its software in Ada to prove that Ada could
Customers could expect to get bug not produce real-time code. The expectation was that the
workarounds within one or two resultant machine code would be too large and too slow
business days. If the bug was a to be effective for a communications application. Two
“show-stopper,” they would receive days and sixteen labor hours later, the opposite was
shipment of a new tool incorporating verified. With only minor source code variations, one
their bug-fix within a week or two. version of the compiled Ada code was 3x smaller than the
Less crucial fixes were typically original assembly version and 1.01x slower. A slightly
deferred to the next bi-annual tool modified version of the Ada was 1.92x faster than the
release. assembly version and 1.01x larger.
Tartan’s customers expected a major -- See "Ada Outperforms Assembly: A Case Study." [8]
feature upgrade on a yearly basis,
along with either one or two additional bug-fix upgrades each year. Many of these customers
produced safety-critical systems. As a result, they cared deeply about the quality and reliability
of Tartan’s products. Supported customers paid a maintenance fee of 15% of purchase price per
year for upgrades and support. Revenue from support contracts made up 25% of Tartan’s gross
income1. Therefore, keeping supported customers happy was crucial for corporate survival.
1
This percentage was growing steadily, but still fell below the industry average of 30%, due to year-to-year growth
in Tartan’s sales.
Page 2 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
Software Process
Prior to the emergence of the problems examined in this paper, Tartan had an informal tool-
focused software process. Automated source code control was used without exception. All
changes to the code were accompanied by informative log entries. A bug database tracked
problems uncovered in testing by Tartan or in the field by customers. While most developers did
unit and integration testing, some did not. The entire process was officially ad hoc, with the
details left to each developer's judgment. In 1992, this informal and lightweight process had been
successful for over a decade.
One reason for the success of Tartan’s software process was extremely low turnover of the
technical staff. The compiler group lost an average of one employee per year. Only half of these
losses involved employees leaving the company; the rest were internal transfers to other parts of
development team. Furthermore, there was essentially no loss of senior engineers, only internal
transfers. This unusually low turnover led to a corporate memory that was both long in duration
and broad in scope of knowledge.
The most formal part of our process was the “known bug list.” This list contained complete
descriptions of the cause and symptoms of each bug, along with a small test case that
demonstrated the bug. Bugs were further categorized by product, host system, target architecture
and development board, and, where relevant, customer. Each bug was given a priority based on
its perceived seriousness, the presence or lack of a reasonable work-around, and its expected
degree of impact on Tartan’s reputation. Priorities were set by a small group that included
representatives from Engineering, Sales & Marketing, and Customer Support.
Organization
The relevant groups within the company for the purposes of this study were Senior
Management—whose members had backgrounds in sales, marketing, and finance—and
Engineering. Engineering consisted of the Vice President of Engineering, the senior engineers,
and the rest of the developers. It is significant to note that the VP of Engineering was not
considered to be part of “Senior Management.”
Page 3 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
Management decided to fill the engineering leadership gap. They read enough about software
engineering to recognize that we had a process problem, but did not have enough experience to
develop an effective solution. Instead, they mandated that we engage in a comprehensive testing
activity to “find all the bugs.” This well-meaning change was counterproductive. The result was
a tidal wave of newly identified bugs, along with resentment and bad morale in Engineering.
The volume of newly identified bugs was so large that nearly all of Engineering’s effort for a
six-month period was spent on test running, test automation, and failure analysis, leaving little
effort available to actually fix bugs. As a result we made little useful progress toward product
shipment for six months.
At this point, the senior engineers—including the author—finally concluded that an improved
software development process was the only solution that might enable us to ship new products
before the company went out of business. We designed a new process, paying careful attention
to Tartan’s situation and problems as well as to adoptability issues and technical issues. The
new process, about which much more later, focused on identifying and correcting bugs as early
as possible in the development process. After substantial difficulty with adoption, the new
process proved sufficient and Tartan shipped a new and more reliable product.
The remainder of this study consists of narrative interspersed with analysis. The narrative
sections describe events at Tartan as they unfolded, using the viewpoint we had at the time. Each
analysis section examines a decision point. For each, I examine the issue, the available
information, the decision, and the information used to make it. I then consider the outcome of the
decision, the useful information that could have been available if we had thought to look for it,
and suggest some alternative actions we could have chosen given the knowledge of the time but
with the wisdom of the present.
Page 4 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
2
Vendor-H was a large defense contractor.
3
Byte-sized loads and stores were allowed at any address, 2-byte load/store operations were allowed only at even
addresses, 4-byte at (address mod 4)=0 addresses, etc.
Page 5 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
surprises was necessarily large because Tartan’s performance commitment required the
engineers to handle a very large number of special cases for various target hardware. These
special cases made it much harder to correctly anticipate all the implications of any specific
change.
2. Issues
2.1. Process and Scale
2.1.1. Narrative
As noted previously, Tartan had succeeded for the prior decade with an ad hoc development
process. By the beginning of 1992, however, customers were complaining about reliability.
Users complained that products contained too many bugs. Several large customers were so
dissatisfied that they did not renew their maintenance contracts. At the same time, the
engineering staff was struggling to meet development and maintenance milestones—and failing.
Everything in development seemed slow, but no one had a clear idea of the reason.
Tartan’s Senior Management recognized these warning signs as indications of a serious problem
in Engineering. They correctly categorized it as a process problem. The engineering team, up to
and including the VP of Engineering, was completely unreceptive to this diagnosis. We
recognized the individual problems, but discounted Senior Management’s diagnosis. “What do
those guys know? They’re not technical, and our process has worked fine for over a decade!”
These problems were compounded by the need to produce regular feature upgrades. In early
1992, we had committed to a group of aggressive feature upgrades. These new features were
important both to satisfy market demand and to meet contractual requirements. Implementing
them required examining 80% of the compiler source code, changing 10% of it, and introducing
about 50 kSLOC of new code.
Page 6 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
TI c3x/c4x family covered four variants; the i960 family covered three. Furthermore, between
1988 and the end of 1991, Tartan had added support for four additional processor variants to the
Motorola 68K family (bringing its total to fourteen), and three additional variants to the Mil Std.
1750a family (bringing its total to seven). The burden of special cases increases both with the
number of different target architecture families and with the number of variants within each
individual family—and Tartan’s goal of industry-leading performance meant that these special
cases must be embraced, not ignored. The compiler Front End, Middle Pass, and Optimizer were
not immune to the problem of special cases either. These three components were heavily
parameterized so that they could remain essentially identical for all targets. However, the
profusion of special cases led to a steady stream of changes and improvements to that
parameterization. At the beginning of 1987, Tartan supported three target architecture families,
with fifteen distinct variants. By the beginning of 1992 Tartan supported five target architecture
families, with a grand total of twenty-nine distinct variants—a near doubling of targets.
Growth of surrounding tooling. In 1988 Tartan first fielded its own linker/loader. In 1989
Tartan shipped a new Debugger. In 1990 Tartan added a cross-reference tool. In 1991 Tartan
added two performance-profiling tools. Each of these tools required additional support from the
compiler engineers.
Growth of features and optimizations. In addition to the new tools listed above, each major
upgrade included new optimizations and code generation improvements. New compiler features
such as assembly code inserts and “intrinsic functions” also appeared during this period. The key
observation is that each upgrade added new possibilities for unanticipated interactions between
components.
Aggressive New Features. In early 1992, Tartan had committed to a group of aggressive feature
upgrades. Some notable examples were full debug support for optimized code, support for
complete traceability from source code to object code (even in the face of heavy optimization),
and a collection of new language features (as part of an “Ada-9x User-Implementer study).
These features were important both to satisfy market demand and to meet contractual
requirements. Implementing the upgrades required examining 80% of the compiler source code,
changing 10% of it, and introducing about 50 kSLOC of new code. One key aspect of these new
features is that they required changes to some of the oldest, ugliest, buggiest, and least
understood parts of the compilers.
Page 7 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
and development. As a result, we saw a significant increase of problems being found during
integration, or even later testing. Worst of all, there was a noticeable increase in problems that
were found by customers—the most expensive place of all! This shift of two-to-three phases in
problem discovery represents an increase of perhaps 8x (and possibly up to 1000x) in the cost to
fix the problems. With such a large increase in development costs, it is not surprising that the
Engineering staff was no longer able to meet milestones or to keep up with development
schedules. It is also not surprising that Senior Management focused on bug finding. Bug finding
is, after all, easy to measure, easy to understand, and can be addressed with straight-forward
changes to development process.
There were signs of these growth-related problems as early as 1990 and 1991. For example, the
typical project plan’s reserve for surprises increased from 25% of the total schedule in the late
‘80s, to 50% of the total schedule by 1991. Careful attention to this early indicator could have
identified the problem much sooner.
Tartan’s engineering staff saw many of the individual problems and warning signs. Nevertheless,
we failed to recognize that they all stemmed from a single underlying problem: Tartan’s
previously successful ad hoc software development process was no longer adequate for the size
and scope of our products.
We had indications that we were encountering process problems, had we been looking for them:
Increased customer complaints. During the year prior to summer 1992, we saw a large
increase in customer complains about reliability. This was reflected in customer bug
reports, in word-of-mouth reports from the sales and marketing staff, and—most
worrisome of all—in a sharp decrease in maintenance contract renewals.
“Everything seems hard.” The engineering staff began complaining that nearly every
task was seeming more difficult and taking longer than had similar tasks in the past. In
hindsight, this appears to have been due to the rising size and complexity of our code
base exceeding our ability to cope with it. At the time, we had no explanation.
Missed Milestones. We saw a large increase in missed milestones, but had no obvious
explanation. These milestones were both the large externally visible ones, like
implementing a new feature, and the fine-grained internal milestones, such as an
individual engineer’s goal to “get the following four things done over the next two
weeks.” The tasks involved were well within the range of our previous experience. We
had not suffered significant turnover of engineering staff. We weren’t trying to take on a
drastically larger set of problems without increasing staffing.
Tartan’s engineering staff and management saw these signs, and recognized them as problems.
We did not, however, correctly identify them as having a single underlying cause, and we did not
take any effective action to remedy the root problem. Instead, we worked harder—an ineffective
solution.
Senior management, although non-technical, was paying close attention to the few metrics that
were available from Engineering, e.g., missed milestones, as well as to customer complaints, bug
report rates, maintenance renewals, and other non-engineering metrics. This data, along with
their greater distance from the details of engineering, made it easier for them to realize that we
were facing a major problem. In hindsight, the VP of Engineering should have been attentive to
this data and come to the same realization before Senior Management did.
Page 8 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
One reasonable question to ask is “Why didn’t the engineers believe Senior Management’s
diagnosis of the problem?” In my opinion, the answer is twofold: we were distracted by the
struggle to get our day-to-day work done and meet the looming deadlines, and we were suffering
from a severe case of “not invented here.”
Engineering was happy with the process we had. We were aware of studies showing that process
must change as the code base grows or the task at hand becomes more complex. However, we
were complacent. Although our code base had grown by nearly an order of magnitude, our
process had been able to handle it for more than a decade. Therefore, we did not consider process
to be part of the problem. Our complacency hampered our ability to recognize a process problem
when we encountered one.
2.2.1. Narrative
Tartan’s Senior Management correctly recognized that the engineering development process
needed to be revised. Because of Engineering’s refusal to recognize or address the problem,
Senior Management “took charge.” In an attempt to improve their understanding of the
Engineering side of the company, they read many articles from the business press on the subject
of software process. In the absence of useful input from Engineering, they mandated a solution:
comprehensive testing. Senior Management also decided to delay any further product shipment
until we returned to our previous level of quality and reliability. Catchphrases of the day
included “Find all the bugs, then fix them” and “We shall ship no compiler before its time.”
Engineering fought the introduction of this process change. Those of us in Engineering sensed
that simply adding additional testing effort would not solve our problems. However,
Engineering had lost much credibility by being unresponsive to Senior Management’s
suggestions, and by failing to correct our own problems. We thus lost the argument against
increased testing. Once the decision to start comprehensive testing was made, the engineering
organization worked quite hard to try to make it work, albeit with a fair amount of grumbling
along the way.
Tartan had a large and comprehensive test suite, comprising 2 MLOC spread across
approximately 5,000 executable programs. It included government-mandated language tests,
benchmarks and stress-test programs from customers, regression tests intended to exercise every
prior bug-fix, and purpose written tests intended to exercise various compiler features.
The testing effort lagged in the beginning due to insufficient hardware resources (both host
workstations and target boards) and due to weaknesses in our test automation. These problems
were alleviated by spending significant amounts of money on additional hardware, by
substantial improvements to our test automation, and by creating a smaller “overnight” test suite
for frequent use by front-line developers.
Making a serious effort to run the entire test suite on all supported combinations of host
computer, target architecture, and development board found so many test failures that
Page 9 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
Engineering couldn’t keep up with analyzing them, much less actually fixing any of the bugs
that caused the failures. During the second half of 1992, we struggled with this tidal wave of
test failures. Our list of reported test failures awaiting analysis peaked at over 13,000 items.
Although we eventually managed to diagnose failures as quickly as they were discovered, we
made little progress on fixing the underlying software. Meanwhile, the list of known bugs
continued to grow.
As the massive testing effort dragged on, and fall turned to winter with no end in sight, morale
in Engineering plummeted. It wasn’t just that the testing was yielding no useful result; the
engineers also resented having a “solution” forced on us from above.
Around the end of November, the senior engineers began to understand that we had been
correct—the “comprehensive testing” approach alone was not going to solve our problems. We
understood that we needed something more, or perhaps something different. However, we were
so busy analyzing test failures and fixing bugs that we had no time to spend on understanding
the real problem, much less on fixing it.
Meanwhile, the entire engineering staff was losing what little credibility it had left with Senior
Management. After six months under the “improved” process, there was no sign of actual
improvement. Senior Management was beginning to wonder whether they could trust
Engineering to deliver at all.
2.2.2. Analysis
We made two fundamental mistakes at Tartan during this period. First, Senior Management
mandated a specific change to our development process. This was a good-faith effort to do the
right thing. Unfortunately, Senior Management did not have the necessary experience to
recognize that using testing at the end of a process solely to “add quality” to the product being
tested does not work.
The fact that the mandated change turned out to be counterproductive should not come as a
surprise4. It is unreasonable to expect non-engineers to make correct engineering decisions
without the necessary background. On the other hand, Senior Management should have
understood that they should not be making detailed engineering decisions. A reasonable
alternative action would have been to insist that the Engineering leadership acknowledge and
solve our process problems. If necessary, this insistence could reasonably have taken the form of
replacing the V.P of Engineering.
The Engineering leadership made the second fundamental mistake. Even though we had
predicted that it would fail, we allowed ourselves to be convinced to try the comprehensive
testing approach. Instead, we should have countered with an alternative. This was particularly
bad because we had all read The Mythical Man Month (especially the section on Product
Testing). We all “knew” that testing the quality in at the end does not work. Nevertheless, we
tried that approach for nearly six months before seriously planning an alternative approach.
4
The surprise is that the Senior Management team ordered us to solve a real and important problem, but it wasn’t
the key problem we faced. Tartan’s product testing before that point was clearly both insufficient and ineffective.
Many problems that cropped up in the field really should have been found during in-house testing. That said, the
added testing was needed to verify that our development process was (or, in this case, was not) producing good
results. It could not, however, solve the fundamental problem.
Page 10 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
The Engineering leadership is fundamentally responsible for the software development process.
When Senior Management dictated the comprehensive testing approach, the Senior Engineers
had a second chance to uphold our appropriate responsibility by taking charge of making
changes to our software development process. This responsibility must be discharged by the
engineering leadership, even when it means discounting non-technical management’s specific
process fix.
2.3.1. Narrative
From December 1992 through February 1993, comprehensive testing continued as mandated.
Morale plummeted to new lows. No end to our problems was in sight.
Meanwhile, gripe sessions among the Senior Engineers slowly began to change into
brainstorming and problem-solving sessions. One key part of this change in attitude was our
realization that simply blaming Senior Management for the whole mess was an abdication of our
responsibilities as engineers. We were the people who should have made sure that quality did not
slip in the first place. And we were the only people in the company who might—just maybe—
have the knowledge and experience needed to fix our problems.
We studied the literature, both academic and popular. It is impossible, at this late date, to
reconstruct a complete list of references we consulted. Some that stand out for their contribution
to our eventual process, however, are:
• The Mythical Man Month by Fred Brooks [3]. This masterpiece of Software Engineering
literature began the entire effort. We were inspired by the fact that Brooks and his team
surmounted much greater problems nearly two decades earlier. The guidance he gave helped
to change us from “gripers” to “fixers.”
• Code Complete by Steve McConnell [5] stands out as a comprehensive and valuable
guide, both for its clear exposition of many issues, and especially for its suggestions on
further reading.5
• Fagan’s work [6] [7] on Inspections and their effectiveness. Although we knew about
this work before beginning our study, we didn’t really come to appreciate it until after
reading Code Complete. This body of work helped form our strategy in the area of reviewing
each other’s designs and code.
• Structured Walkthroughs by Ed Yourdon [11]. Another book we appreciated only after
reading Code Complete. This book inspired our choices of specific techniques for code
reviews and walk-throughs.
• Studies on the cost of fixing bugs in various phases of development [2] [4] [6]. These
papers convinced us to focus our efforts earlier in the software life-cycle.
5
We were lucky enough to acquire a number of early copies of this book through contacts at the publisher.
Page 11 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
6
Throughout this document, names have been obfuscated to protect both the innocent and the guilty. That said,
“Dave” was our most notorious, most productive, and most senior “coding cowboy.”
7
A bug cut-down is a small collection of test cases that demonstrate the presence or absence of a specific bug. These
test cases were produced during the debugging process, and would be harnessed as regression tests after successful
correction of the bug.
Page 12 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
COMMANDMENTS to a fellow engineer, and to convince that peer that the testing
performed was indeed sufficient.
Nothing in the new process was novel. It was a careful tailoring of well-known techniques to fit
Tartan’s specific circumstances.
After convincing management that the new development process was the right way to go—as
discussed further in the next section—we briefly shut down the testing process while all
developers analyzed and categorized the remaining unanalyzed test failures. After this point, four
months working within the new process was sufficient to produce a working product release.
The improved process held up over the following three years, until the company was sold in the
summer of 1996. During this period we added several new products, including a new source
language (C++) and several new target architectures, leading to a near doubling of the compiler
code base to 1.5 MSLOC. Comparison of before and after time sheets and other data suggests
roughly an 8x productivity improvement (see box on page 19 for more details).
2.3.2. Analysis
Key Observation: The engineers who designed the new process still did not clearly understand
that our inability to identify bugs in earlier phases of development, and the cost of fixing those
bugs later in development was the key source of our sudden development meltdown. Fortunately,
our decision to focus on catching problems at earlier stages of the development cycle directly
addressed the root problem.
The three changes with the largest impact in the short term were institutionalizing the “Think–
Act–Review” pattern, replacing bug-farms with all-new code, and making the rules for source
code check-in a normal part of our daily work.
2.3.2.1. The “Think-Act-Review” Pattern
The general theme of our new process was “Think–Act–Review.” It is self-evident that thinking
before acting is generally a good idea. Writing down what you've thought about serves two
purposes: First, it forces the writer to make a coherent presentation of his thoughts. Second, it
helps to provide convincing evidence that the writer has actually thought carefully about his
plans. Reviewing one’s output—whether it be designs, code, test plans, test results, or
whatever—provides both an opportunity for other eyes to find problems you have missed, and a
check that you actually followed the process. Although this description may sound rather
Waterfall-ish, we used it more as a relatively fine-grained iterative process. We aimed for
design-implement-test cycles that could be finished in about a week. Naturally, we spent much
less time for simple bug fixing, and occasionally more for major upgrades and new features.
The “Think–Act–Review” pattern provided several levels of benefit. In the small, the
requirement for having at least one peer review every change before check-in provided much of
the benefit claimed for “Pair Programming” at a fraction of the cost. This level of the pattern is
where the COMMANDMENTS OF CHECK-IN operated. Many subtle errors were caught during these
peer reviews—along with some embarrassingly blatant mistakes as well.
At a somewhat larger scale, the “Think–Act–Review” pattern fostered clarity of thinking by
requiring developers to explain their planned changes to a peer group. This requirement for
explanation typically led to a certain amount of writing about planned changes, which also
fostered more careful thought. These reviews served as a fine opportunity to check for missing
pieces and muddled thinking.
Page 13 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
At the highest level, the “Think–Act–Review” pattern served to institutionalize reflection on the
development process itself. This was not originally a planned part of the process. During some of
our larger reviews, various engineers asked why we weren’t applying the pattern directly to the
process. We soon included a section on “what did we mess up process-wise?” in each major
review.
2.3.2.2. Replacing “Bug Farms”
Replacing bug farms had a huge impact on quality. For example, during the four months from
the start of the new process to actually shipping a new version of the product, we replaced five of
the buggiest packages in the compilers. This change alone fixed one third of the bugs on our list,
and more than half of the “really hard” bugs8. The 10% of available staff time we spent replacing
these packages was likely much less than the cost of analyzing and fixing that many bugs—
especially the “really hard” ones.
2.3.2.3. Rules for Source Code Check-in
The rules for source code check-in, known as “THE THREE COMMANDMENTS OF CHECK-IN”
served to protect us from ourselves. The FIRST COMMANDMENT’s requirement for running the
over-night test suite led to a significant decrease in checked-in “fixes” that actually broke the
system for other developers. The SECOND COMMANDMENT’s requirement for passing purpose-
written tests and/or bug cut-downs helped us avoid checking in changes that did not behave as
expected, such as a “bug fix” that doesn’t actually fix the bug.
In many ways the THIRD COMMANDMENT was the most important. Most obviously, it served as
the penultimate line of defense against cutting corners on the process. Perhaps more importantly,
however, the simple act of explaining a bug-fix to another engineer often led to the bug-fixer
suddenly realizing that there were additional special cases he hadn’t considered, or that some
targets had different requirements. Suddenly trailing off in mid-explanation, saying “Wait a
minute! I forgot about…” was a common occurrence for the entire engineering staff.
The THIRD COMMANDMENT worked, in part, because each engineer knew that being responsive
to review requests from other engineers was in his or her own best interest. Today I may be
reviewing someone else’s code; tomorrow I’ll be looking for a reviewer. It’s always easier to
find someone to help when you’ve been helpful yourself. In addition, we found that it was
important to put the reviewer’s name on the check-in right along with the author’s name. This
practice encouraged reviewers to be careful about checking the appropriateness and correctness
of the change they were reviewing. If the fix doesn’t work, and your name is there as reviewer,
you expect to be asked why you didn’t catch the problem.
“Living righteously” by adhering to these commandments significantly reduced rework of bug
fixes and time wasted when someone else broke the system. It also helped us make the entire
process an ingrained part of our development culture.
8
”Really hard” in this context means either that failure diagnosis didn’t fit in the original time-box, thus moving the
bug to the “really hard” bug list, or that the first attempt to fix the bug failed so the developer moved it to the “really
hard” bug list and moved on. These bugs were typically much more difficult to fix than the ordinary variety.
Example “really hard” bugs include a register allocation problem that manifested only when there were more than
1024 variables in the conflict graph, and a mysterious compiler crash caused when the Solaris OS occasionally
failed to restore register values correctly after a context switch.
Page 14 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
2.3.3. Summary
Software process need not be complex. We obtained a tremendous improvement in useful
productivity by adopting a few simple ideas, all of which, by 1993, had been known for over a
decade. When carefully applied, even very simple approaches can have a large impact. We
increased the amount of process used by our organization, but we did so without adding
significant ceremony or paperwork. In our experience additional paperwork would not have
added value. Instead, we focused on activities that directly improved quality and productivity,
such as having programmers routinely inspect each other’s work.
Inappropriate process does more harm than good. The sections above show the effects of two
different inappropriate processes. The original process simply became inadequate due to growth
in size and complexity of our tasks. The “comprehensive testing” process led to a complete melt
down of our ability to make forward progress in development. In Tartan’s market, either of these
processes would shortly have led to the failure of the company.
2.4.1. Narrative
It was surprisingly easy to get initial agreement to try the new process. Both management and
engineers signed up promptly when confronted with the likely alternative of going out of
business. The real problems came in the longer term.
Every developer had problems with the switch to the new process. Even those who designed the
new process found it difficult. Many parts of the process were easy to swallow, but one in
particular was very hard. That one was the check-in rules. Even the most vocal proponents of
Page 15 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
the check-in rules had trouble following them. Some developers hated the check-in rules so
much that they became process “hold-outs” over that issue alone.
A typical interaction early on might go like this:
Reviewer – “Did you follow the *&^%#*&^%$ check-in rules?”
Reviewee – <sighs> “Yes, I did follow those &^%&^% pain in the %%$# rules” (or perhaps
“no. I didn’t.” <sigh>).
Then one day the most notorious holdout in the organization “got it”—he suddenly understood
that the check-in rules helped all of us to avoid stupid mistakes at low cost. Dave described his
sudden understanding as being “almost a religious experience,” and asked me why the designers
of the process hadn’t made it clear that the &^%$&^%$& check-in rules were actually
important, and not just bureaucratic BS. I responded that we had tried to do so, but obviously
hadn’t gotten the idea across to him.
The next day, while reviewing some code I wanted to check in, Dave asked me “Did you follow
THE THREE COMMANDMENTS OF CHECK-IN?” (Imagine the “three commandments” part spoken
in a deep pretentious announcer-style voice). When I stopped laughing, I admitted that I actually
had missed a step that time.
Within days, this description of the check-in rules had spread through the entire team. After this
change, a typical interaction went more like:
Reviewer – “Did you FOLLOW THE THREE COMMANDMENTS?”
Reviewee – “Sure did! And here’s the proof…”
This would be a pointless anecdote, except for one surprising thing—compliance with the rules
went up for the entire team. Both the hold-outs and the engineers who were trying to follow the
new process very carefully found it easier to follow “THE THREE COMMANDMENTS OF CHECK-
IN” than it had been to follow “those &^%$&^%$ check-in rules.” The difference was so large
that we went back and changed the official description of the check-in rules to be “THE THREE
COMMANDMENTS OF CHECK-IN” in all the documents describing the process. It may sound
silly, but it worked. Similar application of humorous peer pressure solved nearly all of our other
problems with compliance with the new process. In the few cases where peer pressure was
ineffective, we resorted to management sanction.
The other main point of contention within the engineering organization came over the issue of
standards for coding style and formatting. About one month after starting to work with the new
process we noticed that many code walkthroughs and reviews involved more time arguing about
formatting than actually looking at the code!
We took a solution straight from McConnell’s Code Complete: each group responsible for a
functional area developed their own consensus style guide. We then agreed that all new code
written by members of that group would follow their team’s standard. However, spending time
reformatting existing code was strictly forbidden unless the code in question was undergoing
very substantial change or rework. All debate about formatting was then placed off-limits
during reviews and walk-throughs. The details of the consensus style guides turned out to be far
less important than the consensus itself. This approach ended the “coding style wars” entirely,
and allowed us to focus on more important parts of the job at hand.
The big process problem with management came not during the adoption phase, but much
farther down the road. Management was continually tempted to respond to tight deadlines by
suggesting that we set aside the process for the duration of the “emergency.” Both Engineering
line managers and the Senior Management team fell victim to this error at various times.
Page 16 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
2.4.2. Analysis
Without management support, a new process has no teeth. In particular, it must be possible to
discipline developers who persistently fail to follow the process. When peer pressure fails, this
discipline requires management intervention.
A more difficult aspect of management support for software process is their temptation to set
aside process to meet short-term goals. Sustaining a disciplined process, even a very lightweight
process, requires significant effort. There were similar temptations for the developers. Doing
more coding or debugging is much more fun than sitting in a code walk-through or reviewing
each-other’s work prior to check-in. Having management offer the excuse of a short-term crisis
only adds to the developer’s temptation to back-slide. Our only successful response was for the
engineering staff to patiently point out that the “overhead” activities that management was
suggesting we set aside were key parts of a process that was yielding tremendous productivity
improvements. “Skipping those activities makes things go slower, not faster!” In the occasional
cases where that approach didn’t work, we simply refused to comply with management’s
instruction to “temporarily” drop the process. This rather unsatisfactory approach has obvious
risks associated with it.
2.4.3. Summary
Process change is hard. Even after the initial resistance to change is overcome, it remains
difficult to change longstanding work habits. We almost didn’t manage to change our process.
Humorous peer pressure was a key enabler for process change in our case.
Humorous peer pressure works surprisingly well. The specific techniques we used may not be
appropriate in other organizations, but they worked far better than we ever expected. The most
successful humorous approaches were initiated by “process hold-outs,” usually as they finally
“converted” to the new process.
“Bottom up” process change is easier than “top down.” Tartan’s engineering staff was better
able to accept a process designed by their peers than a process mandated by management. Take
advantage of this tendency by involving the engineering staff early and fully in any process
change activity.
3. Process Revisited
If I found myself in a similar situation in the future, I would:
• Investigate “Extreme Programming” (XP). My reading suggests that many of the ideas
espoused by advocates of Extreme Programming are rather similar to those we
considered important at Tartan. The proponents of Extreme Programming seem to have
emphasized some areas differently than we did. For example, proponents of XP
recommend that all programming be performed by pairs of developers. This is
reminiscent of, but appears more expensive than, Tartan’s system of reviews. It would be
interesting to know whether the added expense is worthwhile. XP may well be better than
the process we designed at Tartan for at least some organizations and problems.
• Try harder to reduce the overhead of the new process. Examination of time sheets about
18 months into the new process showed that we were expending 40% of staff time on
“process overhead.” That overhead included reviews both formal and informal, code
Page 17 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
walk-throughs, replacement of bug farms, and preparation time for reviews. We also
found that we’d achieved roughly an 8x improvement in completed work, with a
substantial decrease in needed rework as well. Finding a way to cut the 40% “process
overhead” to a smaller number would yield still greater improvement.
If asked to select pieces from Tartan’s new process in priority order, my first choice would be the
“Think–Act–Review” model in all its varied incarnations. My second choice would be THE
THREE COMMANDMENTS OF CHECK-IN. If the problem at hand involved a substantial body of old
code, I’d consider walk-throughs and bug farm replacement to have equal claim on third place.
4. Conclusion
This report has presented issues and lessons learned, from the author’s point of view, of difficult
changes in the software development process at Tartan Inc. between 1992 and 1993. Our
products out-grew our initial development process, leading to a sharp increase in defects and in
missed milestones in development. Senior management responded by imposing a modified
process with a very heavy emphasis on testing. This changed process only made our situation
worse. Finally, the senior engineers designed a new development process carefully tailored to
Tartan’s specific situation and needs. Working within the new process yielded enough
improvement to get development back on track.
Page 18 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
Page 19 of 20
PRACTICUM: A TALE OF T HREE PROCESSES: REFLECTION ON S OFTWARE DEVELOPMENT PROCESS CHANGE AT TARTAN
5. References
[1] B. Boehm. "A spiral model of software development and enhancement." IEEE Computer,
21(5):61-72, 1988.
[2] B. Boehm and P. N. Papaccio. “Understanding and Controlling Software Costs.” IEEE
Transactions on Software Engineering SE-14 no. 10 (October 1988): 1462-77.
[3] Frederick P. Brooks, Jr. the Mythical Man Month, Reading, MA: Addison-Wesly, 1975.
[4] Robert H. Dunn. Software Defect Removal. New York: McGraw-Hill, 1984.
[5] Steve McConnell. Code Complete. Microsoft Press, 1993.
[6] Michael E. Fagan. “Design and Code Inspections to Reduce Errors in Program
Development.” IBM Systems Journal 15, no. 3 pp. 182-211, 1976
[7] Michael E. Fagan. “Advances in Software Inspections.” IEEE Transactions on Software
Engineering SE-12 no. 7 (July 1986) pp. 744-51.
[8] P.K. Lawlis and T.W. Elam, "Ada Outperforms Assembly: A Case Study." Proceedings of
TRI-Ada, 1992. Also available at http://www.seas.gwu.edu/~adagroup/sigada-
website/lawlis.html as of Mar. 2004.
[9] M. Lehman and L. Belady. Program Evolution: processes of software change. Academic
Press, 1985
[10] Winston W. Royce. "Managing the Development of Large Software Systems: Concepts and
Techniques." Proceedings, IEEE WESCON. August 1970, Pages1-9
[11] Edward Yourdon. Structured Walkthroughs. Yourdon Press Computing Series, Prentice
Hall, 1988
[12] “Ada is Good for Real-Time” at “Ada Home: Home of the Brave Ada Programmers”
http://www.adahome.com/Ammo/Stories/Tartan-Realtime.html. Current as of Feb. 2004
Page 20 of 20