You are on page 1of 192

TESTING CONCEPTS

Contents:

Chapter 1: Introduction to Software Testing

1.1. Learning Objectives...............................................................................................4


1.2. Introduction............................................................................................................4
1.3. What is Testing?.....................................................................................................6
1.4. Approaches to Testing...........................................................................................7
1.5. Importance of Testing............................................................................................8
1.6. Hurdles in Testing..................................................................................................9
1.7. Defect Distribution.................................................................................................9
1.8. Testing Fundamentals..........................................................................................10
1.8.1. Testing Objectives......................................................................................10
1.8.2. Test Information Flow................................................................................11
1.8.3. Test Case Design........................................................................................11

Chapter 2: SOFTWARE QUALITY ASSURANCE

2.1. Learning Objectives.............................................................................................13


2.2. Introduction..........................................................................................................13
2.3. Quality Concepts..................................................................................................14
2.4. Quality of design:.................................................................................................15
2.5. Quality of Conformance:....................................................................................15
2.6. Quality Control (QC)...........................................................................................16
2.7. Quality Assurance (QA)......................................................................................16
2.7.1. Cost of Quality...........................................................................................16
2.8. Software Quality Assurance SQA......................................................................18
2.8.1. Background Issues.....................................................................................18
2.8.2. Software Reviews......................................................................................20
2.8.3. Cost impact of Software Defects...............................................................20
2.8.4. Defect Amplification and Removal...........................................................21
2.9. Formal Technical Reviews (FTR).......................................................................24
2.9.1. The Review Meeting..................................................................................24
2.9.2. Review reporting and record keeping........................................................25
2.9.3. Review Guidelines.....................................................................................25
2.10. Statistical Quality Assurance..........................................................................25
2.11. Software Reliability.........................................................................................28
2.11.1. Measures of Reliability and Availability...................................................28
2.11.2. Software Safety and Hazard Analysis........................................................28
2.12. The SQA Plan...................................................................................................29
2.12.1. The ISO Approach to Quality Assurance System......................................30
2.12.2. The ISO 9001 standard..............................................................................31

1
Chapter3: Program Inspections, Walkthroughs, and Reviews

3.1. Learning Objectives.............................................................................................32


3.2. Introduction..........................................................................................................32
3.3. Inspections and Walkthroughs...........................................................................32
3.4. Code Inspections..................................................................................................34
3.5. An Error Checklist for Inspections....................................................................35
3.6. Walkthroughs.......................................................................................................38

Chapter 4: Test Case Design

4.1. Learning Objectives.............................................................................................40


4.2. Introduction..........................................................................................................40
4.3. White Box Testing................................................................................................41
4.4. Basis Path Testing................................................................................................42
4.4.1. Flow Graph Notation.................................................................................42
4.4.2. Cyclomatic Complexity.............................................................................43
4.4.3. Deriving Test Cases...................................................................................45
4.4.4. Graph Matrices..........................................................................................45
4.5. Control Structure testing....................................................................................47
4.5.1. Conditions Testing.....................................................................................47
4.5.2. Data Flow Testing......................................................................................48
4.5.3. Loop Testing..............................................................................................49
4.6. Black Box Testing.................................................................................................51
4.6.1. Equivalence Partitioning............................................................................52
4.6.2. Boundary Value Analysis...........................................................................53
4.6.3. Cause Effect Graphing Techniques............................................................53
4.6.4. Comparison Testing...................................................................................55
4.7. Static Program Analysis......................................................................................55
4.7.1. Program Inspections..................................................................................55
4.7.2. Mathematical Program Verification...........................................................55
4.7.3. Static Program Analysers...........................................................................56
4.8. Automated Testing Tools.....................................................................................57

Chapter 5: Testing for Specialized Environments

5.1. Learning Objectives.............................................................................................58


5.2. Introduction..........................................................................................................58
5.3. Testing GUIs.........................................................................................................58
5.4. Testing of Client/Server Architectures...............................................................61
5.5. Testing documentation and Help facilities.........................................................61

2
Chapter 6: SOFTWARE TESTING STRATEGIES

6.1. Learning Objectives.............................................................................................63


6.2. Introduction..........................................................................................................63
6.2. A Strategic Approach To Software Testing........................................................64
6.3. VERIFICATION AND VALIDATION..............................................................65
6.4. Organizing for software testing..........................................................................67
6.5. A Software Testing Strategy................................................................................67
6.6. Criteria for completion of testing.......................................................................70
6.7. Strategic issues.....................................................................................................71
6.8. Unit Testing...........................................................................................................72
6.8.1. Unit test consideration...............................................................................73
6.8.2. Checklist for interface tests.......................................................................74
6.8.3. Unit test procedures...................................................................................76
6.9. Integration Testing...............................................................................................77
6.9.1. Different Integration Strategies.................................................................78
6.9.2. Top down integration.................................................................................78
6.9.3. Bottom -Up Integration..............................................................................79
6.9.4. Regression Testing.....................................................................................80
6.9.5. Integration Test Documentation.................................................................81
6.10. Validation Testing............................................................................................83
6.10.1. Validation Test Criteria..............................................................................83
6.10.2. Configuration Review................................................................................84
6.10.3. Alpha and Beta Testing..............................................................................84
6.11. System Testing..................................................................................................85
6.11.1. Recovery Testing.......................................................................................85
6.11.2. Security Testing.........................................................................................86
6.11.3. Stress testing..............................................................................................86
6.11.4. Performance Testing..................................................................................87
6.11.5. Debugging..................................................................................................87
6.11.6. The Debugging Process.............................................................................88
6.11.7. Debugging approach..................................................................................89
6.12. Summary..........................................................................................................90

Chapter 7: Software Quality Standards

7.1. CMM.....................................................................................................................92
7.2. Six Sigma100

Chapter 8:Testing FAQ's.....105

3
Chapter 1: Introduction to Software Testing

1.1. Learning Objectives

You will learn about:


What is Software Testing?
Need for software Testing,
Various approaches to Software Testing,
What is the defect distribution,
Software Testing Fundamentals.

1.2. Introduction
Software testing is a critical element of software quality assurance and represents the
ultimate process to ensure the correctness of the product. The quality product always
enhances the customer confidence is using the product thereby increases the business
economics. In other words, a good quality product means zero defects, which is derived
from a better quality process in testing.

The definition of testing is not well understood. People use a totally incorrect definition
of the word testing, and that this is the primary cause for poor program testing. Examples
of these definitions are such statements as Testing is the process of demonstrating that
errors are not present, The purpose of testing is to show that a program performs its
intended functions correctly, and Testing is the process of establishing confidence that
a program does what is supposed to do.

Testing the product means adding value to it that means raising the quality or reliability
of the program. Raising the reliability of the product means finding and removing errors.
Hence one should not test a product to show that it works; rather, one should start with
the assumption that the program contains errors and then test the program to find as many
of the errors as possible. Thus a more appropriate definition is:
Testing is the process of executing a program with the intent of finding errors.

4
What is the purpose of Testing?
To show the software works: It is know as demonstration-oriented
To show the software doesnt work:
It is known as destruction-oriented

To minimize the risk of not working up to an acceptable level:


It is known as evaluation-oriented
Why do we need to Test?
Defects can exist in the software, as it is developed by human who can make mistakes
during the development of software. However, it is the primary duty of a software vendor
to ensure that software delivered does not have defects and the customers day-to-day
operations do not get affected. This can be achieved by rigorously testing the software.
The most common origin of software bugs is due to:

Poor understanding and incomplete requirements

Unrealistic schedule

Fast changes in requirements

Too many assumptions and complacency

Some of major computer system failures listed below gives ample evidence that the
testing is an important activity of the software quality process

1. In April of 1999 a software bug caused the failure of a $1.2 billion military
satellite launch, the costliest unmanned accident in the history of Cape
Canaveral launches. The failure was the latest in a string of launch failures,
triggering a complete military and industry review of U.S. space launch

5
programs, including software integration and testing processes. Congressional
oversight hearings were requested.
2. On June 4 1996 the first flight of the European Space Agency's new Ariane 5
rocket failed shortly after launching, resulting in an estimated uninsured loss
of a half billion dollars. It was reportedly due to the lack of exception
handling of a floating-point error in a conversion from a 64-bit integer to a 16-
bit signed integer.

3. The computer system of a major online U.S. stock trading service failed
during trading hours several times over a period of days in February of 1999
according to nationwide news reports. The problem was reportedly due to
bugs in a software upgrade intended to speed online trade confirmations.

4. In November of 1997 the stock of a major health industry company dropped


60% due to reports of failures in computer billing systems, problems with a
large database conversion, and inadequate software testing. It was reported
that more than $100,000,000 in receivables had to be written off and that
multi-million dollar fines were levied on the company by government
agencies.

5. Software bugs caused the bank accounts of 823 customers of a major U.S.
bank to be credited with $924,844,208.32 each in May of 1996, according to
newspaper reports. The American Bankers Association claimed it was the
largest such error in banking history. A bank spokesman said the programming
errors were corrected and all funds were recovered.

All the above incidents only reiterate the significance of thorough testing of software
applications and products before they are put on production. It clearly demonstrates
that cost of rectifying defect during development is much less than rectifying a defect
in production.

6
1.3. What is Testing?

Testing is an activity in which a system or component is executed under


specified conditions; the results are observed and recorded and an evaluation is
made of some aspect of the system or component
- IEEE
Executing a system or component is known as dynamic testing.
Review, inspection and verification of documents (Requirements, design
documents Test Plans etc.), code and other work products of software is known as
static testing.
Static testing is found to be the most effective and efficient way of testing.
Successful testing of software demands both dynamic and static testing.
Measurements show that a defect discovered during design that costs $1 to rectify
at that stage will cost $1,000 to repair in production. This clearly points out the
advantage of early testing.
Testing should start with small measurable units of code, gradually progress
towards testing integrated components of the applications and finally be
completed with testing at the application level.
Testing verifies the system against its stated and implied requirements, i.e., is it
doing what it is supposed to do? It should also check if the system is not doing
what it is not supposed to do, if it takes care of boundary conditions, how the
system performs in production-like environment and how fast and consistently the
system responds when the data volumes are high.

1.4. Approaches to Testing


Many approaches have been defined in literatures. The importance of any approaches
depends on the type of the system in which you are testing. Some of the approaches are
given below:
Debugging-oriented:
This approach identifies the errors during debugging the program. There is no difference
between testing and debugging.
7
Demonstration-oriented:
The purpose of testing is to show that the software works
. Here most of the time, the software is demonstrated in a normal sequence/flow. All the
branches may not be tested. This approach is mainly to satisfy the customer and no value
added to the program.

Destruction-oriented:
The purpose of testing is to show the software doesnt work.
Evaluation-oriented:
The purpose of testing is to reduce the perceived risk of not working up to an acceptable
value.

Prevention-oriented:
It can be viewed as testing is a mental discipline that results in low risk software
. It is always better to forecast the possible errors and rectify it earlier.

In general, program testing is more properly viewed as the destructive process of trying
to find the errors (whose presence is assumed) in a program. A successful test case is one
that furthers progress in this direction by causing the program to fail. However, one wants
to use program testing to establish some degree of confidence that a program does what it
is supposed to do and does not do what is not supposed to do, but this purpose is best
achieved by a diligent exploration for errors.

1.5. Importance of Testing


Testing activity cannot be eliminated in the life cycle as the end product must be bug free
and reliable one. Testing is important because:

Testing is a critical element of software Quality Assurance

8
Post-release removal of defects is the most expensive
Significant portion of life cycle effort expended on testing

In a typical service oriented project, about 20-40% of project effort spent on testing. It is
much more in the case of human-rated software.

For example, at Microsoft, tester to developer ratio is 1:1 whereas at NASA shuttle
development center (SEI Level 5), the ratio is 7:1. This shows that how testing is an
integral part of Quality assurance.

1.6. Hurdles in Testing


As in many other development projects, testing is not free from hurdles. Some of the
hurdles normally encounters are:

Usually late activity in the project life cycle


No concrete output and therefore difficult to measure the value addition
Lack of historical data
Recognition of importance is relatively less
Politically damaging as you are challenging the developer
Delivery commitments
Too much optimistic that the software always works correctly

1.7. Defect Distribution


In a typical project life cycle, testing is the late activity. When the product is tested, the
defects may be due to many reasons. It may either programming error or may be defects

9
in design or defects at any stages in the life cycle. The overall defect distribution is shown
in fig 1.1 .

Design
27%

Rqmts.
Design

Code Code
Other
Rqmts. 7%
56%
Other
10%

Fig 1.1: Software defect Distribution

1.8. Testing Fundamentals


Before understanding the process of testing software, it is necessary to learn the basic
principles of testing.

1.8.1. Testing Objectives

Testing is a process of executing a program with the intent of finding an error.

A good test is one that has a high probability of finding an as yet undiscovered
error.

A successful test is one that uncovers an as yet undiscovered error.

The objective is to design tests that systematically uncover different classes of errors and
do so with a minimum amount of time and effort.

Secondary benefits include:

Demonstrate that software functions appear to be working according to


specification.

10
Those performance requirements appear to have been met.

Data collected during testing provides a good indication of software reliability


and some indication of software quality.

Testing cannot show the absence of defects, it can only show that software defects are
present.

1.8.2. Test Information Flow


A typical test information flow is shown in fig 1.2.

Fig 1.2: Test information flow in a typical software test life cycle

In the above fig,

11
Software Configuration includes a Software Requirements Specification, a Design
Specification, and source code.

A test configuration includes a Test Plan and Procedures, test cases, and testing
tools.

It is difficult to predict the time to debug the code, hence it is difficult to schedule.

1.8.3. Test Case Design

Some of the points to be noted during the test case design are:

Can be as difficult as the initial design.

Can test if a component conforms to specification - Black Box Testing.

Can test if a component conforms to design - White box testing.

Testing cannot prove correctness as not all execution paths can be tested.

Consider the following example shown in fig 1.3,

12
A program with a structure as illustrated above (with less than 100 lines of Pascal code)
has about 100,000,000,000,000 possible paths. If attempted to test these at rate of 1000
tests per second, would take 3170 years to test all paths. This shows that exhaustive
testing of software is not possible.

Questions:

1. What is software testing? Explain the purpose of testing?

2. Explain the origin of the defect distribution in a typical software development life
cycle?

13
2. Chapter: SOFTWARE QUALITY ASSURANCE

2.1. Learning Objectives

You will learn about:


Basic principles about the Software Quality,
Software Quality Assurance and SQA activities
Software Reliability

2.2. Introduction
The quality is defined as a characteristic or attribute of something. As an attribute of an
item, quality refers to measurable characteristics-things we are able to compare to known
standards such as length, color, electrical properties, malleability, and so on. However,
software, largely an intellectual entity, is more challenging to characterize than physical
objects.

Quality design refers to the characteristic s that designers specify for an item. The grade
of materials, tolerance, and performance specifications all contribute to the quality of
design.

Quality of conformance is the degree to which the design specification s are followed
during manufacturing. Again, the greater the degree of conformance, the higher the level
of quality of conformance.

Software Quality Assurance encompasses:


A quality management approach
Effective software engineering technology
Formal technical reviews
A multi tiered testing strategy
Control of software documentation and changes made to it

14
A procedure to assure compliance with software development standards
Measurement and reporting mechanisms

Software quality is achieved as shown in figure 2.1:

Formal
Software Technical
Measurement
Engineering Review
Methods
Quality

Standards Testing
And SCM
Procedures &
SQA

Figure 2.1: Achieving Software Quality

2.3. Quality Concepts

What are quality concepts?


Quality
Quality control
Quality assurance
Cost of quality

The American heritage dictionary defines quality as "a characteristic or attribute of


something". As an attribute of an item quality refers to measurable characteristic-things,
we are able to compare to known standards such as length, color, electrical properties,
and malleability, and so on. However, software, largely an intellectual entity, is more
challenging to characterize than physical object.

15
Nevertheless, measures of a programs characteristic do exist. These properties include
1. Cyclomatic complexity
2. Cohesion
3. Number of function points
4. Lines of code

When we examine an item based on its measurable characteristics, two kinds of quality
may be encountered:
Quality of design
Quality of conformance

2.4. Quality of design:


Quality of design refers to the characteristics that designers specify for an item. The grade
of materials, tolerance, and performance specifications all contribute to quality of design.
As higher graded materials are used and tighter, tolerance and greater levels of
performance are specified the design quality of a product increases if the product is
manufactured according to specifications.

2.5. Quality of Conformance:


Quality of conformance is the degree to which the design specifications are followed
during manufacturing. Again, the greater the degree of conformance, the higher the level
of quality of conformance.

In software development, quality of design encompasses requirements, specifications and


design of the system.

Quality of conformance is an issue of focused primarily on implementation. If the


implementation follows the design and the resulting system meets its requirements and
performance goals, conformance quality is high.

16
2.6. Quality Control (QC)
QC is the series of inspections, reviews, and tests used throughout the development cycle
to ensure that each work product meets the requirements placed upon it. QC includes a
feedback loop to the process that created the work product. The combination of
measurement and feedback allows us to tune the process when the work products created
fail to meet their specification. These approach views QC as part of the manufacturing
process QC activities may be fully automated, manual or a combination of automated
tools and human interaction. An essential concept of QC is that all work products have
defined and measurable specification to which we may compare the outputs of each
process the feedback loop is essential to minimize the defect produced.

2.7. Quality Assurance (QA)


QA consists of the editing and reporting functions of management. The goal of quality
assurance is to provide management with the data necessary to be informed about product
quality, there by gaining insight and confidence that product quality is meeting its goals.
Of course, if the data provided through QA identify problems, it is management's
responsibility to address the problems and apply the necessary resources to resolve
quality issues.

2.7.1. Cost of Quality


Cost of quality includes all codes incurred in the pursuit of quality or in performing
quality related activities. Cost of quality studies are conducted to provide a base line for
the current cost of quality, to identify opportunities for reducing the cost of quality, and to
provide a normalized basis of comparison. The basis of normalization is usually money.
Once we have normalized quality costs on a money basis, we have the necessary data to
evaluate where the opportunities lie to improve our process further more we can evaluate
the effect of changes in money based terms.
QC may be divided into cost associated with
Prevention
Appraisal
Failure

17
Prevention costs include
Quality Planning
Formal Technical Review
Test Equipment
Training

Appraisal costs includes activity to gain insight into product condition the "First time
through" each process.
Examples for appraisal costs includes
In process and inter process inspection
Equipment calibration and maintenance
Testing

Failure Costs are costs that would disappear if no defects appeared before shipping a
product to customers failure costs may be subdivided into internal and external failure
costs.
Internal failure costs are costs incurred when we detect an error in our product prior to
shipment.
Internal failure costs includes
Rework
Repair
Failure Mode Analyses
External failure costs are the cost associated with defects found after the product has been
shipped to the customer.
Examples of external failure costs are
1. Complaint Resolution
2. Product return and replacement
3. Helpline support
4. Warranty work

18
2.8. Software Quality Assurance SQA

How do we define Quality?


Conformance to explicitly state functional and performance requirements explicitly
documented development standards, and implicit characteristics that are expected of all
professionally developed software.

The above definition emphasizes three important points.


1. Software requirements are the foundation from which quality is measured. Lack of
conformance to requirements is lack of quality.
2. Specified standards define a set of development criteria that guide the manner in
which software is engineered. If the criteria are not followed, lack of quality will
almost surely result.
3. There is a set of implicit requirements often goes unmentioned. (E.g. the desire of
good maintainability). If software conforms to its explicit requirements but fails to
meet implicit requirements, software quality is questionable.

2.8.1. Background Issues

QA is an essential activity for any business that produces products to be used by others.

The SQA group serves as the customer in-house representative. That is the people who
perform SQA must look at the software from customer's point of views.
The SQA group attempts to answer the questions asked below and hence ensure the
quality of software. The questions are
1. Has software development been conducted according to pre-established standards?
2. Have technical disciplines properly performed their role as part of the SQA activity?

19
SQA Activities
SQA Plan is interpreted as shown in figure 2
SQA is comprised of a variety of tasks associated with two different constituencies
1. The software engineers who do technical work like
Performing Quality assurance by applying technical methods
Conduct Formal Technical Reviews
Perform well-planed software testing.
2. SQA group that has responsibility for
Quality assurance planning oversight
Record keeping
Analysis and reporting.
QA activities performed by SE team and SQA are governed by the following plan.
Evaluation to be performed.
Audits and reviews to be performed.
Standards that is applicable to the project.
Procedures for error reporting and tracking
Documents to be produced by the SQA group
Amount of feedback provided to software project team.

SQAPlanning
Planning
SQA
Team
Software Team
Software Activities
Engineers Activities
Engineers
Activities
Activities

SQA Plan

Figure 2.2: Software Quality Assurance Plan

What are the activities performed by SQA and SE team?


Prepare SQA Plan for a project
Participate in the development of the project's software description

20
Review software-engineering activities to verify compliance with defined software
process.
Audits designated software work products to verify compliance with those defined as
part of the software process.
Ensures that deviations in software work and work products are documented and
handled according to a documented procedure.
Records any noncompliance and reports to senior management.

2.8.2. Software Reviews


Software reviews are a "filter " for the software engineering process. That is, reviews are
applied at various points during software development and serve to uncover errors that
can then be removed. Software reviews serve to "purify" the software work products that
occur as a result of analysis, design, and coding
Any review is a way of using the diversity of a group of people to:
1. Point out needed improvements in the product of a single person or a team;
2. Confirm that parts of a product in which improvement is either not desired, or not
needed.
3. Achieve technical work of more uniform, or at least more predictable quality that can
be achieved without reviews, in order to make technical work more manageable.
There are many different types of reviews that can be conducted as part of software-
engineering like
1. An informal meeting if technical problems are discussed.
2. A formal presentation of software design to an audience of customers, management,
and technical staff is a form of review.
3. A formal technical review is the most effective filter from a quality assurance
standpoint. Conducted by software engineers for software engineers, the FTR is an
effective means for improving software quality.

2.8.3. Cost impact of Software Defects


To illustrate the cost impact of early error detection, we consider a series of relative costs
that is based on actual cost data collected for large software projects.

21
Assume that an error uncovered during design will cost 1.0 monetary unit to correct.
Relative to this cost, the same error uncovered just before testing commences will cost
6.5 units; during testing 15 units; and after release, between 60 and 100 units.

2.8.4. Defect Amplification and Removal


A defect amplification model can be used to illustrate the generation and detection of
errors during preliminary design, detail design, and coding steps of the software
engineering process. The model is illustrated schematically in Figure 2.3.
A box represents a software development step. During the step, errors may be
inadvertently generated. Review may fail to uncover newly generated errors from
previous steps, resulting in some number of errors that are passed through. In some cases,
errors passed through from previous steps, resulting in some number of errors that are
passed through. In some cases errors passed through from previous steps are amplified
(amplification factor, x) by current work. The box subdivisions represent each of these
characteristics and the percent efficiency for detecting errors, a function of the
thoroughness of review.

DEVELOPMENT STEP
Errors from previous Step DEFECTS DETECTION
Errors passed through Percent efficiency for
Amplified errors 1:x Errors
error detection
Newly generated errors passed to
next step
Figure 2.3: Defect Amplification Model.

Figure 2.4 illustrates hypothetical example of defect amplification for a software


development process in which no reviews are conducted. As shown in the figure each test
step is assumed to uncover and correct fifty percent of all incoming errors without
introducing new errors (an optimistic assumption). Ten preliminary design errors are
amplified to 94 errors before testing commences. Twelve latent defects are released to the
field. Figure 2.5 considers the same conditions except that design and code reviews are
conducted as part of each development step. In this case, ten initial preliminary design
errors are amplified to 24 errors before testing commences.

22
Only three latent defects exist. By recalling the relative cost associated with the
discovery and correction of errors, overall costs (with and without review for our
hypothetical example) can be established.

To conduct reviews a developer must expend time and effort and the development
organization must spend money. However, the results of the presiding example leave little
doubt that we have encountered a "Pay now or pay much more later" syndrome.
Formal technical reviews (for design and other technical activities) provide a
demonstrable cost benefit and they should be conducted.
Preliminary design
0
0
70
10 Detail Design
% 3,2
2
1-1.5
50
1 25
% 15 Code/Unit
5
10 -3 24
60%
10
25
To

Integration Test
24
12
0
70
10 Validation test
%
2
1-1.5 6
50
25
% System Test

0 3
60%
0

Latent errors

Figure2.4: Defect Amplification -No Reviews


23
Preliminary design
0
0
0%
10 Detail Design
10, 6
4 x 1.5
0%
4 x = 1.5 37 Code/Unit
25
10
27x3 94
20%
x=3
25
To

Integration Test
94
47
0
50
10 Validation test
%
2
1-1.5 24
50
25
% System Test

0 12
60%
0

Latent errors

Figure 2.5: Defect Amplification - Reviews Conducted

2.9. Formal Technical Reviews (FTR)


FTR is a SQA activity that is performed by software engineers.
Objectives of the FTR are

24
To uncover errors in function, logic, are implementations for any representation of the
software.
To verify that software under review meets its requirements.
To ensure that software has been represented according to predefined standards
To achieve software that is developed in an uniform manner
To make projects more manageable
In addition, the FTR serves as a training ground, enabling junior engineers to observe
different approaches to software analysis, design, and implementation. The FTR also
serves to promote backup and continuity because numbers of people become familiar
with parts of the software that they may not have other wise seen.

The FTR is actually a class of reviews that include walkthrough inspection and round
robin reviews, and other small group technical assessments of software. Each FTR is
conducted as meeting and will be successful only if it is properly planned, controlled and
attended. In the paragraph that follows, guidelines similar to those for a walk through are
presented as a representative Formal technical review.

2.9.1. The Review Meeting


The Focus of the FTR is on a work product - a component of the software.
At the end of review all attendees of the FTR must decide
1. Whether to accept the work product without further modification.
2. Reject the work product due to serve errors (Once corrected another review must be
performed)
3. Accept the work product provisionally (minor errors have been encountered and must
be corrected but no additional review will be required).
The decision made, All FTR attendees complete a sign-off indication their participation in
the review and their concurrence with the review team findings.

2.9.2. Review reporting and record keeping


The review summary report is typically is a single page form. It becomes part of the
project historical record and may be distributed to the project leader and other interested
parties. The review issue lists serves two purposes.

25
1. To identify problem areas within the product
2. To serve as an action item. Checklist that guides the producer as corrections are
made. An issues list is normally attached to the summary report.
It is important to establish a follow up procedure to ensure that item on the issues list
have been properly corrected. Unless this is done, it is possible that issued raised can "fall
between the cracks". One approach is to assign responsibility for follow up for the review
leader. A more formal approach as signs responsibility independent to SQA group.

2.9.3. Review Guidelines


The following represents a minimum set of guidelines for formal technical reviews
Review the product, Not the producer
Set an agenda and maintain it
Limit debate and rebuttal
Enunciate problem areas but don't attempt to solve every problem noted
Take return notes
Limit the number of participants and insist upon the advanced preparation
Develop a check list each work product that is likely to be reviewed
Allocate resources and time schedule for FTRs.
Conducts meaningful training for all reviewers
Review your early reviews

2.10. Statistical Quality Assurance


Statistical quality assurance reflects a growing trend throughout industry to become more
quantitative about quality. For software, statistical quality assurance implies the following
steps
Information about software defects is collected and categorized
An attempt is made to trace each defect to its underlying cause
Using Pareto principle (80% of the defects can be traced to 20% of all possible
causes), isolate the 20% (the "vital few")
Once the vital few causes have been identified, move to correct the problems that
have caused the defects.
This relatively simple concept represents an important step toward the creation of an
adaptive software engineering process in which changes are made to improve those

26
elements of the process that introduce errors. To illustrate the process, assume that a
software development organization collects information on defects for a period of one
year. Some errors are uncovered as software is being developed. Other defects are
encountered after the software has been released to its end user.
Although hundreds of errors are uncovered all can be tracked to one of the following
causes.
Incomplete or erroneous specification (IES)
Misinterpretation of customer communication (MCC)
Intentional deviation from specification (IDS)
Violation of programming standards ( VPS )
Error in data representation (EDR)
Inconsistent module interface (IMI)
Error in design logic (EDL)
Incomplete or erroneous testing (IET)
Inaccurate or incomplete documentation (IID)
Error in programming language translation of design (PLT)
Ambiguous or inconsistent human-computer interface (HCI)
Miscellaneous (MIS)
To apply statistical SQA table 1 is built. Once the vital few causes are determined, the
software development organization can begin corrective action.
After analysis, design, coding, testing, and release, the following data are gathered.
Ei = The total number of errors uncovered during the ith step in the software
Engineering process
Si = The number of serious errors
Mi = The number of moderate errors
Ti = The number of minor errors
PS = Size of the product (LOC, design statements, pages of documentation at the
ith step

Ws, Wm, Wt = weighting factors for serious, moderate and trivial errors where
recommended values are Ws = 10, Wm = 3, Wt = 1.
The weighting factors for each phase should become larger as development progresses.
This rewards an organization that finds errors early.

27
At each step in the software engineering process, a phase index, PIi, is computed
PIi = Ws (Si/Ei)+Wm (Mi/Ei)+Wt (Ti/Ei)

The error index EI ids computed by calculating the cumulative effect or each PIi,
weighting errors encountered later in the software engineering process more heavily than
those encountered earlier.
EI = (i x PIi)/PS
= (PIi+2PI2 +3PI3 +iPIi)/PS
The error index can be used in conjunction with information collected in table to develop
an overall indication of improvement in software quality.
DATA COLLECTION FOR STATISTICAL SQA
Total Serious Moderate Minor
Error No. % No. % No % No %
IES 205 22 34 27 68 18 103 24
MCC 156 17 12 9 68 18 76 17
IDS 48 5 1 1 24 6 23 5
VPS 25 3 0 0 15 4 10 2
EDR 130 14 26 20 68 18 36 8
IMI 58 6 9 7 18 5 31 7
EDL 45 5 14 11 12 3 19 4
IET 95 10 12 9 35 9 48 11
IID 36 4 2 2 20 5 14 3
PLT 60 6 15 12 19 5 26 6
HCI 28 3 3 2 17 4 8 2
MIS 56 6 0 0 15 4 41 9
TOTA 942 100 128 100 379 100 435 100
LS

2.11. Software Reliability


Software reliability, unlike many other quality factors, can be measured directed and
estimated using historical and developmental data. Software reliability is defined in
statistical terms as "Probability of failure free operation of a computer program in a
specified environment for a specified time" to illustrate, program x is estimated to have
reliability of 0.96 over 8 elapsed processing hours. In other words, if program x were to
be executed 100 times and required 8 hours of elapsed processing time, it is likely to
operate correctly to operate 96/100 times.

28
2.11.1. Measures of Reliability and Availability
In a computer-based system, a simple measure of reliability is mean time between failure
(MTBF), where

MTBF = MTTF+MTTR

The acronym MTTF and MTTR are Mean Time To Failure and Mean Time To Repair,
respectively.

In addition to reliability measure, we must develop a measure of availability. Software


availability is the probability that a program is operating according to requirements at a
given point in time and is defined as:

Availability = MTTF / (MTTF+MTTR) x100%


The MTBF reliability measure is equally sensitive to MTTF and MTTR. The availability
measure is somewhat more sensitive to MTTR an indirect measure of the maintainability
of the software.

2.11.2. Software Safety and Hazard Analysis


Software safety and hazard analysis are SQA activities that focus on the identification
and assessment of potential hazards that may impact software negatively and cause entire
system to fail. If hazards can be identified early in the software engineering process
software design features can be specified that will either eliminate or control potential
hazards.
A modeling and analysis process is conducted as part of safety. Initially hazards are
identified and categorized by criticality and risk.

Once hazards are identified and analyzed, safety related requirements could be specified
for the software i.e., the specification can contain a list of undesirable events and desired
system responses to these events. The roll of software in managing undesirable events is
then indicated.
Although software reliability and software safety are closely related to one another, it is
important to understand the subtle difference between them. Software reliability uses

29
statistical analysis to determine the likelihood that a software failure will occur however,
the occurrence of a failure does not necessarily result in a hazard or mishap. Software
safety examines the ways in which failure result in condition that can be lead to mishap.
That is, failures are not considered in a vacuum. But are evaluated in the context of an
entire computer based system.

2.12. The SQA Plan

The SQA plan provides a road map for instituting software quality assurance. Developed
by the SQA group and the project team, The plan serves as a template for SQA activities
that are instituted for each software project.

ANSI/IEEE Standards 730-1984 and 983-1986 SQA plans is defined as shown below.
I. Purpose of Plan
II. References
III Management
1. Organization
2. Tasks
3. Responsibilities

IV. Documentation
1. Purpose
2. Required software engineering documents
3. Other Documents
V. Standards, Practices and conventions
1. Purpose
2. Conventions
VI. Reviews and Audits
1. Purpose
2. Review requirements
a. Software requirements
b. Designed reviews
c. Software V & V reviews
d. Functional Audits
e. Physical Audit
f. In-process Audits
g. Management reviews
VII. Test
VIII. Problem reporting and corrective action
30
IX. Tools, techniques and methodologies
X. Code Control
XI. Media Control
XII. Supplier Control
XIII. Record Collection, Maintenance, and retention
XIV. Training
XV. Risk Management.

2.12.1. The ISO Approach to Quality Assurance System


ISO 9000 describes the elements of a quality assurance in general terms. These elements
include the organizational structure, procedures, processes, and resources needed to
implement quality planning, quality control, quality assurance, and quality improvement.
However, ISO 9000 does not describe how an organization should implement these
quality system elements.

Consequently, the challenge lies in designing and implementing a quality assurance


system that meets the standard and fits the company's products, services, and culture.

2.12.2. The ISO 9001 standard


ISO 9001 is the quality assurance standard that applies to software engineering. The
standard contains 20 requirements that must be present for an effective quality assurance
system. Because the ISO 9001 standard is applicable in all engineering disciplines, a
special set of ISO guidelines have been developed to help interpret the standard for use in
the software process.

The 20 requirements delineated by ISO9001 address the following topic.


1. Management responsibility
2. Quality system
3. Contract review
31
4. Design control
5. Document and data control
6. Purchasing
7. Control of customer supplied product
8. Product identification and tractability
9. Process control
10. Inspection and testing
11. Control of inspection, measuring, and test equipment
12. Inspection and test status
13. Control of non confirming product
14. Corrective and preventive action
15. Handling, storage, packing, preservation, and delivery
16. Control of quality records
17. Internal quality audits
18. Training
19. Servicing
20. Statistical techniques
In order for a software organization to become registered to ISO 9001, it must establish
policies and procedure to address each of the requirement noted above and then be able to
demonstrate that these policies and procedures are being followed.
Questions:
Quality and reliability are related concepts, but are fundamentally different in a number
of ways. Discuss them.
1. Can a program be correct and still not be reliable? Explain.
2. Can a program be correct and still not exhibit good quality? Explain.
3. Explain in more detail, the review technique adopted in Quality
Assurance.

3. Chapter: Program Inspections, Walkthroughs, and


Reviews

2.1. Learning Objectives


You will learn about
What is static testing and its importance in Software Testing.
Guidelines to be followed during static testing
Process involved in inspection and walkthroughs
Various check lists to be followed while handling errors in Software Testing
Review techniques

32
3.1. Introduction
The Majority of the programming community worked under the assumptions that
programs are written solely for machine execution and are not intended to be read by
people. The only way to test a program is by executing it on a machine. Weinberg built a
convincing strategy that why programs should be read by people, and indicated this could
be an effective error detection process.
Experience has shown that human testing techniques are quite effective in finding
errors, so much so that one or more of these should be employed in every programming
project. The method discussed in this Chapter are intended to be applied between the time
that the program is coded and the time that computer based testing begins. We discuss
this based on two ways:
It is generally recognized that the earlier that error are found, the lower are the
costs or correcting the errors and the higher is the probability of correcting the
errors correctly.
Programmers seem to experience a psychological change when computer-based
testing commences.

3.2. Inspections and Walkthroughs


Code inspections and walkthroughs are the two primary human testing methods. It
involve the reading or visual inspection of a program by a team of people. Both methods
involve some preparatory work by the participants. Normally it is done through meeting
and it is typically known as meeting of the minds, a conference held by the
participants. The objective of the meeting is to find errors, but not to find solutions to the
errors (i.e. to test but not to debug).

What is the process involved in inspection and walkthroughs?

The process is performed by a group of people (three or four), only one of whom is the
author of the program. Hence the program is essentially being tested by people other than
the author, which is in consonance with the testing principle stating that an individual is
usually ineffective in testing his or her own program. Inspection and walkthroughs are far
more effective compare to desk checking (the process of a programmer reading his/her
own program before testing it) because people other than the programs author are
involved in the process. These processes also appear to result in lower debugging (error

33
correction) costs, since, when they find an error, the precise nature of the error is usually
located. Also, they expose a batch or errors, thus allowing the errors to be corrected later
enmasse. Computer based testing, on the other hand, normally exposes only a symptom
of the error and errors are usually detected and corrected one by one.
Some Observations:
Experience with these methods has found them to be effective in finding from
30% to 70% of the logic design and coding errors in typical programs. They are
not, however, effective in detecting high-level design errors, such as errors
made in the requirements analysis process.

Human processes find only the easy errors (those that would be trivial to find
with computer-based testing) and the difficult, obscure, or tricky errors can only
be found by computer-based testing.
Inspections/walkthroughs and computer-based testing are complementary; error-
detection efficiency will suffer if one or the other is not present.

These processes are invaluable for testing modifications to programs. Because


modifying an existing program is a more error-prone process(in terms of errors
per statement written) than writing a new program.

3.3. Code Inspections


An inspection team usually consists of four people. One of the four people plays the role
of a moderator. The moderator is expected to be a competent programmer, but he/she is
not the author of the program and need not be acquainted with the details of the program.
The duties of the moderator include:
Distributing materials for scheduling inspections
Leading the session,
Recording all errors found, and

34
Ensuring that the errors are subsequently corrected.
Hence the moderator may be called as quality-control engineer. The remaining members
usually consist of the programs designer and a test specialist.
The general procedure is that the moderator distributes the programs listing and design
specification to the other participants well in advance of the inspection session. The
participants are expected to familiarize themselves with the material prior to the session.
During inspection session, two main activities occur:
1. The programmer is requested to narrate, statement by statement, the logic of the
program. During the discourse, questions are raised and pursued to determine if
errors exist. Experience has shown that many of the errors discovered are actually
found by the programmer, rather than the other team members, during the
narration. In other words, the simple act of reading aloud ones program to an
audience seems to be a remarkably effective error-detection technique.

2. The program is analyzed with respect to a checklist of historically common


programming errors (such a checklist is discussed in the next section).

It is moderators responsibility to ensure the smooth conduction of the proceedings and


that the participants focus their attention on finding errors, not correcting them.

After session, the programmer is given a list of the errors found. The list of errors is also
analyzed, categorized, ad used to refine the error checklist to improve the effectiveness of
future inspections.

The main benefits of this method are;


Identifying early errors,
The programmers usually receive feedback concerning his or her programming
style and choice of algorithms and programming techniques.
Other participants are also gain in similar way by being exposed to another
programmers errors and programming style.

35
The inspection process is a way of identifying early the most error-prone sections
of the program, thus allowing one to focus more attention on these sections during
the computer based testing processes.

3.4. An Error Checklist for Inspections


An important part of the inspection process is the use of a checklist to examine the
program for common errors. The checklist is largely language independent as most of the
errors can occur with any programming language.

Data-Reference Errors
1. Is a variable referenced whose value is unset or uninitialized? This is probably the
most frequent programming error; it occurs in a wide variety of circumstances.
2. For all array references, is each subscript value within the defined bounds of the
corresponding dimension?
3. For all array references, does each subscript have an integer value? This is not
necessarily an error in all languages, but it is a dangerous practice.
4. For all references through pointer or reference variables, is the referenced storage
currently allocated? This is known as the dangling reference problem. It occurs
in situations where the lifetime of a pointer is greater than the lifetime of the
referenced storage.
5. Are there any explicit or implicit addressing problems if, on the machine being
used, the units of storage allocation are smaller than the units of storage
addressability?
6. If a data structure is referenced in multiple procedures or subroutines, is the
structure defined identically in each procedure?
7. When indexing into a string, are the limits of the string exceeded?

Data-Declaration Error

1. Have all variables been explicitly declared? A failure to do so is not necessarily an


error, but it is a common source of trouble.
2. If all attributes of a variable are not explicitly stated in the declaration, are the
defaults well understood?

36
3. Where a variable is initialized in a declarative statement, is it properly initialized?
4. Is each variable assigned the correct length, type, and storage class?
5. Is the initialization of a variable consistent with its storage type?

Computation Errors

1. Are there any computations using variables having inconsistent (e.g.


Nonarithmetic) data types?
2. Are there any mixed mode computations?
3. Are there any computations using variables having the same data type but
different lengths?
4. Is the target variable of an assignment smaller than the right-hand expression?
5. Is an overflow or underflow exception possible during the computation of an
expression? That is, the end result may appear to have a valid value, but an
intermediate result might be too big or too small for the machines data
representations.
6. Is it possible for the divisor in a division operation to be zero?
7. Where applicable, can the value of a variable go outside its meaningful range?
8. Are there any invalid uses of integer arithmetic, particularly division? For
example, if I is an integer variable, whether the expression 2*I/2 is equal to I
depends on whether I has an odd or an even value and whether the multiplication
or division is performed first.

Comparison Errors
1. Are there any comparisons between variables having inconsistent data types (e.g.
comparing a character string to an address)?
2. Are there any mixed-mode comparisons or comparisons between variables of
different lengths? If so, ensure that the conversion rules are well understood.
3. Does each Boolean expression state what it is supposed to state? Programmers
often make mistakes when writing logical expressions involving and, or, and
not.

37
4. Are the operands of a Boolean operator Boolean? Have comparison and Boolean
operators been erroneously mixed together?

Control-Flow Errors
1. If the program contains a multi way branch (e.g. a computed GO TO in Fortran),
can the index variable ever exceed the number of branch possibilities? For
example, in the Fortran statement,

GOTO(200,300,400), I
Will I always have the value 1,2, or 3?
2. Will every loop eventually terminate? Devise an informal proof or argument
showing that each loop will terminate
3. Will the program, module, or subroutine eventually terminate?
4. Is it possible that, because of the conditions upon entry, a loop will never execute?
If so, does this represent an oversight? For instance, for loops headed by the
following statements:
DO WHILE(NOTFOUND)
DO I=X TO Z
What happens if NOTFOUND is initially false or if X is greater than Z?

5. Are there any non-exhaustive decisions? For instance, if an input parameters


expected values are 1,2, or 3, does the logic assume that it must be 3 if it is not 1
or 2? If so, is the assumption valid?

Interface Errors

1. Does the number of parameters received by this module equal the number of
arguments sent by each of the calling modules? Also, is the order correct?
2. Do the attributes (e.g. type and size) of each parameter match the attributes of
each corresponding argument?

38
3. Does the number of arguments transmitted by this module to another module
equal the number of parameters expected by that module?
4. Do the attributes of each argument transmitted to another module match the
attributes of the corresponding parameter in that module?
5. If built-in functions are invoked, are the number, attributes, and order of the
arguments correct?
6. Does the subroutine alter a parameter that is intended to be only an input value?

Input/Output Errors

1. If files are explicitly declared, are their attributes correct?


2. Are the attributes on the OPEN statement correct?
3. Is the size of the I/O area in storage equal to the record size?
4. Have all files been opened before use?
5. Are end-of-file conditions detected and handled correctly?
6. Are there spelling or grammatical errors in any text that is printed or displayed by
the program?

3.5. Walkthroughs

The code walkthrough, like the inspection, is a set of procedures and error-detection
techniques for group code reading. It shares much in common with the inspection
process, but the procedures are slightly different, and a different error-detection technique
is employed.

The walkthrough is an uninterrupted meeting of one to two hours in duration. The


walkthrough team consists of three to five people to play the role of moderator, secretary
(a person who records all errors found), tester and programmer. It is suggested to have
other participants like:
A highly experienced programmer,

39
A programming-language expert,
A new programmer (to give a fresh, unbiased outlook)
The person who will eventually maintain the program,
Someone from different project and
Someone from the same programming team as the programmer.

The initial procedure is identical to that of the inspection process: the participants are
given the materials several days in advance to allow them to study the program.
However, the procedure in the meeting is different. Rather than simply reading the
program or using error checklists, the participants play computer. The person
designated as the tester comes to the meeting armed with a small set of paper test cases-
representative sets of inputs (and expected outputs) for the program or module. During
the meeting, each test case is mentally executed. That is, the test data are walked through
the logic of the program. The state of the program (i.e. the values of the variables) is
monitored on paper or a blackboard.
The test case must be simple and few in number, because people execute programs at a
rate that is very slow compared to machines. In most walkthroughs, more errors are found
during the process of questioning the programmer than are found directly by the test
cases themselves.
Questions
1. Is code reviews are relevant to the software testing? Explain the process
involved in a typical code review.
2. Explain the need for inspection and list the different types of code reviews.
3. Consider a program and perform a detailed reviews and list the review
findings in detail.

4. Chapter: Test Case Design

4.1. Learning Objectives

You will learn about:

40
Dynamic testing of Software Applications
White box and black box testing
Various techniques used in White box testing
Various techniques used in black box testing
Static program analysis
Automation of testing process

4.2. Introduction

Software can be tested either by running the programs and verifying each step of its
execution against expected results or by statically examining the code or the document
against its stated requirement or objective. In general, software testing can be divided into
two categories, viz. Static and dynamic testing. Static testing is a non-execution-based
testing and carried through by mostly human effort. In static testing, we test, design, code
or any document through inspection, walkthroughs and reviews as discussed in Chapter
2. Many studies show that the single most cost-effective defect reduction process is the
classic structural test; the code inspection or walk-through. Code inspection is like proof
reading and developers will be benefited in identifying the typographical errors, logic
errors and deviations in styles and standards normally followed.

Dynamic testing is an execution based testing technique. Program must be executed to


find the possible errors. Here, the program, module or the entire system is executed (run)
and the output is verified against the expected result. Dynamic execution of tests is based
on specifications of the program, code and methodology.

4.3. White Box Testing

This testing technique takes into account the internal structure of the system or
component. The entire source code of the system must be available. This technique is
known as white box testing because the complete internal structure and working of the
code is available.

41
White box testing helps to derive test cases to ensure:

1. All independent paths are exercised at least once.

2. All logical decisions are exercised for both true and false paths.

3. All loops are executed at their boundaries and within operational bounds.

4. All internal data structures are exercised to ensure validity.

White box testing helps to:

Traverse complicated loop structures

Cover common data areas,

Cover control structures and sub-routines,

Evaluate different execution paths

Test the module and integration of many modules

Discover logical errors, if any.

Helps to understand the code

Why the white box testing is used to test conformance to requirements?

Logic errors and incorrect assumptions most likely to be made when coding for
"special cases". Need to ensure these execution paths are tested.

May find assumptions about execution paths incorrect, and so make design errors.
White box testing can find these errors.

Typographical errors are random. Just as likely to be on an obscure logical path as


on a mainstream path. "Bugs lurk in corners and congregate at boundaries"

4.4. Basis Path Testing

A testing mechanism proposed by McCabe.

Aim is to derive a logical complexity measure of a procedural design and use this as a
guide for defining a basic set of execution paths.

42
Test cases, which exercise basic set, will execute every statement at least once.

4.4.1. Flow Graph Notation

Flow graph notation helps to represent various control structures of any programming
language. Various notations for representing control flow are:

Fig 3.1: Notations used for control structures

On a flow graph:

Arrows called edges represent flow of control

Circles called nodes represent one or more actions.

Areas bounded by edges and nodes called regions.

A predicate node is a node containing a condition

Any procedural design/ program can be translated into a flow graph. Later the flow graph
can be analyzed for various paths within it.

Note that compound Boolean expressions at tests generate at least two predicate node and
additional arcs.

Example:

43
Fig 3.2: Control flow of a program and the corresponding flow diagram

4.4.2. Cyclomatic Complexity

The cyclomatic complexity gives a quantitative measure of the logical complexity. This
value gives the number of independent paths in the basis set, and an upper bound for the
number of tests to ensure that each statement is executed at least once.

An independent path is any path through a program that introduces at least one new set of
processing statements or a new condition (i.e., a new edge)

44
Fig 3.3: Sample program and corresponding flow diagram

In Fig 3.3, the statements are numbered and the corresponding nodes also numbered with
the same number. The sample program contains one DO and three nested IF statements.

From the example we can observe that:

Cyclomatic Complexity of 4 can be calculated as:

1. Number of regions of flow graph, which is 4.

2. #Edges - #Nodes + 2, which is 11-9+2=4.

3. #Predicate Nodes + 1, which is 3+1=4.

45
The above complexity provides the upper bound on the number of tests cases to be
generated or independent execution paths in the program. The independent paths(4 paths)
for the program shown in fig 3.3 is given below:

Independent Paths:

1. 1, 8

2. 1, 2, 3, 7b, 1, 8

3. 1, 2, 4, 5, 7a, 7b, 1, 8

4. 1, 2, 4, 6, 7a, 7b, 1, 8

Cyclomatic complexity provides upper bound for number of tests required to guarantee
the coverage of all program statements.

4.4.3. Deriving Test Cases

Test cases are designed in many ways. The steps involved for test case design are:

1. Using the design or code, draw the corresponding flow graph.

2. Determine the cyclomatic complexity of the flow graph.

3. Determine a basis set of independent paths.

4. Prepare test cases that will force execution of each path in the basis set.

Note: some paths may only be able to be executed as part of another test.

4.4.4. Graph Matrices

Graph matrices can automate derivation of flow graph and determination of a set of basis
paths. Software tools to do this can use a graph matrix. A sample graph matrix is shown is
Fig 3.4.

46
The graph matrix:
Is a square matrix with number of sides equal to number of nodes.
Rows and columns of the matrix correspond to the number of nodes in the flow
graph.
Entries correspond to the edges.
The matrix can associate a number with each entry of the edge.
Use a value of 1 to calculate the cyclomatic complexity. The cyclomatic complexity is
calculated as follows:
For each row, sum column values and subtract 1.

Sum these totals and add 1.

Which is 4.
Some other interesting link weight can be measured by the graph as:
Probability that a link (edge) will be executed
Processing time for traversal of a link

Memory required during traversal of a link

Resources required during traversal of a link

47
Fig 3.4: Example of a graph matrix

4.5. Control Structure testing.

In programs, conditions are very important and testing such conditions is more complex
than other statements like assignment and declarative statements. Basic path testing is
one example of control structure testing. There are many ways in which control structure
can be tested.

4.5.1. Conditions Testing

Condition testing aims to exercise all logical conditions in a program module. Logical
conditions may be complex or simple. Logical conditions may be nested with many
relational operations.

Can define:

Relational expression: (E1 op E2), where E1 and E2 are arithmetic expressions.


For example, (x+y) (s/t), where x, y, s and t are variables.
Simple condition: Boolean variable or relational expression, possibly proceeded
by a NOT operator.
Compound condition: composed of two or more simple conditions, Boolean
operators and parentheses along with relational operators.
Boolean expression: Condition without relational expressions.

Normally errors in expressions can be due to due to one or all or the following:
Boolean operator error

Boolean variable error

Boolean parenthesis error

Relational operator error

Arithmetic expression error

Mismatch of types

Condition testing methods focus on testing each condition in the program of any type of
conditions. There are many strategies to identify errors.

48
Some of the strategies proposed include:

Branch testing: Every branch is executed at least once.

Domain Testing: Uses three or four tests for every relational operator depending
on the complexity of the statement.

Branch and relational operator testing: Uses condition constraints. Based on the
complexity of the relational operators, many branches will be executed.
Example 1: C1 = B1 & B2

Where B1, B2 are Boolean conditions.

Condition constraint of form (D1, D2) where D1 and D2 can be true (t) or false(f).

The branch and relational operator test requires the constraint set {(t,t),(f,t),(t,f)}
to be covered by the execution of C1.

Coverage of the constraint set guarantees detection of relational operator errors.

4.5.2. Data Flow Testing

First, a proper data flow diagram like control flow(see basis path flow) is drawn. Then
selects test paths according to the location of definitions and use of variables. Any
variables that have been defined in any program behaves in the following way:

D: define the variable, normally defined in declarative section,

U: use the variables which is defined earlier, in the program.

K: kill the variable, which is another state of the variable at any time of the execution of
the program.

Any variable that is part of the program will undergo any of the above state. However, the
sequence of states is important. We can avoid following anomalies during the program
execution:

DU: Normal,

UK, UU: Normal,

49
DD: Suspicious

DK: Probable bug

KD: Normal

KK: Probable bug

KU: bug

UD: Normal

For example,

DU: Normal means a variable is defined first and then used in the program which is
normal behavior of the data flow in the program.

DK: Probable bug means a variable is defined and then killed before using in the
program. This may be bug as why the variable is defined and killed with out using in the
program.

4.5.3. Loop Testing

Loops are fundamental to many algorithms. Loops can be categorized as, define loops as
simple, concatenated, nested, and unstructured. Loops can be defined in many ways.

Examples:

50
Fig 3.5: Different types of Loops

To test the loops, following guidelines may be followed:

Simple Loops of size n:

o Skip loop entirely

o Only one pass through the loop

o Two passes through the loop

o m passes through loop where m<n.

o (n-1), n, and (n+1) passes through the loop. This helps in testing the
boundary of the loops.

Nested Loops

o Start with inner loop. Set all other loops to minimum values.

o Conduct simple loop testing on inner loop.

o Work outwards and take the next nested loop.

51
o Continue until all loops are tested.

Concatenated Loops

o If independent loops, use simple loop testing.

o If dependent, treat as nested loops.

Unstructured loops

o Don't test - redesign. This is known as poor design.

52
4.6. Black Box Testing

Functional tests examine the observable behavior of software as evidenced by its outputs,
without any reference to internal functions. This kind of tests is from the user point of
view, which means as if the user is testing as in the normal business functions.

Black box tests normally determine the quality of the software. It is an advantage
to create the quality criteria from this point of view from the beginning.

In black box testing, software is subjected to a full range of inputs and the outputs
are verified for its correctness. Here, the structure of the program is immaterial.

Black box testing technique can be applied once unit and integration testing is
completed.

It focuses on functional requirements.

It is compliment to the white box testing.

The main objective of the black box testing is to find:

1. incorrect or missing functions

2. interface errors

3. errors in data structures or external database access

4. performance errors

5. initialization and termination errors.

Some of the techniques used for black box testing are discussed below:

53
4.6.1. Equivalence Partitioning

The main objective of this method is to partitioning the input so that an optimal input data
is selected. Steps to be followed are:

1. Divide the input domain into classes of data for which test cases can be generated.

2. Attempting to uncover classes of errors, if any.

3. Identify the both right and wrong input data while partitioning the data.

4. Test the program for all types of data.

Based on equivalence classes for input conditions.

An equivalence class represents a set of valid or invalid states

An input condition is either a specific numeric value, range of values, a set of related
values, or a boolean condition.

Equivalence classes can be defined by:

If an input condition specifies a range or a specific value, one valid and two
invalid equivalence classes defined.

If an input condition specifies a boolean or a member of a set, one valid and one
invalid equivalence classes defined.

Test cases for each input domain data item developed and executed.

This method uses less number of input data compare to exhaustive testing. However, the
data for boundary values are not considered.

This method though reduces significantly the number of input data to be tested, it does
not test the combinations of the input data.

54
4.6.2. Boundary Value Analysis.

It is observed that boundary points for any inputs are not tested properly. This leads to
many errors. Large number of errors tend to occur at boundaries of the input domain.
Boundary Value Analysis(BVA) leads to selection of test cases that exercise boundary
values.
BVA complements equivalence partitioning i.e. select any element in an equivalence
class, select those at the ''edge' of the class.
Examples:
1. For a range of values bounded by a and b, test (a-1), a, (a+1), (b-1), b, (b+1).
2. If input conditions specify a number of values n, test with (n-1), n and (n+1) input
values.
3. Apply 1 and 2 to output conditions (e.g., generate table of minimum and
maximum size).
4. If internal program data structures have boundaries (e.g., buffer size, table limits),
use input data to exercise structures on boundaries.
BVA and Equivalence partitioning both helps in testing the programs and covers most of
the conditions. This method does not test the combinations of input conditions.

4.6.3. Cause Effect Graphing Techniques.

Translation of natural language descriptions of procedures to software based algorithms is


error prone.

Example: From US Army Corps of Engineers:

Executive Order 10358 provides in the case of an employee whose work week varies from
the normal Monday through Friday work week, that Labor Day and Thanksgiving Day
each were to be observed on the next succeeding workday when the holiday fell on a day
outside the employees regular basic work week. Now, when Labor Day, Thanksgiving
Day or any of the new Monday holidays are outside an employees basic workbook, the
immediately preceding workday will be his holiday when the non-workday on which the
holiday falls is the second non-workday or the non-workday designated as the employees
day off in lieu of Saturday. When the non-workday on which the holiday falls is the first
non-workday or the non-workday designated as the employees day off in lieu of Sunday,
the holiday observance is moved to the next succeeding workday.

55
How do you test code, which attempts to implement this?

Cause-effect graphing attempts to provide a concise representation of logical


combinations and corresponding actions.

1. Causes (input conditions) and effects (actions) are listed for a module and an
identifier is assigned to each.

2. A cause-effect graph developed.

3. Graph converted to a decision table.

4. Decision table rules are converted to test cases.

Simplified symbology:

Fig 3.6: Representation of cause-effect nodes

56
4.6.4. Comparison Testing

In some applications the reliability is critical.

Redundant hardware and software may be used.

For redundant software, use separate teams to develop independent versions of the
software.

Test each version with same test data to ensure all provisional identical output.

Run all versions in parallel with a real-time comparison of results.

Even if will only run one version in final system, for some critical applications can
develop independent versions and use comparison testing or back-to-back testing.

When outputs of versions differ, each is investigated to determine if there is a defect.

Method does not catch errors in the specification.

4.7. Static Program Analysis


This strategy helps in identifying errors without executing the program. Peer reviewers
and programmers will use this strategy to uncover probable static errors.

4.7.1. Program Inspections

Have covered in the previous Chapter.

4.7.2. Mathematical Program Verification

If the programming language semantics are formally defined, one can consider program
to be a set of mathematical statements. We can attempt to develop a mathematical proof
that the program is correct with respect to the specification. If the proof can be
established, the program is verified and testing to check verification is not required.

There are a number of approaches to proving program correctness. We will only consider
axiomatic approach.

57
Suppose that at points P(1), .. , P(n) assertions concerning the program variables and their
relationships can be made.

The assertions are a(1), ..., a(n).

The assertion a(1) is about inputs to the program, and a(n) about outputs.

We can now attempt, for k between 1 and (n-1), to prove that the statements between

P(k) and P(k+1) transform the assertion a(k) to a(k+1).

Given that a(1) and a(n) are true, this sequence of proofs shows partial program
correctness. If it can be shown that the program will terminate, the proof is complete.

Note: Students are requested to note this as an introductory section.

4.7.3. Static Program Analyses

Static analysis tools scan the source code to try to detect errors.

The code does not need to be executed.

Most useful for languages which do not have strong typing.

It can check:
1. Syntax.
2. Unreachable code
3. Unconditional branches into loops
4. Undeclared variables
5. Uninitialised variables.
6. Parameter type mismatches
7. Uncalled functions and procedures.
8. Variables used before initialization.
9. Non-usage of function results.
10. Possible array bound errors.
11. Misuse of pointers.

58
4.8. Automated Testing Tools

Automation of testing is the state of the art technique where in number of tools will help
in testing program automatically. Programmers can use any tool to test his/her program
and ensure the quality. There are number of tools are available in the market. Some of the
tools which helps the programmer are:

1. Static analyser

2. Code Auditors

3. Assertion processors

4. Test file generators

5. Test Data Generators

6. Test Verifiers

7. Output comparators.

Programmer can select any tool depending on the complexity of the program.

Question:
1. What is black box testing? Explain
2. What are the different techniques are available to conduct black
box testing?
3. Explain different methods available in white box testing with
examples.

59
5. Chapter: Testing for Specialized Environments

4.1. Learning Objectives

You will learn about:


Concept of Graphic User Interface (GUI) Testing,
GUI checklist for Windows, Data entry and related activities
Testing Client/Server Architecture
Testing documentation and help facilities

4.2. Introduction

The need for specialized testing approaches is becoming mandatory as computer software
has become more complex. The White-box and black box testing methods are applicable
across all environments, architectures and applications, but unique guidelines and
approaches to testing are sometime important. We address the testing guidelines for
specialized environments, architectures, and applications that are commonly encountered
by software engineers.

5.1. Testing GUIs


The growth of Graphical User Interfaces (GUIs) in various applications has become a
challenge for test engineers. Because of reusable components provided as part of GUI
development environments, the creation of the user interface has become less time
consuming and more precise. GUI is becoming mandatory for any application as users
are used to it. Sometime, the user interface may be treated as a different layer and easily
separated from the traditional functional or business layer. The design and development
of user interface layer requires separate design and development methodology. Here the
main problem is to understand the user psychology during the development time. Due to
complexity of GUIs, testing and generating test cases has become more complex and
tedious.

Because of modern GUIs standards (same look and feel), common tests can be derived.
60
What are the guidelines to be followed which helps for creating a series of generic
tests for GUIs?

Guidelines can be categorized into many operations. Some of them are discussed below:

For windows:
Will the window open properly based on related typed or menu-based commands?
Can the window be resized, moved, scrolled?
Does the window properly regenerate when it is overwritten and then recalled?
Are all functions that relate to the window available when needed?
Are all functions that relate to the window available when needed?
Are all functions that relate to the window operational?
Are all relevant pull-down menus, tool bars, scroll bars, dialog boxes, and
buttons, icons, and other controls available and properly represented?
Is the active window properly highlighted?
If multiple or incorrect mouse picks within the window cause unexpected side
effects?
Are audio and/or color prompts within the window or as a consequence of
window operations presented according to specification?
Does the window properly close?

For pull-down menus and mouse operations:


Is the appropriate menu bar displayed in the appropriate context?
Does the application menu bar display system related features (e.g. a clock
display)
Do pull-down operations work properly?
Do breakaway; menus, palettes, and tool bars work properly?
Are all menu functions and pull-down sub functions properly listed?
Are all menu functions and pull-down sub functions properly listed?
Are all menu functions properly addressable by the mouse?
61
Are text typeface, size, and format correct?
Is it possible to invoke each menu function using its alternative text-based
command?
Are menu functions highlighted (or grayed-out) based on the context of current
operations within a window?
Does each menu function perform as advertised?
Are the names of menu functions self-explanatory?
Is help available for each menu item, and is it context sensitive?
Are mouse operations properly recognized throughout the interactive context?
If multiple clicks are required, are they properly recognized in context?
If the mouse has multiple buttons, are they properly recognized in context?
Do the cursor, processing indicator (e.g. an hour glass or clock), and pointer
properly change as different operations are invoked?

Data entry:
Is alphanumeric data entry properly echoed and input to the system?
Do graphical modes of data entry (e.g., a slide bar) work properly?
Is invalid data properly recognized?
Are data input messages intelligible?
Are basic standard validation on each data is considered during the data entry
itself?
Once the data is entered completely and if a correction is to be done for a specific
data, does the system requires entering the entire data again?
Does the mouse clicks are properly used?
Does the help buttons are available during data entry?

In addition to the above guidelines, finite state modeling graphs may be used to derive a
series of tests that address specific data and program objects that are relevant to the GUI.

62
5.2. Testing of Client/Server Architectures
Client/server architectures represent a significant challenge for software testers. The
distributed nature of client/server environments, the performance issues associated with
transaction processing, the potential presence of a number of different hardware
platforms, the complexities of network communication, the need to service multiple
clients from a centralized (or in some cases, distributed) database, and the coordination
requirements imposed on the server all combine to make testing of C/S architectures and
the software that reside within them considerably more difficult than testing standalone
applications. In fact, recent industry studies indicate a significant increase in testing time
and cost when C/S environments are developed.

5.3. Testing documentation and Help facilities


Errors in documentation can be as devastating to the acceptance of the program as errors
in data or source e code. You must have seen the difference between following the user
guide and getting results or behaviors that do not coincide with those predicted by the
document. For this reason, documentation testing should be a meaningful part of every
software test plan.

Documentation testing can be approached in two phases. The first phase, formal technical
review, examines the document for editorial clarity. The second phase, live test, users the
documentation in conjunction with the use of the actual program.

Some of the guidelines are discussed here:


Does the documentation accurately describe how to accomplish each mode of
use?
Is the description of each interaction sequence accurate?
Are examples accurate and context based?
Are terminology, menu descriptions, and system responses consistent with the
actual program?
Is it relatively easy to locate guidance within the documentation?
63
Can troubleshooting be accomplished easily with the documentation?
Are the document table of contents and index accurate and complete?
Is the design of the document (layout, typefaces, indentation, graphics) conducive
to understanding and quick assimilation of information?
Are all error messages displayed for the user described in more detail in the
document?
If hypertext links are used, are they accurate and complete?

The only viable way to answer these questions is to have an independent third party to
test the documentation in the context of program usage. All discrepancies are noted, and
areas of document ambiguity or weakness are defined for potential rewrite.

Questions
1. Explain the need for GUI testing and its complexity?
2. List the guidelines required for a typical tester during GUI testing?
3. Select your own GUI based software system and test the GUI related functions by
using the listed guidelines in this Chapter.

64
6. Chapter: SOFTWARE TESTING STRATEGIES

6.1. Learning Objectives


You will learn about:
Various testing Strategies in Software Testing
Basic concept of Verification and Validation
Criteria for Completion of Testing
Unit Testing
Integration Testing
Validation Testing
System Testing
Debugging Process

6.2. Introduction

A strategy for software testing integrates software test case design methods into a well-
planned series of steps that result in the successful construction of software. As
important, a software testing strategy provides a road map for the software developer, the
quality assurance organization, and the customer- a road map that describes the steps to
be conducted as part of testing, when these steps are planned and then undertaken, and
how much effort, time, and resources will be required. Therefore any testing strategy
must incorporate test planning, test case design, test execution, and resultant data
collection and evaluation.

A software testing strategy should be flexible enough to promote the creativity and
customization that are necessary to adequately test all large software-based systems. At
the same time, the strategy must be rigid enough to promote reasonable planning and
management tracking as the project progresses. Shooman suggests these issues:

65
In many ways, testing is an individualistic process, and the number of different
types of tests varies as much as the different development approaches. For many
years, our only defense against programming errors was careful design and the
native intelligence of the programmer. We are now in an era in which modern
design techniques are helping us to reduce the number of initial errors that are
inherent in the code. Similarly, different test methods are beginning to cluster
themselves into several distinct approaches and philosophies.

These approaches and philosophies are what we shall call strategy.


Different test methods begin to cluster into several distinct approaches and philosophies, which
is called strategy.
A testing strategy incorporates :
Test planning
Test case design
Test execution And
Resultant data collection and evaluation

A software testing strategy should be flexible enough to promote the creativity and
customization that are necessary to adequately test all large software based systems.

At the same time, the strategy must be rigid enough to promote reasonable planning and
Management tracking as the project progresses.

5.2. A Strategic Approach To Software Testing

Testing activity can be planned and conducted systematically; hence, to be very specific test case
design methods are defined called as templates.

A number of software testing has been proposed in the literature. All provide the software
developer with a template for testing and all have the following generic characteristics.

66
Testing begins at module level or class or object level in object-oriented systems and
works Outward toward the integration of the entire computer based system.
Different techniques are appropriate at different points in time

Testing is conducted by the developer of the software and, an independent test group
for large projects
Testing and debugging are different activities, but debugging must be accommodated
in any testing strategy

How should a strategy be?

A strategy for software testing must accommodate low-level tests that are necessary
to verify that a small source code segment has been correctly implemented as well as
high-level tests that validate major customer requirements.

A strategy must provide guidance for the practitioner and a set of milestones for the
manager. Because the steps of the test strategy occur at a time when deadline pressure
begins to rise, progress must be measurable and problems must surface.

6.1. VERIFICATION AND VALIDATION

Software testing is one element of a broader topic that is often referred to as verification
and validation(V&V). Verification refers to the set of activities that ensure that software
correctly implements a specific function. Validation refers to a different set of activities
that ensure that the software that has been built is traceable to customer requirements.

Software testing is one element of a broader topic that is often referred to as verification
and validation (V&V).

Verification refers to the set of activities that ensure that correctly implements a
specific function.
Validation refers to a different set of activities that ensure that the software that has
been built is traceable to customer requirements.
67
Boehm states like this.
Verification: "Are we building the product right"
Validation: "Are we building the right product?"

Formal
Software Technical
Measurement
Engineering Review
Methods
Quality

Standards Testing
And SCM
Procedures &
SQA

Figure 5.1. Achieving Software Quality

Figure 5.1 shows by application of methods and tools, effective formal technical
reviews, and solid management and measurement all lead to quality that is confirmed
during testing.

Testing provides the last bastion from which quality can be assessed and, more
pragmatically, errors can be uncovered.

However, testing should not be viewed as a safety net. Quality cannot be tested it won't
be when you begin testing and when finished testing Quality is incorporated throughout
software process.

Note:
It is important to note that V&V encompass a wide array of SQA activities that include
formal technical reviews, quality and configuration audits, performance monitoring,
Simulation, feasibility study, documentation review, database review, algorithm analysis,
development testing, qualification testing and installation testing.

Although testing plays an extremely important role in V&V, many other activities are
also necessary.
68
6.2. Organizing for software testing

The software developer is always responsible for testing the individual units (modules) of
the program ensuring that each performs the function for which it was designed. In many
cases, the developer also conducts integration testing - A testing step that leads to
construction of the complete program structure. Only after the software architecture is
complete, does an independent test group (ITG) become involved?

The role of an ITG is to remove the inherent problems associated with letting the builder
test the thing that has been built. Independent testing removes the conflict of interest that
may otherwise present. After all, personnel in the ITG team are paid to find errors.
How ever, the software developer does not turn the program over to ITG and walk away.
The developer and the ITG work closely throughout a software project to ensure that
thorough tests will be conducted. While testing is conducted, the developer must be
available to correct errors that are uncovered.

The ITG is part of the software development project team in the sense that it becomes
involved during the specification process and stays involved (planning and specifying test
procedures) throughout a large project.

However, in many cases the ITG reports to the SQA organization, there by achieving a
degree of independence that might not be possible if it were a part of the software
development organization.

6.3. A Software Testing Strategy


The software engineering process may be viewed as a spiral, illustrated in figure 5.2,
Initially system engineering defines the roll of software and leads to software
requirements and analysis, where the information domain, function, behavior,
performance, constraints, and validation criteria for software are established. Moving
inward along the spiral, we come to design and finally to coding.

To develop computer software, We spiral in along streamlines that decrease the level of
Abstraction on each turn.
69
System
S
Engineering

Requirements
R

D Design

C Code
Unit Test
U
I
Integration test
V
Validation Test

System test S
T

Figure 5.2: Testing Strategy

The strategy for software testing may also be viewed in the context of the spiral.
Unit testing begins at the vortex of the spiral and concentrates on each unit of the
software as implemented in source code. Testing progresses by moving outward along the
spiral to integration testing, where the focus is on design and the construction of the
software architecture. Talking another turn outward on the spiral, we encounter
Validation testing where requirements established as part of software requirements
analysis are validated against the software that has been constructed. Finally, We arrive at
system testing where the software and other system elements are tested as a whole.
To test computer software, we spiral out along streamlines that broaden the scope of
testing with each turn.

Considering the process from a procedural point of view testing within the context of
software engineering is a series of four steps that are implemented sequentially.

The steps are shown In Figure 5.3 initially tests focus on each module individually,
assuring that it functions as a unit hence the name unit testing. Unit testing makes heavy
use of white-box testing techniques, exercising specific paths in a module's control

70
structure to ensure complete coverage and maximum error detection. Next, modules must
be assembled or integrated to form the complete software package. Integration testing
addresses the issues associated with the dual problems of verification and program
construction. Black-box test case design techniques are most prevalent during integration,
although a limited amount of white -box testing may be used to ensure coverage of major
control paths. After the software has been integrated (constructed), sets of high-order test
are conducted. Validation criteria (established during requirements analysis) must be
tested. Validation testing provides final assurance that software needs all functional,
behavioral and performance requirements. Black-box testing techniques are used
exclusively during validation.

The last high-order testing step falls outside the boundary of software engineering and
into the broader context of computer system engineering. Software once validated must
be combined with other system elements (e.g., hardware, people, and databases). System
testing verifies the tall elements mesh properly and that overall system
function/performance is achieved.

High order tests


High order tests
Requirement
s

Integrationtest
test
Design Integration

Code
Unit test
Unit test

Figure 5.3: Software Testing Steps


Testing
Direction

71
6.4. Criteria for completion of testing

Using statistical modeling and software reliability theory, models of software failures
(uncovered during testing) as a function of execution time can be developed.
Version of the failure model called a logarithmic Poisson execution-time model takes the
form:

f(t) = (1/p)ln(lo pt +1) (1)

Where
f(t) = cumulative number of failures that are expected to occur once the software has n
has been tested for a certain amount of execution time, t
lo = the initial software failure intensity (failures per unit time) at the beginning of
testing
p = the exponential reduction in failure intensity as errors are uncovered and repairs
are made.
The instantaneous failure intensity, l(t) can be derived by taking the derivative of f(t),
l(t) = lo(lo pt +1) (2)

Using the relationship noted in equation 2, testers can predict the drop off of errors as
testing progresses. The actual error intensity can be plotted against the predict curve
figure 5.4. If the actual data gathered during testing and logarithmic Poisson execution-
time model are responsibly close to one another over a number of data points, the model
can be used to predict total testing time required to achieve an acceptable low failure
intensity.

72
per test hour

Data Collected during testing


Failures

Predicted failure intensity, l(t)


lo
Figure5. 4: Failure intensity as a function of execution time

6.5. Strategic issues

Execution time, t
Following issues must be addressed if a successful software strategy is to be implemented
Specify product requirements in a quantifiable manner long before testing
commences. Although the overriding objective of testing is to find errors good
testing strategy also assesses other quality characteristics such as portability,
maintainability , usability .These should be specified in a way that is measurable so
that testing results are unambiguous.
State testing objectives explicitly. The specific objectives of testing should be stated
in measurable terms for example, test effectiveness, test coverage, meantime to
failure, the cost to find and fix defects, remaining defect density or frequency of
occurrence, and test work - hours per regression test should all be stated within the
test plan.

Understand the users of the software and develop a profile for each user
category.use cases ,which describe interaction scenario for each class of user can
reduce overall testing effort by focussing testing on actual use of the product.

73
Develop a testing plan that emphasizes "rapid cycle testing".
The feedback generated from the rapid cycle tests can be used to control quality levels
and corresponding test strategies.

Build "robust" software that is designed to test itself. Software should be designed
in a manner that uses antibugging techniques. that is software should be capable of
diagnosing certain classes of errors. In addition, the design should accommodate
automated testing regression testing.

Use effective formal technical reviews as a filter prior to testing. formal technical
reviews can be as effective as testing in uncovering errors. For this reason, reviews
can reduce the amount of testing effort that is required to produce high-quality
software.
Conduct formal technical reviews to assess the test strategy and test cases
themselves. Formal technical reviews can uncover inconsistencies, omissions, and
outright errors in the testing approach. This saves time and improves product quality.

Develop a continuous improvement approach for the testing process. The test strategy
should be measured. The metrics collected during testing should be used as part of a
statistical process control approach for software testing.

6.6. Unit Testing

Unit testing focuses verification efforts on the smallest unit of software design the
module. Using the procedural design description as guide, important control paths are
tested to uncover errors within the boundary of the module . the relative complexity of
tests and uncovered errors are limited by the constraint scope established for unit testing.
The unit test is normally white-box oriented, and the step can be conducted in parallel for
multiple modules.

74
6.6.1. Unit test consideration

The tests that occur as part of unit testing are illustrated schematically in figure 5.
The module interface is tested to ensure that information properly flows into and out of
the program unit under test. The local data structure is examined to ensure the data stored
temporarily maintains its integrity during all steps in an algorithm's execution.
Boundary conditions are tested to ensure that the module operates properly at boundaries
established to limit or restrict processing. All independent paths through the control
structure are exercised to ensure that all statements in a module have been executed at
least once. And finally, all error-handling paths are tested.

Tests of data flow across a module interface are required before any other test is initiated.
If data do not enter and exit properly, all other tests are doubtful.

Module
Module
-----------
-----------
------------ Interface
------------
------------- Local data
-------------
structures
Boundary
conditions
Independent paths
Error handling
paths
Test
TestCases
Cases

Figure 5.5: Unit Test

75
6.6.2. Checklist for interface tests.

1. Number of input parameters equals to number of arguments.


2. Parameter and argument attributes match.
3. Parameter and argument systems match.
4. Number of arguments transmitted to called modules equal to number of parameters.
5. Attributes of arguments transmitted to called modules equal to attributes of
parameters.
6. Unit system of arguments transmitted to call modules equal to unit system of
parameters.
7. Number attributes and order of arguments to built-in functions correct.
8. Any references to parameters not associated with current point of entry.
9. Input-only arguments altered.
10. Global variable definitions consistent across modules.
11. Constraints passed as arguments.

When a module performs external I/O, following additional interface test must be
conducted.
1. File attributes correct?
2. Open/Close statements correct?
3. Format specification matches I/O statements?
4. Buffer size matches record size?
5. Files opened before use?
6. End-of-File conditions handled?
7. I/O errors handled
8. Any textual errors in output information?

The local data structure for a module is a common source of errors .Test cases should be
designed to uncover errors in the following categories

1. Improper or inconsistent typing


2. erroneous initialization are default values

76
3. incorrect variable names
4. inconsistent data types
5. underflow, overflow, and addressing exception

In addition to local data structures, the impact of global data on a module should be
ascertained during unit testing.

Selective testing of execution paths is an essential task during the unit test. Test cases
should be designed to uncover errors to erroneous computations; incorrect comparisons
are improper control flow. Basis path and loop testing are effective techniques for
uncovering a broad array of path errors.

Among the more common errors in computation are :


1. misunderstood or incorrect arithmetic precedence
2. mixed mode operation
3. incorrect initialization
4. precision Inaccuracy
5. incorrect symbolic representation of an expression.
Comparison and control flows are closely coupled to one another.

Test cases should uncover errors like:


1. Comparison of different data types
2. Incorrect logical operators are precedence
3. Expectation of equality when precision error makes equality unlikely
4. Incorrect comparison or variables
5. Improper or non-existent loop termination.
6. Failure to exit when divergent iteration is encountered
7. Improperly modified loop variables.

Good design dictates that error conditions be anticipated and error handling paths set up
to reroute or cleanly terminate processing when an error does occur.

77
Among the potential errors that should be tested when error handling is evaluated
are:
1. Error description is unintelligible
2. Error noted does not correspond to error encountered
3. Error condition causes system intervention prior to error handling
4. Exception-condition processing is incorrect
5. Error description does not provide enough information to assist in the location of the
cause of the error.

Boundary testing is the last task of the unit tests step. software often files at its
boundaries. That is, errors often occur when the n th element of an n-dimensional array is
processed; when the ith repetition of a loop with i passes is invoke; or when the maximum
or minimum allowable value is encountered. Test cases that exercise data structure,
control flow and data values just below, at just above maxima and minima are Very likely
to uncover errors.

6.6.3. Unit test procedures

Unit testing is normally considered as an adjunct to the coding step. After source-level
code has been developed, reviewed, and verified for correct syntax, unit test case design
begins. A review of design information provides guidance for establishing test cases that
are likely to uncover errors in each of the categories discussed above. Each test case
should be coupled with a set of expected results.
Because a module is not a standalone program, driver and or stub software must be
developed for each unit test. The unit test environment is illustrated in figure 5.6.In most
applications a driver is nothing more than a "Main program" that accepts test case data,
passes such data to the test module and prints relevant results. Stubs serve to replace
modules that are subordinate to the module that is to be tested. A stub or "dummy sub
program" uses the subordinate module's interface may do minimal data manipulation
prints verification of entry, and returns.

78
Drivers and stubs represent overhead. That is, both are software that must be developed
but that is not delivered with the final software product. If drivers and stubs are kept
simple, actual overhead is relatively low. Unfortunately, many modules cannot be
adequately unit tested with "simple" overhead software. In such cases, Complete testing
can be postponed until the integration test step (Where drivers or stubs are also used).
Unit test is simplified when a module with high cohesion is designed. When a module
addresses only one function, the number of test cases is reduced and errors can be more
easily predicted and uncovered.

Driver

Interface
Module to be tested Local data structures
Boundary conditions
Independent paths
Error handling paths

Stub Stub Figure 5.6: Unit Test Environment

6.7. Integration Testing

Test Cases
RESULTS

Integration testing is a systematic technique for constructing the program structure while
conducting tests to uncover errors associated with interfacing.

The objective is to take unit tested modules and build a program structure that has been
dictated by design.

79
6.7.1. Different Integration Strategies
Integration testing is a systematic technique for constructing the program structure while
conducting tests to uncover errors associated with interfacing. The objective is to take
unit tested modules and build a program structure that has been dictated by design.

There are often a tendency to attempt non-incremental integration; that is, to contrtruct
the program using a big bang approach. All modules are combined in advance. The
entire program is tested as a whole. And chaos usually results! A set of errors is
encountered. Correction is difficult because isolation of causes is complicated by the vast
expanse of the entire program. Once these errors are corrected, new ones appear and the
process continues in a seemingly endless loop.

Incremental integration is the antithesis of the big bang approach. The program is
constructed and tested in small segments, where errors are easier to isolate and correct;
interfaces are more likely to be tested completely; and a systematic test approach may be
applied. We discuss some of incremental methods here:

6.7.2. Top down integration

Top-down integration is an incremental approach to construction of program structure.


Modules are integrated by moving downward through the control hierarchy, beginning
with the main control module.

The integration process is performed in a series of five steps:


1. The main control module is used as a test driver, and stubs are substituted for all
modules directly subordinate to the main control module.

2. Depending on the integration approach selected (i.e., depth-or breadth first),


subordinate stubs are replaced one at a time with actual modules.

80
3. Tests are conducted as each modules are integrated

4. On completion of each set of tests, another stub is replaced with real module

5. Regression testing may be conducted to ensure that new errors have not been
introduced

The process continues from step2 until the entire program structure is built.

Top-down strategy sounds relatively uncomplicated, but in practice, logistical problems


arise. The most common of these problems occurs when processing at low levels in the
hierarchy is required to adequately test upper levels. Stubs replace low-level modules at
the beginning of top-down testing; therefore, no significant data can flow upward in the
program structure.

The tester is left with three choices

1. Delay many tests until stubs are replaced with actual modules.
2. Develop stubs that perform limited functions that simulate the actual module
3. Integrate the software from the bottom of the hierarchy upward

The first approach causes us to lose some control over correspondence between specific
tests and incorporation of specific modules. this can lead to difficulty in determining the
cause of errors tends to violate the highly constrained nature of the top down approach.
The second approach is workable but can lead to significant overhead, as stubs become
increasingly complex. The third approach is discussed in next section.

6.7.3. Bottom -Up Integration


Modules are integrated from the bottom to top, in this approach processing required for
modules subordinate to a given level is always available and the needs for subs is
eliminated.

81
A bottom-up integration strategy may be implemented with the following steps:

1. Low-level modules are combined into clusters that perform a specific software sub
function.
2. A driver is written to coordinate test case input and output.
3. The cluster is tested.
4. Drivers are removed and clusters are combined moving upward in the program
structure.

As integration moves upward, the need for separate test drivers lessens. In fact, if the top
two levels of program structure are integrated top-down, the number of drivers can be
reduced substantially and integration of clusters is greatly simplified.

6.7.4. Regression Testing

Each time a new model is added as a part of integration testing, the software changes.
New data flow paths are established, new I/O may occur, and new control logic is
invoked. These changes may cause problems with functions that previously worked
flawlessly. In the context of an integration test, strategy regression testing is the re-
execution of subset of tests that have already been conducted to ensure that changes have
not propagated unintended side effects.

Regression testing is the activity that helps to ensure that changes do not introduce
unintended behavior or additional errors.

How is regression test conducted?

Regression testing may be conducted manually, by re-executing a subset of all test cases
or using automated capture playback tools.
Capture-playback tools enable the software engineer to capture test cases and results for
subsequent playback and comparison.

82
The regression test suite contains three different classes of test cases.
1. A representative sample of tests that will exercise all software functions.
2. Additional tests that focus on software functions that are likely to be affected by the
change.
3. Tests that focus on software components that have been changed.

Note:
It is impractical and inefficient to re-execute every test for every program function once
a change has occurred.

Selection of an integration strategy depends upon software characteristics and some time
project schedule. In general, a combined approach that uses a top-down strategy for upper
levels of the program structure, coupled with bottom-up strategy for subordinate levels
may be best compromise.

Regression tests should follow on critical module function.

What is critical module?


A critical module has one or more of the following characteristics.
Addresses several software requirements
Has a high level of control
Is a complex or error-prone
Has a definite performance requirement.

6.7.5. Integration Test Documentation

An overall plan for integration of the software and a description of specific tests are
documented in a test specification. The specification is deliverable in the software
engineering process and becomes part of the software configuration.

83
Test Specification Outline
I. Scope of testing
II. Test Plan
1. Test phases and builds
2. Schedule
3. Overhead software
4. Environment and resources
III. Test Procedures
1. Order of integration
Purpose
Modules to be tested
2. Unit test for modules in build
Description of test for module n
Overhead software description
Expected results
3. Test environment
Special tools or techniques
Overhead software description
4. Test case data
5. Expected results for build
IV. Actual Test Results
V. References
VI. Appendices

The Following criteria and corresponding tests are applied for all test phases.
Interfaces integrity. Internal and external interfaces are tested as each module is
incorporated into the structure.
Functional Validity. Tests designed to uncover functional error are conducted.
Information content. Tests designed to uncover errors associated with local or global
data structures are conducted.
Performance Test designed to verify performance bounds established during software
design are conducted.

84
A schedule for integration, overhead software, and related topics are also discussed as
part of the test Plan section. Start and end dates for each phase are established and
availability windows for unit tested modules are defined. A brief description of overhead
software(stubs and drivers) concentrates on characteristics that might require special
effort. Finally, test environments and resources are described.

6.8. Validation Testing

At the culmination of integration testing, software is completely assembled as a package.


Interfacing errors have been uncovered and corrected, and a final series of software tests -
validation testing- may begin.

Validation can be defined in many ways, but a simple definition is that validation
succeeds when software functions in a manner that can be reasonable expected by
customer.

What are reasonable expectations?


Reasonable expectations are defined in the software requirement specification - a
document that describes all user-visible attributes of the software. The specification
contains a section titled "Validation Criteria".
Information contained in that section forms the basis for a validation testing approach.

6.8.1. Validation Test Criteria

A test plans outlines the classes of tests to be conducted and a test procedure defines
specific test cases that will be used in an attempt to uncover errors in conformity with
requirements. Both the plan and procedure are designed to ensure that:

All functional requirements are satisfied


All performance requirements are achieved
Documentation is correct and human-engineered and

85
Other requirements like portability, error recovery, and maintainability are met.

6.8.2. Configuration Review

An important element of the validation process is a configuration review. The intent of


the review is to ensure that all elements of the software configuration have been properly
developed, are catalogued, and have the necessary detail to support the maintenance
phase of the software life cycle.

6.8.3. Alpha and Beta Testing

If software is developed as a product to be used by many customers, it is impractical to


perform formal acceptance tests with each one. Most software product builders use a
process called alpha beta testing to uncover errors that only the end user seems able to
find.
The alpha test is conducted at the developer's site by a customer software is used in a
natural setting with the developer "Looking over the shoulder" of the user and recording
errors and usage problems. Alpha tests are conducted in a controlled environment.

The beta test is conducted at one or more customer sites by the end user(S) of the
software . Unlike alpha testing the developer is generally not present therefore the beta
test is a "live".

Application of the software in an environment that cannot be controlled by the developer.


The customer records all problems (real/imagined) that are encountered during beta
testing and reports these to the developer at regular intervals. Because of problems
reported during beta test, the software developer makes modification and then prepares
for release of the software product to the entire customer base.

86
6.9. System Testing

System testing is actually a series of different tests whose primary purpose is to fully
exercise the computer-based system. Although each test has a different purpose, all Work
to verify that all system elements have been properly integrated and perform allocated
functions.

A classic system testing problem is finger pointing. This occurs when an error is
uncovered, and each system element developer blames the other for the problem. Rather
than indulging in such nonsense, the software engineer should anticipate potential
interfacing problems and 1) design error-handling paths that test all information coming
from other elements of the system; 2) conduct a series of tests that simulate bad data or
other potential errors at the software interface;3) record the results of tests to use as
evidence if finger pointing does occur; and 4) participate in planning and design of
system tests to ensure that software is adequately tested.
In the section that follows, we discuss the types of system tests that are worthwhile for
software -based system.

6.9.1. Recovery Testing

Many computer-based systems must recover from faults and resume processing within a
pre-specified time. In some cases, a system must be fault tolerant; that is, processing
faults must not cause overall system function to cease. In other cases, a system; failure
must be corrected within a specified period of time or severe economic damage will
occur.
Recovery testing is a system test that forces the software to fail in a variety of ways and
verifies that recovery is properly performed. If recovery is automatic (performed by the
system itself), re-initialization, check pointing, mechanism, data recovery, and restart are
each evaluated for correctness. If recovery requires human intervention, the mean time to
repair is evaluated to determine whether it is within acceptable limits.

87
6.9.2. Security Testing

Security testing attempts to verify that protection mechanisms built into a system will in
fact protect it from improper penetration.
During security testing, the tester plays the role(s) of the individual who desires to
penetrate the system. Anything goes! The tester may attempt to acquire passwords
through external clerical means, may attack the system with custom software designed to
break down any defenses that have been constructed; may overwhelm the system, thereby
denying service to others; may purposely cause system errors, hoping to penetrate during
recovery; may browse through insecure data, hoping to find the key to system entry; and
so on.
Given enough time and resources, good security testing will ultimately penetrate a
system. The role of the system designer is to make penetration cost greater than the value
of the information that will be obtained.

6.9.3. Stress testing

Stress test is designed to confront programs with abnormal situations. In essence, the
tester who performs stress testing asks:"How high can we crank this up before it fails"?
Stress testing executes a system in a manner that demands resources in abnormal
quantity, frequency, or volume.
For example
1. Special test may be designed that generate 10 interrupts per second when 1 or 2 is the
average rate.
2. Input data rates may be increased by an order of magnitude to determine how input
function will respond.
3. Test cases that require maximum memory or other resources may be executed
4. Test cases that may cause trashing in a virtual operating system may be designed.
5. Test cases that may cause excessive hunting for disk resident data may be created.

Essentially the tester attempts to break the program.

88
A variation of stress testing is a technique called sensitivity testing. In some situation, a
very small range of data contained within the bounds of valid data for a program may
cause extreme and even erroneous processing or profound performance degradation. This
situation analogous to a singularity in a mathematical function.
Sensitivity testing attempts to uncover data combinations within valid input classes that
may cause instability or improper processing.

6.9.4. Performance Testing


Performance testing is designed to test run-time performance testing occurs throughout
all steps in the testing process. Even at the unit level, the performance of an individual
module may be assessed as white- box tests are conducted. However, it is until not all
system elements are fully integrated that the true performance of a system can be
ascertained.
Performance tests are often coupled with stress testing and often require both hardware
and software instrumentation. That is, it is often necessary to measure resource utilization
in an exacting fashion. External instrumentation can monitor execution intervals, log
events as they occur, and sample machine states on a regular basis. By incrementing a
system, the tester can uncover situation that lead to degradation and possible system
failure.

6.9.5. Debugging

Software testing is a process that can be systematically planned and specified. Test case
design can be conducted, a strategy can be defined, and results can be evaluated against
prescribed expectations.
Debugging occurs as a consequence of successful testing. That is, when a test case
uncovers an error, debugging is the process that results in the removal of the error.
Debugging is not testing, but it always occurs as consequence of testing as shown in
figure 5.7.

89
6.9.6. The Debugging Process

The debugging process begins with the execution of a test case. As shown in fig 5.7, the
debugging process begins with the execution of a test case. Results are assessed and a
lack of correspondence between expected and actual is encountered. In many cases, the
non-corresponding data is a symptom of an underlying cause as yet hidden. The
debugging process attempts to match symptom with cause, thereby leading to error
correction.

The debugging process attempts to match symptom with cause, there by leading to error
correction.
The debugging process will always have two outcomes:
1. The cause will be found, corrected, and removed
2. The cause will not be found.

In the latter case, the person performing debugging may suspect a cause, design a test
case to help validate his/her suspicion, and work toward error correction in iterative
fashion.

Execution of Cases

Results
Results
TestCases
Cases
Test

Additional
Tests
Suspected causes

Regression Tests Debugging

Figure 5.7: Debugging Process


Corrections
Identified 90
Causes
Why debugging is so difficult?
Some characteristics of bugs provide some clues:
1. The symptom and the cause may be geographically remote. That is, the symptom
may appear in one part of a program, while the cause may actually be located at a
site that is far removed. Highly coupled program structures exacerbate this
situation.
2. The symptom may disappear(temporarily) when another error is corrected.
3. The symptom may actually be caused by no errors (e.g. round-off inaccuracies).
4. The symptom may be caused by human error that is not easily traced.
5. The symptom may be a result of timing problems, rather than processing
problems.
6. It may be difficult to accurately reproduce input conditions(e.g. a real-time
application in which input ordering is indeterminate).
7. The symptom may be due to causes that are distributed across a number of tasks
running on different processors.

6.9.7. Debugging approach

In general, three categories for debugging approaches may be proposed.


Brute force
Back tracking
Cause elimination

The brute force category of debugging is probably the most common and efficient
method for isolating the cause of a software error. Brute force debugging methods are
applied when all methods of debugging fail. Using a philosophy, memory dumps are
taken, run time traces are invoked and the program is loaded with WRITE statement.
When this is done, one finds a clue by the information produced which leads to cause of
an error.
Backtracking is a common debugging approach that can be used successfully in small
programs. Beginning at the site where a symptom has been uncovered, the source code is
traced backward (manually) until the site of the cause is found. This process has a
limitation when the source lines are more.

91
Cause Elimination is manifested by induction or deduction and introduces the concept of
binary partitioning. Data related to the error occurrence are organized to isolate potential
causes.
Alternatively, a list of all possible causes is developed and tests are conducted to
eliminate each.

If initial tests indicate that a particular cause hypothesis shows promise the data are
refined in an attempt to isolate the bug.

6.10. Summary
Software testing accounts for the largest percentage of technical effort in the
software process. Yet, we are only beginning to understand the subtleties of
systematic test planning, execution and control.
The objective of software testing is to uncover errors.
To fulfill this objective, a series of test step-unit, integration, validation, and
system tests-are planned and executed.
Unit and integration tests concentrate on functional verification of a module and
incorporation of modules into a program structure.
Validation testing demonstrates tractability to software requirements, and
System testing validates software once it has been incorporated into a larger
system.
Each test step is accomplished through a series of systematic test techniques that
assist in the design of test cases. With each testing step, the level of abstraction
with which software is considered is broadened.
Unlike testing, debugging must be viewed as an art. Beginning with a
symptomatic indication of a problem, the debugging activity tracks down the
cause of an error. Of the many resources available during debugging, the most
valuable is the counsel of other software engineers.
The requirement for higher-quality software demands a more systematic approach
to testing.

92
Questions
1. What is the difference between Verification and Validation? Explain in your own
words.

2. Explain unit test method with the help of your own example.

3. Develop an integration testing strategy for any the system that you have
implemented already. List the problems encountered during such process.

4. What is validation test? Explain.

References

1. Software Engineering, A practitioners Approach, Fourth Edition, by Roger S.


Pressman, McGraw Hill.

2. Effective Methods of Testing, by William Perry, Wiley.

3. The Art of Software Testing, by Glenford J. Myers, John Wiley & Sons.

93
7. Chapter: SOFTWARE QUALITY STANDARS

Some Software Quality Standards Are:


ISO
CMM
SIX SIGMA

7.1. Capability Maturity Model

Capability Maturity Model

What is it?

A Model for measuring the software development process of an


organization
Used to help organizations improve their software process

It is a model that companies can use to measure the maturity of their process

CMM is used by companies to help define their goals of managing the software
process.

CMM does not constrain how the software process is implemented by an


organization. It simply describes what the essential attributes of a software
process would normally be.

Capability Maturity Model

Where did it come from?


Initiated in response to the DODs demands for better software development

Cost of software development and maintenance


Quality of software being delivered
Timeliness of software delivery
94
In the past software projects have been typically a year late and doulble the
budget.
The quality of some software delivered is unsafe. It had to be reworked or not
used at all.
Before CMM was implemented, not one software project has arrived at the initial
stated date.
The Air Force has stated that at some point n the near future all potential
developers will be required to demonstrate a software maturity of Level 3 in
order to be considered for bids.

Capability Maturity Model

Who Developed CMM?

Developed by the Software Engineering Institute at Carnegie Mellon University


with assistance from Mitre Corporation.

Initial Process Maturity Framework released September 1987 by Watts


Humphery of SEI.

CMM v.1.0 released 1991 by Mark Paulk

Current version is v.1.1

SEI is federally funded. Started in 1984.


Managed by the Advance Research Project Agency.
Administered by the Air Force Electronic Systems Center
Mission is to provide leadership in advancing the state of the practice of software
engineering to improve the quality of systems that depend on software.
Mark Paulk was at Georgia Tech on Thursday, February 16. Picked up most of the
graphs that will be shown from his presentation.

95
1.1. The 5 Levels of Software Process Maturity

Optimizing (5)
Managed (4) Focus on process
Process measured improvement
and controlled
Defined (3)
Process
characterized
fairly well
Repeatable (2) understood
Can repeat
previously
mastered tasks
Initial (1)
Unpredictable and
poorly controlled

Level 1 Initial

Characteristics
Ad hoc
Little formalization
Tools Informally applied to process
Success depends on individual efforts

Goals for Advancing of Level 2


Project management
Management oversight
Quality assurance

Level 1 organization have a tendency to be disorganized, chaotic Processes are


undefined
Projects are not well planned

96
If projects are successful, it is usually do the efforts of a few individuals in the
organization.
SEI roots the cause of a level 1 organization as poor management
Software process is unpredictable because of constant change and modification as
work progresses
Schedules, budgets, functionality and product quality are unpredictable

Level 2- Repeatable

Characteristics
Basic project management established
Process discipline in place

Goals for Advancing to Level 3


Establish a process group
Establish a software development process architecture
Introduce software engineering methods

Key Process Areas- Level 2

Software Project Planning


Requirements Management
Software Project Tracking and Oversight
Software Subcontract Management
Software Quality assurance
Software Configuration Management

Polices of managing software project are in place


Procedures to implement polices are established
Management processes allow organization to repeat successful practices
development in earlier projects, although specific processes implemented by the
projects may differ.
Disciplined because planning and tracking of the software project is stable and
earlier successes can be repeated.
Establishment of basic project management controls
Requirements Management establish a common understanding between
customer and software project requirements. Agreement is basis for planning.
Software Project Planning establish reasonable plans for performing software
engineering and for managing project.
Project Tracking & Oversight establish adequate visibility into actual progress so
that management can take effective actions.

97
Subcontract Management select qualified contractors and managed them
effectively.
Quality Assurance provides management with appropriate visibility into process
and products.
Configuration Management establish & maintain integrity of products
throughout software life cycle.

Level 3 Defined

Characteristics
Software processes are documented and standardized
All projects use a standard software process

Goals for Advancing to Level 4


Identify quality and cost parameters thru process management
Establish process database
Gather and maintain process data
Assess quality of products

Key Process Areas- Level 3


Peer reviews
Inter group co-ordination
Software product Engg
Integrated S/w Management
Training Program
Organization process definition
Organization process focus.
Level 4 Managed

Characteristics
Measurement of software process
Measurement of product quality

Goals of Advancing to Level 5

Automatic gathering of process data


Use data to analyze and modify process

Key Process Areas- Level 4


S/w quality measurement 98
Quality process management
Level 5 Optimized

Characteristics
Continues process improvement
Piloting of innovative ideas and technologies

Key Process Areas- Level 5

Process change management


Technology change management
Defect prevention

How CMM is Implemented

Software Process Assessments


Internal
a) Determine current software process
b) Identify improvement priorities
c) Support for software process improvement

Software capabilities Evaluations

Externals
a) Identify qualified contractors
b) Monitor qualified contractors

99
SPA
a) Determines the state of an organizations current software process. Does the
company have a defined process? If so what level is this process at?
b) Identify improvement priorities. What are the organizations priorities in
refining their software process?
c) Does organization have individuals devoted to software process
improvement? Does management back SPI?

SCE

Judgment of Teams in class using Software Capability Evaluation

Common Steps of Software Process


Assessment

Outside assessment team select. This team should be trained n the fundaments
concepts of the CMM as well as the specifics of the assessment or evaluation
method. The methods of the team should be professionals knowledgeable in
software engineering and management.

The second step is to have representatives from the site to be assessed or evaluated
complete the maturity questionnaire and other diagnostic instrument. Once this
activity is completed, the assessment or evaluation team performs a response
analysis, (step 3, with tallies the responses to the questions and identifies those
areas where further exploration is warranted. The areas to be investigated
correspond to the CMM key process areas.

The team is now ready to visit the site being assessed or evaluated (step 4). The
team conducts interviews and reviews documentation to gain an understanding of
the software process followed by the site.

At the end of the on-site period, the team produces a list of findings (step 5) that
identifies the strengths and weaknesses of the organizations software process.

Finally, the team prepares a key process area profile (step 6) that shows the areas
where the organization has, and has not, satisfied the goals of the key process
areas.

100
Management View of Visibility of Software Process

Level 1
a) Amorphous entity, b) activities poorly staged

Level 2
a) Process is viewed as a succession of black boxes
b) Customer requirements are controlled
c) Project management practices are established

Level 3
a) The internal structure of the boxes, i.e. tasks in the projects defined
software process are visible
b) Internal structure represents the way the organizations standard software
process has been applied to specific projects.

Level 4
a) Processes are instrumented and controlled quantitatively.
b) Ability to predict outcomes grows steadily more precise

Level 5
a) New and improved ways of building software are continually tried in a
controlled manner to improve productivity and quality.

1.2. Important Points about CMM

Organization must have at least 100 employees


involved in the software process to even be
considered.
Organization must be able to devote time and
individuals to the improvement process.
Assessment can be costly.
Questionable empirical evidence
You cant skip levels.
All questions on the questionnaire must be answered
favorably in order to move to the next level.

Only large companies will be considered by the SEI for the assessment
One companys assessment cost $40,000
Some of the statistics shown are fuzzy and cannot be readily proved

101
An organization must built foundations on one level before it can move to the
next. If you miss one question, you can be assessed for the level that the question
pertains to.

7.2. Six Sigma

History and Background

In the late 1980s, as the popularity of the Malcolm Baldrine Award was peaking,
an engineer and statistician at Motorola, Dr. Mike Harry, began to study process
variation as a way to improve performance. Dr. Harry formalized his Six Sigma
Philosophy into a system for measurably improving business quality.

Dr. Harry is commonly viewed as the father of Six Sigma. He is currently co-
founder and member of the Board of directors of Arizona-based Six Sigma
Academy and claims ownership of Six Sigma terminology, although many firms
use the terms freely.
The Six Sigma approach became the focal point of Motorolas quality effort and a
way of doing business. Motorolas CEO began to tout the benefits of the
methodology and other executives began to listen. Soon companies like general
Electric, Allied Signal and Texas Instruments were on board.
The concept has since spread widely throughout the manufacturing sector, and
within the last two years, it has been receiving attention and interest in the
financial services sectors. Six Sigma methodologies are delivering post results in
the service sector and the popularity of the technique is expected to grow.

Overview.

Although definitions vary slightly by source, the most common might be: a
disciplined, data-driven approach and methodology for eliminating defects in any
process- from manufacturing to transactional and from product to service.

There are 3 over arching themes to Six Sigma :

Process Focus (at its cores, Six Sigma is about measuring process
variations)
Meeting customer Needs (process outputs must meet customer
requirements)
Data Driven (rigorous analytical methods drive improvements that deliver
measurable differences felt by the customer)

The objective of Six Sigma methodology is to implement a measurement based


approach that focus on improving process and reducing process variation through
102
Six Sigma projects. In essence, it quantifies of a process is performing and seeks
to improve by meeting customer requirements more frequently.
Dr. Harry and Motorola originally coined the Six Sigma in 1986. the martial art
terms used to describe levels of Six Sigma proficiency were also originally
adopted and coined by Motorola and are generally refined as follows:

Mater Black Belt the highest level of technical, organizational and


training proficiency
Black Belt Technically oriented individuals that need not be formally
trained statisticians or engineers, but typically possess a background in
college level mathematics and/or statistics. Black Belts typically need the
statistical piece of the project.
Green Belt Six Sigma team leaders capable of forming and facilitating
Six Sigma teams and managing Six Sigma projects from concept to
completion. Typically, green belt training consists of five days of class
room training conducted in conjunction with Six Sigma projects. Training
covers facilitation techniques, meeting management, project management,
quality management tools, quality control tools, problem solving and
exploratory data analyses, all skills that Nolan consultants processes.

Although these terms are quite common they are not universal. They indicate peer
recognition, not registration or licensure.

The three most prominent organizations promoting Six Sigma include-


International Society of Six Sigma professionals. (ISSSP)
American Society for Quality. (ASQ)
International Quality federation (IQF)

According to a noted Six Sigma expert, Companies an consulting firms often


create their own titles to describe the work done by these technical leaders. There
is currently most standard describing the body of knowledge people with these
titles must master, let alone licensing or certifying credentials. Experts are
currently working to change that through the IQF.

Statistics.

In statistical nomenclature the reek letter Sigma, is used to denote standard


deviation, which is measure of variance about the mean or average. In a standard
bell-shaped curve that represents normal distribution, one Sigma, Or one standard
deviation, represents about 68% of measured population, two Sigma about 95%,
three Sigma about 99% and so on. Six Sigma equates to 99.999665 of the
measured population.
When measuring process performance, you divide the number of errors or defects
by the total number of opportunities. If a process has 55 of error rate then that

103
process of performing at two Sigma. This means that the process performs
correctly (meets customer requirements) 95% of the time.

Most organizations perform somewhere between 2.7 and 4 sigma. If process


performance reached six sigma, there would only be 3.4 errors per million
opportunities which is near perfection.

The concept of Six Sigma is based on the theory of variation, meaning that all
things that are measured fine enough will vary. Variation in a process is driven by:
machines, materials, methods, measurement systems, environment and people.
When there is no undue influence by any one of these Six factors, the variation
produced is called common cause or normal variation. When one or more of
the compounds have an undue influence, Special cause or abnormal
variation exists, which takes the form of multiple or binomial distributions in
statistics language. This distinction is critical in order to select the best course for
management intervention because only abnormal variation can be corrected or
reduced.

There are two methods for calculating sigma, the Discrete method and Continuous
method. The Discrete Method assumes that the customer gives credit to the
service or product provide if only some of the customer requirements are met, so it
may be misleading. The Continuous Method is more appropriate for more
demanding customers. It tends to be more accurate in that it provides a picture of
the magnitude of variation, the type of variation, common or special cost variation,
an requires data collection.

Once the average and standard deviation (sigma0 of a process becomes known,
more specific measures of process performance or capability are typically applied.
These include capability ratio, capability index, and capability index compared to
some constant. The capability ratio compares process performance against the
customer specification. The capability index is the inverse of the capability ratio.

These calculations have limitation in that they are based on the assumption that the
process is centered at the mean, when in reality, processes drift from their

104
intended centers overtime. A more precise measure is therefore the capability
index compared to a constant-k. There are Two formulas that can be used. One is
used when the center of the distribution is closer to the upper customer
specification; another is used when the center is closer to the lowest specification.

When applying these formulas, consideration must be given to short- term vs.
long-term process performance. In other words, a given data sample should be
considered short-term due to the variability of performance over time. On general,
the larger the sample size and/ or number of samples taken, the more accurate the
result.

Program Implementation and Project Steps.

Once an organization has decided to implement the Six Sigma methodology, there
are some initial steps that need to be completed:

Develop process maps for core processes, key sub-processes and enabling
processes, and assign a process owner for each.
Develop a measurement dashboard or scorecard for each process. (all
measures for a give process)
Develop a data collection plan (measure options, data sources, collection
forms, etc.) for each dashboard process and collect sufficient data.
Create project selection criteria and weight factors for choosing projects
which should include impact on business objectives, current process
performance, current process cost or financial impact, feasibility (difficulty,
use of resources, time commitment), etc.
Rate processes and select potential Six Sigma project(s) based on overall
score.
It should be noted here that Six Sigma is not a business strategy. In fact, Six Sigma
would assume that strategic business objectives have already been developed. Processes
that are selected for Six Sigma projects are those that most closely relate to strategic
objectives. Once the initial program setup steps have been completed and an individual
project has been selected, the typical Six Sigma project would include the following
steps:

Develop a project team to include sponsor, leader, technical expert (Black


Belt), and team members.
Prepare a project charter. (Business case, problem statement, scope goals,
milestones, roles & responsibilities, etc.)
Identify customer needs and requirements.
Create high-level process maps to include process definition, start and stop
points, inputs, outputs, customers, customer requirements, suppliers, etc.

105
Establish baseline process performance and current sigma.
Determine process defects and conduct root cause analysis
Develop alternatives and select solution.

Implement the improvement and control measures to hold the gains.

The Six Sigma Project DMAIC Cycle

There are two Six Sigma methodologies that are alternately used depending upon
the type of project. For developing new processes at Six Sigma performance
levels, the methodology is DMADV (define, measure, analyze, design, verify). For
the far more common processes. An illustration of the DMAIC project process
(Copyright 2000 by Thomas Pyzdek) follows:

Define

Control Measure

Analyze
Improve

Six Sigma Use in Organizations and Its Results

According to the Six Sigma Academy, Black Belts save companies


approximately $230,000 per project. General Electric for example, has estimated
benefits on the order of $10 billion during the first five years of implementation.
During the 1990s, Allied Signals sales repeatedly rose in double digits, while
productivity and earnings rose dramatically. Texas instruments adopted Six Sigma
with similar success. Other organizations using Six Sigma include; Motorola,
Sony Honda, Maytag, Johnson controls Raytheon, Canon, Hitachi, Polaroid, and
Lockheed Martin.

106
As you can see, the organizations listed above are manufacturing companies, but
more recently six Sigma has spread into financial services. Good examples of this
include GE Capital Services, American Express, J.P. Morgan, Fannie Mae, Liberty
Insurance, Mount Carmel health and State Street Bank.

Service industry organizations differ significantly from manufacturing


organizations in their approach to quality in the following ways:

Day-to-Day decision making on conformance to standards is largely in the hands


of line departments (i.e. no independent inspection personnel, who have the
power to hold up delivery of a non-conforming product).

The concept of a separate manager and staff of specialists devoting full time to
quality control has a minority acceptance.

Organized coordination of the quality function seldom exists in continuing form.


For specific projects or crisis situations, it typically takes the form of temporary
committees.

It appears that in spite of these concerns, there is significant potential for Six
Sigma to continue to expand into the financial services industry as reengineering
did in the 90s. A methodology that has demonstrated track record of delivering
process improvement, increasing customer satisfaction and delivering bottom line
results will be hard to resist.

SIX STEPS TO SIX SIGMA


# 1- identify the product you create or the service you provide
In other words.. . WHAT DO YOU DO?

# 2 Identify the Customer(s) for your product or service and determine what they
consider important
In other words WHO USES YOUR PRODUCT AND SERVICES?

#3 identify your needs (to provide product / service so that it satisfies the
customer.
In other words . WHAT DO YOU NEED TO DO YOUR WORK?

#4 defines the process for doing your work.


In other words HOW DO YOU DO YOUR WORK?

#5- Mistake-proof the process and eliminate wasted efforts


In other words OW CAN OU DO YOUR WORK BETTER?

# 6 Ensure continuous improvement by measuring, analyzing and controlling the


improved process

107
In other words . HOW PERFECTLY ARE YOU DOING OUR CUSTMER
FOCUSED WORK?

8. Chapter: STESTING FAQs

1. What is black box/white box testing?


Black-box and white-box are test design methods. Black-box test design treats the system as a
black-box, so it doesnt explicitly use knowledge of the internal structure. Black-box test design
is usually described as focusing on testing functional requirements. Synonyms for black-box
include: behavioral, functional, opaque-box, and closed-box. White-box test design allows one to
peek inside the box, and it focuses specifically on using internal knowledge of the software to
guide the selection of test data. Synonyms for white-box include: structural, glass-box and clear-
box.

While black-box and white-box are terms that are still in popular use, many people prefer the
terms "behavioral" and "structural". Behavioral test design is slightly different from black-box test
design because the use of internal knowledge isn't strictly forbidden, but it's still discouraged. In
practice, it hasn't proven useful to use a single test design method. One has to use a mixture of
different methods so that they aren't hindered by the limitations of a particular one. Some call this
"gray-box" or "translucent-box" test design, but others wish we'd stop talking about boxes
altogether.

It is important to understand that these methods are used during the test design phase, and their
influence is hard to see in the tests once they're implemented. Note that any level of testing (unit
testing, system testing, etc.) can use any test design methods. Unit testing is usually associated
with structural test design, but this is because testers usually don't have well-defined
requirements at the unit level to validate.

2. What are unit, component and integration testing?

Note that the definitions of unit, component, integration, and integration testing are recursive :

Unit. The smallest compliable component. A unit typically is the work of one programmer (At
least in principle). As defined, it does not include any called sub-components (for procedural
languages) or communicating components in general.

Unit Testing: in unit testing called components (or communicating components) are replaced
with stubs, simulators, or trusted components. Calling components are replaced with drivers or
trusted super-components. The unit is tested in isolation .

component: a unit is a component. The integration of one or more components is a


component.

Note: The reason for "one or more" as contrasted to "Two or more" is to allow for components
that call themselves recursively.

component testing: same as unit testing except that all stubs and simulators are replaced
with the real thing.

Two components (actually one or more) are said to be integrated when:

108
a. They have been compiled, linked, and loaded together.
b. They have successfully passed the integration tests at the interface between them.

Thus, components A and B are integrated to create a new, larger, component (A,B). Note that
this does not conflict with the idea of incremental integrationit just means that A is a big
component and B, the component added, is a small one.

Integration testing: carrying out integration tests.


Integration tests (After Leung and White) for procedural languages. This is easily generalized for
OO languages by using the equivalent constructs for message passing. In the following, the word
"call" is to be understood in the most general sense of a data flow and is not restricted to just
formal subroutine calls and returns for example, passage of data through global data structures
and/or the use of pointers.

Let A and B be two components in which A calls B.


Let Ta be the component level tests of A
Let Tb be the component level tests of B
Tab The tests in A's suite that cause A to call B.
Tbsa The tests in B's suite for which it is possible to sensitize A -- the inputs
are to A, not B.
Tbsa + Tab == the integration test suite (+ = union).

Note: Sensitize is a technical term. It means inputs that will cause a routine to go down a
specified path. The inputs are to A. Not every input to A will cause A to traverse a path in which
B is called. Tbsa is the set of tests which do cause A to follow a path in which B is called. The
outcome of the test of B may or may not be affected.

There have been variations on these definitions, but the key point is that it is pretty darn formal
and there's a goodly hunk of testing theory, especially as concerns integration testing, OO testing,
and regression testing, based on them.

As to the difference between integration testing and system testing. System testing specifically
goes after behaviors and bugs that are properties of the entire system as distinct from properties
attributable to components (unless, of course, the component in question is the entire system).
Examples of system testing issues:
Resource loss bugs, throughput bugs, performance, security, recovery,
Transaction synchronization bugs (often misnamed "timing bugs").

3. What's the difference between load and stress testing ?


One of the most common, but unfortunate misuse of terminology is treating load testing and
stress testing as synonymous. The consequence of this ignorant semantic abuse is usually
that the system is neither properly load tested nor subjected to a meaningful stress test .

Stress testing is subjecting a system to an unreasonable load while denying it


the resources (e.g., RAM, disc, mips, interrupts, etc.) needed to process that
load. The idea is to stress a system to the breaking point in order to find bugs
that will make that break potentially harmful. The system is not expected to
process the overload without adequate resources, but to behave (e.g., fail) in a
decent manner (e.g., not corrupting or losing data). Bugs and failure modes
discovered under stress testing may or may not be repaired depending on the
application, the failure mode, consequences, etc. The load (incoming transaction
stream) in stress testing is often deliberately distorted so as to force the system
into resource depletion.

109
Load testing is subjecting a system to a statistically representative (usually)
load. The two main reasons for using such loads is in support of software
reliability testing and in performance testing. The term "load testing" by itself is
too vague and imprecise to warrant use. For example, do you mean
representative load," "overload," "high load," etc. In performance testing, load is
varied from a minimum (zero) to the maximum level the system can sustain
without running out of resources or having, transactions >suffer (application-
specific) excessive delay.

A third use of the term is as a test whose objective is to determine the maximum
sustainable load the system can handle. In this usage, "load testing" is merely
testing at the highest transaction arrival rate in performance testing.

4. What's the difference between QA and testing?

QA is more a preventive thing, ensuring quality in the company and therefore the
product rather than just testing the product for software bugs?

TESTING means "quality control"


QUALITY CONTROL measures the quality of a product
QUALITY ASSURANCE measures the quality of processes used to create a
quality product.

5. What is the best tester to developer ratio?

Reported tester: developer ratios range from 10:1 to 1:10.

There's no simple answer. It depends on so many things, Amount of reused


code, number and type of interfaces, platform, quality goals, etc.

It also can depend on the development model. The more specs, the less
testers. The roles can play a big part also. Does QA own beta? Do you include
process auditors or planning activities?

These figures can all vary very widely depending on how you define "tester" and
"developer". In some organizations, a "tester" is anyone who happens to be
testing software at the time -- such as their own. In other organizations, a
"tester" is only a member of an independent test group.

It is better to ask about the test labor content than it is to ask about the
tester/developer ratio. The test labor content, across most applications is
generally accepted as 50%, when people do honest accounting. For life-critical
software, this can go up to 80%.

6. What is Software Quality Assurance?

Software QA involves the entire software development PROCESS - monitoring


and improving the process, making sure that any agreed-upon standards and
procedures are followed, and ensuring that problems are found and dealt with. It
is oriented to 'prevention'.

110
7. What is Software Testing?
Testing involves operation of a system or application under controlled conditions
and evaluating the results (eg, 'if the user is in interface A of the application while
using hardware B, and does C, then D should happen'). The controlled conditions
should include both normal and abnormal conditions. Testing should intentionally
attempt to make things go wrong to determine if things happen when they
shouldn't or things don't happen when they should. It is oriented to 'detection'.
Organizations vary considerably in how they assign responsibility for QA and
testing. Sometimes they're the combined responsibility of one group or individual.
Also common are project teams that include a mix of testers and developers who
work closely together, with overall QA processes monitored by project managers.
It will depend on what best fits an organization's size and business structure.
8. What are some recent major computer system failures caused by
Software bugs?
In March of 2002 it was reported that software bugs in Britain's national
tax system resulted in more than 100,000 erroneous tax overcharges.
The problem was partly attibuted to the difficulty of testing the
integration of multiple systems.
A newspaper columnist reported in July 2001 that a serious flaw was
found in off-the-shelf software that had long been used in systems for
tracking certain U.S. nuclear materials. The same software had been
recently donated to another country to be used in tracking their own
nuclear materials, and it was not until scientists in that country
discovered the problem, and shared the information, that U.S. officials
became aware of the problems.
According to newspaper stories in mid-2001, a major systems
development contractor was fired and sued over problems with a large
retirement plan management system. According to the reports, the
client claimed that system deliveries were late, the software had
excessive defects, and it caused other systems to crash.
In January of 2001 newspapers reported that a major European
railroad was hit by the aftereffects of the Y2K bug. The company found
that many of their newer trains would not run due to their inability to
recognize the date '31/12/2000'; the trains were started by altering the
control system's date settings.
News reports in September of 2000 told of a software vendor settling a
lawsuit with a large mortgage lender; the vendor had reportedly
delivered an online mortgage processing system that did not meet
specifications, was delivered late, and didn't work.
In early 2000, major problems were reported with a new computer
system in a large suburban U.S. public school district with 100,000+
students; problems included 10,000 erroneous report cards and
students left stranded by failed class registration systems; the district's
111
CIO was fired. The school district decided to reinstate it's original 25-
year old system for at least a year until the bugs were worked out of
the new system by the software vendors.
In October of 1999 the $125 million NASA Mars Climate Orbiter
spacecraft was believed to be lost in space due to a simple data
conversion error. It was determined that spacecraft software used
certain data in English units that should have been in metric units.
Among other tasks, the orbiter was to serve as a communications relay
for the Mars Polar Lander mission, which failed for unknown reasons in
December 1999. Several investigating panels were convened to
determine the process failures that allowed the error to go undetected.
Bugs in software supporting a large commercial high-speed data
network affected 70,000 business customers over a period of 8 days in
August of 1999. Among those affected was the electronic trading
system of the largest U.S. futures exchange, which was shut down for
most of a week as a result of the outages.
In April of 1999 a software bug caused the failure of a $1.2 billion
military satellite launch, the costliest unmanned accident in the history
of Cape Canaveral launches. The failure was the latest in a string of
launch failures, triggering a complete military and industry review of
U.S. space launch programs, including software integration and testing
processes. Congressional oversight hearings were requested.
A small town in Illinois received an unusually large monthly electric bill
of $7 million in March of 1999. This was about 700 times larger than its
normal bill. It turned out to be due to bugs in new software that had
been purchased by the local power company to deal with Y2K software
issues.
In early 1999 a major computer game company recalled all copies of a
popular new product due to software problems. The company made a
public apology for releasing a product before it was ready.
The computer system of a major online U.S. stock trading service
failed during trading hours several times over a period of days in
February of 1999 according to nationwide news reports. The problem
was reportedly due to bugs in a software upgrade intended to speed
online trade confirmations.
In April of 1998 a major U.S. data communications network failed for
24 hours, crippling a large part of some U.S. credit card transaction
authorization systems as well as other large U.S. bank, retail, and
government data systems. The cause was eventually traced to a
software bug.
January 1998 news reports told of software problems at a major U.S.
telecommunications company that resulted in no charges for long

112
distance calls for a month for 400,000 customers. The problem went
undetected until customers called up with questions about their bills.
In November of 1997 the stock of a major health industry company
dropped 60% due to reports of failures in computer billing systems,
problems with a large database conversion, and inadequate software
testing. It was reported that more than $100,000,000 in receivables
had to be written off and that multi-million dollar fines were levied on
the company by government agencies.
A retail store chain filed suit in August of 1997 against a transaction
processing system vendor (not a credit card company) due to the
software's inability to handle credit cards with year 2000 expiration
dates.
In August of 1997 one of the leading consumer credit reporting
companies reportedly shut down their new public web site after less
than two days of operation due to software problems. The new site
allowed web site visitors instant access, for a small fee, to their
personal credit reports. However, a number of initial users ended up
viewing each others' reports instead of their own, resulting in irate
customers and nationwide publicity. The problem was attributed to
"...unexpectedly high demand from consumers and faulty software that
routed the files to the wrong computers."
In November of 1996, newspapers reported that software bugs caused
the 411 telephone information system of one of the U.S. RBOC's to fail
for most of a day. Most of the 2000 operators had to search through
phone books instead of using their 13,000,000-listing database. The
bugs were introduced by new software modifications and the problem
software had been installed on both the production and backup
systems. A spokesman for the software vendor reportedly stated that 'It
had nothing to do with the integrity of the software. It was human error.'
On June 4 1996 the first flight of the European Space Agency's new
Ariane 5 rocket failed shortly after launching, resulting in an estimated
uninsured loss of a half billion dollars. It was reportedly due to the lack
of exception handling of a floating-point error in a conversion from a
64-bit integer to a 16-bit signed integer.
Software bugs caused the bank accounts of 823 customers of a major
U.S. bank to be credited with $924,844,208.32 each in May of 1996,
according to newspaper reports. The American Bankers Association
claimed it was the largest such error in banking history. A bank
spokesman said the programming errors were corrected and all funds
were recovered.
Software bugs in a Soviet early-warning monitoring system nearly
brought on nuclear war in 1983, according to news reports in early
1999. The software was supposed to filter out false missile detections

113
caused by Soviet satellites picking up sunlight reflections off cloud-
tops, but failed to do so. Disaster was averted when a Soviet
commander, based on a what he said was a '...funny feeling in my gut',
decided the apparent missile attack was a false alarm. The filtering
software code was rewritten.
9. Why is it often hard for management to get serious about quality
assurance?

Solving problems is a high-visibility process; preventing problems is low-visibility.


This is illustrated by an old parable:
In ancient China there was a family of healers, one of whom was known
throughout the land and employed as a physician to a great lord. The physician
was asked which of his family was the most skillful healer. He replied,
"I tend to the sick and dying with drastic and dramatic treatments, and on
occasion someone is cured and my name gets out among the lords."
"My elder brother cures sickness when it just begins to take root, and his skills
are known among the local peasants and neighbors."
"My eldest brother is able to sense the spirit of sickness and eradicate it before it
takes form. His name is unknown outside our home."

10. Why does Software have bugs?


Miscommunication or no communication - as to specifics of what an
application should or shouldn't do (the application's requirements).
Software complexity - the complexity of current software applications
can be difficult to comprehend for anyone without experience in
modern-day software development. Windows-type interfaces, client-
server and distributed applications, data communications, enormous
relational databases, and sheer size of applications have all
contributed to the exponential growth in software/system complexity.
And the use of object-oriented techniques can complicate instead of
simplify a project unless it is well-engineered.
Programming errors - programmers, like anyone else, can make
mistakes.
changing requirements - the customer may not understand the effects
of changes, or may understand and request them anyway - redesign,
rescheduling of engineers, effects on other projects, work already
completed that may have to be redone or thrown out, hardware
requirements that may be affected, etc. If there are many minor
changes or any major changes, known and unknown dependencies
among parts of the project are likely to interact and cause problems,
and the complexity of keeping track of changes may result in errors.
Enthusiasm of engineering staff may be affected. In some fast-
changing business environments, continuously modified requirements
may be a fact of life. In this case, management must understand the
resulting risks, and QA and test engineers must adapt and plan for

114
continuous extensive testing to keep the inevitable bugs from running
out of control.
time pressures - scheduling of software projects is difficult at best,
often requiring a lot of guesswork. When deadlines loom and the
crunch comes, mistakes will be made.
egos - people prefer to say things like:
'no problem'
'piece of cake'
'I can whip that out in a few hours'
'it should be easy to update that old code'

instead of:
'that adds a lot of complexity and we could end up
making a lot of mistakes'
'we have no idea if we can do that; we'll wing it'
'I can't estimate how long it will take, until I
take a close look at it'
'we can't figure out what that old spaghetti code
did in the first place'

If there are too many unrealistic 'no problem's', the result is bugs.
poorly documented code - it's tough to maintain and modify code that is
badly written or poorly documented; the result is bugs. In many
organizations management provides no incentive for programmers to
document their code or write clear, understandable code. In fact, it's
usually the opposite: they get points mostly for quickly turning out code,
and there's job security if nobody else can understand it ('if it was hard to
write, it should be hard to read').
software development tools - visual tools, class libraries, compilers,
scripting tools, etc. often introduce their own bugs or are poorly
documented, resulting in added bugs.
11. How can new Software QA processes be introduced in an existing
organization?
A lot depends on the size of the organization and the risks involved. For
large organizations with high-risk (in terms of lives or property) projects,
serious management buy-in is required and a formalized QA process is
necessary.
Where the risk is lower, management and organizational buy-in and QA
implementation may be a slower, step-at-a-time process. QA processes
should be balanced with productivity so as to keep bureaucracy from
getting out of hand.
For small groups or projects, a more ad-hoc process may be appropriate,
depending on the type of customers and projects. A lot will depend on

115
team leads or managers, feedback to developers, and ensuring adequate
communications among customers, managers, developers, and testers.
In all cases the most value for effort will be in requirements management
processes, with a goal of clear, complete, testable requirement
specifications or expectations.
12. What is verification? validation?

Verification typically involves reviews and meetings to evaluate documents,


plans, code, requirements, and specifications. This can be done with checklists,
issues lists, walkthroughs, and inspection meetings. Validation typically involves
actual testing and takes place after verifications are completed. The term 'IV & V'
refers to Independent Verification and Validation.

13. What is a 'walkthrough'?

A 'walkthrough' is an informal meeting for evaluation or informational purposes.


Little or no preparation is usually required.

14. What's an 'inspection'?

An inspection is more formalized than a 'walkthrough', typically with 3-8 people


including a moderator, reader, and a recorder to take notes. The subject of the
inspection is typically a document such as a requirements spec or a test plan,
and the purpose is to find problems and see what's missing, not to fix anything.
Attendees should prepare for this type of meeting by reading thru the document;
most problems will be found during this preparation. The result of the inspection
meeting should be a written report. Thorough preparation for inspections is
difficult, painstaking work, but is one of the most cost effective methods of
ensuring quality. Employees who are most skilled at inspections are like the
'eldest brother' in the parable in 'Why is it often hard for management to get
serious about quality assurance?'. Their skill may have low visibility but they are
extremely valuable to any software development organization, since bug
prevention is far more cost-effective than bug detection.

15. What kinds of testing should be considered?


Black box testing - not based on any knowledge of internal design or code.
Tests are based on requirements and functionality.
White box testing - based on knowledge of the internal logic of an
application's code. Tests are based on coverage of code statements,
branches, paths, conditions.
unit testing - the most 'micro' scale of testing; to test particular functions or
code modules. Typically done by the programmer and not by testers, as it
requires detailed knowledge of the internal program design and code. Not
always easily done unless the application has a well-designed architecture
with tight code; may require developing test driver modules or test
harnesses.
116
incremental integration testing - continuous testing of an application as
new functionality is added; requires that various aspects of an
application's functionality be independent enough to work separately
before all parts of the program are completed, or that test drivers be
developed as needed; done by programmers or by testers.
integration testing - testing of combined parts of an application to
determine if they function together correctly. The 'parts' can be code
modules, individual applications, client and server applications on a
network, etc. This type of testing is especially relevant to client/server and
distributed systems.
functional testing - black-box type testing geared to functional
requirements of an application; this type of testing should be done by
testers. This doesn't mean that the programmers shouldn't check that their
code works before releasing it (which of course applies to any stage of
testing.)
system testing - black-box type testing that is based on overall
requirements specifications; covers all combined parts of a system.
end-to-end testing - similar to system testing; the 'macro' end of the test
scale; involves testing of a complete application environment in a situation
that mimics real-world use, such as interacting with a database, using
network communications, or interacting with other hardware, applications,
or systems if appropriate.
sanity testing - typically an initial testing effort to determine if a new
software version is performing well enough to accept it for a major testing
effort. For example, if the new software is crashing systems every 5
minutes, bogging down systems to a crawl, or destroying databases, the
software may not be in a 'sane' enough condition to warrant further testing
in its current state.
regression testing - re-testing after fixes or modifications of the software or
its environment. It can be difficult to determine how much re-testing is
needed, especially near the end of the development cycle. Automated
testing tools can be especially useful for this type of testing.
acceptance testing - final testing based on specifications of the end-user
or customer, or based on use by end-users/customers over some limited
period of time.
load testing - testing an application under heavy loads, such as testing of
a web site under a range of loads to determine at what point the system's
response time degrades or fails.
stress testing - term often used interchangeably with 'load' and
'performance' testing. Also used to describe such tests as system
functional testing while under unusually heavy loads, heavy repetition of

117
certain actions or inputs, input of large numerical values, large complex
queries to a database system, etc.
performance testing - term often used interchangeably with 'stress' and
'load' testing. Ideally 'performance' testing (and any other 'type' of testing)
is defined in requirements documentation or QA or Test Plans.
usability testing - testing for 'user-friendliness'. Clearly this is subjective,
and will depend on the targeted end-user or customer. User interviews,
surveys, video recording of user sessions, and other techniques can be
used. Programmers and testers are usually not appropriate as usability
testers.
install/uninstall testing - testing of full, partial, or upgrade install/uninstall
processes.
recovery testing - testing how well a system recovers from crashes,
hardware failures, or other catastrophic problems.
security testing - testing how well the system protects against
unauthorized internal or external access, willful damage, etc; may require
sophisticated testing techniques.
compatability testing - testing how well software performs in a particular
hardware/software/operating system/network/etc. environment.
exploratory testing - often taken to mean a creative, informal software test
that is not based on formal test plans or test cases; testers may be
learning the software as they test it.
ad-hoc testing - similar to exploratory testing, but often taken to mean that
the testers have significant understanding of the software before testing it.
user acceptance testing - determining if software is satisfactory to an end-
user or customer.
comparison testing - comparing software weaknesses and strengths to
competing products.
alpha testing - testing of an application when development is nearing
completion; minor design changes may still be made as a result of such
testing. Typically done by end-users or others, not by programmers or
testers.
beta testing - testing when development and testing are essentially
completed and final bugs and problems need to be found before final
release. Typically done by end-users or others, not by programmers or
testers.
mutation testing - a method for determining if a set of test data or test
cases is useful, by deliberately introducing various code changes ('bugs')

118
and retesting with the original test data/cases to determine if the 'bugs' are
detected. Proper implementation requires large computational resources.
16. What are 5 common problems in the software development process?
poor requirements - if requirements are unclear, incomplete, too
general, or not testable, there will be problems.
unrealistic schedule - if too much work is crammed in too little time,
problems are inevitable.
inadequate testing - no one will know whether or not the program is
any good until the customer complains or systems crash.
featuritis - requests to pile on new features after development is
underway; extremely common.
miscommunication - if developers don't know what's needed or
customer's have erroneous expectations, problems are guaranteed.
17. What are 5 common solutions to software development problems?
solid requirements - clear, complete, detailed, cohesive, attainable,
testable requirements that are agreed to by all players. Use prototypes
to help nail down requirements.
realistic schedules - allow adequate time for planning, design, testing,
bug fixing, re-testing, changes, and documentation; personnel should
be able to complete the project without burning out.
adequate testing - start testing early on, re-test after fixes or changes,
plan for adequate time for testing and bug-fixing.
stick to initial requirements as much as possible - be prepared to
defend against changes and additions once development has begun,
and be prepared to explain consequences. If changes are necessary,
they should be adequately reflected in related schedule changes. If
possible, use rapid prototyping during the design phase so that
customers can see what to expect. This will provide them a higher
comfort level with their requirements decisions and minimize changes
later on.
communication - require walkthroughs and inspections when
appropriate; make extensive use of group communication tools - e-
mail, groupware, networked bug-tracking tools and change
management tools, intranet capabilities, etc.; insure that
documentation is available and up-to-date - preferably electronic, not
paper; promote teamwork and cooperation; use prototypes early on so
that customers' expectations are clarified.
18. What is software 'quality'?

Quality software is reasonably bug-free, delivered on time and within budget,


meets requirements and/or expectations, and is maintainable. However, quality is
119
obviously a subjective term. It will depend on who the 'customer' is and their
overall influence in the scheme of things. A wide-angle view of the 'customers' of
a software development project might include end-users, customer acceptance
testers, customer contract officers, customer management, the development
organization's management/accountants/testers/salespeople, future software
maintenance engineers, stockholders, magazine columnists, etc. Each type of
'customer' will have their own slant on 'quality' - the accounting department might
define quality in terms of profits while an end-user might define quality as user-
friendly and bug-free.

19. What is 'good code'?

'Good code' is code that works, is bug free, and is readable and maintainable.
Some organizations have coding 'standards' that all developers are supposed to
adhere to, but everyone has different ideas about what's best, or what is too
many or too few rules. There are also various theories and metrics, such as
McCabe Complexity metrics. It should be kept in mind that excessive use of
standards and rules can stifle productivity and creativity. 'Peer reviews', 'buddy
checks' code analysis tools, etc. can be used to check for problems and enforce
standards.
For C and C++ coding, here are some typical ideas to consider in setting
rules/standards; these may or may not apply to a particular situation:
minimize or eliminate use of global variables.

use descriptive function and method names - use both upper and lower
case, avoid abbreviations, use as many characters as necessary to be
adequately descriptive (use of more than 20 characters is not out of line);
be consistent in naming conventions.
use descriptive variable names - use both upper and lower case, avoid
abbreviations, use as many characters as necessary to be adequately
descriptive (use of more than 20 characters is not out of line); be
consistent in naming conventions.
function and method sizes should be minimized; less than 100 lines of
code is good, less than 50 lines is preferable.
function descriptions should be clearly spelled out in comments preceding
a function's code.
organize code for readability.

use whitespace generously - vertically and horizontally

each line of code should contain 70 characters max.

one code statement per line.

coding style should be consistent throught a program (eg, use of brackets,


indentations, naming conventions, etc.)

120
in adding comments, err on the side of too many rather than too few
comments; a common rule of thumb is that there should be at least as
many lines of comments (including header blocks) as lines of code.
no matter how small, an application should include documentaion of the
overall program function and flow (even a few paragraphs is better than
nothing); or if possible a separate flow chart and detailed program
documentation.
make extensive use of error handling procedures and status and error
logging.
for C++, to minimize complexity and increase maintainability, avoid too
many levels of inheritance in class heirarchies (relative to the size and
complexity of the application). Minimize use of multiple inheritance, and
minimize use of operator overloading (note that the Java programming
language eliminates multiple inheritance and operator overloading.)
for C++, keep class methods small, less than 50 lines of code per method
is preferable.
for C++, make liberal use of exception handlers
20. What is 'good design'?
'Design' could refer to many things, but often refers to 'functional design' or
'internal design'. Good internal design is indicated by software code whose
overall structure is clear, understandable, easily modifiable, and maintainable; is
robust with sufficient error-handling and status logging capability; and works
correctly when implemented. Good functional design is indicated by an
application whose functionality can be traced back to customer and end-user
requirements. For programs that have a user interface, it's often a good idea to
assume that the end user will have little computer knowledge and may not read a
user manual or even the on-line help; some common rules-of-thumb include:
the program should act in a way that least surprises the user

it should always be evident to the user what can be done next and how
to exit
the program shouldn't let the users do something stupid without
warning them.
21. What is SEI? CMM? ISO? IEEE? ANSI? Will it help?
SEI = 'Software Engineering Institute' at Carnegie-Mellon University;
initiated by the U.S. Defense Department to help improve software
development processes.
CMM = 'Capability Maturity Model', developed by the SEI. It's a model
of 5 levels of organizational 'maturity' that determine effectiveness in
delivering quality software. It is geared to large organizations such as
large U.S. Defense Department contractors. However, many of the QA

121
processes involved are appropriate to any organization, and if
reasonably applied can be helpful. Organizations can receive CMM
ratings by undergoing assessments by qualified auditors.
Level 1 - characterized by chaos, periodic panics, and heroic
efforts required by individuals to successfully
complete projects. Few if any processes in place;
successes may not be repeatable.

Level 2 - software project tracking, requirements management,


realistic planning, and configuration management
processes are in place; successful practices can
be repeated.

Level 3 - standard software development and maintenance processes


are integrated throughout an organization; a Software
Engineering Process Group is is in place to oversee
software processes, and training programs are used to
ensure understanding and compliance.

Level 4 - metrics are used to track productivity, processes,


and products. Project performance is predictable,
and quality is consistently high.

Level 5 - the focus is on continouous process improvement. The


impact of new processes and technologies can be
predicted and effectively implemented when required.

Perspective on CMM ratings: During 1997-2001, 1018 organizations


were assessed. Of those, 27% were rated at Level 1, 39% at 2,
23% at 3, 6% at 4, and 5% at 5. (For ratings during the period
1992-96, 62% were at Level 1, 23% at 2, 13% at 3, 2% at 4, and
0.4% at 5.) The median size of organizations was 100 software
engineering/maintenance personnel; 32% of organizations were
U.S. federal contractors or agencies. For those rated at
Level 1, the most problematical key process area was in
Software Quality Assurance.
ISO = 'International Organisation for Standardization' - The ISO 9001:2000
standard (which replaces the previous standard of 1994) concerns quality
systems that are assessed by outside auditors, and it applies to many
kinds of production and manufacturing organizations, not just software. It
covers documentation, design, development, production, testing,
installation, servicing, and other processes. The full set of standards
consists of: (a)Q9001-2000 - Quality Management Systems:
Requirements; (b)Q9000-2000 - Quality Management Systems:
Fundamentals and Vocabulary; (c)Q9004-2000 - Quality Management
Systems: Guidelines for Performance Improvements. To be ISO 9001
certified, a third-party auditor assesses an organization, and certification is
typically good for about 3 years, after which a complete reassessment is

122
required. Note that ISO certification does not necessarily indicate quality
products - it indicates only that documented processes are followed.
IEEE = 'Institute of Electrical and Electronics Engineers' - among other
things, creates standards such as 'IEEE Standard for Software Test
Documentation' (IEEE/ANSI Standard 829), 'IEEE Standard of Software
Unit Testing (IEEE/ANSI Standard 1008), 'IEEE Standard for Software
Quality Assurance Plans' (IEEE/ANSI Standard 730), and others.
ANSI = 'American National Standards Institute', the primary industrial
standards body in the U.S.; publishes some software-related standards in
conjunction with the IEEE and ASQ (American Society for Quality).
Other software development process assessment methods besides CMM
and ISO 9000 include SPICE, Trillium, TickIT. and Bootstrap.
22. What is the 'software life cycle'?

The life cycle begins when an application is first conceived and ends when it is
no longer in use. It includes aspects such as initial concept, requirements
analysis, functional design, internal design, documentation planning, test
planning, coding, document preparation, integration, testing, maintenance,
updates, retesting, phase-out, and other aspects.

23. Will automated testing tools make testing easier?


Possibly. For small projects, the time needed to learn and implement
them may not be worth it. For larger projects, or on-going long-term
projects they can be valuable.
A common type of automated tool is the 'record/playback' type. For
example, a tester could click through all combinations of menu
choices, dialog box choices, buttons, etc. in an application GUI and
have them 'recorded' and the results logged by a tool. The 'recording'
is typically in the form of text based on a scripting language that is
interpretable by the testing tool. If new buttons are added, or some
underlying code in the application is changed, etc. the application can
then be retested by just 'playing back' the 'recorded' actions, and
comparing the logging results to check effects of the changes. The
problem with such tools is that if there are continual changes to the
system being tested, the 'recordings' may have to be changed so much
that it becomes very time-consuming to continuously update the
scripts. Additionally, interpretation of results (screens, data, logs, etc.)
can be a difficult task. Note that there are record/playback tools for
text-based interfaces also, and for all types of platforms.
Other automated tools can include:
code analyzers - monitor code complexity, adherence to
standards, etc.

coverage analyzers - these tools check which parts of the


123
code have been exercised by a test, and may
be oriented to code statement coverage,
condition coverage, path coverage, etc.

memory analyzers - such as bounds-checkers and leak detectors.

load/performance test tools - for testing client/server


and web applications under various load
levels.

web test tools - to check that links are valid, HTML code
usage is correct, client-side and
server-side programs work, a web site's
interactions are secure.

other tools - for test case management, documentation


management, bug reporting, and configuration
management.

24. What makes a good test engineer?

A good test engineer has a 'test to break' attitude, an ability to take the point of
view of the customer, a strong desire for quality, and an attention to detail. Tact
and diplomacy are useful in maintaining a cooperative relationship with
developers, and an ability to communicate with both technical (developers) and
non-technical (customers, management) people is useful. Previous software
development experience can be helpful as it provides a deeper understanding of
the software development process, gives the tester an appreciation for the
developers' point of view, and reduce the learning curve in automated test tool
programming. Judgment skills are needed to assess high-risk areas of an
application on which to focus testing efforts when time is limited.

25. What makes a good Software QA engineer?

The same qualities a good tester has are useful for a QA engineer. Additionally,
they must be able to understand the entire software development process and
how it can fit into the business approach and goals of the organization.
Communication skills and the ability to understand various sides of issues are
important. In organizations in the early stages of implementing QA processes,
patience and diplomacy are especially needed. An ability to find problems as well
as to see 'what's missing' is important for inspections and reviews.

26. What makes a good QA or Test manager?


A good QA, test, or QA/Test(combined) manager should:
be familiar with the software development process

124
be able to maintain enthusiasm of their team and promote a positive
atmosphere, despite what is a somewhat 'negative' process (e.g.,
looking for or preventing problems)
be able to promote teamwork to increase productivity

be able to promote cooperation between software, test, and QA


engineers
have the diplomatic skills needed to promote improvements in QA
processes
have the ability to withstand pressures and say 'no' to other managers
when quality is insufficient or QA processes are not being adhered to
have people judgement skills for hiring and keeping skilled personnel

be able to communicate with technical and non-technical people,


engineers, managers, and customers.
be able to run meetings and keep them focused

27. What's the role of documentation in QA?


Critical. (Note that documentation can be electronic, not necessarily paper.) QA
practices should be documented such that they are repeatable. Specifications,
designs, business rules, inspection reports, configurations, code changes, test
plans, test cases, bug reports, user manuals, etc. should all be documented.
There should ideally be a system for easily finding and obtaining documents and
determining what documentation will have a particular piece of information.
Change management for documentation should be used if possible.
28. What's the big deal about 'requirements'?
One of the most reliable methods of insuring problems, or failure, in a complex
software project is to have poorly documented requirements specifications.
Requirements are the details describing an application's externally-perceived
functionality and properties. Requirements should be clear, complete, reasonably
detailed, cohesive, attainable, and testable. A non-testable requirement would be,
for example, 'user-friendly' (too subjective). A testable requirement would be
something like 'the user must enter their previously-assigned password to access
the application'. Determining and organizing requirements details in a useful and
efficient way can be a difficult effort; different methods are available depending
on the particular project. Many books are available that describe various
approaches to this task.
Care should be taken to involve ALL of a project's significant 'customers' in the
requirements process. 'Customers' could be in-house personnel or out, and could
include end-users, customer acceptance testers, customer contract officers,
customer management, future software maintenance engineers, salespeople,
etc. Anyone who could later derail the project if their expectations aren't met
should be included if possible.

125
Organizations vary considerably in their handling of requirements specifications.
Ideally, the requirements are spelled out in a document with statements such as
'The product shall.....'. 'Design' specifications should not be confused with
'requirements'; design specifications should be traceable back to the
requirements.
In some organizations requirements may end up in high level project plans,
functional specification documents, in design documents, or in other documents
at various levels of detail. No matter what they are called, some type of
documentation with detailed requirements will be needed by testers in order to
properly plan and execute tests. Without such documentation, there will be no
clear-cut way to determine if a software application is performing correctly.
29. What steps are needed to develop and run software tests?
The following are some of the steps to consider:
Obtain requirements, functional design, and internal design specifications
and other necessary documents
Obtain budget and schedule requirements

Determine project-related personnel and their responsibilities, reporting


requirements, required standards and processes (such as release
processes, change processes, etc.)
Identify application's higher-risk aspects, set priorities, and determine
scope and limitations of tests
Determine test approaches and methods - unit, integration, functional,
system, load, usability tests, etc.
Determine test environment requirements (hardware, software,
communications, etc.)
Determine testware requirements (record/playback tools, coverage
analyzers, test tracking, problem/bug tracking, etc.)
Determine test input data requirements

Identify tasks, those responsible for tasks, and labor requirements

Set schedule estimates, timelines, milestones

Determine input equivalence classes, boundary value analyses, error


classes
Prepare test plan document and have needed reviews/approvals

Write test cases

Have needed reviews/inspections/approvals of test cases

126
Prepare test environment and testware, obtain needed user
manuals/reference documents/configuration guides/installation guides, set
up test tracking processes, set up logging and archiving processes, set up
or obtain test input data
Obtain and install software releases

Perform tests

Evaluate and report results

Track problems/bugs and fixes

Retest as needed

Maintain and update test plans, test cases, test environment, and testware
through life cycle
30. What's a 'test plan'?

A software project test plan is a document that describes the objectives, scope,
approach, and focus of a software testing effort. The process of preparing a test
plan is a useful way to think through the efforts needed to validate the
acceptability of a software product. The completed document will help people
outside the test group understand the 'why' and 'how' of product validation. It
should be thorough enough to be useful but not so thorough that no one outside
the test group will read it. The following are some of the items that might be
included in a test plan, depending on the particular project:
Title

Identification of software including version/release numbers

Revision history of document including authors, dates, approvals

Table of Contents

Purpose of document, intended audience

Objective of testing effort

Software product overview

Relevant related document list, such as requirements, design documents,


other test plans, etc.
Relevant standards or legal requirements

Traceability requirements

Relevant naming conventions and identifier conventions

127
Overall software project organization and personnel/contact-
info/responsibilties
Test organization and personnel/contact-info/responsibilities

Assumptions and dependencies

Project risk analysis

Testing priorities and focus

Scope and limitations of testing

Test outline - a decomposition of the test approach by test type, feature,


functionality, process, system, module, etc. as applicable
Outline of data input equivalence classes, boundary value analysis, error
classes
Test environment - hardware, operating systems, other required software,
data configurations, interfaces to other systems
Test environment validity analysis - differences between the test and
production systems and their impact on test validity.
Test environment setup and configuration issues

Software migration processes

Software CM processes

Test data setup requirements

Database setup requirements

Outline of system-logging/error-logging/other capabilities, and tools such


as screen capture software, that will be used to help describe and report
bugs
Discussion of any specialized software or hardware tools that will be used
by testers to help track the cause or source of bugs
Test automation - justification and overview

Test tools to be used, including versions, patches, etc.

Test script/test code maintenance processes and version control

Problem tracking and resolution - tools and processes

Project test metrics to be used

Reporting requirements and testing deliverables

128
Software entrance and exit criteria

Initial sanity testing period and criteria

Test suspension and restart criteria

Personnel allocation

Personnel pre-training needs

Test site/location

Outside test organizations to be utilized and their purpose, responsibilities,


deliverables, contact persons, and coordination issues
Relevant proprietary, classified, security, and licensing issues.

Open issues

Appendix - glossary, acronyms, etc.


31. What's a 'test case'?
A test case is a document that describes an input, action, or event and an
expected response, to determine if a feature of an application is working
correctly. A test case should contain particulars such as test case identifier,
test case name, objective, test conditions/setup, input data requirements,
steps, and expected results.
Note that the process of developing test cases can help find problems in
the requirements or design of an application, since it requires completely
thinking through the operation of the application. For this reason, it's useful
to prepare test cases early in the development cycle if possible.
32. What should be done after a bug is found?
The bug needs to be communicated and assigned to developers that can fix it.
After the problem is resolved, fixes should be re-tested, and determinations
made regarding requirements for regression testing to check that fixes didn't
create problems elsewhere. If a problem-tracking system is in place, it should
encapsulate these processes. A variety of commercial problem-
tracking/management software tools are available. The following are items to
consider in the tracking process:
Complete information such that developers can understand the bug,
get an idea of it's severity, and reproduce it if necessary.
Bug identifier (number, ID, etc.)

Current bug status (e.g., 'Released for Retest', 'New', etc.)

The application name or identifier and version

129
The function, module, feature, object, screen, etc. where the bug
occurred
Environment specifics, system, platform, relevant hardware specifics

Test case name/number/identifier

One-line bug description

Full bug description

Description of steps needed to reproduce the bug if not covered by a


test case or if the developer doesn't have easy access to the test
case/test script/test tool
Names and/or descriptions of file/data/messages/etc. used in test

File excerpts/error messages/log file excerpts/screen shots/test tool


logs that would be helpful in finding the cause of the problem
Severity estimate (a 5-level range such as 1-5 or 'critical'-to-'low' is
common)
Was the bug reproducible?

Tester name

Test date

Bug reporting date

Name of developer/group/organization the problem is assigned to

Description of problem cause

Description of fix

Code section/file/module/class/method that was fixed

Date of fix

Application version that contains the fix

Tester responsible for retest

Retest date

Retest results

Regression testing requirements

Tester responsible for regression tests

Regression testing results


130
A reporting or tracking process should enable notification of appropriate
personnel at various stages. For instance, testers need to know when retesting is
needed, developers need to know when bugs are found and how to get the
needed information, and reporting/summary capabilities are needed for
managers.

33. What is 'configuration management'?

Configuration management covers the processes used to control, coordinate,


and track: code, requirements, documentation, problems, change requests,
designs, tools/compilers/libraries/patches, changes made to them, and who
makes the changes.

34. What if the software is so buggy it can't really be tested at all?

The best bet in this situation is for the testers to go through the process of
reporting whatever bugs or blocking-type problems initially show up, with the
focus being on critical bugs. Since this type of problem can severely affect
schedules, and indicates deeper problems in the software development process
(such as insufficient unit testing or insufficient integration testing, poor design,
improper build or release procedures, etc.) managers should be notified, and
provided with some documentation as evidence of the problem.

35. How can it be known when to stop testing?

This can be difficult to determine. Many modern software applications are so


complex, and run in such an interdependent environment, that complete testing
can never be done. Common factors in deciding when to stop are:
Deadlines (release deadlines, testing deadlines, etc.)

Test cases completed with certain percentage passed

Test budget depleted

Coverage of code/functionality/requirements reaches a specified point

Bug rate falls below a certain level

Beta or alpha testing period ends


36. What if there isn't enough time for thorough testing?

Use risk analysis to determine where testing should be focused.


Since it's rarely possible to test every possible aspect of an application, every
possible combination of events, every dependency, or everything that could go
wrong, risk analysis is appropriate to most software development projects. This
requires judgement skills, common sense, and experience. (If warranted, formal
methods are also available.) Considerations can include:
Which functionality is most important to the project's intended
purpose?
131
Which functionality is most visible to the user?

Which functionality has the largest safety impact?

Which functionality has the largest financial impact on users?

Which aspects of the application are most important to the customer?

Which aspects of the application can be tested early in the


development cycle?
Which parts of the code are most complex, and thus most subject to
errors?
Which parts of the application were developed in rush or panic mode?

Which aspects of similar/related previous projects caused problems?

Which aspects of similar/related previous projects had large


maintenance expenses?
Which parts of the requirements and design are unclear or poorly
thought out?
What do the developers think are the highest-risk aspects of the
application?
What kinds of problems would cause the worst publicity?

What kinds of problems would cause the most customer service


complaints?
What kinds of tests could easily cover multiple functionalities?

Which tests will have the best high-risk-coverage to time-required


ratio?
37. What can be done if requirements are changing continuously?
A common problem and a major headache.
Work with the project's stakeholders early on to understand how
requirements might change so that alternate test plans and strategies can
be worked out in advance, if possible.
It's helpful if the application's initial design allows for some adaptability so
that later changes do not require redoing the application from scratch.
If the code is well-commented and well-documented this makes changes
easier for the developers.
Use rapid prototyping whenever possible to help customers feel sure of
their requirements and minimize changes.

132
The project's initial schedule should allow for some extra time
commensurate with the possibility of changes.
Try to move new requirements to a 'Phase 2' version of an application,
while using the original requirements for the 'Phase 1' version.
Negotiate to allow only easily-implemented new requirements into the
project, while moving more difficult new requirements into future versions
of the application.
Be sure that customers and management understand the scheduling
impacts, inherent risks, and costs of significant requirements changes.
Then let management or the customers (not the developers or testers)
decide if the changes are warranted - after all, that's their job.
Balance the effort put into setting up automated testing with the expected
effort required to re-do them to deal with changes.
Try to design some flexibility into automated test scripts.

Focus initial automated testing on application aspects that are most likely
to remain unchanged.
Devote appropriate effort to risk analysis of changes to minimize
regression testing needs.
Design some flexibility into test cases (this is not easily done; the best bet
might be to minimize the detail in the test cases, or set up only higher-
level generic-type test plans)
Focus less on detailed test plans and test cases and more on ad hoc
testing (with an understanding of the added risk that this entails).
38. What if the project isn't big enough to justify extensive testing?

Consider the impact of project errors, not the size of the project. However, if
extensive testing is still not justified, risk analysis is again needed and the same
considerations as described previously in 'What if there isn't enough time for
thorough testing?' apply. The tester might then do ad hoc testing, or write up a
limited test plan based on the risk analysis.

39. What if the application has functionality that wasn't in the


requirements?

It may take serious effort to determine if an application has significant


unexpected or hidden functionality, and it would indicate deeper problems in the
software development process. If the functionality isn't necessary to the purpose
of the application, it should be removed, as it may have unknown impacts or
dependencies that were not taken into account by the designer or the customer.
If not removed, design information will be needed to determine added testing
needs or regression testing needs. Management should be made aware of any

133
significant added risks as a result of the unexpected functionality. If the
functionality only effects areas such as minor improvements in the user interface,
for example, it may not be a significant risk.

40. How can Software QA processes be implemented without stifling


productivity?
By implementing QA processes slowly over time, using consensus to reach
agreement on processes, and adjusting and experimenting as an organization
grows and matures, productivity will be improved instead of stifled. Problem
prevention will lessen the need for problem detection, panics and burn-out will
decrease, and there will be improved focus and less wasted effort. At the same
time, attempts should be made to keep processes simple and efficient, minimize
paperwork, promote computer-based processes and automated tracking and
reporting, minimize time required in meetings, and promote training as part of the
QA process. However, no one - especially talented technical types - likes rules or
bureacracy, and in the short run things may slow down a bit. A typical scenario
would be that more days of planning and development will be needed, but less
time will be required for late-night bug-fixing and calming of irate customers.

41. What if an organization is growing so fast that fixed QA processes are


impossible?
This is a common problem in the software industry, especially in new technology
areas. There is no easy solution in this situation, other than:
Hire good people

Management should 'ruthlessly prioritize' quality issues and maintain focus


on the customer
Everyone in the organization should be clear on what 'quality' means to
the customer
42. How does a client/server environment affect testing?
Client/server applications can be quite complex due to the multiple dependencies
among clients, data communications, hardware, and servers. Thus testing
requirements can be extensive. When time is limited (as it usually is) the focus
should be on integration and system testing. Additionally,
load/stress/performance testing may be useful in determining client/server
application limitations and capabilities. There are commercial tools to assist with
such testing.

43. How can World Wide Web sites be tested?

Web sites are essentially client/server applications - with web servers and
'browser' clients. Consideration should be given to the interactions between html
pages, TCP/IP communications, Internet connections, firewalls, applications that
run in web pages (such as applets, javascript, plug-in applications), and
applications that run on the server side (such as cgi scripts, database interfaces,
logging applications, dynamic page generators, asp, etc.). Additionally, there are

134
a wide variety of servers and browsers, various versions of each, small but
sometimes significant differences between them, variations in connection
speeds, rapidly changing technologies, and multiple standards and protocols.
The end result is that testing for web sites can become a major ongoing effort.
Other considerations might include:
What are the expected loads on the server (e.g., number of hits per unit
time?), and what kind of performance is required under such loads (such
as web server response time, database query response times). What
kinds of tools will be needed for performance testing (such as web load
testing tools, other tools already in house that can be adapted, web robot
downloading tools, etc.)?
Who is the target audience? What kind of browsers will they be using?
What kind of connection speeds will they by using? Are they intra-
organization (thus with likely high connection speeds and similar
browsers) or Internet-wide (thus with a wide variety of connection speeds
and browser types)?
What kind of performance is expected on the client side (e.g., how fast
should pages appear, how fast should animations, applets, etc. load and
run)?
Will down time for server and content maintenance/upgrades be allowed?
how much?
What kinds of security (firewalls, encryptions, passwords, etc.) will be
required and what is it expected to do? How can it be tested?
How reliable are the site's Internet connections required to be? And how
does that affect backup system or redundant connection requirements and
testing?
What processes will be required to manage updates to the web site's
content, and what are the requirements for maintaining, tracking, and
controlling page content, graphics, links, etc.?
Which HTML specification will be adhered to? How strictly? What
variations will be allowed for targeted browsers?
Will there be any standards or requirements for page appearance and/or
graphics throughout a site or parts of a site??
How will internal and external links be validated and updated? how often?

Can testing be done on the production system, or will a separate test


system be required? How are browser caching, variations in browser
option settings, dial-up connection variabilities, and real-world internet
'traffic congestion' problems to be accounted for in testing?

135
How extensive or customized are the server logging and reporting
requirements; are they considered an integral part of the system and do
they require testing?
How are cgi programs, applets, javascripts, ActiveX components, etc. to
be maintained, tracked, controlled, and tested?
Pages should be 3-5 screens max unless content is tightly focused on a
single topic. If larger, provide internal links within the page.
The page layouts and design elements should be consistent throughout a
site, so that it's clear to the user that they're still within a site.
Pages should be as browser-independent as possible, or pages should be
provided or generated based on the browser-type.
All pages should have links external to the page; there should be no dead-
end pages.
The page owner, revision date, and a link to a contact person or
organization should be included on each page.
44. How is testing affected by object-oriented designs?
Well-engineered object-oriented design can make it easier to trace from code to
internal design to functional design to requirements. While there will be little
affect on black box testing (where an understanding of the internal design of the
application is unnecessary), white-box testing can be oriented to the application's
objects. If the application was well-designed this can simplify test design.
45. What is Extreme Programming and what's it got to do with testing?
Extreme Programming (XP) is a software development approach for small teams
on risk-prone projects with unstable requirements. It was created by Kent Beck
who described the approach in his book 'Extreme Programming Explained'.
Testing ('extreme testing') is a core aspect of Extreme Programming.
Programmers are expected to write unit and functional test code first - before the
application is developed. Test code is under source control along with the rest of
the code. Customers are expected to be an integral part of the project team and
to help develope scenarios for acceptance/black box testing. Acceptance tests
are preferably automated, and are modified and rerun for each of the frequent
development iterations. QA and test personnel are also required to be an integral
part of the project team. Detailed requirements documentation is not used, and
frequent re-scheduling, re-estimating, and re-prioritizing is expected.
46. Common Software Errors
Introduction

This document takes you through whirl-wind tour of common software errors.
This is an excellent aid for software testing. It helps you to identify errors
systematically and increases the efficiency of software testing and improves

136
testing productivity. For more information, please refer Testing Computer
Software, Wiley Edition.

Type of Errors

User Interface Errors

Error Handling

Boundary related errors

Calculation errors

Initial and Later states


Control flow errors

Errors in Handling or Interpreting Data

Race Conditions

Load Conditions

Hardware

Source, Version and ID Control

Testing Errors

Let us go through details of each kind of error.

User Interface Errors

Functionality
Sl No Possible Error Conditions
1 Excessive Functionality
2 Inflated impression of functionality
3 Inadequacy for the task at hand
4 Missing function
5 Wrong function
6 Functionality must be created by user
7 Doesn't do what the user expects

Communication
Missing Information
Sl No Possible Error Conditions
1 No on Screen instructions
2 Assuming printed documentation is already available.
3 Undocumented features

137
4 States that appear impossible to exit
5 No cursor
6 Failure to acknowledge input
7 Failure to show activity during long delays
8 Failure to advise when a change will take effect
9 Failure to check for the same document being opened twice
Wrong, misleading, confusing information
10 Simple factual errors
11 Spelling errors
12 Inaccurate simplifications
13 Invalid metaphors
14 Confusing feature names
15 More than one name for the same feature
16 Information overland
17 When are data saved
18 Wrong function
19 Functionality must be created by user
20 Poor external modularity
Help text and error messages
21 Inappropriate reading levels
22 Verbosity
23 Inappropriate emotional tone
24 Factual errors
25 Context errors
26 Failure to identify the source of error
27 Forbidding a resource without saying why
28 Reporting non-errors
29 Failure to highlight the part of the screen
30 Failure to clear highlighting
31 Wrong/partial string displayed
32 Message displayed for too long or not long enough
Display Layout
33 Poor aesthetics in screen layout
34 Menu Layout errors
35 Dialog box layout errors
36 Obscured Instructions
37 Misuse of flash
38 Misuse of color
39 Heavy reliance on color
40 Inconsistent with the style of the environment
41 Cannot get rid of on screen information
Output
42 Can't output certain data
43 Can't redirect output
44 Format incompatible with a follow-up process
45 Must output too little or too much
46 Can't control output layout

138
47 Absurd printout level of precision
48 Can't control labeling of tables or figures
49 Can't control scaling of graphs
Performance
50 Program Speed
51 User Throughput
52 Can't redirect output
53 Perceived performance
54 Slow program
55 slow echoing
56 how to reduce user throughput
57 Poor responsiveness
58 No type ahead
59 No warning that the operation takes long time
60 No progress reports
61 Problems with time-outs
62 Program pesters you

Program Rigidity
User tailorability
Sl No Possible Error Conditions
1 Can't turn off case sensitivity
2 Can't tailor to hardware at hand
3 Can't change device initialization
4 Can't turn off automatic changes
5 Can't slow down/speed up scrolling
6 Can't do what you did last time
7 Failure to execute a customization commands
8 Failure to save customization commands
9 Side effects of feature changes
10 Can't turn off the noise
11 Infinite tailorability
Who is in control?
12 Unnecessary imposition of a conceptual style
13 Novice friendly, experienced hostile
14 Surplus or redundant information required
15 Unnecessary repetition of steps
16 Unnecessary limits

Command Structure and Rigidity


Inconsistencies
Sl No Possible Error Conditions
1 Optimizations
2 Inconsistent syntax
3 Inconsistent command entry style
4 Inconsistent abbreviations
5 Inconsistent termination rule

139
6 Inconsistent command options
7 Similarly named commands
8 Inconsistent Capitalization
9 Inconsistent menu position
10 Inconsistent function key usage
11 Inconsistent error handling rules
12 Inconsistent editing rules
13 Inconsistent data saving rules
Time Wasters
14 Garden paths
15 choice can't be taken
16 Are you really, really sure
17 Obscurely or idiosyncratically named commands
Menus
18 Excessively complex menu hierarchy
19 Inadequate menu navigation options
20 Too many paths to the same place
21 You can't get there from here
22 Related commands relegated to unrelated menus
23 Unrelated commands tossed under the same menu
Command Lines
24 Forced distinction between uppercase and lowercase
25 Reversed parameters
26 Full command names are not allowed
27 Abbreviations are not allowed
28 Demands complex input on one line
29 no batch input
30 can't edit commands
Inappropriate use of key board
31 Failure to use cursor, edit, or function keys
32 Non std use of cursor and edit keys
33 non-standard use of function keys
34 Failure to filter invalid keys
35 Failure to indicate key board state changes

Missing Commands
State transitions
Sl No Possible Error Conditions
1 Can't do nothing and leave
2 Can't quit mid-program
3 Can't stop mid-command
4 Can't pause
Disaster prevention
5 No backup facility
6 No undo
7 No are you sure
8 No incremental saves

140
Disaster prevention
9 Inconsistent menu position
10 Inconsistent function key usage
11 Inconsistent error handling rules
12 Inconsistent editing rules
13 Inconsistent data saving rules
Error handling by the user
14 No user specifiable filters
15 Awkward error correction
16 Can't include comments
17 Can't display relationships between variables
Miscellaneous
18 Inadequate privacy or security
19 Obsession with security
20 Can't hide menus
21 Doesn't support standard OS features
22 Doesn't allow long names

Error Handling

Error prevention
Sl No Possible Error Conditions
1 Inadequate initial state validation
2 Inadequate tests of user input
3 Inadequate protection against corrupted data
4 Inadequate tests of passed parameters
5 Inadequate protection against operating system bugs
6 Inadequate protection against malicious use
7 Inadequate version control

Error Detection
Sl No Possible Error Conditions
1 ignores overflow
2 ignores impossible values
3 ignores implausible values
4 ignores error flag
5 ignores hardware fault or error conditions
6 data comparison

Error Recovery
Sl No Possible Error Conditions
1 automatic error detection
2 failure to report error
3 failure to set an error flag
4 where does the program go back to
5 aborting errors
6 recovery from hardware problems

141
7 no escape from missing disks

Boundary related errors

Sl No Possible Error Conditions


1 Numeric boundaries
2 Equality as boundary
3 Boundaries on numerosity
4 Boundaries in space
5 Boundaries in time
6 Boundaries in loop
7 Boundaries in memory
8 Boundaries with data structure
9 Hardware related boundaries
10 Invisible boundaries
11 Mishandling of boundary case
12 Wrong boundary
13 Mishandling of cases outside boundary

Calculation Errors

Sl No Possible Error Conditions


1 Bad Logic
2 Bad Arithmetic
3 Imprecise Calculations
4 Outdated constants
5 Calculation errors
6 Impossible parenthesis
7 Wrong order of calculations
8 Bad underlying functions
9 Overflow and Underflow
10 Truncation and Round-off error
11 Confusion about the representation of data
12 Incorrect conversion from one data representation to another
13 Wrong Formula
14 Incorrect Approximation

Race Conditions

Sl No Possible Error Conditions


1 Races in updating data
2 Assumption that one event or task finished before another begins
3 Assumptions that one event or task has finished before another
begins
4 Assumptions that input won't occur during a brief processing interval
5 Assumptions that interrupts won't occur during brief interval
142
6 Resource races
7 Assumptions that a person, device or process will respond quickly
8 Options out of sync during display changes
9 Tasks starts before its prerequisites are met
10 Messages cross or don't arrive in the order sent

Initial and Later States

Sl No Possible Error Conditions


1 Failure to set data item to zero
2 Failure to initialize a loop-control variable
3 Failure to initialize a or re-initialize a pointer
4 Failure to clear a string
5 Failure to initialize a register
6 Failure to clear a flag
7 Data were supposed to be initialized elsewhere
8 Failure to re-initialize
9 Assumption that data were not re-initialized
10 Confusion between static and dynamic storage
11 Data modifications by side effect
12 Incorrect initialization

Control Flow Errors

Program runs amok


Sl No Possible Error Conditions
1 Jumping to a routine that isn't resident
2 Re-entrance
3 Variables contains embedded command names
4 Wrong returning state assumed
5 Exception handling based exits

Return to wrong place


Sl No Possible Error Conditions
1 Corrupted Stack
2 Stack underflow/overflow
3 GOTO rather than RETURN from sub-routine
Interrupts
Sl No Possible Error Conditions
1 Wrong interrupt vector
2 Failure to restore or update interrupt vector
3 Invalid restart after an interrupt
4 Failure to block or un-block interrupts

Program Stops
143
Sl No Possible Error Conditions
1 Dead crash
2 Syntax error reported at run time
3 Waiting for impossible condition or combinations of conditions
4 Wrong user or process priority

Error Detection
Sl No Possible Error Conditions
1 infinite loop
2 Wrong starting value for the loop control variables
3 Accidental change of loop control variables
4 Command that do or don't belong inside the loop
5 Command that do or don't belong inside the loop
6 Improper loop nesting

If Then Else , Or may not


Sl No Possible Error Conditions
1 Wrong inequalities
2 Comparison sometimes yields wrong result
3 Not equal verses equal when there are three cases
4 Testing floating point values for equality
5 confusion between inclusive and exclusive OR
6 Incorrectly negating a logical expression
7 Assignment equal instead of test equal
8 Commands being inside the THEN or ELSE clause
9 Commands that don't belong either case
10 Failure to test a flag
11 Failure to clear a flag

Multiple Cases
Sl No Possible Error Conditions
1 Missing default
2 Wrong default
3 Missing cases
4 Overlapping cases
5 Invalid or impossible cases
6 Commands being inside the THEN or ELSE clause
7 Case should be sub-divided

Errors Handling or Interpreting Data

Problems in passing data between routines


Sl No Possible Error Conditions
144
1 Parameter list variables out of order or missing
2 Data Type errors
3 Aliases and shifting interpretations of the same area of memory
4 Misunderstood data values
5 inadequate error information
6 Failure to clean up data on exception handling
7 Outdated copies of data
8 Related variable get out of synch
9 Local setting of global data
10 Global use of local variables
11 Wrong mask in bit fields
12 Wrong value from table

Data boundaries
Sl No Possible Error Conditions
1 Un-terminated null strings
2 Early end of string
3 Read/Write past end of data structure or an element in it

Read outside the limits of message buffer


Sl No Possible Error Conditions
1 Complier padding to word boundaries
2 value stack underflow/overflow
3 Trampling another process's code or data

Messaging Problems
Sl No Possible Error Conditions
1 Messages sent to wrong process or port
2 Failure to validate an incoming message
3 Lost or out of synch messages
4 Message sent to only N of N+1 processes

Data Storage corruption


Sl No Possible Error Conditions
1 Overwritten changes
2 Data entry not saved
3 Too much data for receiving process to handle
4 Overwriting a file after an error exit or user abort

Load Conditions

Sl No Possible Error Conditions


1 Required resources are not available
2 No available large memory area
3 Input buffer or queue not deep enough
4 Doesn't clear item from queue, buffer or stock

145
5 Lost Messages
6 Performance costs
7 Race condition windows expand
8 Doesn't abbreviate under load
9 Doesn't recognize that another process abbreviates output under load
10 Low priority tasks not put off
11 Low priority tasks never done

Doesn't return a resource


Sl No Possible Error Conditions
1 Doesn't indicate that it's done with a device
2 Doesn't erase old files from mass storage
3 Doesn't return unused memory
4 Wastes computer time
Hardware
Sl No Possible Error Conditions
1 Wrong Device
2 Wrong Device Address
3 Device unavailable
4 Device returned to wrong type of pool
5 Device use forbidden to caller
6 Specifies wrong privilege level for the device
7 Noisy Channel
8 Channel goes down
9 Time-out problems
10 Wrong storage device
11 Doesn't check the directory of current disk
12 Doesn't close the file
13 Unexpected end of file
14 Disk sector bug and other length dependent errors
15 Wrong operation or instruction codes
16 Misunderstood status or return code
17 Underutilizing device intelligence
18 Paging mechanism ignored or misunderstood
19 Ignores channel throughput limits
20 Assuming device is or isn't or should be or shouldn't be initialized
21 Assumes programmable function keys are programmed correctly
Source, Version, ID Control
Sl No Possible Error Conditions
1 Old bugs mysteriously re appear
2 Failure to update multiple copies of data or program files
3 No title
4 No version ID
5 Wrong version number of title screen
6 No copy right message or bad one
7 Archived source doesn't compile into a match for shipping code
146
8 Manufactured disks don't work or contain wrong code or data
Testing Errors
Missing bugs in the program
Sl No Possible Error Conditions
1 Failure to notice a problem
2 You don't know what the correct test results are
3 You are bored or inattentive
4 Misreading the Screen
5 Failure to report problem
6 Failure to execute a planned test
7 Failure to use the most promising test case
8 Ignoring programmer's suggestions

Finding bugs that aren't in the program


Sl No Possible Error Conditions
1 Errors in testing programs
2 Corrupted data files
3 Misinterpreted specifications or documentation
Poor reporting
Sl No Possible Error Conditions
1 Illegible reports
2 Failure to make it clear how to reproduce the problem
3 Failure to say you can't reproduce the problem
4 Failure to check your report
5 Failure to report timing dependencies
6 Failure to simplify conditions
7 Concentration on trivia
8 Abusive language
Poor Tracking and follow-up
Sl No Possible Error Conditions
1 Failure to provide summary report
2 Failure to re-report serious bug
3 Failure to check for unresolved problems just before release
4 Failure to verify fixes
47. Designing Unit Test Cases

Executive Summary

Producing a test specification, including the design of test cases, is the level of
test design which has the highest degree of creative input. Furthermore, unit test
specifications will usually be produced by a large number of staff with a wide
range of experience, not just a few experts.

This paper provides a general process for developing unit test specifications and
then describes some specific design techniques for designing unit test cases. It

147
serves as a tutorial for developers who are new to formal testing of software, and
as a reminder of some finer points for experienced software testers.

A. Introduction

The design of tests is subject to the same basic engineering principles as the
design of software. Good design consists of a number of stages which
progressively elaborate the design. Good test design consists of a number of
stages which progressively elaborate the design of tests:

Test strategy;
Test planning;
Test specification;
Test procedure.

These four stages of test design apply to all levels of testing, from unit testing
through to system testing. This paper concentrates on the specification of unit
tests; i.e. the design of individual unit test cases within unit test specifications. A
more detailed description of the four stages of test design can be found in the IPL
paper "An Introduction to Software Testing".

The design of tests has to be driven by the specification of the software. For unit
testing, tests are designed to verify that an individual unit implements all design
decisions made in the unit's design specification. A thorough unit test
specification should include positive testing, that the unit does what it is
supposed to do, and also negative testing, that the unit does not do anything that
it is not supposed to do.

Producing a test specification, including the design of test cases, is the level of
test design which has the highest degree of creative input. Furthermore, unit test
specifications will usually be produced by a large number of staff with a wide
range of experience, not just a few experts.

This paper provides a general process for developing unit test specifications, and
then describes some specific design techniques for designing unit test cases. It
serves as a tutorial for developers who are new to formal testing of software, and
as a reminder of some finer points for experienced software testers.

B. Developing Unit Test Specifications

Once a unit has been designed, the next development step is to design the unit
tests. An important point here is that it is more rigorous to design the tests before
the code is written. If the code was written first, it would be too tempting to test
the software against what it is observed to do (which is not really testing at all),
rather than against what it is specified to do.

A unit test specification comprises a sequence of unit test cases. Each unit test
case should include four essential elements:

148
A statement of the initial state of the unit, the starting point of the test case
(this is only applicable where a unit maintains state between calls);
The inputs to the unit, including the value of any external data read by the
unit;
What the test case actually tests, in terms of the functionality of the unit
and the analysis used in the design of the test case (for example, which
decisions within the unit are tested);
The expected outcome of the test case (the expected outcome of a test
case should always be defined in the test specification, prior to test
execution).

The following subsections of this paper provide a six step general process for
developing a unit test specification as a set of individual unit test cases. For each
step of the process, suitable test case design techniques are suggested. (Note
that these are only suggestions. Individual circumstances may be better served
by other test case design techniques). Section 3 of this paper then describes in
detail a selection of techniques which can be used within this process to help
design test cases.

B.1 Step 1 - Make it Run

The purpose of the first test case in any unit test specification should be to
execute the unit under test in the simplest way possible. When the tests are
actually executed, knowing that at least the first unit test will execute is a good
confidence boost. If it will not execute, then it is preferable to have something as
simple as possible as a starting point for debugging.

Suitable techniques:

- Specification derived tests


- Equivalence partitioning

B.2 Step 2 - Positive Testing

Test cases should be designed to show that the unit under test does what it is
supposed to do. The test designer should walk through the relevant
specifications; each test case should test one or more statements of
specification. Where more than one specification is involved, it is best to make
the sequence of test cases correspond to the sequence of statements in the
primary specification for the unit.

Suitable techniques:

- Specification derived tests


- Equivalence partitioning
- State-transition testing

B.3. Step 3 - Negative Testing

149
Existing test cases should be enhanced and further test cases should be
designed to show that the software does not do anything that it is not specified to
do. This step depends primarily upon error guessing, relying upon the experience
of the test designer to anticipate problem areas.

Suitable techniques:

- Error guessing
- Boundary value analysis
- Internal boundary value testing
- State-transition testing

B.4. Step 4 - Special Considerations

Where appropriate, test cases should be designed to address issues such as


performance, safety requirements and security requirements. Particularly in the
cases of safety and security, it can be convenient to give test cases special
emphasis to facilitate security analysis or safety analysis and certification. Test
cases already designed which address security issues or safety hazards should
be identified in the unit test specification. Further test cases should then be
added to the unit test specification to ensure that all security issues and safety
hazards applicable to the unit will be fully addressed.

Suitable techniques:

- Specification derived tests

B.5. Step 5 - Coverage Tests

The test coverage likely to be achieved by the designed test cases should be
visualised. Further test cases can then be added to the unit test specification to
achieve specific test coverage objectives. Once coverage tests have been
designed, the test procedure can be developed and the tests executed.

Suitable techniques:

- Branch testing
- Condition testing
- Data definition-use testing
- State-transition testing

B.6. Test Execution

A test specification designed using the above five steps should in most cases
provide a thorough test for a unit. At this point the test specification can be used
to develop an actual test procedure, and the test procedure used to execute the

150
tests. For users of AdaTEST or Cantata, the test procedure will be an AdaTEST
or Cantata test script.

Execution of the test procedure will identify errors in the unit which can be
corrected and the unit re-tested. Dynamic analysis during execution of the test
procedure will yield a measure of test coverage, indicating whether coverage
objectives have been achieved. There is therefore a further coverage completion
step in the process of designing test specifications.

B.7. Step 6 - Coverage Completion

Depending upon an organizations standards for the specification of a unit, there


may be no structural specification of processing within a unit other than the code
itself. There are also likely to have been human errors made in the development
of a test specification. Consequently, there may be complex decision conditions,
loops and branches within the code for which coverage targets may not have
been met when tests were executed. Where coverage objectives are not
achieved, analysis must be conducted to determine why. Failure to achieve a
coverage objective may be due to:

Infeasible paths or conditions - the corrective action should be to annotate


the test specification to provide a detailed justification of why the path or
condition is not tested. AdaTEST provides some facilities to help exclude
infeasible conditions from Boolean coverage metrics.
Unreachable or redundant code - the corrective action will probably be to
delete the offending code. It is easy to make mistakes in this analysis,
particularly where defensive programming techniques have been used. If
there is any doubt, defensive programming should not be deleted.
Insufficient test cases - test cases should be refined and further test cases
added to a test specification to fill the gaps in test coverage.

Ideally, the coverage completion step should be conducted without looking at


the actual code. However, in practice some sight of the code may be
necessary in order to achieve coverage targets. It is vital that all test
designers should recognize that use of the coverage completion step should
be minimized. The most effective testing will come from analysis and
specification, not from experimentation and over dependence upon the
coverage completion step to cover for sloppy test design.

Suitable techniques:

- Branch testing
- Condition testing
- Data definition-use testing
- State-transition testing

B.8. General Guidance

Note that the first five steps in producing a test specification can be achieved:
151
Solely from design documentation;
Without looking at the actual code;
Prior to developing the actual test procedure.

It is usually a good idea to avoid long sequences of test cases which depend
upon the outcome of preceding test cases. An error identified by a test case early
in the sequence could cause secondary errors and reduce the amount of real
testing achieved when the tests are executed.

The process of designing test cases, including executing them as "thought


experiments", often identifies bugs before the software has even been built. It is
not uncommon to find more bugs when designing tests than when executing
tests.

Throughout unit test design, the primary input should be the specification
documents for the unit under test. While use of actual code as an input to the test
design process may be necessary in some circumstances, test designers must
take care that they are not testing the code against itself. A test specification
developed from the code will only prove that the code does what the code does,
not that it does what it is supposed to do.

C. Test Case Design Techniques

The preceding section of this paper has provided a "recipe" for developing a unit
test specification as a set of individual test cases. In this section a range of
techniques which can be to help define test cases are described.

Test case design techniques can be broadly split into two main categories. Black
box techniques use the interface to a unit and a description of functionality, but
do not need to know how the inside of a unit is built. White box techniques make
use of information about how the inside of a unit works. There are also some
other techniques which do not fit into either of the above categories. Error
guessing falls into this category.

152
The most important ingredients of any test design are experience and common
sense. Test designers should not let any of the given techniques obstruct the
application of experience and common sense.

The selection of test case design techniques described in the following


subsections is by no means exhaustive. Further information on techniques for
test case design can be found in "Software Testing Techniques" 2nd Edition, B
Beizer,Van Nostrand Reinhold, New York 1990.

C.1. Specification Derived Tests

As the name suggests, test cases are designed by walking through the relevant
specifications. Each test case should test one or more statements of
specification. It is often practical to make the sequence of test cases correspond
to the sequence of statements in the specification for the unit under test. For
example, consider the specification for a function to calculate the square root of a
real number, shown in figure 3.1.

153
There are three statements in this specification, which can be addressed by two
test cases. Note that the use of Print_Line conveys structural information in the
specification.

Test Case 1: Input 4, Return 2

- Exercises the first statement in the specification


("When given an input of 0 or greater, the positive square
root of the input shall be returned.").

Test Case 2: Input -10, Return 0, Output "Square root error - illegal negative
input" using Print_Line.

- Exercises the second and third statements in the specification

("When given an input of less than 0, the error message


"Square root error - illegal negative input" shall be displayed
and a value of 0 returned. The library routine Print_Line shall
be used to display the error message.").

Specification derived test cases can provide an excellent correspondence to the


sequence of statements in the specification for the unit under test, enhancing the
readability and maintainability of the test specification. However, specification
derived testing is a positive test case design technique. Consequently,
specification derived test cases have to be supplemented by negative test cases
in order to provide a thorough unit test specification.

154
A variation of specification derived testing is to apply a similar technique to a
security analysis, safety analysis, software hazard analysis, or other document
which provides supplementary information to the unit's specification.

C.2. Equivalence Partitioning

Equivalence partitioning is a much more formalised method of test case design. It


is based upon splitting the inputs and outputs of the software under test into a
number of partitions, where the behaviour of the software is equivalent for any
value within a particular partition. Data which forms partitions is not just routine
parameters. Partitions can also be present in data accessed by the software, in
time, in input and output sequence, and in state.

Equivalence partitioning assumes that all values within any individual partition
are equivalent for test purposes. Test cases should therefore be designed to test
one value in each partition. Consider again the square root function used in the
previous example. The square root function has two input partitions and two
output partitions, as shown in table 3.2.

These four partitions can be tested with two test cases:

Test Case 1: Input 4, Return 2


- Exercises the >=0 input partition (ii)
- Exercises the >=0 output partition (a)

Test Case 2: Input -10, Return 0, Output "Square root error - illegal negative
input" using Print_Line.

- Exercises the <0 input partition (i)


- Exercises the "error" output partition (b)

For a function like square root, we can see that equivalence partitioning is quite
simple. One test case for a positive number and a real result; and a second test
case for a negative number and an error result. However, as software becomes
more complex, the identification of partitions and the inter-dependencies between
partitions becomes much more difficult, making it less convenient to use this
technique to design test cases. Equivalence partitioning is still basically a positive
test case design technique and needs to be supplemented by negative tests.

155
C.3. Boundary Value Analysis

Boundary value analysis uses the same analysis of partitions as equivalence


partitioning. However, boundary value analysis assumes that errors are most
likely to exist at the boundaries between partitions. Boundary value analysis
consequently incorporates a degree of negative testing into the test design, by
anticipating that errors will occur at or near the partition boundaries. Test cases
are designed to exercise the software on and at either side of boundary values.
Consider the two input partitions in the square root example, as illustrated by
figure 3.2.

The zero or greater partition has a boundary at 0 and a boundary at the most
positive real number. The less than zero partition shares the boundary at 0 and
has another boundary at the most negative real number. The output has a
boundary at 0, below which it cannot go.

Test Case 1: Input {the most negative real number}, Return 0, Output "Square
root error - illegal negative input" using Print_Line

-Exercises the lower boundary of partition (i).

Test Case 2: Input {just less than 0}, Return 0, Output "Square root error - illegal
negative input" using Print_Line

- Exercises the upper boundary of partition (i).


Test Case 3: Input 0, Return 0

- Exercises just outside the upper boundary of partition (i),


the lower boundary of partition (ii) and the lower boundary
of partition (a).

Test Case 4: Input {just greater than 0}, Return {the positive square root of the
input}

156
- Exercises just inside the lower boundary of partition (ii).

Test Case 5: Input {the most positive real number}, Return {the positive square
root of the input}

- Exercises the upper boundary of partition (ii) and the upper boundary of
partition (a).

As for equivalence partitioning, it can become impractical to use boundary value


analysis thoroughly for more complex software. Boundary value analysis can
also be meaningless for non scalar data, such as enumeration values. In the
example, partition (b) does not really have boundaries. For purists, boundary
value analysis requires knowledge of the underlying representation of the
numbers. A more pragmatic approach is to use any small values above and
below each boundary and suitably big positive and negative numbers

C.4. State-Transition Testing

State transition testing is particularly useful where either the software has been
designed as a state machine or the software implements a requirement that has
been modelled as a state machine. Test cases are designed to test the
transitions between states by creating the events which lead to transitions.

When used with illegal combinations of states and events, test cases for negative
testing can be designed using this approach. Testing state machines is
addressed in detail by the IPL paper "Testing State Machines with AdaTEST and
Cantata".
C.5. Branch Testing
In branch testing, test cases are designed to exercise control flow branches or
decision points in a unit. This is usually aimed at achieving a target level of
Decision Coverage. Given a functional specification for a unit, a "black box" form
of branch testing is to "guess" where branches may be coded and to design test
cases to follow the branches. However, branch testing is really a "white box" or
structural test case design technique. Given a structural specification for a unit,
specifying the control flow within the unit, test cases can be designed to exercise
branches. Such a structural unit specification will typically include a flowchart or
PDL.

Returning to the square root example, a test designer could assume that there
would be a branch between the processing of valid and invalid inputs, leading to
the following test cases:

Test Case 1: Input 4, Return 2

- Exercises the valid input processing branch

Test Case 2: Input -10, Return 0, Output "Square root error - illegal negative
input" using Print_Line.

157
- Exercises the invalid input processing branch

However, there could be many different structural implementations of the square


root function. The following structural specifications are all valid implementations
of the square root function, but the above test cases would only achieve decision
coverage of the first and third versions of the specification.

158
It can be seen that branch testing works best with a structural specification for
the unit. A structural unit specification will enable branch test cases to be
designed to achieve decision coverage, but a purely functional unit specification
could lead to coverage gaps.

One thing to beware of is that by concentrating upon branches, a test designer


could loose sight of the overall functionality of a unit. It is important to always
remember that it is the overall functionality of a unit that is important, and that
branch testing is a means to an end, not an end in itself. Another consideration is
that branch testing is based solely on the outcome of decisions. It makes no
allowances for the complexity of the logic which leads to a decision.

C.6. Condition Testing

There are a range of test case design techniques which fall under the general
title of condition testing, all of which try to allay the weaknesses of branch testing
when complex logical conditions are encountered. The object of condition testing
is to design test cases to show that the individual components of logical
conditions and combinations of the individual components are correct.

Test cases are designed to test the individual elements of logical expressions,
both within branch conditions and within other expressions in a unit. As for
branch testing, condition testing could be used as a "black box" technique, where
the test designer makes intelligent guesses about the implementation of a

159
functional specification for a unit. However, condition testing is more suited to
"white box" test design from a structural specification for a unit.

The test cases should be targeted at achieving a condition coverage metric, such
as Modified Condition Decision Coverage (available as Boolean Operand
Effectiveness in AdaTEST). The IPL paper entitled "Structural Coverage Metrics"
provides more detail of condition coverage metrics.

To illustrate condition testing, consider the example specification for the square
root function which uses successive approximation (figure 3.3(d) - Specification
4). Suppose that the designer for the unit made a decision to limit the algorithm
to a maximum of 10 iterations, on the grounds that after 10 iterations the answer
would be as close as it would ever get. The PDL specification for the unit could
specify an exit condition like that given in figure 3.4.

If the coverage objective is Modified Condition Decision Coverage, test cases


have to prove that both error<desired accuracy and iterations=10 can
independently affect the outcome of the decision.

Test Case 1: 10 iterations, error>desired accuracy for all iterations.

- Both parts of the condition are false for the first 9


iterations. On the tenth iteration, the first part of the
condition is false and the second part becomes true,
showing that the iterations=10 part of the condition can
independently affect its outcome.

Test Case 2: 2 iterations, error>=desired accuracy for the first iteration, and
error<desired accuracy for the second iteration.

- Both parts of the condition are false for the first iteration.
On the second iteration, the first part of the condition
becomes true and the second part remains false, showing
that the error<desired accuracy part of the condition can

160
independently affect its outcome.

Condition testing works best when a structural specification for the unit is
available. It provides a thorough test of complex conditions, an area of frequent
programming and design error and an area which is not addressed by branch
testing. As for branch testing, it is important for test designers to beware that
concentrating on conditions could distract a test designer from the overall
functionality of a unit.

C.7. Data Definition-Use Testing

Data definition-use testing designs test cases to test pairs of data definitions and
uses. A data definition is anywhere that the value of a data item is set, and a data
use is anywhere that a data item is read or used. The objective is to create test
cases which will drive execution through paths between specific definitions and
uses.

Like decision testing and condition testing, data definition-use testing can be
used in combination with a functional specification for a unit, but is better suited
to use with a structural specification for a unit.

Consider one of the earlier PDL specifications for the square root function which
sent every input to the maths co-processor and used the co-processor status to
determine the validity of the result. (Figure 3.3(c) - Specification 3). The first step
is to list the pairs of definitions and uses. In this specification there are a number
of definition-use pairs, as shown in table 3.3.

These pairs of definitions and uses can then be used to design test cases. Two
test cases are required to test all six of these definition-use pairs:

Test Case 1: Input 4, Return 2


- Tests definition-use pairs 1, 2, 5, 6
-

161
Test Case 2: Input -10, Return 0, Output "Square root error - illegal negative
input" using Print_Line.

- Tests definition-use pairs 1, 2, 3, 4

The analysis needed to develop test cases using this design technique can also
be useful for identifying problems before the tests are even executed; for
example, identification of situations where data is used without having been
defined. This is the sort of data flow analysis that some static analysis tool can
help with. The analysis of data definition-use pairs can become very complex,
even for relatively simple units. Consider what the definition-use pairs would be
for the successive approximation version of square root!

It is possible to split data definition-use tests into two categories: uses which
affect control flow (predicate uses) and uses which are purely computational.
Refer to "Software Testing Techniques" 2nd Edition, B Beizer,Van Nostrand
Reinhold, New York 1990, for a more detailed description of predicate and
computational uses.

C.8. Internal Boundary Value Testing

In many cases, partitions and their boundaries can be identified from a functional
specification for a unit, as described under equivalence partitioning and boundary
value analysis above. However, a unit may also have internal boundary values
which can only be identified from a structural specification. Consider a fragment
of the successive approximation version of the square root unit specification, as
shown in figure 3.5 ( derived from figure 3.3(d) - Specification 4).

The calculated error can be in one of two partitions about the desired accuracy, a
feature of the structural design for the unit which is not apparent from a purely
functional specification. An analysis of internal boundary values yields three
conditions for which test cases need to be designed.
162
Test Case 1: Error just greater than the desired accuracy
Test Case 2: Error equal to the desired accuracy
Test Case 3: Error just less than the desired accuracy

Internal boundary value testing can help to bring out some elusive bugs. For
example, suppose "<=" had been coded instead of the specified "<".
Nevertheless, internal boundary value testing is a luxury to be applied only as a
final supplement to other test case design techniques.

C.9. Error Guessing

Error guessing is based mostly upon experience, with some assistance from
other techniques such as boundary value analysis. Based on experience, the test
designer guesses the types of errors that could occur in a particular type of
software and designs test cases to uncover them. For example, if any type of
resource is allocated dynamically, a good place to look for errors is in the
deallocation of resources. Are all resources correctly deallocated, or are some
lost as the software executes?

Error guessing by an experienced engineer is probably the single most effective


method of designing tests which uncover bugs. A well placed error guess can
show a bug which could easily be missed by many of the other test case design
techniques presented in this paper. Conversely, in the wrong hands error
guessing can be a waste of time.

To make the maximum use of available experience and to add some structure to
this test case design technique, it is a good idea to build a check list of types of
errors. This check list can then be used to help "guess" where errors may occur
within a unit. The check list should be maintained with the benefit of experience
gained in earlier unit tests, helping to improve the overall effectiveness of error
guessing.

D. Conclusion

Experience has shown that a conscientious approach to unit testing will detect
many bugs at a stage of the software development where they can be corrected
economically. A rigorous approach to unit testing requires:

That the design of units is documented in a specification before coding


begins;
That unit tests are designed from the specification for the unit, also
preferably before coding begins;
That the expected outcomes of unit test cases are specified in the unit test
specification.

The process for developing unit test specifications presented in this paper is
generic, in that it can be applied to any level of testing. Nevertheless, there will
be circumstances where it has to be tailored to specific situations. Tailoring of the
163
process and the use of test case design techniques should be documented in the
overall test strategy.

48. LITERATURE REVIEW

2.1 Introduction

The purpose of this dissertation is to increase understanding of how experienced


practitioners as individuals evaluate diagrammatic models in Formal Technical
Review (FTR). In this research, those aspects of FTR relating to evaluation of an
artifact by practitioners as individuals are referred to as Practitioner Evaluation
(PE). The relevant FTR literature is reviewed for theory and research applicable
to PE. However, FTR developed pragmatically without relation to underlying
cognitive theory, and the literature consists primarily of case studies with a very
limited number of controlled experiments.

Other work on the evaluation of diagrams and graphs is also reviewed for
possible theoretical models that could be used in the current research. Human-
Computer Interaction (HCI) is an Information Systems area that has drawn
extensively on cognitive science to develop and evaluate Graphical User
Interfaces (GUIs). A brief overview of cognitive-based approaches utilized in HCI
is presented. One of these approaches, the Human Information Processing
System model, in which the human mind is treated as an information-processing
system, provides the cognitive theoretical model for this research and is
discussed separately because of its importance. Work on attention and the
comprehension of graphics is also briefly reviewed.

Two further areas are identified as necessary for the development of the
research task and tools: (1) types of diagrammatic models and (2) types of
software defects. Relevant work in each of these areas is briefly reviewed and,
since typologies appropriate to this research were not located, appropriate
typologies are developed.

2.2 Formal Technical Review

Software review as a technique to detect software defects is not new -- it has


been used since the earliest days of programming. For example, Babbage and
von Neumann regularly asked colleagues to examine their programs [Freedman
and Weinberg 1990], and in the 1950s and 1960s, large software projects often
included some type of software review [Knight and Myers 1993]. However, the
first significant formalization of software review practice is generally considered
to be the development by Michael Fagan [1976] of a species of FTR that he
called "inspection."

Following Tjahjono [1996, 2], Formal Technical Review may be defined as any
"evaluation technique that involves the bringing together of a group of technical
[and sometimes non-technical] personnel to analyze a software artifact, typically
with the goal of discovering errors or other anomalies." As such, FTR has the
following distinguishing characteristics:

164
1. Formal process.
2. Use of groups or teams. Most FTR techniques involve real groups, but
nominal groups are used as well.
3. Review by knowledgeable individuals or practitioners.
4. Focus on detection of defects.

2.2.1 Types of Formal Technical Review

While the focus of this research is on the individual evaluation aspects of


reviews, for context several other FTR techniques are discussed as well. Among
the most common forms of FTR are the following:

1.Desk Checking, or reading over a program by hand while sitting at one's desk,
is the oldest software review technique [Adrion et al. 1982]. Strictly speaking,
desk checking is not a form of FTR since it does not involve a formal process or
a group. Moreover, desk checking is generally perceived as ineffective and
unproductive due to (a) its lack of discipline and (b) the general ineffectiveness of
people in detecting their own errors. To correct for the second problem,
programmers often swap programs and check each other's work. Since desk
checking is an individual process not involving group dynamics, research in this
area would be relevant but none applicable to the current research was found.
It should be noted that Humphrey [1995] has developed a review method, called
Personal Review (PR), which is similar to desk checking. In PR, each
programmer examines his own products to find as many defects as possible
utilizing a disciplined process in conjunction with Humphrey's Personal Software
Process (PSP) to improve his own work. The review strategy includes the use of
checklists to guide the review process, review metrics to improve the process,
and defect causal analysis to prevent the same defects from recurring in the
future. The approach taken in developing the Personal Review process is an
engineering one; no reference is made in Humphrey [1995] to cognitive theory.
2. Peer Rating is a technique in which anonymous programs are evaluated in
terms of their overall quality, maintainability, extensibility, usability and clarity
by selected programmers who have similar backgrounds [Myers 1979].
Shneiderman [1980] suggests that peer ratings of programs are productive,
enjoyable, and non-threatening experiences. The technique is often referred
to as Peer Reviews [Shneiderman 1980], but some authors use the term peer
reviews for generic review methods involving peers [Paulk et al 1993;
Humphrey 1989].

3. Walkthroughs are presentation reviews in which a review participant, usually


the software author, narrates a description of the software and the other
members of the review group provide feedback throughout the presentation
[Freedman and Weinberg 1990; Gilb and Graham 1993]. It should be noted
that the term "walkthrough" has been used in the literature variously. Some
authors unite it with "structured" and treat it as a disciplined, formal review
process [Myers 1979; Yourdon 1989; Adrion et al. 1982]. However, the
literature generally describes walkthrough as an undisciplined process without

165
advance preparation on the part of reviewers and with the meeting focus on
education of participants [Fagan 1976].

4. Round-robin Review is a evaluation process in which a copy of the review


materials is made available and routed to each participant; the reviewers then
write their comments/questions concerning the materials and pass the
materials with comments to another reviewer and to the moderator or author
eventually [Hart 1982].

5. Inspection was developed by Fagan [1976, 1986] as a well-planned and


well-defined group review process to detect software defects defect repair
occurs outside the scope of the process. The original Fagan Inspection (FI) is
the most cited review method in the literature and is the source for a variety of
similar inspection techniques [Tjahjono 1996]. Among the FI-derived
techniques are Active Design Review [Parnas and Weiss 1987], Phased
Inspection [Knight and Myers 1993], N-Fold Inspection [Schneider et al.
1992], and FTArm [Tjahjono 1996]. Unlike the review techniques previously
discussed, inspection is often used to control the quality and productivity of
the development process.

A Fagan Inspection consists of six well-defined phases:

i. Planning. Participants are selected and the materials to be reviewed are


prepared and checked for review suitability.
ii. Overview. The author educates the participants about the review
materials through a presentation.
iii. Preparation. The participants learn the materials individually.
iv. Meeting. The reader (a participant other than the author) narrates or
paraphrases the review materials statement by statement, and the other
participants raise issues and questions. Questions continue on a point
only until an error is recognized or the item is deemed correct.
v. Rework. The author fixes the defects identified in the meeting.
vi. Follow-up. The "corrected" products are reinspected.

Practitioner Evaluation is primarily associated with the Preparation phase.

In addition to classification by technique-type, FTR may also be classified on


other dimensions, including the following:

A. Small vs. Large Team Reviews. Siy [1996] classifies reviews into those
conducted by small (1-4 reviewers) [Bisant and Lyle 1996] and large (more
than 4 reviewers) [Fagan 1976, 1986] teams. If each reviewer depends on
different expertise and experiences, a large team should allow a wider
variety of defects to be detected and thus better coverage. However, a
large team requires more effort due to more individuals inspecting the
artifact, generally involves greater scheduling problems [Ballman and
Votta 1994], and may make it more difficult for all participants to
participate fully.

166
B. No vs. Single vs. Multiple Session Reviews. The traditional Fagan
Inspection provided for one session to inspect the software artifact, with
the possibility of a follow-up session to inspect corrections. However,
variants have been suggested.

Humphrey [1989] comments that three-quarters of the errors found in well-


run inspections are found during preparation. Based on an economic
analysis of a series of inspections at AT&T, Votta [1993] argues that
inspection meetings are generally not economic and should be replaced
with depositions, where the author and (optionally) the moderator meet
separately with inspectors to collect their results.

On the other hand, some authors [Knight and Myers 1993; Schneider et
al. 1992] have argued for multiple sessions, conducted either in series or
parallel. Gilb and Graham [1993] do not use multiple inspection sessions
but add a root cause analysis session immediately after the inspection
meeting.
C. Nonsystematic vs. Systematic Defect-Detection Technique Reviews.
The most frequently used detection methods (ad hoc and checklist) rely on
nonsystematic techniques, and reviewer responsibilities are general and not
differentiated for single session reviews [Siy 1996]. However, some methods
employ more prescriptive techniques, such as questionnaires [Parnas and
Weiss 1987] and correctness proofs [Britcher 1988].
D.Single Site vs. Multiple Site Reviews. The traditional FTR techniques
have assumed that the group-meeting component would occur face-to-face at
a single site. However, with improved telecommunications, and especially
with computer support (see item F below), it has become increasingly feasible
to conduct even the group meeting from multiple sites.
E. Synchronous vs. Asynchronous Reviews. The traditional FTR
techniques have also assumed that the group meeting component would
occur in real-time; i.e., synchronously. However, some newer techniques
that eliminate the group meeting or are based on computer support utilize
asynchronous reviews.
F. Manual vs. Computer-supported Reviews. In recent years, several
computer supported review systems have been developed [Brothers et al.
1990; Johnson and Tjahjono 1993; Gintell et al. 1993; Mashayekhi et al
1994]. The type of support varies from simple augmentation of the manual
practices [Brothers et al. 1990; Gintell et al. 1993] to totally new review
methods [Johnson and Tjahjono 1993].

2.2.2 Economic Analyses of Formal Technical Review

Wheeler et al. [1996], after reviewing a number of studies that support the
economic benefit of FTR, conclude that inspections reduce the number of defects
throughout development, cause defects to be found earlier in the development
process where they are less expensive to correct, and uncover defects that
would be difficult or impossible to discover by testing. They also note "these

167
benefits are not without their costs, however. Inspections require an investment
of approximately 15 percent of the total development cost early in the process [p.
11]."
In discussing overall economic effects, Wheeler et al. cite Fagan [1986] to the
effect that investment in inspections has been reported to yield a 25-to-35
percent overall increase in productivity. They also reproduce a graphical analysis
from Boehm [1987] that indicates inspections reduce total development cost by
approximately 30%.

The Wheeler et al. [1996] analysis does not specify the relative value of
Practitioner Evaluation to FTR, but two recent economic analyses provide
indications.

Votta [1993]. After analyzing data collected from 13 traditional inspections


conducted at AT&T, Votta reports that the approximately 4% increase in faults
found at collection meetings (synergy) does not economically justify the
development delays caused by the need to schedule meetings and the
additional developer time associated with the actual meetings. He also argues
that it is not cost-effective to use the collection meeting to reduce the number
of items incorrectly identified as defective prior to the meeting ("false
positives"). Based on these findings, he concludes that almost all inspection
meetings requiring all reviewers to be present should be replaced with
Depositions, which are three person meetings with only the author,
moderator, and one reviewer present.

Siy [1996]. In his analysis of the factors driving inspection costs and benefits,
Siy reports that changes in FTR structural elements, such as group size,
number of sessions, and coordination of multiple sessions, were largely
ineffective in improving the effectiveness of inspections. Instead, inputs into
the process (reviewers and code units) accounted for more outcome variation
than structural factors. He concludes by stating "better techniques by which
reviewers detect defects, not better process structures, are the key to
improving inspection effectiveness [Abstract, p. 2]." (emphasis added)

Votta's analysis effectively attributes most of the economic benefit of FTR to PE,
and Siy's explicitly states that better PE techniques "are the key to improving
inspection effectiveness." These findings, if supported by additional research,
would further support the contention that a better understanding of Practitioner
Evaluation is necessary.
2.2.3 Psychological Aspects of FTR
Work on the psychological aspects of FTR can be categorized into four groups.
1.Egoless Programming. Gerald Weinberg [1971] began the examination of
psychological issues associated with software review in his work on egoless
programming. According to Weinberg, programmers are often reluctant to
allow their programs to be read by other programmers because the programs
are often considered to be an extension of the self and errors discovered in

168
the programs to be a challenge to one's self-image. Two implications of this
theory are as follows:
i. The ability of a programmer to find errors in his own work tends to be
impaired since he tends to justify his own actions, and it is therefore more
effective to have other people check his work.
ii. Each programmer should detach himself from his own work. The work
should be considered a public property where other people can freely
criticize, and thus, improve its quality; otherwise, one tends to become
defensive, and reluctant to expose one's own failures.

These two concepts have led to the justification of FTR groups, as well as the
establishment of independent quality assurance groups that specialize in
finding software defects in many software organizations [Humphrey 1989].

2. Role of Management. Another psychological aspect of FTR that has been


examined is the recording of data and its dissemination to management.
According to Dobbins [1987], this must be done in such a way that individual
programmers will not feel intimidated or threatened.

3. Positive Psychological Impacts. Hart [1982] observes that reviews can


make one more careful in writing programs (e.g., double checking code) in
anticipation of having to present or share the programs with other
participants. Thus, errors are often eliminated even before the actual review
sessions.

4.Group Process. Most FTR methods are implemented using small groups.
Therefore, several key issues from small group theory apply to FTR, such as
group think (tendency to suppress dissent in the interests of group harmony),
group deviants (influence by minority), and domination of the group by a single
member. Other key issues include social facilitation (presence of others boosts
one's performance) and social loafing (one member free rides on the group's
effort) [Myers 1990]. The issue of moderator domination in inspections is also
documented in the literature [Tjahjono 1996].
Perhaps the most interesting research from the perspective of the current
study is that of Sauer et al. [2000]. This research is unusual in that it has an
explicit theoretical basis and outlines a behaviorally motivated program of
research into the effectiveness of software development technical reviews.
The finding that most of the variation in effectiveness of software
development technical reviews is the result of variations in expertise among
the participants provides additional motivation for developing a solid
understanding of Formal Technical Review at the individual level.

It should be noted that all of this work, while based on psychological theory, does
not address the issue of how practitioners actually evaluate software artifacts.
2.3 Approaches to the Evaluation of Diagrammatic Models
The focus of this dissertation is the exploration of how practitioners as individuals
evaluate diagrammatic models for semantic errors that would cause the resulting

169
system not to meet the functionality, performance, security, usability,
maintainability, testability or other requirements necessary to the purposes of the
system [Bass et al. 1998; Boehm et al. 1978].

2.3.1 General Approaches

Information Systems is an applied discipline that traditionally adapts concepts


and techniques from reference disciplines such as management, psychology,
and engineering to solve information systems problems. In searching for a
theoretical model that could be used in the current research, three separate
approaches were explored.

1. Computer Aided Design (CAD). Since CAD uses diagrams to specify the
design and construction of physical entities [Yoshikawa and Warman 1987], it
seemed reasonable to assume that techniques developed to evaluate CAD
diagrams might be adapted for the evaluation of diagrams used to specify
software systems. However, a review of the literature found relatively little
literature on the evaluation of CAD diagrams, and that which was found
pertained to the formal (i.e., "mathematical") evaluation of circuit designs.
Discussion with William Miller of the University of South Florida Engineering
faculty supported this conclusion [Miller 2000], and this approach was
abandoned.

2.Radiological Images. While x-rays are not technically diagrams and do not
specify a system, they are visual artifacts and do convey information. Therefore,
it was reasoned that rules for reading radiological images might provide insights
into the evaluation of software diagrammatic models. Review of the literature
found nothing appropriate. More importantly, as further conceptual work was
done regarding the purposes of evaluating software diagrammatic models, it
became apparent that the reading of x-rays was not an appropriate analog. This
approach was therefore also abandoned.

3.Human-Computer Interaction (HCI). In reviewing the HCI literature, the


following facts were noted:

The language, concepts, and purposes of HCI are very similar to those of
information systems, and it is arguable that HCI is a part of information
systems. (See, for example, the Huber [1983] and Robey [1983] debate
on cognitive style and DSS design.)
HCI is solidly rooted in psychology, a traditional information systems
reference discipline.
Computer user-interfaces almost always have a visual component and are
increasingly diagrammatic in design.
User-interfaces can be and are evaluated in terms of the semantic error
criteria described above; i.e., defects in functionality, performance,
efficiency, etc.

170
Based on these facts, a decision was made to attempt to identify an HCI
evaluation technique that could be adapted for evaluation of software
diagrammatic models.

2.3.2 Human-Computer Interaction

Human-computer interaction (HCI) has been defined as "the processes,


dialogues . . . and actions that a user employs to interact with a computer
environment [Baecker and Buxton 1987, 40]."

2.3.2.1 HCI Evaluation Techniques

Mack and Nielsen [1994] identify eight usability inspection techniques:

1. Heuristic Evaluation. Heuristic evaluation is an informal method that


involves having usability specialists judge whether each dialogue element
conforms to established usability principles or heuristics. Nielsen, the author
of the technique, recommends that evaluators go through the interface twice
and notes that "[t]his two-pass approach is similar in nature to the phased
inspection method for code inspection (Knight and Myers 1993) [Nielsen
1994, 29]."

2. Guideline Reviews. Guideline reviews are inspections where an interface is


checked for conformance with a comprehensive list of guidelines. Nielsen and
Mack note that "since guideline documents contain on the order of 1,000
guidelines, guideline reviews require a high degree of expertise and are fairly
rare in practice [Nielsen and Mack 1994, 5]."

3. Pluralistic Walkthroughs. A pluralistic walkthrough is a meeting in which


users, developers, and human factors experts step through a scenario,
discussing usability issues associated with dialogue elements involved in the
scenario steps.

4. Consistency Inspections. Consistency inspections have designers


representing multiple projects inspect an interface to see whether it consistent
with other interfaces in the "family" of products.

5. Standards Inspections. In a standards inspection, an expert on some


interface standard checks the interface for compliance with that standard.

6. Cognitive Walkthroughs. Cognitive walkthroughs use an explicitly detailed


procedure to simulate a user's problem-solving process at each step in the
human-computer dialog, checking to see if the simulated user's goals and
memory for actions can be assumed to lead to the next correct action.

7. Formal Usability Inspections. Formal usability inspections are designed to


be very similar to the Fagan Inspection used in code reviews.

171
8. Feature Inspections. In feature inspections the focus is on the functionality
provided by the software system being inspected; i.e., whether the function as
designed meets the needs of the intended end users.

These HCI evaluation techniques are clearly similar to FTR in that they involve
the use of knowledgeable individuals to detect defects in a software artifact; most
also involve a formal process and a group.

2.3.2.2 Cognitive Psychology and HCI


To assist in the design of better dialogues, HCI researchers have attempted to
apply the findings of cognitive psychology since, all other factors being equal, an
interface that requires less short-term memory resources or can be manipulated
more quickly because fewer cognitive steps are required should be superior. The
following is a brief overview of cognitive-based approaches utilized in HCI.

Human Information Processing System (HIPS). During the 1960s and


1970s, the main paradigm in cognitive psychology was to characterize
humans as information processors that processed information much like a
computer. While some of the assumptions of the original model proved to be
overly restrictive and other approaches have become popular, updated HIPS
models continue to be useful for HCI research. Given the importance of this
model for this research, a more complete treatment is provided in Section
2.4.1 below.

Computational approaches also adopt the computer metaphor as a


theoretical framework but conceptualize the cognitive system in terms of the
goals, planning, and action involved in task performance. Tasks are analyzed
not in terms of the amount of information processed in the various stages but
in terms of how the system deals with new information [Preece et al. 1994].

Connectionist approaches simulate behavior through neural network or


Parallel Distributed Processing (PDP) models in which cognition is
represented as a web of interconnected nodes. Connectionist models have
become increasingly accepted in cognitive psychology [Ashcraft 1994], and
this fact has been reflected in HCI research [Preece et al. 1994].

Human Factors/Actors. Bannon [1991, 28] argues that the term human
factors should be replaced with the term human actors to indicate "emphasis
is placed on the person as an autonomous agent that has the capacity to
regulate and coordinate his or her behavior, rather than being a simple
passive element in a human-machine system." The change is supposed to
facilitate focusing on the way people act in real work settings instead of
viewing them as information processors.
Distributed Cognition. An emerging theoretical framework is distributed
cognition. The goal of distributed cognition is to conceptualize cognitive
activities as embodied and situated within the work context in which they
occur [Hutchins 1990; Hutchins and Klausen 1992].

172
The human factors/actors and distributed cognition models are not
appropriate to the current study. The connectionist models show great promise
but are not yet sufficiently developed to be useful for this research. The
information processor models are however appropriate and sufficiently mature;
they provide the primary cognitive theoretical base for the dissertation.
Computational approaches are also utilized in that the study analyzes the
cognitive system in terms of the task planning involved in task performance.

2.4 Human Information Processing System (HIPS) Models and Related


Topics

2.4.1 General Model

One of the major paradigms in cognitive science is the Human Information


Processing System model. In this model, humans are characterized as
information processors, in which information enters the mind, is processed in a
series of ordered stages, and then exits [Preece et al. 1994]. Figure 2.1
summarizes one version of the basic model [Barber 1988].

Figure 2.1 Human Information Processing Stages (adapted from Barber


[1988])
An early attempt to apply the model was Card et al.'s The Psychology of Human-
Computer Interaction [1983]. In that work, the authors stated that the human
mind is also an information-processing system and developed a simplified model
of it that they called the Model Human Processor. Based on this model, they
made predictions about the usability of various user interfaces, performed
experiments, and reported their findings. The results were equivocal, and
subsequent cognitive psychology research has shown that the serial stage
approach to cognition of the original model is overly simplistic.
The original model also did not include memory and attention. Later versions do
include these processes, and Cowan [1995], in his exhaustive examination of the
intersection of memory and attention, discusses a number of these. Figure 2.2
summarizes a model that does include memory and attention [Barber 1988].

173
Figure 2.2 Extended Stages of the Information Processing Model (adapted
from Barber [1988])
HIPS models, such as Anderson's ACT-R [1993], continue to be developed and
are useful. Further, the information processing approach has recently been
described as the primary metatheory of cognitive psychology [Ashcraft 1994].
2.4.2 Coping with Attention as a Limited Resource
One of the earliest psychological definitions of attention is that of William James
[1890, vol. 1, 403-404]:
Everyone knows what attention is. It is the taking possession of the
mind, in clear and vivid form, of one out of what seem several
simultaneously possible objects or trains of thought. Focalization,
concentration of consciousness are of its essence. It implies withdrawal
from some things in order to deal more effectively with others . . .
(emphasis added)

This appeal to intuition explicitly states that attention is a limited resource.


In reaction to the introspection methodology of James, the Behaviorist movement
asserted that the study of internal representations and processes was
unscientific. Since behaviorists dominated American psychological thought during
the first half of the Twentieth Century, little or no work was done on attention in
America during this period. In Europe, Gestalt psychology became dominant at
this time and that school, while not actively hostile to attention studies, did not
encourage work in the area. World War II however led to a rethinking of
psychological approaches and acceptance of using the experimental techniques
developed by the behaviorists to study internal states and processes [Cowan
1995].

An example of this rethinking is the work of Broadbent [1952] and Cherry [1953].
They used a technique to study attention in which different spoken messages are
174
presented to a subject's two ears at the same time. Their research shows that
subjects are able to attend to one message if the messages are distinguished by
physical (rather than merely semantic) cues, but recall almost nothing of the
nonattended channel. In 1956, Miller reviewed a series of experiments that
utilized a different methodology and noted that, across many domains, subjects
could keep in mind no more than about seven "chunks" simultaneously. These
findings were among the first experimental evidence that attentional capacity is a
limited resource.
More recent experimental work continues to indicate that attention is a
limited resource [Cowan 1995]. Even those cognitive psychologists who
have recently challenged the very concept of attention assume their
"attention" analog is limited. One example of this would be Allport [1980] and
Wickens [1984], who argue that the concept of attention should be replaced
with the concept of multiple limited processing resources.
Based on an examination of the exhaustive review by Cowan [1995] of the
intersection of memory and attention, the Shiffrin [1988, 739] definition appears
to be representative of contemporary thought:

Attention has been used to refer to all those aspects of human


cognition that the subject can control . . . and to all aspects of cognition
having to do with limited resources or capacity, and methods of dealing
with such constraints. (emphasis added)

Since human cognitive resources are limited, cognitively complex tasks may
overload these resources and decrease the quality and/or quantity of outputs.
Various approaches to measuring the cognitive complexity of tasks have been
developed. In HCI, an informal view of complexity is often utilized. For example,
Grant [1990, sec. 1.3] defines a complex task as one for which there are a large
number of potential practical strategies. This definitions is not inconsistent with
the measure assumed by Simon [1962] in his paper on the use of hierarchical
decomposition to decrease the complexity of problemsolving.

Simon [1990] argues that humans develop mechanisms to enable them to deal
with complex, real-life situations despite their limited cognitive resources. One
such mechanism is task planning. According to Fredericksen and Breuleaux
[1990], task planning is a cognitive bargain in which the time and effort spent
working with an abstract, and therefore, smaller problem space during planning
minimizes actual work on the task in the original, detailed problem space.

Earley and Perry [1987, 279] define a task plan as "a cognitively based routine
for attaining a particular objective and consists of multiple steps." Newell and
Simon [1972] identify planning from verbal protocols as those passages in which:

1. a person is considering abstract specifications of the action/information


transformations required to achieve goals;
2. a person considers sequences of two or more such actions or
transformations; and
3. after developing the sequences, some or all of them are actually performed.

175
Two further items should be noted regarding planning:

1. Not all planning is original. Successful plans learned from others or by


experience may be stored in memory or externally [Newell and Simon 1972;
Wood and Locke 1990]. Without the recall, modification, and use of previous
plans, the development of expertise would be impossible.

2. Planning is not complete before action. Both theory and analysis of verbal
protocols indicate that periods of planning are interleaved with action
[McDermott 1978; Newell and Simon 1972]. In other words, practitioners will
often plan a response to part of a task, complete some or all of the actions
specified in that plan, plan a new response incorporating information acquired
during prior action period(s), complete the new actions, etc.

2.4.3 Application of the HIPS Model to This Research

In the HIPS model, the nature and amount of stimuli impact both information
processing and output. This research uses a key concept of the HIPS model,
attention, in two ways:

1. Attention is a critical and limited resource, and when attention is overloaded,


outputs decrease in quality and quantity; therefore, a meta-cognitive strategy
such as task planning that minimizes attentional load should improve outputs.

2. Patterns are another meta-cognitive strategy for minimizing attentional load;


therefore, understanding which patterns better support the cognitive
processing associated with evaluation of diagrammatic models may allow
individuals to be trained to use these better patterns, thus lessening their
attentional load and improving their outputs.

2.5 Research On the Comprehension of Graphics

Larkin and Simon [1987] consider why diagrams can be superior to a verbal
description for solving problems, and suggest the following reasons:

Diagrams can group together all information that is used together, thus
avoiding large amounts of search for the elements needed to make a
problem-solving inference.
Diagrams typically use location to group information about a single element,
avoiding the need to match symbolic labels.
Diagrams automatically support a large number of perceptual inferences,
which are extremely easy for humans.
As noted in Chapter 1, two of these depend on spatial patterns.
Winn [1994] presents an overview of how the symbol system of graphics
interacts with the viewers' perceptual and cognitive processes, which is
summarized in figure 2.3. In his description, the graphical symbol system
consists of two elements: (1) Symbols that bear an unambiguous one-to-one
176
relationship to objects in the domain of reference, and (2) The spatial relations of
the symbols to each other. Thus, how symbols are configured spatially will affect
the way viewers understand how the associated objects are related and interact.
For the purposes of this dissertation, a particularly interesting finding is that
biases based on reading direction (left-to-right for English) affect the
interpretation of graphics.

Figure 2.3. Winn [1994] Processes Involved in the Perception and


Comprehension of Graphics

Zhang [1997] proposes a theoretical framework for external representation based


problem solving. In an experiment she conducted using a Tic-Tac-Toe board and
its logical isomorphs, the results show that Tic-Tac-Toe behavior is determined by
the configuration of the board. External representations are thus shown to be
more than just memory aids and a representational determinism is suggested.
This last point is particularly relevant to this dissertation since it states that the
form of representation determines what information can be perceived in a
diagram.

2.6 Types of Diagrammatic Models

Selection of diagrammatic models to be included in the research task requires an


appropriate typology. Two diagrammatic model typologies were examined,
Wieringa [1998] and Visible Systems [1999].

177
2.6.1 Wieringa 1998

Wieringa, in his discussion of graphical structures or models that may be used in


software specification techniques, lists four general classes:

1. Decomposition Specification Techniques. These represent the conceptual


structure of data in a database system. Examples include Entity-Relationship
Diagrams (ERDs) and such ERD extensions as OO class diagrams.

2. Communication Specification Techniques. These show how the


conceptual components interact to realize external system interactions.
Examples include Dataflow Diagrams (DFDs), Context Diagrams, SADT
Activity Diagrams, Object Communication Diagrams, SDL Block Diagrams,
Sequence Diagrams, and Collaboration Diagrams.

3. Function Specification Techniques. These specify the external functions of


a system or the functions of system components. Examples Function
Refinement Trees, Event-Response Specifications, and Use Case Diagrams.

4. Behavior Specification Techniques. These show how functions of a system


or its components are ordered in time. Examples include Process Graphs,
JSD Process Structure Diagrams, Finite (and Extended Finite) State
Diagrams, Mealy Machines, Moore Machines, Statecharts, and Process
Dependency Diagrams.

2.6.2 Visible Systems

The methods listing in Visible Systems [1999] was examined as a representative


of practitioner-oriented, CASE-tools-based typologies. Seven models are listed;
of these, six are diagrammatic in nature.

1. Functional Decomposition Model. Shows the business functions and the


processes they support drawn in a hierarchical structure; also known as the
Business Model. This type of model is of a high-level functional nature and
specifically applies to functions and not to the data that those functions use. It
is generally appropriate for defining the overall functioning of an enterprise,
not for individual projects.

2. Data Model. Shows the data entities of an application and the relationships
between the entities. Entities and relationships can be selected in subsets to
produce views of the data model. The diagramming technique normally used
to depict graphically the data model is the Entity Relationship Diagram (ERD)
and the model is sometimes referred to as the Entity-Relationship Model.

3. Process Model. Shows how things occur in the organization via a sequence
of processes, actions, stores, inputs and outputs. Processes are decomposed
into more detail, producing a layered hierarchical structure. The diagramming
technique used for process modeling in structured analysis is the Data Flow
Diagram (DFD). Several notations are available for representing process

178
modeling, with the most widely used being Yourdon/DeMarco and Gane &
Sarson.

4. Product Model. Shows a hierarchical, top-down design map of how the


application is to be programmed, built, integrated, and tested. The modeling
technique used in structured design is the structure chart. It is a tree or
hierarchical diagram that defines the overall architecture of a program or
system by showing the program modules and their interrelationships.

5. State Transition Model (Real Time Model). Shows how objects transition to
and from various states or conditions and the events or triggers that cause
them to change between the different states.

6. Object Class Model. Shows classes of objects, subclasses, aggregations


and inheritance and defines structures and packaging of data for an
application.

2.6.3 Evaluation of Typologies in Prior Work

In evaluating these two typologies for this research, two problems were noted:

1.Neither classification scheme includes diagrammatic representations of


Graphical User Interfaces (GUIs). While such representations are not technically
graphs (and thus not discussed by Wieringa) and are not listed in Visible
Systems, they may be used to specify parts of a system and are therefore
appropriate to this research.
2. Wieringa's work is based on the theoretical characteristics of graphs while
Visible Analyst is representative of practitioner-oriented, CASE-tool-based
typologies. Neither is appropriate to the research of this dissertation since
neither captures factors likely to affect the cognitive processing of
practitioners in evaluating software diagrammatic models.

While it would be relatively easy to add diagrammatic representations of GUIs to


Wieringa or Visible Analyst, it was concluded that the second problem
disqualified them for the purposes of this research. Further review of several
leading systems analysis and design texts [Fertuck 1995; Hoffer et al. 1998;
Kendall and Kendall 1995] did not yield an appropriate typology of diagrammatic
models, and it was therefore deemed necessary to develop one specifically for
this dissertation.

2.6.4 Diagrammatic Model Typology Development

The first step in the development process was to consult several systems
analysis and design and structured techniques texts for classification insights and
to derive lists of commonly used diagrammatic models. These included Fertuck
[1995], Hoffer et al. [1998], Kendall and Kendall [1995], and Martin and McClure
[1985].
Martin and McClure make a major distinction between hierarchical diagrams (i.e.,
those having one overall node or root and which do not remerge) and mesh or
179
network diagrams (i.e., those not having a single overall node or root or which do
remerge). For the purposes of this research, this distinction is operationalized as
the categorical variable hierarchical/not hierarchical.
Martin and McClure also make a major distinction between diagrams showing
sequence and those that do not. Sequence usually implies temporal
directionality; for this dissertation, the distinction is broadened to include the
possibility of logical and other forms of directionality and is operationalized as the
categorical variable directional/not directional.

A distinction found in all texts referenced is between data-oriented and process-


oriented diagrams. Inspection of diagram types shows that the distinction is
actually a data/process orientation continuum. For the purposes of this
dissertation, this continuum is collapsed into the categorical variable
data/hybrid/process oriented.

As a test of the feasibility of the classification scheme, twenty diagram types from
Martin and McClure, UML diagrams from Harmon and Watson [1998], and a
model of a "typical" GUI were then categorized. The results of this categorization
are shown in table 2.1.

Table 2.1 Diagrammatic Model Types

HIERARCHICAL NOT HIERARCHICAL

DIRECTIONAL NOT DIRECTIONAL DIRECTIONAL NOT DIRECTIONAL

DATA HYBRID PROCESS DATA HYBRID PROCESS DATA HYBRID PROCESS DATA HYBRID PROCESS
I II III IV V VI VII VIII IX X XI XII
Functional Functional Data Flow Data Typical
Decomposi- Decomposi- Analysis GUI
tion II tion I

Structure Flow Entity- UML Use


Charts Charts Relationship Case

HIPO HIPO Data Inverted-L UML


(Overview) (VTC) Navigation Class

180
HIPO UML
(Detail) Sequence

Warnier- Warnier-Orr UML


Orr (Process) Collaboration
(Data)

Michael Michael Michael UML State


Jackson Jackson Jackson
Data- System Program-
Structure Network Structure

Nassi- UML
Shneiderman Activity
Charts

Action II Action I

Inspection of table 2.1 shows that only seven of the twelve (2 x 2 x 3) possible
categories are actually populated. Table 2.2 shows the categorization of the
diagram types after collapsing unpopulated categories.

Table 2.2 Diagrammatic Model Types (Collapsed)

HIERARCHICAL NOT HIERARCHICAL

DIRECTIONAL DIRECTIONAL NOT DIRECTIONAL

DATA HYBRID PROCESS HYBRID PROCESS DATA HYBRID


I II III VIII IX X XI
Functional Functional Data Flow Data Analysis Typical GUI
Decomposition II Decomposi-tion I

Structure Charts Flow Charts Entity- UML Use Case


Relationship

181
HIPO HIPO Data Navigation Inverted-L UML Class
(Overview) (VTC)

HIPO UML Sequence


(Detail)

Warnier-Orr Warnier-Orr UML Collaboration


(Data) (Process)

Michael Jackson Michael Jackson Michael Jackson UML State


Data-Structure System Network Program-Structure

Nassi-ShneidermanUML Activity
Charts

Action II Action I

2.7 Types of Software Defects

A semantic software defect (the focus of this research) is defined as a non-


syntactic defect that causes a software artifact or resulting system not to have
the functionality, performance, security, usability, maintainability, testability or
other qualities necessary for the purposes of the system. In other words,
software defects are defined in terms of missing qualities. Other research
reviewed is not inconsistent with this approach. For example, Boehm et al. [1978]

182
and Bass et al. [1998] develop typologies of software qualities, and the definition
in Grady [1992, 122] of a defect as "any flaw in the specification, design, or
implementation of a product" inherently includes software qualities. Therefore,
the primary focus of the first section below is on typologies of software qualities.
The second section reviews other software defect typologies, and the third
section discusses the development of the typology used in this research.

2.7.1 Software Quality Typologies


An interesting early software qualities typology is the Software Quality
Characteristics Tree (SQCT) of Boehm et al. [1978]. The SQCT is a
hierarchical scheme in which the highest-level construct, General Utility, is
determined by two second-level constructs, As-Is Utility and Maintainability,
and one third-level construct, Portability. The second-level constructs are
each in turn determined by three other third-level constructs, Reliability,
Efficiency, and Human Engineering and Testability, Understandability, and
Modifiability respectively. The third-level constructs are determined by
various combinations of twelve primitive characteristics (Device
Independence, Completeness, Accuracy, Consistency, Device Efficiency,
Accessibility, Communicativeness, Structuredness, Self-Descriptiveness,
Conciseness, Legibility, and Augmentability), which are strongly
differentiated with respect to each other.

The Software Quality Characteristics Tree is shown in figure 2.4.

183
Figure 2.4 Boehm et al. [1978] Software Quality Characteristics Tree
(adapted)

The Grady [1992] software defect model is shown below in figure 2.5. It is also a
hierarchical model (with the root at the bottom) that classifies defects according
184
to origin, type, and mode. Grady describes six types of software defects that
correspond to the five modes plus a residual "Other" category:

1. Specifications/Requirements Defect. A mistake in the definition of the


customer/target needs for a system or system component. Such mistakes can
be in functional requirements, performance requirements, test requirements,
development standards, and so on.

2. Design Defect. A mistake in the design of a system or system component.


Such mistakes can be in algorithms, control logic, data structures, database
access, input/output formats, interface descriptions, and so on.

3. Code Defect. A mistake in the implementation of a computer program. Such


mistakes can be in product or test code, JCL, build files, and so on.

4. Documentation Defect. A mistake in any non-code product material


delivered to a customer. Such mistakes can be in user manuals, installation
instructions, data sheets, product demos, and so on. Mistakes in
requirements specification documents, design documents, or code listings are
assumed to be specification defects, design defects, and coding defects,
respectively.

5. Environmental Support Defect. Defects that arise as a result of the system


development and/or testing environment. Such mistakes can be in the
build/configuration process, the development/integration tools, the testing
environment, and so on.

6. Other.

185
Figure 2.5 Grady [1992] Software Defect Model

Bass et al. [1998] discuss ten technical qualities of software, dividing them into
those that are discernible at runtime (DR) and those not discernible at runtime
(NDR). The following is a brief discussion of the software qualities in their
typology:

1. Functionality (DR) is the ability of the system to do the work for which it was
intended; it is the basic statement of the system's capabilities, services, and
behavior.

2. Performance (DR) refers to the responsiveness of the system - the time


required to respond to stimuli (events) or the number of events processed in
some interval of time.

186
Bass et al. [1998, 79] note that "For most of the history of software
engineering, performance has been the driving factor in software architecture,
and this has frequently compromised the achievement of other qualities."

It should be noted that performance is relative to system requirements and


that what would otherwise be a "defect" may be the result of increasing some
other quality.

3. Security (DR) is a measure of the system's ability to resist unauthorized


attempts at usage and denial of service while still providing its services to
legitimate users.

4. Availability (DR) measures the proportion of time the system is up and


running and is typically defined as

= (MTF) / (MTF + MTR) ,

where MTF = mean time to failure and


MTR = mean time to repair.

5. Usability (DR) is largely a function of the user interface.

6. Maintainability (NDR). Bass et al. [1998] use the terms modifiability and
maintainability interchangeably and define modifiability as the ability of a
system to make changes quickly and cost effectively. According to them,
modifications to a system can be broadly categorized as follows:

Extending or changing capabilities. This category includes corrective


maintenance and extensibility.
Deleting unwanted capabilities.
Adapting to new operating environments.
Restructuring.

7. Portability (NDR) is the ability of a system to run under different computing


environments.

8. Reusability (NDR) relates to the design of a system so that the system's


structure or some of its components can be reused again in future
applications. Bass et al. [1998, 84] note that "Reusability is actually a special
case of modifiability..."

9. Integrability (NDR) is the ability to make the separately developed


components of the system work correctly together.

10. Software testability (NDR) refers to the ease with which software can be
made to demonstrate its faults through (typically execution-based) testing.

This research uses Bass et al. [1998] as the basis for the qualities dimension of
the software defects typology.
187
2.7.2 Other Defect Dimensions

Review of the literature yields three other dimensions for the classification of
software defects.

2.7.2.1 Class
Class refers to whether the defect is the result of logic or other required
structure's being missing (M), incorrect (I), or extra (E) [Ebenau and Strauss
1994].

While extra functionality may increase storage requirements or otherwise


decrease efficiency, the impact on functionality is generally less severe than that
caused by the other two types.

2.7.2.2 Severity

The defect severity categories generally listed are major (J), minor (N), and
(sometimes) trivial (T) [Ebenau and Strauss 1994; Gilb and Graham 1993; Kelly
et al. 1992].

A major defect is defined as one "that is expected to cause product failure,


departure from specifications, or prevent further correct development of the
product[Ebenau and Strauss 1994, 92]." A minor defect is defined as one "that
reduces the effectiveness, or confuses a product's representation, format, or
development process characteristics, but is not expected to impact the operation
or further development of the product [p. 92 ]."

2.7.2.3 Cause

Humphrey [1995], following Gale [1990], lists five categories of basic defect
causes:

1. Education. You did not understand how to do something.


2. Communication. You were not properly informed about something.
3. Oversight. You omitted doing something.
4. Transcription. You knew what to do but made a mistake in doing it.
5. Process. Your process somehow misdirected your actions.
2.7.3 Development of the Defect Typology

The four dimensions discussed above produce a four-dimensional defect space.


However, examination shows that dimensional simplification is appropriate.

1. Defect cause cannot be determined directly from examination of software


diagrammatic models.

2. Defect severity is defined in terms of impact on system functionality. Given


that functionality is a type of technical quality, a separate dimension would be
redundant.

188
Further simplification is achieved by ignoring extra functionality defects of the
class dimension. The rationale for this reduction is that, while defects associated
with extra functionality may increase storage requirements or otherwise decrease
efficiency, the impact on functionality is generally less severe than that caused by
missing and incorrect defects.

Change is also necessary on the qualities dimension. Six of the Bass et al.
[1998] qualities are not readily discernable from diagrammatic models and are
consequently not appropriate to the typology. However, according to Boehm et al.
[1978], the primitive quality Structuredness partially determines three of the six.
Similarly, Fenton and Neil [2001] lists Structuredness as an internal attribute
associated with the external attributes reliability (or availability), maintainability,
and reusability. The six non-discernable qualities are listed below. A B indicates a
Boehm quality; an F indicates a Fenton attribute.

Availability F
Maintainability B,F
Portability
Reusability B,F
Integrability
Testability B

Since Structuredness is associated with four of the six non-discernable qualities


and is readily discernable from a diagrammatic model, it is substituted as a
partial proxy.

During the early development of the research task, several subjects noted that
the scope of the diagrammatic models was not consistent. From a theoretical
perspective, lack of Scope Consistency is an instance of a general consistency
problem. In the structured approach to IS development, data and process models
are supposed to model the same system but are fundamentally separate. This
separateness leads to multiple problems including lack of consistency [Repa
2001]. Consideration was given to adding the broader quality consistency to the
topology, but this was rejected because (1) some subjects perceived lack of
Scope Consistency to be a separate issue and (2) lack of Scope Consistency is
different in that it can generally be readily discerned by comparing data and
process models, while other consistency problems are apparent only after
significant functional analysis. Lack of Scope Consistency would be expected to
impact negatively on the integrability and maintainability of the specified system

The resulting matrix is a two-dimensional defect space based on quality affected


and class. It should be noted that Scope Consistency and Structuredness are
treated as logical variables; the quality is either present or missing. Table 2.3
shows the resulting matrix.

189
9. Table 2.3 Software Defect Matrix: Qualities vs. Class

QUALITY
Scope Consistency

Structuredness

Functionality

Performance

Usability
Security
CLASS
Missing

Incorrect

2.7.4 Diagrammatic Model Type vs. Software Defect Type Matrix

Table 2.4 shows the matrix resulting from combining the Diagrammatic Model
Type and Software Defect Type typologies.

190
9.1. Table 2.4 Diagrammatic Model Type vs. Software Defect Type

Scope Consistency
QUALITY

Structuredness

Functionality

Performance

Usability
Security
MODEL M M M I M I M I M I
Hierarchical-
W-O D1

Directional-
Data (I)
Hierarchical-
StrC2

Directional-
Hybrid (II)
Hierarchical-
W-O P3

Directional-
Process (III)
Not Hierarchical-
DFD4

Directional-
Hybrid (VIII)
Not Hierarchical-
FlowC5

Directional-
Process (IX)
Not Hierarchical-
ERD6

Not Directional-
Data (X)
Not Hierarchical-
GUI7

Not Directional-
Hybrid (XI)

9.1.1. NOTES
M = missing
I = incorrect

Typical Diagram for Each Category

1 = Warnier-Orr (Data) Diagram


2 = Structure Chart
3 = Warnier-Orr (Process) Diagram
4 = Data Flow Diagram
5 = Flow Chart
6 = Entity-Relationship Diagram
7 = Typical GUI

191
2.8 Summary and Conclusions

Prior theory and research that might inform the dissertation are reviewed. A large
body of research exists concerning Formal Technical Review, but review of this
work shows that it is not based on theory and therefore cannot inform this
research effort. The first part of the literature review therefore provides context
rather than explicating applicable theory.

Three techniques from non-information systems disciplines for evaluating visual


artifacts conveying meaning are evaluated. While work on the evaluation of
human-computer interaction (HCI) approaches proves not to be directly
applicable to this research, one of the HCI paradigms, the Human Information
Processing System (HIPS) model, is found to be relevant. The HIPS model is
reviewed, as is cognitive science work on attention and the comprehension of
graphics.

Two other areas are identified as necessary for the development of the research
task and tools: (1) types of diagrammatic models and (2) types of software
defects. The literature is reviewed and new typologies are developed.

192

You might also like