You are on page 1of 9

Contents

 Software engineering investigation


 Investigation principles
SENG 421:  Investigation techniques
Software Metrics  Formal experiments: Planning
 Formal experiments: Principles
Empirical Investigation  Formal experiments: Types
(Chapter 4)  Formal experiments: Selection
 Guidelines for empirical research
Department of Electrical & Computer Engineering, University of Calgary

B.H. Far (far@ucalgary.ca)


http://www.enel.ucalgary.ca/~far/Lectures/SENG421/04/

far@ucalgary.ca 2

Empirical SE SE Investigation
 Fill the gap between research and practice by:  What is software engineering investigation?
 Developing methods for studying SE practice  Applying “scientific” principles and techniques to
 Building a body of knowledge of SE practice investigate properties of software and software
related tools and techniques.
 Validating research before deployment in
industrial settings
 Why talking about software engineering
investigation?
 Because the standard of empirical software
engineering research is quite poor.

far@ucalgary.ca 3 far@ucalgary.ca 4

SE Investigation: Examples SE Investigation


Investigation:: Why?
 Experiment to confirm rules-of-thumb  To improve (process and/or product)
 Should the LOC in a module be less than 200?
 Should the number of branches in any functional decomposition be  To evaluate (process and/or product)
less than 7?
 Experiment to explore relationships  To prove a theory or hypothesis
 How does the project team experience with the application affect the
quality of the code?  To disprove a theory or hypothesis
 How does the requirements quality affect the productivity of the
designer?  To understand (a scenario, a situation)
How does the design structure affect maintainability of the code?
 To compare (entities, properties, etc.)

 Experiment to initiate novel practices


 Would it be better to start OO design with UML?
 Would the use of SRE improve software quality?

far@ucalgary.ca 5 far@ucalgary.ca 6
SE Investigation
Investigation:: What? SE Investigation
Investigation:: Where & When?
 Person’s performance  In the field
 Tool’s performance  In the lab

 Person’s perceptions  In the classroom

 Tool’s usability

 Document’s understandability  Anytime depending on what questions you


 Program’s complexity are asking
etc.

far@ucalgary.ca 7 far@ucalgary.ca 8

SE Investigation
Investigation:: How? SE Investigation
Investigation:: Characteristics
 Hypothesis/question generation  Data sources come from industrial settings
 Data collection  This may include people, program code, etc.
 Data evaluation

 Data interpretation  Usually


 Surveys
 Case-studies ( hypothesis generation)
 Feed back into iterative process  Experiments ( hypothesis testing)

far@ucalgary.ca 9 far@ucalgary.ca 10

Empirical SE: Examples Where Data Come From?


 Comparison of different development  First Degree Contact
 Direct access to participants
processes
 Example:
 Factors affecting success in code inspection  Brainstorming
meetings  Interviews
 Requirements gathering for software tool  Questionnaires
 System illustration
development  Work diaries
 Improvement of organizational processes  Think-aloud protocols
 Participant observation

far@ucalgary.ca 11 far@ucalgary.ca 12
Where Data Come From? Where Data Come From?
 Second Degree Contact  Third Degree Contact
 Access to work environment during work time,  Access to work artifacts, such as source code,
but not necessarily participants documentation
 Example:  Example:
 Instrumenting systems  Problem report analysis
 Real time monitoring  Documentation analysis
 Analysis of tool logs
 Off-line monitoring

far@ucalgary.ca 13 far@ucalgary.ca 14

Practical Considerations Investigation Principles


 Hidden Aspects of Performing Studies There are 4 main principles of investigation:
 Negotiations with industrial partners 1. Stating the hypothesis: What should be investigated?
2. Selecting investigation technique: conducting
 Obtaining ethics approval and informed consent surveys, case studies, formal experiments
from participants
3. Maintaining control over variables: dependent and
 Adapting “ideal” research designs to fit with independent variables
reality 4. Making meaningful investigation: verification of
 Dealing with the unexpected theories, evaluating accuracy of models, validating
measurement results.
 Staffing of project

far@ucalgary.ca 15 far@ucalgary.ca 16

SE Investigation Techniques Case--study or Experiment?


Case
 Three ways to investigate:  How to decide whether conduct an experiment or
 Formal experiment: A controlled investigation of an perform a case-study?
activity, by identifying, manipulating and documenting
Factor Experiment Case-study
key factors of that activity.
Retrospective Yes (usually) No (usually)
 Case study: Document an activity by identifying key
factors (inputs, constraints and resources) that may affect Level of control High Low
the outcomes of that activity. Difficulty of control Low High

 Survey: A retrospective study of a situation to try to Level of replication High Low


document relationships and outcomes. Cost of replication Low High
Can Generalize? Yes (may be) No

Control is the key factor


far@ucalgary.ca 17 far@ucalgary.ca 18
Hypothesis Examples /1
 The first step is deciding what to investigate.  Experiment: research in the small
 The goal for the research can be expressed as a  You have heard about software reliability
hypothesis in quantifiable terms that is to be tested. engineering (SRE) and its advantages and may
 The test result (the collected data) will confirm or want to investigate whether to use SRE in your
company. You may design a controlled (dummy)
refute the hypothesis.
project and apply the SRE technique to it. You
 Example: may want to experiment with the various phases
Can Software Reliability Engineering (SRE) help us of application (defining operational profile,
to achieve an overall improvements in software developing test-cases and decision upon
development practice in our company? adequacy of test run) and document the results
for further investigation.

far@ucalgary.ca 19 far@ucalgary.ca 20

Examples /2 Examples /3
 Case study: research in the typical  Survey: investigate in the large
 You may have used software reliability  After you have used SRE in many projects in
engineering (SRE) for the first time in a project your company, you may conduct a survey to
in your company. After the project is completed, capture the effort involved (budget, personnel),
you may perform a case-study to capture the the number of failures investigated, and the
effort involved (budget, personnel), the number project duration for all the projects. Then, you
of failures investigated, and the project duration. may compare these figures with those from
projects using conventional software test
technique to see if SRE could lead to an overall
improvements in practice.

far@ucalgary.ca 21 far@ucalgary.ca 22

Hypothesis (cont’d) Control /1


 Other Examples:  What variables may affect truth of a hypothesis?
 Can integrated development and testing tools How do they affect it?
improve our productivity?  Variable:
 Does Cleanroom software development produce  Independent (values are set by the experiment or initial
better-quality software than using the conditions)
conventional development methods?  Dependent (values are affected by change of other
variables)
 Does code produced using Agile software  Example: Effect of “programming language” on
development have a lower number of defects per
KLOC than code produced using the the “quality” of resulting code.
conventional methods?  Programming language is an independent and quality is a
dependent variable.

far@ucalgary.ca 23 far@ucalgary.ca 24
Control /2 Control /3
 A common mistake: ignoring other variables that may affect  How to identify the dependent and
the values of a dependent variable.
 Example: independent variables?
Suppose you want to determine whether a change in programming

language (independent variable) can affect the productivity
 Example:
(dependent variable) of your project. For instance, you currently use A D
FORTRAN and you want to investigate the effects of changing to
Ada. The values of all other variables should stay the same (e.g., F &BZ
application experience, programming environment, type of problem,
etc.) D&C  F
 Without this you cannot be sure that the difference in productivity is Given :  A, B, C
attributable to the change in language.
 But list of other variables may grow beyond control! Using causal ordering:
 A, B, C  D  F   Z 
far@ucalgary.ca 25 far@ucalgary.ca 26

Formal Experiments: Planning Formal Experiments: Planning


1. Conception 3. Preparation
 Defining the goal of investigation  Getting ready to start, e.g., purchasing tools,
hardware, training personnel, etc.
2. Design
4. Execution
 Generating quantifiable (and manageable)
hypotheses to be tested 5. Review and analysis
 Review the results for soundness and validity
 Defining experimental objects or units
 Identifying experimental subject
6. Dissemination & decision making
 Documenting conclusions
 Identifying the response variable(s)

far@ucalgary.ca 27 far@ucalgary.ca 28

Formal Experiments: Principles Formal Experiments: Principles


1. Replication 3. Local control
 Experiment under identical conditions should  Blocking: allocating experimental units to blocks or
be repeatable. groups so the units within a block are relatively
homogeneous. The blocks are designed so that the
 Confounded results (unable to separate the experimental design captures the anticipated variation
results of two or more variables) should be in the blocks by grouping like varieties, so that the
avoided. variation does not contribute to the experimental error.
2. Randomization  Balancing: is the blocking and assigning of treatments
so that an equal number of subjects is assigned to each
 The experimental trials must be organized in a treatment. Balancing is desirable because it simplifies
way that the effects of uncontrolled variables the statistical analysis.
are minimized

far@ucalgary.ca 29 far@ucalgary.ca 30
Example: Blocking & Balancing Formal Experiments: Principles
 You are investigating the comparative effects of three design techniques
on the quality of the resulting code. 3. Local control
 The experiment involves teaching the techniques to 12 developers and  Correlation: the most
measuring the number of defects found per 1000 LOC to assess the code popular technique to
quality.
 It may be the case that the twelve developers graduated from three assess relationships
universities. It is possible that the universities trained the developers in among observational
very different ways, so that being from a particular university can affect data
the way in which the design technique is understood or used.
 To eliminate this possibility, three blocks can be defined so that the first  Linear and nonlinear
block contains all developers from university X, the second block from correlation.
university Y, and the third block from university Z. Then, the treatments
are assigned at random to the developers from each block. If the first  Nonlinear correlation
block has six developers, two are assigned to design method A, two to B, is hard to be measured
and two to C. and may stay hidden.

far@ucalgary.ca 31 far@ucalgary.ca 32

Formal Experiments: Types Formal Experiments: Types


Factorial design:  Advantages of factorial design
 Crossing (each level of each  Resources can be used more efficiently
factor appears with each level of  Coverage (completeness) of the target variables’ range of
the other factor variation
 Nesting (each level of one  Implicit replication
occurs entirely in conjunction
with one level of another)
 Disadvantages of factorial design
 Proper nested or crossed design  Higher costs of preparation, administration and analysis
may reduce the number of cases  Number of combinations will grow rapidly
to be tested.  Some of the combinations may be worthless

far@ucalgary.ca 33 far@ucalgary.ca 34

Formal Experiments: Selection Formal Experiments: Baselines


 Selecting the number of  Baseline is an “average”
variables: treatment of a variable in a
 Single variable number of experiments (or
 Multiple variables case studies).
 Example: Measuring time to  It provides a measure to
code a program module with or identify whether the value
without using a reusable is within an acceptable
repository range.
 Without considering the effects of  It may help checking the
experience of programmers validity of measurement.
 With considering the effects of
experience of programmers

far@ucalgary.ca 35 far@ucalgary.ca 36
Empirical Research Guidelines Contents
1. Experimental context
2. Experimental design
3. Data collection
4. Analysis
5. Presentation of results
6. Interpretation of results

far@ucalgary.ca 37 far@ucalgary.ca 38

1. Experimental Context 1. Experimental Context


Goals:  C1: Be sure to specify as much of the context as possible. In
particular, clearly define the entities, attributes and measures
 Ensure that the objectives of the experiment that are capturing the contextual information.
have been properly defined  C2: If a specific hypothesis is being tested, state it clearly
prior to performing the study, and discuss the theory from
 Ensure that the description of the experiment which it is derived, so that its implications are apparent.
 C3: If the target is exploratory, state clearly and, prior to
provides enough details for the practitioners data analysis, what questions the investigation is intended to
address, and how it will address them.

far@ucalgary.ca 39 far@ucalgary.ca 40

2. Experimental Design 2. Experimental Design /1


Goal:  D1: Identify the population from which the subjects
and objects are drawn.
 Ensure that the design is appropriate for the
 D2: Define the process by which the subjects and
objectives of the experiment objects were selected (inclusion/exclusion criteria).
 Ensure that the objective of the experiment  D3: Define the process by which subjects and
can be reached using the techniques specified objects are assigned to treatments.
in the design  D4: Restrict yourself to simple study designs or, at
least, to designs that are fully analyzed in the
literature.
 D5: Define the experimental unit.

far@ucalgary.ca 41 far@ucalgary.ca 42
2. Experimental Design /2 3. Data Collection
 D6: For formal experiments, perform a pre- Goal
experiment or pre-calculation to identify or estimate
the minimum required sample size.  Ensure that the data collection process is well
 D7: Use appropriate levels of blinding. defined
 D8: Avoid the use of controls unless you are sure
the control situation can be unambiguously defined.  Monitor the data collection and watch for

 D9: Fully define all treatments (interventions). deviations from the experiment design
 D10: Justify the choice of outcome measures in
terms of their relevance to objectives of the
empirical study.

far@ucalgary.ca 43 far@ucalgary.ca 44

3. Data Collection 4. Analysis


 DC1: Define all software measures fully, including the Goal
entity, attribute, unit and counting rules.
 Ensure that the collected data from the
 DC2: Describe any quality control method used to ensure
completeness and accuracy of data collection. experiment is analyzed correctly
 DC3: For observational studies and experiments, record data  Monitor the data analysis and watch for
about subjects who drop out from the studies.
 DC4: For observational studies and experiments, record data
deviations from the experiment design
about other performance measures that may be adversely
affected by the treatment, even if they are not the main focus
of the study.

far@ucalgary.ca 45 far@ucalgary.ca 46

4. Analysis 5. Presentation of Results


 A1: Specify any procedures used to control for Goal
multiple testing.
 Ensure that the reader of the results can
 A2: Consider using blind analysis (avoid “fishing
understand the objective, the process and the
for results”).
results of experiment
 A3: Perform sensitivity analysis.
 A4: Ensure that the data do not violate the
assumptions of the tests used on them.
 A5: Apply appropriate quality control procedures to
verify the results.

far@ucalgary.ca 47 far@ucalgary.ca 48
5. Presentation of Results 6. Interpretation of Results
 P1: Describe or cite a reference for all procedures used. Goal
Report or cite the statistical package used.
 P2: Present quantitative results as well as significance levels.  Ensure that the conclusions are derived
Quantitative results should show the magnitude of effects merely from the results of the experiment
and the confidence limits.
 P3: Present the raw data whenever possible. Otherwise,
confirm that they are available for review by the reviewers
and independent auditors.
 P4: Provide appropriate descriptive statistics.
 P5: Make appropriate use of graphics.

far@ucalgary.ca 49 far@ucalgary.ca 50

6. Interpretation of Results
 I1: Define the population to which inferential
statistics and predictive models apply.
 I2: Differentiate between statistical significance and
practical importance.
 I3: Specify any limitations of the study.

far@ucalgary.ca 51 far@ucalgary.ca 52

You might also like