TR LS3 130084R4T PDF

EMPIRICAL EVALUATION IN SOFTWARE
PRODUCT LINE ENGINEERING

Alvin Ahnassay, Ebrahim Bagheri, Dragan Gasevic
Laboratory for Systems, Software and Semantics, Ryerson University
Abstract:
Context: Software Product Line (SPL) engineering is a specific approach to software development
that offers strategic methodologies and technologies tailored for reuse of core through the
capturing of variabilities and commonalities. It is imperative to evaluate software product line
engineering practices through real empirical studies and investigate how each of the proposed
work contribute to research and development communities.
Objectives: The objectives of this research is to paint a big picture of the empirical evaluations that
have been undertaken in the field of Software Product Lines (SPL) in reference to development of
methods, techniques, and approaches1. Furthermore, we are interested in determining the quality of
the empirical evaluations and how they can be improved.
Methods: We carried out a systematic literature review of the software product line methods,
techniques, or approaches reported from January 1, 2006 until December 31, 2011. The adopted
method follows well established guidelines in the literature for conducting systematic literature
reviews.
Results: 79 distinct studies representing a diversity of unique software product line methods,
techniques, and approaches were selected according to the inclusion and exclusion criteria. The
results revealed a significant number of evaluations conducted in academia with only 25 studies
conducted in an industrial setting. However, the majority of the evaluations effectively avoided
scalability issues by using industrial sized examples. This investigation also revealed quality issues
in various aspects of the quality assessment criteria for research design and reporting.
Conclusions: This systematic literature review found that a large majority of evaluations had not
been sufficiently designed or reported. This makes it difficult to rely on the available evidence.
The findings highlight unresolved quality issues in empirical studies of software product lines. The
need for improvement in quality research design and quality reporting is only reinforced further
with the findings of this review.
Keywords: Empirical Software Engineering, Software Product Line Engineering, Systematic
Literature Review
In this the paper, the phrase methods, techniques and approaches is intended to refer to any
form of soft or hard means proposed to address a challenge in the area of software product lines.
1. Introduction
The diversity and complexity of stakeholders requirements and demands for high quality software
systems urge the need for approaches which enable effective reusability in developing software
systems. Software Product Line Engineering (SPLE) has proven to be a paradigm which provides
strategic software reuse [1], allowing organizations to reduce development costs and time to
market and increase product quality [36]. SPLE shifts from developing a single software system to
developing a set of software systems which share a common and managed set of features [2].
Hence, SPLE can handle various stakeholders requirements within a particular domain of interest
and builds new products using a stable collection of existing core assets.
In software product lines, reusability is considered as a first class notion; a lifecycle is introduced
for developing reusable artifacts (development for reuse) and a lifecycle is introduced for
developing software systems from reusable artifacts (development with reuse) [3]. On the one
hand, commonality and variability modeling are utilized in the first lifecycle (also known as
domain engineering lifecycle) to document common assets and their variability. On the other hand,
configuration and decision making mechanisms are employed in the second life-cycle (also known
as application engineering lifecycle) to derive software systems fitted to stakeholders
requirements.
There are many studies in software product lines sharing similar aspects that confer various
methods, techniques, and approaches of software product line engineering and the assured benefits
of the practices. These studies propose new and enhanced methods, techniques, and approaches to
software product line development. A common characteristic of these studies is their advantages
and improvements over traditional software development and existing software product line
practices. In addition, many software engineering firms put claim to the fortune of realizing the
benefits of adopting and utilizing the software product line approach [36]. However, not much is
known about what real evidence exists that supports the proclaimed benefits inherited by software
product line engineering practices.
This systematic review seeks to review and summarize empirical evidence reported between 2006
and 2011 from academia and industry on software product lines methods, techniques, and
approaches. The objective of this review is to:
Search and consolidate empirical evidence supporting software product line methods,
techniques, and approaches evaluated in academia and industry;
Assess the evaluation strategies/procedures and the quality of reporting;
Summarize and analyze the results; and
Recommend areas for future research.
This review defines three research questions (see Section 2) that are answered based on the
gathered evidence. Each research question is supported by an underpinning of sub-research
questions which will help define the quality of research evaluations. Based on the nature of this
research, the contribution of this review paper should be of interest to those who want to learn
from experience while planning future studies on software product lines and those who seek
evidence to support their decision-making process for adopting software product line practices or
to improve on their existing practices based on the accumulated results.
This paper is structured as follows. Section 2 explains our motivation and the research questions
that this work intends to answer. Section 3 provides an overview of the design of this systematic
literature review. Section 4 discusses the execution of the review along with threats to validity.
Section 5 presents and discusses the results of the review in reference to the research questions
outlined in Section 2. Section 6 highlights and discusses the main findings derived from the
analysis of the results and points out the recommendations for empirical studies in software
product lines. After reviewing the related work and comparing our systematic review in that
context in Section 7, we conclude the paper and highlight possible future work.
2. Motivation and Research Questions

Systematic reviews are becoming a standard research method amongst software engineers[6],[4].
However, practitioners still are lacking in significant knowledge about this research method and
the number of explored topics remains limited [4]. The deficiency in explored topics holds true in
the area of software product lines and justifies a need for more systematic literature reviews of
software product line methods, techniques, and approaches.
Systematic reviews are performed to summarize existing evidence concerning a specific research
question, topic area or phenomenon of interest, and identify areas where research is lacking [5].
Using a fair and unbiased approach, this systematic literature review is intended to summarize
evidence of various software product line methods, techniques, and approaches subjected to an
empirical evaluation. In addition, there is a need to identify possible holes in existing software
product line evaluation methods in order to recommend further examination of empirical work in
the area of software product lines and provide a context for new software product line evaluation
activities.
The research questions being asked in this review are designed to encapsulate what software
product line methods, techniques, and approaches exist today, what their current state of
evaluations are, and to what extent do the evaluations support adoption of the software product
line methods, techniques, or approaches. The specific research questions that this study is
addressing are:
RQ1: What software product line methods, techniques, and/or approaches exist today and
what is the context in which they have been proposed?
RQ2: What is the state of evaluations being conducted on software product line
methods, techniques, and/or approaches?
RQ3: To what extent do the evaluations support the adoption of software product line
methods, techniques, and/or approaches?
In order to provide proper levels of details for abovementioned research questions, these questions
are refined into several research questions. All research questions and their descriptions are
recorded in Table 1.
Table 1: Designated research questions for the study

#
Research Question
Description
What software product line methods, techniques, or

approaches exist?
What problem area does the software product line
method, technique or approach target?
What is the project lifecycle focus of the software
product line method, technique or approach?
To provide an inventory of software product line methods,

techniques, or approaches evaluated.
To identify the problem area targeted by the software
product line method, technique or approach.
To identify when the software product line method,
technique, or approach is used during the software product
line lifecycle.
1.3
What project activities does the software product

line method, technique or approach target?
1.4
To what degree is the evaluation of the method

described?
What is the state of the software product line
method, technique or approach evaluation?
What research methods are used in the evaluation of
software product line method, technique or
approach?
To identify the software product line development activity

targeted by the software product line method, technique, or
approach.
To assess the quality of reporting used to describe the
software product line method, technique or approach.
A summary of research questions 2.1 2.5 formulating the
state of evaluation.
Identify the type of research method used to evaluate the
software product line method, technique or approach. For
example, experiments or case studies.
1.1
1.2
2
2.1
2.2
2.3
2.4
In what context is the software product line method,

technique or approach evaluated in?
What types of subjects are used during the
evaluation of the software product line method,
technique or approach?
What scale does the evaluation of the software
product line method, technique or approach have?
2.5
What degree of realism does the evaluation of the

software product line method, technique or approach
have?
To what extend does the reported empirical

evaluation enable practitioners to decide about the
actual adoption of the software product line method,
technique or approach?
3.1
To what degree is the software product line method,

technique or approach evaluation described?
Identify if the evaluation was performed from an industry

or academic perspective.
Identify the subjects used during the evaluation of the
software product line method, technique or approach. For
example, students, researchers, or practitioners.
Identify the scale of which the software product line
method, technique or approach is evaluated. Take into
consideration the magnitude of the evaluation conducted.
For example, can the evaluation be considered as industrial
in size or a toy-example?
Measurement of how realistic the evaluation of the
software product line method, technique or approach is in
terms of being applicable. A combination of research
method, context, subjects and scale gives the realism of the
evaluation.
Determine how well the reported empirical evaluation
enables practitioners to assess the probability or likelihood
that the software product line method, technique or
approach is viable enough to be adopted as a practicable
approach in the software product line community. This
question is influenced by the findings of RQ 3.1.
The degree by which the evaluation is described can be
measured by combining the context described, study design
described, and the validity of the outcomes.
3. Review Method
This section provides the details surrounding the review protocol developed to guide the conduct
of this review. It discusses the systematic review design, data source and search strategy, study
selection criteria, quality assessment criteria, data extraction procedures, and data synthesis
procedures.
3.1. Systematic Review Design

Following the guidelines described in [5], this study has been carried out according to the three
main phases of systematic literature reviews: 1) Planning the review, 2) Conducting the review,
and 3) Reporting the review. These phases are illustrated below:
Figure 1 Systematic Literature Review Phases

The Planning Review Phase involves identifying the need for a review and developing a review
protocol. Identifying the need for this study has been discussed in the previous section. This
section provides a discussion on the development of the review protocol used for this study. The
review protocol specifies the methods used for finding primary studies (including search terms and
data sources), study selection criteria, study quality assessment criteria, and procedures for data
extraction and data synthesis.
The Conducting Review Phase involves identifying and selecting primary studies, assessing the
quality of the primary studies, and extracting and synthesizing the information reported in the
primary studies. The review execution is discussed in the next section.
The Reporting Review Phase involves synthesizing the data extracted during the review execution
and summarizing the results of the included primary studies. The results and analysis of this
review is reported in Section 5. The extracted information is tabulated in accordance with the
review questions and is supported by descriptive analysis, aggregated totals and percentages, and
graphical representation. Overall, this paper follows the recommendations for structuring of
reports of systematic reviews outlined in Table 9 of [5].
3.2. Data Sources and Search Strategy

The process of identifying relevant papers that undertake an empirical evaluation of a software
product line method, technique or approach needed the capability to recover research articles and
papers made available through scientific publication databases. Specific publication databases
were selected on the basis that they included research papers in the area of computing and
information technology sciences. These digital databases were chosen because they provide the
most important and highest impact journals and conference proceedings and cover the fields of
software product line engineering and software engineering [23]. Table 2 identifies and describes
the digital databases chosen as adequate sources to identify research papers.
Table 2: Digital Libraries used to search for primary studies
Database Name
ACM Digital Library
IEEE/IEE Electronic
Library
ScienceDirect
SpringerLink
Description
Search this database from ACM (Association for Computing Machinery) to access quality
information in the scientific computing field. Includes full text articles from ACM magazines,
journals and conference proceedings as well as publications from affiliated organizations.
Search this database to find research materials in the areas of computing science and electrical
engineering.
ScienceDirect is Elseviers digital library that is one of the world's largest providers of
scientific, technical and medical (STM) literature. Access a selection of journals from Elsevier
Science covering all fields of science (including journal titles from Business, Psychology, Arts
and Humanities, and Social Science).
Formerly known as Springer-Verlag Journals, from this site one can access hundreds of
Springer-Verlag and Kluwer journal titles, with considerable full text online. Coverage
includes Chemical Sciences, Computer Sciences, Economics, Engineering, Environmental
Sciences, Geosciences, Law, Life Sciences, Mathematics, Medicine, Physics, and Astronomy.
Wiley Online Library
Wiley Online Library, formerly known as Wiley InterScience now includes journal content
formerly on Blackwell Synergy, providing access to more than 3 million articles across 2300+
journals. Coverage includes business, computer science, criminology, education, health, law,
mathematics, psychology, sciences and social sciences.
Retrieving research papers from the above databases required a specific combination of keywords
that would establish the identity of papers that suggest an evaluation of a Software Product Line
method, technique, or approach was conducted. The search terms used were tested and tried to
establish accuracy of paper identification and to ensure the identification process was robust
enough to minimize the risk of missing important and relevant papers. This was ensured by several
rounds of revision and based on the input from all of the authors of this paper. The structure of the
search strategy is based on the main concept being examined in the review with the goal of finding
empirical evaluations of Software Product Line methods, techniques, or approaches.
Table 3 presents the search terms used to identify potentially relevant articles in the computing and
information technology sciences literature. The population signifies the search for Software
Product Lines as well as Software Product Families as these two terms of reference are
interchangeable. The intervention signifies that an evaluation should have been performed in the
research paper.
Table 3: Search Terms used to find software product line literature
Population
software product line
OR software product
famil*
AND
AND
Intervention
empiric* OR experiment* OR experience* OR lesson learned OR lessons learned
OR lesson learnt OR lessons learnt OR evaluat* OR validat* OR stud* OR case*
OR example* OR survey* OR investigat* OR analy* OR demo*
3.3. Study Selection

In addition to executing the automated search based on the search terms in Table 3, the results
required further investigation to determine which papers should be included and excluded from
this study. Hence, we applied a set of inclusion (see Table 4) and exclusion (see Table 5) criteria to
decide proper papers for the study:
Table 4: Inclusion Criteria for determining the papers for the study
Inclusion Criteria
Rationale
Papers where the search terms are found
If the main objective of the paper is to evaluate a software product line

method, technique, or approach than it would most likely be mentioned in
the title or abstract.
Papers published from Jan-01-2006 to

Dec-31-2011.
Papers where the full-text is available.
Papers written in English.
Only interested in current and relevant evaluations conducted in recent

years. The year 2006 was recommended because it is a common practice in
other related studies to investigate and focus on a 5-year review period
[51].
If the full-text is not available for review then there is no information to
review and extract. If there is some information it is most likely unreliable.
Time constraints and language barriers restrict this review to consider
papers written in English only because the author is unilingual and does not
Papers that are either a research paper,

peer-reviewed paper, technical reports or
academic manuscript.
Papers that specify in the abstract that it
benefits from evaluation of a software
product line method, technique or
approach.
have the resources available for translation of other languages.

Due to quality restrictions this review was limited to conducted searches in
academic electronic databases and technical reports. Other sources of
evidence such as works-in-progress were excluded.
If the main objective of the paper is to evaluate a software product line
method, technique, or approach then it would most likely be mentioned in
the title or abstract.
Table 5: Exclusion criteria for filtering out papers for the study
Exclusion Criteria
Rationale
Papers that are duplicates of papers already

included.
Including duplications will skew the results of this review. If

duplicate papers are found, only the latest version will be included
and all others excluded.
Systematic literature reviews that study other systematic literature
reviews are considered tertiary studies. This Systematic literature
review is a secondary study such that it reviews primary studies.
Theoretical evaluations do not produce empirical evidence.
Papers that are systematic literature reviews.

Papers that present a theoretical evaluations of a
software product line method, technique or
approach.
Papers that are solely evaluations of software
product line technologies rather than a method,
technique or approach.
Papers that are evaluations of software
engineering methods, techniques, or approaches
that are not a software product line method,
technique or approach but are evaluated in the
context of software product line development (i.e.
Agile or Model-driven development practices
used to develop a software product line).
The main focus of this study is on empirical evaluations of software

product line methods, techniques, or approaches. Technologies are
tools that are used to assist in the execution of methods, techniques, or
approaches. They are not methods, techniques, or approaches
themselves.
The main focus of this study is on empirical evaluations of software
product line methods, techniques, or approaches. Papers that evaluate
methods, techniques or approaches adopted from other software
engineering methodologies other than software product lines must be
rejected even if it is applied in the context of software product line
development.
3.4. Study Quality Assessment

In addition to the inclusion and exclusion criteria it is important to assess the quality of the
selected primary studies for a variety of reasons one of which is to provide more detailed inclusion
and exclusion criteria [5]. However, because the inclusion and exclusion criteria presented in
Section 3.3 are quite vigorous as is, the quality assessment criteria presented in this section will
serve to render the variances in study quality by providing a way of scoring the quality of
individual primary studies. Furthermore, the quality assessment criteria will also serve to provide
guidance in interpreting findings of this review and to make recommendations for future work [5].
Utilizing the recommendations outlined in [24], the quality assessment criteria are intended to
appraise the attributes of research design and reporting of the selected primary studies. These
attributes are incorporated into the data extraction form presented in Table 7 of Section 3.5 and
include the following components:
Theoretical context describing the theoretical aspects of the proposed software product
line method, technique, or approach creates a frame of reference from which the research
objectives are derived [24][25].
Research Design combining the guidelines for empirical research design recommended
in [24][25][26] consensus was given to specific aspects of research design that were
deemed appropriate for assessing the quality of evaluations. These aspects include clearly
stated research objectives supported by research questions, domain context, samples and
instruments used to carry out the evaluation, and the data collection and analysis
procedures. Describing the research design includes stating research objectives that are
supported by research questions [24][25][26].
Stating the objective sets the focal point
of the research [27]. This can be stated in various ways such as in a problem statement,
hypotheses or objective/goal statement. The research objective should be supported by
research questions that state what is needed to fulfill the objective [24][25]. Moreover,
Kitchenham [24] asserts defining the hypothesis correctly is the most important part of
designing empirical research. Equally important is describing the data collection and
analysis procedures. Describing the data collection and analysis procedures ensures that
the evaluation can be easily replicated and the results are analyzed correctly [24]. A welldefined research design has the advantage for subsequent analysis to be straightforward
and clear [24].
Research Analysis interpreting the results of an evaluation and relating the results back
to the research objectives ensures the results are statistically or practically significant and
valid [24]. Therefore, reporting the results with clear interpretation, communicating
assumptions made, threats to validity and lessons learned is important for the reader to
make sure the findings are reasonable and trustworthy [24][25][26].
Research Conclusions Researchers should conclude their findings into a context of

implications by forming theories that constitute a framework for their analysis [25]. This
is where researchers discuss practical impact, relate their work and results to earlier
research, and give footing towards future work [26].
Based on the attributes listed above, the following quality assessment checklist was created (see
Table 6).
Table 6: Quality Assessment Checklist
Attribute
Theoretical context
Research Design
Research Analysis
Research Conclusions
Criteria
Is the purpose of the method explained?
Is the theoretical aspect of the method explained?
Is the implementation of the method explained?
Are the advantages of the method explained?
Are the limitations of the method explained?
Does the study describe a problem, hypotheses or objective/goal statement?
Does the study provide research questions that will support or reject the problem,
hypotheses or objective/goal statement?
Does the study describe the domain which the evaluation was conducted in?
Does the study describe samples and instruments used to conduct the evaluation?
Does the study describe its data collection procedures?
Does the study describe its data analysis procedures?
Does the study provide an interpretation or analysis of the results?

Does the study describe its results relative to its problem, hypotheses or objective
statement?
Does the study describe assumptions made during the execution of the evaluation?
Does the study discuss threats to validity?
Does the study discuss lessons learned?
Does the study discuss the practical implications?
Does the study discuss related work?
Does the study provide recommendations for future work?
In order for the criteria in the Quality Assessment Checklist to provide a way of weighting the
significance of individual primary studies, each question will be scored based on the answer
provided. The questions are intended to be close-ended questions meaning there are limited
responses (Yes/Somewhat/No), according to [6], which are respectively assigned scoring points of
(1.0/0.5/0.0). This scoring mechanism assists with the rendition of variances in study outcomes
which will in turn determine the quality of each individual study and the strength of inferences.
3.5. Data Extraction

The data extraction form in Table 7 was designed to accrue all the necessary information required
to address the research questions and quality assessment criteria. In addition to acquiring the
information needed to address the research questions and quality assessment criteria, the following
standard information was also extracted from each primary study: Title of the Paper, Sources
(Database and Journal), Date Published, Paper URL, Document Object Identifier (DOI) and
Authors.
The purpose of collecting the aforementioned information is to provide analysis of the meta-data
of the studies themselves. For instance, distinguishing the time frames of the studies (i.e. how
many studies were published in the year 2006 versus 2011). This measurement will provide insight
into the growth and interest in software product line research. Other points of interest that can be
answered include who the main players are in software product line research, how readers can
access the studies via URL or DOI, and what sources are more likely to publish software product
line research, and more importantly, publish high quality research. However, this review has
limited its work to reporting the findings associated with answering the research questions stated
in Section 2.
Table 7 Data Extraction Properties
#
1
Property
Method
Values
The method name provided in
the reviewed literature.
The area of software product line
development.
Description
The name of the method under evaluation (if
the author provides one).
The area of software product line
development which the method is intended
to provide a solution for.
RQ Mappings
RQ 1
Problem Area
Project Focus
Product Management,
Domain Engineering,
Product Engineering,
Not Mentioned
Project focus specifies what software product

line development phase the method is meant
to be used.
RQ 1.2
Project
(Lifecycle)
Activity
Development process specifies whether the

method is meant to be used during a specific
stage of the software development process.
These activities can be seen as sub-processes
of the three phases identified in item 2.
RQ 1.3
Method
described
Business Vision Evaluation,

Scope Definition,
Domain Requirements Analysis,
Common Requirements,
Variable Requirements,
Domain Design,
Domain Implementation,
Domain Testing,
Product Requirements,
Product Design,
Product Construction,
Product Testing,
Product Delivery/Support,
Not Mentioned
Yes
No
Somewhat
1.
RQ 1.4
2.
3.
Is the purpose of the method

explained?
Is the theoretical aspect of the method
explained?
Is the implementation of the method
explained?
RQ 1.1
4.
Research
Method
Context
Subjects
Scale
evaluation
10
11
the
method
the method
evaluate the
The type of
RQ 2,
RQ 2.1,
RQ 2.5
RQ 3
Academia,
Industry
Specifies the context in which the evaluation

is made.
RQ 2,
RQ 2.2,
RQ 2.5
RQ 3
Practitioner,
Researcher,
Student
Specifies who uses the method in the

evaluation
Toy example/Prototype,
Down-scaled real example,
Industrial
Specifies the scale on which the evaluation is

made.
Study designed
described
Yes
No
Somewhat
Study
reporting
described
Yes
No
Somewhat
In combination with Study Context, does the

study also provide information that address
the following questions:
1.
Does the study describe a problem,
hypotheses or objective statement?
2.
Does the study provide research
questions that will support or reject the
problem, hypotheses or objective
statement?
3.
Does the study describe the domain
which the evaluation was conducted
in?
4.
Does the study describe samples and
instruments used to conduct the
evaluation?
5.
Does the study describe its data
collection procedures?
6.
Does the study describe its data
analysis procedures?
In combination with Study Context and
Study Design, does the study also provide
information that address the following
questions:
1.
Does the study describe assumptions
made during the execution of the
evaluation?
2.
Does
the
study
provide
an
interpretation or analysis of the results?
3.
Does the study describe its results
relative to its problem, hypotheses or
objective statement?
4.
Does the study discuss threats to
validity?
5.
Does the study discuss the practical
implications?
6.
Does the study discuss lessons learned?
7.
Does the study discuss related work?
8.
Does
the
study
provide
recommendations for future work?
RQ 2,
RQ 2.3,
RQ 2.5
RQ 3
RQ 2,
RQ 2.4,
RQ 2.5
RQ 3
RQ 3,
RQ 3.1
of
Case Study,
Experimental,
Survey,
Post-mortem Analysis,
Other
Are the benefits of

explained?
5. Are the limitations of
explained?
The research method used to
software product line method.
evaluation classification.
RQ 3,
RQ 3.1
Properties 1-5 are intended to provide insight into RQ1: What software product line methods,
techniques, and/or approaches exist today? Creating an inventory of software product line
methods, techniques, and approaches that includes method names, method descriptions,
targeted problem areas, targeted project lifecycles and activities gives a snapshot of existing
software product line methods, techniques, and approaches being evaluated.
10
Properties 6-9 are intended to provide insight into RQ2: What is the state of evaluations
beings conducted on software product line methods, techniques, and/or approaches?
Determining the state of software product line method evaluations takes into account the
research method used, the context which the evaluation took place, the subjects that took part
in the evaluation and the scale of the evaluation. These four properties describe the context of
the evaluations studied and provide adequate information to develop an understanding of the
state of evaluations.
Properties 6-12 are intended to provide insight into RQ3: To what extent do the evaluations
support the adoption of software product line methods, techniques, and/or approaches? For a
software product line method, technique or approach to be considered for adoption a few
things need to be considered. The context which the method, technique or approach was
evaluated is an important factor because the degree of realism influences how applicable the
evidence is in other organizational environments. The quality of research design and reporting
is also a major influential factor to adoption because it directly affects the trustworthiness of
evidence.
Here we provide further description of each of the properties:

Property #1 Method: This propertys value is in accordance with what is presented in the research
study as the software product line method, technique or approach being evaluated. If the author(s)
has provided the software product line method with a specific name then this value will be
extracted and reported in accordance with what is stated in the study. If a specific name is not
available for reporting then this property will provide a brief description of the method, technique
or approach. Example data entry: Textual Variability Language (TVL) - a textual feature
modeling dialect geared towards software architects and engineers.
Property #2 Problem Area: This propertys value identifies the problem area which the software
product line method is focused. This value should highlight a specific problem area which the
software product line method is attempting to address. Example data entries: Product Derivation,
Variability Management, Visualization, and Architecture.
Property #3 Project Focus: This propertys value indicates which phase the software product line
method is subjected to in relation to software product line development. The primary phases of
development are Product Management, Domain Engineering, or Product Engineering Error!
Reference source not found.. Although the three phases listed is the closest consensus
practitioners and researchers share regarding the phases of software product line development,
some authors may report the phases differently, mention alternative phases, or not mention a
phase. In this event, a general impression will be applied based on the reviewers conclusions. If a
value cannot be determined then unknown will be applied.
Property #4 Project (Lifecycle) Activity: This propertys value categorizes the software product
line method with respect to what development activity of software product line engineering it is
intended to be used in. The values for this property are adapted from Error! Reference source not
11
found.. Some authors may report the activities differently, mention alternative activities, or not
mention an activity. In this event, a general impression will be applied based on the reviewers
conclusions. If a value cannot be determined then unknown will be applied.
Property #5 Method described: This propertys value indicates how the method was reported. Six
questions related to the method description are asked and will contribute to the scoring of this
property. These questions are listed under the study quality assessment criteria section.
Property #6 Research Method: This propertys values are in accordance to the empirical research
methods discussed in [27][30]. These research methods are namely experiment, case-study, survey,
and post-mortem analysis. Other types of research methods are identified (i.e. Action Research)
will be tagged as Other but reported based on their classification.
Property #7 Context: This propertys values indicate whether the evaluation occurred in an
industrial setting or an academic setting. If the context is not mentioned then it will be assumed the
evaluation was conducted in an academic setting.
Property #8 Subjects: This propertys values indicate the type of participants observed during the
evaluation. Participants may include practitioners, researchers, or students. If no participants are
identified then it will be assumed the participants are the authors of the study and will be classified
as researchers.
Property #9 Scale of Evaluation: This propertys values signify the size of evaluation conducted
during the study. Evaluations are conducted on a small, medium, or large scale ranging from
prototypes/toy examples, down-scaled real-world examples, to industrial size projects. Open
source software product lines such as Berkeley DB will be classified as industrial scale
evaluations.
Property #10: Study Design described: This propertys value indicates how the study was
designed. Five questions related to the design of the study are asked and will contribute to the
scoring of this property. These questions are listed in the study quality assessment criteria section.
Property #11: Study Reporting described: This propertys value indicates how the study was
reported. Eight questions related to the reporting aspect of the study are asked and will contribute
to the scoring of this property. These questions are listed in the study quality assessment criteria
section.
3.6. Data Synthesis

The results of the review are presented based on the order of the research questions. Syntheses of
the results are descriptive in nature and complemented by quantitative summaries of classifications
presented in tabular format and graphical representation. The information extracted from the
primary studies is tabled in accordance with the review questions. The tabled information will
focus on similarities and differences between the primary studies and will include counts and
percentages of study properties and values identified in the data extraction form.
If there is specific information available to be extracted that will support the quantitative synthesis
of results then the results are presented using graphical plots, i.e., forest plot. Annotations will be
used on the forest plot to support sensitivity analysis [24].
12
4. Conducting the Review

This section provides a description of how the review was executed (the steps taken to arrive at the
final selection of primary studies to be interpreted and analyzed) and the circumstances which
presented as threats to validity of this review.
4.1. Inclusion and Exclusion of Studies

The review execution followed an automated approach by applying the search strategy discussed
in Section 3.2. In addition to the automated search on the databases, the authors manually checked
the returned results in order to make sure that no significant part of the literature is missing.
References were also crosschecked manually by one of the authors by browsing through the online
databases to ensure that no significant work in this area is overlooked by the automated search.
The overall search resulted in the identification of 615 articles. Of the 615 papers found, 128 were
from the ACM Digital Library, 271 were from the IEEE Xplore Digital Library, 48 were from
ScienceDirect, 122 were from SpringerLink, and 46 were from the Wiley Online Library. 347
articles were rejected when the inclusion criteria had been applied to the title and abstract of the
papers. Another 180 articles were rejected after further assessment of the quality of the papers on
the basis of the quality assessment criteria (explained in Table 22), 8 of which were tagged as
being related work. After the application of the inclusion, exclusion, and quality assessment
criteria, a total of 79 papers remained for data extraction and classification with the intent to be
subjected to analysis for the purpose of addressing the research questions.
As a reference management technique, a free service for managing and discovering scholarly
references called CiteULike2, was used to store and manage the studies. A reference management
system like CiteULike helps in the removal of duplicate studies and provides a list of references
for readers [30]. The complete, unfiltered search results that include all 615 primary studies
initially found have been retained for future access in the event that reanalysis is required. The
unfiltered search results can be obtained by visiting http://www.citeulike.org/user/aahnassay.
4.2. Threats to Validity

The main threats to validity of this review are publication and selection bias, and data extraction
and classification. An explanation about each limitation is provided along with the actions taken to
minimize the impact on the threats on this review.
4.2.1. Validation of the review protocol

The review protocol developed for this systematic literature review was created and validated prior
to conducting the review. Several guidelines were consulted including Kitchenham [5], Wright et
2 http://www.citeulike.org/
13
al. [30], Biolchini et al.[31], and Petersen [41] to create the review protocol. However, it was
Kitchenham [5] that was the primary source of guidance. The review protocol was prepared by the
first author and was then reviewed by the second author and consequently by the other two authors
to verify whether research questions were formulated correctly; whether the search strings were
according to the research questions, and whether the data to be extracted would be proper to
address the research questions. Based on feedback provided in the discussion over research
protocol, improvements were made to the research questions, search strategy, study selection
criteria, quality assessment checklist, and data extraction strategy.
4.2.2. Validation of publication and primary study selection

Systematic reviews continue to be subjected to publication bias because researchers are more
likely to publish studies that produce positive results and refrain from publishing studies that
produce negative results [5][30]. This type of bias has the potential of distorting the results of this
review. Kitchenham [5] recommends performing manual searches in addition with automated
searches to include reference lists from relevant primary studies and articles, company journals
(i.e. IBM Journal of Research), Grey Literature (i.e. technical reports and works-progress),
Conference Proceedings, and the Internet. To address this, besides the automated search on the
databases, the authors manually checked the results of the automated search in order to make sure
that no significant part of the literature is missing. References were also cross-checked manually
by one of the authors by browsing through the online databases to ensure that no significant work
in this area is overlooked by the automated search. Therefore, we believe publication bias was kept
to a minimum in our work. Nevertheless, the effects on the results are considered to be
insignificant because the combined use of digital libraries gives access to quality scientific and
technical information found in over 3 million articles across over 2,300 journals.
The inclusion and exclusion criteria of this review include segments that also produce a threat of
selection bias such as selecting studies only written in the English language. Although this type of
threat can be levied with bias, it has minimal impact on the conclusions [30].
In order to validate the inclusion and exclusion criteria, a random set of five studies were reviewed
based on the inclusion and exclusion criteria. The results were carefully analyzed and validated by
all researchers involved in the study. All 615 studies were subjected to the selection process. Three
of the authors were involved in the selection process and 79 studies were deemed acceptable and
tagged as selected using the reference management system mentioned in Section 4.1. The
remaining studies were either rejected or classified as related work. Reasons for acceptance and
rejection were noted on all studies.
4.2.3. Validation of data extraction criteria and classification

Vague explanations of the data extraction criteria and problems with misclassification exemplifies
as threats to the validity of the results of this review. Data extraction criteria should be clearly
defined and relevant to providing the information needed to answer the studys objective.
14
Consequently, problems with data extraction descriptions will cause problems in classification.
Published research may be of poor quality meaning the reports are written poorly or ambiguously
and at times fail to include relevant data [30]. This makes it difficult to fit the data extracted from
the papers into the chosen categories of the data extraction properties. Hence, it was necessary to
validate the data extraction properties against credible sources (i.e. experts in the field of empirical
research and software product lines).
The data extraction properties 1 and 2 were sourced directly from the primary studies reviewed in
this study. Each study reported the name and description of their proposed method, technique or
approach differently and the problem areas which their solution targeted were also reported
differently. Therefore, at best the verbatim names and descriptions of methods, techniques or
approaches and the targeted problem area were reported based on the information provided in the
source papers. In this circumstance, the extracted information was discussed between the authors
and the information was verified. Therefore, we had full agreement on names and descriptions of
the methods at the end of data extraction process.
The data extraction properties 3 and 4 which identified Project Timelines and Project Activities
were sourced from the Wikipedia article for Product Family Engineering. However, knowing that
Wikipedia is not a credible source of information, further investigation was conducted into one of
the articles listed references [36]. Pohl et al. [36] published book Software product line
engineering foundations, principles, and techniques was used for validation and found the
information in the Wikipedia article to be acceptable.
The data extraction property 6 was sourced by [27]. Wohlin et al. [27] provided a discussion on
empirical research methods in web and software engineering. In their discussion they identified
four research methods: experiments, case studies, surveys, and post-mortem analysis. Furthermore,
these research methods are supported by Easterbrook et al. [37] on selection of empirical methods
for software engineering research.
The data extraction properties 7 - 11 were chosen from a consensual perspective after reviewing
the recommendations made in [24][25][26]Error! Reference source not found.. Kitchenham et
al. [24] produced a set of preliminary guidelines for empirical research in software engineering
which are intended to assist researchers in designing, conducting, and evaluating empirical studies.
Runeson and Host [25] produced guidelines for conducting and reporting case study research in
software engineering which provides empirically derived and evaluated checklists for researchers
and readers of case study research. Jedlitschka and Ciolkowski [26] produced reporting guidelines
for experiments in software engineering which provides guidance on the expected content of
reporting sections of empirical studies. Finally, Wieringa [38] created a unified checklist for
observational and experimental research in software engineering which identifies important
questions to be answered in experimental and case study research design and reports. Interestingly,
Wieringa [38] cites [24][25][26] in his work and examines commonalities between [24][25][26]
and CONSORT [39][40] in order to form his unified checklist.
Data classification proved to be without certainty since the studies under review did not provide
precise answers to the data extraction criteria. Many properties were not described correctly or
15
mentioned at all. In these circumstances, Kitchenham [5] and Biolchini et al. [31] recommend
contacting the author of a questionable study to assist in resolving uncertainties and provide clarity
to unknowns. However, Biolchini et al. [31] provides an alternative suggestion to contacting
authors which allows for general impressions of subjective evidence to be made by the reviewer.
Due to time constraints, the option to make general impressions on subjective evidence was used.
Again, in this circumstance, the extracted information was discussed between the authors and
agreement on the information was obtained.
5. Results and Analysis

This section provides a discussion and analysis surrounding the results of this systematic literature
review based on the 79 primary studies selected. The discussion is structured based on the research
questions presented in Section 2.2. In the following the number of the research question related to
each section and/or subsection is given in parenthesis that can be traced back to Table 1.
5.1. What Software Product Line methods Exist (RQ1)?

This review has identified 79 unique software product line methods, techniques, and approaches
that have undergone an empirical evaluation and are presented in Table 29 found in Appendix C.
Table 29 provides the method name, if one is specified, along with a description of the method.
The method name and description are described as reported by the author. 20 articles did not
provide a specific or formal name for the reported method being evaluated. In addition, each
method is also classified in terms of the problem area which the method is intended to address, the
project timeline focus, and project activities. Each method is mapped to its associated article using
the article id.
5.1.1. What problem area does the Software Product Line method target
(RQ1.1)?
Unlike the project lifecycles and project activities, the problem areas targeted by the software
product line methods were regularly mentioned in the reviewed articles. Many methods targeted a
variety of problem areas. Table 8 shows that Feature Modeling is represented the most by 21
papers, followed by Commonality and Variability Management (CVM) with 15 papers and
Requirements Management with 14 papers. Visualization is represented the least by only 1 paper,
followed by Process Evaluation and Improvement with 4 papers and Adoption with 5 papers. A
graphical representation of the results is presented in Figure 2. The results for each individually
reviewed paper in regards to RQ1.1 can also be viewed by referring to Table 29 in Appendix C.
Table 8: Targeted Problem Areas

Problem Area
Feature Modeling
Code Used For This Problem Area

MOD
Number
21
Percentage
26.58
16
Commonality and Variability Management

Requirements Management
Evolution and Maintenance
Architecture
Product Derivation
Code Implementation
Testing
Configuration Management
Quality Control
Adoption
Process Evaluation and Improvement
Automation
Project Management
Visualization
Documentation
CVM
REQ
EAM
ARC
PDR
COD
TEST
CON
QCL
ADP
PRO
AUTO
PRM
VIS
DOC
15
14
13
10
10
8
8
7
7
5
4
3
3
1
1
18.99
17.72
16.45
12.66
12.66
10.13
10.13
8.86
8.86
6.33
5.06
3.80
3.80
1.26
1.26
Figure 2: Targeted Problem Areas Evaluated
5.1.2. What is the project lifecycle focus of the Software Product Line
method (RQ1.2)?
The project lifecycle focus targeted by the software product line methods were rarely mentioned in
the reviewed articles. Out of the 79 articles, 64 did not specifically mention the project lifecycle
which the software product line method targeted. The absence of project lifecycle reporting
generated the need to generalize the results based on descriptions of the method reported by the
author in regards to the purpose of the method, the problem area the method addressed, and the
project activities the method targeted. Methods often were found to overlap and focus on more
than one project lifecycle. The main overlap occurrence encompassed Domain Engineering and
Product Engineering. Table 9 shows that Domain Engineering is represented the most by 61
papers, followed by Product Engineering with 42 papers, and Product Management with 8 papers.
Product Management is the least represented project lifecycle.
17
It is worth noting again the fact the software product line methods are commonly targeting more
than one project lifecycle, summing the number column in Table 9 will not equal 79 (total number
of articles reviewed) and summing the percentage column will not equate to 100%. The number
and percent columns represents the number and percentage of papers out of 79 which the project
lifecycle is targeted. 1 study spanned across all three project lifecycles and 30 studies spanned
across Domain Engineering and Product Engineering. Figure 3 depicts the overlap of papers
covering their targeted project lifecycle phases.
The results for each individually reviewed paper in regards to RQ1.2 can also be viewed by
referring to Table 29 in Appendix C.
Table 9 Targeted Project Lifecycles
Project Lifecycle
Not Mentioned
Domain Engineering
Product Engineering
Product Management
Project Lifecycle Code

DE
PE
PM
Number
64
61
42
8
Percentage
81
77
53
10
Product
Management
4
1
Domain
Product
Engineering 29 Engineering
29
11
Figure 3 Project Lifecycle Overlaps
5.1.3. What project activities does the Software Product Line method
target (RQ1.3)?
Similarly to the project lifecycle, the project activities were rarely mentioned in the reviewed
articles. Out of the 79 articles, 67 did not specially mention the project (lifecycle) activity or
activities that the software product line method targeted. The absence of project (lifecycle) activity
reporting also generated the need to generalize the results based on descriptions of the method
reported by the authors. Many methods spanned across a variety of project activities, in which case
they were counted towards all the activities that were covered. Table 10 shows that Domain
Requirements Analysis is represented the most by 29 papers, followed by Variable Requirements
with 22 papers. Business Vision Evaluation is represented the least by only 4 papers, followed by
Product Deliver and Support with 5 papers. A graphical representation of the results is presented in
Figure 4.
It is worth noting again the fact the software product line methods are commonly targeting more
than one project (lifecycle) activity, summing the number column in Table 10 will not equate to 79
18
(total number of articles reviewed) and summing the percentage column will not equate to 100%.
The number and percent columns represents the number and percentage of papers out of 79 which
the project (lifecycle) activity is targeted.
referring to Table 29 in Appendix C.
Table 10 Targeted Project Activities
Project Activities
Not Mentioned
Domain Requirements Analysis
Variable Requirements
Domain Design
Product Construction
Product Design
Common Requirements
Domain Implementation
Product Requirements
Product Testing
Domain Testing
Scope Definition
Product Delivery and Support
Business Vision Evaluation
Project (lifecycle) activity Code

DRA
VR
DD
PC
PD
CR
DI
PR
PT
DT
SD
PDS
BE
Number
67
29
22
20
20
18
15
14
14
9
8
6
5
4
Percentage
85
37
28
25
25
23
19
18
18
11
10
8
6
5
Figure 4 Targeted Project Activities
5.1.4. To what degree is the method described (RQ1.4)?

Grading the degree which the software product line method, technique, or approach is described
was determined based on how well the method, technique, or approach was reported to the reader.
It seems logical that as a reader, one needs to have a minimal understanding of the method being
discussed to draw some conclusion about the method and why it is being proposed. This section
summarizes and grades the degree of which the method is described in the reviewed articles by
answering the five questions set out in the quality assessment criteria (Table 22). The results for
19
each individually reviewed paper in regards to RQ1.4 can also be viewed by referring to Table 28
in Appendix B and Table 29 in Appendix C.
Table 11 shows a clear majority of papers identified the purpose of the software product line
method.
Table 11 Purpose of method explained
Q1. Is the purpose of the method explained?
Yes
No
Somewhat
Total
Number
78
1
0
79
Percentage
99
1
0
100
Table 12 shows that the clear majority of papers explained the theoretical aspect of the software
product line method that they proposed or employed.
Table 12 Theoretical aspect of method explained
Q2. Is the theoretical side of the method explained?
Yes
No
Somewhat
Total
Number
74
4
1
79
Percentage
94
5
1
100
Table 13 shows the number of papers that support the theoretical background of the software
product line method with an explanation on how the method is or can be implemented. It would be
acceptable for the authors to provide the reader with an implementation explanation to support the
theoretical background of their methods so the reader understands how the method would be made
and used. 48 papers, a small majority of papers, provided an explanation how the method is
implemented.
Table 13 Implementation of method explained
Q3. Is the implementation of the method explained?
Number
Yes
48
No
28
Somewhat
3
Total
79
Percentage
61
35
4
100
Table 14 shows that a clear majority of the papers explained the advantages to some degree of the
studied software product line method. It would only make sense if someone is proposing a solution
to a problem; they would highlight and assert the advantages of their solution to convince the
reader of the benefits of their approach.
Table 14 Method advantages explained
Q4. Are the advantages of the method explained?
Yes
No
Somewhat
Total
Number
50
22
7
79
Percentage
63
28
9
100
Table 15 shows a low occurrence of papers that report any limitations of their studied software
product line methods. In comparison to Table 14, there were more papers which outlined the
advantages of their proposed software product line method as there was which outlined the
limitations of their method. It would be acceptable and recommended for authors to identify and
20
provide the reader with any limitations so confines of their proposed software product line
methods is understood in case the reader decides to further investigate the method and its practical
implications
Table 15 Method limitations explained

Q5. Are the
explained?
Yes
No
Somewhat
Total
limitations
of
the
method
Number
Percentage
14
62
3
79
18
78
4
100
Table 28 found in Appendix B provides the scoring of each of the five questions intended to
answer to what degree is the method described for each article reviewed. The distribution in Table
28 suggests that:
Researchers are quite consistent in stating the purpose and theoretical aspects of their
proposed methods, techniques, and approaches.
Overall average among the 79 studies suggests there is no significant issue with describing the
methods, techniques, and approaches. Although there seems to be a lack of reporting in terms
of method limitations. Either the researchers failed to report the limitations of their proposed
method, technique, and approach, or they simply did not encounter any.
Table 16 categorizes the number of papers based on their quality scores for describing their
methods. Articles [S2, S6, S9, S22, S26, S28, S60, S64, and S72] scored the highest level of
degree in terms of describing the evaluated software product line method, technique, or approach.
referring to Table 28 in Appendix B and Table 29 in Appendix C.
Table 16 Method described score summary
Score
4.5-5.0
3.5-4.0
2.5-3.0
1.5-2.0
0.5-1.0
0.0
Total
Number
9
36
20
12
1
1
79
Percentage
11.39
45.57
25.31
15.19
1.27
1.27
100.00
The distribution in Table 16 suggests that:
Only 9 (11.39%) papers had provided response to all five questions being asked.
65 (82.27%) papers managed to describe at least three of the five quality assessment criteria.
For the most part, most of the authors tried to provide useful description of their proposed methods
which is indicated by the 65 papers that addressed at minimum three of the five quality assessment
criteria. However, looking back to Table 15 where 62 papers did not address their proposed
methods limitations may suggest that the results would have been highly favorable if this single
criterion was described properly.
21
5.2. What is the state of SPL evaluation (RQ2)?

Determining the state of software product line evaluations takes into account the research method
used, the context which the evaluation took place, the subjects that took part in the evaluation and
the scale of the evaluation. These four properties describe the context of the studied evaluations.
The results for RQ2 are further discussed in detail in the next sections.
5.2.1. What research methods are used in the evaluation of SPL method
(RQ2.1)?
The results in Table 17 shows the most common research method used to evaluate software
product line methods, techniques, and approaches are experiments represented by 39 papers (49%)
followed closely by case studies which were represented in 27 papers (34%). In comparison to
previous works, [S15, S16, and S17] case studies seem to be the most common research method
used for conducting empirical evaluations. In contrast, this review found experiments to be
somewhat more popular than case studies in software product lines.
Table 17 Research methods used

Research Method
Experiment
Case Study
Post-mortem Analysis
Survey
Action Research
Total
Number
39
27
10
2
1
79
Percentage
49
34
13
3
1
100
5.2.2. What context is the SPL method evaluated (RQ2.2)?

The results for context in Table 18 show that a significant number of empirical evaluations were
conducted in academia represented by 46 papers (58%) and the remaining 33 papers (42%)
represented evaluations conducted in an industrial setting.
Table 18 Context
Context
Academia
Industry
Total
Number
46
33
79
Percentage
58
42
100
5.2.3. What types of subjects are used in the evaluation of the SPL
method (RQ2.3)?
The results for subjects in Table 19 show that a substantial amount of empirical evaluations used
researchers as subjects represented by 45 papers (57%), 27 papers (34%) used practitioners, and 7
22
papers (9%) used students. The studies which used students as subjects typically recruited graduate
level students who were studying at a Masters or Doctorate level.
Table 19 Subjects
Subjects
Researcher
Practitioner
Student
Total
Number
45
27
7
79
Percentage
57
34
9
100
5.2.4. What scale does the evaluation of the SPL method have (RQ2.4)?
The results for scale in Table 20 shows a clear majority of empirical evaluations used industrial
size examples represented by 46 papers (58%), followed by toy-examples in 18 papers (23%), and
down-scaled real world examples 15 papers (19%). Some examples of industrial sized evaluations
included domains such as the telecommunications, automotive, and aviation. These evaluations
focused on full blown systems as opposed to subsections of systems. Open source applications like
Berkley DB and MobileMedia which were categorized as industrial scale evaluations were
typically used in academia since the source files are publicly available. Evaluations that used toyexamples were typically done using prototypes.
Table 20 Scale
Scale
Industrial
Toy-Example
Down-scaled Real World
Total
Number
46
18
15
79
Percentage
58
23
19
100
5.2.5. What degree of realism does the evaluation of the SPL method
have (RQ2.5)?
Table 21 presents the number and percentage of papers categorized by their context description.
By combining the four context properties of research method used, context, subjects, and scale of
the evaluations, the degree of realism which the evaluations have can be realized.
Table 21 Degree of realism
Research Method
Context
Subjects
Scale
Number
Experiment
Academia
Researcher
Industrial
13
16
Case Study
Industry
Practitioner
Industrial
12
15
Experiment
Academia
Researcher
Toy Example
11
Post-mortem analysis
Industry
Practitioner
Industrial
Case Study
Academia
Researcher
Industrial
Experiment
Academia
Researcher
Down-Scaled
Case Study
Industry
Researcher
Industrial
Experiment
Academia
Student
Toy Example
23
Research Method
Context
Subjects
Scale
Number
Case Study
Academia
Researcher
Down-scaled
Case Study
Academia
Researcher
Toy Example
Experiment
Industry
Practitioner
Down-Scaled
Experiment
Industry
Researcher
Down-Scaled
Academia
Researcher
Industrial
Industry
Practitioner
Down-Scaled
Action Research
Industry
Practitioner
Industrial
Case Study
Academia
Student
Toy Example
Experiment
Academia
Student
Down-Scaled
Experiment
Academia
Student
Industrial
Experiment
Industry
Practitioner
Industrial
Experiment
Industry
Practitioner
Toy Example
Survey
Industry
Practitioner
Industrial
Survey
Industry
Practitioner
Toy Example
The combination of research method, context, subject and scale that are experiments applied
to an industrial sized example by the researcher in academia appear to have the most number
of occurrences. However, this can be contested with case studies applied to an industrial sized
example by practitioners in industry due to the minuscule difference of 1% between the two
(see Table 21).
Experiments are most frequently used in academia than industry. This may suggest that
experiments are much more difficult to setup and control when executed in an industrial
setting.
Less than half of the 27 case studies are conducted in a real-world environment using
industrial sized evaluations and practitioners as subjects (see Table 21).
When a post-mortem analysis is used, they are more commonly used by practitioners in
industry. This is evident by the fact that 80% of the evaluations using post-mortem analysis as
a research method were used by practitioners in industry as opposed to researchers or students
in academia.
Action research and surveys are not popular choices of research methods to use to conduct
empirical evaluations.
Based on the distribution of Table 21, it is feasible to conclude there is a need for more software
product line evaluations to hold a higher degree of realism. The majority of evaluations were
conducted with experiments and although experimental evaluations ensure high validity, they also
lack realism due to a smaller scope of study and controlled environment [27]. Adopters of new
software product line methods, specifically those in industry, want to see results that support
claimed benefits of software product lines as mentioned in the Introduction of this review. For
example, will the method reduce development effort, reduce time-to-market, increase profit
margin, etc. Practitioners want to feel comfortable that the new method can be applied in their own
environment therefore it would be suitable for researchers to increase the degree of realism of their
evaluations by conducting more evaluations in an industrial context (i.e. Industrial environment,
24
using practitioners and industrial sized examples). It is further recommended to use case studies to
increase the degree of realism as case studies are suitable for industrial evaluations [27].
Further supporting the information tabled in Table 21, the information presented in Figure 5 shows
that experiments and case studies are the dominating choice of research methods among
researchers; whereas, action research, surveys, and post-mortem analysis are rarely used. Figure 5
shows most evaluations are conducted in Academia and most evaluations are conducted using
industrial sized examples. The distribution of Figure 5 also confirms that experimental evaluations
are typically performed in an academic setting whereas case studies are typically performed in an
industrial setting. A common acknowledgment is shared between [27][37][50] and supports the
findings that case studies by definition are empirical investigations of a phenomenon within a realworld context and experiments are empirical investigations of a theory or hypotheses testing
within a controlled environment (i.e. classroom). Case studies are further classified in [27] as
being very suitable for industrial evaluations because they can avoid scalability issues while
sampling from variables representing a typical situation; however, [27] also acknowledges that the
results of case study are difficult to generalize thus affecting the external validity of the evaluation.
Experimental evaluations have higher validity than case studies because of their controlled
environments [27][37][50]; however, like case studies, experiments also have a difficulty
generalizing results [50]. It is also reported in [50] that the software engineering community holds
a common agreement that most experiments do not resemble an industrial situation. Therefore, the
ideal goal would be to increase the validity of software product line methods, techniques, and
approaches by conducting more experiments in an industrial setting.
Action
Research
16
21
14
Case Study
11
Experiment
33
15
10
Post-mortem
Analysis
Survey
Academia
Industry
Industrial
Down-scaled
Real World Example
Context
Toy Example
Scale
Figure 5 State of SPL Evaluations
25
5.3. What extend do the reported empirical results enable decisionmaking regarding the actual adoption of the SPL method (RQ3)?
Referring back to Table 21, the majority of studies performed evaluations in academia and
minority of studies would be considered static evaluations conducted in industry [32]. Static
evaluations are studies that used experiments and surveys as research methods. There were 46
studies found to be academic evaluations; none of which used practitioners as subjects. There were
8 studies classified as static evaluations in industry, 6 of which were carried out using practitioners
as subjects and 2 used researchers. The 46 academic studies potentially offer the least support for
adoption of their software product line methods.
25 studies would be considered dynamic evaluations conducted in industry [32]. Dynamic
evaluations are studies that used action research, case studies, and post-mortem analysis as
research methods. Out of the 25 dynamic evaluations, 21 were carried out using practitioners as
subjects and 4 used researchers. These 25 studies potentially offer the support for adoption of their
software product line methods. However, when reviewing the quality assessment scores reported
in the next section (see Tables 22, 23, and 24), the average score for the 25 dynamic evaluations
conducted in industry is 6.06 while the highest possible score is 14.). This apparent lack of quality
in research design and reporting has the potential to curtail adoption of the proposed software
product line methods. The quality assessment scores for each individually reviewed paper in
regards to RQ3 can also be viewed by referring to Table 30 found in Appendix D.
5.3.1. To what degree is the evaluation described (RQ3.1)?

All papers had description of the research design component of their evaluations (see Table 22).
Only 28 out of 79 studies stated the objective of their empirical evaluation by providing a clear
problem, hypotheses or objective/goal statement with only 13 of the 28 studies supporting their
objectives with specific research questions. 63 studies gave an indication of the domain where
their evaluation was conducted in whereas 16 studies did not. 40 studies described to some degree
either the samples and/or instruments used during their evaluations whereas 39 studies did not. 30
studies took some initiative to describe their data collection and data analysis procedures whereas
49 studies did not.
Table 22 Research Design Quality Assessment Criteria
Research Design
Quality Assessment Criteria
Q1. Does the study describe a clear problem, hypotheses or objective/goal
statement?
Q2. Does the study provide research questions that will support or reject the
problem, hypotheses or objective statement?
Q3. Does the study describe the domain which the evaluation was conducted in?
Q4. Does the study describe samples and instruments used to conduct the
evaluation?
Q5. Does the study describe its data collection procedures?
Q6. Does the study describe its data analysis procedures?
Response
Yes
28
(35%)
13
(16%)
61
(77%)
31
(39%)
23
(29%)
23
(29%)
No
51
(65%)
66
(84%)
16
(20%)
39
(49%)
49
(62%)
49
(62%)
Somewhat
0
(0%)
0
(0%)
2
(3%)
9
(12%)
7
(9%)
7
(9%)
26
All papers had description of the research analysis component of their evaluations (see Table 23).
The purpose of presenting results analysis is for researchers to clearly explain how they arrived at
their interpretation of the evaluation results which should include an overview of the results,
threats to validity, assumptions/inferences, and lessons learned [26].
Most studies had the interpretation or analysis of the evaluation results (see Table 23). 75 (95%)
studies provided an interpretation or analysis of the results where only 4 studies did not.
Furthermore, only 17 (22%) studies described their results relative to the problem, hypotheses or
objective/goal statement. With consideration that only 28 (35%) studies stated a clear problem,
hypotheses or objective/goal statement, (61%) of these studies discussed their results in relation to
their problem, hypotheses or objective/goal statement.
Only 5 (6%) studies communicated their assumptions made during their evaluations (see Table
23). Assumptions correspond to a decision made when insufficient or unavailable information,
vague requirements, previous experience, lack of confidence or knowledge, or any situation in
which a decision must be made so that a something will work or progress can be made [33]. These
decisions are rarely recorded, communicated, or reviewed [33].
As a result, the failure to
communicate these decisions may lead to defects or post-deployment failures or may become
invalid due to changes in the environment in which the method is deployed [33].
Similar to the results found in [34], a considerably high rate of studies failed to discuss any threats
to the validity of their research (see Table 23). Only 15 (19%) studies discussed threats to validity
in comparison to 64 (81%) studies that did not. Researchers should know that there are threats to
the validity of any research study and should be examined before and during the study design
phase, and throughout the execution of the study [34]. Validity of a study denotes the
trustworthiness of the results [25]; therefore, researchers have a responsibility to discuss
limitations of their study [24].
Describing what went well and what did not during the course of the evaluation provides a
discussion for lessons learned [26]. Discussing the successes and failures learned throughout the
course of the evaluation, the researcher conveys knowledge that can be usefully applied in future
research. The fact that 15 (19%) of studies discuss lessons learned whereas 64 (81%) of the
studies did not suggests that insufficient knowledge is being transferred for future research (see
Table 23).
Table 23 Research Analysis Quality Assessment Criteria
Results Analysis
Q7. Does the study provide an interpretation or analysis of the results?
Response
Yes
60 (76%)
Q8. Does the study describe its results relative to its problem,
hypotheses or objective statement?
Q9. Does the study describe assumptions made during the execution of
the evaluation?
Q10. Does the study discuss threats to validity?
17 (22%)
Q11. Does the study discuss lessons learned?
5
(6%)
15
(19%)
15
(19%)
No
4
(5%)
62 (78%)
74 (94%)
64
(81%)
64
(81%)
Somewhat
15
(19%)
0
(0%)
0
(0%)
0
(0%)
0
(0%)
Researchers should conclude their findings into a context of implications by forming theories that
constitute a framework for their analysis [25]. This is where researchers discuss practical impacts,
27
relate their work and results to earlier research, and give footing towards future work [26]. All
papers had the description of the research conclusions component of their evaluations (see Table
24).
When considering the adoption of a new method, technique, or approach, researchers and
practitioners alike tend to ask questions specific to practical implications. For example, will the
method positively or negatively affect cost, time and quality for developing and delivering the
product, and how will the method affect cost, time, and quality for developing and delivering the
product [26]? The practical implications are earmarked as potential benefits for implementation in
practice. As it is common for researchers to report positive findings rather than negative ones [5],
it is not surprising that 58 (73%) studies discussed, to some degree, the practical implications of
their work whereas 21 (27%) of the studies did not (see Table 24).
Clarifying how the researchers work relates to existing works is stipulated as important in
published guidelines [25][26]. Researchers and practitioners alike require the ability to review
alternative approaches and relations between evaluations [26]. 64 (81%) studies facilitated access
to some alternative research and 15 (19%) studies did not (see Table 24). The studies that did not
report related work may be doing so because there may not be any reported alternatives to their
work.
Describing what other research can be run places emphasis on future work for the purpose of
investigating the results further or to evolve the body of knowledge [26]. Providing
recommendations for future work paves a direction of whats next for topic of study. 54 (68%) of
studies provided some kind of recommendation for future work, 34 (43%) of which was clearly
detailed, 20 (25%) studies provided somewhat of a vague discussion, and 25 (32%) studies did not
recommend any further direction for their studied topic (see Table 24).
Table 24 Research Conclusions Quality Assessment Criteria
Research Conclusions
Q12. Does the study discuss the practical implications?
Q13. Does the study discuss related work?
Q14. Does the study provide recommendations for future work?
Response
Yes
45
(57%)
58
(73%)
34
(43%)
No
21
(27%)
15
(19%)
25
(32%)
Somewhat
13
(16%)
6
(8%)
20
(25%)
According to the quality assessment criteria described in Section 3.4, Table 30 found in Appendix
D represents the score card on each of the criteria used for the quality assessment for each of the
79 studies included in this review. If the response given for a question is Yes it is given a score
of 1.0, Somewhat is assigned a score of 0.5 and No is scored as 0.0. The maximum score a
study can receive is 14. An evaluation is deemed of better quality based on higher scores. Table 30
is sorted by highest scores to lowest scores. It is interesting to note only one study scored the
highest possible score.
28
With the highest possible score of 14, the overall average of the 79 studies is 5.92. This lower
than mid-range average indicates there are issues pertaining to the quality of research design
and research reporting that need to be addressed in the software engineering community.
The three highest ranking criteria addressed in research evaluations are domain description,
interpretation of results, and related work.
6. Discussion and Recommendations

This section provides a discussion on the results and recommendations to improve the state of
evaluations in software product line research.
6.1. Discussion
The findings from this systematic literature review have revealed that software product line
research has produced a diverse set of methods, techniques, and approaches. As previously
mentioned, there were 79 different software product line methods, techniques and approaches
revealed that targeted problem areas like feature modelling, requirements management, and
commonality and variability management. Despite this diversity, the 79 studies represented only
13% of the total number of studies this review initially started with. This may suggest that
empirical evaluations using rigorous research methods are not being used as much as they should
be or they are not being reported and published.
A positive finding of this review is with the diversity of methods, techniques, and approaches, all
phases of the software product line life cycle was covered with Domain Engineering being the
most dominant. Furthermore, all software product line life cycle activities were also covered with
Domain Requirements Analysis (Common and Variable Requirements included) being the most
dominant. Lastly with no surprise the three highest ranking of target problem areas were feature
modelling, requirements management, and commonality and variability management. This shows
that Domain Requirements seem to be one of the main targeted research areas in the software
product line community. Kuloor and Eberlein [35] state that requirements engineering, although is
often an early stage of development activity, it influences all phases of the software product life
cycle. They go further to say that changes to requirements are likely to happen during any stage of
development; therefore, a sound requirements engineering approach must be able to accommodate
changes and the evolution of requirements [35].
This systematic literature review identifies a need for more realistic empirical evaluations to be
conducted in industry as the results revealed a majority of evaluations conducted in academia;
however, when considering the full context of research method used, research context, subjects,
and scale of evaluation, there seemed to be a stand-off in the state of evaluations between
experiments conducted by researchers in academia on industrial sized examples versus case
studies conducted by practitioners in industry using industrial sized examples. The significance of
this finding is the majority of the evaluations scaled to industrial sized applications. This is a
positive finding because the software product line methods, techniques and approaches being
29
evaluated are more likely to be applicable in other environments. Conversely, the degree of
realism of the experiments conducted by researchers in academia, despite using industrial sized
examples, is quite low; nonetheless, this is offset by the high degree of realism found with the case
studies conducted by practitioners in industry using industrial sized examples.
One of the major findings of this study is that the quality of research design and reporting to be
significantly low. Research design and reporting was scored against a 14 point quality assessment
checklist. The overall average of quality was 5.92 among the reviewed evaluations with only 30%
of the studies scoring 50% or higher and 1 study scoring 100%. The issue of quality research
design and quality reporting remains to be a problem when reflecting back on the related works
[10][20]. This is not positive as the available evidence of the reviewed software product line
methods, technique and approaches can hardly be passed off as reliable.
Empirical evaluation and assessment are expected to play a vital role in rigorous evaluation of the
software product line methods, technique and approaches [10]. The synthesis of the available
evidence shows only one study [9] scored the highest possible score for sound research design and
reporting.
The main findings of this systematic literature review may have several implications for software
product line researchers and practitioners such as knowing what software product line methods,
techniques and approaches exist and areas where more empirical research is needed in the software
product line community. Another major implication originates from the continual lack of quality in
research design and reporting. This should further encourage the software product line community
to improve the state of evaluating research and reporting the outcomes such that the evidence can
be accepted as reliable and trustworthy.
6.2. Recommendations
With exception to the diverse coverage of topic areas being researched and the positive aspects of
using industrial-scaled evaluations, there are still areas of improvement that need to be addressed
regarding the state of software product line evaluations and research reporting. The following
recommendations are provided to help improve the state of software product line evaluations:
The 79 studies representing only 13% of the total number of studies found conducted
empirical evaluation of a software product line method, technique or approach. This suggests
that much research is not being rigorous in their methods and may only be providing
theoretical evaluations as opposed to an empirical evaluation. Therefore, more empirical
evaluations need to be adopted in the software product line community among researchers and
practitioners alike.
A majority of evaluations were conducted using experiments in academia. This falters in

realism. It is feasible to conclude there is a need for more software product line evaluations to
hold a higher degree of realism. Adopters of new software product line methods, specifically
those in industry, want to see results that support claimed benefits of software product.
Therefore, it recommended for researchers to increase the degree of realism of their
evaluations by conducting more evaluations in an industrial context (i.e. industrial
30
environment, using practitioners and industrial sized examples) using case studies. This will
increase the degree of realism and improve the overall state of their evaluations.
Finally, to improve on the lack of quality reporting, it is recommended for researchers to

adopt a set of reporting guidelines similar to Preliminary guidelines for empirical research in
software engineering [24], Guidelines for conducting and reporting case study research in
software engineering [25], Reporting experiments in software engineering [26], and A
Unified Checklist for Observational and Experimental Research in Software Engineering
[38]. Adopting a reporting guidelines and using a checklist will ensure all reporting
requirements are met and will increase the reliability and understandability of the research
being reported.
7. Related Works
Systematic literature review based research is a fairly new research method and is becoming a
standard research method amongst software engineers. This section seeks to highlight similar
works conducted in the area of software product lines. One way to find how many software
product line systematic literature reviews have been conducted in the past is to find tertiary studies
(a systematic literature review of systematic literature reviews) that have summarized past
systematic literature reviews produced. One such tertiary study was conducted by Kitchenham et
al. [6]. Their objective was to provide a catalogue of systematic literature reviews in software
engineering available to software engineering researchers and practitioners. They sought to find
out how many systematic literature reviews were published between January 1, 2004 and June 30,
2008, what research topics were addressed, and who is most active in systematic literature review
based research. Their results identified 53 software engineering systematic literature reviews
published between January 1, 2004 and June 30, 2008. Unfortunately, none of the 53 secondary
studies identified were reported as having a software product line research topic.
Silva et al. [7] added their contribution to Kitchenham et al. [6] tertiary study by conducting an
extended tertiary study for the same objective. Silva et al. [88] tertiary study identified 67
systematic literature reviews for software engineering published between July 1, 2008 and
December 31, 2009 covering 24 software engineering topic areas. Out of the 67 secondary studies
identified, 7 secondary studies [43][44][45][46][47][48][49] were reported as having a software
product line research topic.
A more recent empirical investigation of systematic literature reviews in software engineering
conducted by Babar and Zhang [8] found 142 secondary studies published between January 1,
2004 and December 31, 2010 covering over 30 software engineering topics. Out of the 142
secondary studies identified, only 5 secondary studies were reported as having a software product
line research topic. Citations for the 5 secondary studies are not available for reporting because
they were not provided as references by Babar and Zhang [8]. Table 25 summarizes the coverage
of software product line systematic literature reviews found among the three tertiary studies
mentioned.
31
Table 25 Number of SPL SLRs found among SE SLRs

Ref
Id
Year of
Article
Date
Range of
Studies
Topic Area
Article Type
# of SPL
Secondary
Studies
found
0
SPL Studies
Tertiary Study
# of
Secondary
Studies
found
53
[6]
2010
[7]
2011
[8]
2011
Jan/04Jun/08
Jul/08Dec/09
Jan/04Dec/10
Software
Engineering
Software
Engineering
Software
Engineering
Tertiary Study
67
Empirical
Investigation
with a tertiary
study component
142
[45, 46, 47, 48,

49, 50, 51]
N/A
N/A
Table 25 shows software product line systematic literature reviews did not emerge until after June
2008 even though systematic literature reviews in the software engineering community had
already began much earlier. Table 25 also shows that software product line systematic literature
reviews only make up a small fraction of research topics among software engineering systematic
literature reviews conducted up until the end of the year 2010.
Since the tertiary studies identified in Table 25 included software engineering systematic literature
reviews published between January 2004 and December 2010, an automated search was conducted
as part of this review to fill in the gap for the year 2011. The search strategy was not restricted to a
date range and used the population and intervention search criteria in Table 26 against the five
digital libraries: ACM Digital Library, IEEE Xplore Digital Library, ScienceDirect, SpringerLink,
and Wiley Online Library. Similar to the search strategy of outlined in Section 3.2 Data Sources
and Search Strategy, the process of identifying relevant papers that undertake an systematic
literature review of software product line methods, techniques or approaches needed the capability
to recover research articles and papers made available through scientific publication databases.
Specific publication databases were selected on the basis that they included research papers in the
area of computing and information technology sciences.
Table 26 Search terms used to find software product line systematic literature reviews
Population
software product line OR software product
family*
AND
AND
Intervention
systematic literature review OR systematic review
OR literature review OR SLR OR systematic
mapping
The search found 14 unique software product line systematic literature reviews, 8 of which were
published in 2011. 6 articles were published between 2008 and 2010 suggesting the search yielded
similar results to [6][7][8]. Table 27 summarizes the software product line systematic literature
reviews found.
Table 27 Software product line systematic literature reviews
Ref Id
Year of
Article
Date Range of
Studies
Article Title
[9]
2011
Not mentioned
[10]
2011
1990 - 2007
[11]
2011
2009-2010
[12]
2011
Up to June 2010
Adopting software product lines: A systematic mapping

study
A systematic review of evaluation of variability
management approaches in software product lines
Agile software product lines: a systematic mapping
study
Agile product line engineeringa systematic literature
No.
Secondary
Studies
34
97
32
39
32
[13]
2011
2001-2008
[14]
2011
[15]
2011
1996 to June
2010
1993-2009
[16]
2011
[17]
2010
[18]
2009
[19]
2009
[20]
2009
Jul. 2008-Oct.
2008
Dec. 2007 to Jan.
2008
2000-2007
[21]
2009
1999-2009
[22]
2008
1998-2007
2002 to
June 2010
1990-2009
review
Software product line testing A systematic mapping
study
A systematic review of quality attributes and measures
for software product lines
A systematic mapping study of software product lines
testing
A preliminary mapping study of approaches bridging
software product lines and service-oriented architectures
Requirements engineering for software product lines: A
systematic literature review
A Systematic Review of Software Product Lines
Applied to Mobile Middleware
Variability management in software product lines: a
systematic review
A systematic review of domain analysis solutions for
product lines
Gathering current knowledge about quality evaluation in
software product lines
Evaluating Domain Design Approaches Using
Systematic Review
64
35
120
48
49
7
33
89
39
17
It is not the intention of this systematic literature review to report the research objectives, results,
and conclusions of all the related works in Table 27; however, an overview of articles [10][20] is
provided for consideration and comparison when studying the results reported in this review.
These two related works were selected for overview because like this review they sought to
summarize evidence of software product line methods, techniques, and approaches subjected to an
empirical evaluation. One difference between this review and [10][20] is that this review has a
much broader scope of software product line methods, techniques, and approaches whereas [10]
only focused on variability management approaches and [20] only focused on domain analysis
solutions. If more information is desired for each of the listed related work, please refer to the
articles bibliography in the References section of this paper.
Chen and Babar [10] sought to review the status of evaluation of reported variability management
approaches and to synthesize the available evidence about the effects of the reported approaches.
They found a small number of studies that used rigorous scientific methods to evaluate the
reviewed approaches. Their investigation revealed significant quality deficiencies in the reporting
of evaluations. They also found that all studies, except one, reported only positive effects. They
concluded that a large majority of reported variability management approaches have not been
sufficiently evaluated using scientifically rigorous methods as such they recommended more
rigorous evaluations are needed for variability management approaches. They also noted the
available evidence they reviewed was sparse and of low quality but consistent across different
studies meaning the proposed approaches may be beneficial when applied properly in appropriate
situations.
Khurum and Gorschek [20] wanted to analyze the level of industrial application and/or empirical
validation of domain analysis solutions for software product lines and the extent the solutions were
evaluated based on usability and usefulness. They found it difficult to evaluate the usability and
usefulness for industry adoption of the proposed solutions because like [10] many of the studies
they reviewed lacked qualitative and quantitative results from empirical validations. They
concluded by asserting that evidence gathered through empirical evaluations supported by an
33
explanation of the design and execution of the evaluation could be beneficial for researchers to
expand on the work and for practitioners to adopt the proposed solutions.
Both related works [10][20] covered published literatures up until 2007. It is worth noting they
both reported lack of rigorous empirical validations performed by evaluators coupled with low
quality reporting. Because this review covers more recent published literatures between 2006 and
2011, one thing to look out for is improvements in empirical evaluations of proposed software
product line methods, techniques and approaches, and the quality of reporting produced by the
evaluators.
Finally we would like to note on the scope of this systematic literature review. Given the extreme
importance of empirical support for approaches proposed in the software engineering community,
we believe that such a systematic literature review is an important step towards finding gaps and
issues with regards to identifying and reporting evidence of applicability. Dyba et al have
performed a systematic literature review with a similar scope for the area of agile software
development [51].
8. Conclusion and Future Work

This systematic literature review sought out to determine how much real and extensive empirical
evaluation has been undertaken in the field of software product lines in reference to development
methods, techniques, and approaches. In order for this review to study relevant and current
software product line methods, techniques, and approaches, research articles and papers published
between 2006 and 2011 were identified through online scientific publication databases. The initial
search found 615 primary studies; however, after applying carefully chosen inclusion and
exclusion criteria, only 79 primary studies were selected for review on the premise they performed
empirical evaluation of a software product line method, technique, and/or approach. An extensive
set of properties and quality assessment criteria were used to extract the relevant information from
each study. The extracted data was used to determine what software product line methods,
techniques or approaches exist today, what the current state of evaluations is, and to what extend
are the software product line methods, technique or approaches are deemed adoptable.
The results revealed lack of proper design in large majority of the software product line methods,
techniques, and approaches evaluations which make it difficult to rely on the available evidence.
The other significant problem was quality issue in empirical studies of software product lines
methods and approaches. The need for improvement in quality research design and quality
reporting is only reinforced further with the findings of this review. One positive outcome of this
review was that most evaluations are being conducted with industrial sized applications as opposed
to toy-examples. Recommendations for future work calls for more thorough search strategies for
finding primary studies and more periodic systematic literature reviews to track the progress of
researchers addressing the quality issues found in this review.
34
APPENDIX A: Studies included in the review

[S1].
Acher, M., Collet, P., Lahire, P., Moisan, S., and Rigault, J. P. (2011). Modeling variability from
requirements to runtime. In Engineering of Complex Computer Systems (ICECCS), 2011 16th IEEE
International Conference on, pages 7786.
[S2].
Ahmed, F. and Capretz, L. (2011a). An architecture process maturity model of software product line
engineering. Innovations in Systems and Software Engineering, 7:191207.
[S3].
Ahmed, F. and Capretz, L. (2011b). A business maturity model of software product line engineering.
Information Systems Frontiers, 13:543560.
[S4].
Aldekoa, G., Trujillo, S., Sagardui, G., and Diaz, O. (2008). Quantifying maintainability in feature
oriented product lines. In Software Maintenance and Reengineering, 2008. CSMR 2008. 12th European
Conference on, pages 243247.
[S5].
Altintas, N., Cetin, S., Dogru, A., and Oguztuzun, H. (2011). Modeling product line software assets
using Domain-Specific kits. Software Engineering, IEEE Transactions on, PP(99):1.
[S6].
Rodrigo Andrade, Marcio Ribeiro, Vaidas Gasiunas, Lucas Satabin, Henrique Rebelo, and Paulo Borba.
2011. Assessing Idioms for Implementing Features with Flexible Binding Times. In Proceedings of the
2011 15th European Conference on Software Maintenance and Reengineering (CSMR '11). IEEE
Computer
Society,
Washington,
DC,
USA,
231-240.
DOI=10.1109/CSMR.2011.29
http://dx.doi.org/10.1109/CSMR.2011.29.
[S7].
Assuncao, W. K. G., de Freitas Guilhermino Trindade, D., Colanzi, T. E., and Vergilio, S. R. (2011).
Evaluating test reuse of a software product line oriented strategy. In Test Workshop (LATW), 2011 12th
Latin American, pages 16.
[S8].
Bagheri, E., Asadi, M., Gasevic, D., and Soltani, S. (2010). Stratified analytic hierarchy process:
Prioritization and selection of software features. In Bosch, J. and Lee, J., editors, Software Product
Lines: Going Beyond, volume 6287 of Lecture Notes in Computer Science, pages 300315. Springer
Berlin / Heidelberg.
[S9].
Bagheri, E. and Gasevic, D. (2011). Assessing the maintainability of software product line feature
models using structural metrics. Software Quality Journal, 19:579612.
[S10].
Bartholdt, J., Oberhauser, R., and Rytina, A. (2008). An approach to addressing entity model variability
within software product lines. In Software Engineering Advances, 2008. ICSEA 08. The Third
[S11].
Bonifcio, R. and Borba, P. (2009). Modeling scenario variability as crosscutting mechanisms. In

Proceedings of the 8th ACM international conference on Aspect-oriented software development, AOSD
09, pages 125136, New York, NY, USA. ACM.
[S12].
Cabral, I., Cohen, M., and Rothermel, G. (2010). Improving the testing and testability of software
product lines. In Bosch, J. and Lee, J., editors, Software Product Lines: Going Beyond, volume 6287 of
Lecture Notes in Computer Science, pages 241255. Springer Berlin / Heidelberg.
[S13].
Cavalcanti, R., de Almeida, E. S., and Meira, S. R. L. (2011). Extending the RiPLE-DE process with
quality attribute variability realization. In Proceedings of the joint ACM SIGSOFT conference QoSA
and ACM SIGSOFT symposium ISARCS on Quality of software architectures QoSA and
architecting critical systems ISARCS, QoSA-ISARCS 11, pages 159164, New York, NY, USA.
ACM.
[S14].
Chakravarthy, V., Regehr, J., and Eide, E. (2008). Edicts: implementing features with flexible binding
times. In Proceedings of the 7th international conference on Aspect-oriented software development,
AOSD 08, pages 108119, New York, NY, USA. ACM.
35
[S15].
Chastek, G., Donohoe, P., and McGregor, J. D. (2011). Commonality and variability analysis for
resource constrained organizations. In Proceedings of the 2nd International Workshop on Product Line
Approaches in Software Engineering, PLEASE 11, pages 3134, New York, NY, USA. ACM.
[S16].
Chen, S. and Erwig, M. (2011). Optimizing the product derivation process. In Software Product Line
Conference (SPLC), 2011 15th International, pages 3544.
[S17].
Chen, Y., Gannod, G. C., and Collofello, J. S. (2006). A software product line process simulator.
Software Process: Improvement and Practice, 11(4):385409.
[S18].
Christian and Rosso, D. (2008). Software performance tuning of software product family architectures:
Two case studies in the real-time embedded systems domain. Journal of Systems and Software, 81(1):1
19.
[S19].
Cirilo, E., Nunes, I., Kulesza, U., and Lucena, C. (2012). Automating the product derivation process of
multi-agent systems product lines. Journal of Systems and Software, 85(2):258276.
[S20].
Classen, A., Heymans, P., Schobbens, P.-Y., and Legay, A. (2011). Symbolic model checking of
software product lines. In Proceedings of the 33rd International Conference on Software Engineering,
ICSE 11, pages 321330, New York, NY, USA. ACM.
[S21].
Dabholka, A. and Gokhale, A. (2010). Middleware specialization for Product-Lines using FeatureOriented reverse engineering. In Information Technology: New Generations (ITNG), 2010 Seventh
[S22].
Deelstra, S., Sinnema, M., and Bosch, J. (2009). Variability assessment in software product families.
Information and Software Technology, 51(1):195218.
[S23].
Dhungana, D., Grnbacher, P., Rabiser, R., and Neumayer, T. (2010). Structuring the modeling space
and supporting evolution in software product line engineering. Journal of Systems and Software,
83(7):11081122.
[S24].
Dong, J. S., Lee, K., Kim, K. H., Kim, S. T., Cho, J. M., and Kim, T. H. (2008). Platform maintenance
process for software quality assurance in product line. In Computer Science and Software Engineering,
2008 International Conference on, volume 2, pages 325331.
[S25].
Eriksson, M., Brstler, J., and Borg, K. (2009). Managing requirements specifications for product lines
an approach and industry case study. Journal of Systems and Software, 82(3):435447.
[S26].
Feigenspan, J., Schulze, M., Papendieck, M., Kastner, C., Dachselt, R., Koppen, V., and Frisch, M.
(2011). Using background colors to support program comprehension in software product lines. In
Evaluation Assessment in Software Engineering (EASE 2011), 15th Annual Conference on, pages 66
75.
[S27].
Ganesan, D., Lindvall, M., McComas, D., Bartholomew, M., Slegel, S., and Medina, B. (2010).
Architecture-Based unit testing of the flight software product line. In Bosch, J. and Lee, J., editors,
Software Product Lines: Going Beyond, volume 6287 of Lecture Notes in Computer Science, pages
256270. Springer Berlin / Heidelberg.
[S28].
Ganesan, D., Muthig, D., Knodel, J., and Yoshimura, K. (2006). Discovering organizational aspects
from the source code history log during the product line planning PhaseA case study. In Reverse
Engineering, 2006. WCRE 06. 13th Working Conference on, pages 211220.
[S29].
Geertsema, B. and Jansen, S. (2010). Increasing software product reusability and variability using active
components: a software product line infrastructure. In Proceedings of the Fourth European Conference
on Software Architecture: Companion Volume, ECSA 10, pages 336343, New York, NY, USA.
ACM.
[S30].
Guo, J., Wang, Y., Trinidad, P., and Benavides, D. (2011a). Consistency maintenance for evolving
feature models. Expert Systems with Applications, (0).
36
[S31].
Guo, J., White, J., Wang, G., Li, J., and Wang, Y. (2011b). A genetic algorithm for optimized feature
selection with resource constraints in software product lines. Journal of Systems and Software,
84(12):22082221.
[S32].
Hubaux, A., Boucher, Q., Hartmann, H., Michel, R., and Heymans, P. (2011). Evaluating a textual
feature modelling language: Four industrial case studies. In Malloy, B., Staab, S., and van den Brand,
M., editors, Software Language Engineering, volume 6563 of Lecture Notes in Computer Science, pages
337356. Springer Berlin / Heidelberg.
[S33].
Hunt, J. M. (2006). Organizing the asset base for product derivation. In Software Product Line
Conference, 2006 10th International, pages 6574.
[S34].
Jiang, M., Zhang, J., Zhao, H., and Zhou, Y. (2008a). Enhancing software product line maintenance with
source code mining. In Li, Y., Huynh, D., Das, S., and Du, D.-Z., editors, Wireless Algorithms, Systems,
and Applications, volume 5258 of Lecture Notes in Computer Science, pages 538547. Springer Berlin /
Heidelberg.
[S35].
Jiang, M., Zhang, J., Zhao, H., and Zhou, Y. (2008b). Maintaining software product lines - an industrial
practice. In Software Maintenance, 2008. ICSM 2008. IEEE International Conference on, pages 444
447.
[S36].
Shigeo Kato and Nobuhito Yamaguchi. 2011. Variation Management for Software Product Lines with
Cumulative Coverage of Feature Interactions. In Proceedings of the 2011 15th International Software
Product Line Conference (SPLC '11). IEEE Computer Society, Washington, DC, USA, 140-149.
DOI=10.1109/SPLC.2011.51 http://dx.doi.org/10.1109/SPLC.2011.51
[S37].
Kim, C., Bodden, E., Batory, D., and Khurshid, S. (2010). Reducing configurations to monitor in a
software product line. In Barringer, H., Falcone, Y., Finkbeiner, B., Havelund, K., Lee, I., Pace, G.,
Rosu, G., Sokolsky, O., and Tillmann, N., editors, Runtime Verification, volume 6418 of Lecture Notes
in Computer Science, pages 285299. Springer Berlin / Heidelberg.
[S38].
Kim, J., Park, S., and Sugumaran, V. (2008a). DRAMA: A framework for domain requirements analysis
and modeling architectures in software product lines. Journal of Systems and Software, 81(1):3755.
[S39].
Kim, K., Kim, H., Kim, S., and Chang, G. (2008b). A case study on SW product line architecture
evaluation: Experience in the consumer electronics domain. In Software Engineering Advances, 2008.
ICSEA 08. The Third International Conference on, pages 192197.
[S40].
Kim, T., Ko, I. Y., Kang, S. W., and Lee, D. H. (2008c). Extending ATAM to assess product line
architecture. In Computer and Information Technology, 2008. CIT 2008. 8th IEEE International
Conference on, pages 790797.
[S41].
Kuvaja, P., Simil, J., and Hanhela, H. (2011). Software product line adoption: guidelines from a case
study. In Proceedings of the Third IFIP TC 2 Central and East European conference on Software
engineering techniques, CEE-SET08, pages 143157, Berlin, Heidelberg. Springer-Verlag.
[S42].
Lamas, E., Dias, L. A. V., and da Cunha, A. M. (2011). Effectively testing for a software product line
with OTM3 organizational testing management maturity model. In Information Technology: New
Generations (ITNG), 2011 Eighth International Conference on, pages 274279.
[S43].
Lee, S.-B., Kim, J.-W., Song, C.-Y., and Baik, D.-K. (2007). An approach to analyzing commonality and
variability of features using ontology in a software product line engineering. In Software Engineering
Research, Management Applications, 2007. SERA 2007. 5th ACIS International Conference on, pages
727734.
[S44].
Loesch, F. and Ploedereder, E. (2007a). Optimization of variability in software product lines. In

Software Product Line Conference, 2007. SPLC 2007. 11th International, pages 151162.
37
[S45].
Loesch, F. and Ploedereder, E. (2007b). Restructuring variability in software product lines using concept
analysis of product configurations. In Software Maintenance and Reengineering, 2007. CSMR 07. 11th
European Conference on, pages 159170.
[S46].
Lorenz, D. and Rosenan, B. (2011). Code reuse with language oriented programming. In Schmid, K.,
editor, Top Productivity through Software Reuse, volume 6727 of Lecture Notes in Computer Science,
pages 167182. Springer Berlin / Heidelberg.
[S47].
Matos, P., Duarte, R., Cardim, I., and Borba, P. (2007). Using design structure matrices to assess
modularity in Aspect-Oriented software product lines. In Proceedings of the First International
Workshop on Assessment of Contemporary Modularization Techniques, ACoM 07, Washington, DC,
USA. IEEE Computer Society.
[S48].
Mellado, D., Fernndez-Medina, E., and Piattini, M. (2010). Security requirements engineering
framework for software product lines. Information and Software Technology, 52(10):10941117.
[S49].
Mendonca, M. and Cowan, D. (2010). Decision-making coordination and efficient reasoning techniques
for feature-based configuration. Science of Computer Programming, 75(5):311332.
[S50].
Mu, Y., Wang, Y., and Guo, J. (2009). Extracting software functional requirements from free text
documents. In Information and Multimedia Technology, 2009. ICIMT 09. International Conference on,
pages 194198.
[S51].
Niemel, E. and Immonen, A. (2007). Capturing quality requirements of product family architecture.
Information and Software Technology, 49(11-12):11071120.
[S52].
Niu, N. and Easterbrook, S. (2009). Concept analysis for product line requirements. In Proceedings of
the 8th ACM international conference on Aspect-oriented software development, AOSD 09, pages 137
148, New York, NY, USA. ACM.
[S53].
Nobrega, J. P., Almeida, E., and Meira, S. R. L. (2008). InCoME: Integrated cost model for product line
engineering. In Software Engineering and Advanced Applications, 2008. SEAA 08. 34th Euromicro
Conference, pages 2734.
[S54].
Nonaka, M., Zhu, L., Babar, M., and Staples, M. (2007). Project cost overrun simulation in software
product line development. In Mnch, J. and Abrahamsson, P., editors, Product-Focused Software
Process Improvement, volume 4589 of Lecture Notes in Computer Science, pages 330344. Springer
[S55].
Oliveira Junior, E. A., Maldonado, J. C., and Gimenes, I. M. S. (2010). Empirical validation of
complexity and extensibility metrics for software product line architectures. In Software Components,
Architectures and Reuse (SBCARS), 2010 Fourth Brazilian Symposium on, pages 3140.
[S56].
Olumofin, F. G. and Misic, V. B. (2007). A holistic architecture assessment method for software
product lines. Information and Software Technology, 49(4):309323.
[S57].
Reis, S., Metzger, A., and Pohl, K. (2007). Integration testing in software product line engineering: A
Model-Based technique. In Dwyer, M. and Lopes, A., editors, Fundamental Approaches to Software
Engineering, volume 4422 of Lecture Notes in Computer Science, pages 321335. Springer Berlin /
Heidelberg.
[S58].
Riva, C., Selonen, P., Syst, T., and Xu, J. (2011). A profile-based approach for maintaining software
architecture: an industrial experience report. Journal of Software Maintenance and Evolution: Research
and Practice, 23(1):320.
[S59].
Romanovsky, K., Koznov, D., and Minchin, L. (2011). Refactoring the documentation of software
product lines. In Huzar, Z., Koci, R., Meyer, B., Walter, B., and Zendulka, J., editors, Software
Engineering Techniques, volume 4980 of Lecture Notes in Computer Science, pages 158170. Springer
38
[S60].
Rosenmller, M., Siegmund, N., Apel, S., and Saake, G. (2011). Flexible feature binding in software
product lines. Automated Software Engineering, 18:163197.
[S61].
Sales, R. J. and Coelho, R. (2011). Preserving the exception handling design rules in software product
line context: A practical approach. In Dependable Computing Workshops (LADCW), 2011 Fifth LatinAmerican Symposium on, pages 916.
[S62].
Segura, S., Hierons, R. M., Benavides, D., and Ruiz-Corte ands, A. (2010). Automated test data
generation on the analyses of feature models: A metamorphic testing approach. In Software Testing,
Verification and Validation (ICST), 2010 Third International Conference on, pages 3544.
[S63].
Sellier, D., Mannion, M., and Mansell, J. (2008). Managing requirements inter-dependency for software
product line derivation. Requirements Engineering, 13:299313.
[S64].
Siegmund, N., Rosenmuller, M., Kastner, C., Giarrusso, P. G., Apel, S., and Kolesnikov, S. S. (2011).
Scalable prediction of non-functional properties in software product lines. In Software Product Line
Conference (SPLC), 2011 15th International, pages 160169.
[S65].
Siegmund, N., Rosenmller, M., Kuhlemann, M., Kstner, C., Apel, S., and Saake, G. SPL conqueror:
Toward optimization of non-functional properties in software product lines. Software Quality Journal,
pages 131.
[S66].
Stoiber, R. and Glinz, M. (2010). Feature unweaving: Efficient variability extraction and specification
for emerging software product lines. In Software Product Management (IWSPM), 2010 Fourth
International Workshop on, pages 5362.
[S67].
Stricker, V., Metzger, A., and Pohl, K. (2010). Avoiding redundant testing in application engineering. In
Bosch, J. and Lee, J., editors, Software Product Lines: Going Beyond, volume 6287 of Lecture Notes in
Computer Science, pages 226240. Springer Berlin / Heidelberg.
[S68].
Teixeira, L., Borba, P., and Gheyi, R. (2011). Safe composition of configuration Knowledge-Based
software product lines. In Software Engineering (SBES), 2011 25th Brazilian Symposium on, pages
263272.
[S69].
Tian, Q. M., Chen, X. Y., Jin, L., Pan, P., and Ying, C. (2008). Asset-based requirement analysis in
telecom service delivery platform domain. In Network Operations and Management Symposium, 2008.
NOMS 2008. IEEE, pages 815818.
[S70].
Uno, K., Hayashi, S., and Saeki, M. (2009). Constructing feature models using Goal-Oriented analysis.
In Quality Software, 2009. QSIC 09. 9th International Conference on, pages 412417.
[S71].
Uzuncaova, E., Khurshid, S., and Batory, D. (2010). Incremental test generation for software product
lines. Software Engineering, IEEE Transactions on, 36(3):309322.
[S72].
Villela, K., Drr, J., and John, I. (2010). Evaluation of a method for proactively managing the evolving
scope of a software product line. In Wieringa, R. and Persson, A., editors, Requirements Engineering:
Foundation for Software Quality, volume 6182 of Lecture Notes in Computer Science, pages 113127.
Springer Berlin / Heidelberg.
[S73].
Weiss, D. M., Li, J. J., Slye, H., Dinh-Trong, T., and Sun, H. (2008). Decision-Model-based code
generation for SPLE. In Software Product Line Conference, 2008. SPLC 08. 12th International, pages
129138.
[S74].
White, J., Benavides, D., Schmidt, D. C., Trinidad, P., Dougherty, B., and Ruiz-Cortes, A. (2010).
Automated diagnosis of feature model configurations. Journal of Systems and Software, 83(7):1094
1107.
[S75].
White, J., Dougherty, B., Schmidt, D. C., and Benavides, D. (2009). Automated reasoning for multi-step
feature model configuration problems. In Proceedings of the 13th International Software Product Line
Conference, SPLC 09, pages 1120, Pittsburgh, PA, USA. Carnegie Mellon University.
39
[S76].
White, J., Schmidt, D. C., Benavides, D., Trinidad, P., and Ruiz-Cortes, A. (2008). Automated diagnosis
of Product-Line configuration errors in feature models. In Software Product Line Conference, 2008.
SPLC 08. 12th International, pages 225234.
[S77].
Xue, Y., Xing, Z., and Jarzabek, S. (2010). Understanding feature evolution in a family of product
variants. In Reverse Engineering (WCRE), 2010 17th Working Conference on, pages 109118.
[S78].
Yoshimura, K., Narisawa, F., Hashimoto, K., and Kikuno, T. (2008). FAVE: factor analysis based
approach for detecting product line variability from change history. In Proceedings of the 2008
international working conference on Mining software repositories, MSR 08, pages 1118, New York,
NY, USA. ACM.
[S79].
Zhang, G., Ye, H., and Lin, Y. (2011). Using knowledge-based systems to manage quality attributes in
software product lines. In Proceedings of the 15th International Software Product Line Conference,
Volume 2, SPLC 11, New York, NY, USA. ACM.
APPENDIX B
Table 28: Quality assessment score card for method described

Article Id
Q1
Q2
Q3
Q4
Q5
Score
[S1]
1.0
1.0
1.0
0.0
0.0
3.0
[S2]
1.0
1.0
1.0
1.0
1.0
5.0
[S3]
1.0
1.0
1.0
1.0
0.0
4.0
[S4]
1.0
1.0
1.0
1.0
0.0
4.0
[S5]
1.0
1.0
1.0
1.0
0.0
4.0
[S6]
1.0
1.0
1.0
1.0
0.0
4.0
[S7]
1.0
0.0
0.0
0.0
0.0
1.0
[S8]
1.0
1.0
0.0
1.0
0.5
3.5
[S9]
1.0
1.0
1.0
1.0
1.0
5.0
[S10]
1.0
1.0
1.0
1.0
0.0
4.0
[S11]
1.0
1.0
0.0
1.0
0.0
3.0
[S12]
1.0
1.0
1.0
0.5
0.5
4.0
[S13]
1.0
1.0
0.0
0.0
0.0
2.0
[S14]
1.0
1.0
1.0
1.0
0.0
4.0
[S15]
1.0
1.0
0.0
0.0
0.0
2.0
[S16]
1.0
1.0
1.0
1.0
0.0
4.0
[S17]
1.0
1.0
0.0
1.0
0.0
3.0
[S18]
1.0
1.0
1.0
1.0
0.0
4.0
[S19]
1.0
1.0
1.0
1.0
0.0
4.0
[S20]
1.0
1.0
1.0
0.0
0.0
3.0
[S21]
1.0
1.0
0.0
0.0
0.0
2.0
[S22]
1.0
1.0
1.0
1.0
1.0
5.0
[S23]
1.0
1.0
1.0
1.0
0.0
4.0
[S24]
1.0
1.0
0.0
0.0
0.0
2.0
[S25]
1.0
1.0
1.0
1.0
0.0
4.0
[S26]
1.0
1.0
1.0
1.0
1.0
5.0
[S27]
1.0
1.0
1.0
0.0
1.0
4.0
[S28]
1.0
1.0
1.0
1.0
0.5
4.5
40
Article Id
Q1
Q2
Q3
Q4
Q5
Score
[S29]
1.0
1.0
0.0
1.0
1.0
4.0
[S30]
1.0
1.0
1.0
1.0
0.0
4.0
[S31]
1.0
1.0
1.0
0.0
0.0
3.0
[S32]
1.0
1.0
0.0
1.0
0.0
3.0
[S33]
1.0
1.0
0.0
1.0
0.0
3.0
[S34]
1.0
0.0
0.0
1.0
0.0
2.0
[S35]
1.0
1.0
0.0
0.5
0.0
2.5
[S36]
1.0
1.0
0.0
1.0
1.0
4.0
[S37]
1.0
0.0
1.0
0.0
0.0
2.0
[S38]
1.0
1.0
1.0
1.0
0.0
4.0
[S39]
1.0
1.0
0.0
0.0
0.0
2.0
[S40]
1.0
1.0
0.0
1.0
0.0
3.0
[S41]
1.0
0.5
0.0
0.5
0.0
2.0
[S42]
1.0
1.0
0.0
0.0
0.0
2.0
[S43]
1.0
1.0
1.0
0.0
0.0
3.0
[S44]
1.0
1.0
1.0
1.0
0.0
4.0
[S45]
1.0
1.0
0.5
0.5
0.0
3.0
[S46]
1.0
1.0
0.0
1.0
1.0
4.0
[S47]
0.0
0.0
0.0
0.0
0.0
0.0
[S48]
1.0
1.0
1.0
1.0
0.0
4.0
[S49]
1.0
1.0
1.0
0.0
0.0
3.0
[S50]
1.0
1.0
1.0
0.0
0.0
3.0
[S51]
1.0
1.0
0.0
0.0
0.0
2.0
[S52]
1.0
1.0
1.0
1.0
0.0
4.0
[S53]
1.0
1.0
1.0
1.0
0.0
4.0
[S54]
1.0
1.0
0.0
1.0
1.0
4.0
[S55]
1.0
1.0
0.0
0.5
0.0
2.5
[S56]
1.0
1.0
1.0
0.0
0.0
3.0
[S57]
1.0
1.0
0.0
0.0
0.0
2.0
[S58]
1.0
1.0
1.0
1.0
0.0
4.0
[S59]
1.0
1.0
1.0
0.0
0.0
3.0
[S60]
1.0
1.0
1.0
1.0
1.0
5.0
[S61]
1.0
1.0
1.0
0.0
0.0
3.0
[S61]
1.0
1.0
1.0
0.0
0.0
3.0
[S63]
1.0
1.0
1.0
1.0
0.0
4.0
[S64]
1.0
1.0
1.0
1.0
1.0
5.0
[S65]
1.0
1.0
0.0
1.0
1.0
4.0
[S66]
1.0
1.0
1.0
1.0
0.0
4.0
[S67]
1.0
1.0
1.0
1.0
1.0
5.0
[S68]
1.0
1.0
1.0
1.0
0.0
4.0
[S69]
1.0
1.0
0.0
1.0
0.0
3.0
[S70]
1.0
1.0
0.0
0.0
0.0
2.0
[S71]
1.0
1.0
1.0
1.0
0.0
4.0
[S72]
1.0
1.0
0.5
1.0
1.0
4.5
[S73]
1.0
1.0
1.0
1.0
0.0
4.0
[S74]
1.0
1.0
1.0
1.0
0.0
4.0
[S75]
1.0
1.0
1.0
1.0
0.0
4.0
[S76]
1.0
1.0
1.0
1.0
0.0
4.0
[S77]
1.0
1.0
0.0
1.0
0.0
3.0
41
Article Id
Q1
Q2
Q3
Q4
Q5
Score
[S78]
1.0
1.0
1.0
1.0
0.0
4.0
[S79]
1.0
1.0
0.5
1.0
0.0
3.5
TOTAL
78.0
74.5
49.5
53.5
15.5
271.0
AVERAGE
0.99
0.94
0.63
0.68
0.20
3.43
APPENDIX C
Table 29 RQ1 Results Summary
Articl
Method Name and/or Description
e Id
Problem
Project
Project
Method
Area
Focus
Activity
Describ
ed
Score
[S1]
A modeling process in which variability sources are
MOD
separated in different FMs and inter-related by
CVM
DE
DRA
3.0
VR
propositional constraints while consistency checking and

propagation of variability choices are automated.
[S2]
Architecture Process Maturity Model -
ARC
An approach to evaluate the current maturity of the
PRO
DE
DRA
5.0
CR
product line architecture development process in an
VR
organization.
DD
DI
[S3]
Business Maturity Model -
PRO
PM
BVE
4.0
EAM
PE
PDS
4.0
Domain-Specific Kits -
MOD
DE
DRA
4.0
A modeling approach for reusable software assets
REQ
A methodology to evaluate the current maturity of the

business dimension of a software product line in an
organization.
[S4]
Maintainability Index (MI) An approach to quantify maintainability of a SPL.
[S5]
[S6]
[S7]
VR
supported variability points.
DD
A technique for modularizing and flexibly binding
COD
DE
DI
varying features in form of idioms.
EAM
PE
PC
Reusable Asset Instantiation (RAI) -
TEST
DE
DT
PE
PT
A software product line test strategy.

[S8]
CR
composed of domain specific artifacts with a set of
Stratified Analytic Hierarchy Process -
CON
PM
SD
A method for ranking and selecting business objectives
PDR
PE
PR
Feature Model Structural Metrics -
EAM
DE
A technique using structural metrics to predict
MOD
and features from the feature model.

[S9]
3.5
DRA
5.0
CR
VR
Approach for Entity Model Variability (AEMV) - An
MOD
approach view and edit the entities within a domain
CVM
model, capture the variability, and shield against
1.0
PC
analyzability, changeability, and understandability.

[S10]
4.0
DE
DRA
4.0
CR
VR
extraneous differences.
42
[S11]
Modeling Scenario Variability as Crosscutting
CVM
DE
Mechanisms (MSVCM) -
DRA
3.0
VR
An approach for use case scenario variability

management.
[S12]
Feature Inclusion Graph (FIG) Basis Path -
TEST
PE
PT
4.0
Enhanced RiPLE-DE Process -
CVM
DE
DRA
2.0
An architecture and design process for SPLs that deals
QCL
A graph based testing approach that selects products and

features for testing based on a feature dependency graph.
[S13]
VR
with quality attributes variability.
DD
DI
[S14]
A variability mechanism for implementing the binding
COD
sites of software features that require flexible binding
DE
DI
PE
PC
DE
DRA
4.0
times.
[S15]
Commonality and variability analysis (C&VA) -
CVM
Commonality and Variability Analysis within an
CR
organization with a single-system development mentality
VR
2.0
attempting to adopt a product line approach.

[S16]
A technique to optimize product derivation by defining a
PDR
PE
PD
4.0
A process evaluation approach - the software product line
ADP
PM
BVE
3.0
process meta-model, and the simulation tool.
PRM
4.0
feature selection strategy.

[S17]
PRO
[S18]
Software Performance Tuning method -
ARC
DE
DT
A scenario-driven approach to evaluate performance and
EAM
PE
PT
An automated product derivation approach for Multi-
PDR
PE
PR
agent Software Product Lines
CON
PD
AUTO
PC
dynamic memory management efficiency of software

product line architectures.
[S19]
[S20]
[S21]
4.0
Featured Transition Systems (FTS) -
MOD
DE
DD
3.0
A modeling checking technique.
QCL
PE
PD
Feature-Oriented Reverse Engineering for Middleware
COD
PE
PC
2.0
CVM
DE
DRA
5.0
Specialization (FORMS) A model-based approach for mapping product-line

variant-specific feature requirements to middlewarespecific features.
[S22]
COVAMOF software variability assessment method

(COSVAM) - A novel and industry-strength assessment
VR
process that addresses the issues that are associated to the

current variability assessment practice.
[S23]
An approach that aims at reducing the maintenance effort
EAM
DE
DD
by organizing product lines as a set of interrelated model
MOD
PE
PD
fragments defining the variability of particular parts of the
CVM
4.0
system.
[S24]
Platform Maintenance Process -
EAM
DE
Di
A process concerned with the implementation and
QCL
PE
DT
validation of the details of change requests and the
PC
eventual release to the stakeholders.
PT
2.0
43
PDS
[S25]
[S26]
Product line use case models (PLUSS) -
MOD
DE
DRA
An approach to manage natural-language requirements
REQ
PE
PR
specifications in a software product line context.
CON
An approach to support comprehension of variable
VIS
DE
DI
PE
PC
DE
DT
PE
PT
PM
SD
DE
DD
software using background colors

[S27]
CFS unit testing framework -
TEST
Strategies for improving the existing unit testing
4.0
5.0
4.0
infrastructure
[S28]
Ownership Architecture -
ADP
A reverse engineering approach to understanding the
4.5
organizational aspects during the product line planning

phase.
[S29]
Software Product Line Infrastructure
ARC
DE
DI
(SPLI) - An approach to support software product
PDR
PE
PC
An approach to deal with inconsistencies in feature model
MOD
DE
DRA
evolution.
EAM
4.0
populations.
[S30]
4.0
CR
VR
[S31]
Genetic Algorithms for optimized Feature Selection
PDR
DE
DI
PE
PC
DE
DD
PE
PD
PDR
DE
DI
3.0
EAM
PE
PC
2.0
(GAFES) - An approach that uses Genetic Algorithms for
3.0
optimized Feature Selection in SPLs.

[S32]
Textual Variability Language (TVL) -
MOD
A textual feature modeling dialect geared towards
3.0
software architects and engineers.

[S33]
Asset Organization for Product Derivation Organization schemes for the software core assets used to
build products.
[S34]
Sequential Pattern Mining technique An approach to simplifying the maintenance of software
PT
product lines and improving software quality

[S35]
Enhanced Software Maintenance Process -
PDS
EAM
An approach for software product line maintenance and
DE
DI
PE
PC
evolution.
[S36]
Pair-wise Feature Interaction Matrix -
2.5
PDS
CVM
DE
VR
4.0
COD
DE
DI
2.0
PE
PC
DE
DRA
A practical method for managing software product lines

with respect to its feature
Interactions.
[S37]
A monitoring technique that statically determines the

feature combinations that cannot violate a stated property
of a SPL.
[S38]
DRAMA (Domain Requirements And Modeling
REQ
Architectures) -
MOD
A framework for modeling domain architecture based on
ARC
4.0
DD
domain requirements within product lines with a focus on

the traceable relationship between requirements and
architectural structures.
[S39]
Architectural Evaluation Model -
ARC
An architecture analysis and improvement process.
MOD
DE
DD
2.0
PRO
44
[S40]
Extended Architecture Tradeoff Analysis Method
ARC
DE
DI
(EATAM) -
QCL
PE
PD
An approach that analyzes variation points of the quality
3.0
PC
attribute using feature modeling and creates variability

scenarios for the derivation of the variation points.
[S41]
Adoption Factory -
ADP
PM
BVE
2.0
TEST
DE
DT
2.0
PE
PT
DE
SD
Adoption and producing plans for proceeding with the

SPL adoption.
[S42]
OTM3 Organizational Testing Management Maturity

Model - A method using metrics to address defect
management and the measurement of software testing
results.
[S43]
[S44]
An approach to analyzing commonality and variability of
CVM
features using semantic-based analysis criteria which is
DRA
able to change feature model of specific domain to feature
CR
ontology.
VR
A method to optimize documenting analyzing variability.
CVM
DE
DRA
3.0
4.0
VR
PR
PD
[S45]
A method for restructuring and simplifying the provided
CVM
DE
VR
3.0
COD
PE
PC
4.0
REQ
DE
DRA
0.0
PE
PR
DE
PC
4.0
3.0
variability based on concept analysis.

[S46]
Language Oriented Programming software development paradigm that aims to close the gap
between the ability to reuse high-level concepts in
software design and the ability to reuse the code
implementing them, through extensive use of Domain
Specific Languages (DSLs).
[S47]
Design Structure Matrices An approach to assess the modularity of SPLs through the
dependencies.
[S48]
Security Requirements Engineering Process for Software
REQ
Product Lines (SREPPLine) - A holistic security
PE
requirements engineering framework with which to

facilitate the development of secure SPLs and their
derived products.
[S49]
Collaborative Product Configuration
AUTO
DE
DRA
(CPC) - An approach to describe and validate
MOD
PE
PR
collaborative configuration scenarios and a set of
PDR
DE
DRA
reasoning algorithms tailored to the feature modeling

domain that can be used to provide automated support for
product configuration.
[S50]
Extended Functional Requirements Framework (EFRF) -
CVM
an approach to extract functional requirements by
REQ
analyzing text-based software requirements specifications
MOD
3.0
PR
(SRSs).
[S51]
QRF (Quality Requirements of a software Family) - A
QCL
DE
DRA
method focusing on how quality requirements have to be
REQ
PE
PR
defined, represented and transformed to architectural
ARC
2.0
models.
45
[S52]
A conceptual framework to analyze the requirements
REQ
DE
assets in support of extractive and reactive SPL
DRA
4.0
VR
development
[S53]
Integrated Cost Model for Product Line Engineering
ADP
PM
(InCoME) - A method that produces cost and benefits
DD
4.0
PD
values in order to help an organization to decide if an

investment in a product line is worthwhile.
[S54]
Project Cost Overrun Simulation Model -
REQ
PM
SD
a cost overrun simulation model for time-boxed SPL
PRM
DE
DRA
PE
CR
development
4.0
VR
[S55]
CompPLA and ExtensPLA -
ARC
A technique for assessing PLA complexity and
PDR
DE
DD
2.5
3.0
extensibility quality attributes using metrics.

[S56]
Holistic Product Line Architecture Assessment
ARC
DE
DD
(HoPLAA) - A holistic approach that focuses on risks and
QCL
PE
PD
Model-based Integration Testing technique - A model-
MOD
DE
DT
2.0
based, automated integration test technique that can be
TEST
4.0
quality attribute tradeoffs for the purpose of assessing

PLA.
[S57]
applied during domain engineering for generating

integration test case scenarios.
[S58]
A UML-based approach for maintaining software
ARC
DE
DD
products of a large-scale industrial product family.
MOD
PE
PD
DE
DRA
EAM
[S59]
Refactoring -
DOC
an approach for refactoring technical documentation of
3.0
PR
SPLs.
[S60]
Flexible Feature Binding -
COD
PE
An approach that integrates static and dynamic feature
PD
5.0
PC
binding at implementation and modeling level.

[S61]
Metamorphic testing for the automated generation of test
MOD
DE
data for the analyses of feature models.
REQ
PE
DD
3.0
3.0
TEST
[S62]
An approach to preserve the exception policy of a family
COD
DE
DI
of systems along with its evolution, based on the
EAM
PE
PC
Requirement Decision Model -
REQ
DE
DRA
A meta-model for requirement decision models to bring
PRM
PE
CR
definition and automatic checking of exception handling

design rules.
[S63]
[S64]
[S65]
formalism and consistency to the structure and to model
VR
inter-dependencies between requirement selection
PR
decisions.
PD
An approach to predict a products non-functional
DE
DRA
properties, based on the products feature selection.
REQ
PE
PR
SPL Conqueror -
PE
PD
A holistic approach for measuring and optimizing non-
4.0
5.0
4.0
PC
functional properties.
[S66]
Feature Unweaving -
ADP
PM
SD
An approach to evolve an integrated graphical
MOD
DE
DRA
4.0
46
requirements model into a product line model.
CVM
CR
VR
DD
[S67]
ScenTED-DF technique -
MOD
DE
DT
A model-based technique to avoid redundant testing in
TEST
PE
PT
PDR
PE
PD
5.0
application engineering.
[S68]
An automated approach for verifying safe compositions

for SPLs with configuration knowledge models.
[S69]
4.0
PC
Domain Requirement Model and Requirement Analysis
REQ
Method - a new domain requirement model for
MOD
DE
DRA
3.0
CR
solution domains.
VR
DD
[S70]
Goal-oriented requirements analysis (GORA) and goal
REQ
graphs -
CVM
DE
VR
requirements based on product goals.

Incremental Test Generation Approach -
2.0
CR
An approach to identify common and variable

[S71]
DRA
DD
TEST
An automatic technique for mapping a formula that
DE
DT
PE
PT
DE
DRA
4.0
specifies a feature into a transformation that defines

incremental refinement of test suites.
[S72]
PLEvo-Scoping - a method intended to help Product Line
EAM
(PL) scoping teams anticipate emergent features and
CR
distinguish unstable from stable features, with the aim of
VR
4.5
preparing their PL for likely future adaptation needs.

[S73]
FAST - A process for achieving generation of members of
COD
DE
the product line.
DRA
4.0
CR
DD
DI
[S74]
[S75]
Configuration Understanding and Remedy (CURE) - A
CON
DE
DD
4.0
constraint-based diagnostic approach.
MOD
MUlti-step Software Configuration problem solver
CON
DE
DRA
4.0
CURE Configuration Understanding Remedy - An
AUTO
DE
DD
4.0
approach to debug feature model configurations.
QCL
PE
PR
3.0
(MUSCLE) - An automated method for deriving a set of

configurations that meet a series of requirements over a
span of configuration steps.
[S76]
MOD
CON
[S77]
A method that assists analysts in detecting changes to
EAM
product features during evolution.

[S78]
FAVE: Factor Analysis based Variability Extraction - An
PD
CVM
DE
approach to detect variability across existing software by
VR
Quality Attribute Knowledge Base
CON
(QA_KB) - An approach that represents the inter-
PDR
relationships among quality attributes in the product
4.0
CR
analyzing the change history

[S79]
DRA
PE
PR
3.5
PD
PC
configuration process.
47
APPENDIX D
Table 30 Score card for the quality assessment of 79 empirical studies
Article
Q1
Q2
Q3
Q4
Q5
Q6
Q7
Q8
Q9
Q10
Q11
Q12
Q13
Q14
Total
S9
14
S6
11
S26
11
S66
.5
10.5
S72
.5
.5
.5
10.5
S17
10
S25
10
S32
10
S18
.5
9.5
S13
S16
S65
S67
.5
.5
S74
S12
.5
8.5
S31
.5
8.5
S64
.5
8.5
S19
.5
.5
S53
.5
.5
S41
.5
.5
.5
7.5
S52
.5
7.5
S44
S55
.5
.5
S58
S30
.5
6.5
S38
.5
6.5
S54
.5
6.5
S10
S11
S20
S22
S23
S45
S68
S71
S77
S78
.5
.5
.5
.5
S2
.5
5.5
S3
.5
.5
.5
5.5
S4
.5
5.5
S34
.5
5.5
S46
.5
5.5
S60
.5
5.5
Id
48
Article
Q1
Q2
Q3
Q4
Q5
Q6
Q7
Q8
Q9
Q10
Q11
Q12
Q13
Q14
Total
S5
S28
.5
.5
S29
S33
.5
.5
S36
S40
.5
.5
S48
S70
.5
.5
S7
.5
.5
.5
4.5
S57
.5
4.5
S61
.5
.5
.5
4.5
S21
.5
.5
S24
.5
.5
S27
S42
.5
.5
S43
.5
.5
S49
S56
S61
S63
S76
S1
.5
.5
.5
3.5
S35
.5
3.5
S39
.5
.5
.5
3.5
S51
.5
.5
.5
3.5
S8
.5
.5
S37
S47
S50
.5
.5
S75
.5
.5
S79
S15
.5
2.5
S59
.5
2.5
S14
S73
.5
.5
S69
.5
1.5
Total
28
13
62
35.5
26.5
26.5
67.5
15
17
15
51.5
61
44
467.5
AVG
0.35
0.16
0.78
0.45
0.34
0.34
0.85
0.19
0.06
0.22
0.19
0.65
0.77
0.56
5.92
Id
REFERENCES
[1]. Krueger, Charles W. "Benefits of Software Product Lines." Software Product Lines. BigLever Software,
Inc. Web. 19 Apr. 2012. <http://www.softwareproductlines.com/benefits/benefits.html>.
[2]. P. Clements, L., and Northrop. 2001. Software Product Lines: Practices and Patterns. Addison-Wesley
Longman Publishing Co., Inc., Boston, MA, USA.
49
[3]. Czarnecki, K., and Eisenecker, U. Generative Programming: Methods, Tools, and Applications. ACM
Press/Addison-Wesley Publ. Co., New York, NY, USA, 2000.
[4]. Zhang, He., and M. A. Babar. "An Empirical Investigation of Systematic Reviews in Software
Engineering." Empirical Software Engineering and Measurement (ESEM), 2011 International
Symposium on (2011): 87-96. IEEE Xplore Digital Library. Web. 19 Apr. 2012.
[5]. Kitchenham, B.A., Procedures for Performing Systematic Reviews, tech. report SE0401, Dept. of
Computer Science, Univ. of Keele, and tech. report 0400011T.1, Empirical Software Eng., National
Information and Communications Technology Australia, 30 Aug. 2004.
[6]. Kitchenham, B., Pretorius R, Budgen D., Pearl Brereton O., Turner M., Niazi, M., and Linkman. S.
"Systematic Literature Reviews in Software Engineering A Tertiary Study." Information and Software
Technology 52.8 (2010): 792-805. ScienceDirect. Web. 19 Apr. 2012.
[7]. Da Silva, Fabio Q.B., Andr L.M. Santos, Srgio Soares, A. Csar C. Frana, Cleviton V.F. Monteiro,
and Felipe F. Maciel. "Six Years of Systematic Literature Reviews in Software Engineering: An
Updated Tertiary Study." Information and Software Technology 53.9 (2011): 899-913. ScienceDirect.
Web. 19 Apr. 2012.
[8]. Zhang, H. and Babar, M. A. (2011). An empirical investigation of systematic reviews in software
engineering. In Empirical Software Engineering and Measurement (ESEM), 2011 International
Symposium on, pages 8796.
[9]. Bastos, J. F., da Mota Silveira Neto, P. A., de Almeida, E. S., and de Lemos Meira, S. R. (2011).
Adopting software product lines: A systematic mapping study. In Evaluation Assessment in Software
Engineering (EASE 2011), 15th Annual Conference on, pages 1120.
[10]. Chen, L. and Babar, M. A. (2011b). A systematic review of evaluation of variability management
approaches in software product lines. Information and Software Technology, 53(4):344362.
[11]. da Silva, I. F., da Mota Silveira Neto, P. A., OLeary, P., de Almeida, E. S., and de Lemos Meira, S. R.
(2011c). Agile software product lines: a systematic mapping study. Software: Practice and Experience,
41(8):899920.
[12]. Diaz, J., Prez, J., Alarcn, P. P., and Garbajosa, J. (2011). Agile product line engineeringa
systematic literature review. Software: Practice and Experience, 41(8):921941.
[13]. Engstrm, E. and Runeson, P. (2011a). Software product line testing a systematic mapping study.
[14]. Montagud, S., Abraho, S., and Insfran, E. A systematic review of quality attributes and measures for
software product lines. Software Quality Journal, pages 162.
[15]. da Mota Silveira Neto, P. A., do Carmo Machado, I., McGregor, J. D., de Almeida, E. S., and
de Lemos Meira, S. R. (2011b). A systematic mapping study of software product lines testing.
[16]. Murugesupillai, E., Mohabbati, B., and Gasevic, D. (2011). A preliminary mapping study of approaches
bridging software product lines and service-oriented architectures. In Proceedings of the 15th
International Software Product Line Conference, Volume 2, SPLC 11, New York, NY, USA. ACM
[17]. Alves, V., Niu, N., Alves, C., and Valena, G. (2010b). Requirements engineering for software product
lines: A systematic literature review. Information and Software Technology, 52(8):806820.
[18]. Bezerra, Y. M., Pereira, T. A. B., and da Silveira, G. E. (2009). A systematic review of software product
lines applied to mobile middleware. In Information Technology: New Generations, 2009. ITNG 09.
Sixth International Conference on, pages 10241029.
50
[19]. Chen, L., Ali Babar, M., and Ali, N. (2009b). Variability management in software product lines: a
systematic review. In Proceedings of the 13th International Software Product Line Conference, SPLC
09, pages 8190, Pittsburgh, PA, USA. Carnegie Mellon University.
[20]. Khurum, M. and Gorschek, T. (2009b). A systematic review of domain analysis solutions for product
lines. Journal of Systems and Software, 82(12):19822003.
[21]. Montagud, S. and Abraho, S. (2009). Gathering current knowledge about quality evaluation in software
product lines. In Proceedings of the 13th International Software Product Line Conference, SPLC 09,
pages 91100, Pittsburgh, PA, USA. Carnegie Mellon University.
[22]. de Souza Filho, E., de Oliveira Cavalcanti, R., Neiva, D., Oliveira, T., Lisboa, L., de Almeida, E., and
de Lemos Meira, S. (2008). Evaluating domain design approaches using systematic review. In Morrison,
R., Balasubramaniam, D., and Falkner, K., editors, Software Architecture, volume 5292 of Lecture Notes
in Computer Science, pages 5065. Springer Berlin / Heidelberg.
[23]. "Journals by Title - Databases - AU Library." AU Library. Athabasca University. Web. 19 Apr. 2012. <
http://library.athabascau.ca/journals/title.php?subjectID=36>.
[24]. B.A. Kitchenham, S.L. Pfleeger, L.M. Pickard, P.W. Jones, D.C. Hoaglin, K. El Emam, J. Rosenberg,
Preliminary guidelines for empirical research in software engineering, IEEE Transactions on Software
Engineering 28 (8) (2002) 721734.
[25]. P. Runeson and M. Host. Guidelines for conducting and reporting case study research in software
engineering. Empirical Software Engineering, 14(2):131164, 2009.
[26]. Jedlitschka, A., Ciolkowski, M. & Pfahl, D. (2008), Reporting experiments in software engineering, in
F. Shull, J. Singer & D. Sjberg, eds, Guide to Advanced Empirical Software Engineering, SpringerVerlag, London, chapter 8.
[27].
B. Kitchenham, "DESMET: A method for evaluating Software Engineering methods and tools,"
Department of Computer Science, University of Keele, UK, TR96-09, 1996
[28].
Pohl, Klaus, Gnter Bckle, and Frank J. van der Linden. Software product line
engineering: foundations, principles, and techniques. Springer, 2005.

[29]. Wohlin, C., Hst, M., & Henningsson, K. (2006). Empirical Research Methods in Web and Software
Engineering, in Mendes, E. and Mosley, N., Web Engineering, Springer Berlin Heidelberg, pp. 409-430.
[30]. Wright RW, Brand RA, Dunn W, Spindler KP. How to write a systematic review. Clin Orthop Relat Res
2007; 455:23-29.
[31]. Biolchini, J., Gomes, P., Cruz, A., & Travassos, G. (2005). Systematic review in software engineering.
Rio de Janeiro, Brazil: Systems Engineering and Computer Science Department, UFRJ.
<http://www.cronos.cos.ufrj.br/publicacoes/reltec/es67905.pdf>.
[32]. Gorschek T, Garre P, Larsson S et al (2006) A model for technology transfer in practice. IEEE Softw
23(6):8895. doi:10.1109/MS.2006.147
[33]. G.A. Lewis, T. Mahatham, and L. Wrage, Assumptions Management in Software Development,
Technical Note CMU/SEI-2004-TN-021, Carnegie Mellon University, 2004.
[34]. Feldt, R., Magazinius, A., 2010. Validity Threats in Empirical Software Engineering Research - An
Initial Survey. The 22nd International Conference on Software Engineering and Knowledge
Engineering, SEKE 2010, Redwood City, USA.
[35]. C. Kuloor and A. Eberlein, "Requirements Engineering for Software Product Lines," in the Proc. of the
15th International Conference of Software and Systems Engineering and their Applications
(ICSSEA'2002), vol. 1. Conservatoire National de Arts et Mtiers, Paris, 2002.
51
[36]. Pohl, Klaus, Gnter Bckle, and Frank van der Linden. Software product line engineering foundations,
principles, and techniques. New York, NY: Springer, 2005. Print.
[37]. Easterbrook, S., Singer, J., Storey, M.-A., and Damian, D. (2008). Selecting empirical methods for
software engineering research. In Shull, F., Singer, J., and Sjberg, D. I. K., editors, Guide to Advanced
Empirical Software Engineering, pages 285311. Springer London.
[38]. Wieringa, R.J.(2012) A Unified Checklist for Observational and Experimental Research in Software
Engineering (Version 1).Technical Report TR-CTIT-12-07,Centre for Telematics and Information
Technology, University of Twente, Enschede. ISSN 1381-3625.
[39]. D. Moher, S. Hopewell, K. Schulz, V. Montori, P. Gtzsche, P. Devereaux, D. Elbourne, M. Egger, and
D. Altman, for the CONSORT Group, CONSORT 2010 Explanation and Elaboration: updated
guidelines for reporting parallel group randomised trial, British Medical Journal, p. 340:c869, 2010.
[40]. K. Schulz, D. Altman, and D. M. D, CONSORT 2010 Statement: updated guidelines for reporting
parallel group randomised trials, Annals of Internal Medicine, vol. 152, no. 11, pp. 17, 1 June 2010.
[41]. Petersen K, Feldt R, Mujtaba S, Mattsson M (2008) Systematic Mapping Studies in Software
Engineering. In: proceedings of the 12th International Conference on Evaluation and Assessment in
Software Engineering, pp. 71-80.
[42]. A. Brown and C. Lin. Limitations of Software Solutions for Soft Error Detection, The Second
Workshop on System Effects of Logic Soft Errors, Urbana-Champain, IL, April 11-12, 2006, (2006).
[43]. L. Chen et al., Variability management in software product lines: a systematic review, in: 13th SPLC,
2009, pp. 8190.
[44]. L. Chen et al., A status report on the evaluation of variability management approaches, in: 13th EASE,
2009.
[45]. M. Khurum et al., A systematic review of domain analysis solutions for product lines, JSS 82 (12)
(2009) 19822003.
[46]. B.P. Lamancha et al., Software product line testing a systematic review, ICSOFT (2009) 2330.
[47]. S. Montagud et al., Gathering current knowledge about quality evaluation in software product lines,
SPLC (2009) 91100.
[48]. E.D. de Souza Filho et al., Evaluating domain design approaches using systematic review, ECSA (2008)
5065.
[49]. M. Khurum et al., Systematic review of solutions proposed for product line economics, in: 2nd
MESPUL, 2008, pp. 386393.
[50]. D. Sjberg, T. Dyba, M. Jrgensen, The Future of Empirical Methods in Software Engineering
Research, in: Future of Software Engineering (FOSE07), IEEE, 2007, pp. 358378.
[51]. Dyb, Tore, and Torgeir Dingsyr. "Empirical studies of agile software development: A
systematic review." Information and software technology 50, no. 9 (2008): 833-859.
52

TR LS3 130084R4T PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

TR LS3 130084R4T PDF

Uploaded by

Copyright:

Available Formats

EMPIRICAL EVALUATION IN SOFTWARE

PRODUCT LINE ENGINEERING

Assess the evaluation strategies/procedures and the quality of reporting;

Summarize and analyze the results; and

Recommend areas for future research.

2. Motivation and Research Questions

Table 1: Designated research questions for the study

What software product line methods, techniques, or

To provide an inventory of software product line methods,

What project activities does the software product

To what degree is the evaluation of the method

To identify the software product line development activity

In what context is the software product line method,

What degree of realism does the evaluation of the

To what extend does the reported empirical

To what degree is the software product line method,

Identify if the evaluation was performed from an industry

3.1. Systematic Review Design

Figure 1 Systematic Literature Review Phases

3.2. Data Sources and Search Strategy

Wiley Online Library

3.3. Study Selection

Papers where the search terms are found

If the main objective of the paper is to evaluate a software product line

Papers published from Jan-01-2006 to

Only interested in current and relevant evaluations conducted in recent

Papers that are either a research paper,

have the resources available for translation of other languages.

Papers that are duplicates of papers already

Including duplications will skew the results of this review. If

Papers that are systematic literature reviews.

The main focus of this study is on empirical evaluations of software

3.4. Study Quality Assessment

Stating the objective sets the focal point

Research Conclusions Researchers should conclude their findings into a context of

Is the purpose of the method explained?

Is the theoretical aspect of the method explained?

Is the implementation of the method explained?

Are the advantages of the method explained?

Are the limitations of the method explained?

Does the study describe a problem, hypotheses or objective/goal statement?

Does the study describe its data collection procedures?

Does the study describe its data analysis procedures?

Does the study provide an interpretation or analysis of the results?

3.5. Data Extraction

Project focus specifies what software product

Development process specifies whether the

Business Vision Evaluation,

Is the purpose of the method

Specifies the context in which the evaluation

Specifies who uses the method in the

Specifies the scale on which the evaluation is

In combination with Study Context, does the

Are the benefits of

Here we provide further description of each of the properties:

3.6. Data Synthesis

4. Conducting the Review

4.1. Inclusion and Exclusion of Studies

4.2. Threats to Validity

4.2.1. Validation of the review protocol

4.2.2. Validation of publication and primary study selection

4.2.3. Validation of data extraction criteria and classification