You are on page 1of 178

International Workshop on Distributed Software Development

Daniela Damian Schahram Dustdar

International Workshop on Distributed Software Development


Paris, 29 August 2005

Workshop on Distributed Software Development


Workshop chairs: Daniela Damian (Univ. of Victoria, Canada) and Schahram Dustdar (Vienna University of Technology, Austria)

This volume contains the proceedings of the First International Workshop on Distributed Software Development (DiSD), co-located with the 13th International Requirements Engineering Conference. This workshop is a merge of two major workshops in the area of distributed software engineering: the GSD (Global Software Development) and CSSE (Computer Support for distributed Software Engineering), organized at ICSE, ASE, and other major Software Engineering conferences in the past 5 years. Software development in geographically distributed settings is increasingly becoming common practice in the software industry. More and more software companies use computersupported cooperative tools to overcome the geographical distance and benefit from access to a qualified resource pool and a reduction in development costs. However, the increased globalization of software development creates software engineering challenges due to the impact of temporal, geographical and cultural differences, and requires development of techniques and technologies to address these issues. The processes of communication, coordination, and collaboration are key enablers of software development processes. In particular, one set of development activities directly affected by challenges in communication is Requirements Engineering (RE) activities that pervade the entire development life-cycle. Industrial case studies reveal the significant impact that distance has on the management of requirements and how well-known problems of RE are exacerbated in DiSD. The majority of distributed projects are characterized by distributed customer-developer relationships, be they inter-organizational projects or projects internal to multinational organizations. Failure to achieve a common understanding of system features, reduced trust and the inability to effectively resolve conflicts result in budget and schedule overruns and, ultimately, in damaged client-supplier relationships. Further, the developer team itself is often geographically distributed and experiences significant problems in requirements management and related activities such as testing and project management (e.g. project planning, progress tracking). The goal of this workshop is to provide an opportunity for researchers and industry practitioners to explore both the state-of-the art and the state-of-the-practice in DiSD. In particular, we intend to explore the specific challenges experienced by DiSD projects in conducting effective Requirements Engineering. Our call for papers elicited contributions on topics that included, but were not limited to: Case studies describing experiences of challenges of DiSD and RE in DiSD in particular Theories of communication, coordination, collaboration and knowledge management in DiSD The multifaceted nature of challenges in requirements management in DiSD

Collaboration infrastructure to support DiSD teams and requirements management activities in particular Requirements Engineering processes and tools specifically targeted for DiSD projects. Workshop program We are pleased to create a program of paper presentation and discussion that includes a number of contributions in the areas of theories as well as practice reports in Distributed Software Development. Twenty paper submissions were reviewed by the Program Committee for relevance to the workshop topics as well as potential to generate discussion of important topics at the workshop. Sixteen papers were selected for presentation in the workshop. The one-day program is structured to include three Experience Reports, five papers on Theories of DiSD, five demos of collaborative tools for DiSD, and three reports of early research reports on investigations of processes of Requirements Engineering in distributed computer-mediated software projects. We hope you have enjoyed the workshop program this year. We kindly thank you for your continued interest in this area of research. Daniela Damian and Schahram Dustdar Workshop Chairs

Organization
International Workshop on Distributed Software Development (DiSD 2005) 29 August 2005 co-located with the 13th IEEE Requirements Engineering Conference 2005

Program Committee Co-Chairs Daniela Damian (University of Victoria, Canada) Schahram Dustdar (Vienna University of Technology, Austria)

Steering Committee Filippo Lanubile, University of Bari, Italy Harald Gall, University of Zrich, Switzerland Andrea de Lucia, University of Salerno, Italy Program Committee Kevin Ryan, University of Limerick, Ireland Christof Ebert, Alcatel, France Andreas Braun, Accenture GmbH, Germany Heather Oppenheimer, Lucent Technologies, USA Lerina Aversano, University of Sannio, Italy Cornelia Boldyreff, University of Lincoln, UK Paolo Ciancarini, University of Bologna, Italy Gianpaolo Cugola, Politecnico di Milano, Italy Rick Dewar, Heriot-Watt University, UK Paul Grnbacher, University of Linz, Austria Frank Maurer, University of Calgary, Canada Pierluigi Ritrovato, CRMPA, Italy Andre Van Der Hoek, University of California, Irvine, USA Rafael Prikladnicki, Pontificia Universidade de Rio Grande du Sul, Brazil Liam Banon, University of Limerick, Ireland Brian Fitzgerald, University of Limerick, Ireland Allen Dutoit, Technical University Mnchen, Germany Stephen Rank, University of Lincoln, UK Bernd Bruegge, Technical University Mnchen, Germany

Table of Contents
Experience Reports
Adel Taweel A Case Study of a Successful Collaboration in Distributed Software Development...8 An Ngo-The, Kiem Hoang, Truc Nguyen, Nhien Mai Extreme Programming in Distributed Software Development: A Case Study..17 Mark Sheppard Organizational Pattern Mining - A Retrospective on XP in the Large Scale Software Development Project...23

Theories in Global Software Development


Pr J gerfalk, Brian Fitzgerald, Helena Holmstrm, Brian Lings, Bjrn Lundell, Eoin Conchi A Framework for considering Opportunities and Threats in Distributed Software Development....43 Allan Scott, Luis Izquierdo, Sweta Gupta, Robert Elves and Daniela Damian Leveraging Design Patterns in Global Software Development: A Proposal for a GSD Communication Pattern Language...58 Karin K. Breitman, Miriam Say, Leonardo M. Couto Using ontologies in Distributed Software Development..72 Gamel O. Wiredu Coordination as the Challenge of Distributed Software Development......80 Bernd Bruegge, Korbinian Herrmann, Axel Rauschmayer, Patrick Renner Situational Requirements Engineering gets distributed ........................................................................85

Collaborative tools: Tool demo


Fabio Calefato, Filippo Lanubile Using the EConference Tool for Synchronous Distributed Requirements Workshops.....91 Naoufel Boulila, Allen H. Dutoit, Bernd Bruegge Bootstrapping Incremental Design: An Empirical Approach For Requirements Identification and Distributed Software Development....102 Timo Wolf, Allen H. Dutoit Supporting Traceability in Distributed Software Development Projects.....111 Andrea De Lucia, Fausto Fasano, Rita Francese, Rocco Oliveto Traceability Management in ADAMS ...125

Muhammad Ali Babar, June Verner Groupware Requirements for supporting software architecture evaluation process..140

Brand new research


Daniela Damian, Filippo Lanubile, Teresa Mallardo Investigating IBIS in a distributed educational environment: the design of a case study.....153 Luis Izquierdo, Daniela Damian, Daniel German Towards Requirements management in a special case of global software development: A study of asynchronous communication in the Open Source community.....159 Tom O Regan, Valentine Casey, Ita Richardson Virtual Team Implementation and Management - A Position Paper......174

A Case Study of a Successful Collaboration in Distributed Software Development Adel Taweel1

Abstract: Market pressures and an increasingly reliable world-wide communications network have made geographically distributed software engineering a reality. With organisations becoming more distributed across several countries, and with the shortage and the need for better utilisation of skills resulting in team members unavoidably distributed across sites. These factors are forcing organisation to develop strategies and enabling technologies in need for more successful collaborative working. The paper reports on a case study of such successful collaborative working with a focus on distributed software development teams in a project. It studies three teams but looks particularly at one of these teams whose members are separated by distance and culture, and outlines the important factors to its success.
Keywords: global software development, collaborative working, distributed software engineering

1. Introduction:
Software organizations, even in some cases more than others, are increasingly separated by distance, time and cultural boundaries [1, 2, 6, 12, 3]. This puts challenges on the software development processes and organizations due to fewer direct communication and interactions between team members, harder to use traditional team building measures and introduces even greater possibilities for miscommunication and misunderstandings of technical details and common objectives and goals [1, 6, 3, 5, 13, 7]. This case study describes the software development life cycle of a project, from its inception to near completion, whose team members distributed across the UK and Europe. Although there are no time differences between the team members, but they are separated by distance boundaries, nationality and cultural differences. Despite these differences, the software development team was considered to be successful where other teams in the same project, who has similar configuration, were not considered to attaint the same level of success. The main purpose of this case study is to identify factors that contribute for successful team collaboration across geographical and cultural boundaries, and to study the impact of the managerial strategy and effective use of collaboration tools in the different stages of the software development process with a focus on requirement specifications, design and implementation stages.
1

University of Manchester, Manchester, M13 9PL, UK, a.taweel@manchester.ac.uk

The study followed a number of main steps to establish and outline the main contributing factors to this successful collaboration, these include: determining what the teams did during their work on the project, identifying the used collaboration tools and determining what was seen a successful team and identifying the factors that contributed to creating successful collaborating teams. Section 2 of the paper gives an overview of the study and the study project, its organizational and geographical structure. Section 3 discusses the details of the followed software development process focusing on requirement specifications and coding phases outlining the development process and the used collaboration tools. Section 4 discusses the main identified contributing factors to this successful collaboration working. The detailed discussion of the study, detailed analysis of the data and its evaluation is beyond the scope of this paper. Finally, section 5 summarises the lessons learned.

2. Overview of the study and the study project


The study project is a 5 year programme funded to accomplish a set of pre-determined objectives and goals. These objectives include research and software outputs and products. The project includes 30 members distributed across five separate locations with individuals located at each site. The development team includes five sub-teams: three software development team, one scientific and one management team. The focus of this study is on the software development teams. The project aims to develop a new product development based on three legacy products. The product itself is a software product for processing medical information integrating four main components into a functioning system that eventually interfaces with clinical staff and other third party hospital systems. Three of these four components are based on legacy systems that require further development and the fourth component is built from scratch. The development teams are made up of mainly new staff employed or brought in specifically for this project with some members have application domain knowledge and some others with experience in the legacy products. The main two characteristics of the development teams are the problems of geographical and cultural separation. The five sites are separated geographically by long distance boundaries. The official language of the project is English, however cultural differences range from different languages (five languages), to different work cultures (three different cultures) and institutional work regulations. Although the objectives of the project were clear, the requirements (more specifically the functional requirements) have not been established. One essential and immediate task for the team was to gather, refine and agree requirements. However, given that the project is developing a new product, there were a set of unknown research and technical problems that needed to be looked at before concrete requirements are established. Also the project suffered a number of usual start-up problems: new people that are not used to working together, not well defined or understood processes, and new yet to be established management style. In other words, developing a software products with untried organisational and management structures and further complicated by geographical and cultural barriers. The study has looked at the project from almost its start, although very early parts of the case were studied retrospectively. Since we are focusing on the collaboration tasks, used technologies and

activities undertaken in the project, the study aimed to collect data about the critical factors in the success of the software development team as a successfully multi site functioning team. Data were collected from various sources including the project historical log, the project management team, the project evolving documents, and the project development teams. One particular advantage was the structured documentation of the project and the collaboration tools used in the project, where significant amount of information was recorded about the development team activities, discussions, actions and so forth, from which factors were studied as to what made the software development team a successful multi-site collaborating team. This recorded information provided a good indication of what have been considered done well and achieved their objectives and those done poorly and partially or failed to achieve their objectives.

3. Software development process


The following describes briefly the followed development process in general, with a focus on the requirement specifications and coding phases. 3.1. Eliciting requirements

One of the major tasks of the project was to establish and refine requirements. The main source for the preliminary set of requirements was from the scientific team and the involved users. However, due to distance and geographical separation traditional requirement engineering methods, such as face-to-face interviews, focused meeting etc., were not easily possible [11, 13]. The development team initially started experimenting with different methods of eliciting requirements to overcome these difficulties until a general methodology was established. The general approach was based on an iterative cycle that begins from a generic and moves towards a more specific and refined requirement specifications. It follows three main stages: 1) preliminary general set, 2) defined and filtered (and prioritised) sets and 3) finally refined, well-understood and confirmed sets. To overcome the geographical separation, the first stage was done using collaboration tools, e.g. specifically web-based Wiki pages, with web forms that were specifically designed that allow users and scientific team members to input their requirements. Although, this stage resulted in a long list of requirements, after studying them in some detail the software development team found more than 26% were repeated requirements, 10% included partial repetition and 15% were spurious (too futuristic or beyond the objectives). To define, understand and prioritise these requirements, in the 2nd stage different collaboration tools were used. A sub-team (includes four members) made up from senior members of the studied development teams, the project manager and the scientific team filtered and classified the initial set of requirements into prioritised sets of categories based on the project set objectives. Because of the size of this team and there is not a major time difference between members of the team, this step was done using collaborative teleconferences with traditional web-based presentation tools. Then in a wider community including other development teams, scientific team and selected users, focusing incrementally on a selected set of requirements, discussion web-based forums were used to analyse and discuss these requirements until have reached sufficiently the level for the development team to have felt comfortable with the meaning, depth, importance and priority of each to carry to the next stage. In cases where agreement (or consensus) was not reached, the decision was differed and delegated to the management team to address in greater details with relevant stakeholders. Most

10

reported such cases were related to prioritisation of requirements opposed to expectations, for instance. In cases where requirements where not clear or ambiguous, either relevant members or users from whom these requirements originated were further consulted, and/or these requirements were put in a separate list and carried to the 3rd stage for more focused discussions. Because contributions in web-based discussion forums are in written forms and are asynchronous, these had the advantage for the involved members to think about responses on their own pace and minimise potential misunderstandings. This also has noticeably helped to reduce the impact of language and cultural barriers [1, 6, 14, 5, 13, 11]. On the other hand, it took longer than initially anticipated to complete this stage, mainly due to the nature of the asynchronous responses but also due to absence or busy schedules of relevant users. In few cases, users did not contribute unless they were individually invited to do so, which caused further delay, although they were repeatedly asked to engage in the process. In the 3rd stage on the other hand it was required to refine, well-understand and to confirm each of the requirements, especially the functional requirements, to carry forward to the software development teams to start the design and implementation. The aim of this stage was to take separate categories of requirements and discuss them in details sufficient for implementation. Two main types of collaboration tools were used in this stage: video and teleconferences. However, it was realised early on, in fact from the first few sessions, that large distributed teams are not particularly productive for this type of activity using these tools, especially when disagreements arise. To overcome this problem, similar to the 2nd stage above, requirements were re-categorised in terms of domain and functionality for the work of focused smaller sub-teams. As noted, having a smaller team has helped to allow greater interaction between team members and create more comfortable collaborative environment and noticeably greater contribution from (especially less confident or shy) members [3, 6, 14, 13]. Using available presentation tools, teleconferences or videoconferences, the software development team produced prototypes for up-coming sessions, mainly using rapid prototyping tools [16], to illustrate and confirm understanding. Occasionally e-mails, and regularly both teleconferences and videoconferences were used in this stage, however wherever and whenever possible videoconferences (such as Access Grid tools [17]) were used to allow greater interaction between members. This experience was also noted by other research reported in [1, 2, 15, 3, 13]. As noted in section 4.2, we noted an important element for using these collaboration tools effectively, the team will need to have established working relationship, perhaps through previous experience or through a number of face-to-face meeting. One of the noted effects of this experience is its contribution to the performance of one of the software development teams compared to the other two software development teams [9, 1, 6, 11, 10]. On the other hand, the occasional breakdown of the communication networks or in the collaboration tools resulted, in few cases, in cancelling meetings, which caused significant delay. Therefore, it was essential to set up a long series of meetings as a contingency measure to help overcome this problem. In cases where input on requirements was required from busy users or stakeholders who usually have busy schedule, face-to-face meetings were the preferred choice, in most cases to avoid delays in others because of non-familiarity with the tools. These face-to-face meetings were usually set after the relevant requirements have gone through several iterations of discussions. Towards the end of this stage, a formal written document of the agreed requirement specifications was produced from all teams and circulated for final amendments. The project manager followed up this process with direct communication with individuals or teams to obtain general consensus from all stakeholders. An added value of the 3rd stage was its positive effect on change management.

11

While the development strategy in the project was iterative to accommodate the inevitable change in requirements, however the collected data indicates that this stage has helped to minimise major changes in requirements in particular and to contribute to change management in general. 3.2. Software coding process

After the completion of the first cycle of the requirement specifications process, software development teams with the system architect started the design and the implementation processes. As mentioned above, three components are based on legacy products and one was created from scratch. These four components are partially dependent however each distributed software development team was working on a vertical sub system made up of at least two or more subcomponents, with interdependency kept to minimum where possible. The initial phase of allocating tasks to teams was therefore straight forward. Tasks, in this phase, were allocated to teams based on the logical understanding of the system. This was made easier because of the existence of the legacy components which their dependency pattern was known and not complicated. The main dependency was with the fourth component. In fact the major part of this phase was completed before or while teams being formed. The general architecture of the system was initially created by the system architect and iteratively refined and agreed with the development teams, however the detailed design of each vertical sub system was done by the members of the respective development team as per the allocation. As mentioned above, in the 3rd stage of eliciting requirements, the requirements were re-classified on domain and functionality which relatively coincided with each vertical sub system, this resulted in members of each development team to engage early on in understanding and prioritising the requirements, which greatly helped to facilitate the design stage. A central source version control system was set up to hold developed source code, a long with enforcing an implementation strategy including coding convention, testing plans, source code updates and so forth, to keep coherent pattern of development. A central web-based bugs tracking system (such as Bugzilla [18]) was set up with guidelines to enable a systematic tracking and handling of errors. Although there were general guidelines for the three teams, each team had defined the details of their own individual conventions, their working patterns, their collaboration mechanisms per se.

4. Successful collaboration: contributing factors


Several factors were identified as contributing factors to the success of the collaboration of the distributed software development teams including the design of the development process as outlined above, the choice and the effective use of collaboration tools and commitment of the team members. The following discussion concentrates on another two main factors that had significant effect as noted by the team. 4.1. Managerial Strategy and team building

The impact of managerial strategy factors on determining the successful working of the team was clearly visible in the teams responses. One advantage was the experience of the management team and their awareness of the literature on virtual and distributed software teams geographical and cultural issues. This helped the management team generally to avoid some of the pit falls that can

12

be easily ignored in traditional co-located teams, some of the team building measures, assuming trust, or ad-hoc discussions for example [9, 1, 4]. With most members of the development teams employed or brought in new, it was essential to establish a working relationship between them, at least in each individual team. A managerial factor that contributed significantly in the determining factors for the successful working of the team is setting up a pattern of initial face-to-face meetings. These working meetings helped to bridge gaps between members and establish social and working knowledge between them, and set the seeds for trust in the teams [6, 10]. Other factors in building the teams, such as, selecting complimentary skills, domain knowledge, and minimizing cultural barriers have also been taken into account [9, 1, 6]. Also since there were no immediate software deliverable deadlines, this gave sufficient start-up time for the software teams to form a trustful working relation. The importance of such factors in the team building process has been noted elsewhere [9, 1, 6, 4, 13]. One of the noted managerial factors that contributed to successfully working together in such a setup is establishing a pattern of self management for each team and at each site. For example, for each of the team leaders the objectives have been outlined at the beginning and each was given the context in which they are expected to work and use their own initiative to derive the team to get the work completed. From teams responses, this approach has given the team the freedom and flexibility to derive their own work in the best way they see suitable for their own team working pattern. This strategy has initially been set as a result of the management team experience of the positive effect this strategy may have on teams self esteem and productivity, but its importance was not realised on its even greater effect on having a successful distributed team, as noted by the management team. With this flexibility in hand, team leaders established their own working patterns, set their own focused vision of what needed to be done and the plan to achieve it. Thus the teams ended up with their own agendas but agreed on shared critical path (e.g. for dependent components), critical resources (e.g. video/teleconferencing facilities, or shared staff) and meeting (intermediate and end) points. Team leaders took the initiative to proactively work and plan ahead rather than wait to react when crises arose. The software development team leaders fruitfully exploited the context for selfmanagement by successfully managing themselves. Another managerial factor that kept the system with a common vision was the bringing in of a system architect that brought the systems various technologies and development threads together. The architect was a common point of contact for team leaders that helped to reduce frictions and the otherwise amount of interactions needed for interdependent components. Two main interfacing actions the architect established in the working pattern between members and teams: integration fests and design fests. The latter was used to set a commonly shared vision of a technical design of interdependent (or in some case challenging) components, while the former was used, in initial stages, to establish application interfaces patterns and, in later stages, to integrate components or resolve integration issues. 4.2. Collaboration and collaboration tools

Various collaboration tools were used in the project at various stages of the development process. Initially, the focus was on using face-to-face meetings, mainly for the teams and team members to know each others and help establish a working relation and trust - these factors have been seen

13

essential by the management team. During this period however, a limited combination of video and teleconferences and web-based collaborations were used. As the project advanced, the use of these collaboration tools became more common. Wiki pages and discussion forums were used as the main web-based collaboration tools to share documents, information, open discussion pages and discussion forums. Video and teleconferences tools were used to substitute face-to-face meetings. Also e-mails, shared repositories and other tools were also often used. The importance of using suitable collaboration tools was realised early on in the project by the management team. Their function has not only been seen to facilitate collaboration between team members, but also, if effectively used, to create a common hub of information that in a way provides transparency in the project of all its components and teams (including the management team) actions, activities, output, work plans and so forth. This transparency was seen to at least partly substitute for some of the activities that commonly exist in non-distributed teams, such as adhoc discussions, social or informal face-to-face meetings etc, which are important factors for the team building process [1, 6, 4]. One essential factor for the successful working of the distributed team in general and the distributed software development teams in particular is how successfully teams and members of each distributed team collaborate or work together. While the final goal is perhaps the quality of the output of each team and the output of teams collectively, the process of having a successful collaborating team was seen by the management team as the key to achieving this final goal. The argument is that a successful working team, despite the geographical and cultural problems, will have a higher chance of reproducing quality outputs, opposed to one that has done a good job once for instance. In other words, for a team being successful has greater implication than just having achieved their objectives once, the operational behaviours of the team are greater determining factors for a possibly continuing successful collaboration. Using the collaboration tools, the effectiveness of the tools, the way and how the tools are deployed or used, the teams effectiveness of planning their activities, the teams ability of self management, the working relation between teams or team members, how well done and the quality of the output product, and the teams ability to overcome the geographical and cultural problems are the main factors derived from the recorded information or the interviews of the software development team. The management team on the other hand, although generally agree with the above factors, had less emphasis on the first three factors. While they realise their importance, but they believe their importance is more as contributing factors rather than essential to achieve the latter factors. The followed managerial strategy including the measures undertaken to creating the teams also were identified as contributing factors to successful working distributed team.

5. Lesson learned
A number of lessons have been learned from this study. These are summarised below. Suitable collaboration tools: It was clear that the use of suitable, robust, and simple but effective collaboration tools is essential to the success of the team. Using tools that developers are not familiar with, need significant training, complex to use or have cumbersome interfaces will only lead to eventually being abandoned. Team members were hesitate to use tools with

14

overloaded functionalities, as they usually take longer to set up and load, and most likely confuses and deviate users from its main functionality. Established working relationships: Face-to-face meetings were essential at the beginning of the project to help establish working relationships, trust and work pattern between members of the team. It was noticed that the studied team had more initial time to establish good working relationships, which later contributed to the team productivity as it was usually faster to finding workarounds and resolving conflicts between team members, compared to other teams. Documentation: precise and complete documentation is a critical factor for reducing ambiguities and ensuring consistency. Using some of the collaboration tools provided part of documentation, such as taken decisions, identified requirements or problems, as an added value. Therefore it is very helpful to use suitable collaboration tools especially when it comes to finding or referring to information. Documentation includes details of requirement specifications, design method and symbols and coding conventions. Flexible planning, clear objectives and work schedules: while good planning is crucial for the success of a distributed team, it is evident that less flexible plans add unnecessary pressures on the team members which may well affect their productivity. One noticeable factor was the managerial strategy that allowed each team self-management, which has provided a distinct advantage for teams to feel in control and to better accommodate controlled deviation from planned schedules. Interfaces and development methods and tools: One of the factors that emerged during this study was the importance of having defined set of interfaces between developed components. Design fests and integration fests proved very effective activities to achieve this aim. Also having a compatible set of development methods and tools at participating sites reduced the pain and difficulty in moving code between team members and facilitated code testing on different machines. Although the above lessons give clear indication of some of the main contributing factors to creating a successful collaboration in distributed software development teams, there remains a number of issues and questions that still need further investigation. For example, team building measures do not clearly show methods of retaining successful teams, or rebuilding a team after it has failed, or indicate the effect on the distributed team when staff leave or join in the middle or towards of the task. One such example was recorded in one of the teams, although a slow productivity was indicated this case was not sufficiently documented to analyse. But this does not only depend on the new staff abilities and experience but also on the timing, the critical and dependency paths of development.

Acknowledgment
The author thanks all colleagues and team members who contributed to this study.

References
[1] CARMEL, E., Global Software Teams: Collaborating Across Borders and Time Zones, Prentice Hall, Oct 1999

15

[2] GORTON, I., Motwani, S., Issues in Co-operative Software Engineering using Globally Distributed teams, Information and Software Technology, Vol. 38, pp.647 - 655, Jan 1996. [3] GORTON, I. and Motwani, S. (1994), Towards a Methodology for 24-Hour Software Development Using Globally Distributed Development Teams, Proc. of the 1st IFIP/SQI International Conference on Software Quality and Productivity, Hong Kong, pp 50-55. [4] GERMAN, D., The GNOME Project: a Case Study of Open Source, Global Software Development, Software. Process Improve. Pract. 2003; 8: 201215 [5] HERBSLEB, J. D., Grinter, R. E. and Finholt, T. A. (2001), An Empirical Study of Global Software Development: Distance and Speed, Proc. of ICSE'2001, Toronto, Canada, pp. 81 - 90. [6] ISHII, I., Cross-cultural communication and CSCW, In L. Harasim (ed.), Global Networks: Computers and International Communication, Cambridge/London: MIT Press, pp.143-152, 1993. [7] KRAUT, R. E. and Streeter, L. A. (1995), Coordination in Software Development, Communications of the ACM, Vol. 38 , No. 3, pp. 69 - 81. [8] LAU, F., On Managing Virtual Teams, Technical report series 1999 No. 1, University of Alberta, URL: http://www.bus.ualberta.ca/flau/Papers/cacm.htm. March 1999 [9] MCGRATH, J. Time Matters in Groups, in [8], pp 23-61. [10] PAETSCH, F, et al , Requirements Engineering and Agile Software Development, 12th Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprise, Austria, pp.308, 2003 [11] PRIKLADNICKI, R., Audy, J and Evaristo,R, Global Software Development in Practice Lessons Learned, Softw. Process Improve. Pract. 2003; 8: 267281 [12] SUZUKI, J. and Yamaoto, Y. (1999), Leveraging Distributed Software Development, IEEE Computer, Vol. 32, No. 9, pp.59-65. [13] SEAMAN, C. B. and Basili, V. R. (1997), Communication and Organization in Software Development: An Empirical Study, IBM Systems Journal, IBM Centre for Advanced Studies, Vol 36, No. 4, pp. 550 564 [14] TAWEEL, A., Brereton, O. P., Developing Software Across Time Zones: An Exploratory Empirical Case Study, International Journal Informatica, December 2002 [15] VAN FENEMA, P. C. (1997), Coordination and Control of Globally Distributed Software Development Projects: The GOLDD Case, Proc. Of The 18th Conference on Information Systems, Atlanta, GA, pp. 474 - 475. [17] see www.accessgrid.org [18] see www.bugzilla.org

16

EXTREME PROGRAMMING IN DISTRIBUTED SOFTWARE DEVELOPMENT: A CASE STUDY An Ngo-The1), Kiem Hoang, Truc Nguyen2), Nhien Mai 3)

Abstract
This case study reports an on-going effort to apply eXtreme Programming (XP) in Quantic, a Vietnamese IT company specialized in subcontracting outsourcing software projects. While many other case studies address distributed software development from the perspective of the organization owning the projects, this study is special in that it approaches the problem from the perspective of the organization subcontracting the projects (subcontractor). The study serves as the preliminary experiment of a research program concerning the application of agile methodologies in distributed software development. While one can argue that the lack of face-to-face contact makes distributed software development irrelevant for agile methodologies, we believe that agility would be the best response to communication issues that are too complicated to handle with a rigid process. Even when co-location is impossible, the agile approach can still inspire other flexible and efficient means to address issues concerning communication. It is still too soon to talk about any solid finding, but what we have observed reinforce our belief in the ability of the agile approach to address difficult issues in distributed software development.

1. Introduction
Quantic Ltd. (http://www.quantic.com.vn) is a Vietnamese IT company specialized in outsourcing with clients in North America, Europe and Asia, including Nortel, Cisco. Until now, the projects received are either small projects or components of larger projects. Typically the size of a project team is no more than ten. The company has its own software process but is ready to follow customized process when required. As a subcontractor, the challenges faced by Quantic are not the same as those faced by the project owners such as Nortel, Cisco. For example, it does not have to face the problem of coordination different (sub)-project teams across many sites. The complexity of communication is reduced to two sites (the customer and the team). On the other hand, while the decisions concerning development process are crucial for both sides (customer and subcontractor), the voice of the subcontractors side is usually weak. Currently, one important concern of Quantic is to improve the software development process: To increase the efficiency of the communication between the team and the customer (e.g. understanding of the requirements, handling requirements changes on time); To increase the ability to comply with customized process.
1 2

University of Calgary, ango@cpsc.ucalgary.ca Center for IT Development, Vietnam National University-HCM City, {hkiem, ntttruc}@citd.edu.vn 3 Quantic Ltd, mai.hao.nhien@quantic.com.vn

17

In order to achieve such flexibility, the agile methodologies [1] seem to be the most appropriate candidates. The company, in cooperation with the Center for IT Development, Vietnam National University at Ho Chi Minh City, has initiated a research program to investigate the application of agile methodologies in the development of outsourcing software projects. It has decided then to start with a partial deployment of eXtreme Programming (XP) [3] in a suitable pilot project. Feedbacks from the project manager, the team members and the customer will be considered before the decision to continue to explore the application of XP. The paper reports this case study and is organized as follows: Section 2 describes the motivation of a research about methodologies at Quantic; Section 3 discusses the rationale of the choice of agile approach and XP; Section 4 present the deployment of XP in the pilot project; Section 5 discusses the lessons learnt and further research.

2. Motivation
In an outsourcing project, the customer has a decisive voice about the development process. This means that there can be one different process for each customer. The positive side is that the company and its employees have a diversified portfolio of processes. The negative side is that it is more difficult for the company to organize a simple framework to capitalize the experiences and knowledge obtained from its projects. Currently, research in software process concentrates on the perspective of the organizations owning a project but little is done from the perspective of the organizations subcontracting an outsourcing project. We see here a need to explore new approaches to face this challenge. Furthermore, at Quantic, we observe that the customers have applied certain measures to address the challenges of DSD. Here are some examples: Outsourcing projects are either independent or loosely couple with others. This reduces the need of intensive collaboration and therefore alleviates the communication problem [4]. Liaisons, engineers who move from the development site to the customer site and stay there for a certain time, are used not only to improve the technical understanding but also to develop relationship [2]. Reduction of cultural distance [4] is achieved by the integration of a cultural liaison, an expatriated Vietnamese or a foreigner living in Vietnam, in almost every team. However, many of the processes used are mainly a variation of the waterfall model with some improvements (e.g. dividing the project into small set of features so that each set can be finished in a shorter time). In this situation, despite all the measures to improve the communication, misunderstandings, changing or new requirements are frequently detected at the delivery. These problems can easily be related to the deficit of communication which is almost unavoidable for an outsourcing project. From the technological perspective, there is little we can do since the company has used every available technique such as: telephone conference, net meeting and groupware. Therefore, if there is a way to improve the situation, it must be sought from the methodological perspective.

18

Until now, our voice in the decisions related to development process is weak. As a stakeholder, we argue that an active participation from our part would be beneficial to both sides. Conducting our own research about methodology will help is to promote the idea and play this role more effectively.

3. Rationale of the choice of agile approach and XP


Before engaging in this experiment, we need a bit more consideration about the arguments for and against the application of the agile approach. The introduction of agile approach is not very welcome in Vietnam due to the following reasons: The market is dominantly outsourcing and it is widely believed that the lack of face-to-face communication make agile methodologies, particularly XP, irrelevant. It is also believed that the deficit of communication should be compensated by more formalized management processes, more comprehensive documents. Agile methodologies are still very controversial, not yet established We address these concerns by observing that: Agility would be the best response to communication issues that are too complicated to handle with a rigid process. Such principles as co-location, on site customer should be interpreted in an agile manner and should not be used to invalidate the approach when they cannot be literally implemented. As pointed out by Simons [10], Fowler [6], we do need more documents to compensate this deficit of communication, and this can be very well addressed in agile methodologies. The fact that success stories are anecdotal is true not only for agile methodologies.

As for the appropriateness of agile methodologies for outsourcing project, our belief has been reinforced by the experiences resumed by Simons [10] and Fowler [6]. Other authors have pointed out that practices of agile methodologies are applicable and efficient in the context of DSD, e.g. iterative and incremental processes [8], test-driven [9]. We have chosen XP because: The practices of XP are concrete, intuitive, convincing and can easily accepted by the project managers and developers (of Quantic); XP can be deployed partially and incrementally. So the fact that certain practices are not convincing enough to everyone and need more preparation does not prevent its application. Since the initial commitment is light, the risk is low too and therefore, people are more willing to try.

4. Case study
4.1. Selection of pilot project A Web application from a Japanese customer has been selected as the pilot project because: It is independent and small and its requirements are highly volatile; The cultural distance between Vietnam and Japan is small;

19

Most importantly, the customer is open to our suggestions concerning the development process and the commitment of the customers side. 4.2. Selection of XP practices Since we have no previous experience with XP and neither formal training nor external coach is available, we have to be very careful in the selection of practices to deploy in this first experiment. We divide the 12 XP practices into three categories. 4.1.1 Strictly Applied In this category, we clearly indicate that the practices must be strictly respected. There are four practices falling into this category: Testing. Refactoring Coding standards. Simple design. The developers are trained and required to follow the guidelines in [3],[5],[7] to implement these practices at work. These practices are the best understood by the team members and almost completely under our control. No matter which process is imposed by the customer, these practices can be applied and help to improve the quality and productivity of the developers. Coding standards must be understood at the project level, i.e. when there is conflict between our own standards and those of the customer, the latter prevail. 4.1.2. Adapted These practices are adapted to the specific context of the project. Some practices are not completely under our control, we can only apply them to the extent that they make sense and stay in compliance with customers standards. Others are not completely understood by everyone and therefore the team is not ready for their application. Continuous integration Coding assignments are broken up into small tasks, preferably of no more than one day. When each task is completed, it is integrated into the collective code base. As a result, there are many product builds each day. Small releases This practice and the previous one are strictly related. However, their implementation depends on the customers collaboration. We try to have small releases, but sometimes, the customer is not available to give feedback on these releases. On site customer This is obviously impossible. However, as suggested by Simons [10], a proxy customer is just good enough. Pair programming This practice is subtle and needs an appropriate coaching program. The team is not ready. In general, few people (in Quantic) strongly believe in the virtue of this practice. Therefore, we just explore by requiring that each developer spends about one halfday per week to pair with the project manager. 40-hour week The idea sounds attractive but not easy to rigorously apply. We keep the main idea that overtime should be exceptional. Developers need more preparation to be able to this practice beneficial to everyone (the management, the customer, and the developer).

20

4.1.3. Recommended These practices (collective ownership, metaphor, planning game) are not very well understood and need more preparation. For example, collective ownership can only gradually achieve when the pair programming practice becomes standard. 4.3. Deployment The pilot project started on Feb 2005 and is planned to finish at the end of August 2005 (six months). The team has six members. The customer has a representative in Vietnam who can play the role of a proxy customer, but does not substitute the customer. The team communicates with the customer (in Japan) using email, telephone conference, documents, etc The proxy customers role is to supplement this communications by clarifying the subtle points that are difficult to get through other means of communication, bridging the difference in terms of national and organizational culture, materializing an informal relationship between the two sides. This person is not 100% dedicated to the project but at least, a face-to-face meeting (even an informal meeting) with the team can easily be arranged. The official language is English. The proxy customer can speak a little Vietnamese and the project manager can speak Japanese fluently. The planning is still organized in a traditional manner with a blend of XP philosophy. The planning consists of releases and iterations as suggested by XP but user stories are not used. The customer does not commit to writing them; and the project manager and the team are not eager to promote them. Instead, just enough requirements documents are sent to the team and clarified through discussion (telephone conference and/or with the assistance of the proxy customer). The team breaks down the requirements into tasks. Task assignment and effort estimation are done by negotiation between the project manager and developers. Past performances are used to estimate effort.

5. Conclusions
We are not yet at the end of the pilot project. It is still too soon to jump to conclusions about the appropriateness of XP in Quantic, let alone in Vietnam. However, following the progress of the project, the feedbacks from the developers, the project manager and the client, what we can observe until now is very encouraging. The average working time of the team is 44 hours/week, i.e. only 4 hours overtime. This is still far from the ideal 40-hour week, but already a considerable progress. Putting it in the context of Vietnam, where overtime (in IT companies) is a social norm (though not paid), we consider this achievement very significant. We observe that an important part of misunderstood issues are not due to the lack of face-toface communication, which we can do little about, but the lack of product-user communication, which is very efficiently addressed by an agile approach, XP in our case. Even without proximity between the customer and the team, many issues concerning requirements have been detected early (not in the traditional sense, here early means that little effort has been spent). The team, with experience in other project developed in traditional methodologies, finds the XP practices practical and efficient. The teams morale is high during the project. The customer is also happy with the progress of the project. The early and frequent contact with the product is assuring to the customer. The customer establishes a sense of ownership

21

of the product soon after some initial releases, feels much more in control of the project, despite the lack of direct contact with the team. When it comes to the need to modify the plan due to changes of requirements, this understanding facilitates very much the negotiation. We find out that these releases are themselves a means of communication. Even this project has not yet finished, Quantic has prepared to apply XP in other projects and continue to cooperate with CITD. Our future research will explore the following directions: To fine tune XP in the context of a subcontractor, where the subcontractor has no full control on the decisions related to development process. This direction will investigate the question of adapting XP practices to different development process. To investigate the role of releases as a means of communication. We see in the research concerning methodologies from the perspective of the subcontractor an important role in DSD. It can also be useful to other form of DSD (offshore units within an international organization). As conclusion, globalization in software engineering is not the question of a dominance of a global process, but the synergy of different processes existing harmoniously.

6. Acknowledgement
The authors gratefully acknowledge the financial support of the Alberta Informatics Circle of Research Excellence (iCORE), Quantic Ltd and CITD.

7. References
[1] http://www.agilealliance.com [2] BATTIN, R.D., Crocker, R., Kreidler, J. and Subramanian, K., Leveraging Resources in Global Software Development, IEEE Software March/April 2001 [3] BECK, K., Extreme Programming Explained, Addison-Weslry, Boston, 2000 [4] CARMEL, E., and Agarwal, R., Tactical Approaches for Alleviating Distance in Global Software Development, IEEE Software, March/April 2001
[5] FOWLER, M. Refactoring: Improving the Design of Existing Code. Addison-Wesley, 1999.

[6] FOWLER, M. Using agile software process with offshore development. http://www.martinfowler.com/articles/agileOffshore.html, April, 2004
[7] JEFFRIES, R., Anderson, A., and Hendrickson, C. Extreme Programming Installed. Addison-Wesley, 2000.

[8] PAASIVAARA, M., Lassenius, C., Using Iterative and Incremental Processes in Global Software Development, in proceedings ICSE Workshop on Global Software Development, May 2004 [9] SENGUPTA, B., Sinha, V., Chandra, S., Sampath, S., and Prasad, K.G., Test-Driven Global Software Development, in proceedings ICSE Workshop on Global Software Development, May 2004 [10] SIMONS, M., Internationally agile, InformIT, 15.03.2002

22

ORGANIZATIONAL PATTERN MINING - A RETROSPECTIVE ON XP IN A LARGE SCALE SOFTWARE DEVELOPMENT PROJECT


Mark Sheppard University of Limerick / LogicaCMG Mobile Networks

Abstract
This paper examines the role of XP on a large scale distributed software development project, highlighting how this agile method contributed to the successful delivery of a large scale telecommunications messaging system. Additionally, it explores the organizational patterns that are evident in the project and relates the relevance of these organizational patterns in the context of distributed software development. Using an effective and efficient development process can have a significant impact on a software development project. Employing a process that is lightweight, flexible and focused will contribute to the success of a project. An adapted form of XP was successfully applied to a large scale telecommunications systems development project. The contribution of XP to this product development was significant. Thus one can conclude that XP can be applied to large scale software product development but not without some level of adaptation. Additionally, the project structures and development process used for this project provide us with significant empirical evidence for successful running of a large software development project. These structures and process are worth capturing. In this respect the organization patterns described in [13] provide us with a way of capturing this knowledge. Thus by examining the experiences in a successful distributed software project, it is possibly to capture and document this experience and provide an insight to organizational design for GSD.

1. Introduction
In todays highly competitive and fast moving global economy there is a demand for the timely delivery of high quality product. The emphasis is on reducing cost and time to market. The need for effective and efficient software development process is therefore evident. This emphasis on cost reduction and time to market is likely to lead, as with globalization in general, to the outsourcing of software development. Carmel [11] provides a taxonomy of structural arrangements for distributed software development and Cockburn [12], also, highlights a range of possible scenarios: multi-site, offshore, open source model. The significant characteristic is that there exists a virtual team, which is dispersed both in time and location. The driving factors are many and they include mergers, strategic partnerships, cost reduction, increase productivity and faster product development, closeness to a market and so on [11]. The context of geographically dispersed multi-site development introduces many challenges and impediments. For distributed software development to be an effective (especially cost wise) method for software system and product development, these must be addressed and overcome. These

23

challenges span the issues of infrastructure, tools, communications, culture, trust, co-ordination, integration, culture and so on. The use of agile development methodologies has been seen to successfully deliver high quality large-scale software systems [42], [20], [45], [19]. Typically the application of XP has been targeted at small to medium sized projects. In [42] pure XP model was adapted for application in large-scale software development project. This is summarized in section 3. Thus if XP (, and other agile development methods,) can scale in terms of project size, then it is pertinent to ask if it is possible to utilize aspects of this development approach in the context of distributed software development? However, one of the central features or characteristics of XP and agile development, as highlighted in the agile manifesto [22], is that of interaction and communication. This occurs within the project team and with the customer. It is widely recognized that distance significantly impacts communications [32], [18], [25], [11]. Additionally, many other factors identified as challenges to be overcome, such as different time zones, culture (national, organizational), infrastructure impact communications and interactions. Thus significant challenges exist and must be solved to successfully apply, adopt, adapt agile methods in a distributed context. Nonetheless, these are not impossible challenges and there is evidence that agile techniques can be successfully employed in a GSD context [25], [43], [44]. The promise that XP and other agile methods offer as an effective and efficient development process, such that they have a significant impact on a software development project, is enticing for their consideration to this area of distributed software development. Employing a process that is lightweight, flexible and focused will be of significant benefit to any software development project. In order to achieve this goal further, then it is necessary to look beyond pure process, tools, and infrastructure. It is necessary to examine organizational structure, roles, relationships, interdependencies, collaborations, communication networks and patterns among organizational unit successfully engaged in GSD. It is necessary to explore, identify, extract and document the organization and process patterns of GSD organizations as they overcome the barriers of time and distance. This paper explores the successful use of XP on a large scale software development project, which was also distributed in nature. This exploration is from an organizational pattern perspective. That is to say this paper provides a retrospective review of the software development project, highlighting the significant aspects of the approach taken, together with its successes and failures. This experience is then cast in terms of how it relates to the organization structures and process captured in Copliens organizational patterns [13]. The significance of this study is that organizational patterns are considered to embody and capture many of the important characteristics and structures of an organization engaged in software development. The structure of this paper is as follows: Section 2 provides background on global software development issues. It introduces organizational patterns. Section 3 describes the project context, together with the successful adaptation of XP to a large scale software development. Section 3.1 summarizes the main conclusions from the project. Section 4 undertakes a pattern mining exercise, in that it examines the projects organizational structure and relates these to the organizational patterns in [13]. This contributes to the validation of organizational patterns as structuring tools for organizations which can lead to effective and efficient software development. Section 5 dicusses

24

the merits of this exercise and section 6 provides some conclusions and highlights the futrure direction of this work.

2. Background and Foundation


This section briefly examines the background streams to the case study project described in section 4. There are fundamentally three streams of interest: agile development, distributed software development, and organizational patterns. The inter-relationship between the three streams is established by the fact that the project under review was a large scale software development project, which used XP as its development methodology, and was distributed in structure. 2.1. GSD and The Distributed Software Development Challenge A central theme in much of the literature is the impact of distance and time, i.e. geographical dispersal, on the communications of a project or an organization [11], [17], [21], [32]. There are a number of common issues and problems which is exist in many software development projects, but which are amplified when considered in the context of distributed software development. These issues have been addressed extensively in the literature, for example, Carmel [11] provides us with an taxonomy of distributed organization structures and a number of strategies for overcoming the impedance of distance on a development project: reducing intensive collaboration, reducing cultural distance, and reducing cultural distance; Herbsleb [32] examining the issues under the thesis of Conways Law, where the issue of integration and co-ordination come to the fore and where some of the informal activities which are an integral part of software development are severely impacted by distance; Damian [17] examining the use of groupware technology to overcome the impact of geographical dispersal. These challenges, stated briefly and incompletely, include: Infrastructure creating a distributed working environment that will promote easy and effective collaboration on software development. Achieving agreement on the use common set of development tools, SCM and integration strategies for the effective multi-site collaborative working is often a challenge, [4], [10], [36]. Cultural overcoming cultural differences, modes of working, approaches, ethos, trust, ownership and so on. Establishing appropriate levels of trust among project participants is essential to a projects survival and success, [11], [41], [35]. Communications essential to have effective and efficient communication mechanisms. This issue also has a cultural aspect. It is important that a common language and understanding exists among the collaborating parties. Learning to communicate effectively in an open an transparent manner is a significant challenge, [37], [36], [11], [44], [23]. Methodologies different approaches and processes can exist between sites. How do you harmonize these differences? How are the working practices aligned and harmonized? How agile methods can be applied, for example, Distributed XP [19], [36], [40], [6], [23], [44]. Recent work of Fitzgerald [20] highlighted the use of XP as the development paradigm and SCRUM as the management methodology. Additionally in [21] the communications gap was significantly highlighted when transforming a co-located team to a multi-site inter-continental team

25

in different time zones. This is significant in that it highlights the difficulties with dispersing a team geographically even in situations where there is a strong foundation in terms of infrastructure, culture, and trust. The central issue and overriding common theme in all these issues is the impact on communications and the tools and strategies needed to overcome this impedance. 2.2. Agile Development The focused timely delivery of quality working software systems. What is Agile Software Development? This is a far reaching and very broad question. It is a concept that has been addressed extensively in the software development literature [1], [12], [22], [30], [24]. Not only is agile development concerned with effective and efficient ways of producing software, it also embodies a philosophical approach to how software can be developed. This is captured in the Agile Manifesto [22], and is central theme in the writings of Highsmith and Cockburn [30], [31], [12]. At this point it is worth recollecting the central assertion of the Agile Manifesto: we are uncovering better ways of developing software by doing it and helping others to do it. Through this work we have come to value individuals and interactions over processes and tools working software over comprehensive documentation customer collaboration over contract negotiations responding to change over following the plan. The value emphasis is on the items on the left hand side. The right hand side items are important and an integral part of any project, but they are not the most important and should be considered as being supportive of the main development activity. There are a number of methods and methodologies under the agile umbrella, such as XP, SCRUM, ASD, Crystal and these are described, compared and contrasted succinctly by Abrahamsson in [1]. One of the main characteristics of agile development is, a focus on the delivery of working (high quality, tested ) software system or products, that are inline with customers expectations. In this respect, the software deliverables are produced in a timely fashion, they work as per what the customer has asked for and what they want ( sometimes these are not congruent). In achieving this goal, agile software development embraces change, and produces the deliverable products in an evolutionary manner. With this approach, the overall development cycle is divided into a number of iterations. Each iteration contributing a certain amount of working functionality. Agile methods do not attempt to develop systems in monolithic phases, as characterized by waterfall method. Agile avoids the BDUF (big design up front) and the big bang integration syndromes. This is achieved in agile development through the use of iterations. The iteration and its associated delivery provide a focal point for obtaining feedback. Additionally, in the context of GSD, they provide a synchronization and integration point. Iterations force decisions to be made [38]; decisions about what should be developed; decisions on what has been developed. They are essential in controlling risk and in managing a project progress. Thus, agile methods are seen to have a strong emphasis on communication and interaction, on continuous integration and delivery of working software, on close collaborations and co-locations, and so on. It is also seen that geographical dispersal in time and distance impacts significantly on many of the core aspects of agile methods. Therefore some significant questions are posed: do agile methods scale appropriately and can they scale in time and space? Can they can be applied to

26

globally distributed software development?; Additionally, is it appropriate that they are applied in the context of distributed software development?; Or should some other agile paradigm or methodologies for use with GSD? The desire or objective is to make distributed software development effective and efficient. Establishing effective communications poses a significant challenge to multi-site development and globally distributed software development. We are seeing the emergence of empirical information on the use of agile methods in a DSD context [23], [43], [44]. These projects have used XP for their off-shore development and report some significant findings. From this we can see some patterns for GSD emerge, which we look at in section 6. The finding from these projects include: the use of site ambassadors, use of wiki web as a common repository for information, separate teams by functionality not by activity (Conways Law), expect more process and documentation, dont under estimate the impact of culture, use synchronous communication as much as possible (phone and instant messaging). 2.3. Organizational Patterns building organizations that produce quality software. In [25] Martin Fowler presents us with proposals for consideration when using XP and how it is possible to adapt it as a process to a particular project context. The approach being analogous to software re-use within OOAD. This leads us to generalize on this idea of variation on a theme and to consider using OOAD techniques when composing, evaluating and structuring a development process. OOAD has espoused the theme of software re-use from its inception. This concept of re-use has been cultivated in the context of software design and software architecture by the application of Alexanders design pattern ideal to software architectural domain, as detailed in the GOFs [27] Design patterns elements of object oriented software re-use. Similarly Coplien [13] has taken the concept of design pattern and used it in his study of organizational structure. Patterns span a multitude of domains. The focus of this work is on organizational structure and development process. In this sense we are interested in aspects described in a range of process patterns, for example, SCM patterns [9], [10], Analysis and design in Catepillar Fate [33], Requirements capture in RAPPeL [46], development process SCRUM [8], Episodes [16], Process Improvement [5], A Development Process Generative Pattern Language [15], Customer interaction patterns [39] and Organizational Patterns [13], [28], [29]. The origins of software patterns and organization patterns come from the work of Alexanders architecture patterns used in the design of buildings, towns and cities as related in the much referenced works Timeless Way of Building [2] and A Pattern Language [3]. Coplien [14] provides a definitive briefing on the patterns concept. These concepts are also explored in the writing of Gabriel [10]. In this he refers to the concept of quality without a name which patterns seem to espouse. In Alexanders work there is a human element to patterns in terms of their use and application; That this leads to a improvement in the quality (of life) through piecemeal growth and incremental repair. Similar sentiments are expressed with organization patterns [13], but the context is that of the social structures of software development orgnaizations. This the foundation for the research described in this paper. In essence, a pattern can be considered the documentation of a problem and its solution. The documentary form has a specified structure composed of its name; The intent which summarizes what the pattern does; The context describing the scope and area of application of the pattern; A

27

statement illustrating the problem domain, thus providing greater understanding to the pattern and its use the issues or forces that are involved and addressed; A description of the solution and the reasoning behind why the pattern works. Names are very important for patterns. Good names are essential. A Patterns name should encode the meaning of the pattern and evoke the essence of the problem and its solution. This is especially important in the domain of organizational structure and development process design. It is desirable that Pattern names have an immediate impact, that they encapsulate the essence of the problem and solution addressed in the pattern. There should be a certain semantic resonance from the pattern name in this respect. Pattern names should be communication enablers, that form part of the design vocabulary. Coplien [13] describes patterns as: a recurring structural configuration that solves a problem in a context contributing to the wholeness of some whole or system that reflects some aesthetic or cultural value. A pattern in itself provides a solution to a particular design problem. When they are brought together as a collection and relationships between them established, then they can be used to build systems, define architecture, establish framework, create structures and so on. In this way the collection of patterns defines or establishes a pattern language. When considering patterns as design problem/solutions pairs, it is seen that they are empirical in origin, in the sense that patterns are not invented or created, but found on a recurring basis. As such, they capture important empirical knowledge. This knowledge relates to design structures, practices, and techniques. This knowledge, captured as patterns is then available for re-use and can be applied in new contextual settings. Patterns capture important practice and often hidden structure. Patterns per se dont exist in isolation. They have relationships with other patterns in the domain context. They inter-work with other patterns, which means that patterns build on each other, establish a flow of problem resolution that help build systems in this case systems of human endeavour. The connectedness, relationships and collaborations among patterns help weave a pattern language. A pattern language enhances the power of a pattern through which a network of solutions are created and evolved by means of piecemeal growth and incremental repair. Thus systems architectures are generated. The path taken will depend on the context in which the patterns are being applied and on what forces are being resolved. A pattern language takes you on a journey. A journey leading to a solution through the interconnections among patterns in the language. With respect to organizational patterns, they generate structure and behaviour of an organization. This is through the interplay and working together of the patterns. The pattern language links the patterns together, and provides the rules by which the patterns work together in meaningful ways. Coplien [13] describes it as analogous to a roadmap and a journey: a pattern language is an outline of the many ways that patterns may be put together. How they are put together depends on the context So, while a pattern language is a roadmap, there are many ways from the start onwards to a journey into organizational growth . Patterns promote awareness of the sociological human forces that have a bearing on software development activities. The basic tenet of organizational patterns is that an organization is a system, and as such it has both structure and behaviour. It is a social system, and the activity of software development is primarily a social activity. Thus how we structure our organizations, to enable this activity, will have a significant bearing on the outcome and success of this development activity.

28

The structure of an organization is reflected in the roles, relationships, collaborations, the network of communications, and the processes used to carry out the development activity. Culture will, also, have a significant bearing on the structure of an organization. Thus, when considering an organization as a system necessitates contemplating its architecture. The architecture is composed of structures and relationships between structures. From case studies in the context of GSD we can extract patterns, create relationships between patterns, which will help us create, grow, repair distributed software development organizations. Coplien and Harrison detail four patterns languages which focus on different facets of an organization: Project management, Piecemeal growth, Organizational style, and People and Code. In section 4 we will reflect on the case study MMSC development project and examine the patterns that can be identified to exist within this project. Additionally, we look at some problems encountered in this project and explore how the application of certain patterns could have helped.

3. XP in the Large A Project Summary


About 2nd quarter 2001 Logica LMN embarked on the development of a large scale telecommunications product for multimedia messaging system targeted at the 3G market. The emphasis for this development on high quality both in terms of the system structure and design and in the levels of testing. About this time the eXtreme Programming (XP)[7] methodology was gaining significant attention and credibility. Logica LMN had used XP on a sub-project for a year prior to this. The results of this pilot were very positive and it was deemed that XP was worth using on the next large scale MMSC development project. This was to replace the then Logica LMN Cortex process which followed, in the main, the traditional waterfall methodology. The key drivers for the adaptation of XP were: initial time to market, phased releases, quality deliverables, system evolution, good design practice, functionality adaptability in an evolutionary product market. It was perceived that XP could assist in meeting these goals. Typically the essence of XP is based on the scenario that requirements and functionality are captured in User Stories. The user stories are reviewed by the developers. After a conversation or conversations with the customer they are prioritized by the customer. Then the stories are implemented and delivered over the course of a number of iterations. In the MMSC product context there are a number of potential customers and product management would usually converse with them. So an adaptation of the customer/developer relationship was needed. For this a number of project stakeholders were identified: Potential Customers, Product Management, Product Architect team, Product Development Manager, Product Component teams, Product End-to-End test team, System test team. This effectively provides you with two virtual teams: Customer team and Development team. In essence we had a proxy customer. The Architecture team provides an overlap between the two virtual teams. This overlap is conceived to assist the communications process between the two domains and additionally within the product development teams.

29

The component team based structure for the overall development team was used to allow for the practice of XP in its purer form at a fine granularity of development. The End-to-End test team was put in place to facilitate component integration and a continuous integration strategy. Additionally in this mix, there were two development sites, one in Dublin and one in Cork. This added another dimension in terms of communications overhead, co-ordination and synchronization of development activities. The adaptation of XP in this context saw the introduction of the Team Story concept. This is to ensure that any team can work in an XP way. The relationship between User Stories and Team Stories is as follows: A Team Story is associated with exactly one, unique, User Story. A User Story has one or more Team Stories. When all the Team Stories associated with a User Story are complete, then the User Story is complete. A Team Story requires work from one team only. Tracking of stories and metrics is necessary in a large project. An online tools web based application was used for recording User Stories, Team Stories and corresponding test coverage. This kept the administrative overhead to a minimum. Thus the adaptation of XP to MMSC product development project included the following project elements and artifacts: Team based component development, User Stories, Team Stories, Unit testing, Acceptance tests, End-to-End testing, System tests, Engineering releases, Iterations, Start of iteration meeting and end of iteration meeting, Online capture of user and team stories, Online capture of acceptance tests. 2.3. Main Conclusions from MMSC Project. The project delivered a high quality MMS within the expected time frames and with the expected functionality (as dictated by the customer) and quality. This can be attributed to the adoption of XP and its adaptation to suit large scale product development. Iterative development facilitates tractable product development and project management. The use of unit tests, extended unit tests, in team acceptance test and end-to-end testing contributed to high quality software deliverables. By using a component team based approach allows XP to be applied in its purer form at a microscopic level. This approach together with End-to-End testing produced a very usable and effective development process. Continuous integration is an integral part of XP. Regular integration testing at all levels (team story and user story, End-to-End test) was fundamental to avoiding big bang problems. In short XP in its purest form cannot be applied to large scale high grade product development. It needs to be augmented with some additional project management artifacts at a macroscopic level. The use of test first design is difficult to achieve, as is the practice of pair programming. These require significant effort in terms of mentoring, tutoring and coaching. Yesterdays weather became an effective mechanisms for estimating the effort for a piece of work. However, care must still be taken to avoid under estimating and over committing in an iterations.

30

By keeping metrics for each iteration it was possible to check the project progress at any time. This is important for global visibility within an organization. The developer - customer communications needs careful attention. The use of the architect team as a proxy customer, while contributing to architectural consistency of the product, didnt quite achieve the developer/customer dialogues espoused with XP. Nonetheless, the two virtual team structure contributed strongly to the projects forward progress. The distributed software development aspects of the project didnt achieve the desired objectives and it is worth looking at this aspect of the project. This had a number of classic symptoms and pathology associated with it: culture differences, lack of trust, political issues, differences in approach and methodology, lack of buy in, infrastructure , and communications problems.

4. Pattern Mining Identifying Organizational Patterns


This section explores the application of XP from an organization structure and process perspective. It sets about identifying the set of organizational patterns which can be found in the project under review. The organization patterns identified are briefly described in Appendix A. Many software projects are composed developers, designers, software architects, testers, systems engineers, quality engineers, project leads, managers, customers, product managers, and various other roles. Additionally, a typical organization will be composed of people, developing products using a particular process. The product development takes place in the context of a project which executes a plan according to a set of objectives and produces a set of deliverables, while adhering to monetary controls imposed by a development budget. The organizational structure used in the MMSC development project was a DivideAndConquer approach which had the overall development divided amongst an number of component teams (TeamPerTask, or TeamPerComponent which is a derivation of OwnerPerDeliverable). The project used XP as its development process from which an number of patterns immediately fall out, such as, EngageCustomer, IncrementalIntegration, CodeOwnership, StandUpMeeting, DevelopInPairs. The overall release and development cycles were organized in an iterative way: engineering releases focused on the GA release (DevelopmentEpisode), composed of a number of iteration (ProgrammingEpisode) which implemented user stories and progressed the working functionality towards the Product Management goal. Product management and the customer team selected user stories for an engineering release. These were prioritized and mapped iterations within the engineering release. Prior to each engineering release as part of the planning activity, rough estimates of all the user stories for that release were made. This helped SizeTheSchedule to obtain a ball park figure for the development effort. This enabled the development manager and the customer team to make a judgement as to the veracity of what was being undertaken. The feedback from this exercise was in turn used to manage the expectation of the customer i.e. product management and to decide on a realistic WorkQueue for the engineering release. For each iteration there was a start of iteration meeting in the team leads and the customer team (product management, development manager, and architect team) agreed the user stories for that iteration (WorkQueue). Each team the established its InformalLaborPlan by signing up for proposed user stories. This was based on the teams velocity ( a measure the work completed from a previous iteration) or YesterdaysWeather. Each team DevelopedInPairs and

31

DeveloperControlProcess, by making the estimate for a piece of work that they signed up for selecting a user story. Thus developers were at the center (or near the center) of the development process. They needed to understand user stories (requirements). Create solution structures and designs ( in whiteboard sessions), build implementations and unit tests, and make component releases (IncrementalIntegration, ContinuousIntegration). In this context the team leads or managers had the roles of facilitators, mentors, XP coach, problem expeditors, developer liaisons (ManagerAlsoImplements). This was achieved through the use of Firewall and GateKeeper patterns. For each iteration, there was a NamedStableBase for each component and this was collected into a product BaseLine, that was EndToEnd tested (ImcrementalIntegration, EngageQualityAssurance, SmokeTest, AcceptanceTest, RegressionTest). During the course of interation, interim releases were made by teams, and these were picked up by other dependent development teams in PrivateWorlds and by the EndToEnd test team as part of the incremental integration strategy. At the end of each iteration each team delivered the implemented users stories, which were UnitTested, and AcceptanceTested. These were gathered into a product release by the packaging and installation team and made available to SystemTest for further independent testing primarily focused in performance and reliability (EngageQualityAssurance). Each release was accompanied by some documentation update by one of the team members (MercenaryAnalyst) - an incremental product specification (IPS) document, containing some essential information on component functionality, structure, and usage. It was not really a design document, but contained some design information. The creation of the customer team (SurrogateCustomer) was designed to overcome the issue that there was no direct customer as this development was new product development and that product management were our proxy customer. We did explicitly EngageCustomer with trial installation of the product after the first engineering release, which represented a six month development cycle. Overall, the project structure reflected OrganizationFollowLocation or Conways Law. The system architecture permitted a division along standard interface boundaries (StandardsLinkLocation), such that components were assigned to each team. There were eight teams in all, one of which was an ArchitectTeam. All, but one was located centrally at one site (SubsystemBySkill). At the start of the project some of the remote team collocated at the central site (Face To Face Before Working Remotely). This was to facilitate the architectural division and to mentor on the XP process to be used as the development methodology. The architect for the remote team worked as an integral part of the ArchitectTeam, which kicked off the project with an intensive two analysis (LockEmUpTogther) and design activity, followed by weekly meetings thereafter (SmokeFilledRooms). The architect team supplied the project with DomianExpertInRoles. It was also common for ArchitectAlsoImplements, and ManagerAlsoImplements. The overall product design authority was the chief architect (ArchitectControlsProduct), who also implemented significant product components (ArchitectAlsoImplements) and the core product framework (NamedStableBase). Although the project structure was designed to encourage good communications between teams (ShapingCurculationRealms and ResponsibilitiesEngage), it was evident that distance had an impact on communications such that issues between the remote teams took days to resolve. At local level this was normally not the case in which it usually took hours for issues to be resolved

32

(SmokeFilledRooms, ResponsibilitiesEngage). Resolution of issues was often to informal meeting (HallwayChatter, WaterCooler) and within a team by means of StandUpMeeting. While overall the local teams were agreeable and responsive to the adoption of XP some people wouldnt pair program the remote team resisted even though the initial start up of the project had a number of the remote project team engage in XP immersion with the local team. This immersion was provided by an external company. This is a useful and worthwhile investment, even though the project had an XP coach as part of the ArchitectTeam sometimes a prophet is not appreciated in her own land. The project structure used OrganizationFollowsLocation and StandardsLinkingLocation to divide up the development work among the project teams. This allowed the remote teams subsystems to be stubbed or mocked and hence reduce immediate dependency on the remote team, despite their insistence and desire to follow sequential waterfall with discrete development phases and a delivery of working product towards the letter stages of the engineering release cycle. Perhaps MercenaryAnalyst could have been used to allow the remote team to follow aspects of the their desired development methodology, producing requirements specification, functional specifications, test specifications and other ancillary documentation, which the project overall did not require and deemed un-necessary. The required deliverables were working software as per users stories, unit tests, acceptance tests, and incremental product specification. Using MercenaryAnalyst could have freed up their developers to focus on developing software. The main impact of this was that as the project gained momentum, the remote team fell behind with their deliverables. Some remedial action was taken by dispatching the XP coach from the architect team to assist the remote team adopt the XP development model and to focus on their deliverables. Additionally, project management and engineering management assisted the remote team in aligning their development activities with the local team. Thus UnityOfPurpose was established. Overall, the project established a CommunityOfTrust. This was mainly through the adoption of the XP development model, whereby DeveloperControlProcess. As such developers sign up for work (user stories) and provide the estimates for this work. So more or less they are at the center of the process and WorkFlowsInwards. This section has explored the project case study of section 4 from an organizational pattern perspective and highlighted the existence of various patterns in this project. An important aspect of patterns is their name. As it through the names that a certain amount of semantics, a certain evocation with respect to the problem and solution being addressed. Although this is a retrospective study, in the sense that it examine the project structure and organizational architecture, and then maps these to the CoplienHarrison patterns [13], it is in keeping with the agile principles of reflection, learning, adjustment and evolution. The patterns identified are generative in that they can be used to create and shape software development organizations. Therefore, just as we can extract re-occurring structure and solution from software systems and then re-use, adapt and apply them in new contexts. It is also possible to extract and distil organizational and process artifacts whereby they can be re-used and applied in structuring a distributed software development process.

33

5. Discussion
Many factors contribute to software quality and productivity, but agile development process has increased its share of the spotlight in recent years. Additionally, the practice of GSD is becoming more and more pervasive. It is considered that the adoption of agile development methods can and will be beneficial in the context of GSD. This will not be without the adaptation and evolution of the agile genre. This is to be expected and is in keeping with the philosophies of agile development we are uncovering better ways of developing software by doing it and helping others to do it which includes reflection, learning, and evolution as part of what it means to be agile. In addressing the challenges that exist in the distributed software development domain, we digress somewhat from the norm and look for assistance from the area of Patterns and Pattern languages. This enables us to look beyond process and consider what might be termed a holistic approach, which incorporates organizational structure, people, roles, relationships, and communication patterns i.e. the complete make-up of an organizational unit. By reflecting on the organizational structure, and the development process used, it is possible to obtain significant knowledge as to the application of XP in large scale distributed context. This knowledge can then be captured as organization patterns, which reflect the roles, relationships, collaborations and the social network communications that exist within a project. These patterns provide, at one level, documentation of problems and solutions in the context of software development organization architectures (roles, relationships, collaborations, communications networks). At another they capture empirical knowledge on how to effectively structure an organizational unit engaged software development. Section 4 described how XP can be adapted for large-scale software development projects. This was also a distributed development, which exhibited many of problems of distributed software development, not least the ubiquitous communication impedance problem and the cultural conflict problem. This study provides some (deep) insight into the structure, organizational architecture and processes for a successful large scale development project using agile development methodology (XP). At the same time it indicates that even with a system architecture that facilitates the easy division of labour among sites (SubsystemBySkill, DeployAlongTheGrain), that co-ordination and integration can have difficulties, as identified in Herbsleb [32]. In this instance there was a cultural impediments (, at an organization level) that went against the successful application of Conways Law, such that OrganizationFollowsLocation was less optimum. However, deeper insight, into the causes and rectification of such impediments to distributed development, has been gained. Also, It can be seen from contemporary studies into offshore development that new patterns are emerging to handle such situations [23], [43], [44]. One such strategy is to maintain a emissary or ambassador at the remote site. Reflecting on the MMSC project is in keeping with the agile philosophy. From this exercise various processes and organizational structures were described and these were mapped to the organizational pattern of Coplien and Harrison [13]. This results in a deeper understanding of the Pattern domain and helps develop a certain intuition of which orgainizational patterns may be used in a distributed agile context. This exercise was an interesting from the perspective that the adaptation of XP and it application to a large scale multi team development was undetaken without prior knowledge of organization patterns. Yet, upon reflection and analysis of the project structures and then relating them to CoplienHarrison patterns, there was an easy association with these patterns. To do this it was first necessary to establish a familarity with the organizational patterns for our analysis to be successful. It also, provides a validation that structures and processes

34

highlighted here can be applied in a similar distributed development context with a reasonable confidence of them producing success. In this retrosprective a number of patterns that can be readily associated with GSD can be identified: Face To Face BeforeWorking Remotely, Architecture Team, Firewall, Gatekeeper, Incremental Integration, Lock Them Up Together, Reponsibilities Engage, Team per Task or Team per Component, Conways Law, Unity of Purpose, Community of Trust, Standards Linking Locations and so on. It is necessary to develop understanding of the range of possible patterns that may be applied in a distributed development context. Additionally, with this understanding it is possible to explore their use in a way that contributes to incremental repair and piecemeal growth for orgainations engaged in GSD. It is informative and worthwhile to examine how the current techniques and methodologies of agile development and extreme programming can be applied in the context of distributed software development; To see if it is possible to leverage the success of XP (, and other agile methods,) in its achievement of timely and quality software product delivery in a distributed devlopment context. From case study analysis of this context the overall objective is to create a Pattern Language for Distributed Software Development. Furthermore, it is possible to identify similar patterns from a number of research works, for example, in [21] Face To Face Before Working Remotely, CommunityOfTrust, UnityOfPurpose, can be readily identified. Similarly in [6], patterns such as TeamPerTask, IncrementalIntegration, BuildPrototypes, EnageTheCustomer, OrganizationFollowLocation can be seen to exist. Additionally, patterns such as FaceToFaceBeforeWorkingRemotely, CustomerTeam, ArchitectTeam, StandardLinkingLocation, DeplyAlongTheGrain, UnitOfPurpose, CommunityOfTrust can all be contribute to the effectiveness of remote collaborative interactions using communication tools such as identified in [17], [18]. Also, the reverse findings are true, in the sense that patterns such as EngageTheCustomer applied in a distributed context will benefit from such research as Damian [18]. Thus, the impact and interplay between organization patterns and current streams of ressearch in GSD can be significant and rewarding. The significance of this study is that organizational patterns are considered to embody and capture many of the important characteristics and structures of an organization engaged in software development. Software development is essentially a social activity. This is emphasized even more when considered in the context of distributed software development. Thus, by exploring, finding and capturing the successful techniques, processes, practices, communication structures of organizations undertaking distributed software development, and representing this knowledge as patterns and creating a pattern language for distributed agile software development, it will provide us with valuable tools for structuring organizations and processes for effective distributed software development.

6. Conclusion
This paper has retrospectively explored the use of XP in a large scale software development project, which had distributed nature to it. It has highlighted the contribution of XP and the associated project structures to the success of this project. Also, it has highlighted a number of the pitfalls that can occur when applying agile methods in a distributed context between two

35

organizational units even ones that is within the one company, same national borders, but with different software development philosophies. Furthermore, a pattern mining exercise was performed which identified the existence and use of an number of Copliens organizational patterns [13] SizeTheSchedule, DivideAndConquer, StandUpMeeting, TeamPerTask, ArchitectControlsProduct, ArchitectAlsoImplements, CommunityOfTrust, IncrementalIntegration. These empirical findings offer further evidence to the significant and important role organizational patterns have to play in shaping organizations and creating development processes. Additionally, some new or pattern variation can be seen to emerge, for example, CustomerTeam, IntegrationTeam, RemoteEmissary, CommonDevelopmentStrategy. In order to examine and address the challenges that exist in establishing appropriate software development processes for geographically dispersed development, we will take an organizational pattern approach. The overall objective is to create pattern language for distributed agile development. The approach taken is to focus (initially) on the organizational structures that relate to the development process. To apply the concept of Organizational Patterns to the domain of distributed software development. In this we will build on the current knowledge captured in Copliens organizational patterns [13]. Thus part of our goals is to establish how to apply and map these patterns effectively in the context of distributed software development. Then and through a series of case studies, create pattern language for distributed agile development. Such a pattern language should be generative, in the sense that it generates organizational structure and process. Additionally, it can be used to repair, improve and incrementally grow or evolve organizations engaged in distributed software development. Thus, within the context of GSD it is desirable and rewarding to pursue this avenue of empirical research and build upon the existing organizational and process patterns; To find new patterns that will progress the creation and evolution of appropriate organizational structures and processes for successful GSD execution.

Acknowledgements
This research has been supported by the Science Foundation Ireland Investigator Programme, B4STEP (Building a Bi-Directional Bridge Between Software ThEory and Practice) and carried out under the direction of Professor Brian Fitzgerald at the University of Limerick.

Appendix A Organization Patlets


This section provides sample summaries of the patterns -patlets - identified in the case study detailed in this paper. These were extracted from [13] A.1 Project Management Patlets These patlets point to patterns for initial organizational design.
COMMUNITY OF TRUST: If you are building any human organization, Then: you must have a

foundation of trust and respect for effective communication at levels deep enough to sustain growth.

36

SIZE THE SCHEDULE: If the schedule is too long, developers become complacent; but if it is too

short, they become overtaxed. Therefore: reward meeting the schedule, and keep two sets of books.
NAMED STABLE BASES: If you want to balance stability with progress, Then: have a hierarchy of

named stable bases that people can work against.


INCREMENTAL INTEGRATION: If you want developers to be able to test changes before

publishing them, Then: allow developers to build the entire product code independently to allow testing with the very latest base (not the latest Named Stable Base).
PRIVATE WORLD: If you want to isolate developers from the effects of changes, Then: allow

developers to have private work spaces containing the entire build environment.
WORK QUEUE: If deliverables are ill-defined, you need to allow time to do everything. Therefore: produce a schedule with less output than you have input. Use the list of IMPLIED REQUIREMENTS

(really just names) as a starting point and order them into a likely implementation order favoring the more urgent or higher priority items.
INFORMAL LABOR PLAN: If developers need to do the most important thing now, Then: let

developers negotiate among themselves or just figure out the right thing to do as regards short term plans, instead of master planning.
DEVELOPMENT EPISODE: If we overemphasize individual contributor skills, work suffers.

Therefore: approach all development as a group activity as if no one had anything else to do.
DEVELOPER CONTROLS PROCESS: If you need to orchestrate the activities of a given location or

feature, Then: put the Developer role in control of the succession of activities.
WORK FLOWS INWARD: If you want information to flow to the producing roles in an organization, Then: put the developer at the center and see that information flows toward the center, not from the center. PROGRAMMING EPISODE: If you need to split up work across time, Then: do the work in discrete

episodes with mind share to commit to concrete deliverables.


SOMEONE ALWAYS MAKES PROGRESS: If Distractions constantly interrupt your teams

progress, Then: whatever happens, ensure someone keeps moving toward your primary goal.
TEAM PER TASK: If a big diversion hits your team, Then: let a sub-team handle the diversion, the

main team keeps going.


MERCENARY ANALYST: If you want to keep documentation from being a critical path roadblock for developers, Then: hire a MERCENARY ANALYST.

A.2 Piecemeal Growth Patlets These patlets summarize patterns for the growth of an organization once it is up and running.

37

SIZE THE ORGANIZATION: If an organization is too large, communications break down, and if it

is too small, it cant achieve its goals or easily overcome the difficulties of adding more people. Therefore: start projects with a critical mass of about 10 people.
ENGAGE CUSTOMERS: If you want to manage an incremental process that accommodates

customer input, and if you want the customer to feel loved, Then: engage customers after Quality Assurance and project management are prepared to serve them.
SURROGATE CUSTOMER: If you need answers from your customer, but no customer is available

to answer your questions, Then: create a surrogate customer role in your organization to play advocate for the customer.
FIRE WALLS: If you want to keep your developers from being interrupted by extraneous influences

and special interest groups, Then: impose a Fire Wall, such as a manager, who keeps the pests away.
GATE KEEPER: If you need to keep from being inbred, Then: use a GATE KEEPER role to tie

together development with other projects, with research, and the outside world.
UNITY OF PURPOSE: If a team is beginning to work together, Then: make sure all members agree

on the purpose of the team.


PATRON ROLE: If you need to insulate Developers so DEVELOPER CONTROLS PROCESS and

provide some organizational inertia at the strategic level, Then: identify a patron to whom the project has access, who can champion the cause of the project.
DOMAIN EXPERTISE IN ROLES: If you need to staff all roles, its difficult to determine how to

match people to roles to optimize communication. Therefore: match people to roles based on domain expertise, and emphasize that people play those roles in the organization.
SUBSYSTEM BY SKILL: If you need to organize subsystems for the long haul, Then: divide them

up by skills.
COMPENSATE SUCCESS: If enterprises are to succeed, they must reward the behaviors that

portend for success; but, these behaviors are varied, and success is difficult to measure. Therefore, establish a spectrum of reward mechanisms that reward both teams and individuals.
DEVELOPING IN PAIRS: If you want to improve the effectiveness of individual developers, Then: have people develop in pairs. ENGAGE QUALITY ASSURANCE: If developers cant be counted on to test beyond what they

already anticipate what might go wrong, Then: engage Quality Assurance as an important function. A.3 Organizational Style Patlets Good design lends a sense of style to anything we build. Each great organization has its own style. These patterns shape the style of an organization. Different organizational styles fit different needs, so these patterns provide a good foundation for tailoring an organization to your business and market.

38

DIVIDE AND CONQUER: If an organization is getting too large for communications to be effective

any more, Then: try partitioning it along lines of mutual interest and coupling, forming a separate organization and process.
CONWAYS LAW: If organization structuring concerns are torn between geography, expertise, politics, and other factors, Then: align the primary organizational structuring with the structure of the business domains, the structure that will be reflected in the product architecture. ORGANIZATION FOLLOWS LOCATION: If you need to distribute work geographically,

communications suffer, but you can limit the damage if work can be partitioned. Therefore: organize work at locations so groups of people that work together are at the same location.
FACE TO FACE BEFORE WORKING REMOTELY: If a project is divided geographically, Then: begin the project with a meeting of everyone in a single place. SHAPING CIRCULATION REALMS: If you need mechanisms to facilitate the communication

structures necessary for good group formation, Then: shape circulation realms.
RESPONSIBILITIES ENGAGE: If central roles are overloaded but you dont want to take them out of the communication loop Then: intensify communication more among non-central roles to lighten the load on the central roles HALLWAY CHATTER: If developers tend to huddle around the organizational core or supporting

roles are inadequately engaged with each other, Then: rearrange responsibilities in a way that encourages less isolation and more inter-working among roles and people.
THE WATER COOLER: If you need more communication between institutionalized organizations,

Then: leave space for everyday human activities at the workplace that can provide more complete and informal communication A.4 People And Code Patlets People and code are the two most important components of a software development organization. Customers wouldnt exist without code to sell to them, and code wouldnt exist without people. People write code, and the structure of code in turn affects how people organize. These patlets point to patterns that help an organization align the people and code structures properly.
ARCHITECT CONTROLS PRODUCT: If a project has a long life, Then: use the architect to carry the

vision forward and serve as the long-term keeper of architectural style.


ARCHITECTURE TEAM: If you are building a system too large or complex to be thoroughly

understood by a single individual, Then build a team that has both the responsibility and the power to create the architecture.
LOCK EM UP TOGETHER: If your team is struggling to come up with an architecture, Then:

isolate them physically for several days where they can work uninterrupted.

39

SMOKE FILLED ROOM: If you need to make a decision quickly and there are reasons to exclude

others, Then: make the decision covertly so that the rationale remains private, though the decision will be publicized.
STAND UP MEETING: If there are pockets of misinformation or people out of the loop: Then: hold

short daily meetings to socialize emerging developments.


DEPLOY ALONG THE GRAIN: If reuse is suffering from fragmentation of responsibilities for an

artifact, Then: give people dedicated, long term responsibility for a management piece of the system.
ARCHITECT ALSO IMPLEMENTS: If an architect is on an ivory tower, they are out of touch; yet

someone needs to take the big and long view and reconcile it with practice. Therefore: the architect is materially involved in day-to-day implementation.
STANDARDS LINKING LOCATIONS: If you have geographically separated development, Then:

use standards to link together parts of the architecture that cross geographic boundaries.
CODE OWNERSHIP: If you need responsibility for code and want to build on DOMAIN EXPERTISE IN ROLES , Then: give various individuals responsibility for the overall quality of the code. FEATURE ASSIGNMENT: If you are trying to partition work in a large project, Then: make

assignments of features to people.


PRIVATE VERSIONING: If you want to enable incremental changes without publishing them, Then:

set up a mechanism for developers to version code without checking it in to a public repository.

References
1. 2.
3.

ABRAHAMSSON P., O Salo, J Ronkainen, J Warsta , Agile Software Development Methods review and analysis, VTT publications 478 ALEXANDER C., Timeless Way of Building, Oxford University Press 1977 ALEXANDER C, A Pattern Language, Oxford University Press 1977 APPLETON B., S Konieczka, A Berczuk, Continuous Staging: Scaling Continuous Integration to Multiple Component Teams CM Crossroads Journal March 200 www.crossroads.com APPLETON B., Patterns for conducting Process Improvement, PLoP 97 Addison-Wesley BASS M., D Paulish Global Software Development Process Research at Siemens 3rd International Workshop on Global Software Development 2004 BECK K., Extreme Programming Explained Embrace Change, Addison Wesley 2000 BEEDLE M.,K. Schwaber, Y Sharon, M Devos, J Sutherland SCRUM: an extension pattern language for hyper-productive software development, PLoP 1999 Addison-Wesley BERCZUK S., Reliable Codelines, PLoP 2001 Addison Wesley CABRERA R., B, Appleton, S Berzcuk Software Reconstruction: Patterns for reproducing software builds, PLoP 1999 Addison-Wesley CARMEL E., R Agarwal Tactical Approaches for Alleviating Distance in Global Software Development IEEE Software March/April 2001

4. 5. 6. 7. 8. 9. 10. 11.

40

12. 13. 14. 15. 16. 17. 18. 19. 20. 21.

COCKBURN A., Agile Software Development 2002 Addison-Wesley COPLIEN J., Harrison N., Organizational patterns for agile software development http://easycomp.info.uni-karlsruhe.de/~jcoplien/HarrisonCoplien.pdf COPLIEN J. O., Software Patterns, Sigs publications 2000 http://www.sigs.com COPLIEN James O., A Development Process Generative Pattern Language. Pattern Languages of Program, Design 1995. Addison-Wesley http://www1.bell-labs.com/user/cope/Patterns/Process/index.html CUNNINGHAM W., Episode A Pattern Language for Competitive Development, PLoP 1996 http://c2.com/ppr/episode.html DAMIAN D., An empirical study of requirement engineering in distributed projects: is distance negotiation more effective? Faculty of Information Technology, University Of Technology, Sydney. DAMIAN D., The impact of stakeholders geographical distribution on managing requirements in s multi-site organization. Proc. of the 10th IEEE Int'l Conference on Requirements Engineering, 2002. ELSSAMADISY A., XP on a Large Project A developers view, Extended Abstract, ThoughtWorks Inc. http://www.thoughtworks.com/us/library/XP On A Large Project - A Developer'sView.pdf FITZGERALD, B. and Hartnett, G. (2005) A study of the use of agile methods within Intel. in Baskerville, R., et al. (Eds) Business Agility & IT Diffusion, Proceedings of the IFIP 8.6 Conference. May, Atlanta FITZGERALD B., D. Boland, Transitioning from a Co-Located to a Globally-Distributed Software Development Team: A Case Study at Analogue Devices Inc. 3rd International Workshop on Global Software Development 2004 FOWLER M., J Highsmith, Agile manifesto, http://www.sdmagazine.com/print FOWLER M., Using an Agile Software Process for Offshore Development, http://martinfowler.com/articles/agileOffshoe.html FOWLER M., The New Methodology, http://martinfowler.com FOWLER M., Variation on a Theme of XP http://www.martinfowler.com/articles/xpVariation.html GABRIEL R., Patterns of Software Tales from the Software Community Oxford University Press 1996 GAMMA E., R Helm, R Johnson, J Vlissides Design Patterns Elements of Reusable Object-Oriented Software, Addison-Wesley 1995. HARRISON N., Organizational Patterns for Teams PLoP 1996, Addison-Wesley HARRISON N., J O Coplien Patterns Of Productive Organizational Bell Labs Technical Journal summer 1996 HIGHSMITH J., What is agile development, Cutter Consortium 2002, http://stsc.hill.af.mil/crosstalk/2002/10/highsmith.html HIGHSMITH J., Agile Project Management, Creating Innovative Products, Addison-Wesley 2004 HERBSLEB J., R Grinter, Splitting the organization and integrating the code: Conways Law revisited, ICSE 1999 KERTH N., Catepillar Fate: A Pattern Language for the Transformation of Analysis to Design, PLoP 1995 Addison-Wesley KIRCHER M. et al. Distributed eXtreme Programming, International Workshop on Global Software Development 2002, http://www.fastnloose.com/cgi-bin/wiki.pl/dad/?MichaelKircher KRUTCHEN P., Analyzing Intercultural Factors Affecting Global Software Development A position Paper 3rd International Workshop on Global Software Development 2004 MAURER F., Dept. Comp Sci, Uni of Calgary Supporting Distributed Extreme Programming International Workshop on Global Software Development 2002 MAURER F., H Holz Integrating Process Support and Knowledge Management for Virtual Software Development Teams International Workshop on Global Software Development 2002

22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37.

41

38. 39. 40. 41. 42. 43.


44. 45.

POPPENDIECK M., T Poppendieck Lean Software Development An Agile Toolkit, Addison-Wesley 2003 RISING L., Customer Interaction patterns, PLoP 1997, Addison-Wesley PAASIVAARA M., C Lassenius Using Iterative and Incremental Processes in Global Software Development 3rd International Workshop on Global Software Development 2004 PYYSIAINEN J, Building Trust in Global Inter-Oganizational Software Development Projects: Problems and Practices, International Workshop on Global Software Development 2003 SHEPPARD M., D. Anderson XP in the Large Applying XP to Large Scale Telecommunication Software Development www.b4step.ul.ie/web/Mark.Sheppard SIMONS M , Internationally agile, InformIT, http://www.informit.com/articles/printerfriendly.asp?p=25929 SIMONS M. , Distributed Agile Development and the Death of Distance, Cutter Consortium 2004 WALLACE N., Peter Bailey, Neil Ashwork, Managing XP with Multiple or Remote Customers http://www.agilealliance.org/articles/articles/Wallace-Bailey-ManagingXPwithMultipleorRemoteCustomers.pdf WHITENACK B.,RAPPeL: A Requirements Analysis Process Pattern Language for Object Oriented development PLoP 1995 Addison-Wesley

46.

42

A FRAMEWORK FOR CONSIDERING OPPORTUNITIES AND THREATS IN DISTRIBUTED SOFTWARE DEVELOPMENT Pr J gerfalk1, Brian Fitzgerald1, Helena Holmstrm1, Brian Lings2, Bjrn Lundell2, Eoin Conchir1
Abstract
In this paper we present an overview of the field of distributed development of software systems and applications (DD). Based on an analysis of the published literature, we consider threats to communication, coordination and control in DD caused by Temporal Distance, Geographical Distance, and Socio-Cultural Distance. The analysis results in a more complete framework for reasoning in the DD domain which should be a useful resource for both academic researchers and practitioners.

1. Introduction
Distributed development of software systems and applications (DD) is an issue of increasing significance for organizations today, all the more so given the current trend towards outsourcing and globalisation. According to the World Investment Report, 2004 [51], offshoring of IT-enabled services is forecast to expand 24-fold by 2007 from a base of $1 billion in 2002. The report also notes that while US companies have been relatively active, European companies have shown less inclination to offshore services. There are many reasons why an organisation should consider adopting a DD model, including access to a larger labour pool and a broader skills base, cost advantage, and round the clock working. This is perhaps most evident in the many cases of outsourcing of software development to low-cost countries, e.g. [12], but is also relevant in the case of, for example, utilizing local expertise to satisfy local demands. In ideal software development teams, members have rich interactions, both formal and informal; share a common organisational culture - which promotes good coordination and facilitates effective control; represent a good mix of all required technical skills and relevant experience, made readily accessible to all team members; and are familiar with, and provided with, homogeneous tools and technologies appropriate for the project. DD adds new demands to the software development process by potentially threatening each of these ideal properties. In this paper we present an overview of the academic body of knowledge on opportunities and threats in distributed development, as represented in peer-reviewed research articles. The paper is organized as follows. Section 2 clarifies what we mean by distributed development. Section 3 presents the research approach adopted. Section 4 discusses the processes and dimensions to be
1 2

Department of Computer Science and Information Systems, University of Limerick, Limerick, Ireland. University of Skvde, School of Humanities and Informatics, P.O. Box 408, SE-541 28 Skvde, Sweden.

43

used within a framework for reporting the literature on DD, and introduces the framework. Section 5 populates and elaborates the developed framework in order to identify important issues in DD. Finally, Section 6 briefly summarizes and concludes the paper.

2. Characterising Distributed Development


For the purpose of this research, we choose to define development broadly as any software development lifecycle activity. This thus extends beyond pure development activities and includes, for example, deployment and maintenance. This broad definition makes sense since we do not want to restrict our analysis strictly to new software product development. We use the term activity in a loose sense, including any individual or collective human action at any level of granularity that serves a particular purpose. According to activity theorist Engestrm [23], an activity is something that transforms an object to an outcome. Hence, a development activity is that individual or collective action that transforms something (abstract or concrete) into something meaningful in the context of a software systems lifecycle. Thus, we would regard an individual developers creation of a source code document a development activity of transforming a requirements document into a piece of code. We would also regard a complete project, transforming an initial idea of a system into a working solution with documentation, associated work-processes, etc., a development activity. This means that we can regard a project as distributed without requiring all of its sub-activities to be. Intuitively, classifying a project or development team as distributed means the team members are not co-located, but geographically spread out; we may thus say that there is a geographical distance between actors in a DD setting. However, as we shall see below, many core aspects of DD are related not to geographical distance, but rather to what can be called a socio-cultural distance. Socio-cultural distance has to do with the fact that different people give different meanings to a situation based on their socio-cultural background and belonging. According to Orlikowski and Gash [43, p. 176], The frames of reference held by organizational members are implicit guidelines that serve to organize and shape their interpretations of events and organizational phenomena and give these meaning. Conflicts can arise from team members coming from different cultures both national culture and organisational culture. National or local culture encompasses an ethnic groups norms, values, spoken language and styles of communication [46, 13]. Organisational culture encompasses the working units norms and values, and includes the culture of systems development [13]. Culture can have a huge effect on how people interpret a certain situation, and how they react to it. Hence, having shared (or overlapping) frames of reference is a precondition for people to succeed in communication and collaboration. At the very least, each actor needs to have an understanding of and accept the others frames of reference, and understand that these might differ from the actors own (i.e. agree to disagree). Certainly, geographical distance may imply increased socio-cultural distance. However, the socio-cultural distance can be great even with low geographical distribution. Similarly, a huge geographical distance does not automatically mean huge socio-cultural distance. Finally, a consequence of being geographically distributed over two or more time zones is that there is also a temporal distance involved. However, neither is temporal distance confined to geographically distributed settings. Rather, a temporal distance is present as soon as team members cannot interact face-to-face. This may be due to geographical distance, but may as well be a result of, for example, shift work.

3. Research Approach Adopted


44

This paper presents an overview of the field of DD. Based on an analysis of the published literature, the paper provides a preliminary analysis of DD in different industrial contexts establishing basic characteristics. Inspired by Webster and Watson [52], we develop a framework to structure existing DD knowledge and studies. This has required a two-phase process of search and refinement. In the first phase, two parallel searches of the literature were conducted. These parallel searches were carried out relatively independently, thus achieving a form of triangulation in validating the resulting output. Each search was systematic, using keyword and author searches, and searches of tables of contents of Journals, and Conference and Workshop proceedings. Bibliographic databases were used to assist in forwards and backwards referencing. Papers were included if they had a core focus on DD (the primary list), or were considered highly relevant for understanding core issues raised in the DD literature (the secondary list). We also compiled an extensive note file, including quoted sections from papers which contained their major import. This allowed faster filtering in the later stages of analysis, but context was always checked against the full text. In the second phase, which commenced when the two searches were complete, the compiled lists were combined. Another iteration of the search was undertaken based on the combined lists, with a further check using bibliographic databases. As the analysis progressed, sources considered redundant or less relevant were removed from the secondary list. The full set of sources was then analysed with a view to developing a framework for compiling the key opportunities and threats considered to be inherent in DD. Practitioner literature was then consulted in order to check for congruence with the peer-reviewed sources. Due to space limitations, we are not able to include all references in this paper, but are confined to a representative selection. The full list of references is available from the authors upon request.

4. A Framework for Analysing Issues in DD


For a number of years the international workshop on Global Software Development has highlighted the impact of distribution on communication, coordination and control within DD lifecycle activities, e.g. [20]. This view is consistent with the position taken by a number of authors who have focused on one or more of these three fundamental processes, e.g. [13, 26, 38, 39, 42, 50]. Communication is the exchange of complete and unambiguous information - that is, the sender and receiver can reach a common understanding. [13] The communication process concerns the transfer of knowledge and information between actors, and the tools used to facilitate such interaction. Communication is an essential process in all software development [18, 7] but becomes even more crucial in DD due to the fact that DD changes the communication context away from the ideal face-to-face setting [15] into a technology-mediated and thus more restricted one [1]. Coordination is the act of integrating each task with each organisational unit, so the unit contributes to the overall objective. [13] The coordination process concerns how this interaction makes actors interdependent on each other: Two people have a coordination problem whenever they have common interests, or goals, and each persons actions depend on the actions of the other. [15, p. 62] All software development obviously requires coordination, but DD increases this

45

need as activities are distributed over time and space and across cultural borders, as we will discuss below. Control is the process of adhering to goals, policies, standards, or quality levels. [13] The control process concerns the management and reporting mechanisms put in place to make sure a development activity is progressing. Control thus relates to project management and hence the formalized structures required ensuring development of software in time, on budget and of desired quality. The communication, coordination and control activities are affected over a number of dimensions, which have been well elaborated in the literature, e.g. [6, 8, 21, 24, 28, 41, 50]. These relate to temporal, geographic and socio-cultural distance. Temporal distance is a directional measure of the dislocation in time experienced by two actors wishing to interact. Temporal distance can be caused by time zone difference or time shifting work patterns. When organising work patterns, note must be taken of both temporal overlap of parties, to facilitate communication, and temporal coverage, for example to move towards 24x7 activities. In fact, time zone difference and time shifting work patterns can work together to either increase or decrease temporal distance. For example, a one hour difference in time-zone within the EU can, because of different routines during a working day, lead to very few overlapping hours and an appearance of higher than expected temporal distance, but may offer increased temporal coverage. Conversely, an EU worker liasing with a counterpart in India working a late shift may experience low temporal distance, but such an arrangement will not offer increased temporal coverage. In general, low temporal distance improves opportunities for timely synchronous communication but may reduce management options. Geographical distance is a directional measure of the effort required for one actor to visit another at the latter's home site. Geographical distance is best measured in ease of relocating rather than in kilometres. Two locations within the same country with a direct air link and regular flights can be considered close even if separated by great distance, but the same cannot be said of two locations which are geographically close but with little transport infrastructure and perhaps intervening borders. Further, even two actors within the same building but separated by long corridors and several floors will be impacted by geographical distance. Ease of relocating has several facets, including ease and time of travel, and necessity for visas and permits. How critical an actor is to the project in their home location may also implicitly affect perceived distance, as it will affect their ease of travelling. In general, low geographical distance offers greater scope for periods of colocated, inter-team working. Socio-cultural distance is a directional measure of an actor's understanding of another actor's values and normative practices. As a consequence, it is possible for actor A to be socio-culturally closer to actor B than B is to A. It is a complex dimension, involving organisational culture, national culture and language, politics, and individual motivations and work ethics. It is possible to have a low socio-cultural distance between two actors from different national and cultural backgrounds who share a common organisational culture, but a high distance between two co-nationals from very different company backgrounds. At the very least, there is a need for an actor to understand and accept others frames of reference, and accept that these might differ from the actors own (i.e. agree to disagree). In general, low socio-cultural distance improves communication and lowers risk.

46

The complete framework forms a matrix in which each cell represents the impact of one dimension on one process. We present an overview of this framework in Table 1, and relate prominent DD issues, including opportunities and threats, to the relevant cells by way of illustration. This table should be considered in addition to the general characterisations given above for each process and dimension. Hence, each cell highlights only what is specific with respect to the affect of one dimension on one process. In Section 5, we elaborate on each cell in this framework.
Table 1: An Overview of the Framework of Issues in DD. Process Communication Dimension Temporal Distance Reduced opportunities for synchronous communication, introducing delayed feedback. Improved record of communications. Geographical Distance Potential for closer proximity to market, and utilisation of remote skilled workforces. Increased cost and logistics of holding face to face meetings Socio-Cultural Distance Potential for stimulating innovation and sharing best practice, but also for misunderstandings.

With appropriate division of work, coordination needs can be minimised. However, coordination costs typically increase with distance. Time zone effectiveness can be utilised for gaining efficient 24x7 working. Management of project artefacts may be subject to delays.

Increase in size and skills of labour pool can offer more flexible coordination planning. Reduced informal contact can lead to reduced trust and a lack of critical task awareness. Difficult to convey vision and strategy. Communication channels often leave an audit trail, but can be threatened at key times.

Potential for learning and access to richer skill set. Inconsistency in work practices can impinge on effective coordination, as can reduced cooperation through misunderstandings. Perceived threat from training low-cost 'rivals'. Different perceptions of authority/hierarchy can undermine morale. Managers must adapt to local regulations.

5. Elaborating the Framework


DD puts new demands on the software process imposed by increased complexity related to, for example, communication (formal, informal, potential lack of), coordination (time zones, social awareness, task-sharing, domain expertise, delays), cooperation (trust, teamness), control (policies, project management, power, uncertainty), culture (social, political), and technology and tools (heterogeneous technology, standardization). This means that any allocation of an issue to a single cell is necessarily arguable. The summary in table 2 therefore places some issues in several cells, as indicated in comments within the text. However, our aim is to use the framework to bring some kind of primary order out of the complex interrelationships evident from the literature.

Control

Coordination

47

5.1 Communication in Distributed Development 5.1.1 Temporal Distance Time Zone Effectiveness: Although the face-to-face setting is the basic prototype for communication [15], and generally considered the best means of exchanging ideas [11], asynchronous communication (over temporal distance) can be leveraged to the distributed teams advantage. By communicating in an asynchronous manner, teams can, for example, strive for round-the-clock development [29, 31], potentially reducing the time-to-completion for the project. Also, since asynchronous communication relies on technologies such as e-mail and fax [8, 19], a written communication history is usually left [13]. This provides for increased traceability and accountability, i.e. it facilitates finding out who said what to whom, and when this was said [1]. This is also a control issue. Delayed communication: Being situated across different time zones, a remotely located colleague may not be at work when their help is needed. The use of asynchronous tools over temporal distances increases the amount of time it takes to receive a response. Questions received by asynchronous communication overnight can be overwhelming for the developers beginning work in the morning [8]. The conversion of ideas into e-mail form can also increase the risk of misunderstanding [19], particularly when the content of the communication is contentious or argumentative in nature [36]. Delayed feedback: The delay in receiving a response can increase the amount of time it will take to resolve the issue at hand [8]. The problem becomes exacerbated and can drag on over days [34, 36], with increasing vulnerability costs as a result [29, 25]. It has been suggested that issues would be resolved more efficiently should the teams be collaborating co-located [8]. 5.1.2 Geographical Distance Proximity to market/customer: One advantage of DD is the possibility of being close to the target market or the customer of the product being developed [29, 31, 32]. By facilitating communication across geographical distance, distributed teams can take advantage of having software developers being placed both near the customer and in the home country, such as facilitating more effective requirements elicitation [19]. Lack of informal communication: One of the major issues highlighted in DD is the lack of informal communication that occurs within the distributed team due to geographical separation. Informal contact allows team members to develop working relationships, and allows a better flow of information about changes in the current project [33]; it is an essential part of software design and development [18, 7]. Informal contact is especially important in unstable, dynamic teams. Written documentation is inadequate when resolving misunderstandings about requirements [18, 19]. In colocated teams, informal contact, aka coffee talk [19], can account for about 75 minutes of the working day [33]. Both geographical and temporal distance reduces the opportunities for informal communication to take place [29, 36]. It has been found that even a small distance (30 meters) can greatly affect the level of communication between colleagues [3]. Naturally, even more attention has to be given to the effect of distance on communication in a global context. Dependency on information and communication technologies: In DD, the dependency on information and communication technology is high. Here, technology is used for communication

48

and, therefore, it has an impact on the most critical processes in an organisation whether and how people communicate to coordinate their processes [49, 35]. Hence, a convenient and well working technical infrastructure for information and communication, for example, effective tools and work environments, seems to be a necessity for successful DD [22]. Increased effort to initiate contact: Having team members separated by geographical distance places a barrier on communication by increasing the effort required to initiate contact [30]. This can lead to developers taking the risk of applying minor modifications to the system without trying to make contact with the person who might have more knowledge of that part of the system [8]. As a consequence, errors may be introduced in the system, ultimately increasing the cycle time. A related factor in initiating contact is not knowing who to contact [30, 6, 9]. This can arise from the lack of informal contact with remotely located colleagues. Due to lack of informal contact, a team member cannot easily learn of the skills and precise roles of their remote colleagues. Providing technical infrastructure: When developing software within a global context, problems can arise with global support for third-party tools being used. Battin et al. [6] found that different versions of tools were being offered in different countries by the third-party vendors. For example, the newest version of a tool was made available in the US, with older versions still being offered in other countries. Also, export regulations may prohibit diffusion of certain technology throughout the distributed team [6]. Cost of travel: Sometimes, meeting remote colleagues face-to-face is indispensable, especially in the early phases of a project. This travel can be very expensive and time-consuming [6]. Also, there may not be direct flights between the two points of travel, increasing the journey time. Furthermore, there is much more to travel-time than flight-time [6]. 5.1.3 Socio-Cultural Distance Innovation and shared best practices: A major positive effect of globally distributed development is innovation [22]. Developers from different cultural backgrounds may work together to continuously improve a product, to innovate and to improve processes. Best practices can be shared amongst developers and between development sites. Asynchronous communication preferred by non-native speakers: Often in globally distributed development, some or all of the developers speak English only as a second language. Having to communicate in real-time over teleconferences can be overwhelming for these people, finding it difficult to keep up with the conversation [48, 36]. Asynchronous communication allows for nonnative speakers to formulate their position and to check that they are making their point clear before sending the email. Thus, non-native speakers of English tend to rely more heavily on asynchronous communication. This introduces the advantages and disadvantages of asynchronous communication, as identified earlier. Language differences and misunderstandings: While English has become the international language for business matters, language competency is still a large stumbling block for communication within and between development teams [34]. In turn, misunderstandings can arise. Even if the whole team are native speakers of the language used in a project, problems can arise from different dialects and local accents [14]. If a major section of the team speaks a particular

49

language natively, unlike their remote colleagues, a feeling of alienation can arise, with non-native speakers of the major language being at a disadvantage in expressing themselves [48]. Managing frames of reference: Establishing mutual understanding is important since it increases the likelihood that communication will be successful [16]. It may be difficult for culturally- and geographically-distributed teams to achieve mutual understanding. National culture can affect how negotiations are carried out and how commitments are accepted [22]. For example, in a NorwegianRussian project, it was found that Norwegian conversation was more low-context that the Russians. This caused frustrations since the Russians relied more on the context of the conversation without explicitly stating some opinions [34]. In another study of a German-Canadian project, the Germans were perceived as being blunt and stubborn, while the Canadians were viewed as being laid-back, chatty and indecisive [36]. Also, practices of agreeing to working late or not can vary between countries [10]. Altogether, this can lead to, for example, unevenly distributed information within the team and the inability to make clear which part of a message is the most important. In some projects a new common frame of reference develops, including team-specific language use, in-jokes, etc [4]. 5.2 Coordination in Distributed Development 5.2.1 Temporal Distance Time zone efficiency: Temporal distance can be seen as beneficial in terms of coordination, in that coordination costs are reduced when team members are not working at the same time [25]. The producer of a unit of work can complete the work during the off-hours of the person who requested that work. In essence, coordination costs are reduced since no direct coordination takes place when two people are not working at the same time. A side effect is that although the coordination cost as such (i.e. time spent on coordination activities, waiting for task handovers, etc) may be reduced, costs related to repairing consequences of misunderstandings, reworking, etc, may increase [25]. Reduced hours of collaboration: An obvious disadvantage of being separated by temporal difference is that the number of overlapping hours during a workday is reduced between sites [6, 36, 14]. For example, a team located in the U.S. and in Ireland can have a total of 3 overlapping hours during a workday [14]. Even a one-hour time zone difference can mean many less overlapping hours. For example, with team members in Germany working from 8am-4pm with a 12pm lunch, coordinating with UK team members working from 9am-5pm with a 1pm lunch, there are only four overlapping hours in a day [30]. Synchronous team meetings difficult: Team members might have to work flexible hours in order to coordinate with their remote colleagues through real-time teleconferences, increasing the cost and effort of coordinating regularly [6]. Availability of technical infrastructure: Available technical infrastructures, and possible incapability, greatly affect performance of DD teams. For example, most change management tools do not allow 24/7 access without disturbing engineers due to back-ups and synchronisations [22]. Coordination complexity: Software development in itself is a complex task with substantial nonroutine work, and coordination itself can be costly [25]. The very nature of DD projects suggest that it is important not to rely on one person as a coordination channel between teams, since the

50

unavailability of this person can affect inter-team communication and coordination [6]. At the very least, its important to manage those people closely [34]. This is also an issue in the geographical distance dimension, and to some extent in the socio-cultural distance dimension. 5.2.2 Geographical Distance Access to large labour pool: By coordinating development across several countries, companies can access large labour pools of skilled workers [32, 20]. Standardisation in work practices: The modularisation of work for DD requires standardisation of the software development environment, processes and practices. These standardised practices, including manuals, databases and implicit and undocumented systems, serve as points of reference to coordinate work across time and space [47]. On the other hand, allowing local variation in work practices may leverage local experience and reduce project overhead [2]. This is also an issue in the socio-cultural distance dimension. Allocation of roles and team structure: In DD, there is the possibility to gain from a very large pool of expertise. In building project teams, people from different sites from all over the world can be included and project roles can be allocated to various development teams. This makes possible for a flexible team structure in that people can relocate for shorter periods, allowing for effective project management, independent of how the project is globally allocated [22]. Also, changes in allocation can adhere to the challenge of replacing isolated expertise and instead create skill-broadening tasks and effective teamwork [22]. This is also a control issue. Reduced trust: Creating trust can be hindered in a DD team, since normal communication like faceto-face feedback and common experience are sources of trust which are lacking in a distributed environment [45]. Familiarity and confidence are stages of relationships that must take place before trust is formed. Achieving and maintaining trust in global teams is more difficult than in collocated teams [40]. At a distance, it is difficult to empathise with those at the other site [36]. Trust can also be corroded, for example when defects are introduced due to a developer not making the effort to contact a remote colleague before making changes to the system [8]. When there is a lack of trust, there is a lack of willingness to communicate [30]. On the other hand, studies of DD in a libre software context suggest that teams rely more on social control mechanisms than on trust [17, 27]. Lack of awareness/team spirit: The feeling of teamness with remote colleagues can be affected because of physical separation and lack of informal contact [6, 33, 36]. Presumably, distance affects the stages by which individuals become coherent groups or teams. Due to physical separation and lack of face-to-face contact, team members may not be aware of the details of their remote colleagues work activities. If awareness of current work isnt spread across the whole team, misunderstandings can continue unnoticed and code conflicts can arise. It can also be difficult to determine if a remote colleague is available to be contacted at a particular time [30]. This is also an issue in the socio-cultural distance dimension. Modularisation of work: According to Conways Law, the structure of the system mirrors the structure of the organisation that designed it [30]. The nature of DD leads teams to splitting their work across feature content into well-defined independent modules [22, 47, 5]. This allows decisions to be made about each component in isolation, and reduces problems in the system integration phase [30]. Partitioning work tasks horizontally having each site responsible for the

51

whole lifecycle of particular functions/modules decreases interdependencies, and hence coordination costs [6]. This is also an issue in the temporal distance dimension. Lack of mechanisms for creating shared understanding: Without effective mechanisms for sharing information and facilitating common understanding, managers cannot exploit the benefits of DD [32]. Inadequate dispersal of important information about a project, such as the overall architectural vision, can leave teams with a skewed perception of which tasks are on the critical path [6]. Also, with a lack of understanding of the wider system, reuse opportunities may be overlooked [32]. This is also an issue in the temporal distance dimension, and to some extent in the socio-cultural distance dimension. 5.2.3 Socio-Cultural Distance Mix of skills and experiences: Globalisation, in general, achieves a constructive cross-fertilization of varying backgrounds and experiences [22], which can enrich coordination efforts between distributed teams. Language and cultural training: An investment in language training and cultural awareness may be required if team members come from different backgrounds [34], and a compromised culture may need to be established. A bridgehead [13] or liaison [6], i.e. a person from one site working in another site and acting as a mediator between sites, may be helpful. Lack of domain knowledge: Work on a project can require specific domain knowledge that developers coming from different backgrounds do not have [34]. Organisations can have incompatible views on a domain, based on their own particular experience and expertise [18]. For example, a Norwegian firm outsourcing work on a Norwegian tax software package to a Russian firm realised that the Russian developers did not have sufficient knowledge on the Norwegian tax system when taking on the work [34]. Doubtful of others capabilities: Developers may be doubtful of the knowledge of team members from other sites, their capabilities and skills [6]. This impression may be overcome by promoting familiarity between teams. It has, for example, been reported that American engineers can have concerns about the competency of international engineering teams [6]. 5.3 Control in Distributed Development 5.3.1 Temporal Distance Management of project artefacts: To maintain consistency among project artefacts, a configuration management tool with centralized storage is often used. Even when working from the same central repository, it may be unclear what problems are addressed by a new version of an artefact and what status it is in (such as whether it is still being tested) [9]. Also, when a DD project involves members from different organizations (aka a virtual organization), enforcing process and artefact standards can be particularly important in maintaining consistency and interoperability between project artefacts [26]. This is also a coordination issue. 5.3.2 Geographical Distance

52

Lack of concurrent engineering principles: Synchronisation is important when teams hand off processes between sites. It requires commonly defined milestones and clear entry and exit criteria. Effectively implementing concurrent engineering principles in DD often becomes difficult because of volatile requirements, unstable specifications, the unavailability of good tools that support collaboration across time and space, and the lack of informal contact [32]. 5.3.3 Socio-Cultural Distance Perceived threat from low-cost alternatives: Employees in the higher-cost economies can feel that their jobs are under threat from their colleagues in lower-cost economies, creating a we versus they mentality [14]. They may see a threat to their future employment and promotion prospects the my job went to India and all I got was this lousy T-shirt syndrome. As a result, they may not want to cooperate with their remote colleagues. This, in turn, affects the teams work and can compromise the benefits of globally distributed development. Apart from economic reasons, power struggles can arise between the different teams when the centre of power is not explicitly defined [36]. Adapting to local formalized norm structures: When working in a global setting, companies must learn about local formalized norm structures (applicable laws, traditions, regulations, etc). For example, applications for visas and work-permits may need to be sent some time before a trip between sites [6]. Also, different sites may prefer different development methods [26]. Different perceptions of authority/hierarchy: The nature of authority in a team environment can vary between cultures [37]. It has, for example, been found that Irish developers require their superiors to earn their respect, while U.S. developers give a more unquestionable respect to figures of authority [14]. 5.4 Summary of Issues in Distributed Development The main issues raised above are summarised in Table 2. Where an issue clearly relates to more than one process, or is impacted by more than one dimension, it is repeated in the table primary effect in standard typeface, other effects in italic. This table is very much a summary, and headings may make only limited sense out of the context of the earlier text. In Table 2 we have indicated if an issue is mainly portrayed as a DD advantage or opportunity, a disadvantage or threat, or something that deserves consideration but is not easily classified as one of the two. We use the symbols for DD advantages and opportunities, for DD disadvantages and threats, and for open DD issues. Due to the complex nature of DD, this classification is obviously coarse-grained but at least serves to indicate main trends in the published peer-reviewed DD literature. As can be seen from Table 2, the proposed framework can effectively be used to structure the many issues pertinent in DD. Although there are obvious overlaps, the framework provides a structure for discussing DD issues which can be useful for understanding the DD domain as well as being a tool for identifying problem areas where more research is needed. We can, for example, see that control issues, in general, have not been addressed to the same extent as issues related to communication and coordination. The framework also highlights that although geographical distance is perhaps the most intuitive discriminating factor in distinguishing DD from traditional software development, many DD issues relate to socio-cultural and temporal distance. This is an important insight since many lessons can probably be transferred from other areas dealing with these aspects to enrich the

53

current DD field of investigation. From Table 2 we can also conclude that most of the published DD literature seems to focus on potential threats in DD (i.e. ), some going on to suggest strategies for successful DD which ameliorate these threats, e.g. [6, 22, 30, 44, 45]. One future line of investigation would be to critically examine those threats and explore how and to what extent they might be leveraged into advantages.
Table 2: Framework of distributed development issues. Process Communication Dimension Temporal Distance Time zone effectiveness Delayed communication Delayed feedback Geographical Distance Proximity to market/customer Lack of informal communication Dependency on ICT Increased effort to initiate contact Providing technical infrastructure Cost of travel Access to large labour pool Standardisation in work practices Allocation of roles and team structure Reduced trust Lack of awareness/team spirit Modularisation of work Lack of mechanisms for creating shared understanding Coordination complexity Socio-Cultural Distance Innovation and shared best practices Asynchronous communication preferred by non-native speakers Language differences and misunderstandings Managing frames of reference Mix of skills and experiences Language and cultural training Lack of domain knowledge Doubtful of others capabilities Lack of mechanisms for creating shared understanding Standardisation in work practices Coordination complexity Lack of awareness/team spirit Perceived threat from lowcost alternatives Adapting to local formalized norm structures Different perceptions of authority/hierarchy

6. Conclusion
The core challenges of DD seem to lie in the complexity of maintaining good communication, coordination and control when teams are dispersed in time (e.g. across time zones) and space, as well as socio-culturally. In this work we have elaborated on these themes, drawing on the growing body of literature in the area of DD. To structure our analysis and presentation we have developed a framework that integrates these aspects and provides a detailed overview of the DD field. Although the processes and dimensions of this framework emerge from the review, as far as we are aware the framework has never been fully articulated in the literature; in particular, the directional nature of every dimension in a DD context has not been explicitly noted before. Proven methods for successful DD have not yet been formulated, and the presented framework may be an important tool in identifying the most pressing research issues.

References

Control

Time zone efficiency Reduced hours of collaboration Synchronised team meetings difficult Availability of technical infrastructure Coordination complexity Modularisation of work Lack of mechanisms for creating shared understanding Management of project artefacts Management of project artefacts Time zone effectiveness

Coordination

Lack of concurrent engineering principles Allocation of roles and team structure

54

[1] [2] [3] [4] [5]

GERFALK P J (2004) Investigating Actability Dimensions: A Language/Action Perspective on Criteria for Information Systems Evaluation, Interacting with Computers, Vol. 16, No. 5, pp. 957988. AKMANLIGIL, M. and PALVIA, P.C. (2004) Strategies for global information systems development, Information & Management, Vol. 42, No. 1, pp. 45-59. ALLEN, T.J. (1977) Managing the Flow of Technology, MIT Press. ARMOUR, P.G. (2002) The organism and the mechanism of projects, Communications of the ACM, Vol. 45, No. 5, pp. 17-20. BASS, M. and PAULISH, D. (2004) Global Software Development Process Research at Siemens, In The 3rd International Workshop on Global Software Development, (co-located with ICSE 2004), pp. 11-14, <gsd2004.cs.uvic.ca/docs/proceedings.pdf > BATTIN, R.D., CROCKER, R., KREIDLER, J. and Subramanian, K. (2001) Leveraging resources in global software development, IEEE Software, Vol. 18, No. 2, pp. 70-77. BECK K (2000) Extreme Programming Explained: Embrace Change, Addison-Wesley, Reading. BOLAND, D. and FITZGERALD, B. (2004) Transitioning from a Co-Located to a Globally-Distributed Software Development Team: A Case Study and Analog Devices Inc., In The 3rd International Workshop on Global Software Development, (co-located with ICSE 2004), pp. 4-7, <gsd2004.cs.uvic.ca/docs/proceedings.pdf> BRAUN, A., DUTOIT, A.H. and BRUGGE, B. (2003) A Software Architecture for Knowledge Acquisition and Retrieval for Global Distributed Teams, In International Workshop on Global Software Development, (colocated with ICSE 2003), pp. 24-29. BRANNEN, M.Y. and SALK, J.E. (2000) Partnering across borders: Negotiating organizational culture in a German-Japanese joint venture, Human Relations, Vol. 53, No. 4, pp. 451-487. CARMEL, E. (1999) Global Software Teams: Collaborating Across Borders and Time Zones, Prentice Hall, Upper Saddle River. CARMEL, E. (2003) Introduction to the Special Issue of EJISD: The Emergence of Software Exporting Industries in Dozens of Developing and Emerging Economies, The Electronic Journal on Information Systems in Developing Countries, <www.ejisdc.org> CARMEL, E. and AGARWAL, R. (2001) Tactical approaches for alleviating distance in global software development, IEEE Software, Vol. 18, No. 2, pp. 22-29. CASEY, V. and RICHARDSON, I. (2004) Practical Experience of Virtual Team Software Development, In European Software Process Improvement (EUROSPI) 2004, Trondheim, Norway. CLARK, H.H. (1996) Using Language, Cambridge University Press, Cambridge. CRAMTON, C.D. (2001) The Mutual Knowledge Problem and Its Consequences for Dispersed Collaboration, Organization Science, Vol. 12, No. 3, pp. 346-371. CROWSTON, K., ANNABI, H., HOWISON, J. and MASANGO, C. (2005) Effective work practices for FLOSS development: A model and propositions, In Proceedings of the 38th Hawaii International Conference on System Sciences 2005, IEEE Computer Society, pp. 1-9. CURTIS, B., KRASNER, H. and ISCOE, N. (1988) A Field Study of the Software Design Process for Large Systems, Communications of the ACM, Vol. 31, No. 11, pp. 1268-1287.

[6] [7] [8]

[9]

[10] [11] [12]

[13] [14] [15] [16] [17]

[18]

55

[19]

DAMIAN, D.E. and ZOWGHI, D. (2002) The impact of stakeholders geographical distribution on managing requirements in a multi-site organization, In Proceedings IEEE Joint International Conference on Requirements Engineering, IEEE Computer Society, Los Alamitos, pp. 319-328. DAMIAN, D., LANUBILE, F. and OPPENHEIMER, H.L. (2003) Addressing the Challenges of Software Industry Globalization: The Workshop on Global Software Development, In Proceedings 25th International Conference on Software Engineering, IEEE Computer Society, Los Alamitos, pp. 793-794. DELONE, W., ESPINOSA, J. A., LEE, G. and CARMEL, E. (2005) Bridging Global Boundaries for IS Project Success, In Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS05) Track 1, IEEE Computer Society, pp. 1-10. EBERT, C. and DE NEVE, P. (2001) Surviving Global Software Development, IEEE Software, Vol. 18, No. 2, pp. 62-69. ENGESTRM, Y. (2000) Activity Theory as a Framework for Analyzing and Redesigning Work, Ergonomics, 43(7), pp. 960974. ESPINOSA, A. and CARMEL, E. (2003) The Impact of Time Separation on Coordination in Global Software Teams: a Conceptual Foundation, Software Process Improvement and Practice, Vol. 8, pp. 249-266. ESPINOSA, J. A. and CARMEL, E. (2004) The Effect of Time Separation on Coordination Costs in Global Software Teams: A Dyad Model, In Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS05) - Track 1, IEEE Computer Society, pp. 1-10. EVARISTO, J.R., SCUDDER, R., DESOUZA, K.C. and SATO, O. (2004) A dimensional analysis of geographically distributed project teams: a case study, Journal of Engineering and Technology Management, Vol. 21, No. 3, pp. 175-189. GALLIVAN, M. J. (2001) Striking a balance between trust and control in a virtual organization: a content analysis of open source software case studies, Information Systems Journal, Vol. 11, No. 4, pp. 277-304. GHOSH, T., YATES, J.A. and ORLIKOWSKI, W.J. (2004) Using Communication Norms for Coordination: Evidence from a Distributed Team, In 2004 Twenty-Fifth International Conference on Information Systems, Association for Information Systems, pp. 115-127. GRINTER, R.E., HERBSLEB, J.D. and PERRY, D.E. (1999) The Geography of Coordination: Dealing with Distance in R&D Work, In Proceedings on the ACM SIGGROUP Conference on International Conference on Supporting Group Work, ACM Press, New York, pp. 306-315. HERBSLEB, J.D. and GRINTER, R.E. (1999) Splitting the Organization and Integrating the Code: Conways Law Revisited, In Proceedings of the 21st International Conference on Software Engineering (ICSE99), ACM Press, New York, pp. 85-95. HERBSLEB, J. D., MOCKUS, A., FINHOLT, T. A. and GRINTER, R. E. (2000) Distance, Dependencies, and Delay in a Global Collaboration, In CSCW 2000 ACM 2000 Conference on Computer Supported Cooperative Work, ACM Press, New York, pp. 319-328. HERBSLEB, J.D. and MOITRA, D. (2001) Guest Editors Introduction: Global Software Development, IEEE Software, Vol. 18, No. 2, pp. 16-20. HERBSLEB, J.D. and MOCKUS, A. (2003) An Empirical Study of Speed and Communication in Globally Distributed Software Development, IEEE Transactions on Software Engineering, Vol. 29, No. 6, pp. 481-494. IMSLAND, V., SAHAY, S. and WARTIAINEN, Y. (2003) Key issues in Managing a Global Software Outsourcing relationship between a Norwegian and Russian firm: Some Practical Implications, In 26th Information Systems Research Seminar in Scandinavia, Finland.

[20]

[21]

[22] [23] [24] [25]

[26]

[27] [28]

[29]

[30]

[31]

[32] [33] [34]

56

[35] [36] [37] [38] [39] [40]

KAROLAK, D. (1998) Global software development: managing virtual teams and environments, Wiley/IEEE Computer Society, Los Alamitos. KIEL, L. (2003) Experiences in Distributed Development: A Case Study, In International Workshop on Global Software Development: GSD 2003, (co-located with ICSE 2003), pp. 44-47. KRISHNA, S., SAHAY, S. and WALSHAM, G. (2004) Managing Cross-Cultural Issues in Global Software Outsourcing, Communications of the ACM, Vol. 47, No. 4, pp. 62-66. MALONE, T.W. and CROWSTON, K. (1994) The interdisciplinary study of coordination, ACM Computing Surveys, Vol. 26, No. 1, pp. 87-119. MCCHESNEY, I.R. and GALLAGHER, S. (2004) Communication and co-ordination practices in software engineering projects, Information and Software Technology, Vol. 46, No. 7, pp. 473-489. MCDONOUGH, E.F., KAHN, K.B. and BARCZAK, G. (2001) An investigation of the use of global, virtual, and colocated new product development teams, Journal of Product Innovation Management, Vol. 18, No. 2, pp. 110-120. NICHOLSON, B. and SAHAY, S. (2001) Some political and cultural issues in the globalisation of software development: case experience from Britain and India, Information and Organization, Vol. 11, No. 1, pp. 25-43. NURMI, A., HALLIKAINEN, P. and ROSSI, M. (2005) Coordination of Outsourced Information System Development in Multiple Customer Environment A Case Study of a Joint Information System Development Project, In Proceedings of the 38th Hawaii International Conference on System Sciences 2005, IEEE Computer Society, Los Alamitos, pp. 1-10. ORLIKOWSKI, W.J. and GASH, D.C. (1994) Technological Frames: Making Sense of Information Technology in Organizations, ACM Transactions on Information Systems, Vol. 12, No. 2, pp. 174207. PAASIVAARA, M. and LASSENIUS, C. (2003) Collaboration Practices in Global Inter-organizational Software Development Projects, Software Process Improvement and Practice, Vol. 8, pp. 183-199. PYYSIINEN, J. (2003) Building Trust in Global Inter-Organizational Software Development Projects: Problems and Practices, In International Workshop on Global Software Development, (co-located with ICSE 2003), pp. 69-74, <gsd2003.cs.uvic.ca/gsd2003proceedings.pdf> ROBEY, D., KHOO, H.M. and POWERS, C. (2000) Situated Learning in Cross-Functional Virtual Teams, IEEE Transactions on Professional Communications, Vol. 43, No. 1, pp. 51-66. SAHAY, S. (2003) Global software alliances: the challenge of standardization, Scandinavian Journal of Information Systems, Vol. 15, pp. 3-21. SARKER, S. and SAHAY, S. (2002) Information Systems Development by US-Norwegian Virtual Teams: Implications of Time and Space, In Proceedings of the 35th Annual Hawaii International Conference on System Sciences, IEEE Computer Society, Los Alamitos, 10p. SPROULL, L. and KIESLER, S. (1991) Connection: New Ways of Working in the Networked Organization, MIT Press, Cambridge. SUTANTO, J., KANKANHALLI, A. and TAN, B.C.Y. (2004) Task Coordination in Global Virtual Teams, In 2004 Twenty-Fifth International Conference on Information Systems, Association for Information Systems, pp. 807-819. UNITED NATIONS (2004) World Investment Report 2004 The Shift Towards Services, United Nations Conference on Trade and Development, New York and Geneva, 468p. WEBSTER J and WATSON R T (2002) Analyzing the Past to Prepare for the Future: Writing a Literature Review, MIS Quarterly, Vol. 26, No. 2, pp. xiiixxiii.

[41] [42]

[43] [44] [45]

[46] [47] [48]

[49] [50]

[51] [52]

57

LEVERAGING DESIGN PATTERNS FOR GLOBAL SOFTWARE DEVELOPMENT: A PROPOSAL FOR A GSD COMMUNICATION PATTERN LANGUAGE Allan Scott, Luis Izquierdo, Sweta Gupta, Robert Elves and Daniela Damian1
Abstract. We describe four communication patterns that emerged in our global software development
experiences. We draw upon both our own experiences and past experiences described in literature to identify instances and applications of these patterns. We employ known software engineering patterns to formalize our description of these communication patterns. In casting the formal software engineering patterns in the context of global software development communication, our aim is to leverage familiarity and experience.

1.

Introduction

Researchers and practitioners of Global Software Development (GSD) use a wide variety of terminology to describe their experience. As researchers use adhoc terminology to describe their GSD experiences, fellow researchers and practitioners must continually familiarize themselves with new terminology. This confusion follows from the lack of formal GSD terminology. The rapidly growing body of literature that reports case studies of GSD warrants a formal language for describing GSD. In this paper we present the beginnings of a patterns based approach to formalizing communication strategies commonly applied in GSD. The idea of developing this pattern language emerged during our experience in an academic GSD environment at the University of Victoria, Canada. An extensive survey of the literature in GSD and our involvement in two global software development projects with clients and developers in Canada and Australia drove our efforts in creating some structure in our understanding of the communication with our distributed team members. In both readings about others GSD experiences and our own first hand experience, we observed that identifiable communication patterns emerge. We leverage the facade, mediator, translator, and blackboard patterns from the domain of software engineering to describe the patterns we observed in GSD communication. Section 2 begins by describing in more detail the motivation behind this effort, while Section 3 introduces the GSD project that further motivated us to develop a formal language. The four patterns are detailed in Section 4, where we describe their manifestation within our own experience and then proceed to identify them in GSD literature. We conclude by discussing directions for future research in Section 5.
1

Department of Computer Science, University of Victoria. E-mail: {aescott, luis, sgoyal, relves, danielad}@cs.uvic.ca

58

2.

Motivation and research background

2.1 Why Use Patterns? As a very young field, Global Software Development has very little formal terminology. Reports of successful (or failed, though more rarely) projects are usually ad-hoc, occasionally relying on terms that are commonly used but have never been rigorously defined. Using terms without rigorous definitions introduces the risk of miscommunication and misunderstanding. In contrast, once we have a common language, communication becomes easier and more effective, and it facilitates shared understanding of the knowledge we have about the field of GSD. Other attempts have been made to generalize the use of patterns, including work by Coplien [6], who introduced pattern languages and tools for software development in collocated organizations specifically for configuration of roles and communication. Beyer and Holtzblatt [3] developed technical organizational patterns for software development with distributed teams such as Loose Interface, Parser/builder, Hierarchy of Factories, and Handlers. However, this work is concerned with overall process rather than communication. To our knowledge, no other attempts have been made to introduce a formal language for describing GSD communication. There are several benefits to be derived from the use of design patterns (as briefly introduced in Section 2.2) to create a GSD communication pattern language: they are relatively well defined and documented, they offer a toolbox of proven solutions for various software design problems, and they allow developers to communicate with standardized terms. We hope that by using patterns these strengths can be leveraged in the GSD world. In addition to the general benefit derived from employing a more formal language in GSD, these patterns could be used to describe solutions to challenges in GSD. Researchers and practitioners could use them to refer to observations made in the field, and to recommend ways of improving practice based on patterns of communication that have been proven successful in other projects. 2.2 Design Patterns Software design patterns, introduced by Gamma, Helm, Johnson, and Vlissides in [11], are a set of standardized solutions for common problems in software design. Each pattern comes with a description of the context of the problem and how the pattern should be applied to make the design better. Design patterns should not be confused with algorithms. Algorithms are used to solve computational problems, while design patterns speak directly to software designers about design issues - the structure of a software system. An example pattern is the commonly used Iterator pattern, which is defined as follows: "Provide a way to access the elements of an aggregate object sequentially without exposing the underlying representation" [11]. Patterns such as this are meant to describe best practices in software design. In addition, they are meant to give developers a common language with which to discuss software artifacts and what design decisions were made and why.2

59

3. Project Background
The Software Engineering program of the University of Victoria in Canada, the University of Technology Sydney in Australia, and the University of Bari in Italy offered a joint Global Software Development course designed and taught by Daniela Damian and Ban Al-Ani. The students participated in various Global Software Development activities including a course project, which placed them in a realistic requirements engineering process with geographically distributed teams. The course was intended to be a realistic simulation of Requirements Engineering as practiced in a multidistributed environment, in which stakeholders must overcome distance, time zone difference, cultural barriers, communication glitches and technology deficiencies to obtain a common understanding of their requirements Australian groups consisted of undergraduate and graduate students from the Information Technology program at the University of Technology, Sydney. Italian groups consisted of undergraduate students from the Dipartimento di Informatica of the University of Bari. Canadian groups consisted of graduate students from the University of Victoria. For this project the Canadian students were split in three groups of four members each, while the Australian participants were split in two groups of five members each; the Italian students were split into two groups, one of three students and another of seven students. With the exception of the Italian groups, each group played dual roles of client and developer for two different software projects over seven weeks. Our Canadian group did not work with the Italian groups. Each group created its own structure and organization. Group leaders were elected, and each group also assigned members specific roles such as contact liaison, domain expert, and documentation keeper. All these members had to report their activities to the group during the group meetings. Before the distributed teams began interacting, the Canadian teams prepared the infrastructure to host a knowledge repository and defined communication protocols to be used during the project. During the project, teams had to interact and create documentation (deliverables) about their projects including a request for proposal and software requirement specification. Interactions between the teams were carried out with the goal of achieving the common understanding necessary to deliver such documents. Major challenges facing the teams were identified from literature, and one of the most important was considered to be communication; sets of synchronous and asynchronous tools were listed and used before and during the interaction with the remote teams. For synchronous communication, tools such as: teleconferencing systems (AG3, Polycom4), Audio conference (Skype), and chat (IRC) were used. For asynchronous communication, tools such as: email, mail groups (Google Groups), and IBIS (an online inspection system [15]) were used.

3 4

www.accessgrid.org www.polycom.com

60

3.1 Project Development Process The objective of the project was for every team to produce a complete Software Requirements Specifications document for a project. Each team was composed of a client group paired with a remote developer group. The client group created a Request for Proposal document that presented a description, list of stakeholders, and main features of the system. The document was submitted to the developer group via email, and posted on the project website. In turn, the developer group reviewed the document and prepared for the first formal meeting with the client group using teleconferencing systems and audio over IP. After an initial kick-off, both groups held a Requirements Elicitation in their first team meeting. During this meeting the developers had the opportunity to ask to their clients for information they considered to be missing from the RFP document. This meeting set the foundation for a common project understanding. After this meeting, the developer group created the first version of the SRS, which primarily included functional and non-functional requirements. When the document was ready it was sent to the client group. Clients had the opportunity to use an asynchronous inspection tool called IBIS to perform individual inspections of the SRS. Each client reviewed the document and identified issues found in the document. When the inspection was finished a moderator elaborated a list containing the issues for consideration by the developer group. Before the requirement negotiation team meeting, the developer group selected major issues from the list to be discussed during the meeting. The requirements negotiation meeting was the second face-to-face meeting between the client and developer groups. In this meeting both groups had to cover as many issues as possible. This meeting contributed to a better understanding of the system from the client point of view. The next step was the creation of a demo prototype; this was the third opportunity for the team to clarify any residual issues left with the specification. During the demo prototype meeting only audio over IP and VNC were used. Smart Board technology allowed the team members to perform interactive design. The final product of this process was a requirement specification for a system that could be built in four months. This specification had to fulfill the criteria laid out in the clients Request for Proposal (RFP) and the developed prototype had to meet with the clients approval.

61

4. GSD Communication Patterns


We here describe four patterns of GSD communication, and Figure 1 comprises them in one team communication architecture. Group 1 represents our local Canadian group, while group 2 represents the remote group in Australia. A member of our local team functioned as mediator throughout these meetings, while other members unofficially fulfilled the role of translator as the need arose. The Australians chose to communicate primarily through one member (a faade), while the Canadians group members choose to communicate individually on issue-by-issue basis. Computer-based tools such as VNC supported synchronous communication (as on a Blackboard) allowing the team to share a common knowledge repository. This tool proved very effective during our prototype demo meeting. Thus the four patterns described in this pattern are the Mediator, Translator, Faade and Blackboard.

Figure 1: Team communication architecture and four communication patterns

The communication patterns presented in this paper emerged as a consequence of project nature and they were easier to recognize during our synchronous communication, but it does not mean that they are not present in asynchronous communication. Originally our local group (Group 1) had an email mediator, but very quickly we realised that mediator pattern was not applicable because our groups organization was flat. Every member of the group was able to contact our remote team members without a mediator. When an email was send for one of the members of the group, the others members were aware of it via a google group created for this project. The remote group (Group 2) used the facade pattern in their asynchronous communication, meaning that all emails were sent to one member of this group who was responsible for propagating this communication to other group members. The translator role was mostly played by a domain expert who was in charge of replying to emails that contained technical questions about the project under development. CVS was intended to be common documentation repository for the project, and according to the blackboard definition, the CVS falls into the category of the asynchronous black board pattern. Hence these patterns were also used in our asynchronous communication. They are described in detail in the sections that follow.

62

4.1 Faade Pattern Problem: Communication with members of a dynamic system is subject to disruption and/or breakdown. Solution: Provide a single point of contact for the dynamic system - a faade. The faade is responsible for providing information after it has been processed internally for the team. Design Rationale If communication channels exist between clients and internal team members, changes to the internal team (i.e. new task assignments, new hirings) can disrupt these communication channels. In order to present a clear and consistent channel for communication, a faade can be set up to shield the client from internal disruption and provide a consistent, professional business image. Previous Work The use of a liaison between local and remote colleagues is an instance of the facade pattern. Battin et al. [2] describe the success of having a liaison and how the liaison gave a 'face' to the remote team. Behind the faade many problems were solved, reducing the need for communication between local and remote team members. This expedites the project since the members do not have to waste time locating the responsible counterpart on the other team a problem highlighted by many, including Herbsleb [13]. Battin describes the role of the liaison as active and reveals that they participated in development since they understood [the] system and knew the right person to talk to when there was an issue [2]. This is the role of a faade, to provide a single point of contact, a single interface. Our Experience The members of the remote Australian developer group had not met any of their Canadian or Australian colleagues before the project began. Therefore none of them knew how capable and trustworthy each other were. One member had substantial experience in industry writing specification documents and therefore knew what had to be produced, and what questions needed to be asked. This individual provided the face for their organization. As such this individual took upon the role of the faade. Communication between groups always went through this individual and did not need to involve any additional members. Questions were always addressed to this one 'facade' and replies were always forthcoming from this individual. Only a single line of communication was maintained. This made technology-mediated interaction much simpler since there was little disruption switching between speakers. In contrast to this experience, the Australian client group did not use the same communication pattern. They took a more distributed approach to communication. This resulted in confusion and frustration, as we were unable to establish who was responsible for certain aspects of the project. This lack of explicit responsibility assignment may have contributed to their lack of responsiveness when contacted via

63

email. With no clear protocol in place, every member could easily pass responsibility to the next member with no response ever forthcoming. This breakdown in communication severely hampered the development of trust between the two teams. A graphical representation of the faade pattern is shown in Figure 2.

Figure 2: The faade pattern.

4.2 The Mediator Pattern Problem: Allocation of a limited resource (usually speaking time during a meeting) among members of a group. Solution: One member is given control of the resource, and allocates it as he sees fit. Others may pose requests for it to him, he may operate on a predetermined schedule, or (most often) he may combine the two approaches. Design Rationale Giving one member control of the resource minimizes confusion and communication overhead. Everyone knows who to pose requests to and who has final authority over conflicts. Previous Work In literature a classic example of the mediator pattern is the air traffic control tower [9]. At any airport there is limited air and runway space for planes wishing to take off or land, and obviously having two planes try to use the same space at the same time would be disastrous. It would be extremely difficult (if not impossible) for every airplane's crew to keep track of every other plane and somehow come to a mutual consensus on who gets to go where at what time. Hence, the control tower takes responsibility for tracking all the planes and allotting airspace. Gottesdiener describes an example of the mediator pattern in [12], the role of a workshop facilitator. While the facilitator role is somewhat expanded from that of a pure mediator, he still retains the core responsibility of keeping the workshop on schedule. He also plays a lead role in developing the agenda

64

(which they call a workshop process). Miranda and Bostrom [17] reported that many traditional meeting structures are unproductive. They observed that productive meeting structures are mediated or reinforced by other supporting structures such as training and technology. Damian [7] presents different settings for Requirements Negotiation in a multi-distributed environment, the findings shown that the groups that included a facilitator performed better than others distributed groups and reached the best negotiation outcomes. As stated, the responsibility of a mediator tends to revolve around allocating a limited, shared resource. In the first example, the control tower mediator is allocating airspace. In meetings, the mediator allots time. In software, however, the mediator pattern is typically seen as a way to reduce interdependence between objects. Rather than forcing every object to be aware of and communicate with every other object directly, objects communicate with each other through the mediator. This highlights a fairly evident facet of the mediator pattern in GSD: it discourages informal communication. Our Experience In our GSD work, we employed a mediator in every meeting we had with our remote colleagues in Australia as shown in Figure 3.

Figure 3: The Mediator pattern.

The purpose of having a mediator role during our communications was defined before to start the interaction with the remote group for the following reasons: Do not allow any of the groups to monopolize the time during synchronous interactions. Follow an agenda and keep the meeting on schedule. Have just one group liaison with the remote groups for asynchronous communication. The mediators function was simply to introduce the agenda and try to keep the meeting moving so that there was time to address all the important issues. After that, much of the job required informing team members that it was their turn to speak. The mediator also took on the added responsibility of deciding who locally could best respond to remote questions. This is in contrast to the role of the faade, which takes responsibility for responding to all questions.

65

As previously mentioned, the agenda was a key component of the mediator's function. Agendas were decided upon in local meetings before the remote meeting was scheduled to take place, in which all members would come to a consensus on what items should be raised as part of the agenda. Also mentioned before was the observation that it was difficult to establish informal communication when a mediator had the group focused on working through a task-oriented agenda. Though the mediator allowed us to focus on work items during our meetings, we lost some opportunity to participate in less formal types of communication which can enhance trust [13]. Originally the mediators role also included moderating asynchronous communication. The intent was to have only one liaison, and to avoid individual communication between the local and remote team members. After some interaction with the remote group we noted that the mediator pattern was not applicable because the group organization was flat. Every member of the group had the option to contact our remote team members without a mediator, also the size of the project and the amount of communication allowed anybody to view all email, because all of them were stored in a Google Group created for this project. 4.3 Translator Problem: Teams face communication glitches produced by language barriers or lack of specific technical expertise. Solution: One or more members of the team play the role of translator, clarifying statements for nonnative-language speakers of the group. Alternatively, one or more members of the group function as a domain expert, clarifying technical issues for other members of the group who are not as familiar with the subject. Design Rationale Language barriers arise during the development of a distributed project between teams that do not speak the same language. A translator is set up as an adapter between the teams to ensure communication is clear, accurate, and meaningful. Previous Work The Translator pattern (which is sometimes called Adapter) has no generic definition. The most common definitions are based in terms of software implementation, i.e.: Adaptor acts like a protocol translator between the client and the server [19], or convert the interface of a component into another interface, through an intermediate adapter object [16].Based on definitions from literature [10], we observe that the concept of the Translator/Adapter pattern can be adapted to the practice of Requirements Engineering in multi-distributed development teams. As we noted, the Translator/Adapter design pattern is present during the distributed project development process. The role of the translator is expressed in Figure 4.

66

Figure 4: The Translator Design Pattern

The globalization of software development has opened new opportunities for companies to become more competitive by outsourcing the development of part or complete projects to developers in developing nations [1]. On the other hand, this practice introduces the challenge of needing to work across cultures to achieve a common understanding among the stakeholders of a project. From literature [5] and experience [8] we understand the importance of communication to the success of a project. Companies with multi-distributed settings must cope with coordinating teams where members may not all speak the same language. The most common language used in software development is English, but often not all members of a multi-distributed team are able to communicate in English at the level required to achieve complete understanding. Companies overcome the language barrier by using a translator / adapter, often an expatriate [14]. Such individuals grew up in a remote location but received their education in North America and have come back to their original country to act as expatriate managers for outsourcing projects. They are part of the cultural bridging staff. Our Experience Our local group had four members, two members were born in Canada and their first language was English. For the two other members one from India and one from Peru English was their second language. As such it was a fairly multicultiral team with varied background and technical experience. The variety of accents and terminology in English is diverse. Considering that the remote colleagues were located in Australia and the communication media was computer-based, the non-native English spakers faced additional communication challenges. The original roles assigned to the members of the team had not included a translator, because language problems simply had not been anticipated; it was assumed that since both teams were located in English-speaking countries there would be no significant problems. The need to have a translator in the team did arise, due to the following reasons: Different team members have different accents and some team members were not native English speakers. Sometimes computer-based communication introduced delay in the audio which hampered understanding. 67

High volume of information provided was not easy to digest in a short period of time scheduled for every teleconferencing meeting. Time constraints limited the level of detail that could be put into questions and comments. The domain expert also fits the description of a translator or adapter, because he or she is the person that translates the clients ideas and concerns for the rest of the team. The domain expert also clarifies technical issues associated with the design for all members of the team, colocated or remote. We identified the translator pattern in two roles assumed by members of the group during our GSD experience: The language translator The domain expert translator The Language Translator The role of language translator was not defined at the beginning of the project. The role was implicity asumed by one of the two native English speakers of the group. This role was switched between the two. If a question for one specific member of the team required extra clarification, the translator either repeated the question or explained the context of it for an individual who could answer the question. After the response, if the translator felt that the answer was not meaningful enough, he backed up the answer in terms that could be more understandable for the remote members of the team. The Domain Expert Translator The role of the domain expert translator was assumed by any member of the team that considered him/herself to be an expert in the area that was under disucussion. This person explained/translated the clients concern about technical issues or other specific subjects. This role has been identified as a "Solution provider" in previous experiences [12]. 4.4 Blackboard Pattern Problem: Stakeholders are unable to build a satisfactory solution individually. Solution: A team uses a common repository to share their knowledge and build a potential solution. Design Rationale The basic model of a blackboard consists of three components: Blackboard, Knowledge Source and Controller. The blackboard provides a central data repository and space in which solutions can be worked out. Individuals interact with the blackboard, observing information posted by others and possibly posting information of their own or possible solutions. The controller monitors the state of the blackboard and undertakes the necessary actions whenever a solution is found.

68

The blackboard pattern for the usage of IBIS is shown in Figure 5.

Figure 5: The Blackboard pattern.

Previous Work An example of the blackboard pattern is the group of experts gathering around a blackboard to solve a problem. The experts stand around the blackboard on which they could read and write. When they work on the solution they write on the blackboard. The problem is considered solved if the group agrees that an adequate solution has emerged on the blackboard [20]. Braun, Dutoit and Brugge [4], described and applied the distributed concurrent blackboard architecture as a primary data repository for a distributed software development project. This architecture is the based on iBistro, which is an environment that allows distributed teams to capture, structure and retrieve information. It also supports the capture of whiteboard content, and to track discussions. This information is stored in several layers of abstraction within the blackboard. Paasivaara and Lassenius [18] suggested the use of Bulletin Board, E-Mail Lists and problem EMailbox for solving communication problems in software multi-distributed development. In their case study they observed that the Bulletin board was the most effective media to solve a problem related to miscommunication during the development process. Suchan and Hayzak [20] reported the use of Lotus Notes Groupware as an information repository; file sharing and emailing. During their case study, Lotus Notes helped re-see the clients problems from different points of view. Our experience The blackboard pattern describes our usage of IBIS, VNC, and Google Groups. The goal of IBIS is to support scalable, distributed software inspection over the Internet, thus enabling the detection and elimination of defects as early as possible in the software life cycle [3]. The IBIS tool was used for inspection and discrimination as described in section 2. VNC (Virtual Network Computing) software makes it possible to view and fully interact with one computer from any other computer or mobile device anywhere on the Internet. We used VNC

69

frequently during our meetings with remote groups. The use of VNC in this context was as an interface through which both local and remote teams could simultaneously view a shared document (blackboard) and make changes. Both groups used VNC to synchronize work done on the requirements during team meetings. Google Groups is an online discussion group or mailing list system provided by Google Inc. Our Google Group worked as an asynchronous communication repository for our local (Canadian) group. Every project had its own Google Group, and all the emails were sent to this common address, which functioned as a blackboard for our asynchronous communication. The advantage of using this technology was that it created an email repository in which it was easy to navigate conversation threads. Both groups collaborated to inspect the requirements and discriminate the issues found using IBIS. Each issue had a separate discussion thread where the developer and the client could discuss the issue by posting messages asynchronously. IBIS (an example of the Blackboard pattern) was instrumental in enabling close cooperation and communication to resolve conflicts and confusion during the requirements negotiation phase.

5. Conclusions
This is an initial and unique approach to formalizing communication patterns that emerge in global software development. While there have been attempts to formalize the process of software design using patterns, to the best of our knowledge this is the first paper to consider using software engineering patterns to describe communication in global software development. We hope that using these patterns allows us to succinctly communicate a rich understanding of our GSD communication experiences. Future Research In our initial research we have identified and elaborated on four patterns of communication in GSD. Our present work must be validated in more distributed development projects within industry and be subject to refinement in order to make the patterns more applicable to a wider range of projects. Future research must seek to identify and document additional patterns that help practitioners understand GSD and communicate their experiences in a formal manner. A complete collection of patterns could be made available to the GSD community. Naturally, for any formalism to serve its purpose it must be widely adopted. Another approach to increasing our understanding of GSD and the problems faced may be to formalize the problem these patterns provide solutions to. This paper begins to formalize the solution domain. Future work could entail attempts to formalize the problem domain of GSD the organization and project concerned. Once we have both formalisms, we would then be able to describe GSD solutions completely in a formal fashion.

70

6. References
[1] ARORA, A., GAMBARDELLA A., Globalization of the software industry: Perspectives and Opportunities for Developed and Developing Countries, May 2004. [2] BATTIN, R.D., CROCKER, R., KREIDLER, J., SUBRAMANIAN, K., Leveraging resources in global software development, Software, IEEE , March-April, 2001, Pages: 70 77. [3] BEYER, H., HOLTZBLATT, K., Contextual design: defining customer-centered systems, Morgan Kaufmann Publishers Inc.1998 San Francisco, CA, USA. [4] BRAUN, A., DUTOIT, A., BRUGGE, B., software architecture for knowledge acquisition and retrieval for global Software development, International Workshop on Global Software Development, International Conference on Software Engineering. Portland, Oregon, May 9, 2003 [5] CARMEL E., Global Software Teams, Chapter 4. Prentice Hall, 1999. [6] COPLIEN, J., A Development Process Generative Pattern Language. In Coplien, J. O., and D. Schmidt, eds., Pattern Languages of Program Design. Reading, MA: Addison-Wesley, 1995. [7] DAMIAN, D.E, EBERLEIN, A., SHAW, M.L.G., GAINES, B.R. Using different communication media in Requirements Negotiation, IEEE Software, May/June, 28-36, 2000. [8] DAMIAN D., ZOWGHI, D., An insight into the interplay between culture, conflict and distance in globally distributed requirements negotiations, Proc. of the 36th Hawaii Conference on Systems Sciences (HICSS'36), Hawaii, January (2003). [9] DUELL M., GOODSEN, J., RISING, L., The design pattern examples available at http://www.cs.uni.edu/~wallingf/teaching/062/sessions/support/pattern-examples.pdf are the result of an OOPSLA 97 workshop of Non-Software Examples of Software Design patterns which is conducted by them. [10] FREDJ, M., ROUDIES, O: A pattern based approach for requirements engineering. The Tenth International Workshop on Database and Expert Systems Applications., 1999 [11] GAMMA, E., HELM, R., JOHNSON, R., VLISSIDES, J., Design Patterns, Addison-Wesley Professional, 1995. [12] GOTTESDIENER, E, Requirements by Collaboration, Addison-Wesley, 2002. [13] HERBSLEB, J.D., MOCKUS, A., FINHOLT, T.A., GRINTER, R.E., An Empirical Study of Global Software Development: Distance and Speed. In proceedings, International Conference on Software Engineering, pages 8190, Toronto, Canada, May 15-18 2001 [14] KRHISHNA S., SAHAY S., WALSHAM G., Managing cross-cultural issues in Global Software Outsourcing. Communications of the ACM, Vol 47, No.4, April 2004, pages 62-66. [15] LANUBILE, F., MALLARDO, T., CALEFATO, F., Tool Support for Geographically Dispersed Inspection Teams. Software Process: Improvement and Practice, 8(4): 217-231, October/December 2003. [16] LARMAN C., Applying UML and Patterns, Prentice Hall, 2002, pages 342-345. [17] MIRANDA, S.,BOSTROM, R. Meeting Facilitation: Process versus Content Interventions, Journal of Management Information Systems, 15(4), 89-114 (1999) [18] PAASIVAARA, M., LASSENIUS, C., Collaboration practices in Global Inter-Organizational Software Development Projects. Software Process: Improvement and practice, 8(4), 2003, 183-200. [19] PRASAD, Framework and design Patterns, Reusability Revisited, Website: www.cs.wright.edu/~tkprasad/courses/ceg860/L156DP.ppt, Accessed May 2005, May 2005 [20] STIGER P, GAMBLE R., Blackboard Systems Formalized Within Software Architectural Style. In, IEEE International Conference on Systems, Man, and Cybernetics, 1997. [21] SUCHAN, J., HAYZAK G., The Communication Characteristics of Virtual Teams: A Case Study, IEEE Transactions on Professional Communication, Sep2001, Vol. 44 Issue 3, p174.

71

Using ontologies in Distributed Software Development Karin K. Breitman1 Miriam Sayo1,2 Leonardo M. Couto1,3
{karin, miriam@inf.puc-rio.br, matriciano@petrobras.com.br} PUC-Rio1, PUC-RS2, Petrobras3 Abstract Distributed software development poses new challenges to the requirements engineer. Communications among geographically distant parties, cultural differences and synchronization issues make verbal contact harder and impose more rigid requirements specification standards. The requirements document is many a time the only communication medium. The use of controlled vocabularies is common practice to help improve communication. In this article we argue that the use of ontology in substitution of lexicons or dictionaries can greatly impact communication efficiency. We exemplify our approach with a real example, the Seismic Control System. Keywords: Distributed Software Development, Ontology, Requirements

1. Introduction
Many companies are applying distributed software development as a way to reduce costs and shorten development time in the global market. Different perspectives on the difficulties encountered in the distributed process are available in the literature [Carmel99, Par99, Prikladnicki04, Damian03, Evaristo04]. Communication is a central challenge. Communication is critical to the success of distributed projects: Gorton and Matwani relate that as much as 22% of project time is devoted to the communication among developers alone. The time used in asynchronous communications, including e-mail and other communication tools involved around 17%. Another study, by Cherry and Robillard [Cherry04], identified that cognitive synchronization, i.e., communications between two or more developers with the goal of making sure that they share the same knowledge or representation of a given object, is responsible for the consumption of 29% of total development time. When comparing co-located to distributed processes, Bianchi et al [Bianchi02] relate that stakeholders communicate more frequently in distributed projects. These studies indicate the strong importance of communication processes in distributed software projects. Communication activities are directly affected by cultural and linguistic differences among stakeholders. Very frequently such differences result in misunderstanding of the project requirements, which are usually stated in natural language. Requirements negotiation and prioritization are impacted by physical distances and communication problems among stakeholders. Time zone compatibility also becomes an issue when the time frames do not allow for synchronous information exchange. Collaborative requirement related tasks sometimes end up being carried away without face-to-face exchanges, based on document exchange.

72

In order to maintain document consistency and unity some authors suggest the adoption of standards. [Prikladnicki03, Prikladnicki04, Damian03, Lopes04]. There are standards for the requirements specification, application vocabulary (or lexicon) and for the development process itself. Some process models used in practice clearly define the requirements artifacts. That is the case of RUP (Rational Unified Process), which advocates the use of the Vision document allied to Use Case Models, Glossary and Supplementary Specification. The IEEE requirement standard STD 830 is also frequently used in practice [Lopes04]. The most popular standards for the applications vocabulary are lexicons and glossaries.

2. Controlled Vocabularies
As a means to improve project communication many Requirement Engineering approaches suggest the adoption of controlled vocabularies, such as lexicon or dictionaries to capture the definitions of important application concepts [Sommerville98, Young01, Gottesdiener02, Leite90, Leffingwell00, Jacobson99, Robertson05]. The aim is to ameliorate problems with the project terminology, which are appointed by Sommerville and Sawyer as the most common source of confusion in the requirements document [Sommerville98]. RUP prescribes the use of a Glossary of important terms and their definitions. This document is to be shared among developers and to serve as base to every project artifact. The glossary is useful in reaching a consensus about the definition of the various terms in use, thus reducing the risk of misunderstandings [Jacobson99]. Capturing naming conventions is part of the Volere method template [Robertson05]. This document contains a glossary that defines the meaning of terms used in the requirements document. As most projects make use of domain specific terminology, this document is used as a shared reference resource. Lexicons and glossary are widely used in software practice. However their informality and lack of structure may impact the quality of the descriptions as well as the contextualization of information. In traditional software development there is an ongoing informal communication process that is responsible for filling those blanks. In the case of distributed software development this mechanism is not available. Problems that results from different interpretation of the requirements are discovered later in the project, demanding rework and impacting costs and project deadlines. It is a well known fact that the later an error is discovered the more expensive it is to fix it [Boehm81]. Cultural differences, in this case, aggravate the scenario and increase the probability of misunderstandings. Thus more formal representations are needed. In this light we propose the use of ontologies.

3. Ontology and Distributed Software Development


An ontology is a conceptual model used to represent concepts and relationships in a given domain. As a philosophical discipline, ontology building is concerned with providing category systems that account for a certain vision of the world [Guarino98]. In computer science, ontologies were firstly used by Artificial Intelligence researchers to facilitate knowledge sharing and reuse [Fensel01]. Today they are becoming widespread in areas such as intelligent information integration, cooperative information systems, agent based software engineering and electronic commerce. One of the most cited ontology definitions is Grubers: An ontology is a formal, explicit specification of a shared conceptualization. [Gruber93]. Where conceptualization stands for an abstract model, explicit means that the elements are clearly defined and, lastly, formal means that 73

the ontology should be machine processable. Ontologies provide a formal and structured way to describe a domain. They describe concepts, properties, relationships and axioms, which are organized in a taxonomic structure, based in the generalization abstraction. Existing ontology language standards, such as OWL provide a consistent way to write and interpret ontologies [W3C]. In Figure 1 we illustrate how a concept is described in an ontology (in the OWL language concepts are called classes). The Asserted Hierarchy tells us that a block is a type of Information Element, as are its siblings, from which block is disjoint. Other than its description it contains hierarchical information, its superclass and relationships that the class Block holds to other classes in the ontology. A glossary entry would only contain the text in the description.

Figure 1 Ontological representation of the block concept

3.1 Ontology versus Lexicons We propose that lexicons and dictionaries should be replaced by the use of ontologies in DSD. Ontologies provide a series of advantages over lexicons, which can be summarized as: 1. the format describing an ontology item (concept or relationship) is more structured (and thus easier to read and understand) than a loose natural language description. 2. ontologies are written using standard, machine processable languages, e.g. OWL and DAML+OIL, instead of unstructured natural language. 3. the structure of an ontology is that of an acyclic graph which allows better visualization than an alphabetical list.

74

4. ontologies allow for automatic consistency checks and verification whereas lexicons are very time consuming to verify. 5. Ontology reasoning tools provide automatic concept classification, thus allowing the identification of concept subsets that conform to a group of conditions. This mechanism, also known as a view in the database community, is very practical when dealing with a large number of concepts. 6. Ontologies promote reuse. Through a borrowing namespace mechanism it is possible to easily include concepts defined by other ontologies, reducing construction effort and time considerably. 7. Existing ontology alignment tools, e.g. ProtgPrompt plug in [Protg] allow for the comparison and alignment with other ontologies and standards. This will be particularly useful when ontologies are adopted at large, for this mechanism can be used to detect differences in terminology. Classical examples are polisemy a term used in different senses and the evolution of a term over time. Even within the same company, terminology used by one project may assume different tones in another. To illustrate the use of ontologies in a DSD project, we present examples from the Seismic Control Application. This application is currently being developed by distributed teams at a large Petroleum company. Different branches of the company are involved, and because of the geographical distance that separate the groups, DSD techniques are in practice. At first a lexicon of terms was developed with the objective of providing a common reference for domain terminology. Developers found the lexicon useful but lacking in detail, specially when it concerned the discovery of relationships that hold among concepts. We have produced an ontology to substitute the original lexicon, that is now in use. Lexicons are organized in alphabetic order and have a flat structure. Relationships to other concepts are inferred from the natural language descriptions of the term. Ontologies on the other hand, are organized in the format of an acyclic graph. In such graphs, there is immediate visual identification of related concepts. Concepts that strongly related to one another are those connected the edges. Figure 2 illustrates the overall structures of the ontology and the lexicon. Comparing the block concept in both representations we observe that the ontology provides visual indication of other classes that are related to Block, whereas we can only inferred possible relationships from the text in the lexicon.
Perhaps the greatest contribution of the adoption of ontologies is the fact that they can be automatically processed. Ontologies coded in OWL, the W3C consortium recommendation ontology language, are directly mapped onto Description Logics. The last is a sub set of First Order Logic and allows ontologies to be processed by inference machines that verify consistency, queries and derive new classes automatically [Antoniou04]. From a requirements perspective an inference mechanism can be used to increase specification quality, once it allows for automatic consistency verifications. The possibility of deriving new relationships from existing ones is one of the greatest differentials of ontologies when compared to traditional conceptual representation models. For instance, using object orientation formalisms it is possible to describe elements using a set of necessary conditions, i.e., to belong to a class an instance must present the whole set of necessary conditions. However it is not possible to define necessary and sufficient conditions, i.e., definitions. With ontologies, on the other hand, it is possible to build new concepts from existing ones. In other words, it is possible that new concepts are identified, and added to the ontology, automatically by the reasoning tool. Using this mechanism it is possible to create new partitions or views of the conceptual model.

75

Block

SLAcquisition

Part of a sedimentary basin, formed by a vertical prism of undetermined depth, whose polygonal surface is defined by the geographical coordinates of its edges. It is the location where oil and natural gas exploration or production activities take place.
Information Element
Kasdlfljasdjflsjfalksjfdlaksasdfasdfasfddasftnfgnhgustrujflksd Afakjsfdjasfslkjfdalksjfalkskjfasjflkjfaskjfalskjfdalsjfdlaskjfdalsdfasfsdfasfdddddddddd dddddddddddddddddddddddddjfaoiejrqneojtopimngaosjgposjgdpoajds.Kasdlfljasdjflsjf alksjfdlaksasdfasdfasfddasffhsfdhfdhhdfhsdhhfshtnfgnhgustrujflksdshfsdhfhgjgjfgsvbm ndgjtrujbmnmduyti,j,ulupdsryatrgtrtr56rhgjgnttttAfakjsfdjasfslkjfdalksjfalkskjfasjflkj.

Kasdlfljasdjflsjfalksjfdlaksasdfasdfasfddasftnfgnhgustrujflksd Afakjsfdjasfslkjfdalksjfalkskjfasjflkjfaskjfalskjfdalsjfdlaskjfdalsdfasfsdfasfdddddddddd dddddddddddddddddddddddddjfaoiejrqneojtopimngaosjgposjgdpoajds.

SLProcessing
Kasdlfljasdjflsjfalksjfdlaksasdfasdfasfddasftnfgnhgustrujflksd Afakjsfdjasfslkjfdalksjfalkskjfasjflkjfaskjfalskjfdalsjfdlaskjfdalsdfasfsdfasfdddddddddd dddddddddddddddddddddddddjfaoiejrqneojtopimngaosjgposjgdpoajds.

SeismicActivity
Kasdlfljasdjflsjfalksjfdlaksasdfasdfasfddasftnfgnhgustrujflksd Afakjsfdjasfslkjfdalksjfalkskjfasjflkjfaskjfalskjfdalsjfdlaskjfdalsdfasfsdfasfdddddddddd dddddddddddddddddddddddddjfaoiejrqneojtopimngaosjgposjgdpoajds.

Seismic
Kasdlfljasdjflsjfalksjfdlaksasdfasdfasfddasftnfgnhgustrujflksd Afakjsfdjasfslkjfdalksjfalkskjfasjflkjfaskjfalskjfdalsjfdlaskjfdalsdfasfsdfasfdddddddddd dddddddddddddddddddddddddjfaoiejrqneojtopimngaosjgposjgdpoajds.

Seismic_3D_Acq
Kasdlfljasdjflsjfalksjfdlaksasdfasdfasfddasftnfgnhgustrujflksd Afakjsfdjasfslkjfdalksjfalkskjfasjflkjfaskjfalskjfdalsjfdlaskjfdalsdfasfsdfasfdddddddddd dddddddddddddddddddddddddjfaoiejrqneojtopimngaosjgposjgdpoajds.

Figure 2 Lexicon and Ontology structures

We exemplify the creation of views using the Seismic Ontology. Note that concepts are organized according to the kind of data they represent, i.e., information element types. Another way to organize the concept would be along a dimensional axis, that is, all 2D elements in one class, all 3-D in a second. Simple observation of the Seismic Ontology, depicted in Figure 2 shows that this would be an orthogonal way to organize concepts, for 2-D related elements are spread across the ontology. To implement this new organization, or vision, we will create a new class, Dimension_2D, under the ontology root, the owl:Thing class. We make this concept a definition, by adding a necessary and sufficient restriction (Has_dimension 2D).With the help of the inference mechanism, or classifier, all the classes that fulfill the definition are now classified under the Dimension_2-D class. Figure 3 illustrates the final result. This mechanism can be used to create different visions, or aspects, of the ontology concepts.

76

Figure 3 Automatic identification of classes by inference

4. Conclusion
In this article we focused on how ontologies can be used to minimize critical DSD problems. Making explicit domain knowledge, providing a formal representation in which to share information in an unambiguous way, allow for automatic consistency verification are some of the advantages that can help improve communication among distributed teams. From a practical point of view, the adoption of the ontologies is facilitated by the large number of on line ontology materials [OilEd_Tutorial, Horridge, Rector &Noy]. A great number of tools for the edition, verification, classification, alignment and visualization of ontologies, most freeware, are also available on line. Examples are Stanfords Protg and University of Manchesters Oild-Ed tool [Protg, OilEd, OWL_Val, Onto Viz]. In this article we focused on the advantages of substituting traditional vocabularies for ontologies. We believe that the system requirements should also be organized and represented using ontologies. We explored the use of ontologies to formalize services specifications in multi-agent systems in [Breitman04]. In this work we used ontologies to represent system requirements as a means to eliminate ambiguity and allow machine based inference and verification. This approach make the requirements explicit, centralized the specification in a single document (the ontology itself), at the same that provided a formal, unambiguous representation that could be shared among developers and software agents (ontologies are machine processable). Of course the adoption of ontologies alone will not contribute in the requirements elicitation process. Ontologies are conceptual models. A requirements process that uses ontologies as its conceptual model, however, demands that a greater deal of information is elicited earlier in the process (early biding), thus reducing the risk of finding errors later in the process. Despite effort consuming, can be very useful in the requirements engineering practice. A known problem in requirement elicitation is tacit knowledge [Sommerville98]. Trivial information is hard to elicit because most stakeholders will not remember or think important to state it. Berry suggests that the

77

requirements engineer behaves in the manner of a ignoramus during elicitation, as to force users and clients to offer more details [Berry95]. On this light an ontology serves as the formalization of the ignoramus, because it demands that information is provided exhaustively. This work is part of a larger project, in which we are investigating the potential uses of semantic web related technologies in Distributed Software Systems settings. Among others we are investigating the possible uses of Metadata, Semantic Web Services and Ontology to build an agent based framework to support DSD. In this framework software agents will be responsible for coordinating communication, search, comparison and negotiation tasks that are currently manned by software developers at great effort.

5. References
[Antoniou04] Antoniou, G.; Harmelen, F.; - A Semantic Web Primer MIT Press, Cambridge Massachussets, 2004. [Berners-Lee01] Berners-Lee, T.; Hendler, J. & Lassila, O. The semantic Web. The Scientific American, maio 2001. [Berry95] Berry, D. "The Importance of Ignorance in Requirements Engineering". Journal of Systems and Software, vol. 28(2), fev 95. pp. 179-184. [Bianchi02] Bianchi, A.; Caivano, D.; Lanubile, F.; Rago, F. & Visaggio, G. "An Empirical Study of Distributed Software Maintenance". In: International Conference on Software Maintenance (ICSM02). Proceedings. [Boehm81] Boehm, B. - Software Engineering Economics - Prentice Hall, 1981. [Breitman04] Breitman, Karin K.; Haendchen F., Aluzio; Haeusler, Edward H. and Staa, Arndt von "Using Ontologies to Formalize Services Specifications in Multi-agent Systems", Lecture Notes in Computer Science, Volume 3228, Jan 2004, page 92. [Carmel99] Carmel, E. "Global Software Teams". Prentice-Hall, 1999. [Cherry04] Cherry, S. & Robillard, P. "Communication Problems in Global Software Development: Spotlight on a New Field of Investigation". In: Third International Workshop on Global Software, May 24, 2004, Edinburgh, Scotland. Proceedings. [Damian03] Damian, D. & Zowghi, D. "RE challenges in multi-site software development organizations". Requirements Engineering Journal, 8(3),2003, pp. 149-160. [Evaristo04] Evaristo, R.; Audy, J.L.N.; Prikladnicki, R. & Avritchir, J. "Wholly Owned Offshore Subsidiaries for IT development: a Program of Research". AMCIS 2004, New York, NY. [Fensel01] Fensel, D.: Ontologie: a silver bullet for knowledge management and electronic commerce, Springer (2001). [Gottesdiener02] - Gottesdiener, E. - Requirements by Collaboration - Addison Wesley,2005. [Gruber93] Gruber, T.R. "A translation approach to portable ontology specifications". In: Knowledge Acquisition 5: 199-220. [Guarino98] Guarino, N. Formal Ontology and information systems In Proceedings of the FOIS98 - Formal Ontology in Information Systems, Trento (1998). [Horridge] Horridge, Matthew - A Practical Guide To Building OWL Ontologies With The Protg-OWL Plugin http://www.co-ode.org/resources/tutorials/ProtegeOWLTutorial.pdf [Jacobson99] Jacobson, I.; Booch, G. & Rumbaugh, J. The Unified Software Development Process, Addison Wesley Longman, 1999. [Leffingwell00] Leffingwell, D. & Widrig, D. Managing Software Requirements, a Unified Approach. AddisonWesley, Reading, Massachusetts, 2000. [Leite90] Leite, J. C. S. P.; Franco, A. P. "O uso de hipertexto na elicitao de linguagens da aplicao". In: 4 Simpsio Brasileiro de Engenharia de Software, 1990. Anais. Sociedade Brasileira de Computao, ed. pp. 124-133. [Lopes04] Leandro Lopes, Azriel Majdenbaum, Ricardo Bastos, Jorge Audy. "Um Modelo de Requisitos para Especificao em Linguagem Natural"

78

[Noy01] Noy, N.; McGuiness, D.: Ontology Development 101 A guide to creating your first ontology. KSL Technical Report, Standford University, 2001. [OilEd] http://oiled.man.ac.uk/ [OilEd_Tutorial] http://oiled.man.ac.uk/tutorial/ [OntoViz] - http://protege.stanford.edu/plugins/ontoviz/ontoviz.html [OWL_Val] http://owl.bbn.com/validator/ [Par99] Par, G. & Dub, L. "Virtual Teams: An Exploratory Study of Key Challenges and Strategies". In: 20th International Conference on Information Systems, dez 1999, Charlotte, North Carolina, United States. Proceedings. pp. 479-483. [Prikladnicki03] Prikladnicki, R.; Audy, J. and Evaristo, R. "Global Software Development in Practice: Lessons Learned". In: Software Process Improvement and Practice, 2003; 8: pp. 267281. [Prikladnicki04] Prikladnicki, R. & Audy, J. "MuNDDoS - Um Modelo de Referncia para Desenvolvimento Distribudo de Software". In: XVIII Simpsio Brasileiro de Engenharia de Software - 2004 - Braslia, DF, Brasil. Anais. pp. 289-304. [Protg] http://protege.stanford.edu/ [Rector & Noy] Ontological Design Pattern and Problems: Practical Ontology Engineering Using Protg- OWL Alan Rector, Natasha Noy, Holger Knublauch, Guus Schreiber, Mark Musen disponvel em - http://www.coode.org/resources/tutorials/iswc2004 [Robertson05] - Robertson, S.; Robertson, J. - Requirements Led Project Management - Addison Wesley, 2005. [Sommerville98] Sommerville, I. Software Engineering, Addison-Wesley, Reading, MA, 1998. [W3C] Owl ontology structure /#StructureOfOntologies available at: http://www.w3.org/TR/2004/REC-owl-guide-20040210

[Young01] Young, R. Effective Requirements Practices, Addison-Wesley, 2001.

79

COORDINATION AS THE CHALLENGE OF DISTRIBUTED SOFTWARE DEVELOPMENT GAMEL O. WIREDU


Abstract
This paper takes the position that coordination is the key challenge of the organisation of distributed software development. Based on distributed-, organisational- and software-based parameters, the paper presents an analysis of the mutual shaping between these parameters and coordination. The paper follows the analysis with tentative theoretical speculations of the potentialities of the complexities inherent in the mutual shaping; and argues for research efforts on distributed software development to be directed at the coordination challenge.

Introduction
Software development is a multifarious activity that presents extensive challenges to both researchers and practitioners concerned with it. In very broad terms, these challenges of software have been categorised into essence and accidents by Brooks [1] to capture the inherent properties of software-in-development (essentials) and its attendant problems (accidents):.
Following Aristotle, I divide them into essence, the difficulties inherent in the nature of software, and accidents, those difficulties that today attend its production but are not inherent. [1].

While, both essentials and accidents of software are relevant in their own rights and in their combination, software development researchers, especially those in Organisation Science and Computer-Supported Cooperative Work (CSCW), have largely concerned themselves with developing management models and concepts aimed at tackling the accidents. To put software accidents into perspective and to capture their variegated nature, I categorise them broadly into three interdependent and interrelated facets organisational, technological and socio-cultural. These categories seem arbitrary, but they are not: they fall under well-known dimensions of organizations, namely people, process and technology [see for example 4]. The problem of organising therefore aims at optimising outcomes from these dimensions through several management functions [see for example 3]. One of these functions, coordination, emerges as a dominant concept that embodies action-related parameters such as communication, cooperation, collaboration and knowledge sharing; and inevitably relates with all other accident attributes in cause-effect fashions. This paper will argue that coordination is the pervading challenge of distributed software development. Based on the underlying premise that distribution is a significant accident, I argue, in addition, that research efforts must be directed at understanding the cause-effect complexities of coordination that are associated with distributed software development. Drawing upon the Coordination Theory of Malone and Crowston [7, 8], an activity is coordinated through processes and mechanisms. Coordination processes are the actions such as managing shared resources, producer/consumer relationships, simultaneity constraints and task/subtask dependencies. A coordination mechanism is a standardised or structured representation of these processes: it is a construct consisting of a coordinative protocol (an integrated set of procedures and conventions stipulating the articulation of interdependent distributed activities) on the one hand and on the other hand an

80 72

artefact (a permanent symbolic construct) in which the protocol is objectified. [14, p.165](emphases in original). Although distribution is specifically mentioned in Schmidt and Simones definition, their deliberations of the concept virtually overlooked or ignored the specific problem of distribution especially geographic or remote distribution as a significant accident of software that is as pervading as coordination. By itself, distribution remotely separates software developers and hence the coefficient of coordination efforts and the processes and mechanisms required [see 10]. An inevitable corollary of remote separation of developers is the emergence of location or place as a significant conditioner of developers actions. In a distributed activity, the modes of actors actions necessarily correspond with the given conditions that directly derive from the peculiarities of those locations in the distribution that are hosting those actions [16]. Against this background, an intriguing question is what are the conditions that derive from the peculiarities of those locations in a distributed software development activity? In my opinion, they are the socio-cultural, organisational and technological accidents of software development, perceived as external factors that affect developers actions. How, therefore, do these accidents condition the construction and operationalisation of coordination processes and mechanisms in distributed software development? The latter question is one that requires a comprehensive research endeavour that entails empirical studies and theoretical analysis to answer satisfactorily. However, this position paper will attempt to address it through a theoretical analysis and synthesis of software accidents, and their interrelation with their essentials in distributed software development settings. The discussions will outlay tentative, yet clear, conceptual foundations of distributed development conditions allowing for sound speculations of the accidents potentialities to condition the construction and operationalisation of coordination processes and mechanisms. Although Brooks asserts that software engineering must direct more efforts towards the essentials, this paper directs attention towards the coordination challenge which is both manifested and operationalised through accidents. The motive is not to criticise Brooks assertion; rather, the motive is to bring the coordination problem of distributed software development to the fore. Nevertheless, it is necessary, first of all, to briefly present the key essentials with which the subsequent expositions of the distribution-related accidents will be cross-examined and analysed.

Distributed Software Development Essentials and Accidents


General Essentials of Software Development Software development has inherent issues that confront it and necessitates their effective handling. These issues conceptualised as essentials by Brooks (1987) directly reflect, mainly, the software product itself and the process of its development. Brooks labels the essentials as conformity, changeability, invisibility and complexity of the product and/or process. In more explicit terms, and in conformity with the coordination challenge which this paper espouses, the development process or approach holds as much significance as the outcome. Mathiassen and Stage (1992) categorise the software development into two broad, yet interwoven, areas. On the one hand, the approach, the mode of operation or means of expression adopted by developers is either experimental or analytical depending on the situation. To them, the situation is engendered by either uncertainty or complexity of requirements information, and it influences the choice of approach to lead to the outcome, the means of expression specifications or prototypes.

81 73

The development approach is interesting in the analysis of the coordination challenge because the accidents directly affect the approach more than they do the outcome. Accidents impact directly on the mutual shaping between the mode of operation and means of expression because they introduce socio-cultural, organisational and technological variables that, more or less, determine the degrees of uncertainty and complexity in distributed software development. Accidents of Distributed Software Development The foremost distributed-related accidents to highlight are the organisational ones. In the context of this paper, the primary component is geographic distribution and the associated parameters such as distance between distributed locations and mobility. The distribution of a software development activity, the distance between locations, and the mobility of actors can individually or collectively be perceived within the dimensions of space, time and context. The spatial, temporal and contextual dimensions give more meaning to distribution, distance and mobility, and to the coordination processes and mechanisms that managers adopt to effectively organise distributed software development. Other organisational accidents are those normative tasks, processes, incentives, rules, and allocations of the resources aimed at achieving predefined goals of the development process. In distributed software development, there are peculiar socio-cultural issues particularly related to the personnel/developers of the locations in the distribution. The degrees of socio-cultural accidents can be very minimal if the locations in the distribution are all characterised by common sociocultural orientations or backgrounds. On the contrary, they can be very pronounced and determining if the locations do not share such similar socio-cultural characteristics. Typical examples of the significance of pronounced cultural differences are espoused by Sundeep et. al. [13] in their research on globally distributed software engineering across borders. Culture is reflected in peoples beliefs, perceptions, attitudes and orientations. Thus, in the organisation of distributed software development, managers have to be mindful of location-based socio-cultural accidents such as the role of power and knowledge in the production and reproduction of cultural norms; belief systems that translate into context-bound meanings of information and nature of knowledge; reward systems and their process or outcome targets; modes of behaviour and outcome control; and the nature of organising in terms of markets, bureaucracies or clans [12]. In the organisation of activities, managers often adopt information and communication technologies and leverage them into computational coordination mechanisms. In distributed software development, managers tackle the pervading problem of distance and distribution with ICTs to facilitate and optimise communications, cooperation and collaboration among distributed developers. ICTs are therefore technological accidents of a distributed software development that engage in a mutual interaction and shaping with organisational and socio-cultural accidents. To understand technological accidents and their impact on distributed organising [see 11], it is important to examine two broad areas of computational coordination mechanisms [14]. On the one hand, models of structures and processes concern aspects such as data flows, conceptual schemes, knowledge management repositories, knowledge representations, and inscribed rules and methods [5]. On the other hand, models of presenting and access concern issues such as user interface, functionality, ease of use and affordance. Inherent in these socio-cultural, organisational and technological accidents are coordination processes and mechanisms. In fact we can confidently view these accidents in totality as dependent variables, and the essentials as independent variables of distributed software development. Accidents are dependent on essential attributes such as task complexity and uncertainty [9], task interdependence and unit size [15, 8], task variety and analysability, equivocality of information

82 74

processing [2], and the means of expression of the final product. As stated above, these essentials broadly relate to the software development process and product which are mutually determining. Therefore, the mechanisms and processes that managers adopt and deploy to coordinate distributed software development reflect these essentials. For example, uncertainties in a highly interdependent distributed development may necessitate the need for computational mechanisms that facilitate communication and knowledge sharing. However, in this same instance, the knowledge sharing motive may be affected by socio-cultural belief systems that translate into context-bound meanings of information and nature of knowledge. The manager may therefore have to coordinate by instituting processes such as intermittent face-to-face meetings between developers in different locations through mobility travelling and visiting [6]. This example can be interpreted as the deployment of a computational coordination mechanism to address a communication and knowledge sharing problem; and a subsequent substitution of the mechanism with a coordination process (mobility) to serve the same process. Note that while the software process and product condition coordination mechanisms and processes, the reverse effect also holds true. For example, wrong or poorly-timed coordination mechanisms and processes could further increase uncertainties instead of decreasing them. In this regard, we can also anticipate processes of construction and reconstruction of coordination mechanisms by developers in the development process. To wit, the cause-effect relationships between the parameters under consideration process, product, uncertainty on the part of software, and mechanisms and processes on the part of coordination are complex. The complexity translates into the need for a comprehensive study of these relationships through thorough empirical and theoretical analysis of the coordination challenge.

Position
Nevertheless, based on the brief discussions and expositions above, this paper takes the following position: Distribution is a significant accident of the software development process that has direct implications on developers actions. Distributed activities are profoundly different from localised activities; hence challenges of distributed software development are different from their localised equivalent. It is argued that the coordination challenge of distribution is pervading and therefore unique in its own right. In view of this, coordination models that will specifically address the peculiarity of coordination in distributed activities are critically necessary for the organisation of DSD activities. This translates into the imperative for research efforts that will identify the peculiar and unique problems of DSD activities, how these problems condition the development process and actions of developers, implications on the nature of developed software, and possible sustainable solutions to those problems. To conclude, this paper merely brings the coordination challenge of distributed software development to the fore by demonstrating the complexity of the problem, and by presenting snapshots of the potentialities of the relationships between the discussed parameters. It represents a mere tentative attempt to highlight the conceptual issues surrounding the coordination of distributed software development with an aim to stimulate further discussions.

References
1. BROOKS, F. P., No Silver Bullet: Essence and Accidents of Software Engineering, IEEE Computer, 20(4): p. 10-19. 1987.

83 75

2.

DAFT, R. L. and N. B. MACINTOSH, A Tentative Exloration of the Amount and Equivocality of Information Processing in Organizational Work Units, Administrative Science Quarterly, 26(2): p. 207-224. 1981. FAYOL, H., General and Industrial Management, Pitman, New York 1949. HAMILTON, M. and H. KERN, Organizing for Successful Software Development. 2001, Prentice Hall PTR. HANSETH, O. and E. MONTEIRO, Inscribing Behaviour in Information Infrastructure Standards, Accounting, Management and Information Technologies, 7(4): p. 183-211. 1997. KRISTOFFERSEN, S. and F. LJUNGBERG, Mobility: From Stationary to Mobile work, in: K. Braa, C. Srensen, and B. Dahlbom (ed.) Planet Internet, Studentliteratur, Lund 2000. MALONE, T. W. and K. CROWSTON, What is Coordination Theory and How Can It Help Design Cooperative Work Systems? in: (ed.) Proceedings of the 3rd Conference on Computer-Supported Cooperative Work, ACM Press, New York 1990. MALONE, T. W. and K. CROWSTON, The Interdisciplinary Study of Coordination, ACM Computing Surveys, 26(1): p. 87-119. 1994. MATHIASSEN, L. and J. STAGE, The Principle of Limited Reduction in Software Design, Information Technology & People, 6(2-3): p. 171-185. 1992. NIDUMOLU, S. R., The Effect of Coordination and Uncertainty on Software Project Performance: Residual Performance Risk as an Intervening Variable, Information Systems Research, 6(3): p. 191-219. 2001. ORLIKOWSKI, W. J., Knowing in Practice: Enacting a Collective Capability in Distributed Organizing, Organization Science, 13(3): p. 249-273. 2002. OUCHI, W. G., Markets, Bureaucracies, and Clans, Administrative Science Quarterly, 25(1): p. 129-141. 1980. SAHAY, S., B. NICHOLSON, and S. KRISHNA, Global IT Outsourcing: Software Development Across Borders, Cambridge University Press, Cambridge, UK 2003. SCHMIDT, K. and C. SIMONE, Coordination Mechanisms: Towards a Conceptual Foundation of Computer Supported Cooperative Work Systems Design, Computer Supported Cooperative Work: The Journal of Collaborative Computing, 5: p. 155-200. 1996. VAN DE VEN, A. H., A. L. DELBECQ, and R. KOENIG, Determinants of Coordination Modes within Organizations, American Sociological Review, 41(2): p. 322-338. 1976. WIREDU, G. O., Mobile Computing in Work-Integrated Learning: Problems of RemotelyDistributed Activities and Technology Use, University of London, London 2005.

3. 4. 5. 6. 7.

8. 9. 10.

11. 12. 13. 14.

15. 16.

84 76

Situational Requirements Engineering Gets Distributed! Bernd Bruegge1, Korbinian Herrmann2, Axel Rauschmayer3, Patrick Renner4
The development of the next generation of an operational facility such as television broadcast transmitters is influenced by requirements that emerge during the operational state, maintenance and repair. For the formulation of these requirements the context of the operational facility is important. Situational requirements engineering considers requirements in their context. As those facilities are typically deployed all over the world, the situational requirements engineering gets distributed. We describe the activities, their integration to life cycle models and an architecture to support this new kind of requirements engineering. It combines end user information with technical context. Ontologies are used to formalize and store the knowledge gained. Merging instances of ontologies effectively allows distributed requirements engineering.

1. Introduction
This position paper proposes a concept to deal with challenges of distributed software development and requirements engineering in the field of production, maintenance and repairs of complex industrial operational facilities like television broadcast transmitters and wind power plants. The focus is on the development of next generation of products based on existing products. This proposal is based on the experience of three companies Rohde & Schwarz, Nordex and HSG Facility Management. All these companies address a very specialized segment in the global market with a single product. Rohde & Schwarz produces television transmitters, Nordex manufactures wind power plants and HSG is specialized in selling and operating building facilities, like moving stairways or elevators. These products are deployed to local distributors all over the world, each of them with diverging local requirements. Examples are languages, different law and culture, education or just technical reasons such as region-specific service concepts. Some of these requirements are usually not modeled explicitly in the design phase of the product. Therefore, integrated solutions have to able to be adapted to location specific requirements. By the deployment and operation the development of the products is effectively distributed. The requirements elicitation process is involving many participants: Developers have to formulate technical requirements, local distributors have to adapt them to local requirements and end users identify their need for new features or enhancements. This distribution of requirements elicitation
1

Institut fr Informatik, Technische Universitt Mnchen, Germany, bruegge@in.tum.de 2 Institut fr Informatik, Technische Universitt Mnchen, Germany, herrmann@in.tum.de 3 Institut fr Informatik, Ludwig Maximilian Universitt Mnchen, Germany, rauschma@informatik.uni-muenchen.de 4 Institut fr Informatik, Technische Universitt Mnchen, Germany, renner@in.tum.de

85

over space and over time creates a problem for the requirements analyst. First, the analyst is not able to observe end users and distributors regarding some requirements. Second, some requirements are recognized after the delivery of the system. Another problem is that most of the requirements in the application domain of interest are context dependent: They cannot be formulated at the desk. The requirements engineer has to formulate them at the workplace. The proposed solution is a situational requirements engineering concept to allow the engineering of the requirements using information won during the operational state as well as during maintenance and repair processes. The concept is based on knowledge management methods, in particular ontology techniques, to identify requirements. In the next section the situational requirements engineering is introduced and related to existing approaches for requirements elicitation. Section 3 gives a more detailed view on the proposed activities and process model that is introduced. In section 4 the necessary next steps are shown to realize the proposed concepts and to evaluate them.

2. Situational Requirements Engineering


Existing process models for requirements engineering are not very well suited to deal with distribution and context in the application domain. Structured questionnaires [1] for example imply that the developer already has knowledge about the application domain. They are not well suited to formulate exceptional behavior that depends on complex system states. The method of Task Analysis [2] lets the developers observe the end users in their context of work. The problem of this approach is that some situations, especially exceptional or error behavior, are so infrequent that they cannot be formulated in advance because they occur often only after the deployment of the system. Thus, the analyst needs to observe the end user during a long period of time. Additionally the analyst might have to acquire abilities of the end user, such as capability to climb on antennas. The cost of this approach is therefore prohibitive for our application domain. Scenario based and participatory methods [3] try to solve the problem by incorporating the end user into the creation of a visionary scenario. This is problematic, because end users are taken out of their context of work as well and are usually not good at externalizing knowledge. Common to all these approaches is that they dont consider the context. Context is an essential part of a requirement in the application domain. For instance, extreme high temperatures might cause a system crash of a television transmitter in Australia. In Siberia this system crash may not happen because of the different climate. Technical context includes the information related to the operational facility describing the system state. These are all available values of sensors and system variables. Examples are environment data, like temperature or wind speed, and internal details such as network traffic or configuration data. Local modifications and configurations that are performed by operators at some facilities are part of the context as they might cause problems or detect bugs in the system. Additionally, data from maintenance and repairs, such as the current step of the repair procedure are included. Maintenance and repairs concentrates on the activities keeping a deployed system running. The proposed model, situational requirements engineering, combines requirements with their context. These new requirements are won during the operational state of the facility. It does not substitute the requirements engineering activities at the start of the development of a facility. However, the situational requirements engineering supports these activities with additional requirements that were won during maintenance and repairs of the operational facilities. So, this new form of requirements engineering focuses the generation of requirements for future systems.

86

By the different place of deployment and production the situational requirements engineering is distributed.

3. Proposed Process, Methods and Platform


First, the activities of situational requirements engineering will be described. Based on these it will be shown how to integrate them into a software life cycle model. Finally a platform is proposed that supports the situational requirements engineering. 3.1. Activities of Situational Requirements Engineering The Situational requirements engineering concept provides functionality for the contextual and task-specific requirements elicitation. The following activities are necessary in situational requirements engineering: Capturing end-user information, tracking technical context, managing of the collected knowledge and providing feedback. End-user information needs to be captured. The experience of the end-users with their tacit knowledge is the core of requirements elicitation in the field. He knows what is not documented and how to deal with exceptional behavior. Thus, it is necessary to support the easy capturing of end user information: This can be explicit information in forms, semi explicit information of system configuration parameters and implicit feedback by observing the use of the system. To provide a better interaction between the end-user and the operational facility, there is a need for an easy interaction methodology that allows the user to formulate and send these requirements even under extreme conditions. As context is always changing, it is important to track and relate it directly during the maintenance and repairs with the end-user information. This collective knowledge of captured end-user information and tracked technical context has to be managed and transmitted to the remote developer team. As system models are contained in the running system, this information can be related to the corresponding parts of the system model during or shortly after the information capture. As a result, situational Requirements Engineering provides feedback to the system developers. The extracted requirements influence the requirements engineering process in an ongoing system development process for a new product and because the process is iterative this information will influence the next requirements engineering iteration. Here the structure and meaning of the information that is gathered during these activities might be changed. New or modified system modules have impact on documentation, configuration and use of the system. 3.2. Integration of Situation Requirements Engineering into Life Cycle Models In the development of software for operational facilities there are two phases of special interest: The requirements engineering phase to build the system and the operating phase with maintenance and repairs after the system is deployed (see Figure 1). In the requirements engineering phase the end user is explicitly integrated to specify desired requirements. Assuming that the facility is not constructed from scratch, feedback collected from the use of previous facilities influences the system development as requirements. The operating phase allows to capture end user information and to track technical context. This captured knowledge is especially important for the further development of the operational facilities.

87

Figure 1

Both of these phases can be covered by the same infrastructure as the described activities are strongly integrated by sharing as much information as possible between the involved participants. The next section proposes an architecture to deal with the distribution of the situational requirements engineering. 3.3. Situational Requirements Engineering Architecture The Situational requirements engineering requires the sharing of information among the distributed parties (see Figure 2): On the one hand, information from the end users who work on the facility whose context is tracked, on the other hand, from the system developers, who use system models and documentation.

Figure 2

88

Following Gruber [4] an ontology is an explicit specification of a conceptualization'', with the following criteria for the design of these ontologies: Clarity, coherence, extendibility, minimal encoding bias and minimal ontological commitment. Most of the ontologies used during system development are transparent to the end user: A rationale tool implies a rhetorical or argumentative ontology, the system modeling tools use ontologies like UML diagrams and even source code can be described inside an ontology. The requirements elicitation activities in general create several ontologies, one for example for the use cases and their relations. Ontologies of different tools are usually not modifiable and cannot be merged. Making the ontologies explicit, relating these ontologies and making them customizable to the needs of a single project or product integrates information and makes tracking and tracing possible beyond the borders of tools. [5] Ontologies in the proposed process model cover four areas: System models, documentation, user information and system state. The manufacturer of the system maintains these four ontologies. End users, distributors and developers fill instances of these ontologies with content. Thus, in the certain activities no complex overall ontology has to be maintained but just an ontology covering the users knowledge domain. During the four activities of situational requirements engineering these ontologies are accessed. Figure 3 shows two examples of instances of ontologies in the Resource Description Framework (RDF): The first one shows end user information that stores the information that the main_sensor needs to repair. The second one, the technical shows the sensor for measuring the temperature.

Figure 3

Merging instances of these ontologies can combine the tracked context and the user information. In Figure 4 the two ontologies of Figure 3 are merged. RDF allows one to easily merge these two instances of ontologies, resulting in one common instance that contains the information of both sources. Note that for illustration purposes only, the main node involved in merging ("sensor:main_sensor") is highlighted with a bold border. The words that are prefixed with a colon indicate that different vocabularies are used depending on the domain of discourse. As ontologies formulated in RDF do not necessarily have to follow a strict schema, users may not only edit instances but also extend the ontology without switching to a meta layer. Merging these not unpredictable extensions would work analogous as the example above in most of the cases. As established tools are well suited for some steps in the development process, they shouldn't be replaced: Adapters provide the necessary interaction between the existing systems and data standards and the ontology repository. For example UML diagrams for the system design remain in the established tools, on changes interpreters re-import UML data into the ontology to let it be merged with other information. So users edit the ontology indirectly with their well-known tools but
89

at the same time profit from a merged ontology-based information pool e.g. for traceability or browsing.

Figure 4

4. Conclusions and Future Work


The elaborated process, method and architecture propose the situational requirements engineering that is tailored to the needs of distributed development of complex operating facilities. The user has a special role in the new process because he can elaborate and specify his requirements during the operational state. The combination of requirements and technical context allows eliciting undetected and exceptional requirements. This is strictly oriented on the requirements of manufacturing companies that produce such complex facilities. Their advantage is to get feedback from remote end users and operators during the running state continuously and the ability to start to integrate it in the next generation facilities. To evaluate the process, methods and tool described in Section 3 the next step is the implementation of the proposed architecture. Then it can be applied and adopted to scenarios in the application domain that will be defined by our partners. Using real scenarios that include the further development of a complex operating facility will guarantee the reference character of the postulated theses.

5. References
[1] J.T.T. Barone & J. Switzer. Interviewing: Art and Skill. Allyn & Bacon, 1995. [2] P. Johnson. Human Computer Interaction: Psychology, Task Analysis and Software Engineering. McGraw-Hill Int., 1992. [3] J.M Carroll (ed.). Scenario-Based Design: Envisioning Work and Technologies in System Development. Wiley, 1995. [4] T. R. Gruber. Towards Principles for the Design of Ontologies Used for Knowledge Sharing. In N. Guarino and R. Poli, editors, Formal Ontology in Conceptual Analysis and Knowledge Representation, Deventer, The Netherlands, 1993. Kluwer Academic Publishers. [5] Timo Wolf and Allen H. Dutoit. Supporting traceability in distributed software development projects. Submitted to the International Workshop on Distributed Software Development, 2005.

90

USING THE ECONFERENCE TOOL FOR SYNCHRONOUS DISTRIBUTED REQUIREMENTS WORKSHOPS Fabio Calefato, Filippo Lanubile 1)
Abstract
eConference is an XMPP-based conferencing tool that supports synchronous, structured communication in distributed scenarios. We present the usage of eConference in the context of distributed requirements engineering, where groups of stakeholders from different organizations are temporarily involved in communication-rich processes such as requirements workshops. We also describe an initial evaluation of the tool in the context of student project works.

1. Introduction
A requirements workshop is a requirements engineering (RE) technique for eliciting or negotiating software requirements where stakeholders are brought together to form a group, share information and take decisions with the help of a facilitator [16]. RE is the most communication-rich process of software development and then its effectiveness is greatly constrained by the geographical distance between stakeholders, as in the case of global software development [7]. For this reason, the need to develop a tool infrastructure to support teams of geographically dispersed stakeholders when developing requirements has been acknowledged in the past [23]. In [5] we presented P2PConference, a text-conferencing tool to enable synchronous, structured communication in distributed scenarios. In this paper we present the new version of our tool, now renamed as eConference and based on the XMPP protocol, an IETF standard for instant messaging and presence awareness [21]. We have initially focused on text-only communication because multipoint audio-video communication poses significant practical barriers (e.g., expense, infrastructure, support) to deployment outside of research institutions. Erickson and Kellogg draw attention to the powerful characteristics of text-based communication: it is easy to use, persistent, traceable, and it enables the use of search and visualization technologies [9]. In the next sections we first describe how the tool works and report about an initial evaluation for distributed requirements workshops in the context of student project works. We then include related work and point out further work.

2. Description
The primary functionality provided by eConference is a closed group chat with agenda, meeting minutes editing and typing awareness capabilities. The tool allows participants to communicate by typing statements that will appear on all participants message boards. Around this basic feature, other features have been built to help organizers control discussion.
1

Dipartimento di Informatica, University of Bari, email: [calefato|lanubile]@di.uniba.it

91

The tool screen has six main areas: agenda, input panel, message board, hand raising panel, edit panel, and presence panel (see Figure 1). The agenda indicates the status of the meeting (started, stopped) as well as the current item under discussion. The input panel enables participants to type and send statements during the discussion. The message board is the area where the meeting discussion takes place. Statements are displayed sequentially, tagged with the time of when they were sent and the senders name. The edit panel is used to synthesize a summary of the discussion. The presence panel shows participants currently logged in and the played role.

Figure 1. eConference screenshot

The tool usage is illustrated in this section using the following scenario: XYZ is a small firm based in Canada that has embarked a project for the development of an e-commerce platform. For this project, XYZ is outsourcing part of the software development to an Italian offshore vendor. However, due to the considerable cost and effort of traveling and local arrangements, it is not feasible to organize face-to-face meetings on an ongoing base. Hence, people must meet remotely. Requirements workshops will involve three groups of stakeholders: The customers and the onshore personnel, located in Victoria, Canada. The offshore developers, located in Bari, Italy. 2.1 eConference Organization Daniela is the project manager. For the requirements workshops, she intends to use text-based communication to mitigate the language disparity issues. However, as an organizer, she does not want the communication to be unconstrained; also, she wants the organizers to have control power over participants. Hence, she opts for eConference to organize and run the requirements workshops,

92

as the tool accommodates the needs to have both structured communication and control over stakeholders. Using a wizard, Daniela is guided through a few steps, necessary to collect all the required information (see Figure 2). The organization of an eConference follows a strict protocol, inspired by CeBASE eWorkshop [3].

Figure 2. eConference organization wizard

Daniela is the director: being the actual workshop organizer, she is supposed to choose the type of eConference to set up, define the main topic and the other items of the discussion agenda, schedule the conference, training sessions, if necessary, and finally send invitations by e-mails. In eConference there are three different types of conferences Daniela can choose among: Meeting. It ensures a limited control power since the moderator can only freeze disturbing participants (i.e., forbid them to type and send statements). This conference type models simple, remote brainstorms. Presentation. This is a more complex kind of conference: one special invited expert, the speaker, delivers his own virtual, text-based speech and the other participants can ask questions, after raising their hands. Panel. It is a generalization of presentation, since there is more than one speaker, the so-called panelists. Among the three groups of stakeholders involved, Daniela has identified some key stakeholders, because she believes they will be able to foster the discussion. Then she chooses to organize the requirements workshop as a Panel and therefore invites the key stakeholders as panelists.

93

Daniela invites Philip to act as a moderator. As such, he will be responsible for monitoring and focusing the discussion. During a presentation or a panel, the moderator will also have to manage the queue of the asked questions. Philip will also be responsible for assessing and setting the pace of the discussion, that is, deciding when it is time to move the discussion to another item. As a scribe, Daniela invites Sylvia. As the discussion moves from one item to another, then Sylvia will have to capture and organizes the results displayed on the edit panel. Thus, the content of the panel becomes the first draft of the requirements meeting minutes. Finally, Daniela decides to allow Philip, the moderator, taking part actively in the conference, but not Sylvia, the scribe, so as to keep her focused on the discussion flow [11],[16]. 2.2 Running eConference The Italian developer Tommaso has been invited to participate to the requirements workshop. Hence, he got an email from Daniela that informed him about the event, and how to launch eConference by Java Web Start2 [8] and join the workshop. The concerted day, Tommaso clicks on the link and runs eConference. 2.2.1 The Moderator Perspective: a Smooth Discussion Philip enters eConference as the moderator. Once joined, as any other stakeholders, he can broadcast file to the other participants: thus, Philips shares the documents that he will refer to during the event to facilitate the discussion (see Figure 1). He has waited for all the stakeholders to join. There is a single participant who is late, but he decides to start the discussion anyway: he is not worried about that since, once joined, any latecomer will automatically receive foregoing discussion, edit panel content and shared files.

Figure 3. File broadcast

After starting the discussion (see Figure 4), the stakeholders are allowed to interact as follows:
2 Java

Web Start is a technology that eases the deployment of Java applications: with a simple click on a web link, it automatically downloads, searches for updates and runs applications for you.

94

key stakeholders have been invited as panelists and, hence, are always granted to speak; other stakeholders, instead, are allowed to speak by raising their hands. Instead, the other participants invited as observers can only observe the proceedings passively.

Figure 4. The agenda (moderator perspective)

Philip selects the first item in the agenda and the panelists (i.e., the key stakeholders) start discussing about it directly. The other invited stakeholders, instead, must press the raise hand button: Ann presses it and a small window pops up. Now Ann has to select the panelists whom she wants to ask her question. Though not mandatory, she fills out the text area with the question and sends it (see Figure 5).

Figure 5. Hand raising panel (participant perspective)

Each time a question is sent, it is displayed in the question queue. When hovering the mouse pointer onto an element in the queue, each participant can get a preview of the question: this is a useful feature to let the moderator decide whether a question is to be moved up or down in the queue, or even completely removed (see Figure 6).

95

Figure 6. Hand raising panel (moderator perspective)

Philip considers Anns question as a relevant contribution to the current discussion and so he decides to satisfy it. Now Anns input panel is enabled, so she can start to type and send statements. Meanwhile, two other questions have entered the queue: the former, from Michael, is also appropriate for the current item. The moderator can satisfy any number of questions at the same time. Instead, the second question, from Jennifer, is off topic and, hence, removed from the queue. Michael is now allowed to send statements too: he disagrees with what Ann said in the N-th statement. To mean that he is referring to that statement precisely, he starts typing his one like this: N: I completely disagree with you Ann, since. Each time eConference receives a statement that starts with a number followed by a colon, it turns it into an HTML anchor. Thus, when stockholders click on such anchors, the message board scrolls to make visible the referenced statements. Two key stakeholders, Peter (an onshore developer) and Maria (an offshore developer) have flaming behaviors: they are arguing because Maria fears that onshore developers are pushing management to outsource the less appealing part of the job. To calm them down and avoid a we versus they condition, Philip freezes them (see Figure 7): now both of them cannot actively take part in the discussion in that: Peter cannot send statements anymore. Maria cannot raise her hand anymore. The moderator also writes in his private text area of the agenda a recommendation to the other stakeholders for not flaming as well.

96

Figure 7. Presence panel (moderator perspective)

Discussion goes on until Philip decides that the current item has been debated enough, since participants seem to have reached a consensus on it. Hence, he stops all the current questions (i.e., disallows to keep sending statements) and announces that the discussion is now moving to the next item. Also, Philip thinks that both Peter and Mary, the two heated participants, have calmed down now. Thus, he unfreezes them. Again, all stakeholders begin to discuss the selected item, typing directly or raising their hands, according to their roles. 2.2.2. The Scribe Perspective As Michael, Ann and the other panelists involved were debating on the first item, Sylvia, the scribe, has summed up both their own views on it and the common consensus reached. Thus, she presses the update button to propagate the edit panel content to all of the other participants (see Figure 8). As discussion is moved on the next item by the moderator, Sylvia begins to write down the other most relevant observations; and so she will do for all of the items in the agenda.

Figure 8. The edit panel (scribe perspective)

97

2.2.3 Ending Discussion Once all of the items in the agenda have been discussed, Daniela announces that the event is over and Philips stops the discussion. After leaving the eConference, each stakeholder finds the logs of both the message board and the edit panel locally stored into HTML files. The meeting minutes, in particular, will serve as a draft to edit a more structured requirements document.

3. First Experience With The Tool


eConference has been used at the University of Bari to organize and run sixteen distributed requirements workshops. Our main intent was to test the tool itself. The participants were Master students in computer science, attending a web engineering course. As final course assignment they were required to develop an enterprise application, including both analysis and design documentation, working in groups of three to five people. All the sixteen workshops have been conducted during the course in a time frame of 5 weeks. The participants received one demo presentation of the tool. To provide further help, a detailed usage scenario was made available online To simulate the geographical dispersion of the stakeholders, the students were also allowed to use the tool from home and laboratories in our department. Hence, the stakeholders involved in the workshops were: One of the researchers, acting as both workshop organizer and facilitator. One PhD and one graduate student, acting as customers. The students, playing the role of the developers, except one who was selected to act as scribe to produce the meeting minutes. Unlike a JAD session, the scribe was free to contribute information to the workshop. The minutes edited by the scribe were the main outcome of the workshops. They contained a general description of the application to develop, a high-level list of the features to implements, all the decisions taken and the constraints imposed by the clients, both technical and functional. Afterwards, the minutes were first used by the developers, to edit a full requirements specification document for their own application, and then by inspectors that cross checked the aforementioned requirement specification using the IBIS tool [14]. To characterize the requirements workshops, in the following we provide a brief report of the data gathered from the tool logs. Table 1 shows the duration and the number of messages sent for each workshop. Duration was computed considering the time span between the first and last message sent by any participant. System notifications of logon or presence were ignored because not relevant. It is remarkable that the shortest meeting (36 min., Grp 8) is not the event with fewer messages (134, Grp 10). The longest meeting, instead, went on for 66 min. Given the small standard deviation (8,7 min), we can state that a workshop lasted in general for a few less than one hour (mean = 49,2 min.). The feedback from the participants, received through interviews and direct observation, allowed us to spot enhancements other than those already present on our to-do list. Most of the suggestions we received were technical feature requests, such as extending the edit panel to support drawing (whiteboard), and adding a feature that allows the scribe to paste text highlighted from the message board with a single mouse right-click. More interestingly, some students reported that they felt

98

constraining and useless the floor control features available. Conversely, the customers reported floor control to be useful to prevent the discussion to become messy, especially when groups of five developers were involved. Duration (in min.) 55 60 39 66 63 47 47 36 47 43 53 45 48 54 46 38 Messages 208 333 201 314 250 230 268 138 143 134 157 301 154 378 241 203

Grp 1 Grp 2 Grp 3 Grp 4 Grp 5 Grp 6 Grp 7 Grp 8 Grp 9 Grp 10 Grp 11 Grp 12 Grp 13 Grp 14 Grp 15 Grp 16

Table 1. Duration and messages exchanged for each workshop

4. Related Work
Conducting a long-running, productive conversation through a digital medium can be very challenging, especially if there are more than a few people involved. Thus, multimedia meetings and their facilitation have been deeply studied in the last two decades [6],[2],[19],[12],[1],[11]. Many of the existing distributed meeting tools use the metaphors of meeting rooms or shared workspaces. TeamRooms [20] and TeamSpace [10] are collaborative workspaces for managing work processes and maintaining shared artifacts in a distributed projects, typically spanning months or years. Their most remarkable features is the ability to seamlessly switch between synchronous and asynchronous support. Moreover, TeamSpace also supports different work modes, namely social/corridor-talk and meeting. These tools support synchronous and structured communication, but include no floor control features. Among the recent research projects, Meeting Central [22]. The tool includes a valuable number of visual cues to convey social aspects during meetings, presentation and browser sharing, VNC viewer for desktop sharing, and, finally, it provides means for using existing PSTN or VoIP infrastructure. Interestingly, Meeting Central does not provide any control channel features or roles, because its aim is to leverage social protocols inherent in any discussion, be it computer-mediated or face-to-face. This approach is in contrast with Moors SmartPhone [18], which augments telephony by using a computer to add symbolic control channel. Other than general purpose CSCW tools, there are specific collaborative RE tools which have been proposed. EasyWinWin [4] is a tool that implements the WinWin approach using the Group Support System (GSS), a commercial collaborative toolset. EasyWinWin defines a set of activities guiding stakeholders through a process of gathering, elaborating, prioritizing, and negotiating

99

requirements. The use of the basic GSS tools is also reported in [23]: brainstorming, voting and group outliner are used during JAD sessions to help elicit requirements. RM-tool [15] is a webbased collaborative tool to support distributed stakeholders in requirements management. It is implemented on a commercial groupware infrastructure, namely Lotus Notes Groupware. RM-tool is focused on structured requirement modeling, thus it offers no synchronous group decision support. The CRC tool [17] is a specialized electronic meeting system to facilitate communication amongst a distributed, multidisciplinary group engaged in the early stages of a software development project. In [13] a P2P toolset for Requirements Elicitation is presented. It was developed for Groove, a P2P platform based on the metaphor of shared workspaces. The toolset includes tools for authoring and delivering interviews, defining requirements according to the RQML structure and voting. Also, there is a workshop tool for brainstorming sessions, which is comparable to eConference, except for the absence of any control channel feature.

5. Future Work
We have presented eConference, a tool to support synchronous and structured discussion in a distributed context, such as requirements workshop. The features of the tool have been outlined drawing its use in a plausible offshore development scenario. We have also reported our first experience with the tool while conducting requirements workshops at the University of Bari. As future work we intend to add features which are specific to requirements workshops, and elicitation in particular. We are currently improving the tool by making the tool fully pluggable to implement useful custom extensions. This change will transform the tool into a framework and will allow us to develop both features tailored for RE and generic collaborative features, such as a voting tool (to help measure the attainment of a common consensus) and presentation/browsersharing (to achieve a richer collaboration). Finally, we intend to run controlled experiments to assess whether control channel is inherent also in a computer-mediated discussion, as hypothesized in [22]. We aim to understand whether synchronous distributed requirements workshops can inherently capture those social protocols and rules which usually determine who can actually speak in non-moderated face-to-face meetings. We also want to discover how synchronous distributed requirements workshops are affected by factors like the number of participants and their mutual relationships, the number of foregoing meetings and familiarity with chat.

6. References
[1] ACKERMAN, M. S., et al. Hanging on the wire': a field study of an audio-only media space. ACM Transactions

on Computer-Human Interaction (TOCHI), 4(1):39-66, March 1997.


[2] AHUJA, S. R., Ensor Robert, J., Horn, D. N. The Rapport multimedia conferencing system. ACM SIGOIS and

IEEECS TC-OA Conference on Office information systems, ACM SIGOIS Bulletin, 9, 2-3, April 1988.
[3] BASILI, V., et al. Building an Experience Base for Software Engineering: A Report on the First CeBASE

eWorkshop, Proc. of the Product Focused Software Process Improvement (PROFES 2001), Kaiserslautern, Germany, September, 2001.
[4] BOEHM, B., GRUNBACHER, P., BRIGGS, R.O. Developing Groupware for Requirements Negotiation: Lessons

Learned, IEEE Software, 18(3):46-55, May/June 2001.


[5] CALEFATO, F., LANUBILE, F. Peer-to-Peer Remote Conferencing. Proc. of the 3rd International Workshop on

Global Software Development (GSD 2004), Edinburgh, Scotland, May 24, 2004.

100

[6] COOK, P., et al. Project Nick: Meetings augmentation and analysis. ACM Transactions on Information Systems

(TOIS), 5(2): 132 146, April 1987.


[7] DAMIAN, D., ZOWGHI, D. The Impact of Stakeholders Geographical Distribution on Managing Requirements

in a Multi-Site Organization, Proc. of the IEEE Joint International Conference on Requirements Engineering (RE 2002), Essen, Germany, September, 2002.
[8] eConference, http://brooks.di.uniba.it/~calefato/econference/ [9] ERICKSON, T., KELLOGG, W. A. Social translucence: an approach to designing systems that support social

processes. ACM Transactions on Computer-Human Interaction (TOCHI), 7(1): 59-83, March 2000.
[10] GEYER, W., et al. Team collaboration space supporting capture and access of virtual meetings. Proc. of the 2001

International ACM SIGGROUP Conference on Supporting Group Work (Group 2001), Boulder, Colorado, USA, September 30 October 3, 2001.
[11] GOTTESDIENER, E., Facilitated Workshops in Software Development Projects, Proc. of the Int. Conference on

Application of Software Measurement (ASM 2001), San Diego, CA, USA, February 2001.
[12] ISAACS, E. A., MORRIS, T., RODRIGUEZ, T. K. A Forum for Supporting Interactive Presentations to

Distributed Audiences. Proc. of the Conference on Computer-Supported Cooperative Work (CSCW 1994), Chapel Hill, NC, October 22-26, 1994.
[13] LANUBILE, F. A P2P Toolset for Distributed Requirements Elicitation, In Proc. of the International Workshop on

Global Software Development (GSD 2003), Portland, Oregon, USA, May 2003.
[14] LANUBILE, F, MALLARDO, T, CALEFATO, F. Tool Support for Geographically Dispersed Inspection Teams,

Software Process: Improvement and Practice, .8(4): 217-231, October/December 2003.


[15] LANG, M., DUGGAN, J. A Tool to Support Collaborative Software Requirements Management. Requirements.

Engineering. 6(3): 161-172, Springer London 2001.


[16] MACAULAY, L.A., Requirements Engineering, Springer-Verlag, 1996. [17] MACAULAY L. Co-operative requirements capture: prototype evaluation. In Spurr K, Layzell P, Jennison L,

Richards N (eds). Computer support for co-operative work. Wiley, Chichester, 1994, pp 169194
[18] MOORS, T. The SmartPhone: Interactive Group Audio with Complementary Symbolic Control. Prof of the

Distributed Communities on the Web (DCW 2002), Sydney, Australia, April 3-5, 2002
[19] NUNAMAKER, J.F. et al. Electronic meeting systems to support group work, Comm. of the ACM, 34(7):40-61,

July 1991.
[20] ROSEMAN, M. and Greenberg, S., TeamRooms: network places for collaboration, Proc. of the ACM conference

on Computer supported cooperative work (CSCW 1996), Boston, Massachusetts, USA, November 16-20, 1996.
[21] XMPP, IETF specifications, http://www.xmpp.org/specs/ [22] YANKELOVICH, N. et al. Meeting Central: Making Distributed Meetings More Effective. Proc. of ACM

Conference on Computer Supported Cooperative Work (CSCW 2004), November 6-10, 2004, Chicago, Illinois, USA.
[23] YIHWA I.L., Minder C. Using Group Support Systems and Joint Application Development for Requirements

Specification. Journal of Management Information Systems, Winter 1993/1994, 25-41.

101

Bootstrapping Incremental Design: An Empirical Approach For Requirements Identication and Distributed Software Development
Naoufel Boulila, Allen H. Dutoit, Bernd Bruegge Institut fr Informatik, Technische Universitt Mnchen, Germany

Abstract
Developers designing software systems lacking clear requirements may adopt different development approach than the traditional ones such as SCRUM or Feature-Driven Design. Exploratory software development approach and iterative incremental delivery of software may help overcoming issues resulting from the ill-defined requirements of the system under development. Developing software to be deployed in distributed settings needs to adapt the exploratory approach to cope with issues introduced by distribution. In this paper we describe a reflection on a development approach, BID (Bootstrapping Incremental Design), for identifying requirements and developing software based on a combination of an evolutionary approach, participatory design approach, and incremental prototyping. We considered using the concept of the bootstrapping process, such as used in compiler construction, to design a new approach which is basically a formative approach augmented with a self-improving mechanism. This method was the result of the development of a framework called SCOOP (Synchronous Collaborative Object-Oriented design Process) for supporting distributed synchronous software design meetings.

1. Motivation
Exploratory software development approach is a method of choice when developers and customers lack clear requirements specification and lack development alternatives [5]. Developers might not be able to take critical design decision or might not know how to best solve certain implementation problems. By experimenting with iterative prototyping, both customers and developers can gain new insights into their problem domains and thus come closer to better solutions. Iterative and incremental development allows to make small development steps, through which a single step results in an extension or an improvement of the current version of the system. This approach fits well in single site software development. For distributed software development such as distributed real-time brainstorming and software design activities, we need to adapt and enrich the method with practices that are gained from experiments using evolutionary and incremental prototyping approaches. The lack of understanding the problem domain, the solution domain, the requirements, and the targeted users, prevented us from following approaches and processes like feature-driven design, SCRUM, or RUP. These approaches assume that the requirements are well-defined or are stable enough to plan for the next step. We followed an empirical approach to identify the requirements based on rapid prototyping to develop the SCOOP framework. We started developing the SCOOP framework [1] with ill-defined requirements. We conducted several case studies of participatory design meetings [2]. We used a formative and evolutionary approach where we developed a prototype that we used during cases studies in an attempt to iden-

102

tify the main requirements for distributed real-time software brainstorming and design meetings. The current state of the prototype and the knowledge gained from the case studies have influenced the subsequent experimental meetings. Once we had a stable and usable prototype, we shifted our focus on how to evolve the prototype using the prototype itself. New features were designed during each set of case studies. These improvements were made incrementally while conducting distributed meetings for designing the missing functionalities. During the process of building the framework, we were made aware of the necessity in shaping the method we were using to understand the requirements identification process, the incremental self-improving process, and the interaction of the users with the evolved system. Therefore, we identified an incremental approach that incorporates the user participation, an evolutionary experimental method, and a bootstrapping-like design method (a self-improvement -based design method). The approach, called BID, made possible the identification of the requirements and the design of a flexible architecture to support global brainstorming meetings. In the section 2 we present global software development process models related to our model we used in developing the SCOOP framework. In section 3 we describe the BID approach. In section 4 we describe the case studies and the experimental approach we followed. In section 5, we show how we applied the BID approach to develop the SCOOP framework. In section 6 we discuss and conclude this paper.

2.Global Software Development Process Models


We distinguish two different types of global software development process models: Horizontal process models and Vertical process models. Software process models are abstract representations of software processes. A software process is a set of activities that are performed by the project participants toward a specific goal. In the following we describe the two different process models (see Figure 1).
Horizontal process model

Figure 1: Collaboration dimensions of multiple-site

In this case, distributed software development process refers to the distribution of independent or loosely coupled activities that are spread over different locations (see Figure 2). That is, the process consists of loosely coupled activities that can be performed independently from each other. For example, the design of a system architecture, the implementation, and the testing activities can be conducted almost independently in different sites by different development groups. Every single site is fully responsible of its deliverable. This type of process model is adequate for different sites that are located in different time-zones where communication and collaboration take place asynchronously (see Figure 1, dashed ellipse I).
Vertical process model

development projects. The dashed ellipse (I) shows the Horizontal process model, the dashed ellipse (II) shows the Vertical process model.

In the vertical process, each activity of the development process is simultaneously shared and performed synchronously by different groups of developers located at geographically distributed locations (see Figure 3). As a re103

sult brainstorming and software design, coding, debugging, inspecting, and documenting can be jointly performed by several participants in the same time. Sites that share the same time-zone and being in different locations can perform concurrent synchronous work (see Figure 1, dashed ellipse II).
Figure 2: Horizontal process model: distributing the process of development over multiple sites

The vertical process model with focus on concurrent software brainstorming and design in global software development, provides the context for this research.

3.The Bootstrapping Incremental Design Approach


The term bootstrapping has several connotations, ranging from binary loaders to compilers construction. Figure 2: Vertical process model of distributed software development Bootstrap most commonly refers to the program that actually begins the initialization of the computers operating systems, like GRUB, Lilo or ntldr. A small amount of code is required to start a computer, but it's used to progressively load more complex code, more and more until the full computer has started up. Bootstrapping can also refer to the development of successively more complex programming environment. According to The Data Analysis BriefBook encyclopedia: Bootstrapping is a general term, that describes any operation which allows a system to generate itself from its own small welldefined subsets. In the context of compilers, the bootstrapping is the process by which a simple compiler is used to translate a more complicated compiler, which in turn may handle an even more complicated compiler and so on. Many compilers for popular languages were first written in another implementation language, and then rewritten in their own source language. Bootstrapping has also another dimension as described by Doug Engelbart [4], who emphasizes the importance of the self-improvement process. In real life many events are happening according to this concept of incrementally pulling oneself one step higher till reaching a goal. For instance the highest suspension bridge in the world Le viaduc de Millau in France was recently built using the same concept of bootstrapping. Engineers build first the pillars of the bridge, then draw a thin cable across the both hills, then they used that

104

cable to elevate a larger one. Then they use both cables to pull a third bigger one and eventually to join both hills. On top of them, engineers put another layer of pillars that are used to join the intertwined wires. From the above definitions and examples we provide the following general definition (see Figure 4) : Bootstrapping is about taking a first step which is on the path, then learn from it, use it to lift up next level. If the current level is not the desired goal, use current to lift up next level till reaching a goal. Figure 5 depicts a UML activity state diagram to describe a general bootstrapping abstract method.

Figure 4. The abstract bootstrapping process

The BID Steps The steps of the BID approach are illustrated in Figure 6. Bootstrap Start: Step0 (See Figure 6) This is the start point of the bootstrapping process. We use a scenario-driven development and the user-centered-design approach to design and build an initial prototype system. The initial prototype has basic capabilities enabling synchronous shared scribbling and text writing using smart boards. Different meeting rooms accessing the same server can manipulate the same artifacts by drawing and clicking on the smart board. A keyboard enables users to enter text elements associated with the artifacts. Changes are immediately propagated to the

Figure 5: Bootstrap abstract method (UML activity state diagram)

other sites. Iteration (1) (See Figure 6) Based on the background gained from Step0 and the initial prototype tool developed with very basic knowledge, we conduct a first experiment. Experiment (1-n):

105

A typical experiment is a series of organized distributed meetings to brainstorm the requirements of the prototype itself. We iterate over this meeting one or (n) times till getting the desired state of the design, an improved design artifact, and an improved prototype.

Figure 6:The Bootstrapping Process

Prototype (1-n): Is the product of the initial step and used as basic infrastructure for the meeting. The prototype evolve from one state to another and can be redesigned (n) times after (n) meetings having (n) times improved a given functionality. Design Artifacts: Are the possible models designed by the participants utilizing the prototype that improve the behavior of the prototype. Models describe functionalities that the participants needed at that moment but are not yet available. For example, a design artifact can be an awareness model designed during the experiment, then incorporated into the system prototype, then used again in a subsequent experiment, then refined and finally integrated. In (n) experiments, a design artifact is evolving over (n) steps. This is about only one type of artifact that is produced in the end of an iteration. In the other hand, design artifacts exist only temporarily and disappear as soon as the functionality is built and is incorporated into the system. Knowledge (1-n): Is the learning mechanisms background knowledge, and the improved understanding of the requirements of the system. The accumulation of the produced knowledge is used in redesigning the prototype in subsequent steps. Typically, a knowledge (n) in a Step (N-1), which is produced after (n) iterations, will serve as the basis for a Step (N) and will evolve to Knowledge (m) after (m) iterations. Although the design process is chaotic, the knowledge (n) is guaranteed to remain the same in the worse case and to evolve to knowledge (m) in the best case. Here, the knowledge is not quantifiable. But still, we can prove the following assumption that knowledge (n) (Step(N)) >= knowledge(n-1) (Step(N-1)). This is a straightforward claim, since we need more knowledge to move from current step (N-1) to the next step (N), if we dont produce more knowledge, we are

106

stalled in the step (N-1). We conclude that the bootstrapping method is an intensive and evolutionary knowledge construction-based method. Iteration (N): (See Figure 6) The current iteration (N) happens based on the results of the previous iteration. Using the prototype and knowledge from Step (N-1), one or several design artifacts are produced and are fed back to the prototype. The knowledge produced during an experiment(N.M) is used to redesign the prototype utilizing the design artifacts. At this level, we iterate over Step (N) till having the desired state of the prototype. A typical desired state is fully accepted and mature functionality of the system. It is interesting to notice that each Step of the BID approach is a bootstrapping process in itself. To arrive to a final system, we need several loops in which we incrementally iterate over several sub-bootstrapping processes, hence the name Bootstrapping Incremental Design process. A description of the experimental bootstrapping process is described in figure 6. Figure 7 shows a UML metamodel of the bootstrapping process. A bootstrapping process is basically a set of activities that are build upon each other starting from a basic state. Knowledge acquisition and system improvement are the characteristics of a bootstrapping step. System improvement activity yields an improved prototype for the subsequent bootstrapping step using knowledge acquired during the current step. 4.Case Studies We conducted several case studies that took place within the university program for postgraduate participants and Master students of the computer science curriculum in the Technische Universitt Mnchen. Most of the subFigure 7: A Bootstrapping Metamodel jects had relevant work experience background in the industry. We conducted experiments over a duration of four semesters with a total of seven groups averagely six participants volunteer each. During each semester, a series of experiments are conducted consisting of a regular series of meetings to brainstorm and develop collaboratively the SCOOP framework. A full cycle of development lasts a semester long where groups of participants volunteer meet in a regular basis to sketch and develop an architecture to support distributed brainstorming meetings. During each semester we had two separate groups of six participants each. This enabled us to conduct two independent series of meetings running simultaneously. Each group was split into two sub-groups of three users each. During each meeting, each sub-group of participants were provided a different room with the necessary hardware and software to run the meetings. The hardware consists of Smart Boards connected to networked computers via the internet (see Figure 8). Each computer runs a basic skeleton application prototype GroupUML [3] that enables different groups in different locations to collaborate over system architectural design using UML. A short tutorial about how to use GroupUML and the Smart Board was provided. Figure 9 shows a deployment diagram for GroupUML.

107

5.Applying The BID Approach


This section describes the application of the BID approach in developing the SCOOP framework. Bootstrapping: Step0 - the initial step Problem The problem as outlined in the beginning of the chapter, having no or very little understanding of the requirements and not enough knowledge about the problem domain nor the solution domain, we have to develop a system support to distributed synchronous brainstorming and software design.
Figure 8: Scenario of experimental setup for distributed same time / different place software design and brainstorming meetings

For this we have to answer rst a couple of questions: how to identify initial requirements? how to put the BID approach into practice? Solution We designed the initial bootstrapping step using the following approach: The initial requirements specication process was carried out in a usercentered way. Next, conducting an initial distributed session of problem solving. Then gathering requirements through interviewing users and classify them according to the following guiding questions: What sorts of tasks must be supported and mediated by the initial prototype? What types of infrastructure and technology are accepted for use to the users in carrying out their tasks? ! The rst question is answered through a scenario-driven requirement elicitation, where we derived the core requirements such as support for sharing artifacts among dispersed developers, supporting communication, and enabling aware-

Figure 9:GroupUML deployment diagram

ness information. 108

To answer the second question, we suggest to use available hardware such as electronic white boards (smart boards) and networked computers. The smart boards are in particular adequate to conduct design and brainstorming meetings. The above approach is described in Figure 10, to identify the initial requirements, we used scenarios and applied rules from the user-centered design approach. In the next section we provide a description of the user-centered design philosophy. Bootstrapping: Step (n) Problem ! So far, we dened the bootstrapping initial step Step0 that enables to trigger the main bootstrapping process. We need to apply the subsequent steps of the bootstrapping approach. Solution ! As we gain a better understanding of the issues from the previous step Step (n1), through the cumulative knowledge acquired during experiments, we can then apply the following steps of the bootstrapping process. The next issue was how to use the prototype and how to develop the increments and who should contribute to design and to evolve the whole system. The solution was to include the users of the prototype to design and redesign the prototype they are using. ! Like the self-compiling compilers, Figure 10: bootstrapping initial step they were rst written in another implementation language (say in Assembler), once the compiling of a given programming language (say Pascal) is working, we can use the original compiler (written in assembler and can compile Pascal) to improve itself. We rewrite the original compiler (originally written in Assembler) in the compiled language (Pascal) and nally the new compiler (written in Pascal) is generated by the original compiler (written in assembler). So we can claim that the Pascal compiler is written in Pascal. ! The above description is a bootstrapping approach and, by analogy, we would like to apply similar approach, the empirical BID approach, to the design of SCOOP. By analogy, the initial prototype of SCOOP serve as a platform to conduct distributed software design meetings where the design is about SCOOP itself. ! For the this bootstrapping step (Step (n)) we are taking prot from the participative or participatory design approach. Since the users of the system know best their needs and therefore can model their requirements in a better way than by designers, involving potential and skilled users in the design seems to be a viable solution. ! Figure 11 shows a description of a modied participative design approach adapted to our bootstrapping approach. ! The user-centered design process corresponds to the initial bootstrapping step that triggers the bootstrapping process. The participative design process is a traditional design process (e.g. as in the waterfall model) augmented with user participation The iterative and participative nature of the bootstrapping process suggests that this is an improvement of the traditional participative process to a bootstrapping participative process. ! Figure 12 depicts the BID process applied in developing SCOOP framework.

109

6.Conclusion
We presented in this paper the BID approach, which is an empirical approach applying the ideas of compiler bootstrapping theory to system development resulting in an approach that reintroduces the ideas of exibility, adaptability and selfimprovement. We developed an evolutionary approach called BID (bootstrapping Incremental Design approach) that consists in, rst, through several case studies that serve for identifying the initial requirements, second, through incremental prototyping, we discover and improve functionalities of the prototypes. Finally, the experimentation lead to build an initial framework architecture based on components and objectoriented techniques. Figure 11: Bootstrapped participative approach The BID method can provide explicit detailed guidance on eliciting the architectural requirements, designing the architecture, and analyzing the resulting design. Moreover, we provide software architects with a framework for understanding the technical trade-offs and risks they face as they make architectural design decisions.

7.References

[1] N. Boulila, A.H. Dutoit, B. Bruegge SCOOP: A framework for supporting Synchronous Collaborative Object-Oriented Software Design Process, Proceedings of Cooperative Support for Distributed Software Engineering Processes - ASE Linz 2004 [2] N. Boulila, A.H. Dutoit, B. Bruegge CSCW-based Software Engineering Course: A Case Study Of Distributed Collaborative Software Modeling in Education Proceedings of the International Conference on Applied Computing - pp. 271-278, Lisbon, Portugal, Mar. 2004. [3] Boulila, Dutoit, Bruegge, D-Meeting: An ObjectOriented Framework for supporting distributing modeling of software, ICSE 2003 International Workshop on Global Software Development [4] Olav W. Bertelsen Elements of a Theory of Design Artefacts. Ph. D. Thesis Department of Information and Media Science Aarhus University Aarhus, January 1998 [5] J. Sametinger and A. Stritzinger. Exploratory software development with class libraries. In Proc. 7th Joint Conference of the Austrian Computer Society, Klagenfurt, Austria, 1992.
Figure 12: Applying the bootstrapping design approach in designing SCOOP framework

110

Supporting Traceability in Distributed Software Development Projects Timo Wolf and Allen H. Dutoit
Abstract. Traceability is the ability to follow the life of a requirement, both in forward and backward directions. Traceability facilitates dealing with change, by identifying which components may be impacted and by identifying which stakeholders may need to be consulted when speci c requirements are revised. Rationale is the justi cation behind decisions, including alternatives explored, evaluation criteria, arguments, and decisions. In typical projects, little traceability or rationale is explicitly documented, because of the overhead and cost of capturing and maintaining this information. Instead, this information is often exchanged among stakeholders informally and during peer review activities or reconstructed when changes or issues arise. In distributed projects, opportunities such information exchanges are signi cantly reduced, resulting in even less traceability and rationale being captured. However, this information is critical when assessing the impact of changes, as few participants in a distributed setting have a complete overview of the system and the stakeholders involved. In this paper, we propose a model and tool support for uniformly representing and integrating system models and rationale. We support traceability within and across these models. Participants collaborate by attaching annotations, action items, and issues to the system models, providing the relevant collaboration context. By dealing with all three types of information together, we aim to increase their value, to decrease the overhead and cost of capturing and maintaining rationale and traceability, while providing participants an incentive for maintaining their consistency. To follow existing traces and dependencies across system models and rationale, we provide alternate graphical user interfaces that focus on clear and simple design and on desired tasks.

1.

Introduction

Driven by the global market and resource requirements, distributed software development has become typical in technology companies. In contrast to single site projects, communication and coordination issues, as well as language and cultural differences are major challenges within global projects [1, 8] and are responsible for reduced productivity and success [11]. It is quite common that large industry projects split the development into subsystems and assign them to different sites. Each subsystem realizes coherent features and can be developed more or less independent of the other sites. Therefore, interest among sites decreases, less information is exchanged and the awareness of other sites progress, problems, and decisions is minimal. Problems become visible and are confronted only when integrating subsystems, resulting in long integration cycles [8]. The interfaces among major subsystems are not suf cient for preventing such surprises at integration. Increasing the awareness

Chair for Applied Softwareengineering, Technische Universi t M nchen, email: wolft@in.tum.de a u Chair for Applied Softwareengineering, Technische Universi t M nchen, email: dutoit@in.tum.de a u

111

of the other sites development state, including problems and upcoming changes, is a key factor for increasing collaboration and trust among sites, and consequently, increasingly their ability to prevent or solve integration problems [1]. Software development projects need to deal with change during the whole life cycle. A variety of changes occur during development, including requirements changes triggered by the client, architectural changes triggered by technology enablers, and stakeholder changes, including new clients, end users, or project managers. Because dependencies among development artifacts, a single change can impact many other artifacts. Traceability is a key success factor for long-term projects. Dependency links between stakeholders, requirements, design and implementation are captured and maintained to identify change impact on models and the system, as well as identify critical, decision making stakeholders [12, 9, 10]. When developing in distributed settings, the need for traceability increases, as the information exchange and awareness of other sites decreases and the identi cation of change impact becomes more dif cult. Rationale is the justi cation behind decisions, including alternatives explored, evaluation criteria, arguments, and decisions. While requirements and design decisions are captured in system models, their rationale is gradually lost as participants move to other projects or as their memories fade. The assumption behind rationale management is that externalizing rationale, making it available to project participants, and structuring it for long-term use can increase the quality of decisions, increase trust and consensus among stakeholders, and facilitate changes to existing systems [15]. While the usefulness of rationale management has been shown in speci c cases, the overhead associated with training participants and formalizing knowledge remains a signi cant obstacles during development [16]. In this paper, we propose a model for uniformly representing traceability and rationale within the context of a system model. Annotations, in the form of comments, action items, or issues, are used as a lightweight communication mechanism and for making status information explicit. System models and annotations are represented as a generic graph. Relationships among system models and annotations are captured, either automatically or manually depending on the task at hand. Pre- and post traceability are then supported by following these links. For example, when assessing the impact of a requirement change, a user can nd all likely impacted requirements, test cases that would need to be adapted and rerun, related nonfunctional requirements, and components that need to be reviewed or changed. Similarly, when changing a component, all requirements that are related to the component can be identi ed and retested, and critical stakeholders who need to be consulted can be found. By putting equal emphasis on system elements and annotations, implicit relationships that are otherwise not captured in the system model can be inferred. For example, it is relatively straightforward to maintain explicit trace links between functional requirements (e.g., use cases) and the components in which they are realized. Conversely, tracing nonfunctional requirements to their corresponding architectural decisions is more complex. In our model, nonfunctional requirements can be used when assessing different options for addressing an issues. This enables the developer to follow traces between system elements that are connected through an annotation. The proposed model is supported by a tool suite called SYSIPHUS [19]. SYSIPHUS is a distributed multi-user application that enables users to collaborate synchronously on the same system models (e.g. use cases, classes, components), documents (e.g. requirements analysis document, system design document), and annotations. Users familiar with CASE tools access SYSIPHUS with a graphical desktop application called RAT [18]. Users more familiar with a document management metaphor

112

access SYSIPHUS through a web application called REQuest [6]. SYSIPHUS also includes an awareness component that noti es users of relevant changes. Starting at a speci ed element, a traceability wizard facilitates the identi cation of change impact and exploration of the model, including system models, documents, and annotations. This paper is structured as follows. Section 2. introduces our generic graph model and shows how it is used to realize concrete system models. Section 3. describes the annotation model, including issues, comments, and action items, and describes its relation to the system models. Section 4. illustrates how different traceability scenarios are supported. Section 5. summarizes the current status of the tool suite. Section 6. discusses the case studies we have done so far. Section 7. discusses our model in the light of related work. Section 8. concludes this paper.

2.

System Models

In SYSIPHUS, models are represented as a graph. The graph consists of two classes, ModelElement and ModelLink. All system model elements (e.g. use cases, nonfunctional requirements) and collaboration elements (e.g. comments, issues, action items) inherit from the ModelElement class. A model element can have an arbitrary number of links to other model elements. A link is represented by the class ModelLink and refers to exactly two other model elements. The class ModelLink is also a model element and can be connected by other links to any number of model elements. The graph provides a generic and extensible basis for all concrete model elements. Figure 1 shows the
2 ModelElement

*
ModelLink

InitiatingActor UseCaseLink

0..1

UseCase 1 Actor

Figure 1. The UML class diagram shows the generic, type-less graph and an example of a typed re nement representing a use case and an actor. The InitiatingActorUseCaseLink re nes the ModelLink and de nes the initiated use cases of an actor.

graph and illustrates the typing mechanism by adding a use case and an actor model element. Both extends the class ModelElement. To de ne that an actor may initiate many use cases, a class InitiatingActor UseCaseLink is used. It re nes the class ModelLink by connecting only actors with use cases. S YSIPHUS stores all model elements in a central repository, tracking authorship and changes, and providing access control. Users access model elements in the context of a document. In SYSIPHUS, a document is a view of the model customized for a speci c activity or role. Documents can be customized for each project. A document is de ned in terms of sections and subsections, each containing

113

either text, a diagram, or a lter . The lter mechanism is used to attach any selected model elements into the sections. A lter is de ned as a class of element (e.g., a use case) and an optional number of property name and values (e.g., priority = high). For example, a requirements section may contain all use cases and a use case diagram. A management document may contain a section with all use cases with a high priority. When changing a use case, both documents are kept consistent, because all sections display the same model. Assuming the priority of a use case changes from low to high, the use case stays in the requirements section but dynamically appears in the management document. To enable collaboration with external stakeholders, SYSIPHUS can export a document into les of different formats, such as PDF or RTF. Figure 2 illustrates the document model used in SYSIPHUS and adds some selected model element classes (only selected class shown).
ModelElement Service Component Class TestCase Document

* *

Section

* *
Contrained Element

Criterion Composite Section Leaf Section

Nonfunction Requirement

DesignGoal

TestCriterion

UserTask

0..1

UseCase

Figure 2. S YSIPHUS document structure (UML class diagram), only selected classes depicted for illustration.

3.

Annotation Models

In SYSIPHUS, users collaborate by linking annotations to model elements. Comment, Issue and ActionItem are the main classes within this model. All inherit from the Annotation class. Annotation inherits in turn from ModelElement. Consequently, annotation elements have the same importance as system model elements and, unlike other tools, are treated as rst class objects. Annotations can also be included into document sections, in the same way system model elements are. For example, a requirements speci cation or an architecture document typically has an open issue section, listing the annotations of class Issue whose status property is open. Figure 3 shows that a single model element (e.g. use case, class, component) can be annotated by many annotations and a single annotation can be linked to many model elements. Therefore, the annotations can be used to represent complex relationships, relating many system and annotation elements from different stakeholders.

114

ModelElement

Annotation * 1 Issue 1

Comment

ActionItem

* * *
Option

* *
1 Resolution

* *

Criteria

Assessment
Figure 3. S YSIPHUS annotation model (UML class diagram).

Comments are a simple, informal, and unstructured way for project participants to communication, similar to a newsgroup. S YSIPHUS does not impose any rules on the usage of comments. Project participants can reply to existing comments, creating discussion threads. Unlike newsgroups, comments and replies can be attached to many model elements, providing a context for the discussion. While comments are suited for supporting light-weight, short-term collaboration, such as requesting clari cations or indicating super cial problems, they are not suf cient for long-running or complex design discussions involving con icting criteria and many alternatives. To support such discussions, SYSIPHUS provides an issue model similar to QOC (Questions, Options, Criteria, [13]), including elements for representing options (i.e., alternatives under consideration), criteria (i.e., qualities, such as nonfunctional requirements or design goals, that should be used to evaluate the alternatives), and assessments (i.e., a judgment indicating how well an option satis es a criterion). We selected QOC instead of the more popular IBIS model (Issue Based Information System, [4]) as we observed that users often reverse engineer issue models from threaded discussions (as opposed to structuring them on the y). Once an issue has been discussed and resolved, users can plan the resulting work in terms of action items. S YSIPHUS supports a simple task model, consisting of Action Items. An action item has a description, a status, a due date and can be assigned to any project participant. Users can decompose an action item into more detailed action items. Like all annotations, they can be attached to any number of model elements, to indicate which elements are likely to change in the process of accomplishing the task. The annotation elements of SYSIPHUS have been designed incrementally and based on the feedback of SYSIPHUS users. Our goal is to strike a balance between spontaneous, informal collaboration, and formal, long-term rationale capture. While using issue models for supporting daily collaboration

115

User Task User Task Use Case Service Component Class NFR Test Case Annotation x

Use Case x x x x x x x

Serivce Component x x x x x

Class NFR x x x x

x x

x x
SYSIPHUS .

Test Case Annotation x x x x x x x x x x x


A cross indicates a direct

Table 1. Traceability matrix between selected model classes, supported by link.

and design tasks introduces an unnecessary overhead [16], project participants need a mechanism for documenting con icts and their resulting agreements, as indicated by the experience with WinWin [2]. Our assumption is also that capturing such contentious issues can also help in future changes for nding the human source of a criteria or for identifying indirectly related model elements. We discuss how SYSIPHUS supports pre- and post-traceability next.

4. Pre- and Post-Traceability


Software development projects need to deal with change. It is quite common that changes occur during the whole project duration to all artifacts. For instance, the requirements may change, when project participants gain deeper knowledge of the problem domain. When new technology is identi ed, the current architecture may need to be adapted. To maintain consistency among models, the related elements of a changed element needs to be evaluated and, if needed, updated. In addition to identifying all related elements potentially impacted by the change, it is often necessary to identify the human sources of these elements, as they may have tacit knowledge about how (or whether) the change should be realized. Tracing from requirements forward, for example, to impacted design, implementation, or test elements is called posttraceability. Tracing from requirements (or other elements) back to their human sources is called pretraceability [12]. S YSIPHUS support post-traceability with two concepts. First, explicit links are used for tracing between the elements. These links are part of the model (e.g. the InitiatingActor UseCaseLink introduced in section 2.). Table 1 shows selected traceability links that are supported by the default model. To increase the usability of the tracing, the user can lter and search within the traceability graph. Moreover, all elements that might be impacted by a change, can be automatically annotated with action items. These action items mark that a model element has been identi ed as impacted by a change but still needs to be reviewed. The second post-traceability concept is based on annotations. Users create annotations, for example issues, to several model elements. This implies that the model elements have a relation. As annotations are also model elements, a user can trace over annotation links to nd related elements. For example, a component can take part in the discussion related to an issue with a number of nonfunctional requirements used as criteria in the resolution of the issue. Even if the component is not directly related to these nonfunctional requirements, the user will be able to nd them by tracing through the issue.

116

Pre-traceability is supported similarly, by enabling the user to examine the author of a model element in relation to its use in an issue. For example, by examining who wrote a speci c nonfunctional requirement and how it was used as a criteria when resolving an issue, the user can identify which qualities are critical for each stakeholder. This use of issue model is similar to the WinWin process and tool [2].

5.

The Sysiphus Tools

In this section, we describe SYSIPHUS [19], a distributed environment that supports the conceptual model and traceability scenarios outlined in the previous sections. S YSIPHUS is composed of a suite of tools centered on a repository, which stores all models, rationale, and user information. The repository controls access and concurrency, enabling multiple users to work at the same time on the same models. Users access models either through a web interface (REQuest [6]) or a graphical user interface (RAT [18]). Both interfaces access the repository through a common model layer that de nes the type of elements and links. The model layer can easily be extended to accommodate new element and link types, as dictated by the needs of a project. The model layer in tern uses the SYSIPHUS repository for persistency and for pushing changes to other users. With the web interface users can access the repository from a variety of environments (e.g., lab, home, of ce) without the installation of additional software. It is divided into two parts, one for accessing and modifying the system models, the other for accessing the discussion threads, rationale and action items. Users can trace from the system models to relevant collaboration models and back. The graphical user interface (see Figure 4) provides a UML view over the models, adopting the look and feel of typical UML modeling tools. The graphical user interface is typically used for drawing diagrams and, with drag and drop interaction styles, offers more intuitive ways to specify relationships among model elements than the web interface. While using a uni ed model for representing system models and annotations facilitates the use of traceability links, usability and visualization issues have been as critical to our goal of lowering the overhead of capturing this information. For example, displaying the entire traceability graph quickly overwhelms the user, given the number of interrelationships among elements in realistic models. Instead, we evaluated various features for ltering links and for providing a starting element. In the next subsections, we single out three features of ness, traceability table, and traceability wizard. 5.1. Awareness
SYSIPHUS

that illustrate these issues: aware-

Within distributed projects, awareness of the other sites work progress, including development artifacts, documentation, and problems is a major key to reduce misunderstandings, redundant work and to increase the trust across sites. Awareness helps to identify critical issues early and gives the possibility to resolve issues in time [3, 1, 8]. SYSIPHUS supports awareness of new model elements, modi cations, and deletions by notifying project participants via email. Therefore, the noti cation range from documents, system models (e.g. use cases, components, classes), to issues, or action items. To avoid broadcasting of all changes to all project participants, SYSIPHUS enables the users to con gure the kind of changes they want to be noti ed of. They can choose speci c model element

117

Figure 4. Screenshot of the graphical user interface RAT.

classes like use case, component, issue or task, resulting in an email, if any elements of these classes are created, modi ed, or deleted. In addition, they can select existing documents or sections of documents they want to be noti ed of. For instance, a user chooses the requirements analysis document, than he is informed of all text modi cations of the sections, and of all modi cations to use cases included in the document. In addition, SYSIPHUS enables the user to restrict the noti cations to changes of speci c persons. Noti cation emails in hourly, weekly, or monthly periodic intervals, containing all changes that occurred. To each change, the responsible person, the date and time and a URL to the model element within S YSIPHUS is included. 5.2. Traceability Table

S YSIPHUS supports two main graphical interfaces to visualize and follow traceability links. Both focus on the reduction of complexity, to support the users in daily work tasks. The rst view displays all elements as a table 5, with one element per row. Element properties are displayed in the columns. The columns also include the target elements of the traceability links. For instance, the gure 5 displays the use cases of an example project, containing Open Annotations and Realizing Classes. The user can access the properties of any displayed element by double-clicking on its name. Users can also con gure which columns are displayed for a speci c element class. The rows can be sorted by column, and the content can be searched and ltered. Hence, the tables are useful for inspection and review tasks. For example, a user can identify critical use cases by displaying all

118

use cases and sorting for open issues. Alternatively, all use cases without a corresponding system test case can be displayed.

Figure 5. A screenshot of the traceability table view, containing use cases. The complexity of displaying traces is reduced, by adding the trace targets in columns and using a label of the traceability link as column header.

5.3.

Traceability Wizard

The second traceability view is graph based and focuses only on the context surrounding a single element, which is displayed in the center of the graph pane. All its traces and its related elements are displayed. The layout differs for annotations (issues, comments, and tasks) and the rest. It is separated into four areas: left, right, top and bottom. The sub graph associated with a system element is drawn so that all incoming system links are depicted on the left and all outgoing system links on the right. Annotations attached to the system element are displayed below it while constraining elements (e.g., nonfunctional requirements) are displayed on the top. In other terms, the horizontal axis displays system model dependencies while the vertical axis displays rationale information. As the graph focuses on only one element, traces between other displayed elements are hidden. Figure 6 shows a sequence of screenshots of the traceability graph. The graph is interactive and enables the user to click on a displayed element to set a new focus. Double clicking on an element opens the element in a new pane that displays its properties and enables its modi cation. Each screenshot shows the graph after the user clicks on a traced element, starting at the actor Customer. From the Customer, he clicks on the user tasks Buy Article, than to the use case Pay for Article, followed by the issue How should the customer input commands? and ending at the quality constraint Familiarity with Checkout Procedure.

6.

Evaluation

Our approach to evaluating and re ning SYSIPHUS has been incremental. We have used a software engineering project course at our chair to explore and debug initial ideas [7]. We have used this project course as a rst approximation of a distributed project, as students work part time, from different locations, and usually do not know each other before the start of the course. Moreover, students elicit requirements and deliver a system to an actual client external to the university. In a second phase, once we take into account feedback from the project course, we then have used documents and feedback from industrial partners to assess how realistic our ideas would be in real project. This enables us to study aspects related to scale and long term use that we cannot assess in an academic environment. During the winter semester 2003/04, about 30 students developed a prototype forecasting system for the Wacker chemical company. SYSIPHUS was used for developing the requirements analysis and the system design documents. The requirements analysis document turned out to be one of the primary client deliverable for this project for the Wacker IT department for realizing a production version of the system. During the winter semester 2004/05, about 25 students worked developed CampusTV, a

119

Figure 6. A sequence of screenshots of the traceability graph. Each screenshot shows the graph after the user clicks on a traced element.

prototype system for supporting interactive and distributed lectures, based on digital video broadcast (DVB-T). The hardware transmitters were provided by the company and our client Rhode & Schwarz. In both projects, annotations in SYSIPHUS were used for ad hoc collaboration (clari cation questions, discussion of design issues) and for tracking open changes. Change requests were indicated as challenge questions that were then closed by the participant who implemented the change. This made it possible by instructors to monitor and address potential problem areas. In addition, we recognized that the student awareness of other teams work, was much higher that in project courses before. The students noticed open issues of other teams and were enabled by the given traced system model context to understand, help and collaborate across teams. To assess our traceability model for an actual project, we used a set of documents from an industrial project developing a communication framework for enterprise phone applications. At the time, the project had been running for about nine months, included about 40 people, and was about to be distributed to four different sites, growing to about 100 participants. We selected about 120 use cases and 100 nonfunctional requirements from their requirements documents, and entered in SYSIPHUS traces for these requirements, including links to design components, and system tests. The documents we examined also included annotations which we also entered in SYSIPHUS. We then used the entered models as examples to elicit feedback during interviews with key individuals, then with project man-

120

agement, and nally with all relevant persons as a group. Several participants interviewed had taken part in distributed projects and had used DOORS [17] as a requirements management tool. While this study is still ongoing, there are several preliminary lessons learned that we were already able to draw: Different roles are involved in creating and using the traceability links. For example, architect or senior developers are typically responsible for linking requirements to components, whereas product managers and requirements engineers need change impact knowledge when prioritizing requirements. When the individuals involved are in different sites, we think that the likelihood that traceability links are entered is much reduced. Individuals entering requirements in the tool are not necessarily the stakeholders who originated the requirements, as pointed out by Gotel [10]. However, architectural and development decisions seem more easily traceable to their authors. The project we studied was organized in four week iterations, at the end of which a product increment, including both documentation and code, was validated. We think that such fast pace iterations are necessary to keep the model up-to-date so that different sites use annotations on the model to collaborate. Conversely, we think that the value of traceability through annotations would be reduced in projects were documentation either precedes the product or is reconstructed after the product is stable. In summary, the observations we collected so far seem to indicate that the traceability model presented in this paper could be deployed and useful in a distributed development environment. However, we also observe that missing traceability links will probably be most often those between model elements owned by different sites or by different roles. In this case, it is critical that suf cient collaboration (either in the form of informal comments or formal issues) occurs over the model to make up for this de cit. In either case, this phenomenon would have to be studied with models or documents produced by a distributed project running for a longer period of time.

7.

Related Work

Ebert and Neve summarize in Surviving Global Software Development [8] experiences and best practices for global distributed software development projects. The results focus on project organization structures, process management and integrated work o w management. They stated out that one of the real challenges is to spread the awareness, communication and knowledge to all development levels and sites in real time, specially when different cultures and languages are involved. Within this paper, we proposed a model and prototype implementation to increase these properties by realtime collaboration on requirements and system models integrated with rationale knowledge and communication. Traceability between requirements, system models and rationale knowledge and communication is supported to identify change impact as well as critical stakeholders. Battin, Crocker and Kreidler describe in [1] the main issues and solutions of a global distributed project from Motorola. Main issues they encountered are the loss of communication richness between different sites and a centralized software con gur ation management. The solution approach for the loss of communication richness includes the exchange of documents and work products over

121

the intranet, using a web site for each site. S YSIPHUS follows the same principle, as all artifacts like documents, models, rationale and communication elements are located in a central repository, accessible from all sites. Thus, awareness of the other sites work increases and foreign experts can be found. Chisan and Damian developing a model of awareness support in software development in GSD [3]. They propose a workspace, containing all development artifacts like the requirements or the design document, which support noti cation to relevant project participants, when the content of the workspace changes. To avoid broad cast noti cation, the model proposes artifact dependencies to notify selected people only. We agree to the need of awareness in distributed software development and have already implemented aspects of the proposed model. Our approach consists of a central repository containing all development artifacts that are an extension of a generic graph model. Our concrete development artifacts range from documents (e.g., requirements, design, test) to UML models like use cases, classes, or components to design rationale based on QOC [13]. Awareness is already supported as developers can subscribe to their project and gets noti ed when elements changes. Instead of getting noti ed about all changes they can de ne the artifact class or the document sections they want to be aware of. Noti cation of depending artifacts is not supported right now, but would be a consequence of our traceability approach as proposed within this paper. Mockus and Herbsleb show in [14] the need for nding experts in distributed projects. They introduced a tool that supports the measurement of expertise and expert searching in terms of people or organizational units. The tool works on the data of version control systems (VCS), capturing deltas of les. In addition, author information, modi cation dates and a change log are captured. The tool enables the browsing and searching of work products and expertise-based identi cation of people. The work products are mainly source code les and are visualized as a product hierarchy resulting from the source code directory structure. VCSs are very good in dealing with text les, but they have problems when working on binary les, which are mainly used by CASE tools to store models. Often, MS Word les are used to create the project documents, containing images of diagrams. The SYSIPHUS server component is capturing similar change data as a VCS. Therefore, the same approach as in [14] could be used to identify experts within the requirements, system models, rationale knowledge, or within the communication elements. The introduced traceability concepts would further help to identify dependencies among experts, expertise, and work products in terms of model elements. Dellen, Kohler and Maurer propose in [5] methods and techniques to extract automatically causal dependencies between decisions from development tasks decompositions and from the information ow of ne-gr ained software process models. The decisions are part of rationale information and supported with cons and pros. Decisions are made during tasks, which are part of the process model. The decisions of a task are in uencing depending tasks. The application of their approach overlaps to ours. The dependencies are used to identify change impact, when a decision changes or becomes invalid. In opposite to our work, they focus on a process model and use development artifacts as inputs and outputs for tasks. We focus on all kinds of development artifacts and support direct links between them and rationale, capturing the reasoning behind them and providing a context to resolve issues.

8.

Conclusion and Future work

In this paper we described a model to capture and manipulate development artifacts, such as documents, requirements, system models as well as rationale knowledge and dependencies. Pre- and post-

122

traceability between system models are supported and in particular, traceability between rationale and system models. We observed that visualizing and working on these linked models is challenging, as the complexity of traces increases quickly. Consequently, we evaluated alternate graphical interfaces supporting the user in focusing on desired traces. Currently, we support only manual traceability, in which a user can only navigate between depending elements. Intelligent lters, searches and automated operations are needed on the traceability graph to work with a large amount of data. For instance, SYSIPHUS users have to identify change impact on design elements manually, when changing a requirement. Ideally, the system should support change impact by resulting a set of elements that should be revised, while not returning all reachable elements. The collaboration with an industrial project helped us to identify the needs of distributed development and assess the current version of our traceability concept. We gathered qualitative observations by surveying large student development projects and to evaluate our increments. However, the work is still in progress and more evaluations are needed.

References
[1] Robert D. Battin, Ron Crocker, Joe Kreidler, and K. Subramanian. Leveraging resources in global software development. IEEE Software, pages 7077, March/April 2001. [2] Barry W. Boehm, Alexander Egyed, Julie Kwan, Daniel Port, Archita Shah, and Raymond J. Madachy. Using the winwin spiral model: A case study. IEEE Computer, pages 3344, July 1998. [3] James Chisan and Daniela Damian. Towards a model of awareness support of software development in gsd. In The 3rd International Workshop on Global Software Development, pages 2833, May 2004. [4] Jeff Conklin and K. C. Burgess-Yakemovic. A process-oriented approach to design rationale. Human-Computer Interaction, 6(11):357391, 1991. [5] Barbara Dellen, Kirstin Kohler, and Frank Maurer. Integrating software process models and design rationales. In Knowledge-Based Software Engineering Conference, volume 11, 1996. [6] Allen H. Dutoit and Barbara Paech. Rationale-based use case speci cation. Requirements Engineering Journal, 7(1):39, 2002. [7] Allen H. Dutoit, Timo Wolf, Barbara Paech, Lars Borner, and Juergen Rueckert. Using rationale for software engineering education. In Timothy C. Lethbridge and Daniel Port, editors, 18th Conference on Software Engineering Education and Training, pages 129136. IEEE, April 2005. [8] Christof Ebert and Philip De Neve. Surviving global software development. IEEE Software, pages 6269, March/April 2001. [9] Orlena Gotel and Anthony Finkelstein. An analysis of the requirements traceability problem. In International Conference on Requirements Engineering, pages 94101, Colorado, April 1994. IEEE.

123

[10] Orlena Gotel and Anthony Finkelstein. Contribution structures. In International Symposium on Requirments Engineering, pages 100107. IEEE, March 1995. [11] James D. Herbsleb and Audris Mockus. An empirical study of speed and communication in globally distributed software development. IEEE Transactions on Software Engineering, 29(6):481494, June 2003. [12] Matthias Jarke. Requirements tracing. Communications of the ACM, 41(12):3236, December 1998. [13] Allan MacLean, Richard M. Young, Victoria M.E. Bellotti, and Thomas P. Moran. Questions, options, and criteria: Elements of design space analysis. Human-Computer Interaction, 6(34):201250, 1991. [14] Audris Mockus and James D. Herbsleb. Expertise browser: A quantitative approach to identifying expertise. In International Conference on Software Engineering, pages 503512, May 2002. [15] Thomas P. Moran and John M. Carroll. Design Rationale: Concepts, Techniques, and Use. Lawrence Erlbaum Associates, January 1996. [16] Simon Buckingham Shum and Nick Hammond. Argumentation-based design rationale: what use at what cost? Int. J. Hum.-Comput. Stud., 40(4):603652, 1994. [17] Telelogic. http://www.telelogic.com, 2005. [18] Timo Wolf and Allen H. Dutoit. A rationale-based analysis tool. In Walter Dosch and Narayan Debnath, editors, Proceedings of the ISCA 13th International Conference on Intelligent and Adaptive Systems and Software Engineering (IASSE04), pages 209214. ISCA, July 2004. [19] Timo Wolf and Allen H. Dutoit. Sysiphus at http://sysiphus.in.tum.de, Mai 2005.

124

Traceability Management in ADAMS


Andrea De Lucia, Fausto Fasano, Rita Francese, Rocco Oliveto
{adelucia, ffasano, francese, roliveto}@unisa.it Dipartimento di Matematica e Informatica Universit di Salerno Via Ponte don Melillo 84084 Fisciano (SA) Italy

Abstract
Maintaining traceability links (dependencies) between artefacts enables the management of changes during incremental and iterative software development in a flexible way. In this paper we present the traceability environment offered by ADAMS (ADvanced Artefact Management System). Basically, the traceability layer is used to propagate events concerning changes to an artefact to the dependent artefacts, thus also increasing the context awareness within the project. The proliferation of the messages generated by the system could slowdown the system and cause the developer to ignore notifications. Therefore, in ADAMS a visualisation tool is also included that enables the software engineer to browse the dependences concerning a given artefact and selectively subscribe events he/she is interested in.

1. Introduction
In the last years methodologies and technologies supporting coordination and collaboration of distributed software engineering teams have been largely investigated. Configuration Management (CM) tools (see e.g., [9], [20]) are mostly used in software engineering projects to face with cooperation and coordination problems. Existing CM systems are based on the workspace concept, representing the work environment of each user [19]. The adoption of such a separate area causes a lack of context-awareness. As a matter of fact, the discovery of potential problems is delayed, as a developer is informed of work made by others on the artefacts he/she is working on or on related artefacts only after these have been checked-in. On the other side, researches on Process Support Systems (PSSs) [1], [7], including Workflow Management Systems (WfMSs) [28] and Process-centered Software Engineering Environments (PSEEs) [2], aim at supporting the coordination of software development activities through process modelling and enactment. The Process Description Languages (PDLs) they propose for the business processes modelling are too complicated to understand and manipulate for many practitioners. Thus, despite the advances made in the field, most of the solutions proposed have not gained wide acceptance in the industry. Most of PSSs are activity-based and model software processes in a top down manner, focusing on the specification of the control and data flow between activities. In these systems the modelling of the activities is difficult and laborious, often similar to programming, and, as a consequence, project managers are not facilitated by their adoption. When unforeseen situations (frequently) happen, process model deviations are, in general, not supported.

125

Even when such a support is provided, these situations are too complicated to manage. Moreover, often there is lack of integration with configuration management systems and the production of an artefact is seen as the result of the execution of an activity [1]. Differently from CM tools, most recent PSSs integrate communication tools and notification mechanisms to make aware developers about events occurring within activities, providing a greater support to context-awareness [1]. Recent researches have demonstrated that lack of an adequate support to traceability is one of the main causes of over-running projects [6]. CM tools mainly enable versioning of artefacts, but traceability information among different artefacts is lacking and even when supported, the traceability infrastructure fails during the system evolution. In PSSs dependencies between artefacts can be derived from the data flow links between activities, but relationships between the artefacts produced during the software development process are not directly stored and maintained. As a result, changes handling is difficult. This paper focuses upon the traceability management features offered by the ADAMS (ADvanced Artefact Management System), an artefact-based process support system for the management of human resources, projects, and software artefacts [12]. Rather than defining the control and data flow between activities, like in most PSSs, ADAMS models relations between the produced artefacts. Maintaining traceability links (dependencies) between artefacts supports management of changes during incremental and iterative software development in a flexible way. Basically, the traceability layer is used to propagate events concerning changes to an artefact to the dependent artefacts, thus also increasing the context awareness in the project. ADAMS increases the context awareness without overloading a software engineer with useless event messages. To this aim, a software engineer can selectively specify the events concerning the artefact he/she needs to be notified of. In particular, he/she can require to be informed about events having a direct impact on the artefact he/she is working on, or events concerning all the indirectly dependent artefacts as well as on specific artefacts. To support this feature ADAMS provides a traceability visualisation graph showing all the dependences concerning a given artefact. This graph can be visualised and browsed to look at the state of previously developed artefacts, to download latest artefact versions, or to subscribe events on specific artefacts in order to receive notifications concerning their development. When changes occur or the project grows up, traceability links management tends to be a time consuming and error prone activity. ADAMS reduces the traceability maintenance problem providing the traceability link recovery tools proposed in [13], [14]. The remainder of this paper is organised as follows. Section 2 discusses related work. Section 3 presents an overview of ADAMS, while Section 4 and 5 describe traceability and event management, respectively. Finally, Section 6 and 7 present a preliminary evaluation on the usage of ADAMS and concluding remarks, respectively.

2. Related Work
Several research and commercial tools are available that support traceability between artefacts. DOORS [27] and Rational RequisitePro [23] are commercial tools that provide effective support for recording, displaying, and checking the completeness of traced artefacts using operations such as drag and drop [27], or by clicking on a cell of a traceability link matrix [23]. RDD.100 [17] (Requirements Driven Development) uses an ERA (Element/Relationship/Attribute) repository to capture and trace complicated sets of requirements, providing functionalities to graphically visualise how individual pieces of data relate to each other and to trace back to their source. TOOR [21] is a research tool in which traceability is not modelled in terms of simple links, but through user-definable relations that are meaningful for the kind of connection being made. This

126

lets developers distinguish among different links between the same objects. Moreover, TOOR can relate objects that are not directly linked using mathematical properties of relations such as transitivity. gIBIS [18] is an hypertext system designed to facilitate the capture of early design deliberations (rationale). It implements a specific method of design deliberation, called Issue Based Information System (IBIS) [24], based on the principle that the design process for complex problems is fundamentally a conversation among the stakeholders in which they bring their respective expertise and viewpoints to the resolution of design issues. gIBIS (for graphical IBIS) makes use of colour and a high speed relational database server to facilitate building and browsing typed IBIS networks made up of nodes (issues in the design problem) and links among them. REMAP [22] (REpresentation and MAintenance of Process knowledge) is another conceptual model based on IBIS, that relates process knowledge to the objects that are created during the requirements engineering process. REMAP offers a built-in set of types of traceability relations with pre-defined semantics. It was developed using an empirical study of problem-solving behaviour of individuals and groups of information system professionals. Recently, artefact traceability has been tackled within the Ophelia project [26] which pursued the development of a platform supporting software engineering in a distributed environment. In Ophelia the artefacts of the software engineering process are represented by CORBA objects. A graph is created to maintain relationships between these elements and navigate among them. OSCAR [3] is the artefact management subsystem of the GENESIS environment [1]. It has been designed to non-invasively interoperate with workflow management systems, development tools, and existing repository systems. Each artefact in OSCAR possesses a collection of standard meta-data and is represented by an XML document containing both meta-data and artefact data that include linking relationships with other artefacts. OSCAR introduces the notion of active software artefacts that are aware of their own evolution. To support such an active behaviour, every operation on an artefact generates events that may be propagated by an event monitor to artefacts deemed to be interested in such events by their relationships with the artefact generating the event. Other tools [5], [6] also combine the traceability layer with event-based notifications to make users aware of artefact modifications. Chen and Chou [5] have proposed a method for consistency management in the Aper process environment. The method is based on maintaining different types of traceability relations between artefacts, including composition relations, and uses triggering mechanisms to identify artefacts affected by changes to a related artefact. Cleland-Huang et al. [6] have developed EBT (Event Based Traceability), an approach based on a publish-subscribe mechanism between artefacts. When a change occurs on a given artefact having the publish role, notifications are sent to all the subscriber (dependent) artefacts. Traceability has also been used in the PROSYT environment [7] to model software processes in terms of the artefacts to be produced and their interrelationships. This artefact-based approach results in process models composed of simpler and more general operations than the operations identified using an activity based approach. PROSYT is able to manage the inconsistencies between a process instance and the process model [8] by tracing the deviating actions and supporting the users in reconciling the enacted process and the process model, if necessary.

3. ADAMS overview
ADAMS (ADvanced Artefact Management System) is an artefact-based process support system. It enables the definition of a process in terms of the artefacts to be produced and the

127

relations among them [12]. ADAMS emphasises the artefact life cycle by associating software engineers with the different operations they can perform on each artefact. This feature, together with the resource permissions definition and management, represents a first level of process support and allows the project manager to focus on practical problems involved in the process and avoids getting lost in the complexity of the process modelling, like in workflow management systems. ADAMS also enables process management in a flexible way, giving managers the possibility of changing the state of an artefact or the resources associated with it and the related permissions, supporting deviations from the process development model during its actual enactment. ADAMS supports quality management by associating each artefact type with a standard template and an inspection checklist to be validated during the review process. Each template can be customised for specific artefacts according to the organisations quality plan and used to produce the first version of each artefact. As well as the artefact standard template, the standard checklist can be customised for specific artefacts. The support for cooperation is provided through typical configuration management features. In fact, ADAMS enables groups of people to work on the same artefact, depending on the required roles. Different software engineers can access the same artefact according to a lock-based policy or concurrently, if branch versions of the same artefact are allowed by the artefact manager. Moreover, the system has been enriched with features to deal with some of the most common problems faced by cooperative environments, in particular context awareness and communication among software engineers. A first context-awareness level is given by the possibility to see at any time the people who are working on an artefact. Context awareness is also supported through event notifications: software engineers working on an artefact are notified whenever relevant events happen to such an artefact. An example of such events is the creation of a newer version of the artefact. Events notification provides a solution to the isolation problem for resources working on the same artefact in different workspaces [25]. In fact, context awareness allows to identify potential conflicts before they occur, because the system is able to notify interested resources as soon as an artefact is checked-out and potentially before substantial modifications have been applied to it. ADAMS provides support for the artefact traceability management. Traceability links can be defined and managed by a project quality manager who has the responsibility to keep them updated. A semi-automatic traceability tool has been integrated that supports quality managers in their work. Traceability links are organised in a traceability graph, that can be visualised to provide a birds-eye view of the project and to subscribe events upon specific artefacts. Finally, another way software engineers can exploit the visualisation of traceability information is by sending a feedback whenever they discovers an inconsistency on an artefact their work depends on. Feedbacks are then notified to the software engineer responsible for the artefact. Feedbacks, traceability management and event notifications are the mechanisms used by ADAMS to support the software development process. This approach is much more flexible than activitybased workflow management systems [1], [15], [28], in particular with respect to the deviations from the process model.

4. Traceability management in ADAMS


Software artefact traceability is the ability to describe and follow the life of an artefact (requirements, code, tests, models, reports, plans, etc.) developed during the software lifecycle in both a forwards and backwards direction (i.e., from its origins, through its development and specification, to its subsequent deployment and use, and through all periods of on-going refinement and iteration in any of these phases) [16]. Traceability can provide important insights into system

128

development and evolution assisting in both top-down and bottom-up program comprehension, impact analysis, and reuse of existing software, thus giving an essential support in understanding the relationships existing within and across software requirements, design, and implementation [19]. Within ADAMS, software artefact traceability is mainly provided specifying traceability links between pairs of artefacts. A traceability link represents a relationship between an independent artefact (or source artefact) and a dependent artefact (or target artefact). In ADAMS, it is possible to specify three types of traceability links (see Figure 1): a dependence (directed link) is a relationship where the target artefact depends on or is impacted by changes applied to the source artefact; an undirected link is a relationship where artefacts impact on each other; a composition denote a whole-part relationship between two artefacts such that the target artefact is a component of the source artefact.

Figure 1. Traceability link definition form

Traceability links can be organised in a traceability graph where the artefacts are identified by nodes and the traceability links are represented with edges of the graph. Within a traceability graph we identify traceability paths, i.e. sets of artefacts connected by traceability links. Obviously an artefact along a traceability path could be impacted by each artefact appearing in the part of the path preceding it (i.e., it plays the target role in a set of fictitious traceability link with such artefacts) as well as it could impact on each artefact appearing in the part of the path following it (i.e., it plays the source role in a set of fictitious traceability link with such artefacts). We refer to such fictitious links among artefacts within the same traceability path with the term indirect traceability links.

129

ADAMS enables software engineers to create and manage such types of traceability links between artefacts (see Figure 1). The software engineer responsible for keeping the traceability layer consistent (the quality manager), is allowed to select the source and the target artefact among the already defined artefacts of a project. The link type field is used to identify the nature of the link as described above. 4.1 Traceability recovery In the first versions of ADAMS, traceability link identification was delegated to the software engineer, who had the responsibility to manage traceability links whenever new artefacts were added to the project, existing artefacts were removed or newer versions were checked-in. We noticed that as soon as the project grew-up, this task tended to be hard to manage. For this reason we have integrated in ADAMS a traceability link recovery tool, called ADAMS Re-Trace [13], [14], that supports the software engineer during the traceability link definition. Such tool uses Latent Semantic Indexing (LSI) [11] as an information retrieval method to calculate the similarity measure among the artefacts within a project. Given a source artefact ai the tool compares it against the other artefacts in the artefact space and ranks these artefacts according to their similarity with ai. Moreover, the tool uses a similarity threshold to cut the ranked list and presents the software engineer only the subset of top artefacts that are deemed similar to ai. It is worth noting that artefacts having an high similarity probably contain similar concepts, so they are likely good candidates to be traced on each other.

Figure 2.Traceability link recovery tool

130

In particular, ADAMS Re-Trace compares the links identified by the traceability recovery tool with the links manually traced by the software engineer and highlights the disagreements between the tool results and the software engineer choices. To this aim, when the software engineer performs the traceability recovery function, the tool visualises three different sets of links (see Figure 2): Lost Links: this set contains the links not traced yet by the software engineer and retrieved by the tool. Analysing this set the software engineer is able to discover new links, thus enriching the set of traced links; Warning Links: this set contains the links traced by the software engineer and missed by the tool. The warning links have to be investigated by the software engineer for two reasons: in case the tool is actually right the traceability link has to be removed, while in case the software engineer is right, the indications of the tool might reveal some inconsistencies in the usage of terms within the traced artefacts; False Positive: this set contains the links classified as false positive by the software engineer but retrieved by the tool. It is worth noting that the size of the set Lost Links grows-up when the threshold is low. To reduce the size of this set, the software engineer should also have the possibility to insert a negative link between two artefacts, thus avoiding that the tool proposes retrieved links that are known to be false positives.

As we can see in Figure 2 the software engineer can perform a specific action on each link; in particular, considering a link in the set Lost Links the software engineer can trace the proposed link (Trace), move it into the False Positive links set (Move to False Positive) or postpone its analysis (None). Moreover, considering a link in the set Warning Links the software engineer can correct his/her choice removing the link (Remove Link), send a feedback to the managers of the traced artefact to notify quality problems (Send Feedback), or postpone its analysis (None); finally, considering a link in the set False Positive the software engineer can correct his/her choice tracing the link (Trace), moving it back to the set Lost Links (Move to Lost Links), or postpone its analysis (None). 4.2 Traceability graph visualisation Traceability links in ADAMS are useful for impact analysis and change management during software development. Using such links it is possible to define a traceability graph having the nodes represented by artefacts and the edge represented by traceability links. This graph can be visualised by a software engineer and browsed to look at the state of previously developed artefacts, to download latest artefact versions, or to subscribe events on specific artefacts in order to receive notifications concerning their development. To show the graph, software engineers select a source artefact from the artefact list and define the dependence types they want to visualise in the graph (see Figure 3). It is worth noting that the traceability graph is built starting from a source artefact and finding all the dependences of a specific type that involve the source artefact either as source or target artefact. The traceability graph visualisation has been realised by integrating in ADAMS Grappa, an open source graph drawing package developed by AT&T Research Lab as a port of a subset of GraphViz to Java. Unfortunately Grappa does not provide a graph layout manager, so it delegates this functionality to an external layout engine; in our implementation, we use the GraphViz graph layout manager server. Grappa has been also customised with artefact type icons associated with the different types of artefact managed in ADAMS, in order to provide enhanced visual information

131

about the artefacts. Moreover, tool tips have been used to show immediate artefact information as soon as the user positions the mouse over an icon. The artefact used as query has been shown in red to point out its position within the traceability graph. Finally, each different traceability link type has been associated with a specific notation. In particular: a direct link is represented by a dotted black arrow labelled by a d; an undirected link is represented by a dotted red arrow labelled by an u; a composition is represented by a solid green arrow labelled by a c.

Finally, the contextual menu shown on a right click of the mouse over an artefact of the graph has been customised. In particular, we added three new menu items to open the artefact card, to immediately send a feedback to the developers allocated to the artefact, and to subscribe events concerning the specific artefact, respectively (see Figure 4).

Figure 3. Definition of the traceability graph properties

Figure 4. Traceability graph visualisation in ADAMS

132

5. Event management in ADAMS


One of the most common problems faced by cooperative environments and distributed development teams concerns the context awareness. Even when supported by configuration management tools, such teams quickly suffer by the isolation problem occurring when different people are involved in the development of the same (or correlated) artefacts within different workspaces. In ADAMS, context awareness is mainly supported through event notifications: i.e., messages generated in response to specific events triggered by one of its subsystems. Event notification in ADAMS adopt a publish-subscriber paradigm: as soon as an event concerning an artefact occurs, the developers who subscribed such event are notified. One of the main drawback of this approach is the proliferation of the messages generated by the system which could slowdown the system and cause the developer to ignore notifications. However, our approach is enough flexible to be targeted on the specific user needs still minimizing the number or subscription requested to the software engineers and the number of unimportant messages they receive, resulting in a good compromise between an adequate context-awareness level and an endurable quantity of messages received. By default, all the artefacts allocated to a developer are intended as fully subscribed (all events concerning the artefact are notified), whereas all the other artefacts are intended as unsubscribed. However each developer can override such behaviour specifying for each artefact the most appropriate notification level. To support the developers during the customisation of the types of event they want to be notified of, the ADAMS Event Management Subsystem subdivides events in a layered classification in order to cluster events sharing the same level of relevance into the same class. This approach simplifies the subscription phase because developers are only requested to choose the relevance class for which they needs to be notified (see Figure 5). The layered structure of the event classification ensures them that they will be notified as soon as an event belonging to the class subscribed as well as an event belonging to one of the upper level classes is generated. As an example, developers could choose to be notified as soon as an artefact they are allocated on is checked-out, in order to prevent the isolation problem, but they could only be interested on meaningful changes applied to artefacts it directly or indirectly depends on. Such a distinction is possible due to the improvement of the granularity of change management introduced in ADAMS. Notifications in ADAMS are also propagated across the traceability links, so that the developers can be aware of changes applied to artefact their work depends on. This can contribute to anticipate potential changes thus reducing expensive later modifications. Event propagation also reduces the complexity of subscribing several events for notification, avoids worthless redundancies, and prevents from forgetting indirect but essential subscriptions. In fact, software engineers are not required to subscribe the same events for artefacts connected by traceability links, but it is sufficient to subscribe only the events occurring on the dependent artefacts as the system is able to notify them as soon as those events occur on the artefacts directly or indirectly linked. Event propagation module uses the traceability graph to compute the set of dependent artefact directly or indirectly impacted by the event. For each artefact impacted all the subscribers (for a suitable event level) are notified by the Event Management Subsystem. To select the set of the artefacts involved, the Event Management Subsystem incrementally computes the transitive closure of the traceability graph as soon as a new link is inserted or existing links are modified or removed. Such an approach allows a time efficient computation of the set of dependant artefacts, relying on an relational database management system to maintain all the necessary information storing the maximum traceability paths.

133

Figure 5. Event subscription in ADAMS

Figure 6. Activation of indirect event notification

134

Figure 7. Impact specification

Figure 8. Event Notification

To avoid the proliferation of messages, developers are allowed to specify whether they want to be notified about events concerning artefacts indirectly impacting their work, i.e., artefact not subscribed being source of an indirect traceability link with respect to artefact they subscribed (see Figure 6). Moreover, with respect to the layered structure of the event classification, only events involving meaningful changes (or higher) are propagated across the traceability links.

135

To improve the quality of the notifications each draft version of an artefact checked-in into ADAMS is tagged with an impact value determined comparing the new version with the previously stored version. For this reason developers are required to specify the impact of the changes implemented as soon as they check-in a new draft version of an artefact (see Figure 7), so that it is possible to distinguish between unimportant changes (e.g. text corrections, code comments, language improvements) and meaningful changes (e.g. requisite modifications, new functionalities, error corrections). This information, together with subscriptions, is used by the Event Management Subsystem to decide whether to send a notification for the newer version or not. Finally, to alleviate the developers from the necessity of finding the version of the artefact for which the notification was generated (that could cause them to pass over the notification), each notification message has been enriched with references to relevant artefact concerning the notification, provided as hyper textual links to the appropriate entities (see Figure 8).

6. Preliminary evaluation
ADAMS has been experimented from April 15th to July 20th 2005 in the Software Engineering courses of the Computer Science program at the University of Salerno (Italy). Experimentation included about 150 students involved in seventeen different projects. We are still analysing the data of the experimentation, so we can only provide preliminary results. The average number of artefacts produces within each project was 192 (140 artefacts were produced for the smallest project and 325 artefacts for the largest project); on the average, for each artefact about 4 different versions were produced (in some cases, up to 22 different versions of the same artefact were produced). The main software documents produced were the Requirements Analysis Document (RAD), the System Design Document (SDD), the Object Design Document (ODD), the Test Plan, and the test execution documents, besides project and quality management documents [4]. Using the composition links, such documents were decomposed into a large number of sub-artefacts. In particular, for each project the average number of composition links traced by the software engineers was about 100. It is worth noting that the Requirements Analysis Document was the artefact with the highest number of sub-artefacts. This result suggests that a fine-grained artefact decomposition has been preferred to a coarse-grained decomposition for those artefacts involving a high number of resources concurrently, such as the RAD. This choice aims at reducing the number of concurrent branches allowing the development of different sections of the same document by different team members. Another important data concerns the number of dependences and undirected links traced by the software engineers (the average number of these links for each project is about 250). Analysing such data (the average number of traced links and artefacts within each project) we can estimate the average dimension of a traceability graph. In particular, on the average, each project has a traceability graph composed of about 200 nodes (artefacts) and 350 edges (traceability links). At the end of the projects, the involved students have expressed their evaluation of ADAMS by filling in a questionnaire. A preliminary analysis on the collected data concerning the traceability recovery tool reveals two different behaviours in the usage of the tool: about half of the traceability managers have manually traced a large number of links and mainly used the tool for artefact quality control. In particular, about 40% of these users have declared that over 50% of the warning links highlighted real inconsistencies in domain terms usage; the remaining traceability managers have used the tool to enrich the set of links manually traced. In particular, analysing the set of lost links they greatly increased the number of traceability links. Moreover, evaluating the usefulness of the traceability link recovery feature we notice that 96% of the traceability managers stated that this

136

feature completely satisfied its needs and 85% of them asserted that the tool performances were adequate to their needs. Concerning the traceability links visualisation tool we noticed that only half of the users involved in the experimentation made an intensive usage of such functionality. This result was affected by performance issues, due to the use of an external layout manager server. We also point out that this feature has been introduced only in June. This can be one of the reasons for the lack of attention paid by half of the users to this new functionality. Finally, a preliminary analysis has also been executed on the notifications received by a selected group of software engineers involved in different projects and having different roles within their teams, in order to evaluate the tool effectiveness (i.e., how many notifications were generated an how much useful they were considered by each receiver) and the software engineers behaviour (i.e., how much attention they paid to the received messages). Such an analysis revealed that, despite of the notifications filtering based on the subscription level and the possibility of classifying the relevance of a modification applied to an artefact version, the number of notifications is still too large to be managed by a single software engineer. Such a result was mainly spoiled by the proliferation of indirect notifications for a traceability graph with a high number of links. Another consideration we can do concerns the software engineers behaviour. We noticed that in a first stage of the project, the classification of notifications was very good (nearly 100%), and the declared usefulness of received messages was satisfactory (about 50%). Subsequently, the behaviour get worse. We noticed that such a worsening occurs close to the delivery dates. In these periods, the number of generated notification greatly increases, due to the high number of releases produced by the team members. Such a proliferation of messages (in a critical period) is responsible of a reduction of the software engineering confidence towards the tool. In fact, even when the number of notifications became manageable, their behaviour was still inappropriate. Such a situation has also been found for software engineers being inactive or away for a short time. In these situations, their notifications were collected and it became hard to settle them. Generalising such a situation, we noticed that the intention in classifying notifications is changeable. In order to face up these problems we are going to introduce two improvements to the event engine subsystem. A first improvement aims at reducing the generation of unnecessary notifications. In particular, indirect notifications involving artefacts being distant from the source of the event notified, are quite unlikely useful for most software engineers. For this reason we intend to allow users to specify, within a subscription, the maximum distance that a notification can reach across the traceability graph. In this way, we limit the propagation of indirect notifications. The evaluation of the software engineers behaviour indicated that the user confidence toward the tool decreases when the number of notification increases. In order to minimise the number of ignored messages while keeping a good context awareness level, we aim at adaptively tuning the event notification level, monitoring each single software engineer behaviour whenever he/she receives a new notification. In particular, software engineers that repetitively ignored notifications concerning one or more artefacts, will realistically ignore later similar notification. By this assumption, we intend to decrease the subscription level accordingly, in order to reduce the number of notification he/she receives. Obviously, in such a tuning process we will take into account the actual working time (e.g. the last access date) of the involved software engineer.

7. Conclusions
Software artefact traceability is fundamental to effectively manage the development and evolution of software systems, to help in program comprehension, maintenance, impact analysis,

137

and reuse of existing software. In this paper we have presented the traceability management features supported by an artefact management system called ADAMS. The major contribution of this work is to provide an effective solution to the context awareness problem. In ADAMS, context awareness is mainly supported through event notifications and traceability links, which are used to propagate events concerning changes of an artefact to the dependent artefacts. The proposed approach enables a software engineer to create and manage traceability links between artefacts as well as to recover them while the artefacts evolve and the traceability structure inevitably deteriorates. The number of notifications a software engineer can receive is reduced by enabling him/her to visualise the traceability graph and to select the artefacts and the event types he/she is interested in. However a preliminary evaluation has revealed that the number of notifications received by the software engineers is still too large to be managed for graph with a high number of links due to the proliferation of indirect notifications. A possible solution to this drawback can be performed allowing users to specify, within a subscription, the maximum distance that a notification can reach across the traceability graph. At the end of the experimentation, we collected data on the produced documents, the traceability links traced, and the notifications generated within each project. Moreover, students involved in the experimentation have filled in a questionnaire to evaluate their experience with ADAMS. At the present, we are analysing the results in order to face up the problems pointed out and improve the tool.

References
[1] L. Aversano, A. De Lucia, M. Gaeta, P. Ritrovato, S. Stefanucci, and M.L. Villani, Managing Coordination and Cooperation in Distributed Software Processes: the GENESIS Environment, Software Process: Improvement and Practice, vol. 9, no. 4, 2004, pp. 239263. S. Bandinelli, E. Di Nitto, and A. Fuggetta, Supporting Cooperation in the SPADE-1 Environment, IEEE Transactions on Software Engineering, vol. 22, no. 12, 1996, pp. 841865. C. Boldyreff, D. Nutter, and S. Rank, Active Artefact Management for Distributed Software Engineering, Proceedings of 26th IEEE Annual International Computer Software and Applications Conference, Oxford, UK, IEEE Computer Society Press, 2002, pp. 1081-1086. B. Bruegge and A. Dutoit, Object-Oriented Software Engineering, 2nd edition, Prentice Hall, 2003. J.Y.J. Chen and S.C. Chou, Consistency Management in a Process Environment, The Journal of Systems and Software, vol. 47, 1999, pp. 105-110. J. Cleland-Huang, C. K. Chang, and M. Christensen, Event-Based Traceability for Managing Evolutionary Change, IEEE Transactions on Software Engineering, vol. 29, no. 9, 2003, pp. 796-810. G. Cugola, Tolerating Deviations in Process Support Systems via Flexible Enactment of Process Models, IEEE Transactions on Software Engineering, vol. 24, no. 11, 1998, pp. 982-1001. G. Cugola, E. Di Nitto, A. Fuggetta, and C. Ghezzi, A Framework for Formalizing Inconsistencies in Human-Centered Systems, ACM Transactions On Software Engineering and Methodology, vol. 5, no. 3, 1996, pp. 191-230. CVS Home Page. http://www.cvshome.org.

[2] [3] [4] [5] [6] [7] [8] [9]

138

[10] J. Dag, B. Regnell, P. Carlshamre, M. Andersson, J. Karlsson, A Feasibility Study of Automated Natural Language Requirements Analysis in Market-driven Development, Requirements Engineering, vol. 7, no. 1, 2002, pp. 20-33. [11] S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, Indexing by Latent Semantic Analysis, Journal of the American Society for Information Science, no. 41, 1990, pp. 391-407. [12] A. De Lucia, F. Fasano, R. Francese, and G. Tortora, ADAMS: an Artefact-based Process Support System, Proc. of 16th Intern. Conference on Software Engineering and Knowledge Engineering, Banff, Alberta, Canada, 2004, pp. 31-36. [13] A. De Lucia, F. Fasano, R. Oliveto, and G. Tortora, Enhancing an Artefact Management System with Traceability Recovery Features, Proc. of the 20th International Conference on Software Maintenance, Chicago Illinois, USA, 2004, pp. 306-315. [14] A. De Lucia, F. Fasano, R. Oliveto, and G. Tortora, "ADAMS Re-Trace: a Traceability Recovery Tool", in Proc. of 9th European Conference on Software Maintenance and Reengineering, The Manchester Conference Centre, Manchester, UK, 2005, pp. 32-41. [15] D. Georgakopoulos, H. Hornick, and A. Sheth, An Overview of Workflow Management: from Process Modelling to Workflow Automation Infrastructure, Distributed and Parallel Databases, vol. 3, no. 2, 1995, pp. 119-153. [16] O. Gotel and A. Finkelstein, An Analysis of the Requirements Traceability Problem, Proceedings of 1st International Conference on Requirements Engineering, Colorado Springs, Colorado, USA, 1994, pp. 94-101. [17] Holagent Corporation RDD-100, http://www.holagent.com/new/products/modules.html [18] J. Konclin and M. Bergen, Gibis: A Hypertext Tool for Exploratory Policy Discussion, ACM Transactions Office Information Systems, vol.6, no. 4, 1988, pp. 303331. [19] J.D. Palmer, Traceability, in Software Requirements Engineering, Second Edition, R. H. Thayer and M. Dorfman (editors), IEEE Computer Society Press, 2000, pp. 412-422. [20] Perforce Home Page. http:/www.perforce.com. [21] F.A.C. Pinhero and J.A. Goguen, An Object-Oriented Tool for Tracing Requirements, IEEE Software, vol. 13, no. 2, 1996, pp. 5264. [22] B. Ramesh and V. Dhar, Supporting Systems Development Using Knowledge Captured During Requirements Engineering, IEEE Transactions on Software Engineering, vol. 9, no. 2, 1992, pp. 498510. [23] Rational RequisitePro web site, http://www.rational.com/products/reqpro/index.jsp. [24] H. Rittel and W. Kunz, Issues as Elements of Information Systems Working paper NI 31, Institut fur Grundlagen der Planung I.A. University of Stuttgart, 1970. [25] A. Sarma and A. van der Hoek, Palantr: Coordinating Distributed Workspaces, Proceedings of the 26th Annual IEEE International Computer Software and Applications Conference, Oxford, UK, IEEE Computer Society Press, 2002, pp 1093-1097. [26] M. Smith, D. Weiss, P.Wilcox, and R.Dewer, The Ophelia traceability layer, in Cooperative Methods and Tools for Distributed Software Processes, A. Cimitile, A. De Lucia, and H. Gall (editors), Franco Angeli, 2003, pp. 150-161. [27] Telelogic product DOORS, http://www.telelogic.com. [28] Workflow Management Coalition, Workflow Management Coalition Interface 1: Process Definition Interchange Process Model, Document no. WFMC-TC-1016-P, 1999, available at http://www.aiim.org/wfmc/standards/docs/if19910v11.pdf.

139

GROUPWARE REQUIREMENTS FOR SUPPORTING SOFTWARE ARCHITECTURE EVALUATION PROCESS


Muhammad Ali Babar and June Verner Empirical Software Engineering National ICT Australia Ltd. and University of New South Wales, Australia {malibaba, june.verner}@nicta.com.au

Abstract
Scenario-based software architecture (SA) evaluation methods require a large number of stakeholders to be collocated for evaluation sessions. Since collocating stakeholders usually causes scheduling difficulties and can be an expensive exercise, especially in the context of the increasing global software development trend, such issues may discourage organizations from introducing disciplined architectural evaluation practices. In an attempt to find a cost effective and efficient alternative to face-to-face meetings, we have developed a concept of groupware-supported distributed SA evaluation process. In this paper, we analyze the unique characteristics of the proposed process to identify the requirements for a suitable collaborative tool that can effectively and efficiently support the process. We also describe the design and execution of an empirically motivated approach to gather the requirements for such a tool. Based on the findings of our study, we identify and discuss some of the features a groupware system should have to successfully support architecture evaluation process in the context of distributed software development.

1. Introduction
It is widely recognized that SA evaluation is an effective quality assurance technique that helps identify potential architectural risks and questionable design decisions [11, 16]. Most of the wellknown SA analysis methods are scenario-based [6], such as the Architecture Tradeoff Analysis Method (ATAM) [16], Performance Assessment of Software Architecture (PASA) [48] and Architecture-Level Maintainability Analysis (ALMA) [36]. Our initial studies of these methods revealed that most of these approaches rely extensively on the collaborative efforts of multiple stakeholders, who need to be brought face-to-face in order to elicit/refine business goals and analyze the suitability of the proposed architecture. In fact, some approaches suggest that evaluation meetings be held away from the development sites to avoid potential distractions [33]. However, it has been reported that collocating a large number of stakeholders is an expensive and time consuming exercise, which can create logistical problems such as scheduling difficulties, traveling problems, etc., [40]. Moreover, stakeholders may have to travel if they are geographically separated, which is highly likely as distributed software development is increasingly becoming the norm than a fad [31, 41]. Furthermore, current architecture evaluation approaches provide little support to address several issues that characterise face-to-face meetings, such as conformity pressures, dominating personalities, cultural differences and so on [39, 40]. Altogether, such issues may hinder the widespread adoption of disciplined architecture evaluation practices. Thus, the research challenge is to find and assess an effective and efficient way of enabling physically dispersed stakeholders to participate in architecture processes without having to travel, while improving the overall process of SA evaluation. Previous experimental studies found that groupware applications provide an appropriate support mechanism to introduce the shift work

140

concept in software development activities to exploit organizational knowledge (mainly workforce) distributed across different time zones and geographical locations [23]. Groupware applications have been reported as an effective mechanism to minimise meeting costs, maximising asynchronous work and preserving a number of organisational resources [15, 17, 21, 22]. Moreover, it has also been found that Internet and groupware systems have played a vital role in improving collaborative processes in a number of disciplines by minimizing dysfunctional behaviour and enhancing group productivity [25, 26, 40, 42]. Based on these results, we have developed the concept of a groupware supported distributed software architecture evaluation process, which is aimed at addressing a number of abovementioned logistical issues characterizing the current evaluation approaches [5, 7]. We have demonstrated that gathering scenario to specify non-functional requirements can be supported by a generic collaborative system without compromising the quality of the scenarios [7]. However, one of major issues is how do we know which of the many features provided by different groupware systems we need to successfully support the proposed process. Considering the unique requirements and nature of the tasks and activities that characterize architecture evaluation, it may not be possible to use any available groupware application without substantial tailoring to effectively support the process. Thus, it is imperative to identify the features needed for distributed SA evaluation process. Similarly to Perry et al., [41], we decided to use experimentation to gathered requirements. We designed an empirically motivated research program to assess the feasibility of a distributed SA evaluation process and to identify the basic requirements of a collaborative tool to support it. Moreover, our research program is also aimed at understanding the socio-technical aspects of conducting SA evaluation in distributed environments. The design of this research project is based on a framework of experimentation [10] with guidelines provided in [34]. The remainder of the paper is organized as follows. In the next section, we briefly review the work that has motivated our research program. We then analyze a generic process of SA evaluation to identify groupware support for its activities and tasks. In section 5, we discuss the goals, design, and execution process of our empirical research program. We provide a set of unique features required to support the proposed process in section 6. Conclusion and future work conclude the paper.

2. Background and Motivation


In this section we discuss SA evaluation groupware systems in general and groupware support for software development activities. 2.1 Software Architecture Evaluation It is widely recognized that quality attributes (such as the maintainability, reliability) of complex software intensive systems largely depend on the overall SA of such systems [11]. Since SA plays a vital role in achieving system wide quality, it is important to evaluate a systems architecture with regard to desired quality requirements. Abowd et al. [2] proposed two broad categories of SA evaluation: questioning and measurement. The former category includes techniques like scenarios, questionnaires, and checklists. The latter category consists of metrics and simulation. Scenario-based methods (such as ATAM, PASA and ALMA) are considered the most mature and well-known of this latter group. Most of these methods are structurally similar but there are a number of differences among their activities and techniques [4]. One of the commonalities among these approaches is meeting-based activities.

141

The requirement to hold meetings to perform different evaluation activities is partially created by the very nature of the scenario-based approaches. A SA is evaluated against the desired quality attributes, which are often vaguely specified. Quality attribute workshops are aimed at eliciting the quality goals of a system by generating general as well as concrete scenarios [11]. Stakeholders also prioritize quality attributes and generate scenarios according to business goals. Workshops provide an opportunity for stakeholders to become familiar with the architectural approaches being used to achieve the quality goals. The presentation of the architecture may result in immediate questions being raised and addressed during the workshop. Architectural assessment normally requires expertise and knowledge of different quality attribute experts such as performance engineers and usability specialists. Furthermore, the effect of a particular quality attribute cannot be analyzed in isolation as quality attributes have positive or negative influences on each other, which may require tradeoffs between achieving different levels of different quality attributes. All these activities require group discussions and decision making processes, which necessitate meetings. 2.2 Groupware Systems Groupware systems are computer-based applications that support communication, collaboration, and coordination among a group of people working towards a common goal; much of the time these people are geographically distributed [20]. A groupware system usually has a very diverse set of tools (such as E-mail, audio video conferencing, calendar, content management, workflow management, electronic meetings) that complement each other [40]. These systems have emerged over the past decade as mainstream business applications to support a variety of problem solving and planning activities in a wide variety of organizations. A key benefit of groupware systems is to increase efficiency compared with face-to-face meetings by creating positive changes in group interactions and dynamics [22, 40]. Groupware systems have proven effective in reducing the time and resources required to complete a project by minimizing the inter-activity intervals and delays [40, 41]. Researchers have shown that teams using groupware systems can reduce their labor costs by up to 50 percent and project cycle times by up to 90 percent [26]. Groupware systems also have the potential to effectively support meeting processes involving large groups [39] and to increase the number and quality of the ideas generated [47]. Groupwares also provide a set of tools that can efficiently process large amount of information consumed or generated during meetings. Other notable attributes of such systems include anonymity, simultaneity, process structuring, process support and task support [39]. 2.3 Groupware Support for Software Development Activities It has been shown that face-to-face meetings for software inspections incur substantial cost and lengthen the development process [41]. Some studies have called into question the value of face-toface inspection meetings [44]. Studies have also indicated that computer tools, including groupware, may improve inspections [45]. Groupware-supported inspection processes have been successfully evaluated as a promising way to minimise meeting costs, maximise asynchronous work and conserve a number of precious organisational resources [21, 29]. Moreover, it has also been shown that the software inspection process can be improved with group process support [46]. The Requirements Engineering (RE) community has also successfully used groupware applications to enable distributed teams of stakeholders to perform different RE tasks. For example, Liou and Chen integrated joint application development (JAD) and group support systems (GSS) to support

142

requirements acquisition and specification activities [37]. Damian and her colleagues reported successful experiments with using a web-based collaborative tool to support requirements negotiation meetings [17]. Boehm and his colleagues developed a groupware tool to support their EasyWinWin requirements negotiation methodology [15] and integrated a case tool to improve the support for requirements engineering tasks [27]. These findings within other meeting-intensive software engineering activities provided a strong motivation to systematically evaluate the advantages and disadvantages of using web-based groupware applications to conduct SA evaluation to support geographically dispersed software development teams. Moreover, we are also interested in leveraging different groupware tools to improve the effectiveness and efficiency of the individuals participating in SA evaluation process. There may seem to be some similarities between SA evaluation and other software development activities, which have already been successfully supported by groupware applications. For example, planning for SA evaluation and planning for code reviews, scenarios development and win conditions elicitation for requirements negotiations, and identifying risky design decisions and defect detection and others. Then, a natural question is why we cannot use the tools used or developed for these activities. Table 1 Summary of research efforts to use groupware for software development activities
Researchers Liou & Chen [37]. Boehm et al. [15]. Grunbacher & Braunsberger [28]. Damian et al. Lanubile et al. [35]. Halling et al. [29]. Perry et al. [41]. Macdonald & Miller [38]. Gorton et al. [23]. Research domain Requirements Engineering (RE) Requirements Engineering Requirements Engineering Technical reviews Technical reviews Technical reviews Technical reviews Software development Task Types Requirements Specifications Requirements Negotiations Requirements Negotiations Requirements & Code Inspections Requirements Inspection Code Inspection Code Inspection Design, Coding, Testing Tool Used GroupSystems tools EasyWinWin NetMeeting IBIS GroupSystems HyperCode ASSIST Lotus Notes Why cant we use the existing tools? System only available to existing research partners GroupSystems only available to existing research partners Insufficient workspace & documentation mgmt support. Specialized for inspection task & no concept of workspaces Systems only available to existing research partners Specially built for code reviews Specially built for inspection supports any type of documents Prototype built on Lotus Notes for experiments

Having analysed the tools used for the research on groupware support for various software development activities (see Table 1). We found that despite some apparent similarities, architecture evaluation process has several unique aspects, which cannot be supported with the tools developed or tailored for other software development activities1. For example, number and categories of stakeholders (technical and business) simultaneously involved in SA evaluation is much bigger and broader than any other software development activity. SA evaluation has unique requirements for a knowledge repository and rationale management. Table 1 provides some reasons that may make it difficult or impossible to use the groupware tools developed or tailored for other software development activities to effectively support distributed SA evaluation process. Moreover, it is
1 The readers interested in further knowing why the tools and processes for distributed RE/inspections/coding etc. cannot be used for distributed SA are encouraged to consult literature on SA evaluation process to understand the unique requirements of this process.

143

considered that different software development activities are inherently different in nature and require different tools, resources, skills and others resources [24]. Hence there is a need to determine the groupware requirements for enabling geographical dispersed stakeholders to participate in various tasks of software architecture evaluation process without being collocated.

3. Research Approach
Our research is aimed at improving the practice of SA evaluation. Our strategy to achieve this objective is to understand current approaches and techniques used to support SA evaluation process and identify the areas that need to be addressed in order to support their widespread adoption in the industry. We have identified that existing methods of architecture evaluation do not scale up to the growing trend of distributed software development [5]. We have proposed the use of Internet-based groupware systems to address the issues of supporting geographically dispersed stakeholders (technical and businees) needs [5]. Since a match between task and technology is considered vital to the success of a groupware system to support a group process [19], we cannot assume that any existing groupware system should be able to provide the required support. Hence, our next research challenge is to identify requirements for a suitable groupware system for the proposed process. Our approach to determine the requirements for a suitable support environment has been guided by the following tenets: Derive the requirements for such system from the analysis of the ways people work together according to current methods and the changes proposed in these methods to support distributed teams [43]. Conduct empirical studies to determine the needs of the people who have worked on certain tasks of the changed process with a support technology [41]. Gather anecdotal evidence from the literature, discussions with colleagues, and personal experiences of using a technology or process [12]. In the following sections, we describe and discuss how we followed these principles to determine the basic requirements of a groupware application to support the distributed SA evaluation process. 3.1 Analysing Software Architecture Process to Identify Groupware Requirements We have mentioned that SA researchers have developed different methods of architecture evaluation and, although there are differences among these methods, we have identified five common activities by comparing four main approaches to evaluate architecture [4]. These five activities make up a generic scenario-based SA evaluation process that can be supported by a groupware application. In the following sections, we analyze the main tasks, inputs, deliverables and participants of each of the activities of this generic process to identify the suitable collaborative tools that should be integrated into a support environment. Evaluation planning and preparation - This is concerned with allocating organizational resources and setting goals for evaluation, selecting an appropriate evaluation approach, identifying suitable stakeholders, preparing inputs and deciding on the evaluation team. This vital activity provides the roadmap for the process and identifies expected outcomes. There are several tasks that can be automated using a suitable collaborative tool. For example, identifying and checking the availability of the evaluators and stakeholders based on predefined criteria, suggesting evaluation methods, assigning the tasks and notifying the participants. Most of the management related tasks of organizing and conducting SA evaluation sessions are manual or semi-automated (computer-based data entry), which needs to be optimized by providing a suitable web-based project management tool to plan, allocate, and monitory resource allocation [25]. There

144

is also a need of appropriate workflow engine to tailor and execute a suitable SA evaluation process along with the activities, roles, artifacts and so on. The workflow engine should be able to make dynamic changes in processes by responding to the needs of any unanticipated development [30]. Present architectural approaches - During this activity, software architects present the architecture of the system under consideration. They also identify the architectural styles or patterns used and explain how the chosen approaches can satisfy the scenarios. According to the existing SA evaluation methods, this activity is performed using an overhead projector/power point slides/ or some similar mechanism. In order to conduct this activity in a distributed environment, different types of collaborative tools are required. For example, modeling tools to present and manipulate architectural views, synchronous and asynchronous communication channels to discuss the architectural decisions, their rationales, and stakeholders concerns about them. Moreover, there is a need for a knowledge repository to store and retrieve reusable architectural artefacts such as design decisions, general scenarios, known patterns and others. The availability of such a knowledge repository helps the stakeholders to fully comprehend the architectural approaches being used and then raise their concerns with informed arguments. The knowledge repository should also provide the source of the knowledge (human or non-human) and means of contacting it. For example, the contact details of architect who proposed a certain architecture decision or URL for J2EE patterns documentation. There is also a need for integrating groupware systems and Computer Aided Software Engineering (CASE) tools like [27] to enable different classes of stakeholders view, modify and annotate various views and diagrams of the architectural description during discussion. However, the ability to use the modification features may be limited depending of the role of the participants or consent of the architect. Elicit quality sensitive scenarios - The purpose of this activity is to develop scenarios to characterize the desired quality attributes for a system. For instance the maintainability quality attribute can be specified by software change scenarios. Different techniques like brainstorming or interviews are used during this activity. Since there can be many more scenarios than can be analyzed given the time and resources, elicited scenarios are prioritized according to their value to business goals. Scenario elicitation and prioritization is the most expensive and time consuming activity. Our research program particularly aims at improving this activity by developing different techniques to help enhance the quality and quantity of the generated scenarios with minimum resources. That is why we began our empirical investigation to assess the effectiveness of distributed SA evaluation process with scenario development activity. Currently, scenarios are developed in a brainstorming workshop or elicited by interviewing stakeholders and initially recorded using flipcharts. Scenarios are prioritized without using any systematic voting system. The most prevalent form of prioritization is by raising hands. The distributed SA evaluation process needs a sophisticated Electronic Meeting System (EMS) that can support synchronous and asynchronous communication modes to support the prepared group method of developing scenarios [13]. For example, stakeholders should be able to develop and prioritize individual scenarios using a tool. All the stakeholders can see each others scenarios, make comments and discuss each scenario using a discussion board or chat room. Then they can use an EMS to integrate their scenarios and brainstorm new ones [39]. Moreover, one of the tasks of an evaluator is to encourage the participants of SA evaluation meetings to diverge from familiar thinking patterns using a brainstorming; and then the evaluator tries to move the divergent group towards more focused thinking. An EMS equipped with a brainstorming and idea organizer tools should be used to optimize this activity [47]. Moreover, instead of prioritizing scenarios by raising hands or openly assigning votes to certain scenarios

145

[16], there is a need of a sophisticated voting tool that keeps the process anonymous, which is important to separate ideas from organizational politics [40]. Current SA evaluation processes suffer from social and conflict issues (i.e. egoism, unfair floor control, and deficiency in a spoken language) [33]. Moreover, a huge amount of information is generated during scenario development workshops. Appropriate tool support is required to handle socio-technical as well as information processing needs of the proposed process. Several studies have proved that groupware systems can provide an effective mechanism for capturing and processing large amounts of information [25, 26, 40], which can be instantly disseminated. Analyze architectural approaches - This activity is aimed at analyzing architectural approaches using the scenarios developed during the previous stage. The most common approach is impact analysis, which tries to map each scenario onto the architecture and reason about its suitability for that particular scenario. During this activity, the evaluators walk through a selected set of scenarios along with the architect to assess the suitability of the proposed architectural decisions for high priority scenarios. Having performed this activity in physical meetings, our conclusion is that this activity needs an effective and efficient mechanism for sharing information (with human or computers), finding and evaluating design alternatives, identifying risks and non-risks, making tradeoffs, and storing and retrieving design rationale. We realize that a single application may not be able to support all these tasks as most of the decision support tools tend to focus on a very small group of tasks [32]. However, a suitable groupware system will provide a range of decision support tools and a workspace where the evaluator(s) and the architect(s) can discuss the findings. The evaluation team and architect also require a design decision repository to facilitate the decision making process by presenting architecturally significant information in a readily useable format [8]. Interpret and present results - This activity is concerned with summarizing the results of all previous activities, interpreting the deliverables and presenting results to the stakeholders. The evaluation team also prepares a formal report for the evaluations sponsors. In order to support this activity in a virtual environment, a groupware system should provide an online document management and group memory tools. The evaluation team needs these tools to search through the annotated information to discuss and clarify ambiguities and debatable points or contact the source of a particular scenario or architectural concern using collaborative tools. Distributed SA evaluation process requires a groupware system to disseminate review reports electronically and to enable the stakeholders to see the review report and presentation online as well. There is also a requirement to generate post-review feedback forms, reminders to provide feedback, and automating the data collection process. Another important requirement of any groupware system for the proposed process is to gather data for various metrics to measure the performance of the SA evaluation process, evaluators, and participants. The system shall be able to collect data on a wide variety of architectural artifacts (such as scenarios, architectural approaches, sensitivity points) along with its available meta-data, which can be processed to develop different metrics. One of our main goals is to design and validate some mechanism for quantifying the benefits of performing evaluations. We believe that an automated tool support to capture and analyze the data used or generated during architecture evaluation is the first step towards that goal. 3.2 Empirical Study Having developed the concept of a distributed software architecture evaluation process, we decided to evaluate the effectiveness of proposed process and determine the requirements of the technological support required by means of a series of laboratory experiments. It is suggested that

146

laboratory-based experiments are appropriate to evaluate a concept when the cumulative body of knowledge related to a specific phenomenon is limited, as is the case with the groupware supported distributed SA evaluation process [3]. We designed a series of controlled experiments, one formal pilot study [9] and one large scale study. This section is based on the requirements elicited using a post-experiment questionnaire. Thus, the detailed description of the experimental design, its logistics, and results are not within the scope of this paper. Rather, we briefly discuss only those aspects of the controlled experiment that we believe can help the reader to understand the procedure of gathering requirements for a groupware application to support architecture evaluation in the context of global software development. For the interested reader, we used an AB/BA cross-over experiment design, which provided all the participants with an opportunity to perform the required task once in a collocated environment and once in a distributed arrangement using the software requirements of two different systems. Participants - We expected around 150-160 participants as initially there were 155 students enrolled in the Total Quality Management course. Our experiments were part of academic assessment of this course. The participants had a strong technical background, varying degrees of work experience, and familarity with functional and non-functional requirements concepts. Training - For training purposes, there were two lectures, each two hours, covering the software architecture evaluation process, current methods, developing change scenarios to specify quality attribute, procedure of the experiments. Furthermore, they were also provided with support materials at the beginning of experiment. The participants also received training on using a collaborative tool, LiveNet , which was used to support the distributed meeting arrangements. This collaborative application was an integral part of the course as the students were required to use the application to participate in the discussions on course related topics. Collaborative Tool - The participants were required to brainstorm and structure their scenario profiles in a distributed arrangement using a web-based groupware system. We decided to use a generic collaborative application, LiveNet, based on its features and availability for research purposes. LiveNet provides a generic workflow engine and different features to support geographically distributed team. LiveNet enables users to create workspaces and define elements of a particular workspace [14]. Data collection procedure - We collected three sets of data from 156 participants; the individual scenario profiles, group scenario profiles, and questionnaire filled by all the participants at the end of the experiment. The quality of the scenario profiles is used to compare the performance of the groups working in distributed and collocated arrangement as reported in [7]. We used a Technology Acceptance Model (TAM) [18] based questionnaires to obtain self reported information to assess the suitability of the LiveNet to support distributed SA evaluation process and identify the required feature. One question asked the participants to identify at least three features that they believed a groupware system shall provide to support architecture evaluation process in a distributed arrangement. Analysis and results of the self-reported data - In order to analyze the self-reported data, we encoded the participants responses to the open-ended question on the required features. We encoded the responses in two steps. First, we allocated the identified features to commonly known tools to support collaborative processes in distributed arrangements such as discussion boards, chat room, repository of artifacts, calendaring and so on. Then, we allocated each of the identified tool to different types of functions that are usually required to support software development activities such as synchronous and asynchronous communication channels, knowledge management, project planning, content management and others. Two researchers independently performed the coding process and disagreements were discussed and resolved before finalizing the list of the features and

147

their respective frequencies. Table 2 shows the identified features and the number of participants who mentioned them. Table 2 The features required of a groupware application suggested by the participants Features requested
Synchronous Communication Audio/video & advanced chat room features, Net-phone Asynchronous communication Discussion board, Email, Instant messaging etc. Content management upload/download, instant file transfer, version control Awareness Participants location & time zones, users status etc. Knowledge management repository of artefacts and staff skill & experience directory Project management tool Planning, tracking, staff selection, & calendar History Log Tracking the changes made to the artefacts Anonymous voting Modelling tools to view and modify architectural diagrams Multilingual support Multiple languages supported

Frequencies
88 38 30 17 15 15 13 7 6 6

The identified features revealed that the participants of our empirical studies understood the collaborative nature of architecture evaluation process and focused on those groupware tools that can support communication and coordination among geographically dispersed stakeholders. A large majority of them emphaised the need of rich synchronous communication media (such as video/audio conference, Netphone and others). Though, the proposed SA evaluation process invovles real time, different places activities except developing scenarios that can be performed asynchronously, a significant number of the participants also identified the tools required to support asynchronous communication (such as discussion board, email and instant messaging). 19% of the participants also mentioned the content management related features. All the categories (synchronous / asynchornous coummunication and content management) of tools that attracted top three frequencies are considered vital to support decision making and management processes, which are characterised by information retrieval (or generation), sharing, and use [32]. Some of the important features (such as user and event awareness, knowledge management, project management, and modeling support) required to support architecture evaluation were identify by only a small number of the participants. One reason can be the lack of experience in architecture evaluaion process. Another explanation can be that the LiveNet provides several syncrhonous, asynchronous communication channels (real time chatting, discussion board, notification and others), awareness (user and event), document management (upload/download/sharing) and project planning (e.g. forming groups, tasks creation & assignment and calendaring) tools, which may have given the impression to the participants that they were supposed to identify those features that LiveNet does not possess or their performance is not good. For example, LiveNet has a real time chat room, however, 42 of the participants mentioned that the chat room should be highly responsive and scalable along with advanced features like automatic storage, personization etc. 3.3 Features Identified from Anecdotal Evidence Based on anecdotal evidence gathered from our experiences of software architecture design and evaluation, extensive literature reviews, and discussions with colleagues in the commercial and research domains, we have also identified a preliminary set of features that the proposed process of distirbuted architecture evaluation needs. Apart from above-mentioned general collaborative tools identified by analysing the activities of the propsoed process and the participants of the controlled

148

experiments, we believe that a groupware system should have following features to successfully support the architecture evaluation in a distributed arrangement: Electronic workspaces Electronic workspaces should provide a virtual space along with all necessary tools and items required to support architecture evaluation. These workspaces should have a dynamic structure capable of accommodating continous modifications and evoluation in work patterns. One of the important requirements for such workspaces is peer-to-peer communication, a facility not provided by GroupSystems according to [29]. A peer-to-peer communication enables workspace supports communication among objects (artifacts, activities, roles) within a workspace as well as between different workspaces. There must be strong correspondence between different workspaces as well. We envisage the need for an integrated collaborative environment which has workspaces enquipped with all the required groupware tools. Support for modeling and documenting software architecture Architecture description is usually captured in various views using different modeling languages. Each view may present architectural concerns of a particular class of stakeholdres. Moreover, architecture decision along with the context knowledge also need to be documented and managed. We need a groupware system that can either provide an integrated modeling tool or can easily import and export diagrams of architecture design being reviewed from several CASE tools. Furthermore, this tools should also support suitable templates based on architecture documentation standards . Architecture knowledge management repository Our research efforts are focused on enabling organizations to do more with less in architecture processes. A sophisticated knowledge management repository is vital to this end. Such a repository improves the reusability of architecture artefacts (such as scenarios, patterns, tactics, design alternatives) [1]. Availability of architecture artifacts annotated with design rationale in a readily reusable format can enhance the quality of design decisions and evaluation results [16]. Concurrency control Since we expect a number of stakeholders to be working concurrently, a good concurrency control mechanism is vital. The system must resolve conflicting requests, be highly responsive and robust, support data replication and turn taking, and not be a hindrance in tightly coupled teamwork [8]. Evaluation measures A tool must be able to analyze the data consumed or generated during an evaluation and build basic metrics to quantify costs and benefits. For example, the ratio of reused and new objects consumed in different types of evaluations, number of participants and quantity and quality of scenarios generated, number of design decision made and alternatives considered for each decision, and so on. These metrics can help evaluate architecture evaluation process.

4. Summarizing the Requirements for Distributed Architecture Evaluation


Our study has identified a large number of groupware tools that need to be integrated into a support environment such as electronic workspaces. Such workspaces should attempt to support collaboration, communication, and coordination requirements that characterise architecture evaluation in a physically collocated environment. Such a collaborative support environment is expected to incorporate functionalities that are currently found in different groupware applications. We have briefly mentioned some of them in section 2.3. Others are offered by several application providers. Table 3 shows the activities of a generic architecture evaluation process and main tasks of each of the activities and major participants involved. We also enlist the groupware tools that are expected to support each of the activity. Table 3 shows that we have integrated several common groupware tools (email, chat room, calendaring, planning, files download/upload, user and event awareness etc.) into more sophisticated packaged systems like Electronic Meeting Systems (EMSs), project management and

149

content management tools. We argue that a collaborative support environment that can fulfil these requirements can not only support architecture evaluation but is also able to support other upstream activities like requirement elicitation and negotiation in the context of distributed software development. Our assertion that such collaborative environment can also support other upstream development activities is based on the fact that several of the identified requirements to a groupware-supported architecture evaluation can be fulfilled by the groupware systems, which have been used for requirements acquisition or negotiation in distributed arrangements.
Table 3 Summary of the activities, tasks, participants of the proposed process and groupware tools required
Activity Name Planning & preparation Explain SA approaches Elicit scenarios Analyze SA approaches Main tasks Set goals, identify evaluation team & major stakeholders, prepare inputs, arrange meeting facilities etc. Architect explains SA approaches, probe design decisions for the effects. Stakeholders develop scenarios and prioritize them according to business goals. Evaluate design decisions with respect to desired scenarios; identify risks, non-risks, tradeoff points, and sensitivity points and document design rationale Interpret artifacts, prepare reports and present results. Major participants Evaluation manager / project manager / evaluation team / architect and whoever required Evaluation team and architect Evaluation team, architect and all major stakeholders. Evaluation team, architect and some of the stakeholders. Main groupware tools required Tools for project mgmt, content mgmt, decision support tools & EMSs CASE, content mgmt & knowledge mgmt tools and EMSs. CASE, content mgmt & knowledge mgmt tools and EMSs. CASE, content mgmt, knowledge mgmt & decision support tools and EMSs Content & knowledge mgmt, EMSs.

Interpret and present results

Evaluation team and sponsors of the evaluation exercise.

5. Conclusion and Future Work


Like any activity of software development process, software architecture evaluation faces new challenges posed by the increasingly popularity of distributed software development (or global software development). Available architecture evaluation approaches do not scale up to the needs of geographically dispersed stakeholders. Hence, new processes and appropriate support mechanism are necessary. We have proposed a groupware-supported distributed architecture evaluation process aimed at supporting geographically dispersed stakeholders. Our proposed process is expected to address a number of logistical issues that characterize current architecture evaluation approaches by taking advantage of groupware technologies. However, we need to know which of the many features provided by different groupware systems are required for a suitable collaborative support environement. We have combined activity analysis, experimentation and anecdotal evidence based approaches to determine the requirements for a suitable groupware system to enable geographically distributed stakeholders to perform architecture evaluation in the context of global software development. We have identified several requirements for groupware support environment to support the proposed process. We assert that a collaborative tool capable of supporting distributed architecture evaluation process can also effectively support other upstream development activities like requirements acquisition, elicitation, specification, and negotiation. We envisage that most of the required features can be found in various groupware tools except knowledge management support, which cannot be fulfilled by any generic tool. The research challenge is to identify currently available groupware tools and tightly integrate them into a collaborative support environment. Moreover, there is an imperative need to design and

150

assess an appropriate data model that characterises the different constructs of architecture knowledge used or generated during architecture evaluation process. Such data model could be implemented to provide architecture knowledge management component of the support environment. We plan to undertake both of these tasks in our future work in this line of research. We hope that our work may also provide research directions to groupware systems developers and researchers interested in providing collaborative support to different activities of distributed software development. Acknowledgement National ICT Australia is funded through the Australian Government's Backing Australia's Ability initiative, in part through the Australian Research Council.

6. References
[1] IEEE Recommended Practice for Architectural Description of Software-Intensive Systems: IEEE Standard No. 1471-2000. [2] Abowd, G., et al., Recommanded Best Industrial Practice for Software Architecture Evaluation, Tech Report CMU/SEI-96-TR-025, Software Engineering Institute, Carnegie Mellon University, 1997 [3] Agarwal, R., P. De, and A.P. Sinha, Comprehending Object and Process Models: An Empirical Study. IEEE Transactions of Software Engineering, 1999. 25(4): p. 541-556. [4] Ali-Babar, M. and I. Gorton. Comparison of Scenario-Based Software Architecture Evaluation Methods. 1st AsiaPacific Workshop on Software Architecture and Component Technologies. 2004. Busan, South Korea. [5] Ali-Babar, M. and I. Gorton. Supporting Architecture Evaluation Process with Collaborative Applications. 8th Int'l. Multitopic Conference. 2004: IEEE Computer Press. [6] Ali-Babar, M., L. Zhu, and R. Jeffery. A Framework for Classifying and Comparing Software Architecture Evaluation Methods. 15th Australian Software Engineering Conference. 2004. Melbourne, Australia. [7] Ali-Babar, M., et al. An Exploratory Study of Groupware Support for Distributed Software Architecture Evaluation Process. 11th Asia Pacific Software Eng. Conf. 2004. Busan, South Korea. [8] Ali-Babar, M., et al. Mining Patterns for Improving Architecting Activities - A Research Program and Preliminary Assessment. 9th Int'l. conf. on Empirical Assessment in Software Engineering. 2005. Keele, UK. [9] Ali-Babar, M., et al., An Empirical Study of Groupware Support for Distributed Software Architecture Evaluation Process. Invited paper in Journal of Systems and Software, 2005. [10] Basili, V.R., R.W. Selby, and D.H. Hutchens, Experimentation in Software Engineering. IEEE Transactions on Software Engineering, July, 1986. 12(7): p. 733-743. [11] Bass, L., P. Clements, and R. Kazman, Software Architecture in Practice. 2 ed. 2003: Addison-Wesley. [12] Bass, L. and B.E. John, Linking usability to software architecture patterns through general scenarios. Journal of Systems and Software, 2003. 66(3): p. 187-197. [13] Bengtsson, P. and J. Bosch, An Experiment on Creating Scenario Profiles for Software Change. Annals of Software Engineering, 2000. 9: p. 59-78. [14] Biuk-Aghai, R.P. and I.T. Hawryszkiewyez. Analysis of Virtual Workspaces. Proceedings of the Database Applications in Non-Traditional Environments. 1999. Japan. [15] Boeham, B., P. Grunbacher, and R.O. Briggs, Developing Groupware for Requirements Negotiation: Lessons Learned. IEEE Software, 2001. 18(3): p. 46-55. [16] Clements, P., R. Kazman, and M. Klein, Evaluating Software Architectures: Methods and Case Studies. 2002: Addison-Wesley. [17] Damian, D.E., et al., Using Different Communication Media in Requirements Negotiation. IEEE Software, 2000. 17(3): p. 28-36. [18] Davis, F.D., Perceived Usefulness, Perceived Ease of Use, And User Acceptance of Information Technology. MIS, Quarterly, 1989. 13(3): p. 318-340. [19] Dennis, A.R., et al., Information technology to support electronic meetings. MIS, Quarterly, 1988. 12(4): p. 591624. [20] Ellis, C.A., S.J. Gibbs, and G.L. Reln, Groupware: Some issues and Experiences. Communication of the ACM, 1991. 34(1). [21] Genuchten, M.V., W. Cornelissen, and C.V. Dijk, Supporting Inspection with an Electronic Meeting System. Journal of Management Information Systems, 1997-98. 14(3): p. 165-178.

151

[22] Genuchten, M.V., et al., Using group support systems for software inspections. IEEE Software, 2001. 18(3): p. 60-65. [23] Gorton, I., I. Hawryszkiewycz, and L. Fung. Enabling Software Shitf Work with Groupware: A Case Study. Proc. of the 29th Hawaii Int'l. Conf. on System Sciences. 1996. [24] Gorton, I. and S. Motwani, Issues in co-operative software engineering using globally distributed teams. Journal of Information and Software Technology, 1996. 38(10): p. 647-655. [25] Griffith, T., J.E. Sawyer, and M.A. Neale, Virtualness and Knowledge in Teams: Managing the Love Triangle of Organisations, Individuals, and Information Technology. MIS, Quarterly, 2003. 27(2): p. 265-287. [26] Grohowski, R., et al., Implementing Electronic Meeting Systems at IBM: Lessons Learned and Success Factors. MIS, Quarterly, 1990. 14(4): p. 369-383. [27] Gruenbacher, P. Integrating groupware and CASE capabilities for improving stakeholder involvement in requirements engineering. Proc. of the 26th Euromicro Conference. 2000. [28] Grnbacher, P. and P. Braunsberger, Tool Support for Distributed Requirements Negotiation, in Cooperative Methods and Tools for Distributed Software Processes, A. Cimitile, A.D. Lucia, and F. Harald Gall, Editors. 2003: Milan, Italy. [29] Halling, M., P. Grunbacher, and S. Biffl. Tailoring a COTS group support system for software requirements inspection. Proc. of the 16th International Conference on Automated Software Engineering. 2001. [30] Hawryszkiewycz, I., Designing the Networked Enterprise. 1997: Artech House, Boston, USA. [31] Herbsleb, J.D. and D. Moitra, Global software development. Software, IEEE, 2001. 18(2): p. 16-20. [32] Huber, G.P., Issues in the Design of Group Decision Support Systems. MIS, Quarterly, 1984. 8(3): p. 195-204. [33] Kazman, R. and L. Bass, Making Architecture Reviews Work in the Real World. IEEE Software, 2002. 19(1). [34] Kitchenham, B.A., et al., Preliminary guidelines for empirical research in software engineering. Software Engineering, IEEE Transactions on, 2002. 28(8). [35] Lanubile, F., T. Mallardo, and F. Calefato, Tool Support for Geographically Dispersed Inspection Teams. Software Process Improvement and Practice, 2003. 8(4): p. 217-231. [36] Lassing, N., et al., Experience with ALMA: Architecture-Level Modifiability Analysis. Journal of Systems and Software, 2002. 61(1): p. 47-57. [37] Liou, Y.I. and M. Chen, Using Group Support Systems and Joint Application Development for Requirements Specification. Journal of Management Information Systems, 1993-94. 10(3): p. 25-41. [38] Macdonald, F. and J. Miller, A Comparison of Computer Support Systems for Software Inspection. Journal of Automated Software Engineering, 1999. 6: p. 291-313. [39] Nunamaker, J.F., et al., Electronic Meeting Systems to Support Group Work. Communication of the ACM, 1991. 34(7). [40] Nunamaker, J.F., et al., Lessons from a Dozen Years of Group Support Systems Research: A Discussion of Lab and Field Findings. Journal of Management Information Systems, Winter, 1996-97. 13(3): p. 163-207. [41] Perry, D.E., et al., Reducing inspection interval in large-scale software development. IEEE Transactions of Software Engineering, 2002. 28(7): p. 695-705. [42] Piccoli, G., R. Ahmad, and B. Ives, Web-Based Virtual Learning Environments: A Research Framework and A Preliminary Assessment of Effectivenss in Basic IT Skills Training. MIS, Quarterly, 2001. 25(4): p. 401-426. [43] Poltrock, S.E. and G. Engelbeck, Requirements for a Virtual Collocation Environment. Journal of Information and Software Technology, 1999. 41(6): p. 331-339. [44] Porter, A.A. and P.M. Johnson, Assessing Software Review Meetings: Results of a Comparative Analysis of Two Experimental Studies. IEEE Transactions on Software Engineering, 1997. 23(3): p. 129-145. [45] Sauer, C., et al., The Effectiveness of Software Development Technical Reviews: A Behaviorally Motivated Program of Research. IEEE Transactions on Software Engineering, Jan., 2000. 26(1). [46] Tyran, C.K. and J.F. George, Improving Software Inspections with Group Process Support, in Communication of the ACM. Sep., 2002. p. 87-92. [47] Valachich, J.S. and A.R. Dennis, Idea Generation in Computer-Based Groups: A New Ending to an Old Story. Organizational Behavior and Human Decision Processes, 1994. 57(3): p. 448-467. [48] Williams, L.G. and C.U. Smith. PASA: A Method for the Performance Assessment of Software Architecture. Proc. of the 3rd Workshop on Software Performance. 2002. Rome, Italy.

152

Investigating IBIS in a Distributed Educational Environment: the Design of a Case Study Daniela Damian, Filippo Lanubile, Teresa Mallardo 1)

Abstract
In this paper we present a first experience in conducting distributed software inspections in support of the remote communication between clients and developers collaboratively developing a requirements specification. We intend to assess if remote synchronous requirements negotiations can benefit from asynchronous discussions for reducing the number of open issues to be resolved in order to short the negotiation agenda. The case study took place in the context of a global software development course and involved three universities from three different continents.

1. Introduction
Meeting-centered activities are very critical activities to conduct in a geographically-distributed environment. We are investigating approaches that enable effective requirements negotiations [Boe98] and software inspections [Lai00] in a distributed software development, as they are activities that should support collaborative software engineering in remote teams as well as they do in traditional software teams [Gr04, Hal03]. In this position paper we report the design of our research that investigates the usefulness of asynchronous discussions, as part of the requirements inspection process, to facilitate more effective synchronous requirements meetings in distributed teams [Dam03]. In particular, we describe how we are planning to study the use of a web-based inspection tool, IBIS [Lan03], in support of the remote communication between clients and developers collaboratively developing a requirements specification. IBIS supports remote teams during the inspection of requirements documents, and in particular supports teams through stages of issue Discovery, Collection, and Discrimination. During the Discovery stage, inspectors review individually the document with the help of checklists or scenarios, and records issues. In the Collection stage the inspection leader or the documents author collate recorded issues and eliminate duplicates. In the Discrimination stage the inspection team makes decisions about collated issues. The Discrimination stage is designed as a structured asynchronous discussion with two mechanisms: posting of messages for each issue under discussion and voting as to whether an issue is a true issue or not (false positives).
1 Damian Daniela, Department of Computer Science, University of Victoria, BC, Canada, email: danielad@cs.uvic.ca Lanubile Filippo, Dipartimento di Informatica, University of Bari, Italy, email: lanubile@di.uniba.it Mallardo Teresa, Dipartimento di Informatica, University of Bari, Italy, email: mallardo@di.uniba.it

153

2. Case study setting


To investigate the usefulness of asynchronous discussions to facilitate effective synchronous requirements negotiation meetings, we studied IBIS in six educational global project teams in a global software development (GSD) course. Each software project followed an iterative development process in which designers in collaboration with clients were to develop a requirements specification (RS): after a requirements elicitation stage, a requirements inspection of an early draft of RS involved the discovery as well as asynchronous discussion of requirements issues (in IBIS) and was further followed by requirements negotiations and prototype demonstrations before the final draft of the RS was delivered. The Global Software Development course was offered in a three University collaboration involving University of Victoria, Canada, University of Technology, Sydney, Australia, and University of Bari, Italy during January and May of 20052. The course involved a total of 32 students. 12 of them were Masters and Doctorate students at the University of Victoria, 2 graduate and 8 undergraduate students at the University of Technology, Sydney, and 10 Masters students at the University of Bari. As shown in Table 1, the Canadian students worked on software projects with the Australian and Italian groups as follows: the 12 Canadian students formed three groups of 4 (Gr1-3), the Australian students formed two groups of 5 (Gr4-5), and the Italian students formed two groups, of 3 and 7 students respectively (Gr6cl and Gr6dev). Each Canadian and Australian group was involved in two different projects, playing the role of client (C) and developer (D) respectively. Each of the two Italian groups was involved in only one project, either as a client (Gr6cl) or as a developer (Gr6dev). Table 1. Groups and allocation to course projects Project A Project B Project C (A1, A2) (B1, B2) (C1, C2) Country Group PT1 PT2 PT3 PT4 PT5 PT6 Ca Gr1 Gr2 Gr3 Gr4 Gr5 Gr6cl Gr6dev Client (C) D Developer (D) C D C D C D C C D

Au It

2 More information can be found on the course website: http://segal.cs.uvic.ca/csc576b

154

2.1. The software projects There were three distinct projects in the course (A, B and C). Project A (A1 and A2 in Table 1): Global software development system. The students were to design a system to facilitate collaboration in GSD by supporting informal communication as well as document exchange in remote teams. Tasks supported by the tool included: displaying peoples availability information, viewing changes between different versions of documents and authors of those changes, visualizing the evolution history of a particular document, and discovering who has been working on a particular document or section of a document. Project B (B1 and B2): iMedia system. The students were to design the interface for a iMedia software that will allow users to purchase movies online, organize their movie library, and play movies. One of the key requirements was that the interface be simple to use even for inexperienced computer users, without sacrificing key features. Project C (C1 and C2): Virtual Realty system. The students were to design a system that provides accurate and easy-to-find information to real estate agents and home buyers in the Victoria area. The system had to display an interactive map, where the end-user can zoom in, zoom out, pan, etc., and click on it to get the information of the property. Two global software teams were allocated to each project, each with the client and developer group in two different countries (see Table 1). The projects were assigned to groups before group membership was determined. The project assignment was done so that each group worked with a different partner group for each of the two projects it was assigned (with the partner group always located in a different country), and so that the two projects it worked on were on a different topic. 2.2. The RS development process Each project followed an iterative process of developing a requirements specification (RS) through collaboration between developers and clients over a period of 7 weeks. The RS development lifecycle consisted of ten phases of requirements discovery and validation, and through which the understanding and documentation of requirements was to be improved: 1. Kickoff Meeting 2. Create RFP 3. Analyze RFP 4. Requirements Elicitation. 5. Create RS 1.0 6. Discover issues on RS 1.0 7. Asynchronous discussions 8. Requirements negotiation 9. Create prototype demo 10. Create RS 2.0 The purpose of the asynchronous discussion was to come to an understanding of each issue and those issues that could be closed online (i.e. where resolution could be reached without further negotiation) or remained open issues (anything else, and which had to be further negotiated in real-

155

time discussion). Discussants attempted to close issues by using the two mechanisms in IBIS: posting messages with respect to a certain issue, and voting as to whether it is still an open issue or is resolved and thus could be closed. Those issues that could not be resolved during the asynchronous discussion in IBIS (i.e. open issues) were then discussed during a scheduled requirements negotiation held in a videoconference meeting between developers and clients. Each of which stages included either client, developer or group tasks and ended with a project deliverable on which students were graded for the class. The final deliverable was the final version of the SRS, which reflected the shared understanding of the project that the clients and the developers built over the previous four phases. The project finished at the point where the developer group would start writing the code for the system called for by the project.

3. Research design: exploring the usefulness of asynchronous discussions to facilitate effective synchronous requirements negotiations
The usefulness of asynchronous discussions prior to requirements negotiations consists in focusing the synchronous meeting, in which requirements are negotiated, on the issues that could not be resolved earlier, during the asynchronous discussion. As a first assessment, we instructed half the projects to conduct the asynchronous discussion before the negotiation, and half the projects to jump into the negotiation without asynchronous discussion. Table 2 indicates which projects conducted the Asynchronous discussion (AD) and which did not (No AD). Then, the process variant (AD or No AD) was the main independent variable that we manipulated for experimental purposes. Table 2. Experimental design project team process (client/developer variant ) A1 (gr1/gr4) No AD B1 (gr2/gr6dev) No AD C1 (gr3/gr5) No AD A2 (gr5/gr2) AD B2 (gr4/gr3) AD C2 (gr6cl/gr1) AD When asynchronous discussions were scheduled for a project team, both clients and developers used the IBIS tool over a week, as a threaded discussion forum. The aim was to come to an understanding of each issue by exchanging messages and to an early resolution through a common agreement expressed by voting. Those open issues that could not be closed during asynchronous discussion in IBIS were then left for the synchronous negotiation meeting. For those project teams which skipped the asynchronous discussion, all collated issues were thus considered as open issues to be dealt at the negotiation.

156

To measure the usefulness of asynchronous discussions, we are going to collect data about: collated issues, closed issues during asynchronous discussion, open issues before synch negotiation, closed issues during sync negotiation, open issues after sync negotiation. In addition, to understand better what actually happened during asynchronous discussions, we are also collecting the following information from the asynchronous discussions in IBIS: discussants, posted messages, messages per issue and participant, votes per issue and participant. To complement the quantitative data, we gathered the students perceptions on the usefulness of the asynchronous discussion through two questionnaires, administered at middle and end point in the course. The two surveys were designed to elicit feedback about processes as well as tools used in the distributed project.

4. Future work
We are planning to analyze our data. Firstly we want to measure the dependent variables to assess our research goal. Unfortunately, because the sample size is so small (only 3 projects for each process variant) we think that it will make no sense running any kind of statistics for measuring effectiveness. Then we will reinforce our quantitative results from the questionnaires analysis and, in particular, from the comments on open questions we can try to explain unexpected results. We seek feedback from the workshop with regards to the research directions presented here. We hope to present some findings from analyzing this data at the workshop.

Acknowledgements
We gratefully acknowledge the technical support of Luis Izquierdo and Fabio Calefato during the duration of the three-University course. Thanks also to Dr. Ban Al-Ani for her instrumental role in this collaboration and to all the students who participated to the remote projects.

References
[Boe98] B.W. Boehm , A. Egyed, J. Kwan, D. Port, A. Shah, R. Madachy. Using the WinWin Spiral Model: A Case Study. Computer, 31(7): 33-44, July 1998. [Dam03] D. Damian, and D. Zowghi. Requirements Engineering challenges in multi-site software development organizations. Requirements Engineering Journal, 8: 149-160, 2003. [Gr04] P. Grnbacher, M. Halling, S. Biffl, H. Kitapci, and B.W. Boehm. Integrating Collaborative Processes and Quality Assurance Techniques: Experiences from Requirements Negotiation. Journal of Management Information Systems, 20(4): 9-29, 2004. [Hal03] M. Halling, S. Biffl, and P. Grnbacher. An economic approach for improving requirements negotiation models with inspection. Requirements Engineering Journal, 8: 236-247, 2003. [Her03] J.D. Herbsleb, A. Mockus. An Empirical Study of Speed and Communication in Globally-Distributed Software Development. IEEE Transactions on Software Engineering, 29(3): 1-14, 2003.

157

[Lai00] O. Laitenberger, and J.M. DeBaud. An encompassing life cycle centric survey of software inspection. The Journal of Systems and Software, 50 (1): 5-31, January 2000. [Lan03] F. Lanubile, T. Mallardo, and F. Calefato. Tool Support for Geographically Dispersed Inspection Teams. Software Process: Improvement and Practice, 8(4): 217-231, October/December 2003.

158

Towards understanding requirements management in a special case of global software development: A study of asynchronous communication in the Open Source Community Luis Izquierdo, Daniela Damian and Daniel M. German1
Abstract. Distributed software development faces many challenges, and one of the most
challenging issues is communication. Open Source projects by nature are multi-distributed, and their success seem to rely on the effectiveness of the communication in the community using mailing lists, discussion forums, and IRC communication channels. This paper presents a first step in our research on the communication patterns in the mailing lists of the active open source projects. We are interested in identifying patterns that may seem useful in
understanding RE practice in OSS project, and draw implications from distributed projects that are successful, despite not following the traditional formal practices of Requirements Engineering,

1.

Introduction

Open Source Software projects (OSS), a type of GSD projects quite different from commercial projects in large corporations, exhibit a new and interesting style of development. Not only motivations are different between OSS projects and commercial ones, the development of projects from their inception to deployment are quite different, considering that the formal practice of requirements engineering (RE) in OSS projects is almost non existent. The stories of some successful OSS projects [1], despite the absence of traditional Software Engineering processes, are inspiring for the GSD community. One question that arises is how can many OSS projects succeed without using formal RE practice? Or, is the RE practice, as it is applied now, necessary for the success of any software project? What can be learnt from the OSS projects that generalize to other GSD projects, not open source in particular? In our research we are interested in analyzing OSS and not traditional commercial projects and how the RE practice and formal specification are replaced for more informal practices. These practices usually take advantage of the experience of the community members, who use the project's infrastructure (mailing lists, discussion forums, wikis, website, etc.) to informally gather requirements, reach a shared understanding of desired functionality, and create a common knowledge repository accessible to all members of the community. One of the most interesting aspects of OSS development is the communication behaviour and the frequent reliance on asynchronous channels of communication to facilitate the work of volunteer developers in arbitrary locations, who rarely meet face to face, and typically coordinate their activity by means of email and bulletin boards [2]. Time is one of the most important factors in business, but does not seem to be a critical factor in OSS projects. Private companies have master development plans, which include milestones and deliverables; consequently, they become timedependant. The speed of the communication is critical for the success of a project, and companies must establish synchronous and asynchronous communication channels in order to ensure that the information is gathered or propagated on time and that it reaches the right person.
1

Department of Computer Science, University of Victoria, Canada. Email: {luis, danielad, dmg}@cs.uvic.ca

159

While recent research investigated the development of tools to support synchronous communication for ad-hoc distributed workgroups [3], more evidence needs to be gathered about the patterns of communication in different projects in the Open Source communities. Due to the lack of information in this field, it is important that through research we understand more about asynchronous communication in OSS projects, to possibly discern patterns of group and communication behaviour that further our understanding of how requirements for a project are being managed despite formal RE practices. In this paper we present our research directions in tackling these questions, report on work in progress and planned future directions. We describe a study of asynchronous communication patterns in open source projects and draw implications for future research in the area of RE in GSD. We recognize that this is very preliminary work in trying to learn from patterns of asynchronous communication in open source projects and without attempting to overstate our findings, we discuss the many future work directions are worth pursuing. We are seeking feedback from the workshop participants as to the methodology followed and validity of our preliminary results. The paper is structured as follows: Section 2 presents our motivation and related research in the area of requirements engineering in open source software projects. We then describe our research design in Section 3. We outline a number of research questions that represent a first step in a research with a longer term goal. Findings discussed in Section 4 and 5 represent some preliminary insights we obtained from analyzing patterns of asynchronous communication in OSS projects, and motivate us to draw directions for future investigations of patterns that may seem useful in understanding RE practice in OSS projects. They are described in Sections 6.

2.

Motivation and related work: Requirements Engineering in OSS

Traditional software requirements engineering processes contain a set of activities that have to be followed to produce a reliable, and high quality system [9],[10],[11],[12]. These activities include requirements elicitation, requirements specification, requirements analysis, requirements validation, and requirements communication. However, open source development communities do not seem to readily adopt or practice such RE processes and formal specifications. Despite this reality, these communities produce high-quality software that is used for the people within its community and outside of it. So what processes or practices are being used to develop the requirements for OSS systems? Since no formal specification is defined in OSS projects, nor requirements are recorded in a software requirements specification, how are the software implementations validated against the requirements? Previous research shed some light onto how requirements are being elicited and managed in OSS. A close examination of four OSS communities [8] found that requirements for open software can appear or be implied within an email message or within a discussion thread that is captured and/or posted on a project's website board for open review, elaboration, refutation, or refinement. Regardless of whether these requirements have any link to other documents, sources, standards, they are requirements because some developers wanted these capabilities, and some of them were willing to implement them. Scacchi [8] identifies that requirements emerge from the experience of the community members through their email and board discussion forums, and they are asserted rather than elicited. Messages on mailing lists threads give rise to the development of narrative descriptions, which become functional and non-functional requirements. Further, when the capabilities are implemented, the concerned developers justify their requirements providing the code that implement the requirement. Senior members or code developers in the community vote to

160

accept or reject the capability [14]; this decision is recorded within the mailing lists or board of discussion archive, to document who required what, when, why and how. The importance of the mailing list as a knowledge repository is an important factor to consider. Mailing lists are an important part of what Cubranic calls Group Memory [15]. However, there is no evidence of further formal documentation of the process. One single isolated effort we could find was the AccessGrid community [13], where some developers elaborated a document that contains formal specification of the requirements for the AccessGrid. Further, asynchronous communication tools such as email, are a very interesting group of tools for distributed communication and coordination. On the one hand, as seen above, the lack of formal Software Engineering practice generates open discussion of the requirements in the mailing list. Despite some efforts to develop synchronous tools to support the communication during the development activities in the OSS community [3], until these tools show that they are useful for distributed projects, the community will continue to rely on the use of asynchronous tools to learn about what is required and to report the implementation of a certain capability. On the other hand, email is well-known for its lack of support to ambiguous and conflictual situations. Mailing lists are weak in managing ambiguous information, emails get lost or forgotten, and issues remain unresolved [7]. They also provide no indication of when an electronic answer will come back, i.e. you are forced to rely on individual work styles and this may introduce delay in resolving issues [7]. In the experience of the authors, the contribution of a member or a newcomer is slowed down by the asynchronous nature of email. The contributor does not know if the contribution will be accepted nor when. The lack of immediate feedback might frustrate the contributor, who might decide to leave the project. Hence the study of patterns of asynchronous communication becomes very important. Once the requirements are clarified, a small number of developers are usually responsible for most of their implementation [1]. There is no information, however, on how these developers communicate and coordinate their work in implementing these requirements. We know very little about how threads of discussions occur on requirements, clarifications about implementation decisions, testing, or other aspects of how feature development is being managed in OSS. In such an unexplored terrain, exploratory research may prove very beneficial. It is in this direction that our research is headed and we present here some directions we explored and some preliminary findings.

3. Research Design: an empirical study of asynchronous communication in open source projects


The main objective of this empirical study is to identify communication patterns in the contribution of community members to the mailing lists of OSS projects. To reach our goal we defined a number of questions and methodological steps in pursuing these questions, as follows: Q: Are mailing lists used to discuss feature development? Since communication about design of OSS is performed almost entirely through communication in mailing lists, we wanted to see if we could identify a particular list or category of lists that were dedicated to communication of requirements. We wanted to focus our investigation of the communication patterns in the lists that are most likely to be used for discussions about development features. To this end, we asked a more specific question: Q1: Which types of mailing lists are most commonly found in OSS? Is any particular mailing list typically used for the discussion of feature development?

161

To this end, we selected a sample of open source projects that showed a significant level of activity. We selected 50 projects from SourceForge with the highest percentage of activity (we used the activity measure provided by one of the most popular portals that hosts several thousand OSS projects). Our assumption to do this was that a higher activity in the project will imply a higher amount of traffic generated on its mailing lists. We will further refer to these 50 projects as TOP50. From the TOP50, we catalogued the mailing lists of each project, and classified them into a number of categories. Q: Are there any patterns in the mailing list where feature management is discussed that could provide some insight about how feature management is done in OSS? The intention was to focus our investigation to those messages within this list that provide more relevant information about feature management. In particular, we wanted to identify patterns in the amount of contribution to the list that could be related to feature development. It is well known that in OSS projects only a small percentage of developers generate most of the code [4], but it is not known whether the same pattern is followed in the communication behaviour on mailing lists. To pursue this question, we first concentrated on learning, who the authors of these messages are and how frequently they contributed. Thus we asked these specific questions: Q2: Is there any pattern in the number of message per year to the list dedicated to feature management? We examined the traffic and content of this list in the top 10 projects from the TOP50 (we will refer to these 10 projects as TOP10). We wanted to know if there was a pattern in the distribution of messages with respect to time. Q3: How many people contribute to those lists dedicated to discussion about feature management? Is there a relationship between number of authors and the traffic of this list (as defined by the number of threads started in the mailing list)? The archives of mailings lists are available through the Web, and are organized by thread. We created a program to scrap and retrieve these lists, and decided to concentrate only in the first message of a thread sent to a message (this decision will be further described below). We extracted each thread for a particular list, and its author. We wanted to see if there was a correlation of the number of threads and the number of authors. Q4: How many authors generate the most of the traffic in the feature management list? Our hypothesis was that a small proportion of authors is responsible for most of the threads in the mailing list. To test this assertion we decided to find what percentage of threads was authored by the 10% most active authors. 3.1 Methodology of extracting information from SourceForge Most of the successful OSS projects have their own development infrastructure (version control and configuration management systems, mailing list servers, defect tracking systems, etc.). Some portals are available to provide support to projects that need infrastructure to start the development and attract members to their communities. SourceForge [5] is one of the most widely used portals with 98,605 projects currently registered by the time that this paper is being written. It provides the necessary infrastructure to deploy a project,

162

such as a CVS repository (for version control), a repository product releases, bug tracking infrastructure and a mailing list server. SourceForge allows a project administrator to create mailing lists according to the particular needs of each project. SourceForge provides a number of statistics related to the activity of the projects it hosts. These statistics include: the most active projects of the week, the most active projects of all time, the projects with the highest number of downloads, etc. We selected the TOP50 as the top 50 from the list of most active of all time. OSSmole [6] is a project that releases some data about the projects that SourceForge hosts. For each project, it releases information such as its registered contributors, the role each play (user, administrator, developer), and the date the contributor joined the project. 3.2 Gathering and Analysis of the Data TOP10 is the list of 10 most active projects of all time, based on the SourceForge statistics. Each one of these projects had a mailing list called <project-name>-devel, where feature management usually took place (we will refer to this list as development). We proceeded to retrieve the archives of each of these 10 lists by scrapping the repositories via HTTP requests: We retrieved subject of each thread, that is, the first message of a discussion thread. For instance, if someone sends a message to the list, and 10 people reply to it, we only record the first message. The main reason is that we found it difficult and error-prone to extract all the messages, and decided that, at least at this stage of our research, accuracy was more important. One important assumption we make is that a person is as likely to reply to a message as to create a new thread. We expect to further explore this assumption in the future. For each thread we retrieved its original authors name, email address and date of creation. Some mail clients are not very good at maintaining the metadata that labels a given message a response to a thread. We discovered that frequently a reply to a thread would appear as a new thread. The subject of these messages invariably contained the prefix Re:. Hence we removed from our database all messages that contained this prefix in their subject. We also discovered that sometimes a developer would use different email addresses. We tried to address this issue by discovering all the email address that a given individual (the authors name) had used. We also removed, manually, messages that were clearly spam. The concept of contributor covers different types of actors of the system. We would classify them in the following categories: Developers are members of the community that contribute to the development of the system, submit patches to improve the functionality, and/or fix a bug. Users are members of their community that use the system. Their contribution to the community would be expressed in reporting bugs, listing a desired functionality, and participating in the discussion. Administrators are the members in charge of the management of the project, and they could be also considered as developers. The purpose of this classification is to clarify the roles of the general contributor that is registered in SourceForge. We assumed that, according to this classification of roles in each project, the developers are the main users of the development mailing list.

163

Unfortunately the data provided by SourceForge through the OSSmole project includes all the contributors of the project without distinction of role, which makes sense because the roles could be switched anytime among the users of the projects. This means that a user can become a developer and vice versa, and/or an administrator can pass his administrative responsibilities onto another user or developer. Another potential problem for our research is that not all actual contributors might have registered in SourceForge as a contributor to the project (for example, a person sends a patch to one of the registered developers).

4. Findings
4.1 Are mailing lists used to discuss feature development? Q1: Which types of mailing lists are most commonly found in OSS? Is any particular mailing list used for the discussion of feature development? In the TOP50 projects analyzed, the names and purposes of the mailing lists were diverse. We examined the content of messages and identified that they can be grouped in the following categories of the mailing lists. A short explanation of the lists in each category in included: General General: mailing list used as a repository of questions, messages, and reports from the users and developers of the projects, i.e.: crystal-general mailing list. Users Support Users / Help: used to post questions and discuss usability issues, i.e.: squirrelmail-users, dynapi-help mailing lists. Announce: mailing list used to keep the project community up to date with the evolution of the project, the stage of the implementation of new functionality, and the notification of new releases. This list is usually moderated, i.e.: dri-announce mailing list. Docs: mailing list intended to discuss issues related to project documentation, and translation, it is also used to track the changes made to the documentation, i.e.: crystal-docs mailing list. Development Support Development: mailing list used for discussion related to develop of features. Usually the name of these lists contains one of the following suffixes: dev, develop, or development, i.e.: dri-devel mailing list. Plugins: mailing list used to discuss issues related to the creation or implementation of plugins, i.e.: squirrelmail-plugins mailing list. Automatically generated Patches: mailing list used to track the status of patches submitted to the system, i.e: gaimpatches mailing list. Bugs: mailing list used to track bugs reported in the system. The messages in this list are usually automatically generated by SourceForges bug tracking system. Examples of these lists are: gaim-bugs mailing list.

164

Cvs: mailing list used to notify about commits performed to the cvs repository. The messages in this list are usually generated automatically by CVS, i.e.: bo2k-cvs-updates mailing lists. Others Internationalisation: mailing list intended to discuss the efforts related to the internationalization of the project, i.e. translation into other human-languages, i.e.: gaimi18n mailing list. We then counted how many projects had a list in each of the categories. Figure 1 shows the percentage of projects (out of a total of 50) per each category. For ease of reading, these categories will be referred to as lists henceforth.

70 60 50 40 30 20 10 0

de ve lo p

ge ne ra l

cv s

us er s

Figure 1: Most common used mailing lists and percentages of projects in which they were found

The most important finding is that the development list was found in 60% of the projects. In what follows, we discuss these lists in more detail, in the light of data from Figure 1, in an attempt to explain how the development list is different than the others and the reasons for which we believe that it is the one that provides most relevant information about feature management in open source projects OSS projects rely on the development mailing list to gather the requirements of the system from the users community. An important point is that the users identified themselves as developers; hence, the development mailing list becomes the one that they feel is the right place to post their ideas, concerns, and questions. The announce mailing list is considered to be very important for the projects, because it keeps the members of the community up to date about the development progress of the system, and sometimes provides roadmaps for future development. Even though, in the sample of projects, 40% of projects have it. The cvs mailing list is present in 40% of the projects. The messages in the cvs and the bugs mailing lists are generated automatically by the CVS repository manager and the bug tracking system respectively (non human contributor) and they are intended to inform their subscribers about the development activity in the project; thus, the amount of traffic generated would be a good indicator

an no un

pa tc he s

do cs

op en

ce

he lp

bu gs

165

of the developers activity as a group, but the data tells us nothing about developers communication behaviour nor can we find communication patterns. In OSS projects, the concept of user is closely related to the developer, but not all the users are developers, and this differentiation is also expressed in the creation of the mailing lists. As it is observed in figure 1, 24% of the projects have a users mailing lists, which could be related to the nature of the project where the community has to split the communication between the development mailing list and the users mailing list. The main purpose of the mailing list is to separate functionality issues from the usability issues that the application might have. With this differentiation, OSS projects aim to avoid the discussion of non-important issues in the development mailing list, focusing instead on issues that concern the functionality of the system. The help mailing lists have as a purpose to provide a community forum where users and developers can help one another with problems related to the use of the system. The general mailing list has no specific purpose. It could contain information about development, releases, plugins, and general questions. From Figure 1, we observed that 24% of the projects have this mailing. The general mailing list is also used as an alternative of the announce mailing list in the projects that do not have one. SourceForge also has an automated system to submit a patch. A registered user can use this functionality, for every patch submitted to the system creates a Request ID. Similar to the bugs mailing list, the patches mailing list is used to communicate the status of the submitted patch. When the patch is committed in the CVS repository, a record of this event is kept in the patches mailing list. We could say that this is mainly used as an archive to track the history of a patch. As we can observe in Figure 1, less than 10% of the projects have a docs mailing list, which could be interpreted as a lack of interest of the communities to generate documentation in support of the project. The open-discussion mailing list fosters communication that is not necessarily related to the technical issues of the system, since the creation of a sense of community is leverage for the use of this mailing list. 4.2 Are there any patterns in the mailing list where feature management is discussed that could provide some insight about how feature management is done in OSS? Q2: Is there any pattern in the number of message per year to the list dedicated to feature management? We were interested to know if the traffic in mailings lists remain or not uniform over the life of the project. For instance, does the number of postings remain the same, or increases over the life of the project?
Figure 2 shows the number of threads in the development list for each of the TOP10 projects from 2000 to 2004 (we did not want to show only partial totals for 2005). Unfortunately the activity of these projects do not seem to follow any obvious pattern. Analyzing the graphic we can observe that for example, with project dri (Direct Rendering Infrastructure), the traffic in the list in 2000 is aprox. 1100 postings, observing the change of the graphic, we can say that it was a very productive year for the project. In 2001, the number of threads starts to drop down. In 2003, aprox. 12 threads were submitted to the mailing list. This could be due to two possible scenarios: (1) the project was

166

abandoned, or (2) the project reached its maturity so that no further development was required. However, in 2004, the number of threads goes up to 1400 postings.

1200 bo2k 1000 800 600 400 200 0 2000 2001 2002 2003 2004 corelinux crystal dri gaim gimp_print kicq squirrelmail tik vacm

Figure 2: Volume of traffic in the development mailing list. The left hand side shows the number of threads started during the corresponding period

Gimp-Print (an API for printing under Unix) has a more stable communication traffic, the graphic displays a gradually drop down of the traffic, but this project has high an stable traffic; and SquirrelMail (a PHP4-based web email client) shows that 2002 was the year with the highest activity, and the next year the traffic went down 53%. The similar traffic amount is kept in the next year on both projects, and this behaviour is related to the natural project life cycle. Q3: How many people contribute to those lists dedicated to discussion about feature management? Is there a relationship between number of contributors and the traffic on the mailing lists (as defined by the number of threads started in the mailing list)? In order to investigate this relationship, we present the TOP10 projects in Table 1. The table presents the current number of subscribers to the development mailing list, and the number of posters, where a poster is a person who starts a thread in the list. The number of subscribers is the total number of people subscribed to the list (as of April 2005) and does not reflect the number of subscribers over the entire history of the list.
Table 1. Most active projects in SourceForge with development mailing list Project Name Crystal Space 3D Direct Rendering Gaim CoreLinux++ SquirrelMail TiK Mailing Lists announce, core, cvs, development, docs, main announce, development, patches, users bugs, cvs, development, features, forums, il8n, patches, plugins cvs, development, public announce, cvs, development, il8n, lang_es, plugins, siteupdates development, news Number of subscribers to the development list 226 604 404 9 541 13 Number of posters to the development list 140 959 586 9 750 6

167

BO2K Gimp-Print VACM kicq

cvs, development, general, users announce, development development, features, general announce, development, users

140 184 52 32

66 584 59 18

As we observe in Table 1, all the projects have at least one extra mailing list besides the development one. The development mailing list is one of the most important lists, and it is usually supported by extra mailing lists that host the communication of other issues, frequently not technical. Another point is that developers focused on the discussion of particular technical issues need extra communication channels to have a better perspective of the project as a whole. For example, the projects, TiK and Gimp-Print have only two mailing lists: development and news for Tik or announce for Gimp-Print. Both lists are likely used as a medium to reach all the members of the community. In several cases the number of different posters is larger than the number of subscribers. We attribute this to the fact that some contributors might no longer be subscribed to the mailing list. To identify any relationship between the total number of posters and the traffic in the list, Table 2 shows the average number of postings per contributor. We observed a positive correlation, this average seems to be around 3 and 4 for most of the projects. CoreLinux has a very large average, and TiK a very small but we attribute this to the fact that both have a very small number of posters (as shown in Table 1).
Table 2. Number of threads per poster in the development mailing list

Project Crystal Space 3D Direct Rendering Gaim CoreLinux++ SquirrelMail TiK BO2K Gimp-Print VACM kicq

Number of posters 140 959 586 9 750 6 66 584 59 18

Number of threads 432 3039 1247 113 2657 11 214 2582 163 71

Average number of threads per poster 3.09 3.17 2.13 12.56 3.54 1.83 3.24 4.42 2.76 3.94

However, this is just a rough characterization of traffic in the list, and does not provide any indication of how many threads each poster starts, and whether the contribution to the list is indeed uniformly distributed across contributors. Our assumption being that a small proportion of posters start most of the threads to the development mailing list, we further investigated the following question: Q4: How many authors generate the majority of the traffic in the development list? It is well known, based on the literature [8], that during the development of an OSS project, a small percentage of developers write most of the code of the application. We were interested to know if the same pattern is present in the mailing lists. Our assumption is that the most active developers in a project will probably be responsible for the majority of the threads of a mailing list, and therefore, a small number of posters will contribute the majority of the traffic.

168

Table 3 shows, for each of the TOP10 projects the percentage of threads created by the 10% most active posters.
Table 3. Percentage of contribution to the development mailing list

Project name Crystal Space 3D Direct Rendering Gaim CoreLinux++ SquirrelMail BO2K Gimp-Print Tik VACM Kicq

Percentage of threads generated by the 10% of contributors 49.5% 53.3% 44.1% 60.1% 58.9% 48.5% 72.8% 27% 50.9% 52.1%

We observed that 10% of the contributors generate an average of 52% of the threads of the list which confirms our hypothesis that the contribution to the development mailing list follows a similar pattern that the contribution of code in an OSS projects. The Tik project had the smallest percentage, with 27%. We attribute this to the fact that it is a very small list, with only 6 different posters, and 13 current subscribers. But the 10% of contributors is still a large number of individuals if the project has a large community; hence the size of group that is relevant to study is still large. How is this 10% broken down? We proceeded to analyze in more detail two of these projects. 4.3 A closer look at how posters contribute to two projects To further demonstrate these findings, we present the details of two projects that illustrate most of the insights we obtained from this analysis. 4.3.1 Dri The Direct Rendering Project This project has three mailing lists: announce, development, and users, and the number of messages posted in each list is included in table 4.
Table 4. Number of messages per mailing list in dri

Mailing list name announce development users

Total number of messages (threads and replies) 10 32397 8325

169

This finding supports our original assumption that the development mailing list is the one with the highest activity over the other mailing lists. Comparing the traffic of the development mailing list against the users mailing lists, we can observe that the volume of data of the first one is four times larger than the users mailing list. The development mailing list was further analyzed: 959 different posters submitted a total of 3039 threads. 96 posters submitted a total of 1621 threads, which means that 10% of the posters generate approximately 53.3% of the total amount of messages in the list.
Figure 3 shows that the five top contributors to the mailing list have contributed around 16% of the total messages posted. Considering that the number of contributors to the list is 959, the contribution of five top contributors is very significant to the project, but the fact that they are the most active contributors does not imply that they are the most active or productive developers. We plan to conduct further analysis to verify this assertion.

16%

47%

Top 5 developers
Rest of the 10%
Other

37%

Figure 3. Percentage of messages posted by the five top contributors to the development mailing list dri

4.3.2. SquirrelMail This project has 9 mailing lists hosted by SourceForge. All of them have a considerable amount of traffic, and their statistics are shown in table 5. A total of 750 posters submitted 2657 threads to the development mailing list from 2000 to December 2004, 75 posters submitted 1565 threads. Approximately 58.9% of the threads were submitted by 10% of the posters. The five top contributors submitted 18.2% of the total number of threads to the development list.
Table 5. Number of messages per mailing list in Squirrelmail

Mailing list name Announce Cvs Development Il8n Lang-es Plugins Siteupdates Stable Users

Number of messages (threads and replies) 52 10914 11299 1693 792 10979 1438 616 26875
170

Analyzing the findings from these projects, we observe that 10% of contributors generate approximately 55% of the threads. The five top contributors to the development mailing list are very active: in dri, they generate 16% of the total number of threads; and 18.2% in Squirrelmail.

18%

41%

Top 5 developers
Rest of 10%
Other

41%

Figure 4. Percentage of messages posted by the five top contributors to the development mailing list squirrelmail

5. Conclusions
In this paper we report from our research that analyzes OSS projects and how the RE practice and formal specification are replaced for practices that take advantage of the experience of the community members, who use the project's infrastructure (mailing lists, discussion forums, wikis, website, etc.) to informally gather requirements, reach a shared understanding of requirements being implemented, and create a common knowledge repository accessible to all members of the community. In this paper we described our research directions in investigating the use of asynchronous communication that facilitates the work of distributed developers who typically coordinate their work on requirements by means of mailing lists. We believe that mailing lists could be considered as historical archives of the projects from which we can learn about how informal Software Engineering processes are supported by the communication in the mailing lists. Through an analysis of data from projects in SourceForge, we presented here some interesting findings, which we use as a basis for future research. In summary, (1) We identified one category of mailing lists that hosts most of the communication about feature development. We termed this category the development list and found that it that was present in the majority of 50 most active projects in SourceForge. The practical implication of this finding is that we can now narrow our exploration to this mailing list, and define more specific research questions about communication behaviour relevant to requirements management in OSS. Hence we analyzed the traffic of messages in this list in the most 10 active projects and (2) In spite of not finding any interesting pattern in the number of messages per year in the development list across these projects, we were able to make some interesting observations. While first identifying a direct correlation between the total number of contributors and the traffic on the list (which implies that the number of messages may grow directly proportionate to the number of contributors to the mailing list), we then found that the majority of traffic is in fact generated by a small number of contributors. The practical implication is that we can further narrow our research of the communication of this small
171

number of contributors, as an indication of a representative sample of messages tackling requirements issues.

6. Future Research
These findings are only a first step in our exploratory research. While interesting, this exploration in an unmarked territory may seem fruitless at times, as many directions need to be tried before some prove very useful for the purpose of understanding communication around requirements in open source projects. We thus believe that important directions for future directions, as indicated by this preliminary investigation, include: 1. Increase the information used in our investigation. We have been using only threads in our analysis. We need to analyze every message to every thread. We might discover that some people are more likely to reply to a message than to start a new discussion, and vice-versa. Also, we will try to observe other projects that are not hosted in SourceForge and check whether they have similar communication patterns to the ones we have discovered. 2. In-dept analysis of the content of the messages posted on the development mailing list of the small group identified as most contributing to the list traffic, i.e. the 10% of posters. One important question is whether there is any correlation between the developers who contribute most to the code and those who contribute most to the mailing list. 3. Examination and classification of these messages for possible patterns that could be mapped to discussions of certain features in development. Patterns in the construction of threads and their depth (replies to each posted message) may indicate communication networks, people involved as well as development leadership for a certain feature. 4. According to our findings, approximately 90% of contributors to discussions on the mailing list are transient. In the absence of specification-centric knowledge management techniques, an interesting question that emerges is the support offered by these mailing lists to newcomers who are trying to find out about requirements and their implementation status. 5. An analysis of the traffic in the development list related to the releases and prerelease of a new version of the system will help us to understand how discussions on the mailing list impact the choices made by the community decision-makers. It will provide us with a list of functionality that was ignored or forced into the next release of the system. We expect our current and future research to benefit other open source projects. Further research is needed to evaluate if any of findings in the open source domain can be transplanted to proprietary software development, where the motivations and organization are very different to those in open source.

172

7. References
[1] KOGUT, B., METIU, A., Open Source Software Development and Distributed Innovation. Oxford Review of Economic Policy, 17 (2), pp. 248-264., 2001 [2] MOCKUS, A., FIELDING, R., HERBSLEB, J., Two Case Studies of Open Source Software Development: Apache and Mozilla, ACM Transactions on Software Engineering and Methodology, 11(3): 309-346, July 2002. [3] CALEFATO F., LANUBILE F., A decentralized conferencing tool for ad hoc distributed workgroups, Dipartamento di Informatica, University of Bari, 2004. [4] CAPILUPPI A., LAGO P., MORISIO M., Characteristics of Open Source Projects, on the Proceedings of the 7th European Conference on Software Maintenance and Reengineering, March 2003. [5] SourceForge.net Web Page Online at http://sourceforge.net/index.php, February 2005. Accessed March 2005. [6] OSSMole Web Page Online at http://ossmole.sourceforge.net, February 2005. Accessed March 2005. [7] DAMIAN, D., ZOWGHI, D., Requirements Engineering challenges in multi-site software development organizations, Requirements Engineering Journal, 8, pp. 149-160, 2003. [8] SCACCHI, W., Understanding the Requirements for Developing Open Source Software Systems, IEE Proceedings - Software, volume 148, number 1, pp. 24-39, 2002. [9] NUSEIBEH, R., EASTERBROOK, S., Requirements Engineering: A Roadmap, in A. Finkelstein (ed.), The Future of Software Engineering, ACM and IEEE Computer Society Press, http://www.softwaresystems.org/future.html, 2000. [10] DAVIS, A.M., Software Requirements: Analysis and Specification, Prentice-Hall, 1990. [11] JACKSON, M., Software Requirements & Specifications: Practice, Principles, and Prejudices, Addison-Wesley Pub. Co., Boston, MA, 1995. [12] KOTONYA, G., SOMMERVILLE, I., Requirements Engineering: Processes and Techniques, John Wiley and Sons, Inc, New York, 1998. [13] AccessGrid Website, http://www.accessgrid.org. Accessed May 2005, May 2005 [14] FIELDING,R.T., Shared Leadership in the Apache Project, Communications ACM, 42, (4), pp. 42-43, April 1999. [15]CUBRANIC D., MURPHY C., Hipikat: Recommending Pertinent Software Development Artifacts. In ICSE 2003, Portland, IEEE CS Press, pages 408-418, May 2003

173

Virtual Team Implementation and Management - A Position Paper Tom O Regan, Valentine Casey, Ita Richardson
Abstract
This paper outlines the current and proposed research undertaken by the authors in the area of virtual team implementation and management. Outlining advantages and disadvantages that could be gained from the adoption of a GSD undertaking, the authors discuss development of a virtual team framework. This framework was developed by Casey [4] through qualitative research in a multi national company. It facilitates the development of a strategy for establishing and managing virtual teams. The research project is now moving to implementation stage using action research. This paper presents the proposed strategy for action research and analysis of results.

1. Introduction
Information technology is providing us with the necessary infrastructure to support Global Software Development (GSD) and the conception of new types of software development teams, such as virtual teams. Although the implementation and running of such teams is fundamentally different to single site teams, virtual teams could revolutionise the workplace through increased flexibility and productivity. 1.1 Advantages that GSD can offer A major advantage of employing a GSD policy is the possibility of developing a product close to your target market / customer. Developing close to or in target markets reduces overall time to market which results in cost savings to a project [3, 4]. When people from different backgrounds or culture come together to work toward a common goal, the different levels of experience, technical knowledge and elementary understanding of a problem, can lean in favour of innovation [5]. Ebert and De Neve [5] declare that a mix of skill sets and life experiences within a team can result in improved coordination among GSD team members. By developing across international borders companies have access to a new set of individuals that match its needs [6]. Furthermore, communication over temporal distances can be used to the advantage of a distributed software development team. These teams may work towards 24-hour development through the effective use of time zones therefore reducing development cost which in turn will reduce overall development costs. GSD in essence allows distributed teams to split up the tasks of a project and distribute them as separate jobs [6]. This allows development decisions about each project task to be made with a degree of independence [7]. 1.2 Disadvantages of GSD When operating across time zones, remotely located individuals might not be accessible instantly. This results in increased amounts of asynchronous communication tools like email [1]. These tools when used in conjunction with temporal distances increase the time to get a response. Delays in receiving responses may increase the time taken to resolve an issue intrinsically reduce their

174

productivity [1]. If the expert is remotely situated the time scale can grow considerably as can the effort needed to establish contact in the first place [7]. When development teams are separated geographically, the level of informal contact is reduced. This point is supported by Herbsleb and Mockus [8] who suggest that informal contact between team members builds better working relationships facilitating a more desirable information flow regarding projects. Furthermore, a disadvantage of GSD is the short overlap in time between sites in a given working day [11, 3] available for collaboration. English has become the international business language, but misunderstandings can occur, as there may be different dialects or accents [9]. Different cultures have varied approaches to development. Certain cultures may possess a set of values, language or approach to authority that other cultures might not have the same regard for or place as much emphasis on [11]. Carmel [2] expands the list of potential cultural differences in the workplace to include concepts of space, material goods and time keeping. These cultural differences where encountered, can contribute to scenarios that could have effects on productivity and morale. The lack of trust building communication techniques, as exists in GSD inevitably makes establishing trust much harder than within collocated teams [13]. Face-to-face communication methods can build trust that may not be present in a distributed team [12]. Established trust gained from collated experiences can deteriorate over time in a distributed setting [4]. 1.3 Research Project Lipnack and Stamps define a virtual team as a group of people who interact through independent tasks guided by common purpose that works across space, time and organizational boundaries with links strengthened by webs of communication technologies [10]. The number of organisations employing virtual team based globally distributed software development continues to increase.. This expansion necessitates further research to be undertaken in this area, much of it industry driven. The objective of this research project is to evaluate and improve the current methods of deployment and more fully comprehend and address the implications of implementing a GSD and virtual team development strategy.

2. Caseys Framework Development


The development of the GSD framework has resulted from detailed qualitative research examining the implementation of GSD strategies within multi-national organizations in Ireland. At a macro level one of the most pressing problems is the need for the development of a specific GSD centric software development process. This process needs to be generic in nature. It is required to address and efficiently leverage the key variables and infrastructures that facilitate effective global software development. At the micro level one of the key areas within such a process is the establishment and operation of virtual software testing teams. The qualitative research was undertaken within three distributed software development environments employing virtual teams: Local off-site development and testing where virtual team members were situated in two locations in the same country. Off shore / near shore development and testing where team members were located in Ireland and the United States, both countries are geographically distant, but they are considered linguistically and culturally near shore.

175

Off shore development where virtual team members were geographically, linguistically and culturally distant, with members in Ireland and the Far East exclusively involved in software testing. Onsite qualitative research which included document review, observation, interviews and completion of questionnaires was carried out. This allowed us to identify key variables and infrastructure required to facilitate effective software testing in a globally distributed virtual team environment. Analysis of the results and further literature review provided the basis for the development of the GSD framework for virtual team testing. This framework also provided the basis for the development of an effective strategy for establishing and managing virtual teams with a focus on leveraging process, project management and infrastructure. Specific areas identified in Caseys framework include: 1) Requirement for organisations to realistically evaluate the true cost and risks involved in undertaking a GSD virtual team strategy. 2) Provision of clear definitions of roles and responsibilities for managers and team members. 3) Development and implementation of a project management strategy, which incorporates and addresses the specific advantages and disadvantages of virtual team testing. The solution proposes to tackle the GSD issues in a number of ways. Senior management is required to understand the full implications of embarking on a GSD virtual team testing strategy. Once that has been achieved the resources required must be made available to provide the infrastructure and training to efficiently leverage this approach. Effective project management needs to be undertaken to facilitate and ensure its successful implementation. The effectiveness of the framework and proposed project management based strategy has been the subject of a quasi-Delphi analysis with international experts. (Further detail on the framework can be found in [4])

3. Implementation Methodology

Figure 1 - Implementation Strategy

The research project is moving into Phase II. The objective is to implement the framework within a project employing virtual software testing teams in industry, aiming to make the global testing environment more effective and efficient. The implementation will be carried out through action research with a multi-national partner and it is expected that further refinement to the framework will occur.

176

Figure 1 illustrates the current implementation plan as agreed with our partner. Through the analysis of company data relating to a certain product we will identify the features of that product and how well that feature is developed. For each product feature the company has a corresponding development team situated across multiple sites. Examples of metrics include actual versus planned development time and maintenance problems caused in one site and solved in another. Use of these metrics will allow us to determine a measure of feature goodness in the product. This will be done in conjunction with our industry partner. As part of Caseys research [4] a set of qualitative data has been collected. This data consists of semi-structured interviews with staff members and a series of questionnaires. One measure that this data is being analysed to indicate is team cohesiveness/ effectiveness/ performance. These measures are used in Organisational Theory research and will be transferred during this project to GSD research [14]. Following the analysis of the product features and their associated project virtual teams we will investigate if there is any correlation between the two, in doing so answering the question, how can virtual teams produce the best features? Subsequently, we will relate these results to the relevant point in Caseys framework. We now proceed to implement only the relevant portion of the framework, completing the first stage of an action research cycle, diagnosis. Working with industrial practitioners we will collaborate on the next stage of the cycle, action planning. This collaboration will result in the compilation of implementation plans and required organisational changes to ensure that virtual teams can, in fact, produce good or even the best product features. To make these changes we will consult Caseys research relating to the related portion of the framework. Coupled with literature from the field we will implement our changes thus fulfilling the action-taking phase. Our industrial collaborators have expressed a requirement that quantitative data be presented to them detailing their position in relation to the measure used before and after our changes are made. This data will show any improvements the company has received by adopting our changes and also show any deterioration, from this we will learn if the desired result has been fully achieved and if further iterations of the cycle are needed. The authors will also undertake iterations within a second company with the aim of developing a more uniform framework by leveraging results from the first iterations into a different scenario and monitoring the results.

4. Conclusion and Further Work


We expect that the development of a successful implementation strategy will result in a framework that can be used by any multi-national company for the development of an effective strategy for establishing and managing virtual teams. In parallel with this project, the Irish Software Engineering Research Centre are undertaking a project with small-to-medium sized enterprises in which researchers will be implementing and modifying the framework for use in small software development companies who are also involved in GSD.

5. Acknowledgement
This research has been supported by the Science Foundation Ireland Investigator Programme, B4STEP (Building a Bi-Directional Bridge Between Software ThEory and Practice), and the Irish Software Engineering Research GSD for SME cluster project.

177

6. References
[1] Boland, D. and B. Fitzgerald (2004). Transitioning From A Co-Located To A Globally-Distributed Software Development Team: A Case Study And Analog Devices Inc. ICSE International Workshop On Global Software Development, Edinburgh, Scotland. [2] Carmel, E. (1999). Global Software teams: Collaboration Across Borders and Time Zones. Upper Saddle river, NJ, USA, Prentice Hall. [3] Casey, V. and I. Richardson (2004). Practical Experience Of Virtual Team Software Development. European Software Process Improvement (EuroSPI) 2004, Trondheim, Norway. [4] Casey, V. PhD to be published, University of Limerick, 2005. [5] Ebert, C. and P. De Neve (2001). Surviving Global Software Development IEEE Software 18(2): 62-69. [6] Grinter, R. E., J.D. Herbsleb, et al. (1999) The Geography Of Coordination: Dealing With Distance In R&D Work. International Conference On Supporting Group Work. [7] Herbsleb, J. D. and R. E. Grinter. (1999). Splitting The Organisation And Integrating The Code: Conways Law Revisited. 21st International Conference On Software Engineering, Los Angeles, California, United States, IEEE Computer Press. [8] Herbsleb J. D. and A. Mockus et al (2000). Distance, Dependencies and delay in a global collaboration. 2000 ACM conference on computer supported cooperative work, Philadelphia, Pennsylvania, United states, ACM press. [9] Kiel, L. (2003). Experiences In Distributed Development: A Case Study. ICSE International Workshop On Global Software Development, Portland, Oregon, USA. [10] Lipnack, J. and J. Stamps (1997), Virtual Teams: Reaching Across space, Time and Organisations With Technology. John Wiley & Sons, NY. [11] McDonough, E. F. I., K. B. Kahn, et al. (2001). An Investigation Of The Use Of Global, Virtual, And Collocated New Product Development Teams. Journal Of Product Innovation Management 18: 110-120. [12] Pyysiainen, J. (2003). Building Trust In Global Inter-Organizational Software Development Project: Problems And Practices. ICSE Workshop On Global Software Development. [13] Robey, D., H. M. Khoo, et al. (2000). Situated Learning In Cross-Functional Virtual Teams. IEEE Transactions On Professional Communications 43(1): 51-66. [14] Wenchuan Liu, P. Flood, , J. Guthrie, and S. Mac Curtain (2005). High Performance Work Systems in Ireland: The Economic Case, National Centre for Partnership and Performance: Dublin.

178

You might also like