Strategies to Minimize Problems in Global Requirements Elicitation

Many challenges arise in global software development projects, most of which are related to the lack of face-to-face communication and people's need to feel comfortable with the technology that they use. In this paper we introduce a methodology to detect the problems which may occur during the global requirement elicitation process and propose solutions to reduce them.


Introduction
The Requirement Elicitation (RE) process is challenged by different factors, most of which are related to communication between stakeholders [7].In addition, Global Software Development (GSD) is becoming continually more common [17,19] and the distribution of stakeholders through various countries makes communication even more difficult.The geographic and temporal distance between stakeholders increases the difficulty in developing the RE process [12].Communication is particularly less effective because of the different time zones which complicate synchronous communication, and distance which makes face-to-face meetings more difficult [17].Communication is also made difficult by cultural differences [19] and lack of awareness [17] which may cause misunderstandings.A complete list of critical factors is shown in [21].
As an attempt to decrease the impact of some of these factors, we propose a methodology that helps to detect in advance the problems that are likely to take place, by taking into account the stakeholders' profile and their environment.Moreover, the methodology also proposes various strategies through which to decrease the impact of these problems or to avoid them.
The remainder of this paper is structured as follows: Section 2 describes the RE-GSD methodology.In Section 3, the experiment that is being performed to validate the first stages of the methodology is outlined.Later, in Section 4 conclusions and future work are presented.

RE-GSD Methodology
The RE-GSD (Requirement Elicitation for Global Software Development projects) methodology attempts to detect in advance possible sources of problems that might take place in a GSD project, and suggests strategies to minimize them.The following sections will focus on the first two phases; however, Figure 1 shows how they integrate with the rest of the phases of a requirements definition process.

Figure 1 RE-GSD Methodology
In the following sub-sections we shall focus upon describing the first two stages, since both are the principal contributions and the key stages in attaining the goal of this methodology: detecting and correcting problems.

Phase 1: Preliminary Data Collection
The overall aim of this stage is to discover as much as possible about the environment and the people that will be part of the requirement elicitation process, along with the domain and main characteristics of the system under construction.
In order to gather the information we suggest organizing it into 3 different groups, as follows: The stakeholders The environment in which the elicitation will be carried out The characteristics of the system that will be built and its domain.
Traditional requirements elicitation methodologies do not provide structured guidelines through which to obtain this information.To help collecting this information we provide specially designed questionnaires and forms.Those related to the first two groups are shown in following sub-sections.

Obtaining information about the stakeholders
In order to obtain information about the stakeholders involved in the process we propose to: Identify people whose participation is important in the requirement elicitation process, including people from different levels of the organization.
Obtain personal information, such as stakeholders' cognitive characteristics, first and secondary language, etc.The form with which to collect such information is shown in Form 1.

Considerations about Form 1:
(1) In GSD projects it is important to have clear information about which is the given name and which is the family name, since different cultures use different orders.For instance, in China and Korea the family name goes first, while in most of occidental countries, like Spain, the family name is at the end.
(2) Some people have a favourite name that they like to be called by.It might be a nickname, a special form of their given name or even a different name used by their family or friends.We think it is important to have a record of such a preference if the stakeholders wish to give such information, in order to allow them to feel more comfortable with the environment.
(3) As it is widely recommended, it is more useful to ask for a stakeholder's date of birth, rather than just his/her age, because the age can be calculated from the date of birth, and the information does not need to be updated.(4) Country of Origin and Residence are significant because, although the first language might be the same (for instance: Spanish), the country of origin may use certain words in a different way and this might also be a source of misunderstandings.

Stakeholder's Personal Information Form
(5) Information about the stakeholder's knowledge about foreign languages must be obtained for each language that the organization considers as a possible "second language" for its virtual teams.For convenience we only show one language in the table format.
(6) Since stakeholders are from different countries, it is important knowing what their degree status is, according to a normal scale such as: Bachelor, Master, Post-Master, Doctor, Post-Doctoral (7) We consider it important to know the cognitive profile of the stakeholders and we have therefore used the Felder-Silverman Test, which can be obtained from the NC State University web page (http://www.engr.ncsu.edu/learningstyles/ilsweb.html).We are developing a tool to calculate such results and record them, along with the stakeholders' preferences.
Moreover, we collect information about stakeholders' jobs, roles, responsibilities and schedules.Since they are distributed throughout various locations, we also obtain information about their habits at each location (time difference with other sites, working hours, lunch time, etc.).This is very important if members are to know how to contact each other.This information could be collected by using Form 2.  (2) ………………………….

Considerations about Form 2:
(8) The stakeholder's location is important for the rest of the virtual team.It is also useful to know the time difference between the site in question and the Greenwich Meridian, along with the time difference between the independent sites of the countries that are involved.
(9) The role can be useful in defining priority between stakeholders' preferences.For instance, a project manager might decide not to be contacted by phone by designers or analysts and this preference might weigh more than his/her subordinates' preferences.
(10) Contact information gives us the opportunity to discover all the ways in which a person can be contacted.
(11) If possible, and if other communication technologies can be used, stakeholders can include this information for making it clear to the rest of the stakeholders.
(12) Preferences about groupware tools are gathered in order to discover more about stakeholders and the way in which they prefer to interact with others.
(13) Timetable information is useful as it gives us information about stakeholders' routines, indicating working time, breaks to have lunch, and so on.

Obtaining information about the organizational environment
In order to adjust the software to the existing environment and to avoid controversies, it is important to obtain information about the organization's structure, culture, and internal policies [23].We also collect information about groupware technology, as this has a direct influence upon the communication process, and about the requirement elicitation techniques since this choice is also very important for the quality of requirement specifications.To do this we have designed two questionnaires, one for stakeholders and the other for managers, in order to answer the following questions:

Communication patterns within the organization:
Do organizational policies exist which allow stakeholders to communicate freely with others in the virtual team or is there a person who must act as a mediator?

Phase 2: Virtual Team Definition and Problem Detection
As we have previously explained, the main goal in this phase is to define strategies to minimize the problems that may appear during requirements gathering.With such an aim, we focus on the information that we have gathered in the previous phase (personal information about stakeholders and their routine at work), we identify the possible sources of problems, and finally, we propose strategies to improve the requirements elicitation process.In order to organize this work, we have divided it into two tasks: 1) Detecting factors that may be a source of future problems 2) Determining the strategies to be applied in each case in order to minimize the effects of the problems detected in the previous step.
We shall now explain both tasks in greater detail.

Detecting factors that may be a source of future problems
We first need to determine what the problems that we expect to solve are.The most common sources of problems cited in Global Software Development literature are: -Problems caused by inadequate communication [14,20] -Problems caused by time difference or time separation [8,14] -Problems caused by cultural differences (Here we will distinguish problems caused by language and behavioural differences between different cultures.[6,14,19]) -Problems related to knowledge management [14] The possible sources of problems must be analyzed in each iteration of the requirements elicitation process or every time the set of stakeholders change.To do this, we need to analyze the information that has been gathered in Forms 1 and 2 in the previous phase, and to request the stakeholders to fill in Form 3.

√ √ √
As we have mentioned before, these strategies are related to the problems that may appear in GSD projects during the requirements elicitation process, and we looked for strategies that covered the main four problems suggested in related literature [8,14].As Table 1 shows, the three strategies are related to improving communication.For instance when focusing on improving the knowledge of stakeholders about the foreign culture, which is naturally related to cultural difference problems, it has a significant strain on communication between software developers [15].Similarly, using ontologies to minimize problems due to language differences (which is a consequence of cultural diversity) we are reducing conceptual ambiguities [24] (which is related to communication) and clarifying the structure of knowledge [11], that means helping knowledge management.Finally, technology selection is related in first place to time difference, since they can be classified as synchronous and asynchronous communication groupware tools, and also they relate to language difference, considering the communication channels that groupware tools and requirements elicitation techniques can use (that means written text, audio as well as other visual medium).
Even when problems and strategies application are not sequential, we propose seeing the process as a set of linear steps, as it can be seen in Figure 2.

Figure 2 Applying strategies for minimizing problems in GSD
In the first place we shall use the information about cultural differences.If cultural differences are detected, stakeholders must be aware of normal behaviour in other cultures as well as being conscious of their own behaviour, which may be seen as offensive or difficult to understand by others.Secondly, we shall analyze the stakeholders' country of origin.If all of them are from the same country of origin then understanding between them should not be a problem; but if this is not the case, we propose using ontologies to help reach a common understanding.In addition, when the stakeholders' first language is not the same, it is important to define the degree of knowledge of a common language.If this level is intermediate or less, we propose restricting communication to asynchronous tools, in order to give people the chance to read and write with greater care.Finally we propose using knowledge about the stakeholders' cognitive characteristics to choose the groupware tools and requirements elicitation techniques that are closer to the way in which they understand the world.In this case, we differentiate two ways to select the technology, depending on the existence or not of conflicts of preferences between the team members.
We shall now give more details about each strategy.Cultural differences cannot be avoided, but stakeholders can learn about the differences of the other culture.Being trained about cultural diversity is crucial for stakeholders to be aware of normal behaviour in other cultures as well as being conscious of their own behaviour, especially for things that can be offensive or misunderstood.To minimize such kind of problems, we have used strategies as follows: Cultural mediation: taking advantage of people who have visited the other site before -and therefore they know about customs and normal behaviour related to the foreign culture -that become referents for communication with people at the other site.Those people are called mediators, bridgeheads [9] or liaisons [18].
Virtual mentoring: based on simulation and virtual actors and it can become an interesting way for motivating stakeholders in foreign language training and cultural familiarization [22].
In addition to cultural diversity, GSD projects also must deal with language differences.Language difference can occur in a wide variety of levels, considering if stakeholders share or not the same first language.When people do not share the native language, English is usually the language chosen for interaction and it is crucial having a clear understanding of domain concepts and relationships.But also when people share the native language, if they come from different countries, idiomatic differences are a challenge for communication.For instance, people from Argentina and Spain share Spanish as their native language, but pronunciation and the use many words can have different meanings in both sites.

(B) Use of ontologies to minimize language differences
If all the stakeholders are from the same country of origin then understanding between them should not be a problem, but if this is not the case, we propose using ontologies to help reach common understanding.The main reasons why ontologies may be important during the requirements elicitation process are: They clarify the structure of knowledge: domain concepts and their relationships are defined during the ontological analysis, and thus permit a clear specification of the nature of the concepts and the terms used to represent them, with regard to the knowledge body that we intend to build [11].

They reduce conceptual and terminological ambiguities:
Ontological analysis provides a framework that unifies criteria, even between people with different needs or viewpoints that depend on their own context [24].
Ontologies can be classified as follows [16]: Top Level Ontologies: describe all general concepts such as space, time, matter, object, event, action, and so forth.They are domain independent.Their intention is to unify criteria among large communities of users.
Domain Ontologies: describe the vocabulary related to a generic domain (such as medicine, or automobiles), by specializing the terms introduced in the top-level ontology.
Task Ontologies: describe the vocabulary related to a generic task or activity (such as diagnosing or selling), by specializing the terms introduced in the top-level ontology; Application Ontologies: describe concepts depending both on a particular domain and task, which are often specializations of both the related ontologies.
When the ontology of an application domain of a software system, or the processes for their design and construction, exist, it is an important tool which helps to avoid errors and problems in the different phases of the software product life cycle, from the requirement analysis (where it facilitates the interaction between analyst and client) to maintenance (where it allows a better comprehension of the system under maintenance and the modification requests).
We therefore propose: If a domain ontology exists, use it to solve ambiguities and share knowledge Otherwise (if a domain ontology does not exist) the ontology can be built as part of the requirements elicitation process.We will propose certain easy steps and guidelines through which to build an ontology to assist communication between stakeholders.This ontology will be part of the whole software life cycle, because it can be started during the requirements elicitation phase and can grow as long as the different cycles during the requirements gathering phase continue.
Moreover, we believe that it is very important to systematically include an ontology about the requirements elicitation process in the methodology, to help stakeholders to discover more about the process and its components.We know of the existence of certain works in this field which we hope will be published quite soon.

(C) Technology selection
Even when ontologies play an important role in the requirements elicitation process because they help to clarify the structure of knowledge, along with reducing conceptual and terminological ambiguities [11,24], when the stakeholders' first language is not the same, we propose analyzing the degree of knowledge of a common language.If this level is intermediate or less, we propose restricting communication to asynchronous tools, in order to give people the chance to read and write with greater care.Also when the time difference is wide enough, technologies for communication are commonly reduced to asynchronous ones, since synchronous communication is usually not possible.Furthermore, in order to improve communication, we propose going a step further and using knowledge about stakeholders' cognitive characteristics to choose the groupware tools and requirements elicitation techniques that are closer to the way in which they learn [2,3,4].
To do so, our methodology includes a selection process which uses fuzzy logic and fuzzy sets [1] to obtain rules from a set of representative examples, in the way of patterns of behaviour.The selection process is divided into two stages: the first is independent of any project and comprehends phases 1 to 4, and the second is dependent upon a given project and covers phases 5 and 6, as is shown in Figure 3.
Phases 1 to 3 deal with looking for a set of examples, which are real data about stakeholders' preferences in their daily use of groupware tools and requirements elicitation techniques.We then analyze the data by using the machine learning algorithm proposed in [10], in which each example is turned into an initial rule and a finite set of fuzzy rules is iteratively found, which reproduces the inputoutput system's behaviour (Phase 4).This algorithm was designed to obtain rules with a maximum degree of generality, and it then reduces the antecedent part as much as possible so as to obtain rules that can be easily understood and highly approximated to real-life examples.As we have mentioned previously, phases 1 to 4 constitute the project independent part in which the example and preference rule databases can be improved through surveys and applied to different GSD projects.
The remaining phases consist of the application of our guide to a specific GSD project during a requirement elicitation process, so this is therefore called the project dependent stage.In this stage, we obtain the personal preferences of each person who is going to work in a given virtual team (Phase 5) and store it in a database which can be accessed every time people need to communicate with each other.
The technology selection process is carried out by studying and confronting the personal preferences of people who need to work together.This is done by means of an automatic tool that chooses and suggests the most appropriated technology (Phase 6).As we have explained previously, such strategies must take into account other factors besides the stakeholders' cognitive profiles, such as time differences between sites, the degree to which a common language is shared, and the current situation in the requirement elicitation process.

Validating the Proposal
In order to validate the first two stages of our methodology we have carried out an experiment in which 24 computer science post-graduate students took part.Twelve of them were from the University of Comahue (Argentina) and the rest were from the University of Castilla-La Mancha (Spain).The students were divided into eight teams, with three people in each, in which two Spanish people played the role of analysts and the other, from Argentina, played the role of user or client.The client indicated to the analysts the requirements of a system by means of a groupware tool.Each team had to face the same challenges: they had 4 hours of time difference, they had the same difference in timetables, and there were also cultural differences (for instance, although they share a common language, Argentinean people often are more formal than Spanish people, who like to get straight to the point, and also the pronunciation and certain vocabulary are different).
The distribution of the students was performed by considering their experience in requirement elicitation, and their gender and age in an attempt to obtain teams with similar features.Another factor that we have taken into account is their learning styles, which were obtained from the Felder-Silverman test.
Since we only had eight teams, we had to reduce the independent variables to two, in order to have more than one group for each treatment.We then decided to test the use of groupware tools and the use of ontologies and their effect on the requirements elicitation process, by fixing the rest of the

Feedback and rules adjustment
Our suggestion variables (time difference, culture difference, language difference and requirements elicitation techniques).After doing so, we defined four treatments: T1: using appropriate groupware and a domain ontology T2: using appropriate groupware without using a domain ontology T3: using non-appropriate groupware and a domain ontology T4: using non-appropriate groupware without using a domain ontology The eight teams were randomly assigned to one of the four treatments (two teams for each treatment).Team members did not know that some groupware tools were supposed to be more suitable for their cognitive profiles, as well as team members in treatments T2 and T4 did not know that teams in treatments T1 and T3 could consult a domain ontology.We did so to avoid preconceived ideas and therefore we can also evaluate whether those teams using the more suitable groupware tools and the domain ontology obtained a better performance than the rest of the teams.When teams finished their work and sent the requirements specification they had written, we asked them to fill in a post-experiment questionnaire to obtain their perception about the communication with people in their group.Satisfaction about communication was defined as an ordinal variable in the scale 0-4 (0=very bad, 1=bad, 2=acceptable, 3=good, 4=very good) and the resulting data (24 questionnaires) was analyzed and summarized by teams (G1-G8) and treatments (T1-T4), as it can be seen in Table 2.As it was previously explained, we had assigned groupware tools by means of a set of preference rules obtained in previous surveys [5]; then, we expected that people using the groupware tool suggested by our rules (T1, T2) would feel more comfortable that those who did not.Similarly, we expected that people using the domain ontology (T1, T3) would feel more comfortable that those who did not.Table 2 shows that our expectations were accomplished, since satisfaction about communication media was higher for those treatments where groupware suitability was high.Moreover it was observed that the satisfaction for those teams that used domain ontologies was as good or better that teams that did not use it.
In Figure 4 comparison between means is shown, where it is notorious that stakeholders satisfaction about communication for treatments with higher groupware suitability (T1 and T2) is better than the rest.Our current work focuses on analyzing the results obtained with more detail, to prove that differences between treatments are significant, as well as analyzing their correlation with the quality of the specifications according to the judgment of experts.

Conclusions and Future work
Many software organizations have adopted a distributed structure for software development in which members are disseminated over distanced sites and communicate through groupware tools.In such environments, software development projects are affected by many factors that complicate communication, and new methodologies thus need to be developed to improve the requirements elicitation and development processes by considering the main difficulties they have to deal with.
With this aim in mind, we propose a methodology that extends previous generic models for requirements elicitation processes by considering the special conditions that take place in GSD.The main contribution of our methodology is the phase in which we focus upon the detection and analysis of possible problems (in a given virtual team), and we consequently suggest a set of strategies with which to minimize them.Furthermore, we consider stakeholders' cognitive aspects in order to select the technology that is best for them.This is significant because, as stakeholders might feel more comfortable expressing themselves when using a tool closer to the way in which they perceive and reason about the world, information gathered during the requirements elicitation process is expected to be more accurate and therefore the final product is closer to the clients' and users' needs.
In order to validate our approach, we are currently analyzing data obtained from a recent controlled experiment that involved 24 participants from universities in Spain and Argentina.The main goal of this experiment was to validate the performance of domain ontologies and groupware tools selection as communication facilitators in a simulated global environment.Our future work will focus upon carrying out replicas of this experiment in which people from more nationalities will be involved and in which the time and cultural differences between them will be greater than in our current experiment.
To sum up, carrying out a "global experiment" is not an easy task, especially in the case of the requirements elicitation process, where the time needed to finish the process, even in a simulated scenario, may take many days or even weeks.Moreover, in the case that the subjects under study are students, many circumstances have to converge in order not to disturb their learning.On the other hand, carrying out the experiment in real industrial scenarios is even more challenging as industries are not willing to waste time or money by allowing their employees to take part in this kind of experiment.However, after evaluating the results of our experiment, we will focus upon carrying out similar experiments for the remainder of the strategies, and we later plan to apply our methodology in industry in order to test its performance in real projects.

Form 2 :
Stakeholder's Work Information Form Which groupware tools are commonly used in the organization?Have stakeholders received training in the use of groupware tools?Which tools do they know best?Which tools have they not used before?Does any policy exist in the organization which limits the use of groupware tools?How does it do so?Are stakeholders willing to learn how to use other groupware?Requirements elicitation techniques: Which requirement elicitation techniques are commonly used in the organization?Have stakeholders received training in the use of requirement elicitation techniques?Which techniques do they know best?Which techniques have they not used before?Are stakeholders willing to learn new techniques?

Figure 3
Figure 3 Phases to define and analyze personal preferences to choose appropriate technology in Virtual Teams

Figure 4
Figure 4 Phases to define and analyze personal preferences to choose appropriate technology inVirtual Teams

Table 1
Once possible sources of problems are detected, we use this information to suggest strategies to minimize them.The strategies we suggest are:Strategy A: Training about cultural diversity This strategy includes different approaches with the goal of making people aware about other customs and teaching them behaviour to dealt with people from the different cultures of the members that form the virtual team.Problems that are minimized for each strategy

Table 2
Media values for satisfaction about communication and written specifications