Ranking XP Prioritization Methods based on the ANP

The analytic network process (ANP) is considered
one of the most powerful tools to facilitate decision-making
in complex environments. The ANP allows decision makers to
structure their problems mathematically using a series of simple
binary comparisons. Research suggests that ANP can be useful in
software development, where complicated decisions are routinely
made. Industrial adoption of ANP, however, is virtually nonexistent
because of its perceived complexity. We believe that
ANP can be very beneficial in industry as it resolves conflicts
in a mutually acceptable manner. We propose a protocol for its
adoption by means of a case study that aims to explain a ranking
method to assist an XP team in selecting the best prioritization
method for ranking the user stories. The protocol was tested in
a professional course environment.


I. INTRODUCTION
Extreme Programming (XP) is a popular agile method based on taking 12 practices to their extreme in order to produce a high quality software.One of these practices is the planning game, in which XP team members meet together to identify the system requirements.These requirements are written as user stories.According to Cohn user stories are "short descriptions of functionality told from the perspective of a user that are valuable to either a user of the software or the customer of the software" [1].These user stories are significant because they make it easy to structure a general framework for the system.They do this by testing the designed software against identified user stories.A development team reviews the written stories in order to ensure domain specific information is adequate for the implementation.Using story points, the development team evaluates user stories to specify the cost and complexity of the implementation.Developers then break down the user stories into small tasks.Both developers and customers work together to prioritize user stories according to their business value.
Developers and customers usually agree on a well-known prioritization method in order to reconcile conflicting perspectives among them [2].This selection, however, is not often based on a formal approach.Well-known methods include numeral assignment technique, weighted criteria analysis, binary search tree, requirements triage, dot voting, pair-wise analysis, top-ten requirements, and the kano model.
In this paper, the ANP is used to formalize the process of ranking the prioritization techniques that can be used to prioritize the system requirements.In this study, five prioritization techniques are selected as alternatives, which are Kano Model, Relative Weighting, Top-Ten Requirements, 100-Dollar Test, and MoSCoW.

II. RELATED WORK
Requirements may be prioritized based on various features.These features receive no consensus on their importance in the process.Developers seek to increase the delivered value to the user by making the most suitable decision.
Based on a survey written by Wohlin and Aurum [3], Hoff et al. [4] introduced other features that influence the decision.According to Wohlin and Aurum [3] factors like delivery dates, stakeholder priority of requirement, and development costbenefit were found to be the most significant features.Hoff et al. [4] presented features such as impact of maintenance, complexity, increased performance, and cost-benefit to the organization.Probability of success, testability, impact to the organization, and prior errors addressed are other factors added by Hof et al. [4].The authors investigated which features were the most significant by conducting a comprehensive survey.At the end of their study, the authors addressed the most significant features during prioritizing system requirements for implementation.These factors were complexity, cost-benefit to the organization, delivery data/schedule, requirement dependences, and fixes errors.
Bhoem et al. [5] considered the cost of requirement implementation to be the most important feature when prioritizing system requirements.These costs involve aspects such as quality, documentation, stable requirements, availability of reusable software, complexity, and time-frame.Different factors affecting prioritizing requirements have been introduced by Firesmith [6].These factors include risk, time to market, personal preferences, requirements stability, legal mandate, dependencies, difficulty, business value, type of requirement, and frequency of use.Bakalova et al. [7] proposed various factors are acknowledged when determining the requirements prioritization.These factors include the effort required to measure estimation regarding size, input from developers, the context of the project, associated dependencies, the external changes, and criteria regarding prioritization.The authors concentrated on business value, negative value, and risk estimated by the user for the prioritization criteria.
Patel and Ramachandran [8] ranked user stories based on market value, business risk, business functionality, customer priority, core value, and implementation cost.While Wieger [9] prioritized the requirements importance according to risk associated with the implementation, the system benefits, technical cost, and penalties.
Carlshamre et al. [10] discussed requirement interdependencies by conducting a deep study.The authors presented the requirement interdependencies within various sets of requirements.The findings showed that 20% of the requirements are responsible for more than 70% of the interdependencies.The authors also addressed that requirement interdependencies should be considered the most important factor when prioritizing requirements.

III. METHODOLOGY
The main objective in this research is to investigate how the analytic network process might be used to rank XP prioritization methods.The case study methodology, which is explained in [11], is the chosen research methodology.
The following research questions provide more focus for the research case study: 1) How can the ANP assist in ranking the prioritization techniques in order to prioritize user stories?2) How does the ANP influence the development team's communication and productivity?
Moreover, the study propositions are as follows: Proposition 1: The ANP catches significant criteria and alternatives that have effect in ranking XP prioritization methods.Also, the results of using the ANP display the order of alternatives and criteria based on their importance.

Proposition 2:
The ANP includes creative debate and enhances team communication.

Proposition 3:
The ANP clears up conflict perspectives between the development team within the ranking process.
After determining the study propositions, the criteria for interpretation for the findings should be determined as well [4].When the final findings are analysed, these findings are compared to the initial propositions to decide if they match each other or not.Therefore, the criteria for interpretation are: P1: • Researches exhibit that for ranking requirements prioritization methods, ANP introduces the criteria and alternative clusters and their level of relation.
• The ANP's findings are displayed precisely with an order for both alternatives and criteria.

P2:
• Evidence shows that applying the ANP in planning game practice is simple and understandable.

P3:
• Evidence shows that ANP helps in create a debatable environment between the development team, which aids to share more knowledge.

P4:
• Evidence indicates that ANP aids to hear everyone's voice in the team and clears up conflict perspectives between the development team in the ranking process.
From the above questions, we derived the units of analysis for our study.The main objective is ranking various XP prioritization methods that can be applied to prioritize user stories.Appropriately, evaluating and ranking are two units of analysis.Another is the participants' perspective of the ANP benefits in each practice.Therefore, the design of this case study includes multiple cases, embedded with multiple units of analysis.The logic linking of the collected data to the study propositions is shown at the end of this paper.

IV. DATA COLLECTION AND SOURCES
At the beginning of each use for the ANP in extreme programming, we identified the criteria influencing the ranking process and assisting to investigate the ANP ability and advantages.Data was collected from searching previous studies and literature review.As well, data triangulation is adopted in order to increase the validity of the study.
The major data source of this research is an extreme programming project, conducted during the winter semester of 2016 at the University of Regina.The data sources in this research are: • Questionnaires given to the students during the development of the XP project.
• Archival records, such as study plans, from the students.
• Comments from the customer.
• Open-ended interviews with the students.

V. CASE STUDY
The case study was conducted during a 12-week Winter 2016 semester at the University of Regina.Several studies, like [12], [13] and [14], addressed that the suitable XP team size is between three and seven members.Moreover, Ambler [15] emphasized that the success of agile project is 83 % with team size less than eleven members, and the percentage goes lower with increasing the team size for more than eleven people [15].The major cause of this reducing in the success percentage is regarding to communication lack or misunderstanding with the large team size.Therefore, we had 12 graduate students from the University of Regina, and one additional participant, a client, who were included in this case study.These students had intermediate knowledge of extreme programming process and practices, and different programming levels.The majority of these students was part of a professional program, meaning that their graduate degree was part of their professional development and that they had previous employment experience in the software industry.Some of these students were continuing to work part-time.The participants' backgrounds included various programming languages such as C++, Java, and PHP.The participants were organized into two teams, the first team used the ANP method in order to make their decisions in the mentioned areas, and the second team followed the traditional XP method.Both teams were asked to develop a project called "Professors' Availability Managing System" complete with a set of requirements.The project was developed in 5 iterations, allowing two weeks for each.At the end of the project, the two teams implemented all system requirements.The participants were asked to evaluate all user stories in each prioritization technique before using the ANP in order to rank them.Assistance materials that focused on planning game practices were given to the participants in order to ensure their understanding.These materials involved prioritizing user stories, writing user stories, and making programming commitments.The ANP team was given white papers, several presentations, and other important materials about the ANP in order to allow them to apply it in their development.Team 1 practiced on several pairwise comparisons and increased their understandings of the ANP structure.At the end, the researcher handed out a survey to the participants in order to collect more data about the participants' perspectives.

VI. THE ANP
According to Saaty [16] "the Analytic Network Process (ANP) is a multi-criteria theory of measurement used to derive relative priority scales of absolute numbers from individual judgments (or from actual measurements normalized to a relative form) that also belong to a fundamental scale of absolute numbers" [16].The ANP provides a structure to present a solution for a certain problem, which leads to a decision for that problem.In the ANP method, dependencies among various criteria are considered making it different from the Analytic Hierarchy Process (AHP) [16].Saaty states [16] "in fact the ANP uses a network without the need to specify levels.As in the AHP, dominance or the relative importance of influence is a central concept.In the ANP, one forms a judgment from the fundamental scale of the AHP by answering two kinds of questions with regard to strength of dominance: 1) Given a criterion, which of two elements is more dominant with respect to that criterion, 2) Which of two elements influences a third element more, with respect to a criterion" [16]?
In pairwise comparisons, entered values reflect the relative effect among elements with respect to a control criterion.These entered values are based on the importance of each criterion.As such, "the ANP is a useful tool for prediction and for representing a variety of competitors with their explicitly known and implicitly assumed interactions and the relative strengths with which they wield their influence in making a decision.It is also useful in conflict resolution where there can be many opposing influences" [16].The network structure consists of different clusters, and these clusters contain various nodes or elements.These clusters are connected to each other based on the relative influences among the nodes.The links can either have external relative influence, which means elements in cluster X affect element in cluster Y, or internal relative influence, which means elements in the same cluster (e.g., X) affect each other.In this case, the external relative influence is named outer-dependence, and the internal relative influence    [17].Figure 1 gives a general idea of the ANP structure [17].Another aspect of the ANP structure is the prioritizing of different alternatives in order to make an appropriate decision.This starts by making pairwise comparisons, based on a fundamental scale, as shown in table I. Following this, "the vector of priorities is the principal eigenvector of the matrix.This vector gives the relative priority of the criteria measured on a ratio scale.That is, these priorities are unique within multiplication by a positive constant.If one ensures that they sum to one they are then unique and belong to a scale of absolute numbers" [17]."The consistency index of a matrix is given by C.I. (max n)/(n-1), where n is the number of alternatives.The consistency ratio (C.R.) is obtained by forming the ratio of C.I.The appropriate set of numbers is shown in table II, each of which is an average random consistency index computed for n 10 for very large samples.They create randomly generated reciprocal matrices using the scale 1 9 , 1 8 , 1 2 , 1, 2, 8, 9 and calculate the average of their eigenvalues.This average is used to form the Random Consistency Index R .I" [17].The consistency ratio (C.R) should be lower than 0.10, otherwise, the entered judgements need to be enhanced.After obtaining all priorities from the pairwise comparisons, these priorities are placed in a supermatrix.According to Saaty [17] "the supermatrix represents the influence priority of an element on the left of Fig. 2. The Super-matrix of a network [17] the matrix on an element at the top of the matrix with respect to a particular control criterion.A supermatrix along with an example of one of its general entry matrices is shown in figure 2. The component C1 in the supermatrix includes all priority vectors derived for nodes that are parent nodes in the C1 cluster" [17].

VII. PRIORITIZATION METHODS
There are several prioritization techniques that can be used to prioritize user stories.In this paper, the commonly used methods are selected as alternatives, which can be summarized as follows: 1) Top-Ten Requirements: This method is based on selecting ten requirements that are considered most important by customers, ignoring the internal order of the selected requirements [19].This is significant in resolving any conflict between the customers.More than ten main requirements can be achieved by any stakeholder, but the challenge is that some stakeholders might not be able to specify their top priorities.This technique is more appropriate for stakeholders who have equal importance.2) Cumulative Voting (The 100-Dollar Test) The 100-Dollar Test technique, expalined by Leffingwell and Widrig [20], is simple and straightforward.The stakeholders have 100 imaginary units (money, hours, etc.) to spread among the requirements.Regnell et al. [21] suggested using the amount of $100 units (1,000, 10,000 or 100,000) if the number of requirements is too high, in order to give the stakeholders greater freedom in the prioritization.Stakeholders count the total for each requirement after spreading the units across the requirements and prioritize the requirements based on the highest total.

3) Relative Weighting
This method assesses each requirement according to its impact of being present or absent in the project.Each requirement is evaluated on a scale of 0 to 9, where 0 indicates low influence and 9 indicates a high influence.Each feature is given a value by the stakeholders for having it as well as a penalty for not having it.Then, the stakeholders count the value of each requirement in comparison to the entire requirements in order to obtain the relative value.
Similarly, the stakeholders evaluate the cost for each requirement in comparison to the entire requirements in order to obtain the relative cost.In the end, the priority is given by dividing the relative value by the relative cost [22].

4) Kano Model
In 1987, the Kano method was founded by Noriako Kano in order to organize the requirements into five groups based on asking two questions [23]: a) "Functional question: How do you feel if this feature is present?"b) "Dysfunctional question: How do you feel if this feature in NOT present?"From the five options below, the customer has to select one answer for each question [24] This method prioritizes the requirements based on values from the customer's point of view.The requirements are organized into four categories as follows [25]: • M: Must have this attribute.This is not negotiable, and without it the project is considered a failure.• S: Should have this attribute.If possible, in order to satisfy the customer.However, the project is not considered a failure regarding its absence.• C: Could have this attribute if it does not influence anything else.This is less critical, and it is nice to have.• W: Won't have it now, but would like to have in the future.

VIII. PROPOSED CRITERIA FOR RANKING
To rank each prioritization technique, it is important to identify the criteria that affect the ranking process.These criteria are compared to show their interdependences and are compared with respect to each alternative or prioritization technique.The prioritization techniques are compared with respect to the criteria in order to show the feedback in relation to the ranking process.In this paper, four criteria are proposed for ranking the prioritization techniques; however, different studies might apply the same methodology with different criteria.These four criteria are: 1) Accuracy: Which prioritization technique gives the most accurate outcomes?2) Simplicity: What is the simplest prioritization method to understand and to apply? 3) Collaboration: Which prioritization method has the highest degree of collaboration between the team members?4) Time: Which prioritization method saves the time when prioritizing the user stories?

IX. ANP STRUCTURE FOR RANKING PRIORITIZATION METHODS
Structuring the problem in a network is the first step in the analytic network process.The network consists of three Figure 3 shows the ANP network for ranking the prioritization techniques.Next, the suitable ANP tables were generated, and all ANP team members received the tables.The ANP team was asked to fill out the pairwise comparisons based on the ANP fundamental scale that was described previously.General information, such as member's experience and programming level, was collected in each cover page.The ANP participants were also asked to compare the criteria among each other with respect to each prioritization method.The participants then used a matrix in order to compare the selected criteria.
Appropriately, the participants were asked to use the prioritization techniques during the whole project development in order to practice the advantages and disadvantages of each technique.After that, the participants evaluated each prioritization technique based on the four criteria.This was achieved, by giving the participants the suitable ANP tables and other supporting materials that mentioned above.
The participants first evaluated the four prioritization criteria with respect to each prioritization method using the Saaty scale that was described in I. Example of the participants questions is: • With respect to MoSCoW which criterion is more important, collaboration or simplicity and by how much?
After completing the criteria evaluation, the participants then compared the prioritization methods with respect to each criterion.Example of questions for the participants is: • With respect to simplicity: which method is simplest, Kano Model or Relative Weighting and by how much?
The same comparisons and questions were done again for all prioritization techniques and criteria.

X. FINDINGS AND RESULTS
The prioritization methods were evaluated by each participant in Team 1 according to the mentioned criteria.The Super Decision software [26] was used to count the aggregation results for the ANP team.For Team 1, according to the criteria, the ranking for the prioritization methods was as follows: First: Kano Model, second: Top-Ten Requirements, third: Relative Weighting, fourth: MoSCoW, and fifth: 100-Dollar Test.Table 3 shows these results.Using the Super Decision Software, we were able to analyse he importance of each criterion based on all prioritization techniques, which was as follows: First: simplicity, second: collaboration, third: time, and fourth: accuracy.Figure 4 exhibits these findings.For Team 2, the participants were asked to follow the traditional method in their decisions and therefore were asked to document each step in their process in terms of how and why the decision was made.Most of their decisions were made based on deep discussions and voting.Team 2 results show that MoSCoW technique was given the highest rank among the prioritization techniques.Table IV displays the prioritization methods ranking by Team 2. In addition, by asking Team 2 what was the most important factor for ranking the prioritization techniques, they ranked collaboration at the top.Table V shows the ranking of the criteria by Team 2.

A. ANP Ranking Results
With respect to the four criteria, Team 1 ranked kano model technique as the highest prioritization technique.They ranked the top-ten requirements technique second.The relative weighting technique was ranked in the third position and MoSCoW and 100-dollar test were the fourth and fifth positions respectively.Team 2 ranked MoSCoW technique as the highest prioritization technique based on the traditional method of XP.Similar to Team 1, Team 2 ranked top-ten requirements technique at the second position followed by kano model at the third position.100-dollar test and relative weighting techniques were ranked at the fourth and fifth positions respectively.Moreover, by asking Team 2 members about the most important criteria, the team members gave the collaboration factor the highest importance, while simplicity was considered the less important factor.In contrast, Team 1 considered simplicity as the highest important factor, and collaboration factor was in the second position.
When considering each criterion individually, it was noted that the 100-dollar test technique was given the top score in terms of accuracy by Team 1.The kano model was ranked as the highest with respect to time, simplicity, and collaboration.However, Team 2 ranked MoSCoW technique as the highest with respect to time criterion.These results show options that were made by each team.Rankings were completed individually, however, the group was consistent in the consistency rates.

B. Interview Results
After completing the project, the results of the ANP evaluation for ranking the prioritization methods were shown to the participants in order to conduct the interviews.Not all results were as expected and some findings were surprising.The interviews involved open-ended questions in order to collect the participants' perspectives about the ANP, their perspectives on its benefits and disadvantages in XP, as well to collect their opinions about the best application for ANP in XP among all mentioned practices.The collected data was comprised of handwritten notes from the interviews.
The interview results show positive comments from the participants regarding the ANP.The ANP was a helpful tool in solving conflict perspectives, and encouraged each team member to participate in making decisions.The main concern was the time it took during the ANP evaluation, and the number of pairwise comparisons.Another recommendation was applying the ANP in more XP practices and studying the effects.All ANP team members recommended using ANP in their future XP projects.
On the other hand, Team 2 was not completely satisfied with the process of their decisions.Some of the team members complained about that the most experience member had more voting weight than others, which lead them to follow decisions that they may not like.Another issue is that the ANP allowed us to know the difference between each ranking position in a percentage; however, Team 2 could not specified the amount of difference between each ranked technique and criterion.

C. Questionnaires
Questionnaires were distributed among the participants in order to collect their experiences and viewpoints with ANP.The given questionnaires consisted of two sections.The first section included questions about ANP as a ranking and decision tool, such as capturing the needed information, goodness of the decision structure, clarity of criteria involved, and clarity of alternatives involved.The second section included questions about the benefits of each extreme programming practice, and the students' satisfaction, such as enhancing the team communication, clarifying the ranking problem, creating positive discussion and learning chances, team performance, and satisfaction of the final results of the ANP.In this study, a seven-point Likert scale was used in order to determine the acceptability level of the ANP tool as follows: 1) Totally unacceptable.2) Unacceptable.
After completing the questionnaire, the same steps were followed as in [27] in order to aggregate the collected data and display the total acceptability percentage.The total acceptability percentage can be obtained as follows: The total acceptability percentage (TAP)= the average score x 100 7 .
Where the average score = the sum of all scores given by team members / number of the team members.
The following percentages show the level of acceptability for the ANP as a ranking and decision tool: • Enhancing team communication: 75 %.
• Clearing up conflict perspectives among the team members: 89 %.
• Satisfaction of the ANP final results 71 %.
From different data sources, the data was collected.By comparing the collected data with the study propositions based on the interpretation of the criteria that was mentioned above, we will analysis this collected data.The followings are the study propositions and their answers: • For the first proposition, we can see that both the alternatives and criteria are structured sufficiently, and considered in figure 3. Also, the accomplish results and objectives of the ANP use in ranking the prioritization methods can be seen in table III, which exhibited the ranking of the ANP team for the XP prioritization techniques, and kano model was ranked as the highest.
• The questionnaire statement 'satisfaction of the ANP final results' supported the second proposition, and the feedback of this was positive, which is 71 %.Moreover, the statement ' clearing up conflict perspectives among the team members' supported the third initial proposition, and the score was 89 %.

XII. VALIDITY
In this section, related threats to the validity are explained.These threats are construct validity, external validity, internal validity, and reliability.Several researchers emphasized that case studies are difficult to analyze due to biases and validity threats as described in [28] "empirical studies in general and case studies in particular are prone to biases and validity threats that make it difficult to control the quality of the study to generalize its results" [28].

A. Construct Validity
Construct validity ensures that "the treatment reflects the construct of the cause well, and the outcome reflects the construct of the effect well" [29].It deals with matching the concept being researched and studied, to the specific measurements.The small number of participants is the main threat to this case study.
Using various methods to ensure the validity of the results reduced this threat.Some of these methods are: • Data triangulation: a major advantage of case study is the opportunity to use several sources of evidence [30].An evidence chain is built through using interviews and surveys with various types of participants with different skills and experience levels, and the use of participants' comments and many observations.Therefore, a valid conclusion can be reached.
• Methodological triangulation: engaging a combination of research methods such as conducting an XP project to serve the study purpose, surveys, results of ANP pairwise comparisons, researchers' observations, and interviews.
• Member checking: showing the findings to the participants is recommended.This concern was addressed by presenting the final findings to all students in order to guarantee the accuracy of the study and to avoid researcher bias.

B. Internal Validity
Internal validity is about making sure the outcome is caused by the treatment (the effect).This type of validity is only related to explanatory case study.This issue may be addressed by linking all data sources regarding the research questions, and linking the research questions to research propositions.

C. External Validity
External validity ensures the relationship between the construct and the effect in order to guarantee that the experiment will be generalized to a different scope [29].In this study, additional case study will be need to be conducted in different environments such as industry in order to involve more experts from the field.Conducting such a case study will help in comparing the various results and findings from different environments.Future work will add to increased external validity.

D. Reliability
Reliability deals with the procedure of data collection and findings.Similar conclusions and results should be arrived by other researchers when following the same procedure.This can be done through the availability of same research questions, data collection, and case studies designed by other researchers.

XIII. CONCLUSION
After applying the ANP with extreme programming in order to rank the most popular user story prioritization techniques, the participants found that the ANP was a beneficial tool to assist stakeholders in ranking the prioritization methods.Specifying the related criteria such as simplicity, collaboration, accuracy, and time, that affect the prioritization methods might benefit the XP team members.The kano model technique was the most preferred method for the ANP team in this case study.The ANP team also, considered simplicity as the most important criterion.The traditional XP team, on the other hand, ranked MoSCoW method as the top alternative and the team considered collaboration as the most important criterion.
Using the ANP tool, the XP team was able to evaluate each prioritization method with respect to different aspects.Moreover, the ANP allowed us to specify the difference between each element in our model by a percentage, while the traditional XP team were not be able to do that.Furthermore, the traditional team ranked the prioritization methods by considering only time criterion without considering the other criteria in their decision.However, the ANP allowed Team 1 to rank the alternatives based on a multi criteria decision making approach, which helped the team to rank the alternatives with considering different aspects.The ANP helped the team members resolve conflicts based on a structured approach grounded in scientific principles.The ANP ended up simplifying decision making, which maximized the effect of the software being developed.Given the participants' background and their reaction to the results from this case study, we believe that this protocol can be transferred into industry.Thus, we look forward to extending this approach to an industrial case.

Fig. 1 .
Fig.1.The analytic network process structure[17] : a) I like it.b) I expect it.c) I'm natural.d) I can tolerate it.e) I dislike it.5) MoSCoW

Fig. 4 .
Fig. 4. The importance of the criteria by Team 1

TABLE IV .
PRIORITIZATION METHODS RANKING BY TEAM 2

TABLE V .
THE IMPORTANCE OF THE CRITERIA BY TEAM 2