Prioritizing the refactoring need for critical component using combined approach

Article history: Received March 16, 2017 Received in revised format: August 20, 2017 Accepted August 27, 2017 Available online August 29, 2017 One of the most promising strategies that will smooth out the maintainability issues of the software is refactoring. Due to lack of proper design approach, the code often inherits some bad smells which may lead to improper functioning of the code, especially when it is subject to change and requires some maintenance. A lot of studies have been performed to optimize the refactoring strategy which is also a very expensive process. In this paper, a component based system is considered, and a Fuzzy Multi Criteria Decision Making (FMCDM) model is proposed by combining subjective and objective weights to rank the components as per their urgency of refactoring. Jdeodorant tool is used to detect the code smells from the individual components of a software system. The objective method uses the Entropy approach to rank the component having the code smell. The subjective method uses the Fuzzy TOPSIS approach based on decision makers’ judgement, to identify the critically and dependency of these code smells on the overall software. The suggested approach is implemented on component-based software having 15 components. The constitute components are ranked based on refactoring requirements. Growing Science Ltd. All rights reserved. 8 © 201


Introduction
The quality of any software system is an important aspect that software industry looks forward to its survival for its current scenario because of increase in complexity, less cost, and universalness.Development of software constitutes majorly in two parts, in the first part, software is developed which is followed by the maintenance process and it includes the evolvement of the software according to the customer needs in the second part.Optimizing the software evolvement w.r.t time and the cost is challenging task (Khomh et al., 2009).Adding new functionality into a software system makes the design of the system inconsistent; therefore it is violating the blueprint of the original design.Enhancement of software leads to addition or deletion of methods, which degrades design and structure of the software, which is called Anti-pattern (Li & Shatnawi, 2007;Mansoor et al., 2013;Peters & Zaidman, 2012).An Antipattern is defined as a commonly occurring solution to the enhancement of the functionality of the software.It always produces an adverse effect on the system.When anti-pattern occurs in a software system, it is detected by the code smells (Rahman et al., 2012).Code smell is not Antipattern at the deign level, but it can be at the programming level (Carneiro et al., 2010).Certain code smells like duplicate code; parallel inheritance occurs due to poor programming practices, not because of lack of design in the software system (Moha et al., 2010;Van Emden & Moonen, 2002).
According to the Fowler and Beck (1999), code smell should be dislodged by applying the appropriate refactoring technique to enhance the quality of the source code.Refactoring is the process of changing the structure of the code without changing program behavior (Mens et al., 2007).Refactoring not only increases the quality of the software but it also increases the reusability of the code (Murphy-Hill et al., 2012).Refactoring gives more flexibility to the developers to alter their programs when there is a change in requirement from the user.Applying refactoring techniques does not come for free; a lot of challenges are associated while doing the refactoring to large source code size, like interdependency of a module and lack of tool support for refactoring (Alshayab, 2009).A lot of research have been done to prioritize and optimize the refactoring.Along with selecting appropriate refactoring strategy, it is equally important to decide the order in which the refactoring should be done.For this, some techniques are required which can help to determine the critical module or component of a software system so that more and more refactoring effort should be put on a critical module to improve the quality of the software.In this paper, a medium size component based software system is considered, and code smells are detected from the individual components of the system using the Jdeordent which is an Eclipsebased tool.Entropy method is applied in the objective approach to rank in the critical component based on the code smell presented in the component.Further expert opinion is considered to rank the component based on attributes of components and code smell namely priority, severity, dependency and importance.The combined approach is adopted for making a decision on the component which is more affected by the presence of code smell; it has both objective and subjective dimensions (Nagpal et al., 2016) i.e. they are measured on the absolute scale using analytical approach and also relative grading is given for estimating their importance using subjective approach.
The basic idea of this work is to reorganize components that need to be taken care on future adaptations and extensions.Thus, in this paper, components of a given software are ranked as per their refactoring need.This paper is divided into five sections.The literature on the existing approach of refactoring is discussed in section 2 and framework is proposed.Section 3 presents the analytical method using entropy approach in detail for the weight evaluation of the component of the software.In section 4, Fuzzy TOPSIS approach used to evaluate the subjective weights is discussed.In Section 5, the ranking suggested using combined approach is discussed.The conclusion and future scope are discussed in the last section.

Frame work for ranking the components as per the refactoring need
Once the code smell introduced in the code, it is only cured by refactoring (Mens et al., 2007).It is important to apply the refactoring effort in right direction to improve the quality of the code (Murphy et al., 2012;Alshayeb, 2009).A lot of studies have been recently executed on when and how to refactor the code.Both manual and automated refactoring approaches are suggested in the literature.According to Murphy-Hill et al. (2012), about 90% of refactoring edits are performed manually (Kim et al., 2014;Soetens & Demeyer, 2010;Ouni et al., 2017).Also Vakilian et al. (2012) concluded that automated refactoring is underused and misused.Thus software designer recommends the manual refactoring.Refactoring leads to change in the code which may increase the bugs if the software is not properly designed according to design pattern rules.Weißgerber and Diehl (2006) concluded that the number of bug report increases after refactoring.Researchers have identified the approaches for refactoring for given code smell but still determining which code needs to be refactored on priority is still matter of research.
The ongoing research in the domain of refactoring focuses, on two main dimensions, firstly identifying new refactoring opportunities and secondly aims to rank and prioritize the existing refractory approach (Piveta et al., 2008;Tsantalis & Chatzigeorgiou, 2011).In this study the focus is on the second study.To give priority to the refactoring need termed as Refactoring Index (R.I) for critical component a framework (Fig. 1) is proposed.The refactoring depends both on the quantitative measure that includes the amount of code smell and qualitative measure which is reflected on the effect of code smell on the quality of code.As both of these measures depend on multiple factors, hence Refactoring effort calculation can be considered as a Multi Criteria Decision Making problem (MCDM).The approach can grade the components based on their need of refactoring.Two common approaches are used to find out the weights, namely subjective and objective method are considered (Nagpal et al., 2016) and for complete analysis a combined approach in required including both of them.In this paper, weights for different code, smell is evaluated by combining both decision-maker preferences (subjective weighting method) and mathematical model (objective weighting method).
To evaluate the objective contribution to Refactoring Index one should identify the code smells from the code using the mathematical model.In this paper, a software which is developed based on the Component-Based Software Engineering (CBSE) is considered having 15 components.The jdeodorant tool is used to identify the code smells from the independent components (Fontana et al., 2012;Hamid et al., 2013;Moha et al., 2010).The relative importance of these code smells is measured by calculating its weights using entropy approach.Normalization is performed for the Values of the Code smell derived from JDeodorant tool, because they are not in uniform scale.The value of the proposed refactoring index lies between 0 to 1 where 1 represents huge flaws in design which means the requirement of the refactoring is 100%, and 0 means a negligible flaw in design, and refactoring requirement is 0%.To assess the objective weights of the code smell Shanon's Entropy method is used (Shannon, 2001).

Fig. 1. Framework for Refactoring Index
The subjective weights are calculated using Judgments from the decision-makers (DM'S).For the evaluation of subjective weights which reflects the judgment of decision makers, Fuzzy TOPSIS approach is used.The factor in which experts' opinion was considered are Priority ( , Severity( , Dependency( , Importance( .These factors signify to reflect how the code smell affects the quality of the software.In the evaluation of Refactoring index, each code smell has its contributing weight.Combined weight assessment vector is evaluated after rescaling these weights in the same range.Finally, Refactoring index is calculated by multiplying the normalized value of the codes smells with their respective combined weights.Based on the weights, the components are ranked in order to do refactoring needed.
In the following section objective and subjective weight evaluation is discussed.

Objective weight evaluation
Entropy method by Shannon is most popularly used to measure the system disorder in thermodynamics (Shannon, 2001).A software system is having a multi-attribute problem; entropy proves very useful to find the disorder from the system (Zhang, 2015).The small value of the information entropy of a given parameter provides the substantial contribution to the comprehensive evaluation.In this study, code smells are detected using the JDeodorant tool from the individual component of the software system, and weight of each code smell is evaluated using the Entropy method.If different code smell under consideration has a large difference, then the weight of that code smell is significant, and the value of entropy is reduced and vice versa (Zou et al., 2006).Entropy method is based on mathematical computation and overlooks the decision maker's preference.The description of the code smell and how they are measured is given below.

Code smell
Although a large variety of code smells are reported in the literature.In this study, the focus is on the following four code smells.Reason for considering these code smells is that poses feature of both large and complex projects like long method and God Class as well as smells related to the lack of adoption of good Object-Oriented coding practice which include type checking and feature envy.These code smells are measured using Jdeordent tool.

Long Method
When a method is having more than ten lines of code, this method is called long method code smell.The reason of the occurrence of this particular code smell is that when there is a need to add a new functionality in the software; it is much convenient to add a new line in the existing method rather than to write a new method.It is tough to maintain a function which is having a long method.Refactoring technique used to reduce this code smell is Extract Method (Singh & Kaur, 2017).

Type Checking
Type checking code smell occurs basically in the conditional statement of the code fragments.For example java programmers who are frequently users of switch statements, they need to take care of that switch statements co-occurs with the Enumerated data Types.If another set of data types is used then the type of smell occurs is called Type Checking (Singh & Kaur, 2017).

Blob (God Class)
When a class has several responsibilities, having a large number of attributes, operations, and dependencies with other data classes such that it becomes difficult for a class to work, a class that centralizes the system's behavior and has a dependency towards data classes is called a Blob If the Blob code smell is present in the system, and there is a single change in a piece of the program there is a need of modification in the other part of the program too.This increases the difficulties for the programmers.Extract method refactoring technique can be applied to remove this code smell (Singh & Kaur, 2017).

Feature Envy
This code smell occurs particularly in methods of a class, a method when used more feature of another class than its class then the smell occurs is called Feature Envy.This particular code smell increases the coupling between two classes.Move Method Refactoring technique is used to remove this code smell (Singh & Kaur, 2017).

Tool used to detect the code smell
This section describes the tool used to detect the code smell from the software.In this study, code smells are identified using the Jdeodorant tool from a component-based software system which is having 15 components.Jdeodorant which is an Eclipse plugin automatically detects the code smell namely Type Checking, Long Method, Blob, Feature Envy from the code which is shown in Fig. 2 and Fig. 3.Values of the code smell detected from this tool are given in Table 1.To determine the relative importance of code smell w.r.t its presence analytical approach using entropy is given in this section.

Entropy Approach
The weight of code smell is evaluated using following steps: Step1.A data matrix with i alternatives and n criteria where criteria are the code smells namely (i)long Method(ii) Type Checking(iii) Blob(iv) Feature Envy and the alternatives are components as follows, Cp1 Cp2……Cpn . .⋮ .

⋮ ⋮ ⋮
where (i = 1 to n) is the alternative components of a project.C.Sj (j = 1 to m) is the evaluation criteria, (i = 1 , … , m; j= 1 … , n) represents the performance of alternative under criteria C.Sj, respectively.As discussed above values of the code smell detected from the Jdeodorant are given in Table 1.Step 2. To measure each criterion based on the same scale i.e. between 0 and 1 data matrix given in step 1 is normalized using Eq. ( 1) shown in Table 2 Step 3: Entropy is calculated for each code smell as follows * ln 1 ,2 ,3 , where Step 4: Weights for the objective method is calculated by Eq. ( 3) and given in Table 3 1 ∑ 1 (3) where ∑ 1. Step 5: To rank, the critical component based on the code smell present in Simple Additive Weighted method (SAW) (Hwang & Yoon, 2012) is applied.Eq. ( 4) shows the sum of products of the normalized performance of the alternative for that criterion and the comparative importance (weight of the criteria) for an i th component dependent on j th criteria's , where represents the criticality of the component, = weight, which reflects the relative importance of code smells present in a component, = value of the code smells for a particular component.
Components are ranked according to their Criticality using Eq. ( 4) is given below in Table 4 Table 4 Ranking of the components according to the criticality

Subjective Method
Only the presence of code smells in a component does not determine that a particular component of the software is critical.There are other traits of the code smell and component which also determine the criticality of a component.In this study four parameters; namely Priority, Severity, Dependency and Importance (Ouni, 2015) of the code smell and component are considered to find out the criticality of the components.All of these parameters are subjective parameters and can be the best judge by the domain expert.Opinion on the criticality of a component based on these parameters is taken from the decision maker.Decision makers are software architects, software analyst, and software developer from various industries.On the opinion of the decision maker Fuzzy TOPSIS approach is applied to rank, the components according to their criticality score.In classical TOPSIS approach proposed by Hwang and Yoon (2012) an individual judgment is defined by definite values, however getting the opinion from the decision makers in certain number is not possible in real-life problems.The restriction of TOPSIS approach is overcome by using the Fuzzy set theory that embraces the linguistic variables rather than certain values.This method based on the principle that selected alternative is having the shortest distance from the ideal solution and the farthest distance from the negative ideal solution.In the below section, the chosen parameter and the Fuzzy TOPSIS approach is discussed.

Priority
According to the decision makers, different code smells have different impacts on the software quality.
In this study, opinions of decision makers on the criticality of the component depend on the amount of code smell present in the component.For example in a component of a software four different code smells; namely long method type checking Blob and Feature Envy are present.Decision makers take the opinion on the criticality of the components on the maximum presence of a code smell in a component.

Severity
Code smell detected in this study has a different impact on the software system, only quantitative measure of a code smell does not help in determining the criticality of a component; but it also depends on the severity of the code smell.According to the decision makers, every code smell has different severity score; the severity of code smell measures how much each code smell negatively contributes to the software system.For example, if the two code smells namely God class and Feature Envy have the same values detected from the tool Jdeordent, God class has more severity score than feature envy.Although these two code smells are present in the same amount, a critical component is identified by the severity of the code smell.

Dependency
To provide a complex system functionality, in a software system a group of components depends on each other.This composite functionality is affected if any modification is made in any component of the software.In addition, the replacement of a new version component will also cause the change of dependency between components.A component having more code smell but less dependent is said to be less critical than the component with more dependency with less code smell.

Importance
The importance of a component of software system depends on the usage of a component.A component of a software system is said to be more important than others if its usage is more than others.A component is said to be more critical than the other if it is frequently used, but it contains less code smell, rather than the component which is less in use but having more code smells.Identifying the importance of component can save the refactoring effort.
Based on the above discussed parameters Fuzzy TOPSIS approach is applied, below section described (i)Fuzzy sets (ii)Linguistic Variables (iii) Triangular Fuzzy Number (iv) Defuzzification

Fuzzy sets
An individual perception and general information using linguistic terms can be easily expressed by the Fuzzy sets

Linguistic variables
These are the variables used in such a position which is difficult to define and is very complex.Artificial intelligence, pattern recognition, information retrieval and other related areas are the domain of the linguistic approach (Zadeh, 1975).For example, rather measuring the exact temperature, it can also be represented as linguistic variable if its values are compared with fuzzy variables like very cold, cold, hot, and very hot

Triangular Fuzzy Number
Individual perception of the decision maker is approximated by the linguistic variables.The approximation of the linguistic variable is represented by the Triangular Fuzzy Number (TFN) approach which is a widely adopted approach (Zadeh, 1975).Using the membership function as given below Eq. ( 5).A TFN is defined by a triplet low, medium, high (l, m, h).
if if m q h 0 o t h e r w i s e Table 5 shows the important weights for each criteria and membership function.Using the triangular membership function various fuzzy arithmetic operations can be applied.

Defuzzfication
To get the weights in real number defuzzification is required.Defuzzification can be performed in many ways, but the most popular way of defuzzification is centroid which is given in Eq. ( 6) (Deng, 1999).

Fuzzy TOPSIS method to calculate the subjective weight
Based on the above discussion on chosen parameters the following steps are used to rank the components according to their criticality score using Fuzzy TOPSIS method (Sun, 2010).
Step (i) The evaluation of three decision makers(DM's) with three alternatives and four criteria namely Priority ( , Severity( , Dependency( , Importance( are calculated by Eq. ( 7) using Chen's methodology (Chen, 2000).The elements of yij of the decision matrix show that the performance rating of i th alternatives component of software system corresponding to the n th criteria w.r.t the m th DM  Step (v) For each alternative i Closeness coefficient (Ci) is calculated with Eq. ( 14) with reference to negative ideal value Similarly, for all alternatives the closeness coefficients (Ci) are evaluated and ranked according to their criticality as given in Table 13.Framework to find out the rank of the critical component which requires more refactoring effort is given in Fig. 4. Subjective and objective methods are combined to get the combined weight weights to identify the critical component

Fig. 3. Framework for combined approach
As discussed earlier the objective approach focuses on the analytical model.If the parameters participating have a variation in information gain, it will be reflected in the weights of the component.
The subjective approach focuses on the opinion of the decision maker and has used to grade the components based on closeness coefficient in our case.The output of SAW and closeness coefficient are multiplied by 100 for better visibility (Fig. 3.) in understanding the result; it does not make any change in the ranking order.Both approaches focus on different dimension had a significant contribution to the system thus the combined ranking is obtained by taking the 50% weight from each of them.To further understand the contribution of subjective and objective approach the variation of the contribution of each approach is done at 30% and70% and vise versa and is given in Table 14.The variations discussed in Table 14 are reflected in Fig. 5.As it can be seen from Table 14 and figure the ranking of subjective approach for component 15 is 9.3, objective approach is 3.9, and combined approach rank is 5.5 is at 30% weightage of objective method and 70% weightage of subjective method and 6.6 is at 50%weightage of subjective method and 50% weightage of objective method, 7.7 is for 70% weightage of objective method and 30% weightage of subjective method.Ranking of component 5 is significantly different in case of subjective and objective method i.e. 6.05 and 12.9 but in case of combined approach it is not significant.Fig. 5. Variations in ranking using objective, subjective and combined approach

Conclusion
As significant effort is required in refactoring thus, it is always a choice of software architect to optimize the whole process of refactoring.The simple approach adopted by the industry to improve the quality of software is to find the number of code smell from the code and remove them by applying an appropriate refactoring technique.But along with a number of code smell, it is equally important to understand the subjective traits of code smell and components namely Priority, Severity, Importance and Dependency to identify the critical components.This paper orients itself towards a new approach combining weights assignment methodology regarding entropy and Fuzzy TOPSIS to expertise combined Weight assignment approach.The refactoring index is evaluated using this approach, and the components are ranked as per the need of refactoring.Here the component 5 need to be refactored.

Table 1
Matrix contains the value of the code smell detected by the tool Jdeodorant

Table 2
Normalized value of the code smell

Table 3
Weights of the code smell

Table 5
Linguistic variable for importance weight of each code smell

Table 11
Distance from the positive solution

Table 12
Distance from the negative solution

Table 13
Criticality of the code smell with subjective method

Table 14
Identification of critical component using combined approach