Test Suit Generation for Object Oriented Programs: A Hybrid Firefly and Differential Evolution Approach

In model-based testing, the test suites are derived from design models of system specification documents instead of actual program codes to reduce cost and time of testing. In search-based software testing approach, the nature inspired meta-heuristic search algorithms are used for automating and optimizing the test suite generation process of software testing. This paper proposes a concrete model-based testing framework; using UML behavioral state chart model along with the hybrid version of the two most popular nature inspired algorithms, Firefly algorithm (FA) and Differential Algorithm (DE). The hybrid algorithm is adopted to generate optimized test suits for the benchmark triangle classification problem. Experimental results evidently show that the hybrid FA-DE search algorithm outperforms the individual model-based Firefly and Differential Evolution algorithm’s performances in terms of time complexity, better exploration and exploitation as well as variations in test case generation process. The framework generates optimized test data for complete transition path coverage of the available feasible paths of the example problem.


I. INTRODUCTION
The software development organizations spend more than two third of the project development cost on product testing. The main intention of testing is to define some specific set of test suites that are capable enough to reveal the hidden errors/mistakes associated with the software under test thus avoiding bugs or system failures in future [37], [53]. The two most universally adapted testing strategies followed by testers are functional testing commonly known as black box testing and structural testing, popularly known as white box testing [67]- [71]. White box testing tests the logical flows, the key control flow paths, and program logics of the software under test. The black box testing tests the functions or modules of The associate editor coordinating the review of this manuscript and approving it for publication was Dongxiao Yu . the software, verifying the outputs generated for a given set of inputs.
Currently in major sectors like banking, stock markets, telecommunication, health management, university management, internet applications, rocket launching systems and mobile applications etc., almost in every domain the widespread use of object-oriented programming approach is noticed. The popularity of the object-oriented programming concept is due to its modular structure and specific features like encapsulation, polymorphism, inheritance, dynamic binding etc. [22] that make the development and modification of applications quite simple in comparison to structured programming style. The object-oriented testing paradigm, popularly known as grey box testing was introduced in late 80s, the main challenges and complexities encountered in this approach is the testing of the specific VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ features introduced by object-oriented programming concepts. The specific programming features that make the object-oriented programming popular also made it too critical to test. Therefore, a different testing style known as modelbased testing approach was adapted for object-oriented testing [39]. In the model-based testing approach the test cases are derived directly from system specification and design documents, i.e. mainly from the dynamic models also known as behavioural models of software systems instead of actual programme codes [49], it is a very tough task and an open research problem [22], [27]. The existing testing strategies are still incompetent to generate optimal set of test cases or test suites for critical path coverage and therefore the researcher community is still trying to figure out some new frameworks or methodologies for the complete automation of the process of object-oriented testing [30]. In last two decades researchers have started applying the meta-heuristic search algorithms [40], [43]- [45] to the field of model-based testing [22], [31], [34], [39], [78] for generating optimized test cases or test sequences. Nature inspired algorithms are gradually being hybridised [86], [87] keeping in mind the best features of different algorithms, [1], [18], [24], [32], [48], [63] to obtain more variation and quality in the solutions, as some metaheuristics are very efficient in exploration where as others are good at exploitation. The hybrid metaheuristic techniques are created by combining two search algorithms where balance between exploration and exploitation is maintained, one algorithm is better in exploration and other one in exploitation [86]. The Firefly Algorithm (FA) is a stochastic, population-based metaheuristic, that has proved its efficiency in solving NP hard problems, in various fields of engineering and industry like wireless network design, market pricing, structural optimization, robotics etc. [11], [16]. Many versions of Firefly algorithms are applied for solving various problems in the fields like cryptanalysis [18], graph colouring [17], speech reorganization [3], for improving the speed of convergence, for diagnosing Parkinson's disease, for feature selection [6]- [8] and to provide microarray data for cancer prediction [1], [3], [7]. Similarly, the Differential Evolution (DE) algorithm is based on the evolutionary principle of survival of the fittest; it's a very popular algorithm, proving its excellence in global optimization, at the same time hybridization of DE with several other algorithms have provided excellent results [29].
Keeping the above research findings in mind, in this paper a novel FA-DE algorithm has been proposed along with a framework for model-based testing of object-oriented programs, using UML behavioural state chart model. The proposed hybrid algorithm provides good exploitation feature using Firefly Algorithm (FA) and enhanced exploration feature using Differential Evolution (DE) algorithm to generate balanced test suits for the benchmark triangle classification problem. Initially, the UML State chart models are converted to state chart graphs (SCG), then feasible test sequences are extracted from the SCG graph and finally model-based hybrid FA-DE algorithm is applied to select a suitable set of test suites from a vast set of possible test suits for testing those feasible paths. Efficiency of the algorithm is verified using the benchmark triangle classification problem. Hence, the objectives of the proposed proposal would be achieved by implementing the following modules.
1) Development and generation of object-oriented test suites for triangle classification problem using FA based model. 2) Development of a DE based object-oriented model for generation and verification of test suits for the same classical problem. 3) Development of FA-DE hybrid model for test suit generation and its performance verification using the triangle classification problem. 4) An extensive set of experiments has been performed on a well-known benchmark triangle classification problem and a comparative analysis with state-of-the-art methods including single and hybrid metaheuristics.
The remaining parts of the paper are arranged as follows.
Section II provides a detailed literature review, section III provides explanations of the hybrid metaheuristic algorithms with parameter settings; Section IV describes the proposed framework for test suit generation, and Section V explains the detailed experimental set up and statistical analysis of the experimental results. Finally, the paper provides the conclusions and possible future directions in Section VI.

II. LITERATURE REVIEW
At present the newly emerging subdomain of testing, searchbased software testing had shown promising results in the field of software testing, where metaheuristic nature inspired algorithms, predominantly evolutionary Genetic algorithms (GA), genetic programming (GP), bio-inspired algorithms including particle swarm optimization (PSO), Ant colony optimization (ACO), Firefly Algorithm (FA), and Cuckoo search (CS) algorithms were employed in automating the process of test case generation and test case prioritization [33], [68], [76]. The nature inspired evolutionary algorithms mainly the Genetic algorithms or genetic programming and swarm based metaheuristic algorithms [19], [32] including PSO, ACO, CS and FA were used for generating test data or test cases targeting specific coverage criteria [12]. In last two decades as there was a paradigm shift from structured programming approach towards object, oriented programming approach a new testing methodology known as model based testing was adapted to test the object-oriented programs. Then researchers started applying the metaheuristic search algorithms both evolutionary as well as swarm based [43]- [45] to the field of model-based testing [22], [31], [34], [39], [70] for generating optimized test cases or test sequences.
The metaheuristic algorithms, also known as natureinspired algorithms are robust optimizers coming under the category of stochastic algorithms. The metaheuristic algorithms were developed replicating the natural processes such as the gravitational force of attraction, various laws of physics, harmonics, principles of chemistry, the adaptability features followed by the nature i.e. the process of evolution and the intelligentsia shown by its various species starting from micro-organisms like bacteria to birds, honey bees, flies, fishes, frogs, monkeys, wolfs and many more [81].
Though more than 300 types of nature inspired metaheuristic algorithms are available in the literature [82], the most widely accepted and popular primary algorithms mainly include the Genetic algorithms(GAs), Differential evolution algorithms(DE), Artificial Bee colony algorithms(ABC), particle swarm optimization algorithms(PSO), fireflies algorithm(FA) and ant colony optimization algorithm(ACO). The metaheuristic bio-inspired algorithms have proved their proficiency in solving complex real world application problems in versatile fields of Engineering like language recognition using firefly algorithm [3], in speech recognition where the parameters of a fuzzy neural network are optimized using Firefly algorithm [13] in electronics circuit design, in problems of traffic optimization using popular metaheuristics like GA, DE, ACO, GP(genetic programming), ABC etc [36], in enhancing the image contrast using FA with chaotic sequence and data classification using ACO [8], [52], in healthcare the firefly model is used for Parkinson's disease diagnosis and classification [5]- [7], in Robotics, where swarm based Glow worm optimization algorithm with multimodal functions was used for collective robotics applications similarly GA and PSO Algorithms were used for Intelligent Robot Path Optimization [16], [59].
Gradually with wider use of metaheuristics it was observed that these algorithms are not suitable for solving all kinds of problems and a specific metaheuristic algorithm shows excellent performance in solving a particular problem whereas the same algorithm shows worst performance in solving another type of problem, therefore one or a few metaheuristic algorithms cannot be standardized to get optimized solutions for all types of problems. Problem definition, fitness function, and parameters play a major role in the performance of those metaheuristic algorithms. Therefore gradually the hybrid approaches combining two metaheuristics are adapted for solving complex problems, a hybrid model based TVIW-PSO-GSA algorithm and SVM was applied for Classification Problems [60], a binary hybrid grey wolf optimization technique was used for feature selection [61] and some added applications like layout designing, graphics and art designing [28].
The model-based testing framework using metaheuristics seems too complex and difficult, as a small number of stan-dard papers are available and the area still remains challenging to the researchers, Kari and Kumar [15] used a model based Cuckoo search algorithm for test suite optimization, similarly Shirole and Kumar [34] provided a detailed analysis on the model-based test case generation processes using UML behavioural models, Utting et al. [39] provided a thorough taxonomy of the various model based testing approaches. Since last two decades more than seventy percent of research work in the field of model-based testing using metaheuristics, is mainly based on genetic algorithms [33], [72] or genetic programming [9]. Many researchers have already suggested that instead of individual metaheuristic algorithms the hybrid models are more suitable for providing better-optimized solutions due to their combined exploration and exploitation capabilities [86]. In literature, very less number of papers [84] are available, where hybrid algorithms are applied for model-based testing of object-oriented programs. In this section out of the vast number of available papers, the basic papers have been provided, which have proved to be useful in providing a detailed analysis and through understanding of the metaheuristic algorithms, hybrid metaheuristic algorithms, application of those metaheuristic and hybrid metaheuristic algorithms in the broad area of search based testing and model based testing.
Uzun et al. [77] have described that search based software testing is the emerging field of software engineering domain where the software testing problem is reformulated as a search problem to select an appropriate and specific set of test suites using some metaheuristic algorithms and the fitness function is defined on the basis of certain coverage metric. They have mentioned that although a number of search-based optimization techniques are available but still a very little theoretical analysis is available regarding the suitability of the problem for specific testing problems, they also suggested that a hybrid global search approach may be suitable for solving the test data generation problem.
Daniel et al. [87] have made an exhaustive literature study of bio-inspired algorithms during past few decades and they find out that the number of metaheuristic bio-inspired optimization approaches have reached to such unprecedented levels that it may dark the future prospects of this research field. They have addressed the problem by proposing two comprehensive, principle-based taxonomies thus allowing the future researchers to organize the existing and upcoming algorithmic developments on the basis of two well defined criteria i.e. the source of inspiration and the behavior of each algorithm. In this work they have reviewed more than three hundred publications dealing with nature-inspired and bioinspired algorithms, and the most interestingly they revealed that more than one-third of the reviewed bio-inspired solvers are versions of classical algorithms mainly Genetic algorithms, particle swarm optimization algorithm, Ant colony optimization algorithm etc.. The authors have also suggested that the hybridisation of the existing algorithms may produce new algorithmic behaviours and the results may be able to solve complex problems which still remain unsolved using existing single metaheuristic algorithms and once solid proof is provided that the hybrid approaches are able to compensate the increasing complexity then those approaches would be gradually incorporated using existing taxonomies of metaheuristic algorithms.
Khari and Kumar [15] have presented a cost effective and time efficient Cuckoo Search (CS) algorithm for test data optimization; they have provided a detailed statistical study for validating their results.
Zhang et al. [48] have proposed a hybrid firefly algorithm combining the advantages of firefly (FA) and differential evolution (DE) algorithms and also verified the performance of the hybrid algorithm using benchmark unimodal and multimodal functions. The experimental results show that the hybrid firefly algorithm had better performance than the original versions of FA, DE and PSO algorithms in convergence rate and in avoidance of getting trapped in local minima.
Nayyar et al. [51]- [53], Durbhaka et al. [54], Nayyar and Nguyen [55], Diwaker et al. [56], and Gheisari et al. [57] have provided a detailed understanding on evolutionary algorithms including GA, and swarm based algorithms including ACO, ABO, PSO, Glow worm, Cockroach swarm optimization, Cat swarm optimization, Dolphin echo location, Eagle strategy, monkey search algorithm etc., highlighting the various computational models, the versatile approaches along with their applications in the newly emerging complex fields of engineering including IOT, AI, Big data, Data mining and Robotics.
Panda and Dash [23] have provided a detailed overview of the popular metaheuristic algorithms since last two decades including Cuckoo Search(CS), Gravitational search algorithm(GSA), Genetic Algorithms(GA), Particle swarm optimization(PSO), Differential Evolution(DE) and Artificial Bee Colony algorithm(ABC) and have compared the performances of the algorithms to generate test data for path coverage based testing.
Sahoo et al. [84] have used a hybrid bee colony algorithm combining Particle swarm optimization(PSO) and Bee Colony (ABC) algorithms along with unified modelling language (UML) combination diagram for optimized test data generation using the UML state chart diagram and sequence diagram of an ATM system. Panthi and Mohapatra [27] used firefly algorithm for generating optimized and prioritized test sequences from UML state machine diagrams. The firefly algorithms have proved themselves in solving numerical optimization problems, i.e. the NP hard problems and also the firefly algorithm reduces the overall computational effort by 86 and 74 percent respectively in comparison to Genetic algorithm(GA) and particle swarm optimization algorithm(PSO).
Srivastava et al. [37] have used a modified firefly algorithm along with guidance matrix for generating optimal test paths. They claim that their methodology is capable of generating optimal discrete independent test paths that are highly useful in software testing.
Guohua et al. [83] have presented a very recent detailed review on popular strategies known as ensemble strategies that could be incorporated to different stages of populationbased algorithms, to enhance their efficiency, precision and robustness. These ensemble techniques improve the computational intensiveness of the population-based algorithms by providing versatile tools and paradigms to design a better algorithm that would be able to handle versatile optimization problems. Here the adapted controlling parameters are updated automatically depending on the type and complexity of optimization algorithms, i.e. the algorithm tuning is performed by those automatically adapted parameters instead of normal hit and trial method.
Grosan and Abraham [86] stated that although evolutionary computation has solved many practical problems in engineering, business, commerce, etc., still sometimes they fail to give better performance due to poor parameters selection or in appropriate problem representation. This is in accordance with the No Free Lunch theorem, which states that for any algorithm, any high performance over one class of problems is paid by poor performance in another class, therefore the need for hybrid evolutionary algorithms is emphasized and they explained the several possibilities for hybridization of the metaheuristic algorithms along with they presented a detailed review of the available hybrid frameworks using PSO, ACO, Bacteria foraging algorithm(BFO) and some generic hybrid evolutionary architectures developed during the last couple of decades.
The object-oriented testing paradigm was introduced in late 80s, where the main challenge was the testing of the specific features of the object-oriented programming concepts such as inheritance, multiple inheritance, polymorphism, encapsulation and overloading [22]. At the same time, it is also unavoidable to ensure the quality and dependability of the software. The best way out for reducing cost and handling program complexity during software testing of objectoriented software is the complete automation and optimization of its entire product testing phase.
In earlier 90s Panda and Dash [22], tried to apply the traditional testing techniques in testing object-oriented programs and recommended that these programs can only be tested by considering the massage passing between the objects or change of states of the objects i.e. taking into consideration the dynamic behaviour of the system. Therefore, generating test cases from design documents rather than codes would be more appropriate.
Sharma et al. [33] proposed that the model-based testing approach mainly concentrates on test case generation and test result evaluation using a model. A software model mainly describes the system behaviour in terms of the input sequences accepted by the system, a set of conditions, actions, the data flow between its modules and routines.
Panda and Dash [22] described that the code-based testing approaches are not applicable in object-oriented programs testing, due to the specific features of Object-Oriented programming concepts like data encapsulation, data abstraction, dynamic binding etc. and therefore model-based testing approaches are used for the testing of object oriented programs. Some of the popular software testing models include the unified modelling language (UML) models, finite state machines (FSM) models, Markov chains model and formal models. The design artefacts, mainly the UML behavioural models are mostly used by researchers for testing; these include Use case model, Activity model, State chart model, sequence model, object model and component model. A group of researchers also used combinatorial models fusing two UML models like Use case and sequence model, Activity and sequence model, state chart and sequence model etc.
Ananya and Swapan [2] demonstrated that the UML models cannot be used directly for testing; some intermediate representation is required for using those UML models in testing. A lot of research work is conducted in this direction and the popular techniques include symbolic execution, OCL(Object control language), Directed graph(DG), System sequence diagram(SSD), Sequence diagram graph(SDG), Extended control flow graph(ECFG), State machine graph, Activity graph, Message flow graph(MFG), Use case diagram graph(UDG),Communication tree, object oriented graph(OOG).
Srivastava et al. [37] revealed that it is a very tough task to analyse UML models, in particular the behavioural models as they capture the dynamic system behaviour. Specifically, for object-oriented programs, many researchers propose the automation of software testing process but till date test sequence generation and complete test coverage remains an open research problem. The existing testing strategies cannot guarantee the exact and optimal set of test cases or test suites for coverage of the critical paths as well as quality of testing.
Saeed et al. [49] explained that optimization algorithms are broadly classified into two primary categories; first, one is the deterministic algorithms and the second one is stochastic algorithms. Deterministic algorithms include the algorithms like Hill Climbing, Newton-Raphson Method, Simplex method etc., for similar set of initial values; these algorithms obtain similar set of final values. The stochastic algorithms always produce a new set of solutions even though they begin with the same set of initial points. These algorithms include some advantages as well as disadvantages. The advantages include shared information, preservation of good solutions and very few chances of getting confused with local best as the global best. The disadvantages include, these are complex metaheuristic algorithms, require a lot of parameter settings and show better performance with large data sets.
In model-based testing approach appropriate test suites can be extracted from UML models to test object-oriented programs, a detailed view of the literature available on modelbased testing in last two decades is presented in TABLE I. Though around more than two decades of research work has been carried out, still we require some concrete framework to automatically generate optimized and prioritized test suites for the hassle free model-based testing of object-oriented programs and software.
In the literature, few papers are available for model-based test data generation employing firefly algorithm [27], [37]. The authors have used the ATM state chart model and vending machine model as a case study for their problem. Srivastava et al. [37] have used a combined graph reduction technique with firefly algorithm to generate discrete and independent paths. Panthi and Mohapatra [27] have generated optimized and prioritized test sequences from UML state machine models, specifically generated test sequences for the composite states. Samuel et al. [31] generated test cases from UML state machine diagrams by applying transformed predicate functions.
The exhaustive search of the available literatures of last five years showed the availability of only one research work based on model-based testing using hybrid Bee colony algorithm [84]. In this work optimised path sequences are generated from UML combinational diagrams, the state-chart sequence diagram system graph (SCSEDG) of the ATM withdrawal operation. The hybrid Bee colony algorithm is developed by merging PSO and Bee colony algorithm where first the initial population was randomly generated and the fitness function of individual solutions was calculated and then the candidate solutions were ranked according to the fitness value. Afterwards the solutions were divided into two groups, the best solution are kept and the worst solutions are replaced with a copy of the best solutions, then the two metaheuristic algorithms are separately applied to get best optimal solutions. Here the solutions are path sequences and the optimized path is only one, the best path having minimum cost.
This work has much similarity with our work in terms of the model-based testing approach using UML diagrams as well as hybrid algorithms, but this approach cannot be compared with our work, as our objective is to generate test data for every feasible path targeting transition path coverage, also in our case the case study is the benchmark triangle classification problem having four paths and in the above work the case study is for ATM withdrawal operation, and their objective is to select only one path sequence having minimum cost.
After performing an in-depth study of the available literature on model-based testing using metaheuristic algorithms, it was observed that very few papers are available on the use of hybrid metaheuristic in the field of model based testing; no concrete framework is still available for automatic test suite generation with complete path coverage. In order to overcome the above difficulties, this paper proposed a novel FA-DE algorithm which generates optimal test suits fulfilling complete transition path coverage for model-based testing of object-oriented programs.

III. PROPOSED HYBRID MODEL-BASEDTESTING APPROACH
Nature inspired algorithms are good at solving many complex problems in different fields of science and engineering efficiently. There are several stochastic model-based nature inspired algorithms and out of them the Firefly (FA) and Differential Evolution (DE), have outperformed in various complex optimization problems. The detailed descriptions of FA and DE along with their algorithms have been presented in the Algorithm 1 and Algorithm 2. However, both algorithms have some inherent limitations such as FA searches nearby local regions and take more time to converge, whereas DE explores more randomly due to its mutation operator and gets premature convergence. Therefore, integrating the respective merits of FA and DE, a hybrid algorithm, denoted as FA-DE is proposed in Algorithm 3 to obtain quality test suites for testing object-oriented programs.

A. FIREFLY ALGORITHM
Firefly algorithm (FA) is a population-based optimization technique in the swarm intelligence family and it is proposed by Yang and He [46], Yang et al. [47]. This algorithm is inspired by the swarming and flashing light characteristics of the fireflies. In the summer night group of fireflies in the sky produce flashing light for two fundamental reasons, to attract their partners for mating and to protect themselves from potential predators [8].
However, the flashing lights follow two physical laws: first, the light intensity (I) is inversely proportional to the distance (r) in the form of Iα1/r2 light intensity deceases as the distance increases and second, the intensity of light exponentially decreases due to absorption of light in the air. for i ← 1 to N do 7.
for j ← 1 to N do 8. Compute the distance r ij between two fire flies x i and x j using Euclidean distance in Eq.(4) 9.
if light intensity f (x i ) < f (x j ) then Less-brighter firefly moves towards more-brighter firefly 10. Compute attractiveness varies with absorption parameter (γ ) and distance (r ij ) using Eq.(3) 11. Move firefly x i towards firefly x j using Eq. (5)  Therefore, in FA algorithm the intensity of light is associated with fitness value of the cost function to be optimized VOLUME 8, 2020 [11]. The FA can be formulated based on following three rules: 1) All fireflies assumed to be unisex so any firefly can be attracted to other fireflies irrespective their sex for mating.
2) The attractiveness is proportional to light intensity of the fireflies.
3) The light intensity of a firefly is determined by the cost function that is to be optimized. From the above rules, it is understood that FA has been designed using two important issues: i) the variation of light intensity and ii) formulation of the attractiveness that is inversely proportional to the distance. Hence, the attractiveness of a firefly can be established with its flashing light intensity which in turn is considered as fitness value of the corresponding firefly. In addition, flashing light of firefly also gets absorbed in the air. Hence, the light intensity (I) inversely varies with distance (r) and adsorption (γ ) which can be derived as follows: where I 0 denotes the light intensity of the firefly with distance r = 0, and the light absorption is assumed with a fixed light absorption coefficient. In order to avoid singularity at r = 0 in Eq. (1) we can combine inverse square law and adsorption using Gaussian form as follows.
Similarly, the attractiveness (β) is also the function of distance and adsorption. The attractiveness of a firefly is determined based on light intensity of the fireflies in its neighborhood in Gaussian form and it is defined as follows.
where β 0 is the initial attractiveness of a firefly which is initialized with a constant value at distance r = 0. The terms light intensity I and attractiveness β are by some means equivalent. The term light intensity indicates total light emitted by a firefly, the term attractiveness denotes amount of light that someone can see and being observed by other fireflies at a distance (r). The distance between any two fireflies in a group x i and x j can be computed by Euclidean distance in the Cartesian space, as follows.
where d denotes the dimensionality of each firefly; x i and x j are the i th and j th fireflies of the population. In the FA, movement of each firefly takes place based on the principle that the i th firefly attracts another j th firefly when j th firefly is more attractive than i th firefly. The movement of the fireflies is formulated in the algorithm as follows where α is a randomization parameter and third term generate random number in the range [−1, 1] from the Gaussian distribution. Eq. (5) represents new position of the i th firefly consists of three terms: the current position of i th firefly, move to j th firefly which is more attractive, and a random walk in the range of [−1, 1]. Finally, the steps of Firefly optimization are summarized in Algorithm 1.

B. DIFFERENTIAL EVOLUTION ALGORITHM
Differential Evolution (DE) is a stochastic population-based optimization technique in the evolutionary algorithms family and it is proposed by Rainer Storn and Kenneth Price [29]. DE algorithm works on the principle of evolution theory of nature i.e. survival of the fittest [39]. It primarily consists of two operators namely, mutation and recombination. In DE, the main role is played by the mutation operator and it is followed by the recombination operator. In evolutionary algorithms, each candidate solution is known as a genome or chromosome. Then, we can initialize the j th component of the i th target vector as follows, where rand is a uniformly distributed random number that varies within 0 and 1.
x i,j = x min,j + rand * x max,j − x min,j

1) MUTATION
After the initialization step, DE algorithm generates i th donor/mutant vector v n i corresponding to each target/parent vector x n i in n th iteration using mutation operator. Two most popular mutation operators are formulated as mentioned below: where random numbers R 1 / = R 2 / = R 3 within the range of [1, 2, . . . .., N p ]; One of the control parameter F is known as scaling factor which is a positive real number in the range of [0, 2]. In this work, we use these two mutation operators in the experiment that offer better result over other variants of mutation strategies. Unlike GA, in the DE algorithm the target vector is not involved in mutation operation. Bound trail vector u i within sample space 23: Evaluate fitness f (u i ) 24: Perform greedy selection between f (u i ) and f (x i ) and update target vector x i 25: end for 26: t = t + 1 27: end while

2) CROSSOVER
In order to increase the diversity in the search space, next recombination (crossover) operator combines the components of donor/mutant vector v n i with the target vector x n i to obtain trial/offspring vector, u n i = u n 1 , u n 2 , · · · , u n d . In this strategy, the trail vector is directly involved for the crossover operation. The DE algorithm basically uses two crossover operators such as binomial (uniform) and exponential (twopoint modulo). Binomial crossover is applied on a number of D components of the donor vector based on uniformly generated random numbers which vary from 0 and 1and is less than or equal to a pre-defined control parameter called crossover rate (Pc).
The binomial crossover is defined as in equation (8), Where k is a randomly generated natural number in the range {1, 2, · · · , d}, rand i,j is a randomly generated real number that varies in between [0, 1]. Next, the exponential crossover, we first need to choose a random integer number (n) between {1, 2, · · · , d}. Before nth component, all components of target vector are copied to the trail vector, than nth variable from donor vector is directly copied to corresponding position of the trial vector. For subsequent components, real valued random numbers are generated between [0, 1]. If rand i,j ≤ Pc Then copy the components from donor vector to trial vector. When rand i,j > Pc, copy remaining components of target vector to the trial vector. In this work, both the crossover operators have been used to maintain diversity in the solutions; here the solutions are the different test cases. From the experiment results, it was revealed that exponential crossover produces variation in the test cases, which is needed for all paths coverage.

3) SELECTION
Based on fitness values of target (parent) and trial (offspring) vectors, the greedy selection scheme is adopted to decide survival condition of the vectors to the next generation (n + 1).
The selection procedure is expressed as: where f(.) is the objective (cost) function to be maximized. It is necessary to mention that we consider only maximization problem in this paper.

C. PROPOSED HYBRID FIREFLY-DIFFERENTIAL EVOLUTION ALGORITHM
In order to improve the quality of the solutions and overcome the limitations of both individual Firefly and Differential evolution algorithms, this paper introduced a novel hybrid model-based framework established on the hybrid version of Firefly(FA) and Differential evolution(DE) algorithms. In proposed work, quality of the solutions means to generate test cases for the transition path coverage of the feasible paths present in the objective function, from all parts in the search space. It is only possible when both exploitation and exploration features are simultaneously justified in a metaheuristic optimization technique. Each metaheuristic algorithm has its own capacity to improve either exploitation or exploration characteristics.
Recently, researchers are showing interest to use hybrid approach, [86] by combining more than one algorithm together in a single framework, to improve quality of the solutions, by giving importance to both exploitation and exploration features. In this work, firefly and differential evolution algorithms are combined to form an efficient hybrid metaheuristic approach, referred to as hybrid FA-DE algorithm, in which the algorithms FA and DE will work together for simultaneously increasing the exploitation and exploration capability. VOLUME 8, 2020 As a result, the gap between exploitation and exploration decreases, that increases the number of test cases for each path. The FA has good exploitation capability because each firefly moves towards another brighter one neighbour firefly. In this way, all the fireflies of the population make different clusters based on their attractiveness property. Similarly, DE is an evolutionary algorithm which has higher exploration capability due to its mutation operator that is used for all candidate solutions in the population to increase randomness in the search space.
In FA, every firefly gets attracted to another brighter firefly and if the firefly is unable to find any neighbourhood brighter one then it tries to find another firefly through a random walk [20], [21], [47]. Here the DE algorithm [48] replaces the random walk feature used for exploration of the search space for the desired firefly. Differential evolution algorithm uses mutation and crossover operators only on those fireflies that cannot find a brighter one. When any firefly is unable to find a nearest brighter one then it is considered that the firefly may be the local best. At this point, in this case the DE operators such as mutation and crossover promote the firefly in avoiding the situation of getting trapped in local minima and provide quality test suits for object-oriented programs. The detailed steps of the hybrid FA-DE is given in Algorithm 3.The hybrid FA and DE algorithm has been designed as follows, FIGURE.1.
An individual firefly is attracted to its neighborhood firefly when the later firefly is brighter in the population based on its fitness. This process continues until the maximum number of fireflies in the population is reached. To ensure diversity, only some specific fireflies are selected for the execution of DE with greater random number [0, 1] in comparison to the rate of selection of FA. Thus it can be concluded that the DE improves exploration capability and convergence speed. Therefore, the combination of FA and DE algorithms is useful to increase the random search behavior of the proposed hybrid algorithm, for getting high quality solutions in the form of uniform number of test cases for all paths coverage.

D. TIME COMPLEXITY OF HYBRID FIREFLY AND DIFFERENTIAL EVOLUTION ALGORITHM
All of the metaheuristic algorithms are used to solve NP hard complex engineering optimization problems in polynomial time. They are less complex in design and easy to implement. Hence, they produce efficient results compared to other conventional optimization techniques.
Nowadays, many hybrid metaheuristic algorithms have been proposed for engineering design applications to provide consistent and better results than the individual metaheuristic algorithms. In practice, the design of hybrid model-based algorithm is to eliminate the limitations of individual metaheuristic algorithms at the cost of a little bit more time complexity. In this work, a hybrid FA-DE algorithm is proposed by combining the best exploitation property of FA and exploration feature of DE, with the expectation of generating consistent test suits by covering entire problem space in less computing time. Firefly algorithm consists of two inner loops of maximum size (n), the population size and one outer loop of size t, which is the maximum number of generations of the FA. Hence the time complexity of Firefly in worst case is O n 2 t . Similarly, DE has the same time complexity which is O n 2 t . In case of small n (our case, n = 50 maximum) and small t (our case, t = 30 maximum), the computation cost is relatively linear with respect to t. In general, the major computational time is expended in the evaluations of fitness function, which is same for all metaheuristic algorithms. In case of the proposed hybrid FA-DE algorithm, if a firefly fails to get neighbourhood brighter firefly then DE algorithm is executed in place of the random walk feature in FA. Hence, the proposed hybrid FA-DE algorithm takes same constant time i.e., O n 2 t which is relatively inexpensive than the standard FA, DE and other hybrid algorithms as mentioned in the literature [46].

IV. PROPOSED FRAMEWORK
This paper suggests a framework as shown in FIGURE. 2 to generate a set of suitable test suits from UML state chart Return optimal test suit model using hybrid FA-DE algorithm. Initially the UML state chart model is converted to state chart graph. Then the start node of the state chart graph(SCG)is assigned weight=1, edges are also assigned with weight=1, here the parent node number is the weight for the child node and if a child node has many parent nodes then the sum of all parent node weights is the weight of the child node [19]. Then Depth first search is applied to traverse the SCG graph, tracing the feasible paths and the total path cost, i.e. the sum of node weights and edge weights assigned to each path. Here the total path weight is the fitness function for each feasible path [24]. After calculation of the fitness function, the fireflies are generated randomly at each node and the sum of the edge and node weights is the fitness function of respective fireflies. Finally, the hybrid FA-DE metaheuristic algorithm is applied to generate test suits for path based coverage of the feasible paths.

V. EXPERIMENTS AND RESULT ANALYSIS
Here in the example problem, the UML state chart is developed using ArgoUML, and then the test suites are generated using MATLAB R2016b. The experiments were performed on Intel Core TM i3 CPU, 2.0GHz speed, 4GB RAM, running on 64-bit windows.
This paper suggested a framework to generate optimized test suits for the model-based testing of object-oriented programs using UML state chart model and hybrid metaheuristic FA-DE algorithm. In this framework for test suite generation, first of all the UML state chart model is developed for the example case study, 'the benchmark triangle classification problem', using ArgoUML case tool as shown in FIGURE 3. Then the state model is converted to start chart graph (SCG), as shown in FIGURE 4, where the states of the state chart model are considered as nodes of the SCG graph. In the next step DFS graph traversal algorithm is applied to traverse the SCG graph for generating the feasible path sequences and from those feasible path sequences the total path weight of respective path sequences is calculated. The path weight is the fitness function of the problem, and it's a maximizing optimization problem.
The hybrid FA-DE optimization algorithm has been used to generate test suites and here the path-based coverage criteria determines the fitness of the candidate solutions. In this work, the performance of the proposed hybrid FA-DE algorithm was implemented and compared among metaheuristic algorithms including FA, DE, PSO, CS and our recently published hybrid particle swarm optimization and gravitational search algorithm(PSO-GSA) [22] and hybrid Cuckoo search and simulated annealing(CS-SA) [24] metaheuristic algorithms for test suites generation. The first experiment determines optimal control parameter values of the proposed algorithm, then the second experiment compares four metaheuristic algorithms to point out the importance of the proposed hybrid algorithm, and finally the third experiment compares the

A. EXPERIMENTAL SETUP
The Classic Triangle classification problem is the bench mark problem used by many researchers, [22], [23] since last three decades in the domain of software testing specifically test data generation [34]. The generated test suites must hold test data that satisfies the triangle characteristics and distinctive attributes of this problem, thus putting it in a very specific group of benchmark software testing problem. Separate sets of data are required for different categories of triangles like isosceles, scalene and equilateral, which is too complex to figure out manually [25]. The triangle classification problem has been used as a case study for generating test suites using metaheuristic algorithms like Cuckoo search, Particle swarm optimization etc., [22]- [26], and is already recommended as a bench mark problem in the domain of test cases and test data generation. Therefore, the triangle classification problem is used in this work as a case study to generate test suites using metaheuristic FA, DE and hybrid FA-DE algorithms. The UML state chart model of the example triangle classification problem is shown in FIGURE 3 and its state chart graph (SCG) is depicted in FIGURE 4. As shown in FIGURE  4, the triangle classification problem has six states, and four feasible paths. The state S1 is the test for triangle, S3 is not for a triangle, and S2 is a composite state showing a triangle, S4 state is for scalene, S5 state is for isosceles, and S6 is for equilateral. The TABLE II clearly includes the name of the paths, state sequences for the respective paths and the conditions that the respective path sequences satisfy.   TABLE IV and TABLE V. The test data generation at each generation is depicted in graphical manner for varying population size and numbers of generations are represented in Figure 5 and Figure 6. In the TABLE IV and  TABLE V, the minimum test cases interpret minimum path coverage i.e., the minimum number of test cases generated for a particular path in one of the algorithm executions. It points out the lower bound in test cases generation for a particular path. Similarly, the maximum test cases infer maximum path coverage i.e., the maximum number of test cases generated for a particular path in one of the algorithm executions. It points out the upper bound in test cases generation for a particular path.
In this experiment, if a minimum value is zero, then it indicates that in at least one of the executions that particular path has not been covered. If a maximum value is zero that means that no test case has been generated for a path in 30 times execution of the algorithm. In the results, it is clearly observed that one minimum value is zero in the proposed FA-DE hybrid algorithm. The mean test cases indicate average number of test cases generated for a particular path in the total number of executions of the algorithm. The proposed FA-DE algorithm ensured large mean value of the test cases for all paths; however, it is not true for all paths in case of FA and DE. Another important statistic is the standard deviation that indicates the variations in test cases generation in the number of times of execution. Depending on the statistics in TABLE IV and  TABLE V, mainly the mean and standard deviation values are  closer for all paths in the proposed FA-DE than FA and DE. In addition, it was also observed that FA has more exploitation capability that leads to some path coverage, a large number of times than other paths. Similarly, DE has better exploration capability that implies more paths were covered with closer values than FA. These results motivate us to hybrid FA and DE to produce a set of balanced and stable statistics values with much improved generation of test suites as compared to FA and DE. In many cases the FA and DE generated zero data for path 2 and 3 whereas the proposed hybrid FA-DE algorithm provides uniform number of test data for every path. By comparing TABLE IV, and TABLE V, it is clearly evident that for a small population and generation, FA and DE, are not able to generate adequate number of test suits for all the four paths. On the other hand, FA-DE is quite efficient in generating stable and uniform test suits for all the four paths specifically for path3 which is a critical path of the problem. The FIGURE 5 (a, b, c) and 6 (a, b, c) depict the tests suites generated for the triangle classification problem by DE, similarly,FIGURE 5 (d, e, f) and 6(d, e, f) show the test cases generated by FA and FIGURE 5 (g, h, i) and 6 (g, h, i) show the test cases generated by the proposed hybrid FA-DE algorithm. The above derived inference aligns with the projected graph depicted in FIGURE 5 (g, h, i) and FIGURE 6 (g, h, i). Hence, the proposed hybrid FA-DE algorithm generates optimal test suits uniformly for all the test paths which are pictorially shown using bar chart in FIGURE 7.   (30) and varation in the number of generations (10,20,30).  (10) and variation in population size (10,20,50).
ing in mind their popularity as well as successful in various fields of complex software engineering problems including test data generation, test sequence generation, test case pri-oritization, etc., At the same time, the proposed FA-DE was also compared with our published hybrid algorithms: PSO-GSA and CS-SA. The performance of these algorithms is VOLUME 8, 2020 compared using descriptive statistics and statistical hypothesis. In this experiment, number of test cases and execution time metrics were used for the statistical analysis and comparison. The descriptive statistics used to analysis test cases for different path of triangle classification problem and execution time of metaheuristic algorithms include: minimum, maximum, mean, median, mode, standard deviation, and standard error. In addition, to this single factor ANOVA is used to verify whether or not the results of the metaheuristic algorithms, those are significantly different from each other. Referring to the results in    i.e. PSO-GSA and CS-SA. In addition, through statistical hypothesis, it can be verified whether all the algorithms used in the experiments were performing in similar way or significantly different by using the test suites generated by the respective metaheuristic algorithms for all four paths along with their execution times. In order to check this hypothesis, the single factor ANOVA has been used in our experiment. Initially, a null hypothesis was specified, which states that Null hypothesis (H0): All algorithms used in this experiment have equally performed without significant difference. Then, one Alternative hypothesis was defined (H1): states that the set of algorithms used in the experiments are performance wise significantly different. Referring to the results of  TABLE VIII and TABLE IX based on single factor ANOVA on test case generation and execution time of all algorithms, the Null hypothesis was rejected using F-critical value (2.14). The F test statistic (54.07) is greater than F-critical value (2.14) in TABLE VIII at 5 % level of significance. Hence, hypothesis (H0) is rejected and the result of test cases for all paths is significantly different among all algorithms. Also, another performance metric is execution time and the F test statistic (2703.15) was obtained using single factor ANOVA which is greater than F-critical value (2.14) at 5 % level of significance.

E. DISCUSSION
As the performance of the metaheuristic algorithms is affected by the improper selection of the control parameters, in the first experiment, the proposed algorithm was executed with different combination of control parameters to tune optimal control parameters.
In the second experiment, the performance of the proposed FA-DE was evaluated and statistically compared to FA and DE metaheuristic algorithms. The results presented in the TABLE IV and TABLE V revealed that the performance of the proposed FA-DE in large search space is superior over FA and DE. In the third experiment, using TABLE VI,  TABLE VII and TABLE X showed the motivation behind the selection of the specific two algorithms, FA and DE for hybridization over other recently popular metaheuristic algorithms such as PSO and CS. Then, the performance of the hybrid FA-DE was statistically compared with our recently published PSO-GSA and CS-SA hybrid algorithms and it was well established that the proposed algorithm is superior in terms of generation of balanced test suites and inexpensive execution time.
Finally, using TABLE VIII and TABLE IX based on test  cases for each paths and execution time of each algorithms  TABLE X, by using single factor ANOVA, the null hypothesis was statistically rejected and alternative hypothesis was accepted which states that the test suite generated by different metaheuristic algorithms in the experiments are significantly different and by descriptive statistics it was showed that the proposed FA-DE hybrid algorithm can be an alternative method for test suite generation in object-oriented testing domain of software testing. From the analysis of experimental results and computation time, it can be concluded that the new hybrid FA-DE algorithm has the capability to achieve promising performance. The proposed hybrid FA-DE method has the following advantages: (1) the experimental results indicated that both algorithms indeed add complementary features results in increasing test suite coverage performance.
(2) As DE algorithm is executed based on selection rate whenever, a firefly fails to get neighborhood brighter firefly then DE. Hence, the execution time of the proposed FA-DE algorithm is slightly higher than the execution time of FA algorithm. (3) Due to the balanced exploitation and exploration characteristics, the proposed method performs superior than existing metaheuristic algorithms. (4) Most of time, it generates test cases symmetrically for all critical paths. A comparison among all the algorithms is depicted pictorially in FIGURE 7. The major limitation of the proposed FA-DE algorithm is that the performance is sensitive to proper tuning of control parameters.
The proposed framework can be applied for testing real life problems, i.e. the test suites can be derived for testing by following the proposed methodology without any further modifications or improvements. The software requirement specification document of any problem statement can be used for designing UML models and from those models test suits can be derived by replicating our proposed framework without further modifications.

VI. CONCLUSION
Automatic test suite generation in object-oriented programming is a challenging task. This paper proposes a novel hybrid FA-DE framework to generate optimized test suits targeting path-based coverage criteria of testing. The Firefly algorithm and Differential Evolution algorithm have proven their efficiency in solving many engineering optimization problems. VOLUME 8, 2020 Our proposed framework has successfully generated optimal test suites for effective testing of object-oriented programs using UML state chart model.
The proposed framework is simulated using the benchmark triangle classification problem. The simulation results of the hybridized FA-DE algorithms clearly established the hybrid algorithms efficiency in uniform exploration and exploitation of the solution space, thus generating uniform test suits for all the feasible paths of the example problem, achieving full path coverage. Whereas the individual FA and DE algorithms, efficiently generated test suits for some specific paths. Therefore, the efficiency of the proposed Framework in generating uniform test suits for all the four paths of Triangle Classification problem can be attributed to the exploitation and exploration capabilities of the hybrid algorithm. This framework can be further enhanced by using the hybrid version of other metaheuristic algorithms and different UML models, taking into consideration large data sets. In addition to this the work will be extended by using ensemble strategies of metaheuristic algorithms [78] for test data generation in model-based testing of object-oriented programs.