Test Case Prioritization Based on Artificial Immune Algorithm

: Regression testing is an essential and critical part of smart terminal program development. The test case suite is usually preprocessed by test case prioritization technology to improve the efficiency of regression testing. To address the problems of traditional genetic algorithm in solving the test case prioritization problem, this paper proposed a test case prioritization algorithm for intelligent terminal based on artificial immune algorithm. Firstly, different sequences of test case sets were used as the encoding of antibodies to initialize the antibody population; secondly, the Hemming distance was introduced as the concentration index of antibodies to calculate the incentive degree; finally, the antibodies were immunized to find the optimal test case set sequence. The experimental results showed that the algorithm based on the artificial immune algorithm was more capable of global search and less likely to fall into local optimum than the genetic algorithm, which indicated that the artificial immune algorithm was more stable and could better solve the test case prioritization problem.


INTRODUCTION
In order to better meet the needs of users, the update and iteration rate of intelligent terminal software systems has been accelerating. The intelligent terminal software systems have formed iterative development driven by regression testing [1]. Regression testing of the system is required to ensure correctness between program branches after the smart terminal program defects have been fixed. During the development and maintenance of intelligent terminal software systems, the requirements of the program change frequently [2]. With the iteration of intelligent terminal software system versions, the size of the corresponding test case set becomes larger and larger, and the cost of regression testing keeps increasing. Through a certain strategy to prioritize the test case set, the test cases that can better cover system defects are placed in front of the test case set. The test cases with higher priority can be executed quickly during regression testing, so that errors in the software can be found in time and the cost of software development can be reduced.
Test Case Prioritization (TCP) refers to determine the order of test case execution according to a certain strategy for one or more test objectives to be optimized. The TCP technique can help testers find a better test case execution sequence. By executing sequenced test cases, testers are able to find software defects earlier or at a lower cost [3]. It has been shown that test case prioritization is NP-hard problem [5], and traditional deterministic algorithms have difficulty in obtaining optimal solutions when dealing with the TCP problem. For this reason, swarm intelligence algorithms have been created and many optimization algorithms have been proposed [4,6,7]. In contrast, both genetic algorithms and swarm intelligence optimization algorithms have shown better results in solving the TCP problem [8][9][10]. However, these algorithms are prone to problems such as falling into local optimum and premature convergence when solving the TCP problem [11].
The Artificial Immune Algorithm (AIA) is an intelligent algorithm that was born after the swarm intelligence algorithm. In common with swarm intelligence algorithms, it is an algorithm that has been transferred to computer science through observational studies of the life sciences [12]. The AIA is able to generate and maintain the diversity of populations with the characteristics of adaptivity, stochasticity, parallelism, global convergence, and population diversity [13], which overcomes the inevitable "premature" problem in the process of finding the optimal solution, and can obtain the global optimal solution of the problem to be optimized. The AIA aims to establish the corresponding engineering model by studying the information processing mechanism of the immune system in the body of an organism during the occurrence of biological immunity using mathematical knowledge to model the mechanism accordingly through the observation of this mechanism. Currently, the AIA has good performance in many fields such as wireless sensing [14], military applications [15], and control engineering [16]. Based on the AIA, this paper proposed an optimization algorithm applied to the TCP problem for intelligent terminals. The optimal test case sequence was found by this optimization algorithm, and the performance of that was analyzed by experimental results.

RELATED WORK
Wong et al [17] were the first to conduct research on test case prioritization related techniques. They approximated and prioritized the test cases set based on the modified information of the code and the historical executed information of the test cases. Rothermel et al [18,19] proposed Total strategy and Additional strategy and experimentally verified the effectiveness of these two strategies in solving the TCP problem. Li et al [5] transformed the TCP problem into a knapsack problem and proved that the TCP problem is a combinatorial optimization NP-complete problem. They applied Hill Climbing and Genetic Algorithm (GA) to solve the TCP problem in their experiments. The experimental results showed that the GA performed better in solving the TCP problem. Zhang et al [20] solved the TCP problem based on GA and proposed two test case prioritization evaluation metrics. Zhang et al [21] proposed an algorithm based on the Particle Swarm Optimization (PSO) of Tent chaos, which improved the three main characteristics of Tent mapping and effectively avoided the premature and convergence of the standard PSO at the later stage. Zhang et al [19] proposed a hybrid optimization method of initial selection-ranking-again selection that combined test case selection and priority ranking, using the results of change impact analysis of code based on function call paths.

AIA-BASED TEST CASE PRIORITIZATION 3.1 Test Case Priority Definition
Test case prioritization technique is one of the important tools to improve the efficiency of regression testing. Rothermel et al [18] defined the test case prioritization problem as follows: Known: a test case set T, PT is the set consisting of all possible orderings of test cases in T, and f is a mapping from PT to the space of real numbers. Purpose:  . In this definition, PT is the definition domain of the optimization objective function f, and f is a quantitative description of the ranking objective for measuring the effectiveness of the ranking. The larger the value of f, the better the test case sequence is sorted. T' is the optimal test case sorting sequence and T'' is the test case sequence that is worse than T'.

Introduction to Artificial Immunity Algorithm
The Artificial Immune Algorithm (AIA) is a new type of intelligent optimization algorithm constructed artificially by imitating biological immune mechanism and combining with the evolutionary mechanism of genes. It has the characteristics of a general immune system and uses a population search strategy to obtain the optimal solution of the problem by iterative computation. Compared with other algorithms, the AIA uses its own characteristics of diversity generation and maintenance mechanism to ensure the diversity of the population and overcome the inevitable "premature" problem in the general optimization search process, and can find the global optimal solution. The AIA has the advantages of adaptivity, stochasticity, parallelism, global convergence, and population diversity.

Coding Strategy Design
In the test case prioritization problem, the input is the test case set T and the antibody individual is a test case sequence. In this paper, different sequences of the test case set are put into the AIA as the encoding of an antibody. Suppose the test case set is T = {t 1 , t 2 , …, t n }, the coding of one of the antibodies is shown in Fig. 1.

Affinity Function Design
Affinity characterizes the binding strength of immune cells to antigens and is similar to fitness in GA. The evaluation of affinity is related to specific problems. For different optimization problems, the affinity evaluation function should be defined according to the characteristics of the problem with the understanding of the problem substance. Usually, function optimization problems can be evaluated in terms of function values or simple treatments of function values (e.g., taking the reciprocal, opposite, etc.) as affinity, while for combinatorial optimization problems or more complex optimization problems in applications, problem-specific analysis is required [23].
In this paper, affinity is the evaluation of the similarity between the individual antibody and the optimal solution, which is a function of the match between antigen and antibody in the immune system. In this paper, the Average Percentage of Statement Coverage (APSC) corresponding to each test case sequence is used as the affinity index of the organism, and the closer the APSC is to 1, the higher the affinity of the organism and the closer the test case sequence is to the optimal solution.

Antibody Concentration Design
The antibody concentration characterizes the diversity of the antibody population, and a high antibody concentration means that there are a large number of very similar individuals in the population, and the search for the optimal solution will be concentrated in one region of the feasible solution interval, which is not conducive to global optimization. Therefore, the optimization algorithm should suppress the individuals with too high concentration to ensure the diversity of individuals [23].
In this paper, the individual antibodies are coded for the test case set sequences, so the Hemming distance between antibodies is used as the determination of the antibodyantibody affinity, and the formula for calculating the Hemming distance is shown in Eq. (1): where, in this paper, the similarity of encoding between two antibodies is used as a calculation of the Hemming distance. k  is defined as shown in Eq. (2): In antibody populations, the higher the coding similarity of two antibodies, the greater the Hemming distance and the higher the antibody concentration.

Antibody Excitation Degree Design
Antibody excitation is the combined ability of antibody populations to respond to antigen and be activated by other antibodies, and is mainly influenced by affinity and concentration, which are proportional to affinity and inversely proportional to concentration [20]. In this paper, it was obtained by a simple mathematical operation using the results of antibody affinity and antibody concentration evaluation, as shown in Eq. (3): where,   i act ab is the excitation degree of antibody i ab , a and b are constants.

AIA-based Test Case Prioritization Optimization Algorithm Flow
The optimization algorithm proposed in this paper based on the AIA is shown in Algorithm 1. In the above algorithm, line 1 represents the input, lines 2 and 3 represent the initialization of the algorithm parameters and the antibody population, line 4 represents the affinity evaluation of each feasible solution in the population to obtain the maximum aff_max of affinity in the current population and its corresponding antibody individual best, lines 5 -28 represent the iterative optimization search of the antibody population, where the immune processing methods are immune selection, cloning, mutation and clonal suppression, and line 29 represents the post-processing and visualization of the results.

Algorithm 1 AIA-based test case prioritization algorithm
The immune processing part of this paper makes corresponding changes for the TCP problem, as shown in Algorithm 2.
In the above algorithm, lines 1 -3 represent the input and output of the algorithm, lines 4 -9 represent the immune selection for the current antibody population, and lines 10 -26 represent the immune clone selection for the antibody population.

EXPERIMENTAL VERIFICATION 4.1 Algorithm Flow Selection of Optimization Targets and Experimental Environment 4.1.1 Selection of Optimization Targets
For the test case prioritization optimization problem, the Average Percentage of Statement Coverage (APSC) of the test case sequence is selected as the optimization objective for the experiments in this paper [24].
APSC represents the coverage of a sequence of test cases over lines of statements in the software code and is defined as in Eq. (4):

Experimental Environment
The experimental environment of the article is a 64-bit operating system with x64-based processor i5-8250U, system version Windows 10, 1.6 GHz main frequency, 8.00G DDR4-2400 running memory, and python 3.9.0 programming language. The optimal sequence of test cases is obtained by iterative execution of the AIA algorithm. The number of antibody populations is set to 100, the constants of antibody excitation are 0.66 and 0.34, and the probability of clonal variation is 0.6.

Test Data Set
The programs under test in this experiment are the codes of five modules in the intelligent terminal system, and test cases of different sizes are prepared for different programs under test, which can achieve full coverage of program statements. The specific information is shown in Tab. 1.

Experiment 1: Exploring The Effect of the Number of Population Iterations on the Test Case Prioritization Algorithm
The main purpose of this experiment is to investigate the effect of different iteration numbers on the AIA-based test case prioritization algorithm under a certain number of antibody populations. The number of iterations was set as 50, 100, 150 and 200, and then the test case sets of the five smart terminal programs under test were prioritized. The optimal test case sequences and the corresponding APSC values were recorded under different iterations. The APSC values obtained under different iterations were shown in Fig. 2.
In Fig. 2, we have counted the APSC values corresponding to the number of 50, 100, 150 and 200 iterations, and the closer the APSC is to 1, the better the sorted test case sequence obtained. As can be seen from Fig.  2, with a certain number of antibody populations, the APSC of the intelligent terminal programs under test gradually tend to be optimal as the number of iterations of the AIA increases. After a certain number of iterations, the APSC values tend to be stable and do not change significantly.

Experiment 2: Comparison of Artificial Immune Algorithm and Genetic Algorithm
This experiment is designed to compare the AIA with the GA in solving the TCP problem. We set the population size to 50 and the number of iterations to 200. The experimental procedure replicates that of the GA-based algorithm in the literature [17]. The experimental process will be performed on the test program using GA and AIA for test case prioritization, respectively, and the APSC values obtained from both are shown in Fig. 3.  Fig. 3, the diagonal stripes represent the APSC values obtained from the GA iteration and the small squares represent the APSC values obtained from the AIA iteration. As can be seen from Fig. 3, for a certain scale of test program, when the APSC value of test case prioritization reaches stability, the APSC value obtained by the AIA-based test case prioritization algorithm is significantly better than that of the GA-based test case prioritization algorithm. In solving the test case prioritization problem, the artificial immune algorithm used in this paper can better preserve the optimal solutions obtained during the iterative process and globally keep searching for possible optimal solutions, effectively avoiding the situation of falling into local optimum.

CONCLUSION
In this paper, we proposed a test case prioritization algorithm based on AIA for smart terminal systems. Firstly, different sequences of the test case set were used as the encoding of antibodies to initialize the antibody population; secondly, the Hemming distance was introduced as the concentration index of antibodies to calculate the excitation degree of the antibody population; then, the antibody population was immunized and the optimal sequence of the test case set is iteratively found; finally, the obtained experimental data were analyzed and processed. The experimental results showed that the AIA combined the idea of antibody immune variation into TCP problem, and the proposed optimization algorithm has a strong global search capability and good algorithm stability, which can obtain better APSC values for the intelligent terminal test case set compared with GA.
We will try to use more optimization algorithms to solve the problems related to the prioritization of smart terminal test cases in the future. In addition, in the future, we will try to analyze and improve the research of artificial immune algorithms on the multi-objective test case prioritization problem.