Evolution test by improved genetic algorithm with application to performance limit evaluation of automatic parallel parking system

Performance limit evaluation of automatic driving system before putting into the market is critical for driving safety. The evolution test by genetic algorithm (GA) is a method by iteratively generating new test scenarios according to the last test results. To avoid its blind search for better efﬁciency, a scenario complexity index is proposed to measure the test effectiveness indirectly and guide the evolution process under the assumption that a complex scenario is more challenging to realise automatic driving. The traditional crossover and mutation operators are modiﬁed to generate more complex scenarios to improve the test efﬁciency. The advantage of the improved crossover/mutation operators in increasing the offspring’s scenario complexity index is analysed in theory. Moreover, the inﬂuence of the design parameters on the evolution test process and the global convergence are also discussed. The new evolution test by this improved GA has been applied to ﬁnd the collision condition of a parallel automatic parking system to validate its effectiveness.


INTRODUCTION
Compared with traditional onboard systems, automatic driving brings much better social and economic benefits, so it becomes a hot spot for research and production [1][2][3]. Many competitions have been held to promote its commercialisation, for example, the Grand Challenge sponsored by the Defense Advanced Research Projects Agency [4] and the Intelligent Vehicle Future Challenges sponsored by the National Science Foundation of China [5]. Such competitions catch the public's attention and try to show the maturity of automatic driving technologies, but only typical use scenarios are considered. It is far away to ensure the reliability and performance of the automatic driving system (ADS), which relates to traffic safety directly and are influenced by uncontrolled traffic environments [6,7]. Field test [8] and road test [9] are widely used methods to validate the developed system, but they are time-consuming and costly. With the help of X-in-the-loop technologies [10,11], some researchers tested ADS by replaying the recorded real traffic data to save time and money [6,12]. Compared with vehicle test, it also has the advantage of safety and repeatability. However, its effectiveness is limited by the completeness of the data, and the tested ADS works in open loop essentially [13]. Besides, This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2021 The Authors. IET Intelligent Transport Systems published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology there exist many similar and simple scenarios in natural driving conditions, which are inefficient to find the hidden problems.
To realise an effective closed-loop test, some systematic methods have been adopted to design test scenarios [14]. The full coverage of the selected scenario elements can be ensured by the combinational theory [15]. Moreover, if an analytic expression of the control logic can be established, the test can be accelerated by resampling the scenario elements to increase the probability of critical conditions [16]. However, most ADS algorithms are too complicated to be described by an analytic expression, and it is also impractical for suppliers to open their confidential logic to manufacture.
To relax the dependence on the prior information of ADS, an accelerated test method by interactively optimising the probability of scenario elements using the cross-entropy method was proposed in [17,18]. This approach can find the optimal scenario that is most likely to detect the performance limit. It focuses on the probability distribution instead of the specific value of the scenario element. This is regarded as an indirect optimisation method because after finding the optimal distribution, we still need to look for the exact scenario, which activates the performance limit. So some researchers proposed an idea of direct optimisation, which considers the scenario element as optimisation parameters and iteratively updates their values to generate new test scenarios by intelligent algorithms [5,19]. Klück et al. compared the efficiency of two frequently used intelligent algorithms, that is, genetic algorithm (GA) and simulated annealing, in performance limit evaluation of ADS through a number of experiments and concluded that GA is more efficient [20]. Vos et al. carried out an evolution test by GA for an adaptive cruise control system [21]. Furthermore, when doing the evolution test by GA for an autonomous parking system, Bühler et al. introduced a novel fitness function to accelerate the optimisation process [22]. These studies are beneficial to test efficiency, but the evolution process is still random because the effectiveness of the generated offspring cannot be estimated beforehand.
To reduce the blindness of searching, an index is designed to measure the scenario complexity under the assumption that automatic driving is more difficult to be realised under complex scenarios. The offspring generation process of the crossover and mutation operators are improved by the guidance of this scenario complexity index. With the idea of brother competition, the candidate offspring generated by the new crossover operator competes to survive according to the average scenario complexity index, and so will the candidate offspring generated by the new mutation operator. In this way, the generated offspring becomes more complex, which is beneficial to test efficiency. After the improvement analysis of the scenario complexity theoretically, the effectiveness of the evolution test by this improved GA is validated by application on an automatic parallel parking system (APPS) to evaluate its collision avoidance performance.
The main contributions of this paper are: 1. A scenario complexity index is proposed to measure the effectiveness of the test scenario without conducting the test; 2. the traditional crossover and mutation operators of GA have been re-designed to increase the searching speed by introducing the scenario complexity index; 3. the improvement of the offspring's scenario complexity index and the influence of the key design parameters have been analysed theoretically; 4. The effectiveness of the evolution test by the improved GA has been validated comparatively by application on APPS.
The remained parts are organised as follow. The main process is introduced in Section 2. Sections 3 and 4 design the new crossover and mutation operators, respectively. In Section 5, the performance is studied from: (1) The influence of the key parameter on the evolution test and (2) the global convergence. The effectiveness is validated in Section 6, and Section 7 concludes the paper.

EVOLUTION TEST BY IMPROVED GA
Being similar to the traditional evolution test by GA, the improved one also converts performance limit evaluation to a where X is the optimisation variables, that is, the test scenario; h(X ) is the objective function measuring the desired performance; R is the set composed of all possible test scenarios. In Expression (1), the performance limit is assumed to be the minimum value of the objective function as an example. For the application whose performance limit is the maximum, the optimisation problem can be converted to Expression (1) equivalently. The objective function is designed according to the desired performance of the tested ADS, which is used to evaluate its behaviour and is always provided by the supplier.

Framework of evolution test
Just like the traditional test, the evolution test by the improved GA is also composed of five steps as shown in Figure 1 [21], where the traditional crossover/mutation operators are modified to improve test efficiency. The new ones are called full crossover and multiple mutation, respectively. With the assumption that the performance limit is more likely to be activated with complex scenarios, the full crossover and multiple mutation operators are designed to increase the possibility of producing more complex scenarios. The details are introduced in Sections 3 and 4, respectively. The initial population, 1 = {X 1 , … , X 2n } with 2n test scenarios, is generated randomly. The test scenario, X i , is composed of the values selected randomly from the m scenario elements. Each scenario is considered as an individual and denoted by a chromosome with L genes. The performance is measured by an objective function. When it reaches the desired value, it implies that the performance limit is found and the test process is terminated. For example, if the test target is to detect a collision, the distance from the obstacle can act as the objective function. Moreover, due to the limit of test time, when the number of iterations is larger than the allowable one, the test is also terminated in practice. To avoid premature caused by over-selection of individuals, and ensure the selection probability of each individual is bigger than zero, the sort-based fitness function is used for natural selection [23]: where f (X i ) is the fitness function determining the selection probability of X i , A ∈ [1, 2) is the selection pressure, which has a positive correlation with the selection probability of good individuals, and B i ∈ [1, 2n] is the ranking number of X i . To ensure the global convergence, the best individual found during the test is recorded before natural selection, which is called an elitist selection strategy [24,25].

Scenario complexity index design
The crossover and mutation operators are cores of GA. To improve them by increasing the possibility of producing more complex scenarios for better efficiency, an index is required to measure the effectiveness of the test scenario without conducting the test. But it is hard to set up an analytical function between the performance of ADS and the test scenario. To overcome this challenge, the analytic hierarchy process (AHP) is adopted to design the scenario complexity index [14,15]. Being different from the objective function in Expression (1), the scenario complexity index is an estimation of the effectiveness of the test scenario, that is, the ability of the test scenario to activate the performance limit of the tested ADS. As shown in Figure 2, to realise the evaluation of the test effect indirectly, a tree model is built first to analyse the influence of scenario elements, which affect the functionality and performance of the tested ADS. In Figure 2, i, j is the j-th node in the i-th layer. In layer 1 to layer F-1, each node represents a scenario element, while in the bottom layer, each node is one value of its parent node. Since the number of nodes is limited, some discretisation methods, such as equivalence partitioning and boundary value analysis, can be adopted to convert the continuous element to limited and discrete ones [26,27].
The influence factors of ADS are numerous and complicated. It is very difficult to determine the importance of each factor when designing the test scenario quantitatively. By this tree Full crossover operator structure model, only comparative analysis among the factors belonging to the same parent node is needed. Then, the importance degree of the node at the bottom layer is obtained by AHP [28,29]: where  k is the importance degree,  i, j is the relative importance of node, i, j , and ℝ is a set composed of the indexes of the path from F ,k to the root. A test scenario, X , is represented by the values of scenario elements, that is, X = {x 1 , … , x m } , where x i represents the value of the i-th scenario element. To find the exact performance limit, x i is generated randomly in its continuous range, so x i is not equal to the discrete value, F ,i . The linear interpolation is used to calculate the important degree of x i ∈ ( F ,k−1 , F ,k ) and the scenario complexity index is obtained by summarisation: where C (X ) is the scenario complexity index.

Design of full crossover operator
Traditional GA has no prior knowledge about the effectiveness of offspring, and so the crossover point is selected randomly. There is a high possibility of missing the best one. To ensure the good offspring exist in the candidate ones definitely, the full crossover operator is designed as shown in Figure 3. It performs the single-point crossover at all positions. If all candidate offspring is selected to mutate, the number of offspring will increase by the geometric progression. The brother competition strategy is adopted to select the formal offspring pair according to their scenario complexity index, which has a positive correlation with the selection probability. As shown in Figure 3, the pseudocode to realise the full crossover operator is Randomly select a candidate offspring pair as the formal one (  i ) with the probability that where Cr single (  i , j ) denotes the single-point crossover operation on  i at j th position [24,25], and ∈ [0, +∞) is the factor that has a positive correlation with the possibility of selecting the candidate offspring with a higher scenario complexity index as the formal offspring.

Analysis of the offspring's scenario complexity index
To show the influence of the full crossover on the offspring, the expect of scenario complexity index is analysed by comparison with the traditional single-point crossover operator. According to the pseudocode of the full crossover operator, we have For the single-point crossover operator, its expect of scenario complexity index is analysed under two situations according to whether the parent is selected as the offspring directly [24,25]: is selected as offspring pair and then P (̂ i =  i, j ) =  ∕L, where  is the crossover probability; 2.  i is selected as offspring pair directly and then P (̂ i =  i ) = 1 −  . According to the above two conditions, the following theorem shows the relationship of the expect of offspring's scenario complexity index between the traditional and modified crossover operators.
Theorem 1: Compared with the single-point crossover operator with the crossover probability  , the expect of offspring's scenario complexity index generated by the full crossover operator satisfies: 1) and the average scenario complexity index of all candidate offspring pairs is not smaller than that of their parent, that is, Theorem 1 shows the improvement of the offspring's scenario complexity index by the full crossover operator. This is beneficial to find out the critical condition. But it is known from C2 that when  is small enough, the full crossover operator may generate offspring with a smaller scenario complexity index. However, this situation appears hardly in practice because  cannot be too small, otherwise it easily causes premature [24].

Design of multiple mutation operator
The traditional operator only generates one population, among which it has a low possibility to include the good individuals.
To increase this possibility, the multiple mutation operator is designed as shown in Figure 4. It performs traditional canonical mutation [30] on the population for N times. To keep the population size, the brother competition strategy is adopted to select the formal offspring according to their overall scenario complexity index. According to the multiple mutation operator shown in Figure 4, its pseudocode is Randomly select a candidate mutation offspring population as the formal one (  ) with the probability that where Mut (  , w  ) denotes the canonical mutation on  with the mutation probability, w  .

Analysis of offspring's scenario complexity index
It is hard to compare the expect of scenario complexity index directly. To analyse the improvement of the scenario complexity index, a new operator named blind multiple mutation is introduced to act as a bridge between the traditional operator and the multiple mutation operator. Its definition and characteristic are introduced as the following.
Blind multiple mutation operator (BMMO): It performs the canonical mutation operation on a population for N times, and then selects one of them as the formal one with an equal probability of 1/N. Lemma 1: From a statistical point of view, BMMO is equal to the canonical mutation operator.
Proof: The proof is in Appendix A.3.
With help of BMMO, the following theorem is established to show the improvement of scenario complexity index by the multiple mutation operator.
Theorem 2: Compared with the canonical mutation, the expect of scenario complexity index of the offspring generated by the multiple mutation operator satisfies: Proof: The proof is in Appendix A.4. Theorem 2 shows that the multiple mutation operator ensures that the expect of the offspring's scenario complexity index is not smaller than the canonical one. It is concluded from Theorems 1 and 2 that the improved GA has a higher possibility to generate the offspring with a higher scenario complexity index after each iteration. Since the driving task is more difficult to be executed in complex scenarios, the evolution test by the improved GA is easier to find the critical condition.

Influence analysis of design parameters
From the pseudocode of full crossover operator (Line (6)) and Equation (5), the following equation establishes: Similarly, from the pseudocode of the multiple mutation operator (Line 4) and Equation (21), we have It is known from Equations (7) and (8) that the design parameter, , influences E(C (  i )) and E(C (  )) directly. To study their relationship for the selection of , a function, g( ), is introduced: It is found by comparing Equations (9) with (7) and (8) that the monotonic relation between and g( ) is the same as that between and E(C (  i )) or E(C (  )). The following lemma is introduced first by using Equation (9)

Proof:
The proof is in Appendix A.5. By Lemma 2, the following theorem is established to show the influence of on the offspring's scenario complexity index.
Theorem 3: The parameter, , has a positive correlation with E(C (  i )) and E(C (  )), lim The poof is in Appendix A.6. Theorem 3 shows that the offspring's scenario complexity index becomes bigger by increasing but has an upper bound. However, in practice, a larger is not always beneficial to test efficiency because the population convergences to the area with the highest local scenario complexity index more easily if is too large.

Convergence of improved GA
The following definition and lemma about the convergence of GA are introduced first. Definition [30]:

Lemma 3 [30]: For the canonical GA with the assumption that it is a Markov chain, the global convergence is achieved by the elitist selection strategy, if the transition matrix for selection is column-allowable, that of the crossover operator is stochastic and that of the mutation operator is positive.
Based on Lemma 3, the following theorem ensures the global convergence of the improved GA.

Theorem 4: The improved GA converges to the global optimum if the process is a Markov chain.
Proof: The proof is in Appendix A.7.

APPLICATION AND ANALYSIS
The proposed evolution test by the improved GA can be used for different intelligent driving systems by properly defining the objective function, h(X ), and parameterising the test scenario to measure its complexity. Here, it is applied to the performance limit evaluation of an APPS in avoiding collision with the surroundings to show its effectiveness as an example.

Experiment platform and scenario
The experiment was conducted by a model-in-loop test platform shown in Figure 5  The schematic of the test scenario is shown in Figure 6. In this study, we focus on the evolution test algorithm, and there is only one vehicle acting as the surrounding object. The test objective is to find the minimum distance between the tested vehicle and the surroundings. Of course, there exist several types of performances and functionalities of APPS. In this study, only the collision avoidance performance is considered as an example to validate the effectiveness of the evolution test by the improved GA. If more performances or functionalities need to be tested, the objective function in Expression (1) should be re-designed to describe them. According to the tree structure model introduced in Section 2.2, the considered scenario elements are listed in Table 1. All scenario elements are grouped into three types according to their physical meaning, that is, 'orientation', 'distance' and 'velocity'. Each scenario element is denoted by I i and their discrete values are listed in the column labeled by 'layer 3'. By AHP, the importance degree of each discrete value is obtained as shown in the column labelled by ' k ′ before conducting the test. During the evolution test, if a scenario is generated, its scenario complexity index can be calculated by Equation (4).

Experiment results and analysis
To validate the hypothesis that the test scenario with a higher complexity index is more likely to detect the performance limit, the test results and the scenario complexity index of the generated 413 scenarios are analysed statistically as shown in Figure 7. From a statistical point of view, the effectiveness of the test scenario has a positive correlation with its scenario complexity index.
With the setting that n = 5, L = 100, A = 1.9, w  = 0.009, G te = 25, = 600, N = 50 and w  = 1, the overall offspring's scenario complexity index is shown in Figure 8(a). The proposed full crossover and multiple mutation operators are beneficial to increase the offspring's scenario complexity index. Therefore, the test efficiency is improved (see Figure 8(b)). Both the convergence and the ability to find the  collision condition are better than traditional GA. The improved GA detects the collision situation at the 11th generation. On the contrary, the worst condition found by traditional GA is that the minimum distance is about 0.1 m at the 25th generation.

Influence analysis of parameters
The two important parameters of the improved GA are and N (see the pseudocode in Sections 3 and 4). Intuitively, has a positive correlation with the tendency of selecting the candidate offspring with a higher scenario complexity index, and a larger N will generate more candidate offspring population.
To show their influence, the evolution test has been conducted with different parameter values. The experiment was conducted six times for each value to avoid randomness. The average test results are shown in Tables 2 and 3.
It is found from Table 2 that with the increase of , the test effect becomes better until = 400. After that, the result tends to be stable. This is caused by the saturation of the influence of scenario complexity on evolution. According to the results in Table 3, when N is small, the test effect is not good enough because the probability that the good offspring appears in the candidate offspring population is low. As N increases, this probability becomes higher, and so the test effect also becomes better. When N reaches 30, the test effect stays stably. The reason is that when the number of samples is large enough, the distribution of scenario complexity index of the candidate offspring population tends to be stable.

CONCLUSION
In this paper, an evolution test by the improved GA is presented for the performance limit evaluation of ADS to increase the test efficiency. Both theoretical analysis and application results show that: 1. The proposed scenario complexity index can measure the test effectiveness without conducting the test statistically; 2. the scenario complexity index is beneficial to guide the generation of offspring, and the proposed full crossover and multiple mutation operators can generate more complex scenarios; 3. the ability to find the performance limit is improved by replacing the traditional operators with the full crossover and multiple mutation operators.
The average complexity index of the test scenario, t = {X 1 , … , X t }

A.2
Proof of Theorem 1 According to the algorithm of single-point crossover, its scenario complexity index of the offspring is Subtracting Equations (10) from (5) yields When  = 1, Equation (11) is rewritten as To facilitate the analysis, according to whether the full crossover operator is able to improve the probability to generate  i, j , the right side of Equations (12) is divided into two parts: where From the pseudocode (Line 6) of the full crossover, ∑ L j = 1 w  i, j = 1, and therefore Then, substituting Equations (14) to (13) yields According to the equation of w  i, j (Line 6 in the pseudocode of the full crossover), it has a positive correlation withC (  i. j ). It is concluded from the definition of +  and −  that Then, Theorem 1 (C1) is proved by substituting Equations (16) into (15).

A.3
Proof of Lemma 1 Let ℚ = {0, 1} L denote the individual state space, ℚ 2n denote the population state space,  * denote a population in ℚ 2n ,  denote the formal offspring population generated by BMMO and̂ denote the offspring population generated by the canonical mutation operator.
Each operation of BMMO is identical and independent, and so where The following equation establishes because of the binomial expansion: Substitute Equations (20) to (19), we have P (̃ =  * ) = P(̂ =  * ) and Lemma 1 is proved.

A.4
Proof of Theorem 2 According to the pseudocode of the multiple mutation operator (Line 4), we have The following equation is obtained from Lemma 1: Combining Equations (21) and (22) yields where From the pseudocode (Line 4) of the multiple mutation operator, it is known that Being similar with the process from Equations (14) to (16), we have E(  ) − E(̂) ≥ 0, and the conclusion is proved.

A.5
Proof of Lemma 2 Proof: The derivative of Equation (9) to is The denominator of g ′ ( ) is greater than 0, and so its sign is determined by the numerator, which is rewritten as the following by the mathematical induction: The induction process to obtain Equation (25) is as follows: Step 1. The following equation establishes when s = 2: Step 2. For ∀s = t , we have Since g dt ( |s = t + 1) = g dt ( |s = t + 1) − g dt ( |s = t ) + g dt ( |s = t ), the following equation is obtained by substituting s = t and s = t + 1 to g dt ( ), respectively.
It is known from Equation (9) that Since lim A.6 Proof of Theorem 3 According to Lemma 2, Theorem 3 can be proved by replacing u i in Equation (9) withC (  i, j ) in Equation (7) andC (  i ) in Equation (8) respectively.

A.7
Proof of Theorem 4 The elitist selection strategy is used by the improved GA. It is known from Lemma 3 that the global convergence can be ensured by proving that the transition matrixes of the natural selection, the full crossover operator and the multiple mutation are column-allowable, stochastic and positive, respectively.
Let  = [  i j ] |ℚ 2n |×|ℚ 2n | denote the transition matrix of the natural selection, where  i j is the probability that the population is transferred from state i to j. Then,  is column-allowable because Let  = [  i j ] |ℚ 2n |×|ℚ 2n | denote the transition matrix of the full crossover operator, where  i j denotes the probability that the population is transferred from state i to j. The value of  i j only depends on the gene sequence of the parental population and the scenario complexity index of the candidate offspring. Therefore, all elements in  are constant. Furthermore, for any state i, it must be transferred to a state in ℚ 2n . So ∑ |ℚ 2n | j = 1  i j = 1 and  is stochastic.
Let  = [  i j ] |ℚ 2n |×|ℚ 2n | denote the transition matrix of the multiple mutation operator, where  i j denotes the probability that the population is transferred from state i to j. Similarly, all elements in  are constant. For the canonical mutation operator, the probability that it transfers a population from state i to j is positive [30]. Then, it is known from the pseudocode (Line 4) of the multiple mutation operator that once state j appears in the candidate offspring, the probability that it is selected as the formal offspring is bigger than 0. Therefore, ∀i, j ∈ {1, 2, … , |ℚ 2n |},  i j > 0, and so the transition matrix is positive.
In summary, Theorem 4 is proved by Lemma 3.