An Improved Genetic Algorithm for Constrained Optimization Problems

The mathematical form of many optimization problems in engineering is constrained optimization problems. In this paper, an improved genetic algorithm based on two-direction crossover and grouped mutation is proposed to solve constrained optimization problems. In addition to making full use of the direction information of the parent individual, the two-direction crossover adds an additional search direction and finally searches in the better direction of the two directions, which improves the search efficiency. The grouped mutation divides the population into two groups and uses mutation operators with different properties for each group to give full play to the characteristics of these mutation operators and improve the search efficiency. In experiments on the IEEE CEC 2017 competition on constrained real-parameter optimization and ten real-world constrained optimization problems, the proposed algorithm outperforms other state-of-the-art algorithms. Finally, the proposed algorithm is used to optimize a single-stage cylindrical gear reducer.


I. INTRODUCTION
Many engineering practices involve the solution of a constrained optimization problem [1], [2], [3], [4]. A constrained optimization problem (COP) aims at finding the optimal solution of a numerical function (called an objective function) with some equality or inequality constraints. COP could be classified into many categories, such as linear programming [5], convex quadratic programming [6], convex optimization [7], and non-convex optimization [8]. Both linear programming and convex quadratic programming are typical types of convex optimization. The convex optimization problem is relatively easy to solve because there is no local optimum, whereas non-convex optimization could have many local optima and it is impossible to assert that a solution is locally optimal unless a better one is found. Unfortunately, COPs in many engineering practices are nonconvex optimization.
The associate editor coordinating the review of this manuscript and approving it for publication was Daniel Augusto Ribeiro Chaves .
Researchers have developed many types of evolutionary algorithms to solve COPs [9], [10], [11], [12]. The genetic algorithm (GA) is a powerful type of evolutionary algorithm that can solve the non-convex optimization problem because GA keeps a population of solutions and it not only utilizes information in the current population but also explores new areas in the search region. Therefore, individuals in GA can escape from local optima. Researchers have made many developments of GA, and the main efforts could be roughly classified into two categories: develop new crossover operators and new mutation operators.
The crossover operator generates offspring by recombining information of individuals in the current population. In the early period, crossover operators of the genetic algorithm are designed by directly imitating crossover operators of binary GA. Simulated binary crossover simulates the principle of the single-point crossover operator on binary strings [13]. Later researchers got rid of the shackles of binary coding and began to develop crossover operators on real numbers. Arithmetic crossover, local crossover, flat crossover, and blend crossover generate two offspring on the line segment between two parents [14], [15], [16], [17], [18]. Recently, researchers focus on the direction-based crossover operators, which search for new solutions using the information of direction from the worse parent to the better parent [19], [20], [21], [22].
The mutation operator randomly changes individuals to avoid getting stuck in local optima. To achieve this purpose, the mutation operators often add random perturbations to individuals [23], [24], [25], [26], [27], [28]. Therefore, the mutation is also searching for the optimal solution with certain rules, and different mutation operators have different characteristics. However, different evolution stages or different types of problems require different search strategies according to the evolution process or the nature of the problem [29]. Combinational mutation combines multiple mutation methods so that the mutation operator has the advantage of multiple mutation operators [21], [30].
In this paper, namely GA-TDX, an improved genetic algorithm is proposed, which consists of two-direction crossover and grouped mutation. The two-direction crossover uses the information from the worse parent to the better parent. However, due to COPs being nonlinear, this direction is usually not the optimal search direction. Therefore, the twodirection crossover finds a direction at an angle of 45 degrees to this direction as a new search direction, searches for an individual along each of the two search directions in the two parents, and retains the two optimal individuals. This additional search direction makes two-direction crossover a better search capability. In grouped mutation, individuals in different parts of the population are mutated by different single mutation operators. Individuals in different parts of the population suit different search tasks, to be precise, the best part of the population is good at finding the global optimum, whereas the worst part of the population is good at skipping the local optimum.
The remaining of this paper is organized as follows. Section II briefly introduces the mathematical model and constraint handling technique of COP. Section III introduces details of the proposed GA-TDX. Section IV shows the experimental results of the proposed algorithm compared with other algorithms on the IEEE CEC 2017 benchmark functions and ten COPs. In Section V, GA-TDX is applied to optimize the parameters of a single-stage cylindrical gear reducer. Finally, Section VI concludes this paper.

II. CONSTRAINT OPTIMIZATION PROBLEM A. MATHEMATICAL MODEL OF COP
The objective function of real-valued COP is formulated in (1).
where x ∈ F ⊆ S. The objective function f is defined on the search space S ∈ R n and the set F ⊆ S defines the feasible region. Usually, the search space S is defined as an n-dimensional rectangle in R n (domains of variables defined by their lower and upper bounds) as shown in (2). The feasible region F ⊆ S is defined by a set of P + Q additional constraints as shown in (3). The infeasible region I is defined by F ∪ I = S and F ∩ I = ∅. At any point x ∈ F, the constraints g k that satisfy g k (x) = 0 are called the active constraints at x [31].
COP is more challenging than an unconstrained optimization problem. The search region S of COP is divided into several small irregular regions by the constraints. Some of these small regions are feasible regions, whereas others are infeasible regions. Thus, COP is essentially an optimization problem to find the optimal solution in F, whereas an unconstrained optimization problem is an optimization problem to find the optimal solution in S. The first challenge of COP is that the optimal solution in F is always worse than the optimal solution in S, as a result, the optimal solution is very likely located on the border between F and I. However, it is difficult for Evolutionary algorithms to precisely search the boundary area [31]. Secondly, these small feasible regions may not be adjacent and evolutionary algorithms have to find new feasible regions which have not been found, because these undiscovered regions may contain the optimal solution.

B. CONSTRAINT HANDLING TECHNIQUE
Constraint handling techniques are of great significance in efficiently solving constrained optimization problems.
Researchers have proposed many types of penalty functions, such as static penalty [32], dynamic penalty [33], [34], the superiority of feasible solutions [35], ranking-based method [36], and ensemble constraint handling technique [37]. In the evolutionary algorithm community, the most commonly used approach to handle constraints is the penalty function [38]. The idea of this method is to transform a COP into an unconstrained optimization problem by adding a penalty term (negative for maximization problems or positive for minimization problems) to the objective function based on the amount of constraint violation present in a certain solution. In this paper, the static penalty function is used because of its simplicity. The mathematical form of static penalty is shown in (4).
where m is the predefined penalty parameter to define the tolerance of constraints violation. To ensure the feasibility of the obtained optimal solution, m should be a sufficiently large positive number.

III. THE PROPOSED ALGORITHMS
A. THE ALGORITHM STRUCTURE Algorithm 1 shows the structure of the proposed algorithm, consisting of sorting grouping selection, two-direction VOLUME 11, 2023 crossover (TDX), and grouped mutation (GM). Sorting grouping selection cuts a population into two parts based on the fitness value and pairs individuals between these two parts to ensure that the algorithm searches in the direction of decreasing the value of the objective function [21]. Then, TDX searches along not only the direction from the worse parent to the better parent but also a random direction at 45 degrees to the former to avoid getting stuck in a local optimum. Finally, GM cut the population into two groups based on the fitness value. The best group is used for local search, and the worst group is used for global search.

B. TWO-DIRECTION CROSSOVER
Given that parents x 1 , x 2 , if the fitness value of x 1 is smaller than x 2 , the first search direction is d 1 = x 1 − x 2 . Then, using the method in (6) (details of this equation will be introduced later), a vector represents the length of a vector d. Therefore, the length of and its length is There is a 45 • angle between d 1 and  two individuals searched from each parent is retained, which makes TDX finds offspring along the better direction of two at each parent. The whole algorithm of TDX is shown in Algorithm 2. Fig. 1 shows an example of TDX in a 2-dimensional plane. Given x 1 and x 2 as before. According to the method mentioned above, vector TDX finds new points in the better direction of two in each iteration. The search region of TDX contains a line (d 1 ) and the surface of two cones ( − − → x 2 x 4 and − − → x 1 x 6 ) whose apex angle is 90 • , which makes it have a fast speed and avoid getting trapped in local optima. When the algorithm is iterated many times, TDX searches along the best of ∞ direction (Due to the bottom circle of a cone having infinite points, the number of vectors from the apex of a cone to a random point on the bottom circle is infinite).

1) FINDING A VECTOR PERPENDICULAR TO A GIVEN VECTOR
In linear algebra, if two vectors x and y are perpendicular, there must be a perpendicular equation x · y = n i=1 x i y i = 0, which could be used to find a vector y perpendicular to a given vector x. Suppose x, y ∈ R n , an easy method to find y is to randomly generate y 1 to y n−1 . Then, use the perpendicular equation to calculate y n = − n−1 i x i y i x n . However, if the denominator x n is zero, the algorithm can not continue in an electronic computer. It is necessary to develop a robust method to calculate a vector perpendicular to a given vector.
It is obvious that if

Algorithm 2 Algorithm of TDX
Require: Parents x 1 , x 2 and fitness value of x 1 is smaller than Then, the objective is transformed into calculating vector y 1 = (y 1 , y 2 , . . . , y n−1 ) perpendicular to a given vector . Thus, we could use the above method to randomly generate y 1 to y n−2 and calculate y n−1 . However, if x n−1 to x n−k (n > k > 1) are zeros too, we have to transform the objective into calculating vector y k+1 = (y 1 , y 2 , . . . , y n−(k+1) ) perpendicular to a given vector x k+1 = (x 1 , x 2 , . . . , x n−(k+1) ). This recursive procedure has uncontrollable recursion times and computational time.
Although any x n = 0 will stop the algorithm, the probability of x n = 0 is very small. Therefore, We develop a fast algorithm to find a perpendicular vector, which has a certain computational time. Recall the perpendicular equation, if x n = 0, the algorithm can not continue. Our purpose is to remove the risk of the appearance of ∞ and make the algorithm could continue. Therefore, we just replace the denominator x n with a non-zero uniform distribution number in [−mean, mean], where mean is the mean value of x n in the population. The mathematical form of this method is shown in (6). This method ensures that x n and y n are perpendicular and void recursive calculations.

C. GROUPED MUTATION
The sorted population is cut into two groups and two single mutation operators are absorbed into GM. The first group consists of the best group of individuals, which are mutated by a local search mutation operator. The second group consists of the rest individuals, which are mutated by a global mutation operator. Fig. 2 shows the process of GM and an algorithm of GM is shown in Algorithm 3.

Algorithm 3 Algorithm of GM
where the first individual is the best one cut x s into two groups, The first mutation operator is a normal mutation [21], whereas its variance is calculated by two individuals as shown in (7). The calculated variance ensures this normal mutation operator could adaptively search a small region.
where o ij and x ij are j − th variable of i th individual in offspring population and parent population respectively, β is the proportion of the best group in the population and X βNP is the last individual in this group. The second mutation operator is the non-uniform mutation, which is shown in (8), where rand is a uniformly distributed random number generated within the interval [0, 1], t is the current generation, T is the maximum number of generations and γ is a shape parameter that controls the speed at which the step length decreases. The search range of this mutation operator decreases adaptively with the increase of iteration times, so this operator can have a larger search range in the early stage and better convergence in the later stage.
Individuals in the first group are the better group of the population and are considered to be the closer to the global optimum. Therefore, mutating them using the normal VOLUME 11, 2023 mutation operator to search the local region has the higher probability to find the global optimum. Individuals in the second group are the worse group of the population and are considered to be the farther from the global optimum. Therefore, mutating them using the non-uniform mutation operator to search in a large region has the higher probability to find the global optimum.

IV. EXPERIMENTAL RESULTS AND ANALYSIS A. SYSTEM DETAILS
The proposed algorithm was implemented in MATLAB R2022a under Windows 10 with AMD 5700X CPU @ 2.3GHz with 32 GB of RAM.

B. BENCHMARK SUIT FUNCTIONS
Performance of GA-TDX and other methods are compared on solving the IEEE CEC 2017 competition on constrained real-parameter optimization [39] and ten real COPs from [31], [40]. The ten COPs are mathematical models extracted from real-world optimization problems. Appendix A shows details of these COPs.

C. PARAMETER SETTINGS AND ALGORITHMS COMPARISONS
Penalty parameter is set as m = 10 10 . Population size is set as NP = 100. When the number of iterations of the algorithms reaches the maximum number of iterations T , the algorithm stops. For the IEEE CEC 2017 benchmark functions, T is 100D, where D is the number of decision variables. As for ten COPs, T is 1000.
Eight state-of-the-art algorithms are adopted for comparison: GA-MPC [41], GA-LX [19], GA-DBX [20], GA-HNDX [21], GA-DEX [22], BOA [42], CS [43], OSCPSO [44]. The parameters of the eight state-of-theart algorithms are the same as the values in the original paper where they are proposed. The initial population of all algorithms is the same, but the initial population of different runs is different.

D. PERFORMANCE METRICS
Each COP is solved 25 times by each algorithm. Since the theoretical optimal solution of COP is unknown, a nonnegative difference between the optimal solution obtained by a specific method and the optimal solution obtained by all methods is used as the error of this method. To assess the performance of all contestant algorithms, the mean (denoted as F) and standard deviation of the error are used. The mean indicates the accuracy of the search process of the algorithm, while the standard deviation is an indicator of how consistent the algorithm is when solving a certain optimization function.
To comprehensively compare the results of different algorithms for solving multiple functions, a rank value is used and the best algorithm will obtain the lowest rank value. For each problem, algorithm ranks are determined in terms of the mean values and median solutions at the maximum allowed number of evaluations, respectively. The total rank value of each algorithm is calculated as in (9), where meanrank i and medianrank i are the rank of mean and median of the ith functions respectively.
For a rigorous performance comparison between GA-TDX and the other algorithms, the Wilcoxon signed rank test [45] is used for pairwise comparison between the contestant algorithms at the significance level of α = 0.05, and the results are reported as W + , W − , p, and l. The W + , W − symbols are used to indicate the sum of ranks where the performance of GA-TDX is superior or inferior to the other peers, respectively. The p-value is used as the minimum level of significance to detect a difference in performance between the algorithms. If the p-value is less than α, then this means that the result of the well-performing algorithm is statistically significant. From the statistical point of view, and using α and the p-value, the l value can be used to indicate if GA-TDX is significantly worse (l = ''−''), insignificant (l = ''=''), or significantly better (l = ''+'') than the contestant methods.

E. SENSITIVITY ANALYSIS
It is anticipated that some parameters like β and γ have an important impact on search efficiency. In this subsection, a set of parameter sensitivity tests are carried out to investigate the impact of these parameters on the performance of GA-TDX.

1) EFFECTS OF β
The value of β determines the size of two groups in GM, which may affect the effect of GM. The search performance of GA-TDX was investigated when β is set from 0 to 1 with an interval of 0.1 using the IEEE CEC 2017 benchmark functions at D = 10, 30, 50. In Fig. 3, different values of β behave differently in different functions, and it is difficult to conclude that the value of a certain value of β is superior to other values. The value of β has little impact on the performance of the algorithm. As shown in Tab. 1, although the gap is small, the β of 0.2 has the smallest rank value at all dimensions. This value is used in the following parts of this article.
2) EFFECTS OF γ The value of γ affects the reduction rate of the search area of the non-uniform mutation operator as the number of iterations increases, which may affect the effect of GM. The search performance of GA-TDX was investigated when γ is set from 0 to 10 with an interval of 1 using the IEEE CEC 2017 benchmark functions at D = 10, 30, 50. In Fig. 4, different values of γ behave differently in different functions, and it is difficult to conclude that the value of a certain value of γ is superior to other values. The value of γ has   little impact on the performance of the algorithm. As shown in Tab. 2, although the gap is small, the γ of 6 has the smallest rank value at all dimensions. This value is used in the following parts of this article. VOLUME 11, 2023       The pairwise comparison between the proposed algorithm and its peer algorithms is conducted using Wilcoxon signed rank test and is summarized in Tab. 6, 7, 8 in terms of W +, W −, p, and l values, as was stated in Section IV-D. The l-values show significant improvement of GA-TDX over the other eight competent algorithms, shown as a ''+'' sign.

G. ANALYSIS OF RESULTS USING TEN COPs
Tab. 9 shows the simulation results from solving ten COPs. The proposed GA-TDX achieve the best F over 6 COPs out of 10 and has the lowest rank value, suggesting this algorithm outperforms other algorithms in solving these COPs. Tab. 10 shows the pairwise comparison between the proposed algorithm and its peer algorithms, conducted using Wilcoxon signed rank test. The l-values show a significant improvement of GA-TDX over seven competent algorithms, shown as a ''+'' sign, and an insignificant improvement of GA-TDX over GA-TEX, shown as a ''='' sign.

V. OPTIMIZATION DESIGN OF SINGLE-STAGE CYLINDRICAL GEAR REDUCER
A gear reducer is a widely used transmission device, used to reduce speed and increase torque. Constrained by the strength of the material, the size of the gears and shafts of the reducer must be large enough to carry the corresponding loads. However, the larger the size of the gears and shafts, the higher the cost. Therefore, the design of a reducer is to find the smallest shaft and gear size while satisfying the constraints.
In this paper, we optimize a single-stage cylindrical gear reducer. The structural sketch of the reducer is shown in Fig. 5. The gear ratio is u = 5, input power is P = 58 kW, speed of driving gear is n 1 = 1000 r/min, allowable stress of gear is [σ ] H = 550 MPa, allowable bending stress is [σ ] f = 400 MPa. The purpose is to find the design parameters that minimize the volume of the reducer under the condition VOLUME 11, 2023    of strength and stiffness. Since the parts inside the shell, i.e. gears and axles, are the basis for determining the volume of the reducer, the objective function (shown in (10)) can be established according to the principle of minimizing the sum of their volumes, where the first two terms are the volumes of the two gears, and the latter term is the sum of the volumes of the two shafts.
The volume of the reducer depends on six parameters: modulus M , number of teeth of the first gear Z 1 , the thickness of gears B, the thickness of the gearbox L, and the diameter of the first shaft D z1 and the second shaft D z2 . Let design vari- The objective function is minf 39 (x) = V , and constraints are shown in (11), where g 1 is the lower limit constraint on the number of teeth, g 2 and g 3 are the lower and upper limit constraints on the tooth width factor, g 4 is the lower limit constraint on the modulus, g 5 is the upper limit constraint on the diameter of the first gear, g 6 and g 7 are the lower and upper limit constraints on the diameter of the first axis, g 8 and g 9 are the lower and upper limits of the diameter of the second shaft, g 10 is the lower limit constraint of the box thickness, g 11 is the contact stress constraint of the first gear, g 12 and g 13 are the bending stress constraints of the two gears, g 14 and g 15 are the bending stress constraints of the two shafts.
When solving this problem, we increase the value of m to 10 200 to avoid the phenomenon that the previous value will result in an unreasonable negative objective function value. Other parameters are the same as before. Tab. 11 shows the simulation results from optimizing this reducer. The proposed GA-TDX achieve the best F, suggesting this algorithm outperforms other algorithms in optimizing this reducer. Tab. 12 shows the pairwise comparison between the proposed algorithm and its peer algorithms, conducted using Wilcoxon signed rank test. The l-values show significant improvement

VI. CONCLUSION
In this paper, a novel class of improved genetic algorithms is introduced to solve constrained optimization problems. The two-direction crossover utilizes the orientation information of paired individuals in the current population and adds an additional search direction. Searching for an individual in both directions and keeping the better individual improves the search efficiency. The grouped mutation introduces two mutation operators with different properties and performs different mutation operations on different groups of the population. Grouped mutation takes full advantage of the characteristics of each mutation operator and improves the search efficiency.
The performance of the proposed algorithms was tested on a challenging set of 28 benchmark problems taken from the IEEE CEC 2017 competition on constrained realparameter optimization and ten real constrained optimization problems. The simulation results affirm the fact that the proposed algorithm significantly outperforms other improved genetic algorithms as well as state-of-the-art evolutionary algorithms. Finally, the optimization results for a single-stage cylindrical gear reducer show that the proposed algorithm can significantly reduce the size and cost of the gearbox.

APPENDIX A MATHEMATICAL MODEL OF TEN COPs
It is difficult to find the precise optimal solution of COPs used in this paper, due to the objective and constraints are very complicated. The optimal solution shown below is not the exact optimal solution, but a reference value. Details of ten COPs are shown as follow: COP1. There are 13 variables and 9 constraints in this COP. The best solution is x * = (1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 1) and its function value is f (x * ) = −15. 6 constraints are active at x * .
GANG XU received the B.E. and M.E. degrees in industry engineering from Northeast Agricultural University, China, in 2015 and 2017, respectively, where he is currently pursuing the Ph.D. degree with the Engineering College. His current research interest includes genetic algorithm theory and its application.
MO WANG received the B.E. degree in industry engineering and the M.M. degree in management science and engineering from Northeast Agricultural University, China, in 2018 and 2022, respectively. Her current research interest includes genetic algorithm theory and its application.