Using a Tolerance-based Surrogate Method for Computer Resources Saving in Optimization

This paper presents a very simple surrogate optimization method a Tolerance-based Surrogate Method. A surrogate optimization in general is essential to more and more frequently used optimization in the development process of new technologies. Fitness functions of such systems are often costly, therefore keeping a number of evaluations of the fitness functions at minimum is of a great importance in order to save computer and time resources, i.e. the overall cost of design. Unlike other complex surrogate optimization methods, the tolerance-based surrogate method does not require excessive computational resources, is easy to implement, and is flexible for all types of optimization algorithms. Behaviour of the tolerance-based surrogate method is demonstrated on several modified benchmark problems. Afterwards, our method is verified on a real-world time-demanding optimization task.


Introduction
Global optimization algorithms works so that they search for and compare different solutions to find the optimal one.The comparison of solutions is based on fitness values, which express the quality of the solution.The fitness values are obtained by evaluating fitness functions, which describe the behaviour of the optimized system with its properties called decision variables.Therefore, the optimization is a process of finding minima or maxima of the fitness values.
If the system is described by one fitness function, the optimization is called single-objective and a single decision space vector is expected as an optimization result.Contrarily, multiple conflicting fitness functions lead to the multiobjective optimization and multiple trade-off solutions are given at the end of the process.
The fitness functions can have various forms.If the fitness function is expressed as a closed form formula, it can be computed almost immediately.However, the fitness functions in real-world optimization problems can take considerably more time to compute.A common assumption is that the computation of the fitness values is the most time demanding operation during the optimization process.
An example of such a complex optimization can be the synthesis of the cavity resonator structure used in [1].An optimization algorithm generates decision variables (e.g.design dimensions) and the calculation of the fitness values involves full-wave simulation of the designed structure with dimensions determined by the optimization algorithm.
Since evolutionary algorithms generally require a large number of fitness function evaluations during the optimization process and each evaluation can take a significant amount of time, it is desirable to be able to skip some unnecessary fitness functions evaluations.
There are various methods described in an open literature dealing with "redundant" fitness functions evaluations, which are in general called surrogate optimization methods.In [2], response surfaces are used to approximate the fitness functions, that are evaluated only at a few points.Authors in [3] proposes Progressive Optimum Search Using Evolving Reliable Regions (POSER) method, which at first establishes a Kriging model (contains an error prediction) from a few initial samples and then applies it to create a reliable region.The reliable region of the Kriging surrogate is progressively improved by additional samples.The method presented in [4] reduces the number of fitness function evaluations by fitting a function approximation model over k nearest previously evaluated points.The method tries to identify the most promising offspring solutions and exploit their potential before other offspring solutions.The approximation model uses Symmetric Latin Hypercube Design (SLHD).In [5], a trust-region framework using an interpolating Radial Basis Function (RBF) model is used.The surrogate optimization methods were summarized in [6].
All the methods proposed in the open literature approximate or interpolate the unknown regions of the fitness function by sampling it by minimal possible points to obtain a reliable surrogate.Such complex methods are undoubtedly able to estimate more accurate substitute solution than our proposed method.Nevertheless the tolerance-based surrogate method simply stores all the evaluated solutions in an archive and uses the stored fitness values later if a new solution is within a specified margin.It can be used on any problem with any single-objective or multi-objective algorithm.While the methods proposed in [2][3][4][5][6], and other surrogate optimization methods to be found in the open literature, are rather difficult to implement, our tolerance-based surrogate method is, for its simple principle, very easy to implement and is also universal for all kinds of optimization algorithms and problems.Therefore, it has a potential to help many engineers in various engineering branches in their efforts to reduce design cost without deep studying of the problem and implementing the complex surrogate methods.
In Sec. 2, an optimization technique used for the validation of the proposed method is described.In Sec. 3, the principle of the tolerance-based surrogate method is described.In Sec 4 and 5, metrics used for the validation of the results and problems for benchmarking the results are discussed.Section 6 is dedicated to the experimental verification of the proposed method.Section 7 presents the use of the tolerancebased surrogate method for the design of a band-stop filter.Finally, the conclusion of the paper is given in Sec. 8.

Optimization Technique
For the validation purposes of this paper the FOPS toolbox [7] was used.The Multi-objective Particle Swarm Optimization (MOPSO) [8] algorithm has been exploited to obtain presented results.The Elitist Non-dominated Sorting Genetic Algorithm (NSGA-II) [9] and the Third Generalized Differential Evolution algorithm (GDE3) [10] were also tried and the produced results were practically similar to those from MOPSO algorithm.
Minor differences in the results were related to algorithms' performance rather than to the tolerance-based surrogate method itself.Therefore, the results from the GDE3 algorithm are not presented in this paper.Although the results of the NSGA-II algorithm are also similar to those of MOPSO algorithm, they are attached to the paper in an appendix 1, due to a completely different nature of the NSGA-II algorithm.
The MOPSO algorithm is based on the simulation of the social behaviour of bees in a swarm.The position of a particle is changed according to its own experience and that of its neighbours according to equation [8]: where x t is the position of the particle at a time step t, ∆t is the time step (∆t = 1), and the v t is a velocity vector at the time step t.The velocity vector reflects the exchange of information and is defined as follows [8]: where w is the inertia weight, c 1 and c 2 are cognitive and social learning factors, respectively, r 1 and r 2 ∈ [0, 1] are random values, x pbest is the position of a personal best, and x gbest is the position of a global best.
A multi-objective variant of PSO algorithm is extended by an external archive, which is the container that stores non-dominated solutions found during the optimization run.Therefore, the global best solution (x gbest in (2)) is selected among external archive members.To avoid overloading of the external archive, a pruning method based on a crowding distance [9] is used.

Tolerance-based Surrogate Method
As was mentioned before, the tolerance-based surrogate method allows an optimization algorithm to skip some fitness function evaluations.The question is, which evaluations can be skipped?
If an electromagnetic structure design is considered, there are always some manufacture precision limits, therefore it is useless to evaluate the fitness values for dimensions (decision variables) that differ at e.g.sixth decimal place.Moreover, the fitness values of such similar dimensions would most likely be very similar too and an overall contribution to the optimization process would be minimal.This is the essential idea of the tolerance-based surrogate method.At the beginning of the optimization run, no fitness values are known and all the fitness function has to be evaluated.Each evaluated solution (i.e.decision variables and corresponding fitness values) is stored in the archive.At some point in the optimization process, an algorithm converges close to the true global optimum (minimum or maximum for single-objective optimization and Pareto-front for multi-objective optimization) and new solutions with yet unknown fitness values begin to be similar (or equal) to some members of the archive.Evaluation of the fitness functions of such solution has a negligible contribution to the optimization process, therefore the fitness functions are not evaluated and the fitness values of the closest solution in the archive are taken from the archive.
How close the new solution from a member of the archive has to be, is defined by the vector of tolerances, which has a number of elements equal to a number of decision variables of a problem.Each time the differences between all the decision variables of some member of the archive and the new solution are lower than the vector of tolerances, the fitness function evaluation is skipped.
Figure 1 further clarifies the tolerance-based surrogate method.It depicts a decision space of a simple two-objective optimization problem and several solutions stored in the archive.A light grey grid denotes the limits of the decision variables, i.e. x 1 ∈ 0.1, 1 and x 2 ∈ 0, 1 .The fitness functions are defined as follows: A red line marks the true Pareto-front of the problem in decision space.There are 15 solutions stored in the archive (their positions are marked with the black thick crosses).The tolerance vector was set to {0.05, 0.1} and areas within tolerance are marked by the hatched boxes around solutions.Five of the solutions are marked with the index number from 1 to 5.
The solution 1 is the true Pareto-optimal solution and its fitness values are {0.1, 10}.The solution 2 has the fitness values {0.3, 3.67}.The solution 3 is not far from optimality (see that a tolerance box covers a part of the true Pareto-front) and its fitness values are {0.5, 2.15}.The solution 4 has the fitness values {0.7, 1.643} and the solution 5 has the fitness values {1, 1} (also the true Pareto-optimal solution).All the indexed solutions are non-dominated (in objective space).
If a newly generated solution has the position e.g.{0.94, 0} (marked with the blue thin cross in Fig. 1), it will fall in the tolerance area of the solution 5 ({1, 0}) and even if its fitness values according to (3) and ( 4) would be {0.94,1.064}, the fitness values {1, 1} of the solution 5 will be assigned to it.A deviation in fitness values caused by tolerance-based surrogate method is relatively small in this case.
Another generated solution has the position e.g.{0.14, 0} (marked with the blue thin cross in Fig. 1).Such solution will fall in the tolerance area of the solution 1 ({0.1, 0}) and even if its fitness values according to (3) and ( 4) would be {0.1, 7.143}, the fitness values {0.1, 10} of the solution 1 will be assigned to it.The difference between the true fitness values and the surrogate fitness values is rather large here, although the absolute distance between the archive member and the generated solution in the decision space is identical as in the case described in the previous paragraph.This suggests that the setting of the tolerance vector can be sometimes a difficult task.
The solution 3 in Fig. 1 indicates the main drawback of the tolerance-based surrogate method.The border of the hatched box of this solution lies on the true Pareto-front, but the solution itself is rather far away ({0.5, 0.075}).Therefore, if a new solution is generated within the hatched box, e.g.{0.5, 0} (marked with the blue thin cross), then the fitness values of known solution are assigned to it.But the fitness values of the true Pareto-optimal solution with the position {0.5, 0} according to equations ( 3) and (4) are {0.5, 2}, while the fitness values of the solution 3 from the archive are {0.5, 3.5}.Afterwards, the solution {0.5, 0} (which is, as we know, better then the archive member) will be supressed in the optimization process due to its downgraded fitness values.This denotes that a part of the true Pareto-front under the solution 3 in Fig. 1 is inaccessible due to the tolerancebased surrogate method when too large values are used in the tolerance vector.
There exists no methodology to estimate the proper tolerance vector.The tolerance vector depends on an optimized problem and user's knowledge about the problem.This drawback of the tolerance-based surrogate method that can make the parts of the true Pareto-front inaccessible, and therefore introduces an uncertainty into the optimization process, is balanced by the positive effect on the overall number of fitness evaluations i.e. overall cost of optimization.In other words, the setting of the tolerance vector is a trade-off between the time saving properties and the inaccessible area that might occur around the true global optimum.
Note that an optimization algorithm can still reach any point within the decision space.It can not reach only a close neighbourhood of the archive members.
The drawback can be suppressed with the use of a discrete decision space.When the decision space is discrete, the tolerance vector can be set to almost zero values and a new solution can be either identical or differ by a step of the discrete decision variable.If the new solution generated by the optimization algorithm already exists in the archive, it is not calculated again.In this scenario, some regions of the decision space are inaccessible for the optimization algorithm.

Evaluated Metrics
The performance of the tolerance-based surrogate method is tested from a two points of view.The first one is the computational time required to perform the particular simulation run.For a better insight into the time saving property of the tolerance-based surrogate method, Table 3 contains an average count of fitness values obtained by our surrogate method.
The second point of view is the value of a generational distance.The generational distance was proposed in [11].It defines the distance between a non-dominated set P and a true Pareto-front P * .It is obtained by the equation: where d 2 i is the minimal Euclidean distance in an objective space between i-th solution from the set P and any member of the true Pareto-front P * : where f * m (k) is the m-th fitness value of k-th member from the set P * .

Testing Problems
The validation of the tolerance-based surrogate method was performed on numerous two-objective optimization benchmark problems.However, the tolerance-based surrogate method is independent on the number of objectives.
A summary of benchmark problems can be seen in Tab. 1 (MOFON stands for Fonseca and Fleming's study, MOKUR stands for Kursawe's study, MOPOL stands for Poloni's study and MOZDT1 and MOZDT6 stands for Zitzler, Deb and Thiele's studies).The benchmark problems are further described in [12].
The generational distance metric uses the true Paretofronts P * for the distance calculation.The true Pareto-fronts of MOFON, MOZDT1, and MOZDT6 problems can be found in [12], while the true Pareto-fronts of MOPOL and MOKUR problems were obtained thanks to a very dense sampling of the regions, where the true Pareto-front is located.

Results
Controlling parameters of the MOPSO algorithm were set as follows: the inertia weight w was linearly decreased from 0.6 to 0.4 over each iteration, the cognitive learning factor was c 1 = 1.5, and the social learning factor was c 2 = 1.
There were 100 agents in each simulation run over 100 iterations.Therefore, the fitness function would be evaluated 10 000-times if the tolerance-based surrogate method was disabled.
An evaluation of the fitness function in case of the benchmark problems is almost immediate, therefore the usage of the tolerance-based surrogate method would have no benefit.Nevertheless, there were delays inserted to the fitness functions.The delays were 0, 1, and 10 milliseconds.Due to the nested delays, all evaluations of the fitness functions alone took 0, 10, and 100 seconds, respectively, for each simulation run if the tolerance-based surrogate method was disabled.
The tolerance vector was defined as a fraction of the range of problem's decision variables, i.e. 0, 0.001, 0.01, 0.05, and 0.1 times the range of the decision variable.The first one means that the range of each decision variable is "divided" into an infinite number of sections.In other words, the tolerance-based surrogate method is disabled.The last one means, that the range of each decision variable is "divided" into 10 sections.The quotation marks refer to the fact, that the tolerance-based surrogate method has nothing to do with the discretization of the decision variables.The tolerance-based surrogate method only takes positions of two randomly generated solutions and checks whether its difference is lower than the tolerance or not.All values presented in Tabs.2-5 are an average of 100 repetitions.

Computational Time
Table 2 contains an average computational time of particular simulations.It is obvious from the first three lines (the tolerance-based surrogate method is disabled), that the computational time of an optimization method alone is almost independent on a problem.It is also evident how the nested delays affect the computational times.
Following lines show the computational times when the tolerance-based surrogate method is enabled.The higher the tolerance is, the larger the time saving is.Contrarily, the computational time when the tolerance-based surrogate method is enabled depends on the problem.
On several occasions, the computational time is larger when the tolerance-based surrogate method is enabled, compared to the computational times with the tolerance vector elements set to zero.The most obvious items are the ones where no delay and small tolerance values were used.This behaviour is caused by a number of comparison operations required by the tolerance-based surrogate method.If the tolerance is small and no surrogate fitness values are found in the archive (MOKUR and MOZDT1 problems, see Tab. 3), the number of the comparison operations quickly increases during the optimization which slows the entire process.
Especially, in case of the MOZDT1 problem and the tolerance vector elements set to 0.001, no surrogate solutions were found (see Tab. 3).This problem has 30 decision variables, therefore the probability that a new solution is within the tolerance of some solution stored in the archive is lower compared to other problems.The number of the comparison operations quickly grows from 100 × 30 after first iteration to 10000 × 30 after last iteration.Therefore, an overall deceleration is almost 13 seconds.

Number of Surrogate Solutions
Table 3 shows, how many fitness values were taken from the archive during each simulation run.The content of Tab. 3 correlates with the content of Tab. 2. The first three lines of the table were omitted, because they obviously contain zeros because the tolerance-based surrogate method is disabled.With an increasing tolerance values the number of surrogate solutions also increases, because there is a higher probability of finding a close enough solution in the archive.
The differences between values with the same tolerance but different problems are firstly set by a number of decision variables.If a problem has only three decision variables (MOFON), it is much easier to randomly generate a close solution to a member of the archive compared to a problem with 30 decision variables (MOZDT1).
Secondly, the differences are given by the speed of convergence to an optimum.If agents rapidly approach global optimum, then new solutions are almost similar to the previous ones, therefore surrogate solutions can be found in the archive (MOFON).Contrarily, if agents approach the optimum slowly, the positions are continuously drawn to optimum (MOKUR), and the surrogate solutions cannot be found in the archive.

Generational Distance
When the tolerances are increased, the generational distance also increases due to the drawback of the tolerancebased surrogate method described in Sec. 3. Differences between values of the generational distance with the same tolerance are caused by the difficulty of the problem.The MOFON problem is relatively simple one.On the other hand, the MOZDT4 problem with only 10 decision variables (in comparison with the MOZDT1 problem) has many local optima, where an algorithm can be caught, therefore the values of the generational distance are large.Table 5 contains generational distance values of simulations when a discrete decision space was used.Note that the second column in Tab. 5 is now called Fraction.In this case, the tolerances were set to very low values (1e−6).However, the discretization of the decision variables corresponds with the tolerances from the simulation with a real-coded decision space.Therefore, the fraction of 0.1 denotes that each decision variable was sampled by 11 points.When the discrete decision space is used, the surrogate is found only if a new solution is identical to a member of the archive.Otherwise, the fitness functions has to be evaluated.Some problems (MOPOL and MOFON) in Tab. 5 show that if the fraction value is increased, meaning that the decision variable is sparsely sampled, the generational distance downgrades.This is caused by the fact that the true Paretooptimal solutions do not correspond with the samples of the decision variables.The true Pareto-optimal set of MOZDT problems corresponds to x 1 ∈ 0, 1 while all the other decision variables are zero.Therefore, the discrete samples of the decision variables can match the true Pareto-optimal set.Analogous tables to Tab. 2 and Tab. 3 with the discrete decision space are not presented, because their content is similar to those with the real-coded decision space.

Anisotropic Band-Stop Filter Design
Until now, only artificial, benchmark problems were considered for the verification of the tolerance-based surrogate method in this paper.The benchmark problems were unnaturally altered by inserting delays to introduce reproducible results, but the use of the tolerance-based surrogate method on such optimization tasks is meaningless.
The tolerance-based surrogate method was advantageously exploited in the synthesis of the electromagnetic equivalents of composite sheets [13].An anisotropic bandstop filter based on a microstrip line above a uniplanar band-gap (UBG) ground plane was designed by a multiobjective optimization.
The Anisotropic band-stop filter is formed by the microstrip line above a ground plane with an array of etched slots of varying widths (Fig. 2).By changing a number of step-impedance slot lines N, a slot period a (dimension of etched slots) and the angle ϕ between the microstrip line and the step-impedance slots, transmission properties of the filter are being changed.
The transmission characteristics of the band-stop filter were obtained by a full-wave analysis in the transient solver of CST Microwave Studio.The design properties N, a and ϕ were acting as decision variables and fitness functions evaluation consists of a full-wave analysis in the CST Microwave Studio and parsing of the transmission characteristics to achieve the fitness values.An RT/Duroid substrate (h = 0.635 mm, r = 10.2, t = 35 µm) was used in the optimized structure.
It is obvious, that such evaluation is time demanding (approximately 5 to 30 minutes), therefore a great emphasis should be put on keeping the overall number of the fitness functions evaluations at minimum.

Optimization Parameters
Decision variables were discrete and they were defined as follows: the number of the step-impedance slot lines N ∈ [5,7,9], the slot period a ∈ [0.5, 0.6, . . .2.0] mm and the angle between the microstrip line and the step-impedance slots ϕ ∈ [0, 2, 4, . . .90] • .The fitness values were obtained from a frequency response of the transmission coefficient defined as follows (see Fig. 3): |S 21 (x, F)| > − 5 dB, F ∈ 9 GHz; 10 GHz (9) where x = [a, N, ϕ] T is the position of a solution and F denotes a frequency.
Each frequency band forms one fitness function defined by ( 10)- (12).Basically, it is a sum of S 21 values that violates the defined frequency mask ( 7)-( 9).The CST Microwave Studio produces 5001 frequency samples of S 21 within the interval from 0 GHz to 12 GHz.
where the operator ♦ denotes that an output is equal to an argument in square brackets only if the argument is positive.
Otherwise, the output is zero.

Optimization
The paper [13] presents the comparison of the performance of two algorithms, GDE3 and NSGA-II, in design of the band-stop filter.Both algorithms had 20 agents over 20 iterations, which means that each algorithm needed 400 fitness function evaluations.To obtain independent realizations of stochastic processes, both algorithm runs were 10 times repeated.
In other words, 2 × 400 × 10 = 8000 fitness function evaluations would be normally needed.It would take almost 56 days to evaluate the fitness functions (assuming that each fitness function evaluation takes 10 minutes).However, an overall number of possible solutions is 3 × 16 × 46 = 2208 (the decision variables are discrete).This suggests that some solutions would certainly be evaluated more than once, which encourages to the use of the tolerance-based surrogate method.
Thanks to the tolerance-based surrogate method, the whole procedure took about 15 days and the total number of 2022 solution had to be full-wave analysed by the CST Microwave Studio (an average fitness function evaluation took little over 10 minutes).The remaining solutions (2208 overall) were not reached during any optimization run. Figure 3 shows the result of the anisotropic band-stop filter design using the multi-objective optimization.The figure also shows the intended frequency mask (hatched areas) described by equations ( 7)-( 9).

Conclusion
The Tolerance-based Surrogate Method reducing the time of the optimization process has been introduced.It has been described, that certain fitness function evaluations are unnecessary to evaluate, therefore an overall computational time of an optimization process can be reduced.
The drawback of skipping the fitness function evaluation can lead to a loss of a precision.The precision loss might be reduced if a discrete decision space is used with an appropriate tolerance vector.
It was also discussed that if the tolerance-based surrogate method is improperly used, the optimization process can be even slowed.
The real-world optimization task, the anisotropic bandstop filter design, was also presented.The fitness function evaluation took around 10 minutes and the overall optimization time was reduced from around 2 months to approximately 2 weeks thanks to the tolerance-based surrogate method.
Since an evaluation of fitness functions can be very time consuming, the proposed tolerance-based surrogate method can accelerate the whole optimization process even if only few surrogate solution are found.
The tolerance-based surrogate method can also be exploited in cases of recurrent optimization tasks either after altering algorithm settings or the crash of a simulation, because the archive of known solutions can be inserted before the beginning of the optimization process and surrogate solutions can be used from early stages of the optimization process.

Fig. 1 .
Fig. 1.Decision space of a two-objective problem with solutions stored in the archive.

Fig. 3 .
Fig. 3. Frequency response of transmission coefficients of the best solution found in optimization process.
An average computational time in seconds -NSGA-II.An average number of surrogate solutions -NSGA-II.