Automatic Design of Metaheuristics for Practical Engineering Applications

It is common to find multiple metaheuristics to solve continuous optimization problems. However, choosing what optimizer may obtain the best results for a given task requires exhaustive evaluations that are highly application-dependent. Besides, it is necessary to find sufficiently good tuning parameters to achieve satisfactory performance with the selected approach. In this context, the automatic design of algorithms, particularly those based on heuristics, has been increasing in popularity in the previous years due to its undoubted relevance nowadays. This paper explores a novel approach based on hyper-heuristics to carefully select population-based search operators and their tuning parameters to generate metaheuristics capable of dealing with a given practical engineering problem. The proposed strategy is assessed using three highly relevant and illustrative problems: training Artificial Neural Networks, designing PID controllers, and modeling a calorimetric phenomenon based on fractional calculus. In addition, we implement three well-known optimization metaheuristics to compare achieved solutions via the proposed hyper-heuristic strategy, namely Particle Swarm Optimization, Genetic Algorithm, and Cuckoo Search. Results from extensive numerical tests prove that the customized metaheuristics are generally superior to the three well-known algorithms, taking only a few iterations to converge to an optimal solution. This is an excellent indicator of alleviating the effort and expertise required to choose the proper methodology when dealing with real-valued optimization problems.


I. INTRODUCTION
Optimization methods have become a research topic of great interest, especially when designing complex engineering systems [1], [2], [3]. This may be something other than a trendy topic shadowed by sophisticated Artificial Intelligence techniques nowadays. Still, they are the theoretical and practical cornerstones of many modern applications, even those sophisticated methods. Technological advances have The associate editor coordinating the review of this manuscript and approving it for publication was Yeliz Karaca . enabled it to develop new engineering processes, leading to further optimization applications. From a practical point of view, optimization consists of adjusting or fine-tuning system design parameters based on one or more performing functions. Unfortunately, this task is not trivial in many cases, especially when the estimated searching space is complex, nonlinear, discontinuous, or presents high dimensionality [4], [5], [6], [7]. Indeed, most real-world engineering problems present high design complexity, nonlinear constraints, and vast solution domains. These issues justify using modern techniques capable of dealing with challenging optimization case studies. One family of these techniques comprises Metaheuristic (MH) algorithms. MHs have become an alternative to gradient-based optimization methods, especially when the latter evolves into ill-conditioned scenarios during the running time in the application. Thereunto, MHs are characterized by their flexibility, versatility, and algorithmic simplicity when dealing with optimization models [8].
Nevertheless, MHs are not magical. One must know how to select the most suitable, stable, and fast (if so) MH for a given case study. Notwithstanding, it is often necessary to know how to configure its internal parameters efficiently to obtain the highest method performance in terms of accuracy and the number of iterations required for global convergence. The levels of expertise and problem knowledge of the designer, engineer, or practitioner play a crucial role in successfully implementing these methodologies. No matter its flavors, the No-Free-Lunch theorem still reigns [9]. Various MHs have been proposed and evaluated over the last decades to solve a colorful palette of challenging problems [10]. However, some of the newly developed metaheuristics do not differ substantially from the general structures of well-reputed and conventional MHs such as Simulated Annealing (SA) [11], Differential Evolution (DE) [12], Genetic Algorithms (GA) [13], Grey-Wolf Optimizer (GWO) [14], and Particle Swarm Optimization (PSO) [15], to mention a few.
This study considers these robust MHs as basic heuristics for being analyzed and extracting their Search Operators (SOs) or simple heuristics to generate metaphor-less MHs automatically. In particular, we observe that many SOs are straightforward methodological or mathematical variations of fundamental optimization blocks, e.g., mutation, crossover, fitness functions, stopping criteria, and random searches. Some previous studies have used distinctive functional blocks of impressive methods to develop procedures to select two or more simple heuristics to obtain improved MHs in specialized applications [16]. However, finding a particular MH model with the proper tuning parameters is complex, and developing new strategies allows us to coin problem-specific methods or algorithms for problems sharing certain features. Hence, contemporary engineering design challenges requiring robust optimization algorithms are why combinatorial algorithms have emerged in this direction [17]. The fast progress of hybrid MHs has led to the development of a new optimization sub-area known as Hyper-Heuristics (HHs) [18]. These also heuristic-based algorithms deal with high-level problems by selecting and modifying simple (or low-level) heuristics to, in consequence, find a good quality solution in the low-level problem domain [19]. In other words, HHs focus on automating the model design and adapting the existing heuristic methods to address the search task more efficiently. In addition, these approaches aim to develop the level of generality at which search methodologies can operate [20]. Because of such versatility, HHs have been efficiently implemented in the literature, such as in combinatorial problems [21]. Looking in this direction, it is worth highlighting reports implementing HH models that rely on simple heuristic-based algorithms such as the so-called SA. The grand reception of Simulated Annealing as a high-level solver is undoubtedly explained by its characteristic simplicity, low computation burden, and remarkable effectiveness [18]. For example, Bai et al. developed a HH framework for solving two problems related to flexible decision support, i.e., scheduling university courses and packing garbage cans [22]. Their results showed that the SA-based HH improved the performance considerably over a Tabu-based HH strategy. Likewise, Garza-Santisteban et al. proposed a high-level solver based on SA for selecting the best heuristic sequence that fulfills the requirements of multiple instances of Job Shop Scheduling problems [23]. Moreover, several hybrid approaches incorporating SA-based processes are easily found in the literature [24], [25], [26]. For example, Mosadegh et al. presented a hyper-heuristic based on Q-Learning and Simulated Annealing for dealing with the mixed-model sequencing problem, including stochastic processing times in a multi-station assembly line [27]. Nevertheless, we find challenging to find reports in the literature that address practical engineering applications with continuous domains as low-level problems using any hyper-heuristic framework. Still, only a few approaches have been reported on benchmark continuous optimization problems [28], [29].
This work studies the implementation and feasibility of an automatic algorithm design strategy for generating metaheuristics tailored for practical engineering applications. In that regard, we use the hyper-heuristic framework proposed in [29] and [30] , which contains a collection of search operators extracted from common metaheuristics and an SA-based high-level solver. For the practical problems, we consider training Artificial Neural Networks (ANNs), designing Proportional-Integral-Derivative (PID) controllers, and modeling calorimetric processes via Fractional Calculus (FC). Therefore, we can structurally select the operators to generalize a solution and obtain optimal results for the conditions of each application. Moreover, it counts on high repeatability and performance to standardize operators for each task. We observe from the results that the MHs generated by the HH approach outperform basic MHs. Therefore, we provide the practitioners with an automatic methodology for finding MHs to deal with real optimization problems. So, the principal innovation of our proposal is that the users require a minimum level of expertise in metaheuristics to find one suitable for their needs. Plus, the main contributions of this study can be summarized as follows: 1) Prove the feasibility of the HHs for automatically designing solvers to deal with practical problems. 2) Illustrate an alternative way, based on metaphor-less MH, to tackle three dissimilar problem families. 3) Analyze the performance of the generated MH in terms of accuracy, the number of steps, and repeatability. This document is organized as follows. Sect. II presents the theoretical foundations supporting all methods this work VOLUME 11, 2023 covers. Sect. III describes the implemented framework to analyze the case studies. Sect. IV presents a theoretical overview of the proposed applications and how the implemented methodology deals with them. Finally, Sect. V gives the most relevant conclusions.

A. OPTIMIZATION
Optimization, also called mathematical programming, is focused on estimating the best model parameters that fit the available data, following some criteria among existing alternatives. Therefore, an optimization problem maximizes or minimizes an objective function, achieving the best outcome in practical scenarios. Strictly speaking, the optimization methods search and select the best available values of an objective function, supported mathematically by the following definitions.
Definition 1 (Arbitrary Problem Domain): Let X ⊆ S be a feasible domain since S is an arbitrary domain. This arbitrary domain can be defined according to the problem one is solving. It depends on the level of abstraction, such as continuous, combinatorial, and mixed.
Remark 1 (Particular Problem Domains): For continuous problem domains S = R D , where D stands for the number of dimensions, which is usually fixed and represents the number of design variables in design problems. For combinatorial problems S = Z ϖ , with ϖ is also the number of dimensions or cardinality. For mixed domains, we say that S = R D ×Z ϖ . Lastly, a particular case from the combinatorial domains is the heuristic space, such that S = H ϖ is the heuristic space.
Definition 2 (Minimization Problem): Let f (⃗ x) be a function defined on the set X ̸ = ∅ such as f (⃗ x) : ⃗ x ∈ X ⊆ R D → R. Therefore, the minimization problem is stated as where ⃗ x * ∈ X is the optimal solution that minimizes the objective function, i.e., f (⃗ x * ) ≤ f (⃗ x), ∀⃗ x ∈ X. For the sake of simplicity, we refer to a minimization problem with the tuple comprising its domain and function, such as (X, f ).

B. HEURISTICS
A heuristic is a procedure or operator capable of modifying or generating new potential solutions for optimization problems. Although one can find several definitions in the literature, most describe heuristics in combinatorial optimization domains [31]. However, there are a few cases in continuous domains [19]. In general, heuristics can be classified into three groups: low-level, mid-level, and high-level heuristics [19], [31], which are also known as Simple Heuristics (SHs), Metaheuristics (MHs), and Hyper-Heuristics (HHs). In the following lines, we briefly describe each of these groups; one can find further details in [29] and [32].

C. SIMPLE HEURISTICS
Simple heuristics are a fundamental part of search techniques. SHs directly address the problem domain, and usually are constructive or perturbative. Constructive heuristics generate novel solutions from scratch, whereas perturbative heuristics modify or perturb current solutions [33]. However, at the MH control level, we require that SHs can validate whether the process should be stopped or continued. In simple terms, the finalizer is an essential pillar controlling the search procedure many authors use to define MH [34]. Bearing this in mind, we briefly describe simple heuristics and their three categories. These concepts apply to an arbitrary problem domain S. Definition 6 (Finalizer): Let h f : X × Z 2 → H be a simple heuristic that evaluates the current solution quality and chooses which search operator to apply. To do so, it uses information about the iterative procedure in a criteria function c f : (Z + , R, X, . . .) → Z 2 . Then, h f ∈ H f is a finalizer given by since h 0 is a search operator and h e is the identity operator.

D. METAHEURISTICS
Metaheuristics are commonly defined refined strategies that control simple heuristics. MHs are in vogue in the literature due to their proven performance in different challenging scenarios, outperforming traditional gradient-based strategies [10], [35]. MHs are widely recognized because of their capabilities and performance in optimally solving complex engineering models [35]. Some representative MHs include the Differential Evolution [12], Particle Swarm Optimization [15], Cuckoo Search Algorithm [36], Spiral Optimization Algorithm [37], and Grey-Wolf Optimization [14], among others [38]. Definition 7 formally describes the general scheme for a metaheuristic according to [32].

Definition 7 (Metaheuristic):
Let MH be an iterative metaheuristic procedure that renders an optimal solution ⃗ x * for a given optimization task using an objective function f (⃗ x) (cf. Definition 2). This procedure is represented as a finite sequence of simple heuristics (cf. Definition 3) to be applied iteratively until fulfilling a stopping criterion, i.e.,

E. HYPER-HEURISTICS
The literature describes hyper-heuristics as high-level heuristics that control simple heuristics in optimization solving a given case study [18]. A HH either selects or generates low-level heuristics used to propose hybrid algorithms to address optimization tasks better. They differ from most MH applications because they typically work with the solution search space. So, a HH can be defined, according to Pillay and Qu [19], as follows.
Hence, this technique searches for the optimal heuristic configuration ⃗ h * that best approaches to the solution of (X, f ) with the maximal performance.
Remark 2 (Performance Metric): There is no unique expression for determining the performance, perf( ⃗ h | X, f ) since it depends on the desired requirements for the heuristic sequence ⃗ h. A numerical and practical way to assess this measurement, due to the stochastic nature of almost all the heuristic sequences, is to combine different statistics from several independent runs of ⃗ h over the same problem (X, f ). In this work, we employed the performance metric which comprises the negative sum of median and interquartile range of the last fitness values f ⃗ x * ,r achieved in N r runs, r = 1, . . . , N r , of the same solver.

III. METHODOLOGY
All the experiments in this work were carried out in Python v3.9 running on an ASUS TUF Gaming F17 with AMD Ryzen™ 7 Processor 5700G @ 8 CPU Cores, 16GB RAM, and Windows 10 64-bit. We used the framework CUSTOMHyS v1.0 at https://github.com/jcrvz/customhys and described by Cruz-Duarte et al. to implement the HH search [30]. Figure 1 summarizes the proposed methodology for obtaining optimal metaheuristics for three case studies. For this purpose, we employed Simulated Annealing (SA) to FIGURE 1. Hyper-heuristic methodology implemented to generate optimal metaheuristics for three practical problems.

TABLE 1.
Collection of search Operators employed as the heuristic space. These are composed by a perturbator, its tuning parameters, and a default selector, which can be changed. power our approach, which has been widely implemented as a high-level solver for hyper-heuristic applications and is also coded in CUSTOMHyS [30].
This SA-based HH requires the initial hyper-parameters (such as initial temperature, cooling rate, number of steps, and stagnation count), the low-level optimization problem (X, f ), and the heuristic space (collection of search operators). For the hyper-parameters, we have utilized those provided by default in CUSTOMHyS [30]. Concerning the high-level problem domain, we employed the heuristic space summarized in Table 1. These search operators are composed of a perturbator and a selector. VOLUME 11, 2023 Their names are often related to the MH from which they were extracted. Now, in the high-level problem domain, we utilized a collection of 205 SOs as the heuristic space. This domain is briefly shown in Table 2, where we regard different variations and predefined values for the heuristics tuning parameters according to the data summarized in Table 1. In Table 2, each row groups four versions of the same operator but with a systematic variation of the selector, such as Direct, Greedy, Metropolis, and Probabilistic, except the first row, Random Search, which has a predefined Greedy selector. For the eager reader, Cruz-Duarte et al. provide further details about this collection [30].
In addition, Pseudocode 1 describes the procedure associated with the proposed methodology shown in Figure 1. Consider that all the terminology, symbols, and concepts employed are coherent with the theoretical foundations presented in Sect. II. In simple words, the automatic design strategy of metaheuristics proposed in this work corresponds to the well-known Simulated Annealing as a hyper-heuristic. Such a fact is easy to notice from the backbone of Pseudocode 1, where the first block stands for initialization. Then the candidate heuristic sequence is modified and evaluated in the low-level problem in the main loop, followed by the Metropolis selection criterion. All is repeated until the temperature variable reaches a minimal value or the overall procedure is stagnated. The slight but substantial difference between this SA implementation with a typical one is the heuristic space exploration. This process is carried out by performing a randomly selected action from an action set (i.e., add, delete, and perturb) to find a candidate neighbor. Additional details about this implementation can be found in [29] and [30]. Bear in mind that for this work, we slightly adjusted the methods provided by the CUSTOMHyS framework to study the feasibility of this strategy on practical engineering problems, represented by (X, f ).
Lastly, the three case studies chosen for this work were applications related to Feedforward Neural Networks, Control Systems, and Thermal Modeling based on Fractional Calculus. Each of these cases is detailed in the next section. Nevertheless, it is essential to highlight that each case study has different problem dimensionality. Besides, we repeated each optimization process 50 times to ensure statistical significance, and we used a population size of 30 and a maximum of 30 hyper-heuristic steps. Plus, we analyzed the performance of MHs generated by the hyper-heuristic approach against three traditional MHs from the literature for each of the three case studies. These metaheuristics are Particle Swarm Optimization (PSO), Cuckoo Search (CS), and Gravitational Search Algorithm (GSA). We selected these MHs due to their implementations in similar applications [39], [40], [41], [42], [43], [44], [45], [46], [47].

IV. PRACTICAL APPLICATIONS
This section describes the three case studies we selected to illustrate the automatic design of metaheuristics. We carried out all the experiments from these cases following the methodology described above, and particularities are specified when necessary.

A. TRAINING FEEDFORWARD NEURAL NETWORKS
The first practical case study concerns a Machine Learning application, as follows. A Feedforward Neural Networks (FNN) is a particular architecture type of Artificial Neural Networks (ANNs) [48], [49], [50]. for r = 1 to N r do ▷ Perform repetitions required in (5)  17: return ⃗ x * (ϖ ) ▷ Return the best candidate solution attraction remains in letting us perceive the computational models structurally to simplify their analysis. So, the neuronal interconnections in this architecture generate a unidirectional information flow, i.e., the signal never passes more than once through a neuron before generating the output response, as shown in Figure 2.
FNNs have become famous for several reasons. First, they usually generalize well in practice [51], i.e., when trained with a large data set, they often provide a correct output for an input not contained in such a training set. Second, the neural training is performed with a reliable algorithm, the well-known Back-Propagation, in a reasonable number of iterations [52]. This algorithm is used to compute the gradient of all weights errors for a given input by propagating the error backward throughout the network. In the end, the training process concludes by determining the optimal weights and biases in the neuronal architecture. Keeping this in mind, in the first case study, we considered the problem of generating an optimal metaheuristic algorithm for training an FNN; i.e., for computing the optimal network weights and biases. In this case, the objective function to minimize is the Negative Log-Likelihood function [53], defined as follows, This function computes the error between actual values y i and the predictionsŷ i . In this case study, we estimated the weights and biases of the neural network architecture using the new metaheuristics generated by the HH framework to obtain repeatable results and good classification accuracy. Firstly, the IRIS dataset containing three classes of IRIS species (setosa, virginica, and versicolor) is loaded. This dataset is quite balanced, containing 50 samples per class. Besides, each sample is represented by a four-feature vector (sepal length, sepal width, petal length, and petal width). Moreover, Figure 3 shows the analyzed neural network architecture, which was composed of an input layer with four neurons, a hidden layer with 20 neurons using the tanh activation function, and an output layer with three neurons employing the softmax function.
The cost function in (6) was limited to using 30 steps in the HH framework as a computing budget, so we obtained different responses for each step performed. Figure 4 shows the first visualization of the preliminary experiments from the HH searching procedure.
The HH procedure generally tends to achieve a metaheuristic solver with great fitness statistics, represented by a lower dispersion and median of the fitness values. This trend is observed in the upper-right chart, where the violin plot shows the fitness output after running 50 times for each MH candidate. In particular, Figure 4 (upper left) depicts how the first MH candidate evolves over time, presenting the most significant dispersion and high stagnation. The    Figure 4 (lower-left), exhibits how the MH candidate improves its solving procedure but cannot reach a satisfactory and convergent fitness value under the imposed number of iterations. After 26 steps, the last MH candidate performance is significantly better, as shown in Figure 4 (lower-right). In this case, each replica in the 26 th hyper-heuristic step (each MH run) converged to the performance value of 0.146897252. Recall that this performance metric is associated with the combination of statistics: the sum of the median and the interquartile range of the last fitness values generated in several independent runs on the same problem (cf. (4)).  Once a candidate MH is generated, we test its performance by carrying out 50 repetitions with a population of 30 agents and up to 250 iterations. Furthermore, three classical MHs, PSO, CS, and GSA, were selected for comparison purposes. For these MHs, the same values of agents and number of iterations were also assigned. Table 3 displays the statistics for the accuracy achieved from the classical MHs and the one tailored for this case study. In the case of MH * , an accuracy value of 0.9932 was achieved, surpassing those obtained by PSO, CS, and GSA. In addition, when repeating the experiment, the standard deviation presented a value of ±6.6633 × 10 −4 , indicating a low value of the dispersion of the data. With these results, we can ensure that the generated metaheuristic (MH * )  This degree of repeatability obtained is desired for this type of application. To have a clearer view of this test, we can observe Figure 5. This box plot shows the accuracy of data obtained in the different repetitions. MH * presents the same accuracy value in almost all the repeats except for one occasion, which gave a value of 0.98. If we look at the classical MHs, we can see good results for CS and PSO with an average accuracy of 0.89 and 0.94, respectively. Although the accuracy values of this MH are good, they do not guarantee that we will obtain the same result when repeating the experiment. Finally, the algorithm that brought the worst results was GSA which presented an accuracy of 0.73, a positive asymmetry of its data, and a high dispersion of values between 0.6 and 0.98. These results verified what we observed in Figure 4, where the fitness function converges to a value close to zero at step 26.

B. DESIGNING CONTROL SYSTEMS
The second practical engineering problem lies in control theory, the cornerstone of almost all modern technologies. Automating industrial processes requires reliable methodologies to control systems to increase production and perform tasks efficiently. The correct implementation of these controllers indeed ensures desired performance for the automated systems. In general, Proportional Integral and Derivative (PID) controllers are the most widely used in the industry, mainly because they can handle systems with various plants with Multiple Inputs and Multiple Outputs (MIMO) [54]. PID controllers frequently work well in relatively simple industrial processes, but problems may appear when handling complex nonlinear systems. For this reason, optimization methods have become significant in control design, allowing the adaptive tuning of control parameters so that the systems respond efficiently to uncertainties [2]. In this context, various Metaheuristics (MHs) have been implemented for tuning control gains to obtain the desired performance. For example, Particle Swarm Optimization [55], Genetic Algorithm [56], and Cuckoo Search [57] have proven to be great alternatives that outperform traditional control design methods. Although MHs usually give good results, they only sometimes succeed in fitting every type of controller. In addition, inadequate fitting can cause problems in the analysis of stability and repeatability of the results. Hence, one can use a hyper-heuristic procedure for tailoring the MH that best fits as the problem solver to guarantee the stability and repeatability of the estimated solutions for any well-described model.
In the context of this case study, we considered the tuning problem for controlling a simple mechanical system, such as a Mass-Spring-Damper (MSD) arrangement, which is modeled via a second-order differential equation as shown, where m [kg] is the mass, b [kg/s] is the damping coefficient, and k [N/m] is the stiffness. Before continuing, let us consider the numerical solution of an uncontrolled MSD system, with m = 2 kg, b = 0.6 kg/s, and k = 1.2 N/m, while facing an impulsive force f ext = δ(t) N, as depicted in Figure 6. We chose these values only for illustrative purposes. To analyze better this resulting signal, we must consider the characteristic features such as the settling time With this simple simulation, we easily evidence the need to control the dynamic system using a PID controller. So, we implemented a hyper-heuristic approach for finding the optimal metaheuristic algorithm for tuning this controller. Figure 7 presents the overall process we carried out using a signal flow diagram. Using this methodology, we aimed to  achieve significant efficiency and stability for both the MH and the controller. The gains K P , K i , and K d are determined by the metaheuristic automatically designed by the hyperheuristic procedure. Each potential solution for this PID controller is simultaneously tested in the dynamic system to obtain the characteristic output features (i.e., L 1 , T s , E Ts , and E ss ). To do so, we implemented the following cost function, since α, β, γ ∈ R + , such that α + β + γ = 1, are regularization parameters to balance the relevance of each output feature error.
The MSD system is regulated by tuning a PID controller to obtain an overshoot of 5%, a settling time of 3 s, and zero steady state error for a Heaviside step input. Thus, the corresponding MSD model was analyzed with Simulink using the diagram shown in Figure 8.
Once the HH procedure has been implemented, we analyze the cost function defined in (8). This will allow us to optimize the PID controller parameters and thus obtain the requested system characteristics (overshoot, settling time, and steadystate error).
We then proceed to find a custom metaheuristic for this problem. Each generated MH uses a population size of 30 and a maximum of 50 iterations. Besides, a maximum of 30 steps is used during the HH search to achieve the optimal solver. Figure 9 shows a reliable hyper-heuristic tendency to obtain a performing MH solver. First, the upper-right plot of Figure 9 exhibits each step's dispersion profile during the hyper-heuristic search. It is worth noting that the interquartile measure started at step zero with a value of 0.84 and finalized with a remarkable 0.02 at step 23. Figure 9 (upper left) reveals the MH behavior given by the HH framework's first step. As expected, this generated MH presents the most significant dispersion in all the HH evolution. The lower-left plot in the figure displays the MH tailored from step 6, remarking a considerable improvement in its performance. Lastly, Figure 9, lower-right plot, depicts the results evaluation of step 26, corresponding to the tailored MH with the best performance.
Once the steps and replicas in the HH process are completed, the final customized MH for this case study is defined by the sequence {h 141 , h 144 , h 108 } (cf. Table 2), which is detailed as follows: • h 141 -Perturbator: Random Flight with a scale factor of 1.0 and a Lévy distribution; Selector: Probabilistic.
• h 144 -Perturbator: Random Flight with a scale factor of 1.0 and a Uniform distribution; Selector: Metropolis.
• h 108 -Perturbator: Genetic Crossover with coefficients equal to 0.5 and a mating pool factor of 0.4; Selector: Metropolis. Now, concerning the best metaheuristic achieved from this hyper-heuristic search, we test it to verify the performance of the tuned PID controller. Table 4 shows the results in estimating the K P , K I , and K D coefficients of the PID controller. To gather this information, we repeated the experiment 100 times to check the numerical stability and high accuracy of the generated MH. From that, we noticed a very low standard deviation. This behavior guarantees a high 7270 VOLUME 11, 2023   degree of repeatability, which is sought after when tuning this type of controller. In addition, Figure 10 shows the statistical information related to the cost function values. From this information, we notice a low dispersion in all the replicas carried out, i.e., a mean value of 4.7564×10 −5 and a standard deviation of 1.2337 × 10 −9 .
Next, the designed PID controller is applied to the Mass-Spring-Damper system to verify that the controlled response satisfies the requirements. Plus, we implemented PSO and CS to solve the design problem of this case study, employing the same specifications of iterations and population as in the tailored metaheuristic MH * . Figure 11 illustrates the curves corresponding to the behavior achieved by implementing the PID controllers obtained from each metaheuristic algorithm. In this figure, the horizontal red dotted line limits the observed overshoot of the controlled system, reaching a value of 4.9991%. Note that the vertical red dotted line corresponds to the settling time with a value of 2.996 s, representing a zero steady-state error. Moreover, Figure 11 shows a zoomed oval at the results precisely in the overshoot. Here, we quickly observe that the controllers generated by PSO and CS achieve overshoots of 5.1352% and 5.145%, respectively. Although the results obtained by PSO and CS were good, the tailored metaheuristic MH * presents higher precision and accuracy than the classical MHs.

C. MODELING FRACTIONAL THERMAL SYSTEMS
This work's last but not least case study comprises the fractional-based modeling of a non-conventional calorimetric process for electronic thermal management applications.  We used the ordinary model and the prototype device of this non-conventional calorimeter originally reported in [58]. Figure 12 shows a schematic of the non-conventional calorimeter, and the model corresponding to the temperature of the working fluid is given by Regard that a DUT can be a prototype electronic device or a simple breadboard circuit, among others. It is easy to notice that the model in (9) is an idealistic approach for a practical device with a multi-physics behaviour. For this reason, Cruz-Duarte et al. developed a sophisticated signal processing algorithm to effectively measure entering the working fluid [58]. Therefore, we propose a different alternative to model this process, starting from this simple model but taking advantage of fractional calculus. This area has proven its strength in describing practical and noisy engineering scenarios [59], [60].
Fractional Calculus (FC) is considered a generalization of the well-known integer-order calculus because it comprises derivatives and integrals with non-integer orders. Indeed, these operators go beyond orders with real numbers, such as complex quantities [61]. As we mentioned before, this tool has helped model sophisticated phenomena or systems, for example, those related to thermal processes [62] and ultracapacitor discharging cycles [63]. Different fractionalbased operators have been proposed in the literature. Some of the most popular ones are Caputo-Dzhrbashyan [64], Riemann-Liouville [65], and Grünwald-Letnikov [66]. Each of these operators has different mathematical properties, which one needs to examine and adjust to align with the problem under analysis.
In this work, we used the Caputo-Dzhrbashyan derivative [67], a variant of the Riemann-Liouville operator. This operator is formally defined by where z(t) : R → R is an arbitrary real function and α ∈ [0, 1] is the fractional-order. Expression (10) is used because it adds a versatile transformation kernel, widening the number of applications involving fractional-order differential equations [68]. Keeping the simple model using heat transfer concepts and traditional calculus and the Caputo-Dzhrbashyan derivative, we can easily restate the equivalent fractional-order model such as, To find the solution for this fractional differential equation, we utilized the Mittag-Leffler function, which is considered as an natural extension to the Euler function [69]. The resulting expression, the fractional-order model, is given by since the relative temperature θ(t) = θ T (t) − θ 0 is included to consider variations at room temperature θ 0 , this leads to real temperature θ T (t) estimation.
Lastly, the fractional model in (12) describes the temperature behavior of the working fluid contained in the calorimeter's reservoir during a microelectronic heat power estimation analysis. The problem in this case study is finding the optimal value for the fractional order α to fit the experimental temperature data better and, naturally, describe the calorimeter plant. This is possible via the so-called least squares problem, such as where ⃗ x * stands for the design vector comprising the parameters that describe the calorimeter's model. In our particular case,  the dataset with experimental measurements of the working fluid temperature during the analysis of a particular DUT; T ∋t i corresponds to the temporal stamps associated with temperaturesˆ . These data were achieved using the prototype presented in [58] and the distribution for the temperature sensors depicted in Figure 13. Plus, Figure 14 shows an illustrative example of the raw temperature data captured from a 150 resistor with 10 W power and energized with a 45 V voltage source as the DUT.
It is well-known that the problem stated in (13) is a non-linear regression easy to solve via a soft-computing methodology such as a metaheuristic (MH). The challenge was finding the proper method to solve the low-level problem, which was possible by employing the methodology described in the previous section. To solve the fitting problem, we defined a population of 40 agents performing up to 100 iterations and using up to three search operations per iteration. Figure 15 shows the HH search in tailoring the optimal MH (the upper-right plot), and the other three curves detail particular HH steps from the perspective of the metaheuristic evolution. Likewise the previous case studies, this HH procedure follows the same pattern toward minimizing the mean squared error. As one may notice, the HH evolution is faster in this practical application than in other case studies. For example, we rapidly observe that 30 iterations are enough for reaching a good fitness value in the 10 th step of the hyper-heuristic search; cf. the lower-left plot in Figure 15. Finally, this process obtained the best results at the 17 th step, where the dispersion of the fitness values (mean squared error) is substantially  smaller when compared to the initial candidate metaheuristic. Quantitatively speaking, the interquartile range of the fitness values started at 12.05 in the first HH step and ended at 0.12, with the last step corresponding to the optimal MH for this problem.
Once the steps and replicas in the HH process are completed, the final generated MH is defined by {h 4 , h 148 , h 101 } (cf. Table 2), corresponding to the following search operators: • h 4 -Perturbator: Central Force Dynamic as the perturbator; Selector: Metropolis.
• h 148 -Perturbator: Random Flight with a scale factor of 1.0 and with Gaussian distribution as the perturbator; Selector: Direct.
• h 101 -Perturbator: Genetic Crossover with coefficients equal to 0.5 and a mating pool factor of 0.4; Selector: Metropolis.
The experiment mentioned above was repeated 50 times to guarantee statistical robustness, so we estimated the parameters of the fractional-based calorimetric model summarized in Table 5. For each model's parameter, we present the best, mean, and standard deviation values since the best column corresponds to that configuration rendering the minimal MSE of 0.1903.  Furthermore, Figure 16 displays the comparison between the fractional model achieved by the generated metaheuristic and the traditional model achieved via the Levenberg-Marquart optimization method. We also show the experimental data for contrasting these behaviors. We obtained the traditional model by solving a first-order differential equation, whose solution is a trivial exponential function. Figure 16 shows a close-up of the final part of the graph, where the fractional model can fit the experimental data more accurately.
Lastly, as we did in the previous case studies, we implemented three well-known metaheuristics from the literature to compare their performance against the designed algorithm for this problem. The new generated MHs showed a better performance regarding the number of iterations and minimum error, as shown in Figure 17 (blue line). Also, we detected no stagnation due to the stoppage of this MH. It is noteworthy that the PSO, CS, and GSA were tuned and conditioned according to the literature. Besides, the population size was increased from 30 individuals to 60 for the classical MHs to compare the algorithms satisfactorily. The minimal fitness VOLUME 11, 2023 achieved by the generated MH was 0.1903, while PSO, GSA, and CS reached 0.2301, 0.2542, and 0.2683.

V. CONCLUSION
This paper proposed a novel and practical methodology to automatically design Metaheuristics (MHs) through a Hyper-Heuristic (HH) procedure for solving practical optimization engineering problems. The HH process is powered by the so-called Simulated Annealing algorithm, which searches within the heuristic space for a sequence that, at a lower level, explores the continuous problem domain to find the optimal solution. Therefore, we provided the practitioners with an automatic methodology for finding MHs to deal with real optimization problems. So, due to the nature of these algorithms, one can achieve excellent performance when using an MH tailored for a particular scenario on a similar problem. Even if the solution goodness is limited, such an MH serves as a seed algorithm for further adjustments. The principal innovation of our proposal is that the users require a minimum level of expertise in metaheuristics to find one suitable for their needs.
To illustrate the advantages of the proposed automatic design methodology, we selected three case studies from different fields. Such problems are the training of neural networks for image classification, the design of PID controllers for mechanical systems, and the modeling of a non-conventional calorimetric system via fractional calculus. We proved that the hyper-heuristic methodology is a reliable alternative for designing population-based metaheuristics that solve continuous engineering problems and exhibit a low computational cost. These optimal metaheuristics can be used in similar applications to those they were designed for or be taken as the initial seeds for more sophisticated techniques.
It is evident that this is only an hors d'oeuvre to several astonishing research and innovative applications. So, we expect to expand upon the number of real engineering applications and verify the standardization of the generated algorithms for similar problems. It is necessary to have real engineering problems that serve as benchmark test functions for these generated metaheuristics. In particular, we plan to work on problems related to renewable energy sources, representing an excellent research field with lots of potential for noteworthy contributions. We are moving ahead to study a more autonomous way to explore the heuristic space, an extended heuristic space, which includes the simple heuristics and their tuning parameters. For that purpose, a solid theoretical basis is required.

CONFLICTS OF INTEREST
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.