CTLBO: Converged teaching–learning–based optimization

Abstract Teaching–learning–based optimization (TLBO) is an algorithm based on the influence of a teacher on the output of learners in a class. This method has shown to be more effective and efficient than other optimizations in finding the maximum solutions. In this paper, a new improved version of TLBO algorithm, called the converged teaching-learning-based optimization (CTLBO), is presented. In fact, it combines a proposed convergence operator with the teacher phase to find better solutions with a higher convergence rate. The method is tested on some benchmark problems and the results are compared with the original TLBO and other popular evolutionary algorithms. Furthermore, the introduced algorithm is used for optimization of fuzzy tracking control of a walking humanoid robot. In elaboration, fuzzy tracking control, which has appropriate membership functions and error indices, is employed in this paper as a promising intelligent approach to control the nonlinear dynamics of a humanoid robot. Summation of integrals of absolute angle errors and absolute control efforts is regarded as the objective function addressed by both TLBO and CTLBO algorithms in the present investigation.


PUBLIC INTEREST STATEMENT
Teaching-learning-based optimization (TLBO) is an algorithm based on the influence of a teacher on the learners in a class.The process of TLBO is divided into two parts.The first part is called "teacher phase", and the second part is named the "learner phase".The teacher phase means learning from the teacher, while the learner phase indicates learning through the interaction between learners.In this paper, in order to improve the performance of the TLBO algorithm, the teacher phase is combined with a novel convergence formula to modify the converging process.The method is tested on some benchmark problems and the results are compared with the original TLBO and other popular evolutionary algorithms.The comparative study confirms that CTLBO is a promising global optimization approach and superior other algorithms in terms of accuracy, speed, robustness, and efficiency.

Introduction
Optimization is referred to as the process of finding the best answer among other available answers and is used for design of most engineering and economical systems in order to minimize a defined objective.Traditional techniques often fail to solve optimization problems that have many local optima and thus there remains a need for efficient and effective optimization techniques.Continuous research is being conducted into this field, indicating that the nature-inspired meta-heuristic optimization methods are better than the traditional ones.Moreover, evolutionary algorithms are widely used as the most modern heuristic minimization methods.In fact, the optimization algorithms of this special type such as genetic algorithm (Back, 1996;Holland, 1975) ant colony optimization (Chen, Xiao, Li, Wang, & Huo, 2018), bee colony optimization (Karaboga & Akay, 2009), particle swarm optimization (Kennedy, 2011), championship sports leagues (Kashan, 2014), imperialist competitive algorithm (Atashpaz-Gargari & Lucas, 2007), team game algorithm (Mahmoodabadi, Rasekh, & Zohari, 2018) and the artificial root foraging optimizer (Ma, Zhu, Liu, Tian, & Chen, 2015) are known as meta-heuristic population-based algorithms.
The teaching-learning-based optimization algorithm, which is also an evolutionary one, was originally introduced by R.V. Rao in 2011 for mechanical design optimization problems (Rao, Savsani, & Vakharia, 2011).It was later extended for continuous non-linear large-scale problems (Rao, Savsani, & Vakharia, 2012).In recent years, this algorithm was highlighted mainly due to its strong ability to find the global optimum point.For instance, in 2013, Satapathy et al. proposed a teaching-learning-based optimization, according to the orthogonal design, for solving global optimization problems and called it OTLBO (Satapathy, Naik, & Parvathi, 2013a).Afterwards, they introduced a weighted teaching-learning-based algorithm and proved its superiority in comparison with other approaches (Satapathy, Naik, & Parvathi, 2013b).In addition, the multi-objective optimization of heat exchangers was proposed by Rao et al. in 2013, using a modified TLBO algorithm (Rao & Patel, 2013).More details about multi-objective improved teaching-learningbased optimization algorithm were presented by Rai in 2017 (Rai, 2017).
The two main highlighted issues of TLBO are its speed and accuracy to find the global optimum point.Hence, the main contribution and motivation of this paper is the improvement of TLBO via increasing the convergence speed to reach better results with higher accuracy in shorter time.The considered convergence operator, which has a performance probability, is added to the teacher phase of the algorithm.In order to prove the effectiveness and success of the proposed scenario, it was challenged by both mathematical test functions and real world design problems.The numerical results show that not only the new algorithm has a better performance in comparison with the original TLBO, but also it is more accurate than other well-known evolutionary algorithms.

Teaching-learning-based optimization
The process of TLBO is divided into two parts.The first part is called "teacher phase", and the second part is named the "learner phase".The teacher phase means learning from the teacher, while the learner phase indicates learning through the interaction between learners (Venkata Rao, 2016).

Teacher phase
In this algorithm, a teacher is the learner that has the best level of knowledge.The teacher can only improve the mean performance of class depending on the class capability.Let M K be the mean situation of the population at iteration K; and X K teacher be the teacher situation, which tries to move M K towards its own level.Then, a solution is updated according to the difference between mean and teacher situations as follows: where, TF is the teaching factor, and r K denotes a vector of random numbers in range [0,1].The value of TF can be either 1 or 2 as below:

Learner phase
In this phase, the learners increase their knowledge through interaction between themselves.A learner learns something new if another learner has more knowledge than it.In mathematical terms, for any X i , the learner X j is randomly selected (iÞj), and if

CTLBO
In order to improve the performance of the TLBO algorithm, the teacher phase is combined with a novel convergence formula to modify the converging process.Let p 2 0; 1 ½ be the convergence probability and q 2 0; 1 ½ be a random number; if p < q; then the following operator should be implemented to generate the new situation from the old one: where, C is the social learning factor, inspired by the PSO algorithm (Mahmoodabadi & Ziaei, 2019), and represents the attraction of a learner towards the success of the class (teacher).With a small value of this parameter, learners are allowed to move around their personal position, while its large value helps particles to converge to the best solution of the class.Previous researches on PSO algorithm suggest that the best solutions are determined when C is linearly increased, over the iterations, from 0.25 to 1.25 (Mahmoodabadi & Bisheban, 2014).In addition, for simplicity, the value of the convergence probability is set at p ¼ 0:5.If the convergence probability condition is not satisfied, then the original teacher phase formulation, i.e.Equations (1-3), would be used for each student.A flowchart of the proposed algorithm is illustrated in Figure 1.Besides, Figure 2 delivers a comparison of the population distribution of TLBO and CTLBO for the sphere test function (Table 1) with 50 students in three running phases.This simple investigation exhibits the higher convergence speed of CTBLO and its ability to adapt to a time-varying environment.

Comparison regarding mathematical test functions
In order to evaluate the accuracy and convergence speed of the proposed algorithm, several test functions with different characteristics are employed (Table 1).At first, the performance of the proposed algorithm is assessed in respect of solution accuracy through comparison with harmony search algorithm (HSA) (Geem, Kim, & Loganathan, 2001), ant colony optimization (ACO) (Dorigo, 1992), artificial bee colony optimization (ABC) (Karaboga, 2010) and the teaching-learning-based optimization (TLBO) (Rao et al., 2012).These assessments are accomplished at same conditions such as the number of function evaluations, population size and dimensions.In this experiment, dimensions (D) of the test functions are taken as 5, 10, 30, 50 and 100 with a maximum number of function evaluations 100,000.The results are presented in Table 2 in terms of the mean and the standard deviation (SD) (Equations ( 7) and ( 8), respectively) of the solution errors obtained in 30 (n ¼ 30) independent runs by each algorithm.
These simulations demonstrate that the CTLBO achieves the global optimum in the optimization of complex multimodal functions: sphere, quadric, step, Rastrrigen and Griewank.Although TLBO outperforms CTLBO and others on Rosenbrok for dimensions 5, 10 and 30; its solutions for other dimensions (50 and 100) are worse than those of the CTLBO.Parrott & Li, 2006).The ability to avoid being trapped into local optima and achieve global optimal solutions suggests that the CTLBO can indeed benefit from the convergence operator.
Additionally, Figure 3 represents the comparison of the evolutionary processes for four different test functions with respect to convergence characteristics.Generally speaking, these graphs reveal that CTLBO offers a much higher speed than TLBO over the test functions.
Table 2. Comparison of results in terms of the mean and the standard deviation for HS, ACO, ABC, TLBO and CTLBO HAS (Geem et al., 2001) ACO (Dorigo, 1992) ABC (Karaboga, 2010) TLBO (Rao et  5. Optimum design of the fuzzy controller

Dynamics and modeling of the humanoid robot
The humanoid robot, walking in the vertical plane, is simulated by means of a three-link model as shown schematically in Figure 4 (Mahmoodabadi, Taherkhorsandi, & Bagheri, 2014).The first link is considered the stance leg on the ground; the second link represents the head, arms, and trunk;  while the third link refers to the swing leg.These links move freely in the vertical plane, and their parameters are given in Table 3 which is the anthropometric table for a humanoid robot with 171 cm height and 74 kg weight (Mahmoodabadi et al., 2014).
The Newton-Euler approach is used to obtain the dynamical equations of the model (Mahmoodabadi et al., 2014).Moreover, θ 1 ; θ 2 and θ 3 are the angles between the first, second and third links and their assumed vertical lines, respectively.

Fuzzy tracking control of the humanoid robot
The proposed fuzzy tracking control is based upon a closed-loop fuzzy system.The state variable vector is chosen as x 1 ; x 2 ; x 3 ; x 4 ; The errors, in turn, could be defined as follows: where, x d i i ¼ 1; 2; . . .; 6 ð Þ are the desired state values.The new error index parameters introduced for the inputs of fuzzy system would be defined as below: Furthermore, the constructed rules are mentioned in Table 4 ( y i ); and the considered membership functions are shown in Figure 5 (μ A l i ). Besides, the inference result f i should be calculated through the product-sum gravity method using the following equation (Mahmoodabadi et al., 2014): in which, M stands for the number of rules.Finally, the control efforts, u 1 ; u 2 and u 3 are obtained by the following equations: where, w 1 ; w 2 ; w 3 ; w 4 ; w 5 and w 6 denote the weighting constants and are usually identified by a trial-and-error process.An appropriate approach to choose these factors is to employ optimization algorithms such as TLBO.

Rule modules for the input items
Antecedent Variables E i

Comparison of the optimal control problem solutions
In order to challenge the performance of the introduced controller, the desired trajectories of joint angles are obtained through using third-degree polynomials as follows: x d 1 ¼ À 0:0165t 3 þ 0:1246t 2 À 0:2711t À 0:0083 (13) where, t signifies the time.These desired trajectories cause the zero-moment point to move into the stability polygon, and thus provide stability for the robot.In this paper, the sum of integrals of absolute angle errors and absolute control efforts is regarded as the objective function which should be minimized.
In addition, vector [w 1 ; w 2 ; w 3 ; w 4 ; w 5 ; w 6 ] contains the selective parameters obtained by the trialand-error process, which are all positive constants and lie in a general region as 0 < w i < 1000 (i ¼ 1; 2; . . .; 6).Sum of integrals of absolute angle errors and absolute control efforts is a function of this vector's components.For the present optimization problem, the population size is set at 20, and the maximum iteration is also fixed at 50.
The optimum objective functions and the corresponding design variables acquired by TLBO and CTLBO are illustrated in Table 5.The corresponding joint angles and velocities are illustrated in

Conclusion
In this study, TLBO has been extended to CTLBO by utilizing two processes.The first has been implemented for increasing the convergence speed, and the second has been applied to avoid being trapped into local optima.The CTLBO has comprehensively been evaluated on several wellknown benchmark functions, and the results have been compared with those of other recently introduced optimization algorithms.This analysis has demonstrated the feasibility and efficiency of the proposed strategies in terms of convergence speed, global optimality, solution accuracy and   the algorithm reliability.Furthermore, as a real application, the considered algorithms have been utilized for optimum design of a fuzzy controller for a humanoid robot.The numerical results have been depicted to illustrate the feasibility and efficiency of the CTLBO in comparison with TLBO for a nonlinear complicated problem.You are free to: Sharecopy and redistribute the material in any medium or format.Adaptremix, transform, and build upon the material for any purpose, even commercially.The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms: Attribution -You must give appropriate credit, provide a link to the license, and indicate if changes were made.You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

No additional restrictions
You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Figure 2 .
Figure 2. Comparison of the position of students in three stages for the sphere test function, obtained by TLBO and CTLBO algorithms.

Figure 4 .
Figure 4. Parameters of the humanoid robot based upon the anthropometric table.

Figure 3 .
Figure 3. Convergence performance of TLBO and CTLBO on the test functions.

Figure 5 .
Figure 5. Membership function for the fuzzy control of the humanoid robot.

Figure 6 ,
Figure6, while Figure7shows the related errors of the obtained optimum design variables by the CTLBO.

FigureFigure 7 .
Figure 6.Desired and the tracking trajectories of the joint angles and velocities for the optimal design variables obtained by CTLBO.

©
2019 The Author(s).This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license.

Cogent
Engineering (ISSN: 2331-1916) is published by Cogent OA, part of Taylor & Francis Group.Publishing with Cogent OA ensures: • Immediate, universal access to your article on publication • High visibility and discoverability via the Cogent OA website as well as Taylor & Francis Online • Download and citation statistics for your article • Rapid online publication • Input from, and dialog with, expert editors and editorial boards • Retention of full copyright of your article • Guaranteed legacy preservation of your article • Discounts and waivers for authors in developing regions Submit your manuscript to a Cogent OA journal at www.CogentOA.com

Table 3 .
Anthropometric parameters of the humanoid robot model

Table 5 .
Objective functions and design variables acquired by TLBO and CTLBO