Ant Colony Optimization based on Pareto optimality: application to a congested router controlled by PID regulation

ABSTRACT The subject of this research work is to stabilize the network TCP (transmission control protocol) as well as the queue of the router congestion by designing an Active Queue Management scheme able to ensure this role. The problem is dealt with under the theory of the command by using a tuned PID (proportional–integral–derivative) controller based on an extension of the Hermite–Biehler theorem applied to quasi-polynomials. This tuning approach uses Hurwitz stability concept, that is to say, a sufficient and necessary condition must be given so the roots of the quasi-polynomial lie in the left half plane. Since this stabilization method gives rise to a set of values for parameters ‘P’, ‘I’ and ‘D’, it turns out relevant to optimize the results achieved within this stability region. To achieve this purpose, a multi-criterion Ant Colony Optimization based on Pareto optimality is used, given the conflictual character of the closed-loop system performance parameters. The set of optimal solutions of the problem is given by determining the Pareto front of objective functions. The effectiveness of the proposed control scheme is evaluated via a series of numerical simulations in MATLAB and SIMULINK. The results are compared with those of the genetic algorithm and Ziegler–Nichols methods.


Introduction
The current era is experiencing an exponential increase in the size and diversity of communication networks. This digital revolutionary aspect is, therefore, an issue for today''s communities and is becoming a fertile field for researchers who, in recent decades, are increasingly motivated to make greater efforts to generate new versions and more and more reliable and powerful computer network structures.
The underlying protocols must take along the dimensions of current networks. The transmission control protocol (TCP)/Internet protocol (IP), which is a standard of network languages, is adopted on the Internet to communicate between machines around the world. This universal language makes working together IP and TCP, so that the IP packets do not get lost or do not arrive in duplicate and to make sure that the packet has arrived at its destination. However, when traffic increases, it may cause the network to become congested; a phenomenon quite common in routing. The routers, therefore, cannot cope and therefore they lose packets because there is no place to store them. Increasing the queue further worsens the phenomenon of congestion since the time taken by a packet to get to the head of the queue becomes too CONTACT Samira Chebli samira.chebli@gmail.com great. In addition to the queue saturation, these factors contribute to an overall slowdown of routers (Chiu & Jain, 1989;Ismail, El-Sayed, Elsaghir, & Morsi, 2014;Jacobson, 1988;Low, Paganini, & Doyle, 2002;Ohsaki, Sugiyama, & Imase, 2009). In order to establish a quantitative analysis of the phenomenon of congestion, the behaviour of the TCP protocol was modelled by analogy to the fluid flow, and therefore a system of delayed differential equations was formulated. Previous research has shown that this system can be approached from the point of view of the theory of control (Misra, Gong, & Towsley, 1999. Delay times like those resulting from congestion in the management of the Internet, give rise to characteristic functions called quasi-polynomials whose first research work was carried out by Pontryagin (1955). The latter has constructed a formal and highly relevant mathematical tool for the analysis of the stability of delayed linear timeinvariant (LTI) systems by formulating a necessary and sufficient condition for the roots of a quasi-polynomial to have the negative real part, that is to say to ensure a stability of Hurwitz for the quasi-polynomials in terms of a property of roots interlacing.
The problem arising in the study of the stabilization of delayed systems is that the characteristic equations of the models of these systems have an infinite number of roots in contrast to those of systems without delay. It is undeniable that the study of the first class of systems presents a rather significant level of complexity. The work of Bhattacharyya (2001, 2005) brings a solution to this dilemma. This solution consists in adopting an extension of the Hermite-Biehler theorem applied to quasi-polynomials. The principle of this theorem is used to stabilize the TCP model by means of a proportional-integral-derivative (PID) controller (Chebli, Elakkary, & Sefiani, 2017a, 2017bChebli, Elakkary, Sefiani, & Elalami, 2015).
The controller (PID) is used in more than 90% of industrial processes according to a survey conducted by the Japan Electric Measuring Instrument Manufacturers Association in 1989s. This popularity is due to its different characteristics, when combined they provide a stability whatever optimal it may be. Its three actions, namely proportional (P), integral (I) and derivative (D) complement each other to ensure instantaneous, and therefore rapid, correction of any deviation of the quantity to be adjusted. They also eliminate large system inertias and residual steady-state error, as well as accelerating system response and improving loop stability. The fact that the PID controller can be combined with transmitters and logic, makes it a better candidate to be used in the regulation of the transport of information as it is for TCP/IP networks.
The problem described in this manuscript can be reduced to Pareto optimality or a multi-objective optimization problem (Censor, 1977;Pardalos, Migdalas, & Pitsoulis, 2008), since the purpose of this study is to act on rise time, settling time and overshoot rate of the step response of the closed-loop system, in order to minimize them. These dynamic performance criteria of the system are generally conflicting which hinders a simultaneous optimization of each objective.
In order to solve the optimization problem of the Active Queue Management (AQM) control system, we adopt a multi-objective Ant Colony Optimization (MOACO) which is a meta-heuristic technique based on Pareto optimality concept Colorni, Dorigo, & Maniezzo, 1992;Dorigo & Gambardella, 1997Dorigo, Maniezzo, & Colorni, 1996). The Ant Colony Optimization (ACO) algorithm guarantees reliable data transmission. The way in which this algorithm works to find an optimal solution and thus create the shortest path that will always be borrowed subsequently, makes this technique the best choice to be applied in routing problems.
The ACO optimization topic for the routing in TCP networks has already been discussed (Khanpara, Valiveti, & Kotecha, 2010;Rathore & Khan, 2017;Ring, Munirajan, & Cole, 2004), but without involving the problem of congestion in this algorithm. The contribution made via this paper is indeed to optimize the flow of packets in a congested router by PID controller tuned by the MOACO technique.
Then we carry out, via an illustrative example, a MAT-LAB simulation of the performance of the applied method (MOACO) for the PID-AQM control system.

Problem statement
The Internet comprised the interconnection of highly heterogeneous, complex and dynamic modules and the dominant transport protocol for the Internet is TCP, it adopts the end-to-end windows-based flow control to avoid congestion.
The main property of TCP is to guarantee reliable communication on the Internet. Congestion occurs when a link or node is carrying so much data that its quality of service deteriorates. To correct the problems with tail-drop buffer management, Floyd and Jacobsen introduced the concept of AQM, in which the routing nodes use a more sophisticated algorithm called RED to manage their queues (Chiu & Jain, 1989). Different algorithms were proposed and introduced into the Internet routers such as BLUE, REM or AVQ (Low et al., 2002;Ohsaki et al., 2009).
We design a congestion control mechanism based on AQM techniques by using control theory. In order to use control theory, we propose to introduce a mathematic model of the TCP behaviour which has been developed by Misra et al. (1999Misra et al. ( , 2000 and which has been simplified by Hollot et al. (2001).
This dynamic model of TCP/AQM is based on fluidflow and stochastic differential equations while ignoring the TCP timeout and the slow-start mechanism. The nonlinear model used for the study is given by the following two coupled non-linear differential equations: whereẇ(t) denotes the average of TCP windows size (packets),q(t) the average of queue length (packets), R(t) = q(t)/C + Tp the round-trip time (s), Tp the propagation delay, C the router's transmission capacity, N(t) the number of TCP sessions.ẇ(t) andq(t), denote, respectively, the time-derivative of w(t) and q(t) and p(t) is the probability of packet discard due to the AQM mechanism at the router.
Most systems in the real world are non-linear, then the analysis of certain required properties and design controllers become difficult, if not impossible. For this reason, the linearization of those models using some approximation becomes a relevant step. Taking q as the output and p as input, the linearization of (1) and (2) around the operating point satisfies: where the operating point is given bẏ From (3) and (4), we derive the transfer function from δp to δq: Considering a negative feedback control system with the AQM being the controller, the system to be controlled is given by Lemma 2.1: The plant G in (6) is stable for all positive values of R, C and N (Silva et al., 2001).

Proof:
The poles of the transfer function A are in the left half part of the complex plane for all values of the parameters R, C and N positives, thus A is always stable. We also have for all positive values of the parameters R, C and N. So according to the Nyquist stability test, the transfer function 1 1 + A(s)Rse −Rs is stable for all positive values R, C and N. We can thus conclude that the plant G defined in (8) is stable for all positive values of R, C and N.
The equation linearization around the operating point gives a two-order model with a delay that is given by the following transfer function (Chebli et al., 2015;Hollot et al., 2001;Misra et al., 1999): Generally, the industrial process is modelled as the firstorder system, which is comparatively easy to the analysis: where K, T and L, respectively, represent the state gain, the constant time and the time delay of the plant. These three parameters are supposed to be positive. We calculate the first and second derivatives of the transfer functions (12) and (13) at s = 0: Let: So we have: then, we obtain:

PID controller approach
The PID controller is the most widely used controller in the industry, covering more than 90% of the industrial need in control. This predominance is due to its simplicity and efficiency. The system introduced in the previous section forms part of the delay systems. This described delay presents both a physical reality and makes it possible to describe a part of the dynamics of the system in a simpler way than by using a higher order. The analysis of the stability of delayed systems is not a trivial task, this complexity is due to the infinite number of roots of the characteristic equation of the system also known as quasi-polynomial.
Recently, an extension of the Hermite-Biehler theorem applied to quasi-polynomials adopted by Silva et al. (2001Silva et al. ( , 2005 helped to solve this problem. So we chose it as the basis of the approach used in this article to tune the PID controller. This is an analytical approach used to find Hurwitz stable PID gains. Based on the root interlacing property, this extension of the Hermite-Biehler theorem consists essentially in finding a number of roots of the quasi-polynomial having a negative real part. It should be noted that in this study the exponential term in the characteristic equation of the transfer function of the system has not been approximated, which preserves to the system a number of its properties. In this manuscript, we address the determination of the whole stability region of the variables of the PID regulator for the first-order delay system. This step is essential for any design and tuning of PID controllers. In the course of numerous research studies dedicated to the PID regulator, several configurations have been assigned to it. The most common of these is the closed-loop single input single output (SISO) (Milosawlewitsch-Aliaga, Osornio-Rios, & Romero-Troncoso, 2010) which is shown in Figure 1.
Where y c , e, u and y are respectively the desired, error, command and output signals, C(s) is the PID controller and G(s) is the plant to be controlled.
The control law of the PID regulator can be written in Laplace transform as: where K p , K i and K d are, respectively, the proportional, integral and derivative gains of the PID controller.

Ant Colony Optimization
The PID controller parameters optimization problem is an NP problem, that is to say, the resolution of the algorithm describing the system in question is of exponential complexity prompting to call on the use of meta-heuristics.
One of the utilities of meta-heuristics is that they are applicable over a wide range of problems that are difficult to optimize without the need to make significant or even radical changes in the algorithm used. A meta-heuristic can also refer to a combination of stochastic optimization algorithms and local search. These optimization methods are characterized by the fact that they can be content with a minimum of information on the problem to optimize to be implemented. All that matters is whether it is one or more criteria to be optimized (objective functions).
The ACO came into being as a result of the ant behaviour in nature when searching for food. These social insects use, to communicate, a volatile substance called pheromone, which they use to mark their route between the nest and the food.
In fact, as illustrated in Figure 2, the ants performing the minimum time on the way back and forth between the nest and the food are the ones that have taken the shortest route. This path has a higher concentration of pheromone and is more attractive to ants and therefore has a higher probability of being borrowed. This same track will be more reinforced than the others, and ultimately will be chosen by the vast majority of ants. From this observation (Dorigo et al., 1996), the ACO was formalized; it is a meta-heuristic optimization method. In fact, the Travelling Salesman Problem was the subject of the first implementation of ACO: Ant System (AS) (Colorni  , 1992). The basic algorithm is divided into three essential phases.

Initialization
The problem is to find the shortest Hamiltonian cycle in a graph, where each vertex of the graph represents a city. The distance between cities i and j is represented by d ij , and the pair (i, j) represents the edge between these two cities. We first initialize the quantity of pheromone on edges with τ init , each ant traverses the graph and constructs a complete path (a solution).

Constructing ant solution
At each stage of the construction of the solutions, the ant must decide at which point it will move, this decision is taken in a probabilistic way based on the values of pheromone and a statistical information which allows in particular to find a good solution.
The probability that an ant k moves from vertex i to vertex j, which belongs to a set of vertices that are not yet visited by the ant k denoted by S k i , is (Abolhasan, Wysocki, & Dutkiewicz, 2004) α and β are two parameters that influence the importance of the pheromone intensity τ ij , and the statistical information called visibility η ij . This value guides the choice of ants to nearby towns and avoids those that are too far away (η ij = 1/d ij ). For α = 0, one takes into account just the visibility, that is to say, that the choice will have fallen each time on the nearest city. If β = 0, only the pheromone tracks play on the choice. To avoid too fast selection of a path, a suitable compromise between these two parameters is mandatory. Artificial ants have some kind of memory. The latter is one of the axes around which the meta-heuristics are articulated. It allows the algorithm to converge its solutions to global optimums while sparing local optimums, because avoiding stagnation in a local optimum prevents premature convergence and still allows the algorithm to explore the search space. Nevertheless, it is imperative to find a compromise between the exploration of research space and the exploitation of the paths taken; too much exploration implies the divergence of the algorithm, while too much exploitation leads to convergence towards a local optimum.

Updating pheromone
When all the ants have constructed a solution, an amount of pheromones τ k ij is deposited by each ant k on its path. For any iteration t, if the path (i, j) is in the round of the ant k the quantity of pheromones deposited on this path is where L k (t) is the total length of the ant tour k, and Q is a constant. So the addition of the amount of pheromones certainly depends on the quality of the solution obtained, i.e. the smaller the number of pheromones, the greater the number of pheromones.
In order not to neglect all the bad solutions obtained, and thus avoid convergence towards local optima of poor quality, the concept of evaporation of pheromone tracks is simulated through a parameter ρ called the evaporation rate 0 < ρ < 1 as follows: where τ ij (t) = m k=1 τ k ij (t), t represents a given iteration and m the number of ants.
In Mavrovouniotis and Yang (2014), Michalis explains the impact of pheromone evaporation rate on the performance of ACO algorithms. If the environment is static or even changes slightly, a small pheromone evaporation rate will be the best choice for optimal performance. In the opposite case, where the environment is quite dynamic, the choice of a large value of the evaporation rate is necessary. In the following, we assume that the environment is quite static.
Then, many variants have been proposed (Dorigo & Gambardella, 1995Dorigo et al., 1996;Stützle & Hoos, 2000). The first is characterized by the introduction of elitist ants. In this approach, the ant that has made the shortest route deposits a larger amount of pheromone in order to increase the likelihood of other ants to explore the most promising solution.

MOACO based on Pareto optimality
Once the gain stability region of the PID controller is determined, it will be pertinent to search within this region for the optimal values of these parameters. However, the problem to optimize in this manuscript is not reduced to a single cost function, but to several.
The dynamic performance of a closed-loop system is generally assessed in terms of stability, precision, sensitivity and transient response, to name but a few. Recent control systems require more sophisticated performance criteria. This second category of criteria takes into account simultaneously the error and the time during which it occurred. The importance of this approach has been amply demonstrated in the literature. The error signal is commonly expressed as To cover broad aspects of the dynamics of the system, we will consider a very useful performance index in designing linear control systems (Krohling & Rey, 2001;Sahib & Ahmed, 2016) which is the ITAE (Integrated-Time-Absolute Error) criterion is generally considered to be more selective than other performance indices, it acts on long-duration transients by penalizing it, and focuses more on the error present in the system than on the initial error: The analysis of the dynamic performance of the system amounts to analysing its transient response. The latter is characterized by several parameters including: • Settling time: although in theory, linear systems are characterized by transient regimes of infinite durations, it is possible to estimate their practical duration thanks to the notion of settling time, defined as the time taken to reach the final value of the output to within 5%. • Rise time defined as the time at the end of which the output signal crosses its asymptote for the first time.
The reason why this parameter is retained in system case is that the objective here is to reduce the delay due to congestion and to quickly reach the final value of the output signal. Hence, the rise time clearly corresponds to the moment for which this objective is reached. The rise time is, therefore, a relevant parameter for encrypting the speed of a closed-loop system; ultimate objective of our study. • Overshoot rate: a minimum or even zero value of this performance criterion guarantees a rapid damping of the stable system.
As mentioned above, we look for optimizing three parameters: rise time, settling time, and overshoot rate. It is said that the solution sought is associated with more than one value, so the problem is multi-objective. In this case, it is wise to look for the Pareto front corresponding to the set of non-dominated solutions based on ACO algorithm, that is to say, the solutions that represent a compromise between the different objectives considered. Therefore, choosing one solution over another is hardly justified since no solution is systematically worse or better than the others on all objectives. Pareto''s concept of optimality (Censor, 1977), named after the sociologist Vilfredo Pareto, was used for the first time to describe a state of society; the dilemma of improving the well-being of one individual without damaging that of another. Obviously, this concept of optimality is useful for dealing with conflicting goals. Subsequently, it was imported by various optimization algorithms including GAs and ACO (Häckel, Fischer, Zechel, & Teich, 2008;Tušar & Filipič, 2015).
Generally, it is possible to reduce the multi-criteria problem to another monocriterion by combining objective functions into one by the use of weights (Aguila-Camacho & Duarte-Mermoud, 2013), as follows: where J is the sum objective function, and λ i are the weights of the objective functions f i . Nevertheless, this method implies that the choice of the weights of the objective functions is realized via trial and error. This can mislead and limit the search space to the number of the chosen points.
The limitations of this approach encourage us to keep the multi-objective character of the problem and solve it using the proposed method.
This optimization method applied to the AQM congestion problem is shown in Figure 3.

Simulation results and analysis
In this section, we test the performance of the closed-loop system with the ACO-PID controller by simulation. The Pareto front is drawn too. We compare the results with those obtained by the application of Ziegler-Nichols (ZN) method and GA optimization.
The ZN tuning technique is a classical method based on a step response experiment combined with a table that relates the controller parameters to the characteristics of the step response. It has been opted for this method to be compared with for its simplicity and efficiency (Åström & Hägglund, 1995). As for the GA optimization method, it is considered as a useful optimization method employing the principles of natural genetic systems (Chebli et al., 2017a;Goldberg & Holland, 1988) to seek a global solution of the optimization problem. GAs are stochastic optimization methods that sweep the entire admissible space to search the optimal solution. It has been chosen in this comparison for its similarity to the ACO technique.
For a TCP/AQM network modelled by (13), it is assumed that N = 60 homogeneous TCP connections and shares one bottleneck link with a link capacity C = 1250 (packets/s). Furthermore, the round-trip time R = 400 s, the  The stability region in three-dimensional plot for the variables K p , K i and K d of the PID controller is given in Figure 4. ACO is characterized by a number of ants m = 50, number of iterations = 40, α = 0.8, and β = 0.2. Based on the assumption that the environment is quite dynamic, we take an evaporation rate ρ = 0.7.
To visualize the results of an optimization problem with three objectives, 3D-scatter plots are commonly used because of their simplicity, robustness and computation low cost. However, the non-uniformity of the data requires the use of a multivariable interpolation method (Dudziak, 2007) in order to visualize them. Figure 5 presents the scatter plots of the Pareto curves delimited by the rise time, the settling time and the overshoot rate -as objective functions -for the performance index ITAE. These Pareto fronts exhibit the sets of optimal values that the objective functions take. These provided solutions are equivalent, by way of we cannot valorize a solution at the expense of another one. Table 1, displaying a sample of Pareto front graphs, shows the conflicting nature of the three objective functions for each of the ITAE criterion. It can be clearly seen that when an objective function is noticeably improved, the others are more or less deteriorated. It is concluded that the problem of optimizing the parameters of the dynamic response of the AQM system tackled in this manuscript is indeed a Pareto-optimal problem.   However, it is in light of the nature of the system, that we can prioritize even more the objectives to be optimized. In the case of the delayed congestion control system by AQM, we can affirm that it is the minimization of the settling time that comes first, followed by that of the rise time and finally that of the overshoot rate.
Thus, it can be shown that the ACO technic based on Pareto efficiency provides satisfactory results compared to the solutions given by GA and ZN methods used in previous research works (Chebli & Akkary, 2016;Chebli et al., 2017a) Table 2 and Figure 6 summarize this comparison. We took only one solution for each performance index since all the solutions are equivalent given that they belong to the Pareto surface, we chose the solutions which offer the smallest settling time and rise time. Figure 6 presents the step response for the different optimal PID controllers. From Table 2, it appears that the ZN method gives poor results compared to other random search methods for the different objective functions, especially for the overshoot (91.3046%), which induces strong oscillations and thus affects the stability of the system. The performance criterion ITAE, which is a very sensitive criterion, gives better values when optimized by ACO than by GA. It should be noted that the ITAE is very useful for practical applications which require a sensitive criterion as is the case for the AQM congestion control delay system presented in this manuscript.

Conclusion
In this work, an efficient and effective tuning approach based on the MOACO is developed for getting good performances and optimizing the PID parameters.
The results of the simulations show that the proposed method was more flexible and efficient in terms of dynamic performance such as reduction of maximum overshoot, rise time, and settling time. The MOACO was capable of undertaking local research with a rapid rate of convergence. Thus, the proposed approach has proved its effectiveness through simulation results. To highlight more the efficiency of the MOACO method implemented in the PID controller, the comparative study conducted between the proposed approach and GA and ZN techniques reinforces the relevance of the method addressed in this manuscript.