Energy-E ﬃ cient Clustering Routing Protocol for Wireless Sensor Networks Based on Yellow Saddle Goatﬁsh Algorithm

: The usage of wireless sensor devices in many applications, such as in the Internet of Things and monitoring in dangerous geographical spaces, has increased in recent years. However, sensor nodes have limited power, and battery replacement is not viable in most cases. Thus, energy savings in Wireless Sensor Networks (WSNs) is the primary concern in the design of e ﬃ cient communication protocols. Therefore, a novel energy-e ﬃ cient clustering routing protocol for WSNs based on Yellow Saddle Goatﬁsh Algorithm (YSGA) is proposed. The protocol is intended to intensify the network lifetime by reducing energy consumption. The network considers a base station and a set of cluster heads in its cluster structure. The number of cluster heads and the selection of optimal cluster heads is determined by the YSGA algorithm, while sensor nodes are assigned to its nearest cluster head. The cluster structure of the network is reconﬁgured by YSGA to ensure an optimal distribution of cluster heads and reduce the transmission distance. Experiments show competitive results and demonstrate that the proposed routing protocol minimizes the energy consumption, improves the lifetime, and prolongs the stability period of the network in comparison with the stated of the art clustering routing protocols.


Introduction
In recent years, there has been significant exploitation in wireless communications because they allow great flexibility for data transmission. Wireless communications provide a combination of connectivity and mobility using air as a medium [1]. Among the various existing wireless technologies, the most popular are wireless sensor networks. They require an efficient communication establishment, from the energy and computational point of view, which is guaranteed by developing optimal algorithms for communication protocols.
Wireless networks of distributed sensors are composed of a collection of electro-mechanical microdevices, scattered within a defined area. These sensors can reveal multiple information about territory. Exchanging and transmitting this data over a channel for wireless sensor networks [2]. Wireless sensors depend on the use of batteries, which generates the main problem because they contain limited energy autonomy. Thus, a base station is required to send information and process it. this energy threshold, it is avoided that a node with a lower energy level than the threshold value turns into a CH. With some concepts similar to P-SEP, the clustering protocol Distributed Energy-Efficient Clustering Algorithm (DEEC) [23] was introduced for HFWSNs. DEEC also uses a threshold limit for selecting the sensor nodes as CHs. Its main characteristic is that the period of time in which a sensor node behaves as a CH depends on the sensor energy level. The idea behind this decision is that high energy nodes produce stable transmission cycles in the network.
On the other hand, metaheuristic methods are optimization schemes able to solve complex systems. These techniques are inspired by our scientific understanding of biological or social processes that from an abstraction level, can be considered to be search strategies. Some examples of popular metaheuristic methods include the Particle Swarm Optimization (PSO) [24], Genetic Algorithms [25], the Artificial Bee Colony (ABC) algorithm [26], the Differential Evolution (DE) method [27], the Gravitational Search Algorithm (GSA) [28] and the Flower Pollination Algorithm (FPA) [29]. Metaheuristic schemes do not need convexity, continuity differentiability, or certain initial conditions, which corresponds to an important advantage in comparison to other techniques. Despite their interesting results, these search strategies maintain different difficulties when they are applied to high multi-modal optimization problems.
The Yellow Saddle Goatfish Algorithm (YSGA) [30] represents a metaheuristic scheme that emulates the hunting behavior of the Yellow Saddle Goatfish. The algorithm uses different operators that allow a better balance of exploration-exploitation, reducing the typical flaws of other metaheuristic methods such as the premature convergence and the tendency to be trapped in local optima. Such characteristics have motivated their use in different complex engineering optimization problems such as system control [31] and energies [32]. In contrast to other metaheuristic algorithms, YSGA implements Lévy flights and logarithmic spirals to update the position of search individuals. Furthermore, YSGA considers groups of particles to guide the search through different directions while covering more land space.
As an alternative to classical approaches, the problem of designing clustering protocols was also faced with metaheuristic techniques [33][34][35]. In the literature, metaheuristic methods demonstrated obtaining a better performance than those based on traditional computational techniques in terms of accuracy and robustness. From the metaheuristic perspective, the process of a clustering routing protocol is translated into an optimization problem. Therefore, an objective function is defined to evaluate the quality of a solution. Guided by this objective function, the algorithm modifies the configurations of sensor nodes until obtaining the solution that presents a longer work cycle for the network [36,37]. Several clustering protocols have been proposed based on metaheuristic principles. Some examples include the Energy Centers using Particle Swarm Optimization (EC-PSO) [38]. Using this approach, by using a geometric scheme, the CHs are firstly selected. Then, the PSO algorithm is considered to identify the central sensor of the network, which will represent the CH. The EC-PSO method also includes a mechanism to avoid the consideration of nodes with low energy levels. Another interesting example is Genetic-Algorithm-Based Energy-Efficient Clustering (GAEEC) [39]. In this clustering protocol, the Genetic algorithms are used to identify the CH sensors through a clustering metric. The GA considers as the objective function, a model that combines the energy of the nodes, and the transmission cost. Recently, a routing protocol based on the Gray Wolf Optimizer has been also proposed [17]. In this scheme, different fitness functions are defined to evaluate the node characteristics of each sensor. Such values are considered to be weights that are dynamically modified depending on the distance among the nodes of the network. Therefore, the idea is to identify the configuration of CH nodes that reduce the aggregate value of the weights. Although metaheuristic methods present interesting results, they have a critical problem, such as their low premature convergence. This fact generates that such methods frequently obtain sub-optimal CH configurations, which reduces their working cycles. The difference between the proposed YSGA clustering protocol and other protocols that also implement metaheuristic algorithms is the strategy to select CHs. In our method, we have encoded candidate solutions to automatically choose the number of CHs and avoid a fixed percentage of CHs or predefined probabilities to select CHs. Our strategy ensures the optimal number of CHs is chosen in every round by the implementation of the YSGA. Furthermore, the proposed method includes a different cost function to guide the search toward the clustering network configuration that better fist the requirements of energy savings and load balancing.
In this work, a new clustering routing protocol for Wireless Sensor Networks is introduced. The scheme is developed based on the Yellow Saddle Goatfish Algorithm (YSGA). Under the proposed approach, the number of cluster heads and the selection of optimal cluster heads is determined by the YSGA algorithm, while sensor nodes are assigned to its nearest cluster head. Therefore, the cluster structure of the network is updated by the YSGA to ensure an optimal distribution of cluster heads and reduce the transmission distance. Experiments show competitive results and demonstrate that the proposed routing protocol minimizes the energy consumption, improves the lifetime, and prolongs the stability period of the network in comparison with the stated of the art clustering routing protocols.
The next sections of the paper are structured as follows. In Section 2, the main characteristics of the Yellow Saddle Goatfish Algorithm (YSGA) are reviewed. Section 3 describes the proposed clustering protocol. Section 4 presents the numerical results of the proposed protocol compared with other well-known methods. Finally, the conclusions are discussed in Section 5.

Review of the Yellow Saddle Goatfish Algorithm
The YSGA is a recent metaheuristic algorithm inspired by the yellow saddle goatfish behavior. This optimization method considers a population of individuals divided into different groups, where each subpopulation is created using the k-means algorithm. Individuals in each group can play two different roles: chaser and blocker. Additionally, exchange roles and change zone operators are included in the search strategy of the YSGA.

Chaser Behavior
In every subpopulation, the individual with the best fitness value is the chaser Φ l . This particle leads the group through the search, and its behavior is modeled by a Lévy flight random process. Thus, the location of the chaser is defined as: where α is the step size, which value is 1. The values of u and v are estimated from the normal distribution as: Considering Γ as the Gamma function, σ u and σ v are defined as: The Lévy index β controls the tail of the probability distribution. The value of β is calculated as: where t max is the maximum number of iterations, while t is the current number of iterations. The best chaser among groups is the global best particle Φ best . Thus, the position of the global best is updated by the following equation:

Blocker Behavior
Every group has one chaser individual. Therefore, the remaining particles in each group are considered blockers ϕ g . The position of blockers are updated according to a logarithmic spiral defined as: Here, ρ is a random number in the interval of [a, 1], where a is linearly decreased from −1 to −2 over iterations. The value of the parameter b is 1. The distance D g between the blocker and the corresponding chaser is calculated as: The random number r is among [−1, 1].

Exchange of Roles and Change of Zone
The exchange of roles allows blocker individuals to become a chaser particle. This is a simple mechanism that updates the chaser if a blocker is better in terms of the fitness value. On the other hand, the change of zone is a strategy to escape from local optima. If a better solution is not found in a determined time, then the change of zone operation is executed according to the following equation: This equation updates the position of every particle p t g in the population without considering its role.
Although YSGA implements Lévy flights in its search strategy, as do other algorithms such as Cuckoo Search (CS), there are significant differences among them. As an example, CS uses a global population of host nests for the search. On the other hand, YSGA uses subpopulations of search agents with blockers and a chaser in each subpopulation. Besides, the mainly updated mechanism in CS is Lévy flights, while the updated mechanisms in YSGA are Lévy flights (for exploration) and a logarithmic spiral path for exploitation. Additionally, the CS algorithm includes a selection operator, which discards the worst solutions under a certain probability to replace them with new ones. In contrast, YSGA implements other operators, such as the exchange of roles and the change of zone. These operators help to improve the solutions, so they do not need to be removed from the population by a selection mechanism. Furthermore, the exchange of roles promotes diversity in every subpopulation, while the change of zone avoids stagnation in local optima.

Proposed Clustering Routing Protocol
In this section, the clustering routing protocol based on the Yellow Saddle Goatfish algorithm is introduced. The operation of the proposed method is organized in two main phases: the configuration phase or set-up phase and the operation phase or steady-state phase.

Configuration Phase
Initially, network assumptions and its topological structure must be defined since different network configurations can be established depending on the requirements of the application. The network assumptions comprise the characteristics of the sensor nodes. On the other hand, in the network structure, the establishment of WSNs includes the deployment of nodes in the sensing area to create a topological structure where the data will be collected. Under such considerations, the proposed protocol considers the following network assumptions and structure. This configuration has been adopted to be consistent with other related works.
The network has one base station BS, a set of cluster heads CH, and a set of sensor nodes n 2.
The power of the base station is externally supplied, while the energy of sensor nodes is limited 3.
A sensor node will be considered dead when it is out of power 4.
All sensor nodes are homogeneous Network structure: a.
Initially, all nodes are randomly deployed in the sensing area b.
Nodes location will not change during the whole life of the network c.
The base station is placed at the center of the sensing area d. The number of clusters is not fixed e.
Every normal node (also called leaf node) is added to its nearest cluster head Once the network topology is determined, the entire network connection establishment process begins, where the set-up phase is executed. In the set-up phase, initial CHs are chosen to build the initial cluster network configuration. In the selection of the optimal CHs, the proposed strategy considers the following energy consumption model.

Energy Consumption Model
In WSNs, the activities that consume most of the energy are data transmission and reception. The energy consumption for transmitting or receiving data depends on the distance d and the size of the data packet. Under such considerations, the required energy to transmit a l-bit data packet is defined in Equation (8).
where E TX is the energy consumption for transmitting data, l is the data length, E elec is the energy dissipation for transmitting or receiving 1 bit of data, ε f s is the coefficient of energy dissipation in the free space model, ε mp is the coefficient of energy dissipation in the multi-path attenuation model, and d th is the transmission distance threshold defined in Equation (9).
The required energy to receive a l-bit data packet is calculated in Equation (10).
Since a normal sensor node n i only transmit data to the cluster head, the following equation can calculate its energy consumption: However, the energy consumption of a cluster head must include the energy consumption of receiving packets from cluster member nodes, aggregating data, and transmitting aggregated data to the base station. Therefore, the energy consumption of a cluster head CH j is calculated as Equation (12). where N j is the number of member nodes in cluster j, and E DA is the energy for 1 bit of data aggregation. Under the above considerations, the residual energy of a leaf sensor node n i can be estimated by Equation (13).
While the residual energy of a cluster head CH j is described in Equation (14).
During the set-up phase, the YSGA is implemented to select initial CHs and build the initial cluster network configuration considering the presented energy consumption model. Once the YSGA has chosen the cluster heads, all the remaining nodes will join the nearest CH. However, if the distance of a leaf node to its closest CH is higher than the distance to the base station, then the leaf node will not be part of a cluster. Instead, it will transmit the information directly to the base station.
The process of selecting CHs to build an optimal network configuration by the YSGA is described in the operation phase.

Operation Phase
In the operation phase, the information is transferred to the BS once the selection of optimal CHs has established the network connections. Data transmission occurs every round, where one round is completed every time the information is transferred from normal nodes to CHs, and then to the BS. After each round, the network is reconfigured by executing the YSGA algorithm to select the new optimal CHs and rebuild the network links. The process for choosing CHs and the optimal cluster network configuration is described in the next subsections.

Selection of Optimal Cluster Heads
In the proposed protocol, the YSGA algorithm determines the number of CHs and which nodes will be selected as CHs. In the selection, two main criteria are considered: the distance from CH to the base station and the residual energy of CH.
Regarding the first criteria, establishing the most efficient connection will promote energy savings. Therefore, the selection of CHs considers reducing the length of the links. Additionally, it is assumed that links work at the same frequency, and crosslinks are eliminated because they generate interference during data transmission.
Concerning the second criterion, the proposed method selects sensor nodes as CHs if its residual energy is higher enough. Cluster heads spend more energy than normal nodes because, according to the energy consumption model, CHs consume energy when receiving, aggregating, and transmitting data. Hence, sensor nodes with the highest residual energy must be selected as CHs for load balancing.
In contrast to other clustering routing protocols, the number of CHs is not fixed. Instead, the number of CHs is dynamically changed in order to build the best network configuration in every round. Furthermore, the selection of CHs is not random or probabilistic, such as in LEACH or DEEC protocols. In contrast, in our method, the YSGA algorithm automatically finds the optimal number of CHs and selects the best sensor nodes to become CHs in every round. This strategy allows the implementation of the proposed protocol for a wide range of WSN applications without concern about determining the number of clusters.
Once the optimal cluster heads are selected, each sensor node is incorporated into the nearest cluster head. Nevertheless, if the distance from the sensor node to the BS is shorter than the length to the CH, then the sensor node is not clustered, so the information of this node is transmitted directly to the base station. After completed the clustering process, a potential network configuration is prepared for transmission, and it can be evaluated to determine if it is optimal. The procedure for selecting the best cluster network configuration will be explained in the following section.

Selection of Optimal Cluster Network Configuration
In the proposed protocol, the YSGA algorithm finds the optimal cluster network configuration C by selecting the optimal set of cluster heads. Thus, in every iteration, cluster heads are determined to generate a new network structure. After that the network is evaluated in terms of its fitness value. In this work, we implement a cost function to evaluate the effectiveness of the network configuration. The objective function includes four terms: the total intra-cluster distance, the total distance from cluster heads to the base station, the energy consumption, and the residual energy of cluster heads. The cost function is defined in Equation (15).
where w 1 , w 2 , w 3 , and w 4 are weighted constant factors that control the contribution of every element in the formulation. The first term is the total intra-cluster distance, which is calculated as: Considering the subset of sensor nodes n j assigned to cluster j and CH j as the cluster head, the maximum and minimum total intra-cluster distance from every sensor node in n j to its cluster head CH j is d max and d min , respectively, while the total intra-cluster distance d.
The second term is the total distance from cluster heads to the base station, formulated as: Here, the maximum and minimum total distance from every cluster head CH j to the base station BS is d max and d min , respectively, while the total distance from cluster heads to the base station is d.
The third term is the energy consumption, calculated as: where E n ij is the energy consumption of the sensor node i that belongs to cluster j. The number of member nodes in cluster j is N j . The maximum and minimum total energy consumption of the network is E max and E min , respectively. The fourth term is the sum of the residual energy ratio between sensor node members and its cluster head: where the residual energy of the cluster head is E r CH j , while the residual energy of sensor node members is E r n ij . The maximum and minimum total residual energy of the network is E max and E min , respectively. A smaller value of f (C) indicates that the selected cluster heads have generated a better cluster network configuration. In that sense, the YSGA algorithm will choose a set of cluster heads to build different cluster networks in every iteration in order to find the optimal network configuration. The proposed algorithm is summarized in Algorithm 1.
Update the optimal cluster network configuration as:

Simulation and Experimental Results
Estimating the proposed protocol's performance, we take into account the network lifetime, network stability period, total residual energy, residual energy deviation, and the number of received data packets by the sink node. These metrics are described as follows:

•
Network lifetime: It is the time between the start of the network running and the death of the first node. Besides, it is known as the network stability period. • Network instability period: It is the time between the death of the first node and the death of all nodes.

•
Residual energy: This metric is calculated by considering the total residual energy of all sensor nodes. Usually, this value is presented as a percentage indicator of the total remaining energy of the network, which is represented in Equation (20).
where p(t) is the percentage of the total residual energy in the round t. The total residual energy in round t is N i=1 E r (n i , t) and the sum of the initial energy E o of all the sensor nodes in the network Residual energy deviation: The variation between the node with the most remaining energy and the node with the least remaining energy. The residual energy deviation in the round t is calculated as follows in Equation (21).
where E r max (n, t) and E r min (n, t) are the maximum and minimum residual energy among the set of nodes n in the network considering round t. However, to facilitate the analysis of the results, the residual energy deviation is reported in percentage units.

•
Throughput: The number of data packets received by the BS.
For consistency in the experiments, parameter settings of Table 1 have been considered for all the routing protocols in comparison. Regarding the tuning parameters of the YSGA algorithm, Table 2 shows the adopted values in which the authors have reported the best performance of the YSGA. In Addition to the settings of Table 1, Table 3 shows the values of additional parameters considered in every method in comparison.

Parameter Value
Sensing area 100 m 2 Number of sensor nodes N 100 Packet size l 4000 bits Energy for 1 bit of data aggregation E DA 5 nJ/bit The energy dissipation for transmitting or receiving 1 bit of data E elec 50 nJ/bit The coefficient of energy dissipation in the free space ε f s (transmission coefficient amplifier) 10 pJ/bit/m 2 The coefficient of energy dissipation in the multi-path attenuation model ε mp (transmission coefficient amplifier) 0.0013 pJ/bit/m 4 Initial energy of sensor nodes E o 0.07 J Table 2. Setting parameters of the proposed method.

Parameter Value
Population size 20 Maximum number of iterations 50 w 1 , w 2 , w 3 , w 4 1 α 1 b 1 The simulation was implemented in MATLAB R2019a(The MathWorks, Inc. Natick, Massachusetts, USA), in a computer with a processor Intel(R) Core (TM)i7-8550ucpu@1.80GHz1.99GHz. Figures 1 and 2 show the obtained results of the simulations in terms of the network lifetime. The history of alive nodes is reported in Figure 1, while the number of dead sensor nodes is illustrated in Figure 2. From these figures, it is evident that the proposed clustering routing protocol performs better than the protocols in comparison. Besides, nodes start to die after 230 rounds in the proposed protocol, while nodes begin to die after 180 rounds in DEEC.

Network Lifetime
Considering the achieved results, our method manages to prolong the lifetime of the network more rounds than the other techniques. These results were obtained because our strategy is able to automatically change the number of clusters in every round to balance the energy load. Besides, the selection of sensor nodes to become CHs is also dynamic. Therefore, the roles of sensor nodes can change in each round to obtain the network configuration that reduces the energy consumption and increases the lifetime of the nodes.  Additionally, the clustering evolution of the network is presented in Figure 3, where the clustering formation is visualized in different rounds. This figure shows how clusters are changing as different cluster heads are selected over time. Furthermore, it reveals the distribution of dead nodes and how they are ignored in the clustering process.
In the simulation, sensor nodes are represented as circles, while communication links are symbolized as connection lines. From Figure 3, the base station can be observed at the center of the sensing area, represented as a blue circle. On the other hand, alive sensor nodes can be identified as green circles, while cluster heads are yellow circles, and dead nodes are simulated as black circles. The links that communicate intracluster nodes can be observed as cyan lines, while communication links from cluster heads to the base station are represented as magenta dashed lines. The connection between independent nodes and the base station is represented as blue lines. Figure 3a-d illustrate the optimal cluster network configuration in rounds 5, 100, 270, and 400, respectively. more rounds than the other techniques. These results were obtained because our strategy is able to automatically change the number of clusters in every round to balance the energy load. Besides, the selection of sensor nodes to become CHs is also dynamic. Therefore, the roles of sensor nodes can change in each round to obtain the network configuration that reduces the energy consumption and increases the lifetime of the nodes.  Additionally, the clustering evolution of the network is presented in Figure 3, where the clustering formation is visualized in different rounds. This figure shows how clusters are changing as different cluster heads are selected over time. Furthermore, it reveals the distribution of dead nodes and how they are ignored in the clustering process.
In the simulation, sensor nodes are represented as circles, while communication links are symbolized as connection lines. From Figure 3, the base station can be observed at the center of the sensing area, represented as a blue circle. On the other hand, alive sensor nodes can be identified as green circles, while cluster heads are yellow circles, and dead nodes are simulated as black circles. The links that communicate intracluster nodes can be observed as cyan lines, while communication links from cluster heads to the base station are represented as magenta dashed lines. The connection between independent nodes and the base station is represented as blue lines. Figure 3a-d illustrate the optimal cluster network configuration in rounds 5, 100, 270, and 400, respectively. The proposed method and DEEC might seem similar in their resulting patterns since both schemes are based on similar performing principles. Under such conditions, both techniques could generate similar rates of energy consumption in every round. However, the proposed strategy includes additional factors that influence the selection of CHs, which leads to an optimal cluster configuration that extends the lifetime of the network. In contrast, the DEEC method obtains suboptimal cluster topologies that produce higher consumption. Moreover, selected CHs do not always have the maximum remaining energy, which leads to inefficient energy expenditure. Furthermore, cluster formation also stops quickly in DEEC, which forces normal nodes to communicate directly to the base station, causing faster exhaustion of nodes, and consequently, reducing the network lifetime.
Considering the achieved results, our method manages to prolong the lifetime of the network more rounds than the other techniques. These results were obtained because our strategy is able to automatically change the number of clusters in every round to balance the energy load. Besides, the selection of sensor nodes to become CHs is also dynamic. Therefore, the roles of sensor nodes can change in each round to obtain the network configuration that reduces the energy consumption and increases the lifetime of the nodes.
Additionally, the clustering evolution of the network is presented in Figure 3, where the clustering formation is visualized in different rounds. This figure shows how clusters are changing as different cluster heads are selected over time. Furthermore, it reveals the distribution of dead nodes and how they are ignored in the clustering process.

Network Instability Period
For a closer inspection, Figure 4 shows in which round has died the first sensor node. Additionally, the round in which the half of the nodes have died, and the death of all nodes is also reported in this figure. According to the obtained results, it is observed that the proposed protocol achieves the highest values. In our method, the first dead node occurs after 230 rounds, the 50% of the nodes died after 340 rounds, and the 100% of the sensor nodes died almost in 450 rounds. The protocols in comparison show inferior indices regarding the first, half, and last dead in the network.  In the simulation, sensor nodes are represented as circles, while communication links are symbolized as connection lines. From Figure 3, the base station can be observed at the center of the sensing area, represented as a blue circle. On the other hand, alive sensor nodes can be identified as green circles, while cluster heads are yellow circles, and dead nodes are simulated as black circles. The links that communicate intracluster nodes can be observed as cyan lines, while communication links from cluster heads to the base station are represented as magenta dashed lines. The connection between independent nodes and the base station is represented as blue lines. Figure 3a-d illustrate the optimal cluster network configuration in rounds 5, 100, 270, and 400, respectively.

Network Instability Period
For a closer inspection, Figure 4 shows in which round has died the first sensor node. Additionally, the round in which the half of the nodes have died, and the death of all nodes is also reported in this figure. According to the obtained results, it is observed that the proposed protocol achieves the highest values. In our method, the first dead node occurs after 230 rounds, the 50% of the nodes died after 340 rounds, and the 100% of the sensor nodes died almost in 450 rounds. The protocols in comparison show inferior indices regarding the first, half, and last dead in the network.
For a closer inspection, Figure 4 shows in which round has died the first sensor node. Additionally, the round in which the half of the nodes have died, and the death of all nodes is also reported in this figure. According to the obtained results, it is observed that the proposed protocol achieves the highest values. In our method, the first dead node occurs after 230 rounds, the 50% of the nodes died after 340 rounds, and the 100% of the sensor nodes died almost in 450 rounds. The protocols in comparison show inferior indices regarding the first, half, and last dead in the network.

Residual Energy
The total residual energy of the network considers the remaining energy of all nodes in each round. This metric is reported in Figure 5, where it is evident that the proposed method manages to save more energy than the other protocols. The total residual energy is reported considering 100% of the residual energy in round 0, and 0% of the residual energy in the last round. As it was expected, our approach finds the network configuration that spends the lowest energy in every round. Therefore, the residual energy of the network is higher than the obtained with the other methods. From the figure, we can conclude that the proposed protocol achieves the lowest energy expenditure and lengthens the life of the nodes.

Residual Energy
The total residual energy of the network considers the remaining energy of all nodes in each round. This metric is reported in Figure 5, where it is evident that the proposed method manages to save more energy than the other protocols. The total residual energy is reported considering 100% of the residual energy in round 0, and 0% of the residual energy in the last round. As it was expected, our approach finds the network configuration that spends the lowest energy in every round. Therefore, the residual energy of the network is higher than the obtained with the other methods. From the figure, we can conclude that the proposed protocol achieves the lowest energy expenditure and lengthens the life of the nodes.

Energy Deviation
The energy deviation considers the difference between the sensor node with the highest residual energy and the sensor node with the lowest residual energy. This is an indicator of load balancing. If the energy deviation is high, then the energy expenditure is not balanced among sensor nodes. The unbalanced energy expenditure causes premature node death. Consequently, the network lifetime is reduced. For a better load balancing, it is expected to reduce the highest peak. Figure 6 shows the energy deviation of the reported protocols. From the figure, it can be observed that the highest energy deviation is reached by LEACH, while the lowest is achieved by DEEC and YSGAP. However, the YSGAP protocol reaches the highest peak after 230 rounds, while DEEC after 180 rounds. The later the highest peak is reached, the better. Therefore, we can conclude that YSGAP outperforms its opponents in terms of the energy deviation.

Energy Deviation
The energy deviation considers the difference between the sensor node with the highest residual energy and the sensor node with the lowest residual energy. This is an indicator of load balancing. If the energy deviation is high, then the energy expenditure is not balanced among sensor nodes. The unbalanced energy expenditure causes premature node death. Consequently, the network lifetime is reduced. For a better load balancing, it is expected to reduce the highest peak. Figure 6 shows the energy deviation of the reported protocols. From the figure, it can be observed that the highest energy deviation is reached by LEACH, while the lowest is achieved by DEEC and YSGAP. However, the YSGAP protocol reaches the highest peak after 230 rounds, while DEEC after 180 rounds. The later the highest peak is reached, the better. Therefore, we can conclude that YSGAP outperforms its opponents in terms of the energy deviation.
reduced. For a better load balancing, it is expected to reduce the highest peak. Figure 6 shows the energy deviation of the reported protocols. From the figure, it can be observed that the highest energy deviation is reached by LEACH, while the lowest is achieved by DEEC and YSGAP. However, the YSGAP protocol reaches the highest peak after 230 rounds, while DEEC after 180 rounds. The later the highest peak is reached, the better. Therefore, we can conclude that YSGAP outperforms its opponents in terms of the energy deviation.

Throughput
The number of the sent packet to the base station is known as throughput. The more sent packet, the better. This means that there was more information collected by the sensor nodes. Figure 7 shows the total number of sent packets to the base station in every round. From the figure, it is observed that the YSGAP sends more data packets to the base station than the other protocols. This was as expected since the proposed method prolongs the network lifetime, and consequently, the collected data is superior.

Throughput
The number of the sent packet to the base station is known as throughput. The more sent packet, the better. This means that there was more information collected by the sensor nodes. Figure 7 shows the total number of sent packets to the base station in every round. From the figure, it is observed that the YSGAP sends more data packets to the base station than the other protocols. This was as expected since the proposed method prolongs the network lifetime, and consequently, the collected data is superior.

Conclusions
In this work, a novel clustering routing protocol called YSGAP was proposed. The method finds the optimal network configuration to reduce the energy consumption in every round and prolong the network lifetime. In its operation, the YSGAP protocol automatically determines the number of clusters heads and selects the sensor nodes that will play the role of cluster heads in every round. Our method was compared with some of the most popular clustering routing protocols in the literature, such as DEEC, LEACH, and SEP. Experimental results proved that the proposed strategy outperforms the methods in comparison. The evaluation of the proposed routing protocol was carried out, demonstrating that our approach manages to considerably increase the lifetime of the network, while also providing robustness in communications, fault tolerance, and time-bounded response questions.
As feature work, the proposed protocol can be enhanced by including more terms in the cost function, such as considering uniform clusters sizes for a better load balancing. Additionally, the encoded candidate solutions might be arranged differently so that the dimensionality of the optimization problem can be considerably reduced. This change can increase the scalability of the sensor network. Furthermore, the off-line execution of the protocol can consider the main drawback of the proposed method. Therefore, a significant improvement can be its development in parallel to

Conclusions
In this work, a novel clustering routing protocol called YSGAP was proposed. The method finds the optimal network configuration to reduce the energy consumption in every round and prolong the network lifetime. In its operation, the YSGAP protocol automatically determines the number of clusters heads and selects the sensor nodes that will play the role of cluster heads in every round. Our method was compared with some of the most popular clustering routing protocols in the literature, such as DEEC, LEACH, and SEP. Experimental results proved that the proposed strategy outperforms the methods in comparison. The evaluation of the proposed routing protocol was carried out, demonstrating that our approach manages to considerably increase the lifetime of the network, while also providing robustness in communications, fault tolerance, and time-bounded response questions.
As feature work, the proposed protocol can be enhanced by including more terms in the cost function, such as considering uniform clusters sizes for a better load balancing. Additionally, the encoded candidate solutions might be arranged differently so that the dimensionality of the optimization problem can be considerably reduced. This change can increase the scalability of the sensor network. Furthermore, the off-line execution of the protocol can consider the main drawback of the proposed method. Therefore, a significant improvement can be its development in parallel to speed up the process of finding the optimal cluster configuration. This improvement will allow the protocol to run online in real time applications.
Author Contributions: Relatively to the present manuscript, A.R. developed the energy algorithm, prepared, and executed the scenario and test-bed, interpreted and analyzed the results, and designed the methodology. C.D.-V.-S. was involved in the algorithm designe, she supervised the research methodology and the approach of this work, she performed the formal analysis and interpreted the results. R.V. reviewed, interpreted, and drafted the simulation results, he reviewed the methodology and the manuscript. All authors have read and agreed to the published version of the manuscript.