Routing Protocol for Heterogeneous Wireless Sensor Networks Based on a Modified Grey Wolf Optimizer

Wireless sensor network (WSN) nodes are devices with limited power, and rational utilization of node energy and prolonging the network lifetime are the main objectives of the WSN’s routing protocol. However, irrational considerations of heterogeneity of node energy will lead to an energy imbalance between nodes in heterogeneous WSNs (HWSNs). Therefore, in this paper, a routing protocol for HWSNs based on the modified grey wolf optimizer (HMGWO) is proposed. First, the protocol selects the appropriate initial clusters by defining different fitness functions for heterogeneous energy nodes; the nodes’ fitness values are then calculated and treated as initial weights in the GWO. At the same time, the weights are dynamically updated according to the distance between the wolves and their prey and coefficient vectors to improve the GWO’s optimization ability and ensure the selection of the optimal cluster heads (CHs). The experimental results indicate that the network lifecycle of the HMGWO protocol improves by 55.7%, 31.9%, 46.3%, and 27.0%, respectively, compared with the stable election protocol (SEP), distributed energy-efficient clustering algorithm (DEEC), modified SEP (M-SEP), and fitness-value-based improved GWO (FIGWO) protocols. In terms of the power consumption and network throughput, the HMGWO is also superior to other protocols.


Introduction
In recent years, with the development of low-power digital circuits and wireless communication technology, wireless sensor networks (WSNs) have been widely used in many fields, such as military reconnaissance, medical aid, urban management, smart home and target tracking [1][2][3][4][5]. WSNs are mainly composed of a base station (BS) and a large number of randomly distributed sensor nodes. The energy of the sensor node is mainly powered by the battery. However, the battery power is very limited, and when the sensor nodes are placed in a complex or harsh environment, it is very difficult to charge or replace the battery [6,7]. Therefore, energy efficiency is the most important and critical task to lengthen the network lifetime in the design of WSN routing protocols [8][9][10][11].
In the routing protocol of WSNs, the most effective way to save the energy of the sensor nodes and lengthen the network life cycle is to design a routing protocol based on clustering [12,13]. The idea of a clustering algorithm is to divide sensor nodes into multiple regions; a region is also called a cluster. A sensor node is selected as a leader in each cluster, which is also known as the cluster head (CH),

Related Work
With the rapid evolution and development of sensor networks, the routing protocols of WSNs have attracted the attention of many scholars. However, the clustering routing protocol is a more efficient way to reduce the energy consumption of sensor nodes and lengthen the life cycle of networks relative to other typical routing protocols [28,29]. The most famous clustering routing protocol among homogeneous WSNs is the LEACH protocol [14]. In LEACH, the CHs are randomly selected, and the sensor nodes are periodically rotated into CHs. The network energy consumption is evenly allocated to each sensor node to lengthen the network life cycle and enhance the network throughput. However, many improved protocols based on LEACH have been proposed, such as LEACH-B [30], LEACH-C [17], CL-LEACH [31], H-LEACH [32], O-LEACH [33], etc. Vice-CH-enabled centralized cluster-based routing protocol (VCH-ECCR) [19] is a routing algorithm. The VCH-ECCR reduces the clustering frequency by selecting two-stage vice CHs, thereby reducing the overhead caused by clustering. The number of CHs is updated according to the number of surviving nodes in each round in the network. The experimental results illustrate that the protocol is better than that of traditional routing protocols. A fuzzy logic-based energy-efficient clustering for WSNs based on minimum separation distance enforcement between CHs (FL-EEC/D) [20] was proposed. In FL-EEC/D, a fuzzy logic model is proposed for the selection of CHs. When using this model for CH selection, the residual energy, compaction, density, position suitability, and distance from the BS are used as the main parameters to select a better node as the CH. The life cycle of the FL-EEC/D is raised relative to LEACH. The above protocol performs well in homogeneous WSN, but it performs poorly in HWSNs.
On the basis of the above routing protocols of homogeneous WSNs, many routing protocols of HWSNs have been proposed to further lengthen the life cycle of the network [34,35]. A SEP [21] for the HWSNs was proposed by Smaragdakis et al. The different initial energy levels of nodes indicate that this network is a multi-level HWSN. A two-level HWSN is defined by the SEP; namely, advanced nodes and normal nodes. The SEP protocol has different threshold formulas for advanced nodes and normal nodes. Compared with normal nodes, advanced nodes have a higher probability of becoming CHs, which reduces the remaining energy deviation between ordinary nodes and advanced nodes, thus better lengthening the life cycle of the network. An M-SEP [22] for HWSNs was proposed in 2015. The residual energy of the current node and the average energy of the network are considered in the M-SEP protocol, so that nodes with high residual energy have a higher probability of becoming the CH. The transmission rate of packets and the lifetime of the network are improved through this technique. A P-SEP [23] was proposed in 2017. In the selection of CHs of each round, the protocol controls the inherent randomness and deploys the threshold of heterogeneous energy to prevent low-energy nodes from becoming CHs in the next round. This approach results in improved performance in the network's lifetime. Hosen et al. proposed an energy-centric cluster-based routing protocol (ECCR) [11]. In this protocol, a predefined static cluster is proposed, whose purpose is to reduce the control information overhead in the cluster formation problem. Next, a gatekeeper is introduced, which carries the rank information of nodes with local data. The future cluster director is transferred to the role of the former cluster director of the previous round. This reduces the cost of CH elections over the entire network life cycle, thus extending the network life cycle.
In [25], a DEEC for HWSNs was designed. This protocol is a multi-level heterogeneous network protocol. DEEC chooses the CHs by defining a threshold formula, which is influenced by the remaining energy of the node and the average remaining energy of the network. The period taken for the node to become CH varies according to the remaining energy, so the probability of a high-energy node becoming a CH is higher than that of a low-energy node, which lowers the energy consumption between nodes and lengthens the stable transmission period of the network. An energy efficient clustering protocol to enhance the performance of the HWSN, EECPEP-HWSN [36], was proposed. EECPEP-HWSN was designed as a three-level HWSN, namely including ordinary nodes, advanced nodes, and super nodes. In the process of CH selection, the initial energy of the sensor nodes, hops and residual energy of Sensors 2020, 20, 820 4 of 18 the sensor nodes during operation are considered. This protocol improves the energy efficiency and stability of HWSN.
In recent years, heuristic and metaheuristic algorithms have been applied to WSNs for theoretical research [37][38][39]. However, many of the above algorithms have also been applied to clustering routing protocols to optimize the selection process of cluster heads, thus improving the network's life cycle [40][41][42][43][44][45]. Wang et al. proposed a special clustering method called energy centers searching using particle swarm optimization (EC-PSO) [46]. First, the protocol selects CHs by dividing node positions according to geometric methods. Then, the particle swarm optimization (PSO) algorithm is used to search the energy center of the network, and the nearest node to the energy center is selected as the CH. Finally, in order to avoid weak nodes becoming relay nodes, the algorithm proposes a protection mechanism with low energy consumption. By combining the above method, the protocol can effectively lengthen the network lifespan. A Genetic-Algorithm-Based Energy-Efficient Clustering (GAEEC) [47] was proposed. The genetic algorithm (GA) is used twice in the protocol. First, in order to evenly select the number of CHs, the protocol uses GA to divide the network into static clusters by distance. Second, the GA is used in the phase of CH selection. However, the selection of CHs is based on the remaining energy of the current node and the total transmission cost, so that nodes with a larger remaining energy have a higher probability of becoming the CH. Thus, the protocol effectively enhances the performance of the network lifespan. A multilayer hierarchical routing protocol (MLHP) [48] was presented. The algorithm divides the sensor nodes into three levels (normal nodes, advanced nodes, and super nodes). At the level of normal nodes, a centralized selection is proposed in which the BS plays an important role in CH selection. In the area of advanced nodes, a GWO route is presented for data transmission, and a distributed clustering algorithm based on cost function is presented for the level of super nodes. The above algorithm makes the MLHP have a longer network life cycle. In [49], an energy-efficient routing protocol for WSNs based on an improved grey wolf optimizer (FIGWO) was proposed. FIGWO improves the searching ability of the GWO optimal solution by considering the fitness value, and a better distribution and more balanced clustering structure of CHs are guaranteed. The transmission distance of the sensor node is recalculated according to the distance from CH and BS, thus reducing the network energy consumption and prolonging the network lifespan.

Grey Wolf Optimizer
The GWO algorithm is designed from the social hierarchy and hunting behavior of grey wolf groups [50]. Grey wolves are gregarious animals with an average population of five to 12. Wolves are classified into four categories: leader wolves (α), the second-best wolves (β), the third-best wolves (δ), and other wolves ( ), as shown in Figure 1. Among them, α wolves make decisions about their daily activities, such as how to hunt, when to rest, when to wake up and how to distribute food. The main task of β wolves are not only to assist α wolves in making decisions on the daily behavior of the wolves, but also to report the performance of other wolves to α wolves. δ wolves are the followers of α and β wolves, mainly following the orders of α and β wolves, but can command the lower wolves.
wolves mainly obey the orders of the higher wolves.
Sensors 2020, 20, x FOR PEER REVIEW 4 of 16 In recent years, heuristic and metaheuristic algorithms have been applied to WSNs for theoretical research [37][38][39]. However, many of the above algorithms have also been applied to clustering routing protocols to optimize the selection process of cluster heads, thus improving the network's life cycle [40][41][42][43][44][45]. Wang et al. proposed a special clustering method called energy centers searching using particle swarm optimization (EC-PSO) [46]. First, the protocol selects CHs by dividing node positions according to geometric methods. Then, the particle swarm optimization (PSO) algorithm is used to search the energy center of the network, and the nearest node to the energy center is selected as the CH. Finally, in order to avoid weak nodes becoming relay nodes, the algorithm proposes a protection mechanism with low energy consumption. By combining the above method, the protocol can effectively lengthen the network lifespan. A Genetic-Algorithm-Based Energy-Efficient Clustering (GAEEC) [47] was proposed. The genetic algorithm (GA) is used twice in the protocol. First, in order to evenly select the number of CHs, the protocol uses GA to divide the network into static clusters by distance. Second, the GA is used in the phase of CH selection. However, the selection of CHs is based on the remaining energy of the current node and the total transmission cost, so that nodes with a larger remaining energy have a higher probability of becoming the CH. Thus, the protocol effectively enhances the performance of the network lifespan. A multilayer hierarchical routing protocol (MLHP) [48] was presented. The algorithm divides the sensor nodes into three levels (normal nodes, advanced nodes, and super nodes). At the level of normal nodes, a centralized selection is proposed in which the BS plays an important role in CH selection. In the area of advanced nodes, a GWO route is presented for data transmission, and a distributed clustering algorithm based on cost function is presented for the level of super nodes. The above algorithm makes the MLHP have a longer network life cycle. In [49], an energy-efficient routing protocol for WSNs based on an improved grey wolf optimizer (FIGWO) was proposed. FIGWO improves the searching ability of the GWO optimal solution by considering the fitness value, and a better distribution and more balanced clustering structure of CHs are guaranteed. The transmission distance of the sensor node is recalculated according to the distance from CH and BS, thus reducing the network energy consumption and prolonging the network lifespan.

Grey Wolf Optimizer
The GWO algorithm is designed from the social hierarchy and hunting behavior of grey wolf groups [50]. Grey wolves are gregarious animals with an average population of five to 12. Wolves are classified into four categories: leader wolves ( ), the second-best wolves ( ), the third-best wolves ( ), and other wolves ( ), as shown in Figure 1. Among them, wolves make decisions about their daily activities, such as how to hunt, when to rest, when to wake up and how to distribute food. The main task of wolves are not only to assist wolves in making decisions on the daily behavior of the wolves, but also to report the performance of other wolves to wolves. wolves are the followers of and wolves, mainly following the orders of and wolves, but can command the lower wolves. wolves mainly obey the orders of the higher wolves.  During hunting, the location of the prey is assessed by α, β, and δ wolves, and the remaining wolves calculate the distance between themselves and the prey; then, the wolves encircle the prey. The following is the calculational formula of the wolf's position (1) where → X t+1 is the wolf's position in the (t + 1) th iteration, and → X t p is the prey's position at the t th iteration.  → D is the distance from the wolves to the prey, and is calculated as follows: (3) where → X t p and → X t are, respectively, the prey's position and the wolf's position in the t th iteration, → C is the coefficient vectors calculated in (4), and → r 2 is a random vector between 0 and 1. In GWO, the α, β, and δ wolves are thought to be closest to the prey. They also have the most extensive hunting experience, and thus can determine where the prey is located. Therefore, the prey's location can be calculated as follows where δ are the positions of the α wolves, the β wolves and the δ wolves in the (t + 1) th iteration, respectively. They are all calculated according to Equation (1). In addition, other wolves will update their positions around the prey, according to Formula (1). At the end of the iteration times, α wolves have the highest fitness value. Therefore, α wolves are considered as the optimal solution of the function. Algorithm 1 describes the GWO pseudo-code. In GWO, the end of the iteration indicates the best time for the wolves to capture their prey, and that all wolves are closest to their prey. In the routing protocol of the clustering algorithm, the node with the smallest distance in the cluster should be selected as the CH. Therefore, in the proposed algorithm, we assume that the sensor nodes are wolves and the CH is prey.

Proposed Algorithm
In this section, the routing protocol for HWSN based on the GWO is presented in detail.
Lengthening the network life cycle and enhancing the network throughput are the main objectives of the proposed study.

Network Model and Assumptions
In the network model, the following assumptions are made: 1.
All nodes are randomly distributed in the two-dimensional geographical area. Once the location is determined, no matter what happens, the location of the nodes will not change; 2.
In HWSNs, all sensor nodes are assigned different initial energy levels; 3.
The BS is located in the center of the sensing area, and its power is externally supplied; 4.
The energy of the sensor node is limited, and the battery cannot be charged; 5.
When the sensor node power is exhausted, the node will be considered dead.

Energy Consumption Model
In this paper, the energy consumption model is similar to that used in [51]. In WSNs, nodes are randomly deployed, and the locations of the nodes are not pre-designed. Most of the energy of a node is dissipated due to communication between nodes, depending on the distance between the nodes. Both data transmission and reception consume energy. Therefore, to transmit an (m − bit)-long data packet over the distance d, the required energy is where E TX is the energy consumed when the node transmits data, E elec is the energy dissipation of the process of transmitting 1 bit of data and the process of receiving 1 bit of data, ε f s is the coefficient of Sensors 2020, 20, 820 7 of 18 energy dissipation in the free-space model, ε mp is the coefficient of energy dissipation in the multi-path attenuation model, and d crossover is the threshold of the transmission distance, which is calculated as However, the energy consumption required by the receiving node to receive an m bit data packet is calculated as follows The energy consumption of CH can be calculated by the above model. The energy consumption of CH mainly includes three aspects: the energy consumption of receiving data packets of member nodes, fusing data and transmitting fusing data to the base station. The calculation formula is as follows: where CM num is the number of member nodes in the cluster and E DA is 1 bit of data aggregation energy cost; m is the packet length. The energy consumption of the non-CH node is only the energy consumption of sending data to the CH, and its mathematical expression is as follows: The total residual energy in the r th round is calculated as follows: where E totR (r − 1) is the total residual energy in the (r − 1) th round, CH num (r) represents the number of CHs in the r th round, N alive (r) represents the number of alive nodes in the network in the r th round, E CH (i) is the energy consumed by the i th CH, and E non−CH ( j) is the energy consumed by the j th non-CH.

Proposed Algorithm
In order to avoid the randomness of the CH selection, a centralized clustering routing protocol is proposed. The stage of CH selection is calculated centrally by BS through the modified grey wolf optimizer (MGWO), and then the result of CH selection is notified to all sensor nodes in the network in the form of broadcast. The phases at which the cluster member joins the cluster and steady-state are the same as in the LEACH [14] protocol. In the first stage of CH establishment, the sensor nodes first send the location and initial energy to the BS, which receives the information and saves it. The BS is given the initial energy of the nodes, and the node's position is fixed. The energy consumption of the node can be estimated by clustering information of each round, and the energy information of the node in each round can be obtained. Therefore, the nodes do not need to send position and energy information to the BS in each round.

Selection of Initial Clusters
The protocol uses the energy of the node and the distance from the node to the BS as parameters to select the initial clusters of the network, thereby fixing and limiting the number of CHs in each round, so that the number of CHs in each round is even. The use of the distance parameter makes the distribution of CHs in the network more uniform. The following are the rules used for selecting the initial clusters. The set of alive sensor nodes is divided into m (m is the number of desired clusters, which equals N/p, where N is the number of sensor nodes, and p is the portion of CHs) equal subsets according to the ascending result of the fitness value of the sensor nodes. In each subset, the sensor node that is closest to the middle point is selected as the initial CH. Finally, each node is added to the Sensors 2020, 20, 820 8 of 18 nearest CH according to the Euclidean distance to form the initial cluster. The node's fitness value is calculated according to the distance from the node to the BS and residual energy: , E r > 0 and the node is the normal node , E r > 0 and the node is the advanced node 0 , E r ≤ 0 , where a 1 is the weight, E i is the initial energy of the node, E r is the residual energy of the node, d toBS is the distance from the node to the BS, and d MAXBS and d MINBS are, respectively, the maximum and minimum distances between a sensor node and the BS.

Modified Grey Wolf Optimizer
The MGWO is used to select the CHs. In the original GWO [50], the prey's position was calculated based on the average weight of the α, β, and δ wolves. Considering the difference between the residual energy of the node and the distance between the node and the BS, the fitness value of the node is taken as the initial weight in the GWO, which is calculated by Equation (12). Consequently, based on the GWO and formula (12), the initial position of prey is computed as where ω Iα , ω Iβ , and ω Iδ represent the initial weights of the α wolf, the β wolf, and the δ wolf, respectively. F α , F β , and F δ are the first, the second, and the third best fitness values, respectively, which are calculated through Formula (12). The α, β, and δ wolves are the respective nodes corresponding to the three best fitness values. When the GWO is implemented in the proposed protocol, the fitness value of the node is changed after completing one data transmission, so the weight of the GWO will not change. In order to further improve the global search capability of the GWO, the weights are dynamically updated by vectors D and A. D is the distance between the wolf and its prey, and A is the coefficient vector. D and A are calculated according to Equations (2) and (3). At the (t + 1) th iteration, the location of the prey and the weight updating formula are expressed as can be obtained by Equation (1). At the end of the iteration, the node closest to the prey is generally selected as the CH. However, the residual energy of the node may not be able to complete the task of the CH, causing the node to die prematurely. Therefore, the nodes that are close to the prey and have more residual energy should be selected as the CH. Here, it is proposed to use the remaining energy of the node and the Sensors 2020, 20, 820 9 of 18 distance between the node and the prey as parameters of the fitness function to select the CH. The node with the smaller fitness value is more likely to become the CH. The fitness function is as follows: , E r > 0 and the node is the normal node , E r > 0 and the node is the advanced node 0 , E r ≤ 0 , (17) where a 2 is the weight; E r is the residual energy of the node; E MAX and E MIN are the maximum and the minimum residual energy in the cluster, respectively; d toP is the distance from the node to the prey; and d MAXP and d MINP are the maximum and minimum distance, respectively, between a sensor node and the prey in the cluster.

Selection of the Optimal Cluster Set
According to the clustering algorithm, a network can be divided into multiple clusters, and multiple clusters in the network are called a cluster set. In this study, the selected initial clusters are referred to as the initial cluster set, which is taken as the current optimal cluster set, and the objective function value of the current optimal cluster set is calculated. Each cluster in the current optimal cluster set is randomly changed by the MGWO to generate a new cluster, and a plurality of the new clusters forms a new cluster set; the objective function value of the new cluster set is then calculated. When the value of the objective function of the current optimal cluster set is greater than that of the new one, the new cluster set is accepted as the current optimal cluster set; otherwise, the new cluster set is accepted as the current optimal cluster set in a probabilistic manner. At the end of the iteration, the optimal cluster set is formed. The objective function is defined as where a 3 is the weight, d TCH is the total intra-cluster communication distance, and d TBS is the total communication distance from the CH to the BS. The objective function is designed based on the total inter-cluster communication distance and the total communication distance from the CH to the BS. A smaller objective function value indicates that the CH selection is more reasonable, the CH is optimal in the cluster, and the CH set is also optimal relative to the whole network. The HMGWO pseudo-code is described in Algorithm 2.

Simulation Results
To verify its performance, MATLAB software (MathWorks, USA) was used to simulate the HMGWO algorithm. Under the same experimental conditions, the HMGWO algorithm was compared with the SEP, DEEC, M-SEP, and FIGWO algorithms. The main simulation parameters are shown in Table 2. At the beginning of each round, the BS selects CHs using the algorithm proposed in this article, and each round last for 1 s. In this paper, we used the network lifetime, residual energy and the number of packets received by the BS as the evaluation indexes of the algorithm performance. However, since the HMGWO, SEP, DEEC, M-SEP, and FIGWO protocols are all single-hop routing protocols, the end-to-end delay was not taken into account in the indicator evaluation. The following is an explanation of indicators and nouns.

1.
Network lifetime: The period of time between the beginning of the network operation and the death of the first node, which is also known as the network stability period.

2.
Network instability period: The time duration between the death of the first node and the death of all nodes. 3.
Total residual energy: The sum of the remaining energy of all survival nodes. To facilitate the comparison of the total residual energy, the percentage of the total residual energy is used to represent the total residual energy [10,20], which is calculated as follows: where p t (r) is the percentage of the total residual energy in the r th round; E totR (r) is the total residual energy in the r th round; and N j=1 E i ( j) is the sum of the initial energies of all the nodes.

4.
Residual energy deviation: The deviation between the node with the most residual energy and the node with the least residual energy [46]. The residual energy deviation in the r th round is calculated as follows: where E MaxR (r) and E MinR (r) are the maximum and the minimum residual energy of the node in the network the r th round. However, to facilitate the analysis of the simulation results, the residual energy deviation is compared in percentage units.

5.
Throughput: The number of data packets received by the BS. Figure 2 shows that the HMGWO was superior to other protocols in terms of its network lifespan for different proportions of advanced nodes obtained from the simulation. When the ratio of advanced nodes is α = 0.1, it can be seen in Figure 2a that the network lifespan of the HMGWO increased by 55.7%, 31.9%, 46.3%, and 27.0%, respectively, relative to SEP, DEEC, M-SEP, and FIGWO. When the ratio of advanced nodes is α = 0.2, it can be seen in Figure 2b that the network lifespan of the proposed algorithm is increased by 51.2%, 32.0%, 24.6%, and 32.8%, respectively, relative to SEP, DEEC, M-SEP, and FIGWO. The main reason for the above results is that the nodes with low residual energies have a low probability of becoming the CH, which avoids the phenomenon of rapid death of the node with low residual energy, thus extending the life cycle of the network.

Network Lifetime
HMGWO was shorter than that of other protocols. The main reason for this is that the HMGWO considers the heterogeneity of nodes and the global optimization ability, such that the energy consumption between nodes is relatively balanced, and the network instability period of the HMGWO is shorter. The specific reasons are discussed in detail in Section 6.  Figure 4a,b show the number of data packets received by the BS. When the ratio of advanced nodes is = 0.1, it can be seen from Figure 4a that the network throughput of HMGWO increased by 150.4%, 142.5%, 70.5%, and 17.6%, respectively, relative to SEP, DEEC, M-SEP, and FIGWO. When the ratio of advanced nodes is = 0.2, it can be seen in Figure 4b that the network throughput of the proposed algorithm improved by 142.7%, 124.8%, 70.9%, and 32.8%, respectively, relative to SEP, DEEC, M-SEP and FIGWO. This is because the HMGWO algorithm adopts a more reasonable CH selection mechanism, and the nodes have a longer survival time. When the number of rounds is the same, the number of surviving nodes in the network is higher than that of other algorithms, so more data groups are generated, and the number of data packets received by the BS also increases. The simulation results of the network life cycle for when the first node died, 50% of the nodes died and all nodes died are shown in Figure 3 at different proportions of advanced nodes. As can be seen from Figure 3, the death time of the first node was later than that of other protocols. When the number of death nodes reached 50%, the network lifetime of HGWO was significantly extended compared to that of the SEP, FIGWO, DEEC, and M-SEP protocols. However, as can be seen in Figure 3a,b, when all nodes died, the lifetimes of SEP and FIGWO were greater than that of the algorithm proposed in this paper. It can be clearly seen in Figure 3 that the network instability period of the HMGWO was shorter than that of other protocols. The main reason for this is that the HMGWO considers the heterogeneity of nodes and the global optimization ability, such that the energy consumption between nodes is relatively balanced, and the network instability period of the HMGWO is shorter. The specific reasons are discussed in detail in Section 6.

Number of Packets Received by the BS
Sensors 2020, 20, x FOR PEER REVIEW 11 of 16 HMGWO was shorter than that of other protocols. The main reason for this is that the HMGWO considers the heterogeneity of nodes and the global optimization ability, such that the energy consumption between nodes is relatively balanced, and the network instability period of the HMGWO is shorter. The specific reasons are discussed in detail in Section 6.
(a) (b)   Figure 4a,b show the number of data packets received by the BS. When the ratio of advanced nodes is = 0.1, it can be seen from Figure 4a that the network throughput of HMGWO increased by 150.4%, 142.5%, 70.5%, and 17.6%, respectively, relative to SEP, DEEC, M-SEP, and FIGWO. When the ratio of advanced nodes is = 0.2, it can be seen in Figure 4b that the network throughput of the proposed algorithm improved by 142.7%, 124.8%, 70.9%, and 32.8%, respectively, relative to SEP, DEEC, M-SEP and FIGWO. This is because the HMGWO algorithm adopts a more reasonable CH selection mechanism, and the nodes have a longer survival time. When the number of rounds is the  150.4%, 142.5%, 70.5%, and 17.6%, respectively, relative to SEP, DEEC, M-SEP, and FIGWO. When the ratio of advanced nodes is α = 0.2, it can be seen in Figure 4b that the network throughput of the proposed algorithm improved by 142.7%, 124.8%, 70.9%, and 32.8%, respectively, relative to SEP, DEEC, M-SEP and FIGWO. This is because the HMGWO algorithm adopts a more reasonable CH selection mechanism, and the nodes have a longer survival time. When the number of rounds is the same, the number of surviving nodes in the network is higher than that of other algorithms, so more data groups are generated, and the number of data packets received by the BS also increases.   Figure 5a,b show the total residual energy of the network, from which it is obvious that the HMGWO had a higher total residual energy, which means that the network could survive for a longer period of time. As can be seen in Figure 5c,d, the HMGWO effectively reduced the residual energy deviation compared with other algorithms. In HMGWO, the nodes with more remaining energy were selected as the CH more frequently than the nodes with less remaining energy, so the residual energy deviation was more balanced relative to other protocols. The percentage of the total residual energy and the residual energy difference can be calculated by Equations (19) and (20), respectively.   Figure 5a,b show the total residual energy of the network, from which it is obvious that the HMGWO had a higher total residual energy, which means that the network could survive for a longer period of time. As can be seen in Figure 5c,d, the HMGWO effectively reduced the residual energy deviation compared with other algorithms. In HMGWO, the nodes with more remaining energy were selected as the CH more frequently than the nodes with less remaining energy, so the residual energy deviation was more balanced relative to other protocols. The percentage of the total residual energy and the residual energy difference can be calculated by Equations (19) and (20), respectively.   Figure 5a,b show the total residual energy of the network, from which it is obvious that the HMGWO had a higher total residual energy, which means that the network could survive for a longer period of time. As can be seen in Figure 5c,d, the HMGWO effectively reduced the residual energy deviation compared with other algorithms. In HMGWO, the nodes with more remaining energy were selected as the CH more frequently than the nodes with less remaining energy, so the residual energy deviation was more balanced relative to other protocols. The percentage of the total residual energy and the residual energy difference can be calculated by Equations (19) and (20), respectively.

Discussion
The HMGWO protocol outperformed the SEP, DEEC, M-SEP, and FIGWO protocols in terms of the network lifecycle, power consumption, and throughput. The reasons are proved and summarized as follows.
The SEP, DEEC, and M-SEP protocols are the routing protocols of HWSNs that do not consider the distance from node to BS when selecting CHs, and they just define the different threshold formulas of CH selection relative to the advanced nodes and normal nodes. In SEP, the selection of CHs in the whole network is inherently random. This makes the distribution of CHs in each round uneven, which will lead to unbalanced energy consumption among nodes and to the premature death of some nodes. Although the high-energy node in SEP has a higher probability of becoming the CH than the ordinary node, due to the random selection of the CHs, the frequency of some high-energy nodes becoming CHs may be small. This results in the slow death rate of the high-energy node, which extends the network instability period and the energy consumption among the nodes is not balanced. When the DEEC protocol selects the CH, the residual energy and the average energy of the network node are taken into account in the weighted election probability. That is to say, when the residual energy of the node is larger than the average energy, the average probability of the node being selected as the CH is higher than that of the node whose residual energy is smaller than the average energy; thus, the energy consumption between advanced nodes and ordinary nodes is reduced, and the life cycle of the network is extended, compared with SEP. In the model of CH selection, compared with DEEC, M-SEP applies the residual energy of nodes and the average energy of network nodes to the threshold formula. However, the weighted election probability does not change. When the residual energy is greater than the average energy, the threshold value of the node is relatively large, but there is still some randomness when the node is selected as the CH. The network lifetime is improved relative to the SEP protocol, but it is not optimized compared to DEEC in terms of residual energy deviation. On the contrary, our algorithm uses the fitness function (F 1 ) to select the initial CHs. The fitness function (F 1 ) takes into account the residual energy of the node and the distance from the node to the BS, so that the initial CHs are more evenly distributed in the network, and overcomes the randomness of the selected CHs. However, the selection of the initial cluster further fixes and limits the number of CHs, and provides the search range of the MGWO algorithm, thereby reducing the energy consumption in unnecessary clustering, saving network energy consumption, and prolonging the network life cycle.
The GWO is used to select the CHs in FIGWO. When using the GWO algorithm for the selection of CHs, the selection of the final CHs is carried out according to the Euclidean distance (at the end of the iteration, the algorithm can estimate the prey's position (here, the prey is called a virtual CH), and all nodes will calculate the distance to the virtual CH; the node with the smallest distance will be selected as the CH). In the algorithm proposed in this paper, the final CHs are selected by the fitness function (F 2 ), which is designed by the energy of the node and the distance from the node to the virtual CH. The node with high residual energy and nearest distance have a greater probability as the final CH, thereby avoiding the node with low remaining energy to become the final CH, and further reducing the energy consumption deviation of the network. For FIGWO, the GWO is used to select the final CH within each initial cluster, and the selected CH is only optimal in the cluster; however, the set of final CHs may not be optimal throughout the whole network. The FIGWO algorithm only solves the local optimal problem of CHs but does not consider the optimal problem of CHs in the whole network. However, the HMGWO adds a new algorithm to the outer layer of MGWO. The outer layer algorithm selects the bad set of CHs with a certain probability as the initial cluster set of the next iteration, so that the selected CH set is not only the best in the cluster, but also, the final CH set is the best in the whole network. HMGWO makes rational use of the heterogeneity of nodes, solves the randomness of CHs, and considers the global optimality of CHs. Therefore, HMGWO is superior to the SEP, DEEC, M-SEP, and FIGWO protocols in terms of the energy consumption, network life cycle, and throughput.

Conclusions
In this study, we proposed a routing protocol for HWSNs based on a modified grey wolf optimizer. By defining different fitness functions for both advanced nodes and normal nodes and modifying the GWO, this algorithm can ensure that advanced nodes in the cluster are more likely to be selected as CHs. Consequently, the burden of low-energy nodes being selected as CHs can be reduced, and the network lifetime can be extended. The simulation results show that, compared with traditional SEP, DEEC, M-SEP, and FIGWO protocols, the energy consumption, lifetime, and throughput of the network are significantly improved. In the future, the proposed HMGWO will be extended to larger sensor networks to consider multi-hop communication between CHs, thus reducing energy consumption between a remote BS and CHs. However, when the CHs are selected, how to reasonably reduce the extra energy consumed by all sensor nodes to transmit their positions and energy to the base station, thus reducing the unnecessary energy consumption, will also be a focus of further study.