An Adaptive Link-level Recovery Mechanism for Electric Power IOT based on LoRaWAN

Electric power Internet of Things (IoT) is a network system that can meet multiple requirements of the power grid, such as infrastructure, environment recognition, interconnection, perception and control. Long Range Radio Wide Area Network (LoRaWAN) with the advantages of ultra-long transmission and ultra-low power consumption, becomes the most widely used protocol in the electric power IoT. However, its extremely simple star topology also leads to several problems. When most of terminals depend on one or several gateways for communication, the gateways with heavier communication tasks have poorer communication quality. The load of each gateway is unbalanced, which is hardly conducive to a long-term network operation. At the same time, in the electric power IoT environment, there are some features such as complex terminal deployment environment, wide coverage, and large interference. These characteristics can lead to more vulnerable links and even interruptions in communication services. Therefore, this paper proposes an adaptive link-level recovery mechanism based on link adjustment for LoRaWAN. When a communication link fails, multiple candidate links are selected based on Quality of Service (QoS) requirements, the distribution of LoRaWAN gateways and repeaters. The final adopted link is selected from multiple candidate links using the following method. Considering network load balancing, a Link Recovery Adaptive algorithm based on the Kuhn-Munkras algorithm (LRAKM) is designed from the perspective of fault tolerance. This method is to adaptively adjust some communication tasks to the suboptimal communication link. One or more gateways on the optimal communication link of these communication tasks are overloaded. This adaptive adjustment can make the network load more balanced. The simulation result shows that LRAKM has a higher link recovery rate. It also shows that the whole network is more balanced in both sparse and dense environments. Furthermore, when the network load is heavier, LRAKM also has a better effect on balancing the network load and improving the link recovery rate.


Introduction
We propose an adaptive link-level recovery mechanism from the perspective of fault tolerance. This mechanism with repeaters as backup links aims to eliminate critical nodes and ensure network load balancing. We propose a system for quantitatively scoring gateways by QoS, latency, duty cycle, and distance from the terminal. We propose a complementary virtual node model to match terminals with gateways or repeaters.
The rest of this paper is organized as follows. Section II introduces the related work and LoRa application in power scenarios. Section III describes the network model we are using to define the scenario. We proposed a KM-based link recovery adaptive algorithm in Section IV. Section V shows the simulation results of the link failure recovery rate in two different scenarios. Section VI summarizes this paper.

Related Work
In recent years, due to the rapid development of low-power wide area networks, more and more improvements have been made to it. In order to improve LoRa coverage and stability, many pieces of research have introduced multi-hop structures into LoRa.
Reference [8] developed a multi-hop network combining LoRa and concurrent transmission, which can significantly improve network efficiency. This paper proposed the offset-CT method to preventing the timing offset from diverging over the multi-hop network. Reference [9] presents the prototype design and testing of a long-range, self-powered IoT devices. The coverage area and range can be extended significantly by deploying the devices in a multi-hop network topology. Reference [10] used multi-hop LoRa configurations to extend the range and increase the energy efficiency of the network. They also consider the optimal repeater placement in two or three hops. Reference [11] reports signal strength measurements for inter-building LoRa links and provides insights on factors that affect signal quality such as the spreading factor. Reference [12] develops a new receiver structure that enables the superposed LoRa signals with different odd/even SFs to be demodulated simultaneously based on the capture effect. Simulations verify that, through utilizing the capture effect, the proposed protocol can partly tackle the collisions due to numerous access attempts, which results in enhancing the throughput compared to LoRaWAN.
In addition, in order to increase the transmission rate and throughput of LoRa, there are many papers proposing a hybrid structure of Lora with other communication protocols to compensate for the structural deficiencies of LoRa and applying it to the smart grid.
Reference [13] proposes an energy-efficient network topology and an efficient time division multiple access protocol used in conjunction with LoRa. The on-demand Time Division Multiple Access (TDMA) protocol provides more efficient broadcast and unicast services for data transmission, improving the performance of traditional LoRa networks. Reference [14] proposed a LPWAN communication structure for monitoring distribution networks and designed a 3G-LoRa-Sigfox hybrid communication architecture. If priority is given to coverage, use the Sigfox network, and if priority is given to energy consumption, use the LoRa network. Due to terrestrial structures in urban and suburban environments, the link distance of LoRa transmissions can be reduced. Reference [15] provides a data-driven comparison between LoRa and a variant of frequency-hopping based modulation named Telegram Splitting Multiple Access. Network performance comparison is conducted through system level simulations using multiple IoT applications. It is to satisfy the precise needs of heterogeneous applications and network deployments. Reference [16] introduces the application of LoRa technology in IoT scenarios, discusses its advantages over efficiency, effectiveness, and architectural design over established models, especially for typical smart city applications. Reference [17] checked the wireless access method of both NB-IoT and LoRa for the novel IoT smart grid application requirements. Reference [18] proposed a communication architecture based on LoRaWAN Class B to assess its feasibility for modern power grid. Fig. 1 shows the logical structure of the scene set in this paper. Terminals are directly connected to LoRa gateway. The repeater is used as a backup link. It could be another functional device with a LoRa module. The network is a highly redundant network [19]. Many repeaters or LoRa gateways will receive data from the same terminals. Only one repeater or LoRa gateway will continue to process these data. When the current link fails, the alternate set is all the gateways (or repeaters) that receive the redundant data. This network model makes full use of existing equipment and improves the stability of the network.

Network Model
A network consistent with a finite number of N terminals, R repeaters and 2 LoRa gateways are considered in this paper. Let k denote the failure probability. For model tractability, each terminal sends a data packet with a length of payload at random moments in each period T. In each period, all the terminals involved in the N Â k links need to be adjusted.

KM-Based Link Recovery Adaptive Algorithm
The process of the KM-based link recovery adaptive algorithm proposed in this paper is as follows.
We calculate the duty cycle of each gateway or repeater in order to determine how busy each gateway or repeater is. It determines the set of alternatives through other gateways or repeaters that receive redundant data, which in turn discovers all available communication links. The candidate links are screened from the aspects of delay, energy consumption, adaptive rate attributes of the communication link and the QoS requirements of the transmission. The links that meet the conditions are sorted according to the communication quality. After considering the selection of all other concurrent communication links, the best communication link is adaptively and precisely matched to recover the network failure based on the ranking result.

Scoring System for Candidate Links
This paper quantifies and scores the QoS requirements of the terminal (upstream and downstream data flow requirements, delay, bandwidth, etc.). Let the function mark r; n ð Þ be the final score of each gateway or repeater for terminal n. The higher the score, the busier the gateway or repeater. mark r; n ð Þ ¼ aP sf ; r ð Þþbdis r; n ð Þþ htime r; n ð Þ where P SF; r ð Þrepresents the load rate of the gateway/repeater r on the channel with the spreading factor SF. There are 6 different channels, which do not affect each other. SF value ranges from 7 to 12. dis r; n ð Þ indicates the distance from terminal n to gateway or repeater r. time(r, n) indicates the transmission time between r and terminal (using ADR). Delay and bandwidth are reflected in the transmission time. For other constraints such as upstream and downstream data flow requirements, jitter, etc., when a link does not meet these conditions, the mark value of the path is set to positive infinity. This paper sets the spreading factor (SF) by calculating the distance between the terminal and the repeater or LoRa gateway. In general, adjusting the adaptive rate is limited to the terminals are moving [20], but in this paper, although the position of the terminal is fixed, due to the change of the link, the relative position of the LoRa gateway or repeater and the terminal changes. Therefore, the rate needs to be adjusted to match the current link conditions. The value depends on the distance, 4 , dis , 6 10; 6 , dis , 8 11; 8 , dis , 10 12; 10 , dis , 12 The adaptive symbol rate ADR is According to the characteristics of LoRa, when the bandwidth and code rate are fixed, the transmission rate of LoRa is determined by the spreading factor.
LoRa packet time is equal to the sum of preamble time, packet transmission time, and the length of the preamble can be calculated by the following formula: where LoRa uses pure ALOHA for transmission [21]. If two data packets collide, then the total time taken by the two data packets is the time when the next data packet ends and the time when the previous data packet starts to be sent. The load factor (P) formula is

Complementary Matching Model
Kuhn-Munkras algorithm can be used to find maximum-weight matchings in bipartite graphs [22]. It is based on the Hungarian algorithm. Traditional KM algorithm can only deal with complete graph problems. However, in this scenario, the number of sensors and gateways is unequal. Multiple sensors can correspond to the same gateway. Based on these new features, the improved model proposed in this paper adds virtual nodes on the gateway side and the sensor side. If the terminal sensor node cannot be connected to the gateway, the weight of the connected side is positive infinity. The mark value of the remaining side is the weight of that side. At the same time, edges' weights between virtual terminal sensor node and any of the gateway nodes are all zero. The difference between the actual scene and the model is shown in the Fig. 2. The left side shows the original correspondence in actual scene. The right side shows the improved complementary matching model. When adding virtual nodes, this paper first considers the nodes on the gateway and the repeater side, because a gateway with a lower duty cycle has more positions to connect to new terminals. In the complementary matching model, the number of nodes corresponding to the gateway is the number of remaining positions of the gateway. When these gateway-side virtual nodes correspond to the same terminal-side node, all edge weights are the same.
For each gateway r, Then the number of virtual nodes to be added at the terminal is Let G X ; Y ð Þrepresents the complementary matching model. It is a bipartite graph, and X j j ¼ Y j j, x ij is the weighted of x i y j , l v ð Þ is node v's label. a node labeling l v ð Þ, when x 2 X ; y 2 Y ; l x ð Þ þ l y ð Þ ! x xy ð Þ, (x xy ð Þ represents the weight of edge xy), is called a feasible vertex label. If l v ð Þ is a feasible vertex label,   A matrix is used to represent the result. If the two elements are matched, the corresponding value in the matrix is one. The rest are zero. After all terminals are matched, the weight of the matching edge is checked. If it is positive infinity, it means this terminal could not be recovered.

Simulation Results and Discussion
In the power IoT scenario, a LoRa gateway often connects hundreds or even thousands of terminals. Packet transmission rate varies from one to two hundred packets per day. The specific number of connections for one LoRa gateway is closely related to the packet transmission rate. In the electric power IoT environment, supervisors need to quickly sense the current power environment and make rapid adjustments to the transformation. Two different scenarios are defined in this paper. In cities, there is a large number of existing power equipment that can be modified to have repeater functionality to increase the network's fault tolerance. This scenario corresponds to the scenario where the repeaters are denser. In areas with harsher natural conditions, such as wilderness or mountainous areas, where the original power infrastructure is relatively poor, it is more difficult to add new dedicated repeater equipment. This situation corresponds to the sparser repeater scenario. The first scenario is the sparse situation. In the initial state, all terminals are connected to the LoRa gateway. When a terminal connection fails, if this terminal is directly connected to the gateway, first find out whether other LoRa gateways can be connected. If it can be connected, all channels with an SF value greater than or equal to the current SF will be considered. If they do not meet the requirements, look for a repeater.
In the sparse repeater model set in this paper, there are two LoRa gateways, six repeaters. All terminals are evenly distributed in a rectangular area of 16 km Â 20 km. The period is 3600 seconds/packet, that is, each terminal sends a data packet every hour. The payload is 9. Fig. 5 shows the link recovery rate in the case of 1000 to 8000 terminals. The upper three curves in the Fig. 5 are a group, showing the impact of three different recovery strategies on the link recovery rate when the failure rate is 0.01 (lower). As the number of terminals increases, the recovery rate gradually decreases. The lower three curves in Fig. 5 are a group, showing the impact of three different recovery strategies on the link recovery rate when the failure rate is 0.03 (higher). The LRAKM algorithm has the most obvious advantage when the terminal scale is 2000 to 7000. When it is less than 2000 terminals, the network load is light. When it is greater than 7000 terminals, network load is too heavy. Each gateway has reached the load limit, so the improvement of the algorithm is not ideal. Fig. 6 shows the proportion of energy consumed by each repeater. LRAKM algorithm has a significant improvement in load balancing compared to the distance-first algorithm.
In the case of dense repeaters, in the initial state, all terminals are connected to the LoRa gateway. When a link fails, first find out whether other LoRa gateways can be connected. If it can be connected, all channels with SF value greater than or equal to the current SF will be considered (the distance between new gateway and terminal will not be significantly less than the original). If all SF channels of other gateways do not meet the requirements, the mechanism will look for a repeater. In denser repeater model, there are two LoRa gateways and twenty-four repeaters. All terminals are evenly distributed in a rectangular area of 16 km Â 20 km. It is set that 1% (upper three curves in Fig. 7) or 3% (lower three curves in Fig. 7) links fail. The period is 3600 seconds/packet. The payload is 9. Fig. 7 shows the link recovery rate in the case of dense repeaters. It can be seen that the link recovery rate of the mechanism using the LRAKM algorithm is higher than that of the other two algorithms. The link recovery rate of all algorithms is maintained above 90%. This paper studies the scenarios with heavier network load to test whether the mechanism can be applied to such scenarios. It is set that 1% (the upper three curves in Fig. 8) or 3% (the lower three curves in Fig. 8) of the link fails, with a period of 1800 seconds/packet, that is, each node sends a data packet every half hour. The payload is 9.

Conclusion
LoRa technology is widely used in power IoT due to its advantages such as ultra-long transmission and large-scale connectivity. When some terminals are unable to communicate due to link failure, new links need to be planned. If all terminals are connected to their optimal links, it will cause some gateways to become  critical nodes, resulting in uneven network load, which causes a series of problems and affects the long-term stable operation of the network. Based on these problems, this paper proposes an adaptive link-level recovery mechanism from the perspective of fault tolerance. When a link failure occurs, all alternative links are first searched. These alternative links can be either through repeaters or directly connected to the gateway. With the KM-based LRAKM algorithm proposed in this paper, the optimal links for some terminals are adjusted to sub-optimal links while ensuring the communication demand. From the level of all terminals in the network, this adjustment achieves network load balancing and improves the link failure recovery rate. Simulation results show that the mechanism improves the link fault recovery rate and load balancing compared to the distance-first and idle-first mechanisms in both scenarios where repeaters are sparse or dense. Further, the mechanism still performs well in larger scale terminal scenario. Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.