Cluster-based flow control in hybrid software-defined wireless sensor networks

of handling the individual flows of each node, the SDN controller only manages incoming and outgoing traffic flows of clusters through border nodes, while the flows inside each cluster are controlled by a distributed legacy WSN routing algorithm. Our proof-of-concept implementations in both software and hardware show that our approach is efficient with respect to reducing the number of nodes that must be managed and the number of control messages. In comparison to benchmark solutions with and without clustering, our solution reduces communication costs for flow configuration in an SD-WSN at least by 27% and at most by 88% respectively, without degrading packet delay nor delivery rate.


Introduction
Software-Defined Networking (SDN), in comparison to traditional networking, provides improved flexibility and reduced complexity when it comes to flow management [2,3].Given the advantages and large-scale adoption of SDN within data-center networks and wide-area networks, a logical question is whether the same advantages can be expected when SDN is introduced within a wireless sensor network (WSN) [4].However, most SDN research is focusing on wired networks, ✩ An earlier 8-page version (Liu et al., 2019, [1]) of this paper was presented at the IEEE Wireless Communications and Networking Conference (WCNC), April 2019.Compared to the earlier version, we have (1) restructured and rewritten the entire paper; (2) redefined our proposed system model; (3) provided examples plus a mathematical proof w.r.t. the performance of our clustering algorithm; (4) performed five additional novel experiments that demonstrate the efficacy of our solution; (5) extended the related work section; and (6) added a section about future work.
and only a few initiatives have attempted to extend the benefits of SDN to the wireless domain [5,6].
A WSN typically consists of resource-constrained sensor nodes for monitoring the physical conditions of the environment, while the SDN paradigm provides a simple and flexible control approach to communication networks [7].The confluence of these techniques is called Software-Defined Wireless Sensor Networks (SD-WSNs) [8][9][10].Fig. 1 illustrates a generic architecture of a SD-WSN.In that architecture, the https://doi.org/10.1016/j.comnet.2020.107788Received 9 July 2020; Received in revised form 1 December 2020; Accepted 29 December 2020 sensor nodes only perform packet forwarding, while all the controlplane operations, such as flow routing [11], Quality-of-Service (QoS) control [12], and load balancing [13], are performed by a logically centralized controller.Compared to the distributed control of a WSN, an SDN controller is able to manage and optimize WSN performance, such as energy consumption and communication flow, based on a global view of the entire network.
To implement an SD-WSN, the SDN architecture for wired networks must be mapped to WSN, which involves several difficulties: • To achieve fine-grained flow control granularity, most existing SDN architectures require frequent message exchange between the data plane and the control plane [14].Although this overhead is often acceptable in wired networks, the case for WSN is different.In a WSN, the control and data flows share the same wireless channel.Given that most wireless channels have limited bandwidth (in comparison to wired networks), the SDN control flows may significantly interfere with the data flows.For example, a burst of control packets requesting new flow table entries could stress the available wireless bandwidth.• Nodes in an SD-WSN cannot completely decouple the data plane and control plane.In a typical SD-WSN architecture, the nodes and SDN controllers do not have wired connections.They transmit data via multi-hop wireless communication.Therefore, the nodes have to maintain a distributed local routing table for finding the SDN controller and receiving routing commands.
The observations above imply that the SDN architecture in wired networks cannot directly be applied to a WSN.Instead, to take advantage of the concept of SDN within a WSN, we need to balance the benefits and the communication overhead of SDN.Compared with a pure SDN paradigm, a hybrid SDN contains a mix of centralized SDN control and legacy network control, and thus shows the benefits of both paradigms [15][16][17].Therefore, we aim to leverage hybrid SDN solutions to solve the above difficulties.
In this paper, we propose a cluster-based flow control approach called CluFlow.CluFlow is a hybrid SDN solution.It takes advantage of distributed legacy WSN routing and centralized SDN routing.Meanwhile, it makes a trade-off between the granularity of flow control and the communication overhead induced by the SDN controller.The properties of CluFlow are twofold.Firstly, CluFlow adopts network clustering to control traffic flows on the cluster level instead of at the level of individual nodes, which decreases the number of nodes and messages that are involved in flow control within an SD-WSN.Secondly, CluFlow makes SDN control work in parallel with distributed routing.The nodes inside the clusters use only distributed local routing and do not need to request flow table entries from the SDN controller.The communication delay caused by requesting flow table entries therefore decreases.Compared with existing SD-WSN solutions [18], the novelty of CluFlow is in two aspects.Firstly, we propose a solution for SD-WSN that combines centralized SDN-based routing among clusters and distributed legacy routing inside clusters.Secondly, to the best of our knowledge, this is the first work that partitions a network into clusters with minimum cluster border nodes.
In this paper, we realize the proposed design and provide the following main contributions: • We take a graph-theoretic approach for clustering the network with the goal of minimizing the number of border nodes.The SDN controller manages communication by monitoring and controlling the border nodes of clusters.• We propose a priority scheme to coordinate legacy WSN routing and SDN control, where cluster-level routing performed by the SDN controller has a higher priority than legacy routing.This hierarchical routing decreases the communication overhead of SDN control in WSNs.• We implement an SD-WSN in simulation and real deployments, in which SDN control operates together with legacy distributed routing protocols.
This paper is organized as follows.The system model is presented in Section 2. Our solution of cluster-based flow control for hybrid SD-WSNs is addressed in Section 3. The simulation and hardware experiments are presented in Section 4 and the related work is discussed in Section 5.The future research directions are discussed in Section 6.Finally, we conclude this article in Section 7.

System model
We represent the network as an undirected graph  = ( , ), in which  = { 1 , …   , …   } represents the set of  = | | nodes and  = { 1 , …   , …   } represents the set of  = || edges.The nodes in the network transmit data via multi-hop communication.The nodes that share an edge are called neighbors.Suppose the set of nodes  is partitioned into clusters  = { 1 , …   , …   } with  = ||, we make the following system assumptions with respect to our cluster-based flow control solution: • We assume that there is one central SDN controller that is responsible for partitioning the network (in practice this could be multiple logically centralized controllers).Each node in the WSN reports its neighbor connectivity to the SDN controller.The SDN controller builds the WSN topology and partitions the network.• Our solution targets a static network topology.Once the topology of the WSN changes, the nodes would report the new connectivity to the controller, and the controller would re-partition the new topology into clusters.
Cluster Head Nodes: To set the number and position of clusters, we specify cluster head nodes {ℎ 1 , … ℎ  , … ℎ  }, in which  is the total number of clusters.Each head node must reside in a cluster.We require that {ℎ 1 , … ℎ  , … ℎ  } are disconnected, which means there are no edges connecting any pair of cluster head nodes.
Cluster Border Nodes: If node   belongs to cluster   and one of its neighbor nodes belongs to another cluster, then we call   as a border node of cluster   .We refer to all the border nodes of cluster   as node set   .

Flow control in hybrid SD-WSNs
In this section, we present the design of CluFlow, including the solution overview, an algorithm for minimizing the number of border nodes, and a protocol for cluster-based SDN control.

Solution overview
In the design of CluFlow, centralized SDN control and decentralized legacy routing control coexist in the WSN.On the one hand, each WSN node operates legacy routing protocols.On the other hand, an SDN controller partitions the network to clusters and controls the communication flow among clusters.Specifically, the SDN controller sets routing rules at the cluster border nodes, which is called clusterlevel routing, e.g.forward data flow from cluster   to cluster   .In this condition, both legacy routing protocols and SDN routing control are performed in the border nodes.
To coordinate the hybrid routing control, we require the clusterlevel routing rules to have higher priority than the local routing rules in the border nodes.For example, suppose   and   are two cluster border nodes, and have a linked edge.  ∈   and   ∈   .An example of cluster-level flow control is shown in Fig. 2. Suppose the distributed routing from  1 to the SDN controller is  1 →  2 →  3 →  4 →  5 , as shown in Fig. 2(a).To use the cluster-level flow control, the network is partitioned into four clusters  1 ,  2 ,  3 ,  4 as shown in Fig. 2(b).The SDN controller sets the cluster-level routing rules in the border nodes of each cluster.The cluster-level routing rules are: (i) traffic flows between  1 and  2 ,  1 and  3 ,  2 and  4 are allowed; (ii) traffic flow between  3 and  4 is prohibited.So the routing from  1 to  2 does not fulfill the cluster-level routing, hence it is blocked.Thereafter, the border nodes of  4 and  3 rebuild their local routing tables.Finally, the route from  1 to the SDN controller becomes Based on the analysis above, we found that it is feasible to control the cluster-level data flow by cluster border nodes.This hybrid SD-WSN control brings benefits to the following perspectives.• High scalability: The communication overhead caused by the SDN controller is scalable, which can be tuned by controlling the size of clusters.
However, the existing network clustering solutions cannot optimally partition the network and control the cluster border nodes for two reasons.
• Firstly, monitoring and controlling the cluster border nodes of all the clusters would cause replicated operations.For example, the network in Fig. 3(a) is partitioned into four clusters  1 ,  2 ,  3 and  4 .In these clusters, the incoming flow to  1 equals the sum of the outgoing flow from  2 to  1 and from  3 to  1 .Therefore, there is no need to monitor all the cluster border nodes of  1 .Instead, we only need to monitor the cluster border nodes of  2 and  3 as shown in Fig. 3(b).• Secondly, fewer border nodes means less control flow with the SDN controller.For example, the number of cluster border nodes in Fig. 3(c) is smaller than Fig. 3(b).Although there are various methods for partitioning a network into clusters, to the best of our knowledge, there is no one suitable for our SD-WSN solution.
To cope with these problems, we present our approach to partition the network to clusters with a minimum number of cluster border nodes in the next section.

Problem definition
We formally define the problem of clustering with a minimum number of cluster border nodes as follows.Name the set of network nodes excluding the cluster head nodes in  = ( , ) as .Define  as a set of nodes in .We require that the network  is partitioned into clusters after removing all the nodes in , such that each cluster contains a cluster head node and any two clusters do not share a single edge.The aim is to select  in  with minimum ||.The problem is expressed as The problem above is a variant of the -way node separators (NS) problem, which is known to be NP-hard for general graphs [19] and for which heuristic algorithms, e.g.[20], have been proposed.However, way NS algorithms cannot directly be used for our variant.Because, to manage the flow of a SD-WSN, besides requiring to minimize the number of separator/border nodes, the solution must have the following properties: • The computational complexity must be small to enable the SDN controller to quickly find cluster border nodes after any network changes.• The sizes of the partitioned clusters do not need to be balanced.
We only require that each cluster head node resides in a cluster.

Algorithm
We propose a light-weight -way node separators solution for partitioning the network into clusters.
Step I -Partition Network to Clusters.We first introduce a method to partition a network into two clusters.After that, we extend this method to multiple clusters.
Two Clusters: Suppose a network is required to be partitioned into two clusters   and   .The cluster head nodes ℎ  and ℎ  are required to be clustered inside   and   , respectively.As shown in Fig. 4(a), ℎ  ⊂   , ℎ  ⊂   , and ℎ  ∪ ℎ  ∪  =  .We solve the problem as follows:  Multiple Clusters: Assume we have cluster head nodes Based on the method for partitioning two clusters, we partition the network into multiple clusters as follows: (i) Partition clusters between ℎ  and the other cluster head node We use   as cluster   .(iii) Remove  i from the network .Repeat (i) to (iii) for each cluster head node until all clusters are partitioned.
Step II -Optimize Border Nodes.The intersection set   in Step I is non-optimized.Therefore, we optimize the border nodes of   in this step.
For example, as shown in Fig. 5(a), we cluster the network into  1 ,  2 and  3 via Step I. Assume the cluster border nodes between  1 and  2 are { 3 ,  5 ,  8 ,  9 }, and the cluster border nodes between  1 and  3 are { 2 ,  3 ,  4 ,  6 ,  7 }.The intersection set between  2 1 and        .Assume the data source and sink nodes of cluster   are not in   .In this condition, all the flows of   passes the edges in    .Therefore, monitoring the flows of the MVC nodes belonging to   ∪   can capture all the flows of cluster   .This means the MVC nodes of   ∪   can be used as an alternative set of border nodes to   .□ An example of the proposition is shown in Fig. 6.Suppose a network is partitioned to clusters  1 and  2 , and node set  2 = { 1 ,  2 ,  3 ,  4 } are the border nodes of  2 as shown in Fig. 6(a).In  2 , the neighbor nodes of  2 are  2 = { 5 ,  6 ,  7 }, and the edges connected to  2 are { 1 ,  2 ,  3 ,  4 ,  5 }.In this condition, we could manage the incoming and outgoing flows of  2 by controlling the flows on the edges { 1 ,  2 ,  3 ,  4 ,  5 } of  2 .At the same time, the MVC nodes of  2 ∪  2 are { 1 ,  6 ,  7 }.We could also manage the incoming and outgoing flows of  2 by controlling the flows on the edges { 1 ,  2 ,  3 ,  4 ,  5 } of { 1 ,  6 ,  7 } as shown in Fig. 6(b).Therefore, the node set { 1 ,  6 ,  7 } is an alternative to  2 for controlling the incoming and outgoing flow of cluster  2 .
Based on the analysis above, for optimizing the border nodes of a cluster, we calculate MVC on   ∪   as   .Then, we use   as an alternative to the cluster border nodes   selected by Step I.Because   is not necessarily smaller than   , we finally select the smaller set between   and   as the border nodes of cluster   .

Cluster-based flow control
In this section, we analyze the computational complexity of our clustering solution, and present an SDN control protocol based on the cluster-level control.

Computational complexity
The solution for partitioning networks with a minimum number of border nodes is shown in Alg. 1.
In Step I, we utilize a Max-Flow-Min-Cut (MFMC) method to partition a network into two clusters.In our implementation, we chose the Boykov-Kolmogorov MFMC algorithm with a worst-case complexity of ( 2  ||), in which || is the sum of the costs of boundary edges [21].Then we extend this method from partitioning two clusters  In Step II, we optimize border nodes based on a solution to the Minimum Vertex Cover (MVC) problem [22].Although the MVC problem is NP-complete, its calculation is only performed on a small number of cluster border nodes.

Communication protocol
The main protocol of cluster-based flow control in SD-WSN is shown in Fig. 7.The protocol has three phases.In the first phase, each network node sends neighbor connectivity information local-links to the controller.The controller builds the topology of the network based on the received neighbor connectivity information and partitions the network into clusters using the algorithm in Section 3.2.Then the controller sends a set-border command to the selected cluster border nodes.The network nodes that receive the set-border  CluFlow can be deployed as a network management service, which is connected to the SDN northbound APIs [23].In such a system, CluFlow requests network information, including neighbor connectivity and data flow of each node, from the SDN controller.At the same time, the SDN controller interacts with the forwarding plane of WSN nodes through southbound APIs of communication protocols.In this way, the SDN controller adds and adjusts routing entries in the internal flow-table of cluster border nodes.To control cluster-level flow, SDN controllers configure the action of cluster border nodes mainly through two actions, i.e., forwarding packets to a destination or dropping packets of a source.
CluFlow is able to work with the standard OpenFlow protocol [14], but it is not tied to any specific southbound protocol.An example of OpenFlow-based flow table that could be used for CluFlow is shown in Fig. 8.For example, the flow table entry could match the IP address of the source node, the IP address of the destination node, and the ports of the service.The detailed design about how to translate the routing policies of CluFlow to flow table entries is implementation-specific, and will be part of our future work.
Fig. 9 illustrates two examples of how SDN controls the flow among clusters.In the initialization stage, an SDN controller first gathers topological information of the WSN to build a local network representation.After that, the SDN controller calculates the network clustering and sends control messages to the borders nodes.
• Without Blockage on Border Nodes: Suppose node  1 requests to transmit packets to the data sink as shown in Fig. 9(a).In the first place, node  1 uses local routing to calculate the next hop,

Experimental setup and results
In this section, we test and evaluate CluFlow in simulation and a real deployed WSN.Firstly, we examine the validity of Alg. 1 (Section 4.2).Secondly, we test the practicality of protocol shown in Fig. 8 (Section 4.3).Thirdly, we compare the number of border nodes between CluFlow and the benchmark approaches (Section 4.4 and Section 4.5).After that, we examine how the search space of clustering affects the number of cluster border nodes (Section 4.6).Then, we measure the communication load of CluFlow using real communication protocol stacks in an SD-WSN simulator (Section 4.7).Finally, we evaluate the number of border nodes and communication cost in a real deployed WSN (Section 4.8).

Benchmark approaches
We compare the performance of CluFlow with the following four benchmark solutions.
Minimum Vertex Cover Nodes (MVC): This benchmark solution monitors and calculates the communication flow belonging to the minimum vertex cover (MVC) nodes in the network.Then we calculate the flows on the edges based on the incoming and outgoing flows on the MVC nodes.
Cluster Border Nodes of Voronoi Clustering (CB): Based on cluster head nodes, we partition the network into Voronoi clusters [24].We monitor the traffic flow of every cluster border node of all the clusters.The incoming and outgoing flows of the clusters is the sum of the incoming and outgoing flows of cluster border nodes, respectively.

Cluster Border Nodes of Minimum Vertex Cover Voronoi Clustering (MVC-CB):
We first partition the network into Voronoi clusters using the solution CB.After that, we change the network into a clusterlevel topology as shown in Fig. 10.Specifically, we use a cluster-level  Balanced Graph Partition (METIS): This benchmark solution adopts the widely used METIS algorithm [25] of balanced network partitioning.For balanced partitioning problem [26], the objective is to partition  of  into , ( > 1) subsets, such that (i) the subsets have equal size and are disjoint; (ii) the number of edges with endpoints in two subsets is minimized.METIS only sets the number of clusters, while does not set the cluster head nodes.We set the key parameters of METIS as follows.The scheme for partitioning is multilevel -way partitioning.The scheme for computing the initial partitioning is to grow a bisection using a greedy strategy.Each partitioning subset is contiguous.

Validity test of cluster border nodes
The purpose of this experiment is to validate that the cluster border nodes selected by Alg. 1 (Section 3.2) can correctly capture all the incoming and outgoing flows of clusters.In the experiment, the head nodes are specified as the sink nodes of each cluster.Each network node sends packets to all the head nodes.We measure: (i) the total number of incoming packets received by the head node (named as   in cluster   ); (ii) the total number of incoming packets received by all the border nodes of each cluster (named as   in cluster   ).Then we compare these two values.If   equals   , it means the cluster border nodes capture all the incoming flow of a cluster.So that, cluster border nodes selected by Alg. 1 (Section 3.2) can correctly capture all the flows of clusters.The diagram of the experimental design is shown in Fig. 11.
We implement the experiment in Matlab.The deployment area is 100 m×100 m, and the nodes are randomly deployed.The number of nodes in the experiments is set to 60, 80, 100, 120, and 140, respectively.The transmission range of each node is identical within a single experiment.For different experiments, we reset the transmission range, which always has an average of 6 nodes within the transmission range.We assume a perfect wireless channel without packet loss.We randomly select cluster head nodes in the network.The number of these head nodes equals to the number of required clusters.These head nodes are at least 5 hops away from each other.The network is partitioned into 6 and 9 clusters separately using Alg. 1.The transmission speed of each node is randomly set in the initialization and constant during the testing.The routes from each node to the head nodes are built via the shortest path routing.For every set of testing parameters, including the number of nodes and clusters, we make 50 rounds of testing.
The experimental results illustrate that   and   are equal in every cluster for each round of the test.This experiment demonstrates that the cluster border nodes selected by Alg. 1 can capture all the incoming and outgoing flows of clusters, which can be used to correctly calculate the flows among clusters.

Practicality test of cluster-based flow control
We show the practicality of the protocol shown in Fig. 8 (Section 3.3) by controlling the cluster-level traffic flow in a case study.We implemented the experiment in Matlab.The deployment area is 100 m×100 m, and the nodes are randomly deployed.The network consists of 200 nodes and is partitioned into 4 clusters.There are 6 nodes on average within the transmission range of each node.We assume a perfect wireless channel without packet loss.The head node of  1 is set as the sink node.Every node of the network sends packets to the sink via the shortest path routing.The cluster-level topology and flow without CluFlow control are shown in Fig. 12(a).The time interval between the present and the next sending time of every node is uniformly distributed in [1,8] seconds.The nodes in  1 and  3 send packets of 10 bytes in the whole experiment.The nodes in  2 and  4 send packets of 10 bytes before 400 s, and packets of 50 bytes after 400 s.The SDN controller sets cluster-level routing rules to block the flows between  2 and  3 ,  2 and  4 after 600 s, as shown in Fig. 12(b).
The real-time traffic flows from  2 to  1 and from  3 to  1 are shown in Fig. 12(c).In the experimental results, the flows from  2 to  1 and from  3 to  1 are quite unbalanced between 400 s to 600 s.The main reason is that the traffic generated by the nodes inside  2 and  4 increases significantly after 400 s and they all pass through  2 .After 600 s, the flows from  2 to  1 and from  3 to  1 are better balanced.The main reason is that the controller resets the clusterlevel routing rules, in which the traffic generated by the nodes inside  4 are prohibited to pass through  2 .So, the traffic generated by the nodes inside  4 must pass through  3 .Compared with using only local distributed routing, cluster-level SDN control makes the flow from  2 to  1 and flow from  3 to  1 more balanced.This case study shows the practicality of cluster-based flow control.

Number of border nodes in unbalanced clustering
We compare the number of cluster border nodes created by Clu-Flow to the unbalanced clustering solutions MVC, CB, and MVC-CB.A smaller number of cluster border nodes means fewer communication costs between nodes and the SDN controller.
In the experiment, the number of nodes in the network is set to 60, 80, 100, 120, and 140, respectively.The network is partitioned into 6 and 9 clusters, respectively.The other settings of the network are the same as in Section 4.3.For each set of parameters, we make 10 rounds of testing.The experimental results are illustrated in Fig. 13.The results show that the number of border nodes created by CluFlow is much smaller than the benchmark approaches.As the total number of network nodes increases, the percentage of improvement increases, because the state space for partitioning clusters is larger in larger networks.In the testing with 140 nodes and 6 heads, CluFlow has 83%, 65%, 34% fewer border nodes than MVC, CB and MVC-CB, respectively.
Compared with MVC and CB, the number of border nodes selected by MVC-CB is smaller.The main reason is that MVC-CB inherits some properties of CluFlow, including (i) abstracting the network to clusterlevel topology and (ii) controlling the border nodes of MVC clusters.But MVC-CB only uses Voronoi cluster partition.So CluFlow, using cluster partition Alg. 1, has fewer cluster border nodes than MVC-CB.Meanwhile, as the number of clusters increases from 6 to 9, the number of cluster border nodes increases in both CluFlow and benchmark solutions.This means the cost for flow control of cluster border nodes increases as the number of clusters becomes larger.

Number of border nodes in balanced clustering
We compare CluFlow with balanced clustering solution METIS.To increase the state space for clustering, we increase the number of network nodes (compared with the experiments in unbalanced clustering) to 100, 150, 200, 250, and 300.The values of other experimental parameters are the same as the experiments in unbalanced clustering.
The experimental results are shown in Fig. 14.The number of border nodes produced by METIS is much higher than CluFlow.In the testing of 300 nodes, CluFlow has 71% and 68% fewer border nodes than METIS with 6 and 9 clusters, respectively.The main reason is that METIS needs to balance the cluster size while minimizing the number of cut edges, which produces more cluster border nodes.Compared with METIS, CluFlow aims to minimize the number of border nodes without requirement on balanced partitioning.

Search space of cluster border nodes
In this experiment, we observe how the search space of clustering affects the number of cluster border nodes. is the search space of cluster border nodes.We set  as follows.Firstly, we randomly select cluster head nodes in the network.These cluster head nodes are at least 8 hops away from each other.Secondly, we make Voronoi clusters in the network based on the cluster head nodes.Name the border nodes of all the Voronoi clusters as   .Name the nodes that reside outside   and have 1 hop distance to any node in   as  1 .Name the nodes that reside outside   and have 2 hop distance to any node in   as  2 .Finally, we create  in the following three scenarios.
• Scenario 3:  includes all the nodes except the cluster head nodes.
In the three scenarios, the size of  in scenario 1 is the smallest, and the size of  in scenario 3 is the largest.We set the number of nodes in different experiments to 100, 120, 140, 160, and 180, and the number of clusters to 4. The other settings are the same as in Section 4.3.For each setup, we perform 10 rounds of testing.The testing results are shown in Fig. 15.As the size of  increases, the number of border nodes decreases.The main reason is that larger  provides a bigger search space to partition clusters, so that the possibility to find fewer border nodes increases.

Communication cost
To back up our claim that a smaller number of border nodes leads to less control traffic, we perform experiments to assess the number of flow configuration messages in an SD-WSN simulated scenario.We use the Cooja simulator [27] with sky motes.We use IT-SDN [28] as the southbound protocol, since it is tailored to WSNs.The version of IT-SDN is 0.4.1, which is configured to use source-routed control packets.A simple custom neighbor discovery protocol is employed, which gathers neighborhood information at the beginning of the simulation by periodic beacons.We set the number of nodes in different experiments to 60, 80, 100, 120, and 140, and the number of clusters is 6 and 9.The other settings are the same as in Section 4.3.
We select a sink node in the network and the controller is located at the sink node.The SDN controller interacts only with the cluster border nodes, while the other nodes route packets according to a distributed routing algorithm.Since our goal is to study the behavior of SDN control messages in the face of different clustering algorithms, the nodes are configured with static routing tables instead of dynamic distributed local routing.Every node in the network transmits one 10byte data packet to the data sink per minute.Fig. 16 displays the average number of flow configuration messages for each clustering solution and for a traditional SD-WSN without clustering.
Our approach CluFlow yields the least amount of control messages.In comparison to not using clustering, the reduction ranges from 78% to 88%.MVC-CB is the closest to CluFlow, however it produces on average 75% and 27% more control messages, for 6 and 9 clusters, respectively.We observe that the number of control messages increases as the number of clusters becomes larger.This is mainly because the amount of border nodes tends to with the number of clusters.
The goal of cluster-level routing is to reduce the control overhead.While the experiments above show CluFlow mitigates the control cost of SD-WSN in comparison to other clustering algorithms, it is crucial to investigate whether this gain comes at the expense of degrading other metrics or not.To this end, we measure two important communication metrics, which are the data delivery rate as shown in Fig. 17 and packet delay as shown in Fig. 18.
The experimental results show that all the tested clustering approaches have achieved over 98.5% delivery rate, and there is no statistical difference among them.In addition, CluFlow presents the highest delivery rate in half of the scenarios.Regarding the packet delay, the time difference between the highest value and the lowest value in each testing point is always less than 5 ms.This means none of the solutions presents significantly lower or higher delay than the others.The small variations observed are likely to arise from the underlying simulation randomness.
Based on the above experimental results, we conclude that, in comparison to the benchmark solutions, CluFlow significantly mitigates the communication cost of SD-WSN, while it does not degrade the other important communication metrics.

Performance in a real indoor WSN
We set up a real indoor WSN in a university building to test CluFlow.We measure the number of border nodes and the communication cost.The deployed nodes are CC2650STK SensorTag motes [29], using Contiki 3.0 OS [30], IEEE 802.15.4 MAC standard [31].We use CSMA/CA collision avoidance, Contiki-Mac radio duty cycle, and RPL [32] routing protocol.The Tx power of each node is set to 0dBm, and Rx sensitivity is -100dBm.32 nodes are deployed in an area of 65m×38m as shown in Fig. 19.The sink node is attached to a SensorTag Debugger DevPack, which links to a computer by a USB cable.The SDN controller runs on a computer, and communicates with the WSN through the sink node.
In the experiment, each mote reports the connectivity of neighbor nodes to the SDN controller every 30 s.The controller builds the topology of the network.The network is partitioned into 3 clusters based on 3 header nodes.The selected border nodes send monitoring data and routing requests to the controller every 3 s.Once the controller receives a request, it sends a reply back.The controller uses the monitoring data of all the border nodes to calculate the traffic flow among clusters.We do not instantiate cluster-level routing in this test.The border nodes do not change local routing rules after receiving the reply messages from the SDN controller.The load size of each packet is 64 bytes.The experiment lasts for 600 s using CluFlow and each benchmark approach.
We count the number of cluster border nodes and the communication cost, i.e., the number of sent and forwarded IP packets in the border nodes.The results are shown in Fig. 20.Compared with the benchmark approaches, CluFlow utilizes the smallest number of border nodes and communication costs.Compared with the results in Section 4.4 and Section 4.7, the improvement of CluFlow to MVC, CB and MVC-CB is smaller.The main reason is that the total number of nodes is smaller in the real network deployment.Therefore the difference in clustering using different solutions is smaller.

Related work
Most existing WSN structures utilize distributed control solutions.They face the same difficulties as traditional wired networks, such as lack of a high-level abstraction and ossified protocol stack.Dynamically changing control policy in a WSN becomes increasingly difficult as the size of the WSN increases [8].For example, as the communication flow pattern or environment changes, if a WSN needs to achieve better performance, the control plane of each sensor node must be (re)programmed.In a large-scale WSN, this task is difficult to handle.Therefore, a WSN needs a high-level centralized SDN control.
There are already various types of research on SDN in wired networks.These solutions provide improved flexibility and reduced complexity for flow control.For example, in [33], a hybrid mechanism is presented to control distributed routing by centralized management.The SDN controller injects routing guidance, e.g.fake nodes, to networks.In [34], the wired SDN is partitioned into clusters.Only border switches are connected and controlled by the SDN controller.SDN switches forward all the received messages to the SDN controller.After receiving messages, the SDN controller changes the routing information and sends it back to SDN switches.
Hybrid SDN is a networking paradigm where both centralized SDN control and distributed networking paradigms coexist [1,[15][16][17].Although using SDN technologies on top of legacy networking devices poses several challenges [35], hybrid SDN is gradually adopted by industry and academia.For example, the research in [36] presents a service-based hybrid SDN model in a wireless mesh backhaul for the coexistence of network services SDN controller and distributed network services.[37] proposes an incremental deployment strategy and a throughput-maximization routing for deploying a hybrid SDN.[38] addresses the efficient deployment problem of hybrid SDN devices through the maximum coverage problem.
In the research of hybrid SDN, some works focus on managing SDN control by network clustering.In [39], an SDN enabled 5G vehicular ad-hoc network is proposed.Due to the mobility features of vehicles, the solution utilizes vehicle clustering for reducing the overhead of cellular networks and providing better communication quality.The aim of this work is to manage the network of mobile vehicles, but the control for SD-WSN is not researched.[40] proposes an active network management QoS scheme for managing data flow in SD-WSN.In its implementation, a WSN is partitioned into clusters, and multiple base stations are used as cluster heads.However, this solution uses SDN controllers to manage all the cluster member nodes of clusters.Although this solution is easy to be implemented, the communication cost of managing WSN nodes is much higher than CluFlow.The research in [41] proposes an SDN-based clustering mechanism, which considers power, trust, secure centrality, mobility, priority, and heterogeneity in Internet of Things.Compared with CluFlow, this paper aims to provide secure clustering by adaptive cluster head selection instead of decreasing the communication cost on SDN control.Meanwhile, the paper assumes that the cluster heads are the main communication bridges, and the SDN controller does not manage the routing of multihop communication as in CluFlow.The research in [42] provides a two-level hybrid SDN control mechanism for Internet of Things.In the mechanism, a routing protocol based on multi-hop clustering is used on the first level, and an SDN for managing the global network is leveraged on the second level.Compared to the SD-WSN structure of CluFlow, this paper assumes a hierarchical network structure, in which the communication among clusters is through SDN switches.In addition, the aim of this paper is to meet QoS requirements, while CluFlow aims to reduce the communication cost of SDN control.
SD-WSN is one of the most important research directions in hybrid SDN.TinySDN is an early effort in developing an SD-WSN [43].Its experiments use a 7-node network, focusing on the delay metric.CORAL-SDN aims at exploring the impact of discovery algorithms on SD-WSN performance [44].The authors assessed metrics related to topology discovery on a variety of 25-node topologies.Control overhead, however, is not measured.[45] proposes SDN, an IPv6compatible SD-WSN stack.Their experiments show that approximately 15% of the traffic in a 30-node network is composed of SDN control packets.[28] uses IT-SDN to perform an SD-WSN scalability study.Its experiments are based on networks with up to 289 nodes under varying networking conditions.The results indicate that control traffic grows linearly with the network size, suggesting the need for control traffic reduction mechanisms.[46] proposes a mechanism for on-line metric assessment on SD-WSN systems.However, the mechanism further increases the control overhead in larger networks.The research in [47] provides a solution to utilize OpenFlow in wireless networks.It uses the OpenFlow centralized controller for routing data traffic.SDN-WISE [48] designs and implements a complete SDN system in a real multi-hop wireless network.Its SDN components consist of SDN controller, topology manager, protocol stacks, and wireless motes.It provides a stateful solution and reduces the amount of communication between nodes and SDN controllers.The research in [49] creates an SDN framework for IoT systems based on SDN-WISE and Open Network Operating System (ONOS) [50].To connect IoT and SDN, it extends the functionality of ONOS as the controller in a WSN, while the communication protocol relies on SDN-WISE.Generally, all the above wireless SDN structures do not completely decouple the control plane and data plane.The SDN controller must rely on distributed routing to setup control flow in the nodes that are several hops away.To update flow table entries, the nodes and the SDN controller have to periodically exchange request and reply messages over multiple hops.This process increases communication delay and control overhead in wireless networks.
Besides researching SD-WSN architectures, some works focus on increasing the performance of WSNs using an SDN structure, such as energy efficiency, task scheduling, and routing.SDN-ECCKN [51] proposes an SDN-based energy management system for WSN.The system reduces the total transmission time to increase the network lifetime.[52] minimizes energy consumption on sensors with guaranteed quality-of-sensing in a multi-task SD-WSN.It utilizes a centralized SDN to formulate the minimum-energy sensor activation by jointly considering sensor activation and task mapping.The work in [53] presents an energy-efficient routing algorithm based on the SD-WSN framework.To minimize the transmission distance and the energy consumption of sensor nodes, the algorithm partitions the WSN into clusters and dynamically assigns tasks to the intra-cluster nodes by a cluster control node.

Limitations and future research directions
In this work, we open up a new solution for communication flow control in hybrid SD-WSNs, which leverages the benefits of network clustering and legacy distributed routing in WSN.At the same time, we are aware of some key points of improvement as follows.
• It is unclear how to take advantage of our solution to dynamic WSNs.In a dynamic WSN, the network topology changes frequently.Therefore, if CluFlow is directly used in a dynamic WSN, the SDN controller has to calculate new clusters frequently, which will increase the communication cost.• It is important that cluster-level routing rules and distributed routing rules coordinate correctly.For example, cluster-level routing rules and distributed routing rules must avoid livelock routing [54].The detailed design of cluster-level routing rules, and the coordination of cluster-level routing and distributed legacy routing will be part of our future work.• In this paper, we propose an effective network clustering solution.
The clustering algorithm is generic to any kind of networks, and not specific to WSNs.Meanwhile, clustering is a widely used method in communication networks.So it is valuable to evaluate whether our solution is effective in the other wired and wireless networks.

Conclusion
In this work, we have presented CluFlow, a cluster-based hybrid SD-WSN architecture.The key idea is to manage communication flows using central SDN control on the cluster border nodes.Consequently, the control overhead required for programming the network can be significantly reduced.To this end, we have provided a clustering algorithm tailored for minimizing the number of cluster border nodes.In this way, CluFlow can effectively take advantage of distributed control at node-level and centralized control at cluster-level.Based on simulations and testbed experiments, we have demonstrated that CluFlow can significantly decrease the number of border nodes and the communication load for controlling/monitoring cluster-level communication compared to benchmark solutions.

CRediT authorship contribution statement
Qingzhi Liu: Conceived and designed the analysis, Collected the data, Contributed data or analysis tools, Performed the analysis, Wrote the paper.Long Cheng: Conceived and designed the analysis, Contributed data or analysis tools, Performed the analysis, Wrote the paper.Renan Alves: Conceived and designed the analysis, Collected the data, Contributed data or analysis tools, Performed the analysis, Wrote the paper.Tanir Ozcelebi: Conceived and designed the analysis, Performed the analysis, Wrote the paper.Fernando Kuipers: Conceived and designed the analysis, Performed the analysis, Wrote the paper.Guixian Xu: Conceived and designed the analysis, Performed the analysis, Wrote the paper.Johan Lukkien: Conceived and designed the analysis, Performed the analysis, Wrote the paper.Shanzhi Chen: Conceived and designed the analysis, Wrote the paper.

Fig. 3 .
Fig. 3. Use cluster border nodes for controlling the flow of clusters.The network topology of Fig. (a), (b), and (c) are identical.The network is partitioned to clusters  1 ,  2 ,  3 , and  4 .

Fig. 6 .
Fig. 6.An example of an alternative to the border nodes for controlling the incoming and outgoing flow of a cluster.

Algorithm 1 :
Clustering with Minimum Border Nodes

Fig. 8 .
Fig. 8.An example of SD-WSN flow table that could be used for CluFlow.

Fig. 9 .
Fig. 9.An example of CluFlow protocol execution.The network is partitioned into four clusters with various colors.The gray nodes represent cluster border nodes.

Fig. 10 .
Fig. 10.Build a cluster-level network based on a partitioned network.

Q
.Liu et al.

Fig. 11 .
Fig. 11.Validity test of cluster border nodes.Every incoming flow of the cluster passes the cluster border nodes, so that the sum of the incoming flows in all the cluster border nodes equals the incoming flows of the sink node.

Fig. 12 .
Fig. 12. Case study of cluster-based flow control by CluFlow.The SDN controller balances the cluster-level flows (from  2 to  1 and from  3 to  1 ) by re-configuring the cluster-level routes (from  3 to  2 and from  4 to  2 ).

Fig. 13 .
Fig. 13.The number of border nodes using CluFlow and the benchmark approaches MVC, CB, and MVC-CB with 6 clusters and 9 clusters.

Fig. 14 .
Fig. 14.The number of border nodes using CluFlow and the balanced cluster partitioning approach METIS with 6 clusters and 9 clusters.

Fig. 15 .
Fig. 15.The number of cluster border nodes using CluFlow with different sizes of .

Fig. 16 .
Fig. 16.The number of flow configuration packets with 6 clusters and 9 clusters in an SD-WSN.''All Nodes'' represents traditional SD-WSN without clustering, in which all the network nodes communicate with the SDN controller.

Fig. 17 .
Fig. 17.Data delivery rate with 6 clusters and 9 clusters in an SD-WSN.''All Nodes'' represents traditional SD-WSN without clustering, in which all the network nodes communicate with the SDN controller.

Fig. 18 .
Fig. 18.Packet delay with 6 clusters and 9 clusters in an SD-WSN.''All Nodes'' represents traditional SD-WSN without clustering, in which all the network nodes communicate with the SDN controller.

Fig. 19 .
Fig. 19.The deployment of a WSN in a building.Orange circles • represent the positions of the deployed nodes.The node with green diamond background ⧫ is the SDN controller.The nodes with blue square background ■ are the heads of clusters.
We split each node   of  into two nodes    and    and connect them by an edge   .Suppose   has a neighbor node   in .If the hop distance from   to ℎ  is smaller than from   to ℎ  , then   is the previous hop of   and we connect to    .Otherwise, we connect    to    .If the hop distance from   equals that
Proposition 1. Suppose   is the set of border nodes in cluster   .Name   as the subset of   −   , in which each node has at least a neighbor in   .The minimum vertex cover (MVC) of   ∪   is an alternative to the border nodes   for controlling the incoming and outgoing flow of cluster   .
Proof.Name the set of edges in   ∪   as    .Name the set of edges with one endpoint in   and another endpoint in   as    .Based on the property of MVC, each edge in    has at least one endpoint in the MVC nodes of   ∪   .Thus monitoring the flows of the MVC nodes belonging to   ∪  can capture all the flows in    .   is a subset of    .Therefore, monitoring the flows of the MVC nodes belonging to   ∪   can capture all the flows in