Cooperative Data Collection Mechanism Using Multiple Mobile Sinks in Wireless Sensor Networks

Data collection problems have received much attention in recent years. Many data collection algorithms that constructed a path and adopted one or more mobile sinks to collect data along the paths have been proposed in wireless sensor networks (WSNs). However, the efficiency of the established paths still can be improved. This paper proposes a cooperative data collection algorithm (CDCA), which aims to prolong the network lifetime of the given WSNs. The CDCA initially partitions the n sensor nodes into k groups and assigns each mobile sink acting as the local mobile sink to collect data generated by the sensors of each group. Then the CDCA selects an appropriate set of data collection points in each group and establishes a separate path passing through all the data collection points in each group. Finally, a global path is constructed and the rendezvous time points and the speed of each mobile sink are arranged for collecting data from k local mobile sinks to the global mobile sink. Performance evaluations reveal that the proposed CDCA outperforms the related works in terms of rendezvous time, network lifetime, fairness index as well as efficiency index.


Introduction
Wireless sensor networks (WSNs) have been used in many applications, including healthcare, trajectory tracking, environmental monitoring, smart home, military surveillance, coverage, rechargeable and data collection [1][2][3][4][5][6][7][8][9]. The data collection issue has received much attention in recent years. In literature, many studies develop data collection approaches, aiming to cope with the energy unbalanced issue in a given region. An advanced routing transfer-low-energy adaptive clustering hierarchy approach (ART-LEACH) has been proposed in [10]. It assumed that the sensor nodes and sink node are static. These studies have minimized energy consumption and extended network life. But the closer the sensor nodes are to the sink, the greater the data-forwarding workloads and power consumption of the node, leading to an energy imbalance problem.
To deal with energy imbalances issue, numerous studies have adopted mobile sink to collect data from sensor nodes. These studies are mainly classified into two categories: single mobile sink and multiple mobile sinks. In the first class, some studies [11,12] adopted one mobile sink to patrol all sensor nodes and collect their readings. Since there is no data forwarding, the power consumption of sensor nodes are minimized. However, the length of the constructed path is too not efficient, raising the problems of buffer overflow of sensor nodes.
To resolve the path inefficient problem, some other studies [13][14][15] fall in the partial data-forwarding class, which uses a mobile sink to pass through only some nodes. All the visited sensors were called collection points (CPs). Most of these mechanisms partitioned the sensor nodes

•
Achieving the purpose of cooperative data collection. The proposed CDCA partitions the data collection task into k subtasks through benefit calculations. The mobile sinks can cooperatively collect data along the constructed paths. • Prolonging network lifetime. The proposed CDCA considers the forwarding workload of each sensor, selects the k sensors with the largest forwarding workloads and then partitions a tree with n sensors into k subtrees. This can minimize the energy consumption of sensors with maximal energy consumption and hence prolong the network lifetime. • Balancing the workloads of sensors for packet forwarding in each subset. The proposed mechanism considers the length cost between any two consecutive CPs. Therefore, the established route allows the mobile sink to visit more collection points. Compared with weighted rendezvous planning (WRP) [15], the proposed CDCA distributes the workloads of data forwarding to more collection points, prolonging the network lifetime. • Maintaining a stable cycle for the rendezvous opportunities between each local mobile sink and global sink. The proposed mechanism adjusts the velocities of the mobile sinks to ensure that each local mobile sink and the global sink can have stable rendezvous time periodically.
The remainder of this work is organized as follows. Section 2 reviews the existing works related to this study. Section 3 illustrates assumptions, network environment and the problem formulation of the proposed approach. Section 4 presents the proposed mechanism. The experimental results of the proposed mechanism are proposed in Section 5. Finally, Section 6 offers a conclusion and future works.

Related Work
In literature, a number of studies that adopted mobile sink for collecting data are mainly classified into two categories, including single mobile sink collecting data [11][12][13][14][15] and multiple mobile sinks collecting data [16][17][18][19][20]. The following briefly reviews these related works. Studies [11][12][13] fall in the first class mainly used single mobile sink to collect data from the sensor nodes. Study [11] proposed a heuristic tour-planning schedule for single-hop data collection. This study aims to decrease path length but visit all sensor nodes. However, the algorithm may raise the problem of long delay in large-scale sensor network. In [12], a mobile sink collects data from all sensor nodes in a given wireless sensor network. The main objective of this algorithm is to minimize the latency of data collection by designing the shortest path which passes through all sensor nodes. Somasundara et al. [13] proposed an algorithm to build path of the mobile sink for collecting all the data before the buffer of each sensor node overflows. However, with an increasing number of sensor nodes, it will still become impractical due to the high computational complexity.
The previously mentioned studies visiting all sensors become impractical when the number of sensor nodes grows. In order to cope with the data collection issue, some other works [14,15] fall in the partial data-forwarding category where the mobile sink only visits some sensor nodes. Those nodes that are not visited by mobile sink should forward their data to the closest visited sensors. Yi-Hsuan et al. [14] proposed a heuristic algorithm which consists of two steps. The first step is finding the root of a data aggregation tree such that the hop distance from any node to the root is minimized. Then the second step selects sensors as collection points to be visited by the mobile sink for data collection. However, the heuristic path construction algorithm did not consider that the distance between current CP and the next CP. This might construct a long path for the mobile sink.
Hamidreza et al. [15] proposed a weighted rendezvous planning (WRP) where each sensor node is assigned a weight corresponding to its hop distance from the tour and the number of data packets that it forwards to the closest CPs. Then a set of CPs is determined for constructing a near-optimal traveling tour that minimizes the energy consumption of sensor nodes. However, it did not consider the path cost between two consecutive CPs. This reduces the number of sensors visited by the mobile sink and hence causes the problems of shorter network lifetime or fewer collected data.
The abovementioned studies used only single mobile sink to visit each sensor node and collect its data. When the size of network area or the number of sensors grows, the path length of the mobile sink must be increased accordingly. The mobile sink will spend much more time for each round and hence cause high latency for data collection. Therefore, a single mobile sink may not be sufficient for certain applications that require low latency in a large-scale WSN. There have also been several studies [16][17][18][19][20] that use multiple mobile sinks for data gathering in WSNs to reduce latency. The following reviews these studies.
Zhao et al. [16] straight lines and receive data sent from sensors via multihop transmission. In case the sensors are deployed uniformly and the density is high, the algorithm has good performance. However, if some sensors in the network is blocked by obstacles or holes, the performance will be inefficient because that the mobile sinks only move along straight lines. To cope with this problem, Aslanyan et al. [17] adopted multiple mobile sinks that randomly move in the monitoring region for data collection. In addition, the mobile sinks delivered the received data to the other mobile sink when they fall in the communication range. Since the route is randomly determined, the latency is difficult to be estimated and is unstable.
Stefano et al. [18] proposed an energy efficient clustering with delay reduction in data gathering scheme that focuses on energy efficient routing of data from sensor nodes to base station using multiple mobile sinks. This increased the network lifetime, throughput, and delivery rate. However, the proposed mechanism did not consider the issue of speed control for the mobile sinks and hence the cooperation between mobile sinks are not mentioned. Edison et al. [19] presented a bio-inspired networking strategy to allow the cooperation of static sensors on the ground and multiple mobile sensors in the air, applied to the scenario of large areas surveillance. The strategy can provide efficient communication among the sensor nodes, reduce the number of messages exchanged in the network. However, the proposed mechanism did not consider the issue of speed control for the mobile sinks and hence the cooperation between mobile sinks are not mentioned. Duc Van et al. [20] proposed a data collection scheme, called HiCoDG, which adopted multiple mobile sinks to collect and relay data in a cooperative manner. The proposed HiCoDG algorithm aims to find the optimal paths for mobile sinks and global mobile sink, for minimizing the traveling distances of mobile sinks. However, the proposed mechanism did not consider the balance of the number of nodes in the cluster and the distance between the current cluster-head and the next cluster-head.
Most of the studies mentioned above emphasize the improvement of the data fresh problem. But most of them did not take into account the velocity control for mobile sinks and the cooperation between the mobile sinks. As a result, the latency of data collection might be increased and the constructed path is not efficient enough. This paper proposes a cooperative data collection algorithm, called CDCA. Compared with the existing studies, the proposed CDCA exhibits several contributions, including achieving the purpose of cooperative data collection, prolonging network lifetime, balancing the workloads of sensors for packet forwarding and maintaining a stable cycle for the rendezvous opportunities between each mobile sink and global sink. The following summarizes the comparisons of related works and the proposed CDCA, in terms of several important parameters which impact the performance of the network lifetime of the given sensor network and cooperation between the mobile sinks. Table 1 compares each related work with this paper in terms of number of mobile sinks, path construction, data collection latency and mobile sink cooperative. It is shown that the proposed mechanism exhibits all good characteristics, as compared with the other six mechanisms.

Network Environment and Problem Formulation
This section introduces the network environment and assumptions of the modeled WSN. Then the problem formulation is presented. number of nodes in the cluster and the distance between the current cluster-head and the next clusterhead.

Given a monitoring region
Most of the studies mentioned above emphasize the improvement of the data fresh problem. But most of them did not take into account the velocity control for mobile sinks and the cooperation between the mobile sinks. As a result, the latency of data collection might be increased and the constructed path is not efficient enough. This paper proposes a cooperative data collection algorithm, called CDCA. Compared with the existing studies, the proposed CDCA exhibits several contributions, including achieving the purpose of cooperative data collection, prolonging network lifetime, balancing the workloads of sensors for packet forwarding and maintaining a stable cycle for the rendezvous opportunities between each mobile sink and global sink. The following summarizes the comparisons of related works and the proposed CDCA, in terms of several important parameters which impact the performance of the network lifetime of the given sensor network and cooperation between the mobile sinks. Table 1 compares each related work with this paper in terms of number of mobile sinks, path construction, data collection latency and mobile sink cooperative. It is shown that the proposed mechanism exhibits all good characteristics, as compared with the other six mechanisms.

Network Environment and Problem Formulation
This section introduces the network environment and assumptions of the modeled WSN. Then the problem formulation is presented.

Network Environment
Given a monitoring region Ɽ , this paper considers a mobile wireless sensor network that consists of a set of n sensor nodes = { , , , ⋯ , } randomly deployed in Ɽ. There are k + 1 mobile sinks = { , , ⋯ , } aiming at collecting data from all sensor nodes. The k mobile sinks, including , … , , aim to collect data from all sensors in their own local area while the treated as the global mobile sink aims to collect data from the k mobile sinks. The global mobile sink is initially arranged at the location of static sink. The function of static sink is to upload the data to the Internet for further processing. Assume that the communication ranges of all mobile sinks and sensor nodes are identical. Each mobile sink has rich power or is supported with an energy harvesting system. In addition, the paper assumes that all mobile sinks are aware of the locations of all sensor nodes and their own location information. Among the k mobile sinks, there is a leader, denoted by , which is in charge of executing the proposed data collection mechanism and broadcasting the results to all mobile sinks.
All sensor nodes periodically generate one data packet in every time period t and the packet must be delivered to the sink. This paper aims to develop a data collection mechanism which partitions the sensors into k disjoint sets = , , … , and assigns i-th set to mobile sink . Since the mobile sink aims to collect data from all sensor nodes in , it establishes a tour passing through all sensors in . However, for some large-scale applications, it is time consuming for data , this paper considers a mobile wireless sensor network that consists of a set of n sensor nodes S = {s 1 , s 2 , s 3 , · · · , s n } randomly deployed in Sensors 2018, 18, x FOR PEER REVIEW number of nodes in the cluster and the distance between the current cluster-head an head.
Most of the studies mentioned above emphasize the improvement of the da But most of them did not take into account the velocity control for mobile sinks an between the mobile sinks. As a result, the latency of data collection might be i constructed path is not efficient enough. This paper proposes a cooperative data col called CDCA. Compared with the existing studies, the proposed CDCA contributions, including achieving the purpose of cooperative data collection, pro lifetime, balancing the workloads of sensors for packet forwarding and maintaining the rendezvous opportunities between each mobile sink and global sink. The follow the comparisons of related works and the proposed CDCA, in terms of several impo which impact the performance of the network lifetime of the given sensor network between the mobile sinks. Table 1 compares each related work with this paper in terms of number of m construction, data collection latency and mobile sink cooperative. It is shown t mechanism exhibits all good characteristics, as compared with the other six mecha  [14] single O long [15] single O long [16] multiple O short [17] multiple × long [18] multiple × short [19] multiple × short [20] multiple O short The proposed CDCA multiple O short

Network Environment and Problem Formulation
This section introduces the network environment and assumptions of the mod the problem formulation is presented.

Network Environment
Given a monitoring region Ɽ , this paper considers a mobile wireless sen consists of a set of n sensor nodes = { , , , ⋯ , } randomly deployed in Ɽ mobile sinks = { , , ⋯ , } aiming at collecting data from all sensor nod sinks, including , … , , aim to collect data from all sensors in their own local a treated as the global mobile sink aims to collect data from the k mobile sinks. The g is initially arranged at the location of static sink. The function of static sink is t to the Internet for further processing. Assume that the communication ranges of all sensor nodes are identical. Each mobile sink has rich power or is supported harvesting system. In addition, the paper assumes that all mobile sinks are aware o all sensor nodes and their own location information. Among the k mobile sinks, denoted by , which is in charge of executing the proposed data collection broadcasting the results to all mobile sinks.
All sensor nodes periodically generate one data packet in every time period must be delivered to the sink. This paper aims to develop a data collection m partitions the sensors into k disjoint sets = , , … , and assigns i-th set to Since the mobile sink aims to collect data from all sensor nodes in , it establish through all sensors in . However, for some large-scale applications, it is time co . There are k + 1 mobile sinks M = {m 0 , m 1 , · · · , m k } aiming at collecting data from all sensor nodes. The k mobile sinks, including m 1 , . . . , m k , aim to collect data from all sensors in their own local area while the m 0 treated as the global mobile sink aims to collect data from the k mobile sinks. The global mobile sink m 0 is initially arranged at the location of static sink. The function of static sink is to upload the data to the Internet for further processing. Assume that the communication ranges of all mobile sinks and sensor nodes are identical. Each mobile sink has rich power or is supported with an energy harvesting system. In addition, the paper assumes that all mobile sinks are aware of the locations of all sensor nodes and their own location information. Among the k mobile sinks, there is a leader, denoted by m leader , which is in charge of executing the proposed data collection mechanism and broadcasting the results to all mobile sinks. All sensor nodes periodically generate one data packet in every time period t and the packet must be delivered to the sink. This paper aims to develop a data collection mechanism which partitions the sensors into k disjoint sets T = T 1 , T 2 , . . . , T k and assigns i-th set T i to mobile sink m i . Since the mobile sink m i aims to collect data from all sensor nodes in T i , it establishes a tour passing through all sensors in T i . However, for some large-scale applications, it is time consuming for data collection, which might lead to the problem of buffer overflow. To prevent from this situation, the leader should select some sensors and only visit them in T i . Let P i = p i,1 , p i,2 , p i,3 , · · · , p i,k i be the set of k i selected CPs in T i . In each round, the mobile sink m i will visit each p i,j and collect data from each p i,j and then return to the root of T i . Letπ i represent the path passing through all CPs p i,j in T i and the path length of π i is |π i |. In the next subsection, the problem formulation of this paper will be presented. Figure 1 illustrates an example of several CPs and five mobile sink in a rectangular region. In Figure 1, the red circles represent the static sink whereas the black circles denote the selected CPs. The sensor nodes have been partitioned into five groups, and each group organizes a path by m i . In each path, every mobile sink m i moves along the established path, which passes through all CPs in T i to collect data. The global mobile sink m 0 will leave from the static sink, move along the constructed path and collect all data stored in root T i . Then the global mobile sink m 0 goes back to the static sink. collection, which might lead to the problem of buffer overflow. To prevent from this situation, the leader should select some sensors and only visit them in . Let = { , , , , , , ⋯ , , } be the set of selected CPs in . In each round, the mobile sink will visit each , and collect data from each , and then return to the root of . Let  represent the path passing through all CPs , in and the path length of  is | |. In the next subsection, the problem formulation of this paper will be presented. Figure 1 illustrates an example of several CPs and five mobile sink in a rectangular region. In Figure 1, the red circles represent the static sink whereas the black circles denote the selected CPs. The sensor nodes have been partitioned into five groups, and each group organizes a path by . In each path, every mobile sink moves along the established path, which passes through all CPs in to collect data. The global mobile sink will leave from the static sink, move along the constructed path and collect all data stored in root . Then the global mobile sink goes back to the static sink.

Problem Formulation
Energy conservation is the most important parameter that determines the lifetime of a wireless sensor network. In general, the lifetime of a given WSN is defined by the time length starting from the time point that the network operates to the time point that the first sensor node runs out its energy. This paper aims to develop a data collection mechanism which initially partitions the n sensor nodes into k subsets = , , … , and assigns one mobile sink to collect data generated by sensors in . Then the data collection mechanism aims to select CPs from the locations of sensors in and establish a route  which passes through the CPs for data collection. Since the CPs are bottlenecks of network lifetime, saving their energy consumption can prolong the network lifetime.
In the following, the energy consumption model considered in this paper is [21]. The energy consumption of wireless sensor nodes mainly happen on the operations including data sending and receiving. Let sender and receiver be a communication pair and sends k bits to . Let be the distance between and . The energy consumption for each sensor node receiving δ bits can be measured by Equation (1).

Problem Formulation
Energy conservation is the most important parameter that determines the lifetime of a wireless sensor network. In general, the lifetime of a given WSN is defined by the time length starting from the time point that the network operates to the time point that the first sensor node runs out its energy. This paper aims to develop a data collection mechanism which initially partitions the n sensor nodes into k subsets T = T 1 , T 2 , . . . , T k and assigns one mobile sink m i to collect data generated by sensors in T i . Then the data collection mechanism aims to select k i CPs from the locations of sensors in T i and establish a route π i which passes through the k i CPs for data collection. Since the CPs are bottlenecks of network lifetime, saving their energy consumption can prolong the network lifetime.
In the following, the energy consumption model considered in this paper is [21]. The energy consumption of wireless sensor nodes mainly happen on the operations including data sending and receiving. Let sender s i and receiver s j be a communication pair and s i sends k bits to s j . Let d ij be the distance between s i and s j . The energy consumption for each sensor node receiving δ bits can be measured by Equation (1).
where β is the parameter indicating energy consumption for transmitting one bit. The energy consumption for sender s i to transmit a packet to its parent s j is expressed as Equation (2).
where ε 1 is a parameter denoting the energy consumption for transmitting one bit, ε 2 is the energy consumption factor of the amplifier circuit, d γ i,j is the distance between sensors s i and s j , γ is the path-loss exponent, which usually ranges between 2 and 4, depending on the environment.
The lifetime of sensors in T i highly depends on the value of k i . A large value of k i indicates that the mobile sink m i can visit more CPs. This implies that all sensors in T i can be further partitioned into k i subsets. Each subset will be associated with a CP and all sensors in this subset should transmit their data to this CP. Consider each subset and the corresponding CP. Since the distance between each sensor and the CP might be larger than the communication range, a tree rooted by CP will be constructed for relaying the data from each sensor to CP along with the tree topology. Let subtree T ij is rooted by CP p ij and has u ij + 1 nodes. Sensor nodes of each subtree T ij should upload their data to the root p ij . These roots will store the data and then transmit the data to mobile sink when the mobile sink visits them. A route that visits all CPs should be constructed for each mobile sink m i to collect data from all CPs in T i . The energy consumption of sensor nodes mainly happen on the operations including data transmitting and receiving. Let E ij denote the energy consumption of p ij for receiving one packet from its children. Equation (3) evaluates the value of E ij which equals to the sum of energy consumption for receiving u ij packets and transmitting u ij + 1 packets in tree T i .
The network lifetime is determined by the lifetime of the sensor with the least energy. Since each CP consumes the highest energy in its tree, this paper aims to reduce the energy consumption of the CP which consumes maximal energy as compared with the other CPs. Let s max ij denote the bottlenecked CP in T i . The value of s max ij can be derived by applying Equation (4) Expression (5) reflects the goal of this paper. Objective: Min Max When achieving the objective (5), the following constraints (6)-(9) must be satisfied. Let |L| denote the length of path L. Let L min denote the shortest Hamilton path passing through every CP in Tree T i and L max denote the maximal length of a valid path. Let d i denote the rendezvous point of global mobile sink m 0 and local mobile sink m i . Let path π 0 denote the path that passes all rendezvous points. Constraint (6) illustrates that the length of each path π i should be smaller than or equal to the maximal length that mobile sinks can move and larger than the minimal length.
(1) Distance Constraint: Each mobile sink m i should collect data from all CPs in T i and then arrives at the rendezvous point d i to forward its data to the global sink m 0 . Let m 0 and m i arrived at rendezvous point d i at time points t 1 i and t 2 i , respectively. To guarantee that the collected data are fresh, the rendezvous delay constraint, as shown in Constraint (7), should be satisfied. (2) Rendezvous Delay Constraint: where η is an acceptable time delay at each rendezvous point. Let one round, denoted by notation ς i , represent the time period required for collecting one data packet from each sensor in T i . Let Ω denote the lifetime of a given WSN. The number of rounds that mobile sinks can visit T i is Ω/ς i . Since the energy consumption of p ij is E ij in each round. The total energy consumption of CP p ij is E ij × (Ω/ς i ). Constraint (8) checks if the total energy consumption is less than or equal to the battery capacity B.
(3) Battery Constraint: where B denotes the battery capacity. Let S ij denote the set of sensors rooted by CP p ij in subtree T ij . Let f out ij denote the total number of packets sent by CP p ij in each round and let f in i denote the number of packets created by each s i in set S ij in each round. To guarantee that all packets of S ij can be received by the CP p ij , and then p ij forwards the received packets and its own packet to the mobile sink, the following Flow Constraint (9) should be satisfied.

The Proposed CDCA Algorithm
This section proposes the data collection mechanism which aims to partition the whole wireless sensor network into k disjoint sets and assign each set to a mobile sink m i to collect the data generated in this set. For each set of sensors, the proposed mechanism will further determine the set of CPs and establish a path visiting all CPs while the lifetime of the sensors in this set can be maximized, under the constraints (6)- (9). The proposed data collection algorithm is composed of three major phases: Network Partition Phase, CP Selection and Path Construction Phase as well as Speed Control Phase. In the Network Partition Phase, the algorithm partitions the n sensor nodes into k disjoint subsets. Then the CP Selection and Path Construction Phase aim to select k i CPs from the locations of sensors in T i and establish a path π i which visits the k i CPs for data collection. In the Speed Control Phase, the algorithm coordinates to control the speed of mobile sinks m 0 and m i . As a result, each mobile sink m i can collect data generated by sensors in T i while the mobile sink m 0 further collect data from each mobile sink m i . The following gives the details of the three phases.

Network Partition Phase
Given a mi nimum spanning tree (MST) T, the root of tree T can be considered as the sink and there is a mobile sink m 0 located at the root. The Network Partition Phase aims to partitions T into k disjoint subsets T = T 1 , T 2 , . . . , T k such that the sizes of them are similar. Each mobile sink m i will be allocated to T i for data collection. Let S denote the set of all sensor nodes. Let Ñ andŜ = S\Ñ denote the sets of selected and not selected roots for k subtrees, respectively. Let s largest denote the sensor with largest cost inŜ. The basic concept of this phase is to find the sensor s largest inŜ in each run to play the role of tree root until k roots are selected. In each run, the selected s largest will join set Ñ. After that, the minimum spanning tree will be restructured. The following gives the general descriptions for the sink to split up the MST T.
Let D(i, T) represent the number of hops from sensor s i to the root of its tree. Let NS(i, T) denote the number of sensors rooted by s i . Assume that each sensor creates one packet in each round.
Let NP(i) represent the number of packets received by sensor node s i , including its own reading. Equation (10) presents the relation of NP(i) and NS(i, T).
Let Hops(i, sink) denote the number of hops from s i to the sink contains sensor node s i . If sensor s i is selected as the root of some subtree T i , the number of packets saved by selecting s i as the CP is evaluated as shown in Equation (11).
Let W i denote the cost obtained by selecting s i as the tree root. The value of W i is measured by the number of packets saved for transmission from some s j ∈ P to s i , as shown in Equation (12).
The sink calculates the W i of each s i ∈Ŝ and selects the sensor s largest with the largest cost. Expression (13) reflects the condition of sensor, which can be selected as the tree root.
After that, the sensor s largest is added into Ñ and is removed fromŜ. The aforementioned operations will be repeated executed round by round until k roots are selected in Ñ. As a result, the final selected largest cost set Ñ = Ñ 1 , Ñ 2 , · · · Ñ k . The k nodes in Ñ will construct k subtrees T 1 , T 2 , . . . , T k .
Till now, this paper have partition the tree T into several subtrees T 1 , T 2 , . . . , T k . The next operation is trying to balance the subtrees. Let T max and T min denote the trees with maximal and minimal numbers of sensor nodes, respectively. Let Θ(T) denote the number of nodes in tree T. The next step is to balance the tree size. The tree balancing can be achieved by moving some nodes of other tree to tree T min or moving some nodes from tree T max to other trees. To achieve this, the following defines neighboring trees which allow them to move node from one to another. Two trees T i and T j are said to be neighboring trees if there exist s i ∈ T i and s j ∈ T j such that s i and s j are neighbors. The following illustrates how the tree balancing can be achieved. Let N(T) denote the set of neighboring trees of tree T. Let sensor s best denote the sensor that will be moved to tree T min . The sensor that satisfies Expression (14) will be selected to move to tree T min .
where dis(s, T) denotes the number of hops from sensor s to the root of tree T. That is, the sensor that has a large hop distance to the root of tree T i indicates that the energy consumed for packet forwarding of sensor is large. On the contrary, the sensor that has a small hop distance to the root of tree T min indicates that the energy consumed for packet forwarding of sensor is small. As a result, the sensor that meets the requirement of Expression (14) can reduce maximal energy consumption of tree T i and increase minimal energy consumption of tree T min . The aforementioned operations must be repeated until the number of subtree nodes is uniform. That is |T 1 | = |T 2 | =· · · = |T k |. The following presents the proposed BTP algorithm (Algorithm 1).

20.
If ( T i and T min are neighboring trees) 21.

22.
Remove the edge linking from s best to its parent; 23.
Connecting point s best to the nearest node in T min ;}} 24. Return T ; The following presents an example for operating the proposed BTP algorithm. The proposed BTP algorithm calculates the W i of each sensor node in tree and then selects s 9 which has the maximal value of cost index as the candidate of the new root. The BTP algorithm then adds s 9 into tour set Ñ and removes s 9 and edge (s 5 , s 9 ) from the minimal spanning tree given in Figure 2a. As a result, the original MST has been partitioned into two trees, as shown in Figure 2b. The repetitions of executing BTP algorithm will continuously select one sensor node with the maximal cost index value to play the role of new root until the number of nodes in Ñ reaches to k. In the second repetition, the proposed BTP algorithm will select s 16 to play the role of new root since s 16 has the maximal value of cost index. As shown in Figure 2c, tree T has been partitioned into three subtrees T 1 , T 2 and T 3 .
Then BTP algorithm tried to balance the subtrees T 1 , T 2 and T 3 .The numbers of nodes in subtrees T 1 , T 2 and T 3 are 6, 10, 8, respectively. The average number of nodes of the three subtrees is 8. To balance the three subtrees, T 2 should move two nodes to T 1 . According to Equation (14), the proposed BTP algorithm will evaluate the distances from each sensor node in subtrees T 1 and T 2 to the roots of T 1 and T 2 . As a result, sensor s 6 satisfies Expression (14) and will be selected to play the role of s best . Hence an additional edge (s 6 , s 4 ) will be added in subtree T 1 and edge (s 6 , s 9 ) will be deleted from T 2 . Repeat the abovementioned operations, until the sizes of three subtrees are identical. That is |T 1 | = |T 2 | = |T 3 |. Figures 1f and 2e depict the process of moving two nodes s 6 and s 7 from T 2 to T 1 .
proposed BTP algorithm will evaluate the distances from each sensor node in subtrees and to the roots of and . As a result, sensor satisfies Expression (14) and will be selected to play the role of . Hence an additional edge ( , ) will be added in subtree and edge ( , ) will be deleted from . Repeat the abovementioned operations, until the sizes of three subtrees are identical. That is | | = | | = | |. Figures 1f and 2e depict the process of moving two nodes and from to . After completing this phase, the mobile sink executes the CP Selection and Path Construction Phase which is described in the next subsection.

CP Selection and Path Construction Phase
This phase aims to select CPs from the sensors in Ti and construct a path  which visits the CPs for data collection.
Let denote the set of data collection points which has been selected to play the role of root in subtree . Let denote the set of all sensors in subtree and denote the set of sensors that are not selected to play the role of CPs. This phase firstly selects one best sensor node from set at a time. Then the selected will join the set and be removed from . After that, the subtree will be restructured. Data collected by each node in set can be directly sent to the mobile sink , since the mobile sink will visit each selected CP in set . In this way, the energy consumption for those sensors that should forward the data of CP can be further saved. (e) Selecting sensor node s 6 to move from T 2 to T 1 ; (f) Selecting sensor node s 7 to move from T 2 to T 1 .
After completing this phase, the mobile sink executes the CP Selection and Path Construction Phase which is described in the next subsection.

CP Selection and Path Construction Phase
This phase aims to select k i CPs from the sensors in T i and construct a path π i which visits the k i CPs for data collection.
Let P i denote the set of data collection points which has been selected to play the role of root in subtree T i . Let S i denote the set of all sensors in subtree T i and S i denote the set of sensors that are not selected to play the role of CPs. This phase firstly selects one best sensor node s best from set S i at a time. Then the selected s best will join the set P i and be removed from S i . After that, the subtree will be restructured. Data collected by each node in set P i can be directly sent to the mobile sink m i , since the mobile sink m i will visit each selected CP in set P i . In this way, the energy consumption for those sensors that should forward the data of CP can be further saved.
Let P i = p i,1 , p i,2 , p i,3 , · · · , p i,y denote the set of y CPs, which have been selected from in subtree T i . Let R i denote the shortest Hamiltonian route the connects y CPs in P i . Since the selection of each CP can construct a new subtree, it is obvious that subtree T i has been partitioned into y subtrees. Let p i,1 denote the root of tree T i,j . Let s x be any sensor in tree T i,j . Let p closest denote the CP which is closest to sensor node s x . Equation (15) derives p closest .
Let b x denote the benefit index obtained by determining s x as the data collection point. The value of b x , as derived in Equation (16), can be evaluated by the number of packets that are saved for transmission divided by the cost of distance from some s i,j ∈ P i to s i,i .
This phase will calculate b x for each s x ∈ S i and select the best node s best to play the role of CP, where s best satisfies the following condition. Equation (17) derives s best .
Then a new Hamiltonian route will be constructed by adding the new CP s best to the existing route R i for mobile sink m i , where R i connects all (y + 1) CPs in set P i = P i ∪ s best and checks whether or not the length |π i | is smaller than the length bound L max . If it is the case, the proposed algorithm will add the collection point s best to set P and remove the s best from set V. The edge that connects s best and its parent will be removed accordingly. Otherwise, the selection operation will be terminated. The set P i will be the output of this phase.

Speed Control Phase
This phase first constructs a global path which passes through each root of T i for global mobile sink m 0 by applying the Hamiltonian algorithm. Let T = {T i |1 ≤ i ≤ k} denote the set of k subtrees. The global mobile sink m 0 will leave from the static sink, move along the constructed path and collect all data stored in root T i . Then the global mobile sink m 0 goes back to the static sink. Assume {π 0 , π 1 , π 2 , . . . , π k } denote the paths of the mobile sink m 0 , m 1 , . . . , m k , respectively. Let Let t collect i denote the time of the mobile sink m 0 collecting data in rendezvous pointt i . The total time, denoted by t arrive i , required for global mobile sink m 0 traversing from static sink to the rendezvous pointl i can be measured by Equation (19).
The total time, denoted by t cycle , required for mobile sink m 0 touring a cycle can be measured by Equation (20).
The velocity of the mobile sink m i can be measured by Equation (21).
where |R i | denote the path length of subtree T j passes through all collection point.

Performance Evaluations
This section aims to investigate the performance comparison of the proposed CDCA against the existing HiCoDG [20] and balanced regional partitioning algorithm BRPA [22] in terms of the rendezvous time, network lifetime, fairness index, the SD energy consumption and efficiency index. Four scenarios are considered in the experiments. Herein, the BRPA used a single mobile sink to collect the data from all CPs. To compared with the BRPA, the whole region is partitioned into several subregions and then each subregion applies the BRPA to select the CPs and allocate a mobile sink to collect data from the selected CPs. Table 2 lists the parameters of the simulation environment. In the experiments, the sensor nodes are placed uniformly in an 800 m × 800 m rectangular region. The sensing range of each sensor is set by 20 m while the communication range is set by 30 m. The initial energy of each sensor is 100 J. Each sensor creates one data packet in each round which is a predefined time period for mobile sinks cooperatively passing through all rendezvous points and collecting data from all local mobile sinks. Assume that every mobile sink m i is aware of the time point that can meet the global mobile sink m 0 . To further investigate the performance of the compared mechanisms in different distributions of the sensors, four scenarios are considered in the experiments, as shown in Figure 3. The first scenario, called balanced deployment scenario or BD-Scenario in short, adopts randomly deployment. All sensor nodes can communicate with each other. The other three scenarios are unbalanced deployment scenarios, called UD1-Scenario, UD2-Scen and UD3-Scenario. The big holes with circle-shape and X-shape are existed in the central region in UD1-Scenario and UD2-Scenario, respectively. In the UD3-Scenario, the sensors are connected to form an X-shape, which is partitioned by the holes. To further investigate the performance of the compared mechanisms in different distributions of the sensors, four scenarios are considered in the experiments, as shown in Figure 3. The first scenario, called balanced deployment scenario or BD-Scenario in short, adopts randomly deployment. All sensor nodes can communicate with each other. The other three scenarios are unbalanced deployment scenarios, called UD1-Scenario, UD2-Scen and UD3-Scenario. The big holes with circle-shape and X-shape are existed in the central region in UD1-Scenario and UD2-Scenario, respectively. In the UD3-Scenario, the sensors are connected to form an X-shape, which is partitioned by the holes.  Figure 4 shows the rendezvous time points between each of the eight local mobile sinks and the global mobile sink in four scenarios. In Figure 4, the four scenarios show the similar trend that the performance of BD Scenario is best and the rendezvous times are increased with the number of mobile sinks. In average, the BD Scenario saves 57% time cost, as compared with UD3-Scenario. Moreover, the UD2-Scenario and UD1-Scenario save 18% and 23% time costs, respectively, as compared with UD3-Scenario. The main reason is that BD Scenario has balanced deployment of sensors and contains no hole which might cause mobile sink moving from one side to another without any data collection operation, only for the purpose of travelling. As a result, the BD Scenario leads to a shorter path length. For each group, the proposed CDCA selects more CPs and creates shorter path length for each local mobile sink. Thus the average number of hops from each sensor to the CP and the average length between CPs is shorter. Consequently, the delay time of each local mobile sink is reduced. This also reduces the rendezvous time since the global mobile sink can rendezvous with each local mobile sink earlier.  Figure 4 shows the rendezvous time points between each of the eight local mobile sinks and the global mobile sink in four scenarios. In Figure 4, the four scenarios show the similar trend that the performance of BD Scenario is best and the rendezvous times are increased with the number of mobile sinks. In average, the BD Scenario saves 57% time cost, as compared with UD3-Scenario. Moreover, the UD2-Scenario and UD1-Scenario save 18% and 23% time costs, respectively, as compared with UD3-Scenario. The main reason is that BD Scenario has balanced deployment of sensors and contains no hole which might cause mobile sink moving from one side to another without any data collection operation, only for the purpose of travelling. As a result, the BD Scenario leads to a shorter path length. For each group, the proposed CDCA selects more CPs and creates shorter path length for each local mobile sink. Thus the average number of hops from each sensor to the CP and the average length between CPs is shorter. Consequently, the delay time of each local mobile sink is reduced. This also reduces the rendezvous time since the global mobile sink can rendezvous with each local mobile sink earlier.
leads to a shorter path length. For each group, the proposed CDCA selects more CPs and creates shorter path length for each local mobile sink. Thus the average number of hops from each sensor to the CP and the average length between CPs is shorter. Consequently, the delay time of each local mobile sink is reduced. This also reduces the rendezvous time since the global mobile sink can rendezvous with each local mobile sink earlier.   Figure 5a,b investigate the network lifetimes of the four scenarios by varying the number of sensor nodes, ranging from 300 to 800 using seven and nine mobile sinks, respectively. By applying the proposed CDCA, the lifetime of WSN is increased with the number of mobile sinks. The BD-scenario has the longest network lifetime when the number of mobile sink is nine. This occurs because that more mobile sinks can visit more CPs and hence reduce average number of hops from each sensor to the corresponding CP. Since UD1-Scenario, UD2-Scenario and UD3-Scenario have holes which block data transmissions. As a result, the BD-scenario has the best performance in terms of network lifetime.  Figure 5a,b investigate the network lifetimes of the four scenarios by varying the number of sensor nodes, ranging from 300 to 800 using seven and nine mobile sinks, respectively. By applying the proposed CDCA, the lifetime of WSN is increased with the number of mobile sinks. The BDscenario has the longest network lifetime when the number of mobile sink is nine. This occurs because that more mobile sinks can visit more CPs and hence reduce average number of hops from each sensor to the corresponding CP. Since UD1-Scenario, UD2-Scenario and UD3-Scenario have holes which block data transmissions. As a result, the BD-scenario has the best performance in terms of network lifetime. Figure 6 compares the three algorithms in four scenarios in terms of fairness index. Let denote energy consumption of sensor node in each round. The Fairness Index is defined by Expression (22).
As shown in Figure 6, the fairness indexes of three approaches, including CDCA, HiCoDG and BRPA, are compared. The BRPA approach partitions the monitoring region into several equal subregions and assigns each mobile sink to collect all sensors in each subregion. Since different subregions have different number of sensors, it has smaller fairness index value. In HiCoDG, the balance of the number of nodes in clusters is not considered. Furthermore, the distance from the current cluster-head to the next CP is not considered. Therefore, more overheads are needed for travelling from one CP to another one. As a result, fewer CPs can be joined, leading to a larger hop distance from each sensor to the corresponding CP. The proposed CDCA determines more CPs and distributes the workloads of data forwarding to them. Hence the fairness index of CDCA keeps to a constant value which is closed to 1 in all cases. In general, the proposed CDCA outperforms HiCoDG and BRPA schemes in terms of fairness index in all cases.     Figure 6 compares the three algorithms in four scenarios in terms of fairness index. Let E i denote energy consumption of sensor node s i in each round. The Fairness Index is defined by Expression (22). The standard deviation (SD) of the energy consumption of each sensor node was measured using Equation (23).
where denotes energy consumption of sensor node and denote average energy consumption of sensor node. A small SD value indicates that the network lifetime is long. Figure 7 further investigates the effects of the numbers of sensor nodes and mobile sinks on the SD value in four scenarios by applying three mechanisms. The number of sensor nodes was varied ranging from 300 to 800, and the number of mobile sinks was set ranging from 2 to 10. In general, the SD value is decreased with the number of sensor nodes. The BRPA yielded the largest SD value among the three compared mechanisms because the node balance in each subtree was not considered. For the HiCoDG scheme, multiple mobile sinks cooperate with each other to collect and relay data, resulting in a lower SD value, as compared with the existing BRPA. However, the SD value of the HiCoDG scheme was higher than that of the proposed CDCA mechanism. This occurs because it did not consider the node balance factor in each cluster. Moreover, the distance from the current clusterhead to the next cluster-head is not considered when constructing the path. The proposed CDCA mechanism partitions the sensor nodes into several disjoint subsets, and further selects as more as possible CPs in . Then CDCA constructs a path  passing through all CPs and collects data from higher burden CPs in each subset. After that, the CDCA further applies the speed control mechanism to maintain a stable rendezvous time point for global mobile sinks and each mobile sink . As a result, the proposed CDCA mechanism yielded the lowest SD value, as shown in Figure 6. As shown in Figure 6, the fairness indexes of three approaches, including CDCA, HiCoDG and BRPA, are compared. The BRPA approach partitions the monitoring region into several equal sub-regions and assigns each mobile sink to collect all sensors in each subregion. Since different sub-regions have different number of sensors, it has smaller fairness index value. In HiCoDG, the balance of the number of nodes in clusters is not considered. Furthermore, the distance from the current cluster-head to the next CP is not considered. Therefore, more overheads are needed for travelling from one CP to another one. As a result, fewer CPs can be joined, leading to a larger hop distance from each sensor to the corresponding CP. The proposed CDCA determines more CPs and distributes the workloads of data forwarding to them. Hence the fairness index of CDCA keeps to a constant value which is closed to 1 in all cases. In general, the proposed CDCA outperforms HiCoDG and BRPA schemes in terms of fairness index in all cases.
The standard deviation (SD) of the energy consumption of each sensor node was measured using Equation (23).
where E x denotes energy consumption of sensor node s i and E avg denote average energy consumption of sensor node. A small SD value indicates that the network lifetime is long. Figure 7 further investigates the effects of the numbers of sensor nodes and mobile sinks on the SD value in four scenarios by applying three mechanisms. The number of sensor nodes was varied ranging from 300 to 800, and the number of mobile sinks was set ranging from 2 to 10. In general, the SD value is decreased with the number of sensor nodes. The BRPA yielded the largest SD value among the three compared mechanisms because the node balance in each subtree was not considered. For the HiCoDG scheme, multiple mobile sinks cooperate with each other to collect and relay data, resulting in a lower SD value, as compared with the existing BRPA. However, the SD value of the HiCoDG scheme was higher than that of the proposed CDCA mechanism. This occurs because it did not consider the node balance factor in each cluster. Moreover, the distance from the current cluster-head to the next cluster-head is not considered when constructing the path. The proposed CDCA mechanism partitions the sensor nodes into several disjoint subsets, and further selects as more as possible CPs in T i . Then CDCA constructs a path π i passing through all CPs and collects data from higher burden CPs in each subset. After that, the CDCA further applies the speed control mechanism to maintain a stable rendezvous time point for global mobile sinks m 0 and each mobile sink m i . As a result, the proposed CDCA mechanism yielded the lowest SD value, as shown in Figure 6. Let represent the efficiency index equal to the average length cost multiplied by the resource unused rate. The path length cost is measured by the total length divided by the number of CPs (τ). The maximal length of path constraint is considered as the resource and the unused resource is measured by 1 − ( / ). Equation (24) indicates that the efficiency index depends on the path length cost for each CP and the resource unused rate.
where denotes the actual length of mobile sink travelling path in each subtree in each round and τ denotes the number of CP in subtree. Figure 8 depicts the impact of path length on the efficiency index. Four scenarios were considered in the simulation. In general, the efficiency indices of the BRPA, HiCoDG and CDCA mechanisms increase with the number of mobile sinks, as shown in Figure 8a,b The proposed mechanism CDCA outperforms the other two approaches HiCoDA and BRPA. This occurred because that HiCoDA and BRPA mechanisms do not consider the constraints of number of mobile sinks and path length. As shown in Figure 8, the ERPA yielded the smallest Let ζ e f f iciency represent the efficiency index equal to the average length cost multiplied by the resource unused rate. The path length cost is measured by the total length l at divided by the number of CPs (τ). The maximal length of path constraint is considered as the resource and the unused resource is measured by 1 − (l at /l max ). Equation (24) indicates that the efficiency index depends on the path length cost for each CP and the resource unused rate.
where l at denotes the actual length of mobile sink m i travelling path in each subtree in each round and τ denotes the number of CP in subtree. Figure 8 depicts the impact of path length on the efficiency index. Four scenarios were considered in the simulation. In general, the efficiency indices of the BRPA, HiCoDG and CDCA mechanisms increase with the number of mobile sinks, as shown in Figure 8a,b The proposed mechanism CDCA outperforms the other two approaches HiCoDA and BRPA. This occurred because that HiCoDA and BRPA mechanisms do not consider the constraints of number of mobile sinks and path length. As shown in Figure 8, the ERPA yielded the smallest efficiency index value among the three compared mechanisms because the node balance in different subtrees was not considered. In addition, the efficiency index value of HiCoDG scheme was smaller than that of the proposed CDCA mechanism. This occurs because that the HiCoDG scheme also did not consider the node balance issue in different clusters and the distance between current cluster-head and the next cluster-head is not taken into consideration. The proposed CDCA balances the node number in different subtrees and finds as more as possible CPs under the constraint of total path length in each subtree. Therefore, the proposed CDCA achieves higher resource utilization and smaller path cost, as compared with the other two mechanisms. Consequently, the proposed CDCA mechanism yielded the largest efficiency index value. efficiency index value among the three compared mechanisms because the node balance in different subtrees was not considered. In addition, the efficiency index value of HiCoDG scheme was smaller than that of the proposed CDCA mechanism. This occurs because that the HiCoDG scheme also did not consider the node balance issue in different clusters and the distance between current clusterhead and the next cluster-head is not taken into consideration. The proposed CDCA balances the node number in different subtrees and finds as more as possible CPs under the constraint of total path length in each subtree. Therefore, the proposed CDCA achieves higher resource utilization and smaller path cost, as compared with the other two mechanisms. Consequently, the proposed CDCA mechanism yielded the largest efficiency index value.

Conclusions
This paper considers the data collection issue in a given mobile wireless sensor network where multiple mobile sinks aim to collect data from a given set of static sensors. The proposed CDCA mechanism aims to prolong the network lifetime by visiting as more CPs as possible such that the forwarding cost of each static sensor is reduced. The proposed CDCA mechanism primarily consists of three phases: Network Partition Phase, CP Selection and Path Construction Phase, and Speed Control Phase. In the Network Partition Phase, the CDCA partitions the sensor nodes into k disjoint subsets. In the Collection Point Selection Phase and Path Construction Phase, CDCA further selects CPs from sensors in and constructed a path  which passes through the CPs and collects data from higher burden CPs. In the Speed Control Phase, the algorithm coordinates to control the speed of mobile sinks such that the global sink and each local mobile sink can be rendezvoused. The performance results show that the proposed CDCA outperforms other mechanisms in terms of rendezvous time, network lifetime, fairness index, standard deviation (SD) of energy consumption, as well as efficiency index. The proposed CDCA reduces the energy consumption of tested WSNs by 18% and 37% in comparison to HiCoDG and BRPA, respectively.
Further studies for exploring energy recharging are planned. The large WSN is partitioned into smaller areas where each area is assigned a mobile sink. The proposed CDCA can be extended such that it can establish a recharging path for each area while considering the coverage issue. Moving along the path, a mobile recharger can recharge the sensors for maintaining maximal coverage for a

Conclusions
This paper considers the data collection issue in a given mobile wireless sensor network where multiple mobile sinks aim to collect data from a given set of static sensors. The proposed CDCA mechanism aims to prolong the network lifetime by visiting as more CPs as possible such that the forwarding cost of each static sensor is reduced. The proposed CDCA mechanism primarily consists of three phases: Network Partition Phase, CP Selection and Path Construction Phase, and Speed Control Phase. In the Network Partition Phase, the CDCA partitions the sensor nodes into k disjoint subsets. In the Collection Point Selection Phase and Path Construction Phase, CDCA further selects k i CPs from sensors in T i and constructed a path π i which passes through the k i CPs and collects data from higher burden CPs. In the Speed Control Phase, the algorithm coordinates to control the speed of mobile sinks such that the global sink and each local mobile sink can be rendezvoused. The performance results show that the proposed CDCA outperforms other mechanisms in terms of rendezvous time, network lifetime, fairness index, standard deviation (SD) of energy consumption, as well as efficiency index. The proposed CDCA reduces the energy consumption of tested WSNs by 18% and 37% in comparison to HiCoDG and BRPA, respectively.
Further studies for exploring energy recharging are planned. The large WSN is partitioned into smaller areas where each area is assigned a mobile sink. The proposed CDCA can be extended such that it can establish a recharging path for each area while considering the coverage issue. Moving along the path, a mobile recharger can recharge the sensors for maintaining maximal coverage for a given WSN.