Robust Data Recovery in Wireless Sensor Network: A Learning-Based Matrix Completion Framework†

In this paper, we are interested in the data gathering for Wireless Sensor Networks (WSNs). In this context, we assume that only some nodes are active in the network, and that these nodes are not transmitting all the time. On the other side, the inactive nodes are considered to be inexistent or idle for a long time period. Henceforth, the sink should be able to recover the entire data matrix whie using the few received measurements. To this end, we propose a novel technique that is based on the Matrix Completion (MC) methodology. Indeed, the considered compression pattern, which is composed of structured and random losses, cannot be solved by existing MC techniques. When the received reading matrix contains several missing rows, corresponding to the inactive nodes, MC techniques are unable to recover the missing data. Thus, we propose a clustering technique that takes the inter-nodes correlation into account, and we present a complementary minimization problem based-interpolation technique that guarantees the recovery of the inactive nodes’ readings. The proposed reconstruction pattern, combined with the sampling one, is evaluated under extensive simulations. The results confirm the validity of each building block and the efficiency of the whole structured approach, and prove that it outperforms the closest scheme.


Introduction
During the last decades, the Internet of Things (IoT) has emerged as a new business model that is composed of billions of communicating devices. Hence, it has gained considerable attention in both the scientific community and industry. However, the inclusion of the IoT into the fifth generation cellular systems (5G) and their evolution still represent a formidable technical challenge due to the huge number of sensors and the generated information. Note that one of the main challenges of the 5G is the massive connectivity for Machine-Type Communications (MTC) and managing its coexistence with the high rate continuous traffic that is generated by Human-Type Communications (HTC) in an efficient and effective manner. An interesting proposal is the Compressive Sensing (CS), which reduces the number of active agents at a given time slot, while remaining able to recover the sensing data. In general, Wireless Sensor Networks (WSNs) consist of a large set of sensor nodes, which are self-organising and geographically distributed across the network. They are usually used to monitor various physical phenomena with a high resolution, such as in forests, under water, as well as in civilian and habitat application areas. Usually, these devices operate in an unattended mode and they are unable to renew their batteries. Hence, energy efficiency is the main challenge for these networks, since it directly affects their lifetimes and, thus, their sustainability. In usual data gathering techniques, each sensor node takes measures and sends data to the sink node via multi-hop transmission. If nodes face packet losses, due to collisions or buffer overflows, packets are retransmitted, which leads to a high sensing cost and a heavy traffic, especially in large-scale networks. Indeed, reducing the number of transmitting source nodes, using techniques such as CS, is not only useful in reducing the collisions, but also crucial for sensor nodes that need to sleep to prolong their lifetimes.
Recently, it has been shown that the integration of the Matrix Completion (MC) technique, viewed as an extension of CS, has enhanced wireless networks scenarios. If the received data matrix has a low-rank structure, then it can be recovered with high accuracy while using the partially received elements [1]. Firstly, data are directly sensed in compressed form and the high energy-intensive recovery algorithm is executed at the sink node. Hence, the computation complexity is moved from sensor nodes to the sink. This meets well the resource-constrained devices and significantly reduces the energy consumption. Secondly, because MC handles the data in its matrix form, it can fully capture the signal correlation in both space and time dimensions and, hence, achieves a satisfying interpolation quality with a higher compression rate (very few transmitted readings).
In some applications, especially the densely deployed WSNs, the sensed data are, in general, highly correlated, and redundancy exists between the sensor nodes that belong to the same geographic area. These nodes can be arranged into a group or a cluster. Because they are monitoring the same targets or events, collecting raw data from all of the cluster members becomes inefficient and wasteful for the energy. Therefore, a sufficient subset of nodes can be selected from each group, according to a certain criteria, to be the representative of the whole network. These active nodes deliver their readings to the sink under a compression ratio that is guaranteed by MC theory, while the rest of nodes remain silent and do not participate in the sensing operation. Thus, as an extension to [2], we carry on with the twofold compression technique that has been updated compared to the paper [2]. First, we assume that part of nodes do not sense the environment at all. We can consider that these sensors are inactive or idle for a long period or that these nodes are absent. Specifically, in this paper, these notations are only related with the sensing activity, and all of the nodes are connected in order to participate in data forwarding (Here, a node is absent in the sense that its data reading is completely missing, and the sink node has to recover it correctly). The second compression level is that, at each time slot, only a subset of the active nodes, referred to as the transmitting ones, send their sensing data to the sink. Different from [2], the nodes having the higher correlation with other nodes, i.e., best represent the network, are selected as representative sensor nodes. Indeed, in order to be chosen as active nodes, they should be able to capture enough information regarding the others and the whole network. This strategy not only minimizes the energy cost and extend the network lifetime, but it also helps to avoid other problems, such as the traffic congestion collapse [3]. It is true that in [4], the choice of the active nodes follows a deterministic based metric. However, unlike [4], in this work, we explain, in detail, each building block of the introduced structured MC-based data gathering framework (the representative nodes selection process and the network clustering phase). Subsequently, we separately evaluate them in the numerical results section in order to illustrate the benefits of each building block of the proposed technique. Furthermore, in this paper, we propose a Multi-Gaussian signal model that introduces the solution of reproducing a signal retaining the behaviour of a given real world data by adjusting the correlation parameters. For that reason, this method represents an effective alternative to the real world signals.
The application of the just mentioned atypical high-loss scenario leads to a significant number of empty rows in the signal data matrix (a row (resp. column)) is called an empty row (resp. column) if and only if all of the values of the row (resp. column) are un-sampled), which completely disagrees with MC fundamentals. In fact, because MC approaches are based on the minimization of the matrix rank, they become useless when there is any empty row or empty column in the matrix. Indeed, MC techniques have been conceived to recover a matrix containing random missing elements [5]. In the state-of-art of MC-based algorithms, to the best of our knowledge, Ref. [6] is the only paper who dealt with the case, where there is a small number of missing rows in the received data matrix by applying a spatial pre-interpolation technique, which recovers data from neighboring sensor nodes. However, as the number of active nodes decreases, we also face absent nodes having absent neighbor sensors. Thus, this framework becomes unable to recover the data rows of these isolated sensor nodes. Hence, although this approach is interesting, it seems not well suited for the addressed scenario and it fails to take the existence of isolated sensor nodes (absent nodes having all their neighbors absent) into account. In this context, we develop our scheme, which, firstly, schedules the sampling pattern after efficiently identifying the different clusters and their representative nodes. Secondly, it treats the case of high compression ratios with a considerable number of inactive sensor nodes (empty rows) while using a sequence of three different interpolation techniques.
The proposed framework is also useful for another challenging scenario; when we have a small number of sensors that have to be deployed in a spacious area. Indeed, either the sensor nodes are costly or the environment is large enough to be content with the limited number of sensors. This may also concern the harsh environments that are difficult to access such as volcanoes and other troublesome environments, where the deployment of many sensor nodes is not practical and becomes expensive. However, in many applications, the amount of gathered data must be significant enough to be processed. The idea here is to place a relatively small number of spatially spaced sensor nodes to control the correlated field under a compression ratio. These sensor nodes represent other sensor nodes that do not really exist. Particularly, the sensory data field is, most of the time, highly correlated and redundant between nearby sensor nodes, which makes possible to estimate the readings at locations, where the signal cannot be sensed.
The main contributions of the paper are summarized, as follows: • We generate a synthetic space-time signal that is composed of different Gaussians, each of which presents a cluster of wireless nodes. Like all the WSNs signals' profiles, the generated signals are correlated in space and time, where spatial and temporal correlation parameters and models differ from one Gaussian to another and can be separately adjusted. • For the sampling part, only a small subset of sensor nodes is selected to be active and report its readings. For each detected cluster, the active sensor nodes selection is achieved by considering the correlation criteria. Subsequently, for each time instant, we choose the transmitting sensor nodes with the same percentage from each cluster in order to ensure the diversity in the transmitted data, notably for the high compression ratios. • For the reconstruction part, we propose using three different techniques to accurately rebuild the entire data matrix. In the first step, we fill the missing readings of the active sensor nodes by applying the MC. Subsequently, we carry on with the spatial pre-interpolation to handle a part of the empty rows while adjusting the 1-hop topology matrix to the presence of the disjoint clusters in the monitored field. Finally, we recover the rows of the isolated sensor nodes using a minimization problem interpolation-based technique with a spatial correlation matrix. In this paper, the third stage of data recovery pattern has been re-investigated and improved to be more efficient when compared to the one used in papers [2,4], i.e., providing a lower data recovery error for the isolated nodes. In the numerical results section, we evaluate the two techniques with respect to a tuning parameter, and we show that the proposed minimization problem interpolation-based method significantly enhances the data recovery performance. • Through extensive simulations, we show that the proposed framework outperforms other existing techniques in the literature, especially when the number of inactive nodes increases.
The remainder of the paper is organized, as follows. The next section discusses the related work, and Section 3 provides a brief overview on the MC theory and introduces the problem formulation of the paper. Section 4 presents the signal model that we used for the evaluation of our approach. In Section 5, we introduce the efficient clustering method that we propose. In Section 6, we present a strategy that selects the set of the representative sensor nodes. Section 7 is dedicated to the data reconstruction framework. Before concluding the paper in Section 9, we carry out, in Section 8, extensive simulations in order to evaluate the performance of the proposed approach.

Related Work
Environmental WSN signal profiles exhibit both spatial and temporal dependency. Such structures generate redundancy and enable a succinct representation of the data while using a number of coefficients that are much smaller than its actual dimension. One popular postulate of such low-dimensional structures is sparsity, which is, a signal can be simply represented with a few non-zero coefficients in an invertible proper sparsifying domain [7]. CS has been introduced as a good fit for such application in both the acquisition and reconstruction of the signal [8]. With a number of measurements proportional to the sparsity level, CS enables the reliable reconstruction of the signal. Indeed, the latter can be encoded using a much lower sampling frequency than the traditional Nyquist one [9][10][11]. In order to handle the under-determined linear systems, efficient convex relaxation and greedy pursuit-based solvers have been proposed, such as NESTA [12], L1-MAGIC [13], and orthogonal matching pursuit (OMP) [14]. Over the past years, plenty of papers have addressed the data gathering problems in WSNs by the integration of the CS theory, which had made appealing progress in the network energy consumption [15][16][17][18][19][20][21].
Originally, CS-based schemes were designed to sample and recover sparse vectors, and they were classified either as purely spatial approaches [18][19][20][21][22] or as purely temporal ones [23]. Despite the incorporation of the kronecker CS framewok, the standard resolution of CS is still formulated in vector form [15,16,[24][25][26]. Moreover, tools from linear algebra are still needed in order to reformulate the data matrix into the vector form. Without the need of computing an adaptive sparsifying basis, MC has recently emerged using another type of structural sparsity (a low-rank matrix holds singular values composing a sparse spectrum) [27], which is the matrix low rank property [1]. Because it treats the data matrix as a genuine matrix, MC can take advantage of the correlation in its two dimensions and capture more information. In [28], the authors have found that the data reconstruction performance of the MC depends on the compression ratio. In our previous work [29], we have illustrated that a simple MC-based approach requires a smaller fraction of sensor node readings. In [30], a state-of-the-art of MC-based algorithm for compressive data gathering has introduced the short-term stability with the low-rank feature. The considered feature was used not only to reduce the recovery error, but also to recover the likely empty columns appearing in the received data matrix. The existence of the empty columns was possible, since the readings were forwarded according to a presence probability. Differently, Zhou et al. in [31], have taken advantage of the temporal stability feature and a MC method based on the Bayesian inference to interpolate the missing data. Furthermore, the authors, in [32], addressed joint CS and MC. They have used the CS to compress the sensor node readings and then the MC to recover the non-sampled or lost information. However, this approach has not been compared to other state-of-the-art approaches to show its real contribution. In addition, they have not taken advantage of the space-time correlation of the signal as it should be, since they have used standard compression and sparsifying matrices for the CS. Different from [32], Wang et al. in [33], explored the graph based transform sparsity of the sensed data and considered it as a penalty term in the resolution of the MC problem. Similarly, Ref. [34] combined the sparsity and the low-rank feature in the decoding part, and, as in [33], has used the alternating direction method of multipliers to solve the constrained optimization problem. However, the authors have focused on vector-valued signals when sampling. In [35], the authors introduced an active sparse mobile crowd sensing approach that is based on MC with the intention to reduce the data acquisition cost, while inferring the non-sampled data readings. Because adaptability and efficiency are two very important issues in WSNs data gathering, Ref. [36] has proposed an adaptive and online data gathering scheme for weather data, purely based on MC requirements. In contrast to our proposed approach, this paper has addressed the sampling side differently. Indeed, they have focused on the sampled data locations in the received data matrix, whereas we have considered the sampled data locations in the network area.
The authors of [6] have focused on the case of MC recovery with the existence of successive data missing or corruption, which is referred to as structure faults. Indeed, they have considered that successive data may be missing or corrupted due to channel fading or sensor node failures, which creates successive missing data on rows and/or on columns. However, treating a significant number of totally missing rows was out of the scope of their paper. In this paper, we investigate how to solve a challenging problem in the WSNs: how to omit a considerable number of sensor nodes from the monitoring field and estimate their readings from the partially reported readings of the representative sensor nodes using a MC-based approach. It is worth mentioning that efficiently identifying the clusters, their representative nodes, as well as the data transmission schedule significantly affect the recovery accuracy.

Overview of Matrix Completion
As an extension of CS, the MC technique has emerged recently to benefit from the signal low-rank feature in order to recover the missing data from a substantially limited number of matrix entries [1]. That is, a partially unknown matrix M ∈ IR N×T of rank r min{N, T} can be entirely reconstructed if a subset of its sampled elements M ij as well as their indices (i, j) ∈ Ω are available at the receiver side. The entry-wise partial observation operator P Ω : IR N×T → IR N×T is defined by the following expression: Roughly speaking, the goal of the MC is to find a low-rank matrix X that is consistent with the observed measurements M ij . According to [1], if Ω contains enough information and if M ∈ IR N×T is a low rank matrix (to check whether the data matrix has a low-rank or approximately low-rank structure, one can perform the Singular Value Decomposition method [37]), we can fill the unknown entries by solving the following rank minimization problem: Yet, problem (2) is not convex, and algorithms solving it are doubly exponential. Fortunately, the nuclear norm X * minimization problem, which is a convex relaxation, can be solved. In fact, it is deployed as an alternative to the NP-hard rank minimization problem [38]. Thus, we have: X * , which is also referred as the trace norm of X, denotes the sum of its singular values τ i ≥ 0. In the literature, various efficient solvers for this type of systems have been suggested. For example, the Singular Value Thresholding (SVT) optimizes an approximation of (3) by adding a Frobenius-norm term to the objective function [39]: Different from (3), another method has been proposed to approximate (2) rather than the nuclear norm, which is the matrix factorization. Low rank matrix fitting (LMaFit) [40], Sparsity Regularized SVD (SRSVD), and Sparsity Regularized Matrix Factorization (SRMF) [27] are among the approaches that use the matrix factorization method. These approaches are based on the fact that any matrix X ∈ IR N×T of a rank up to r can be explicitly written as the product of two matrices with the form X = LR tr , where L ∈ IR N×r and R ∈ IR T×r . Hence, the goal here is to search over the set of rank-r matrices and find a point LR tr that is closest to the set of matrices, which meets M at all known entries. In order to solve the problem, an alternating minimization scheme is used by fixing one of L and R and making the other one as the optimization variable.

Problem Formulation
Consider a WSN that is composed of a set N = {1, . . . , N} of N sensor nodes. Let X ∈ IR N×T denote the data matrix that contains measurements that are collected by the set N during a sensing period of length T time slots. Precisely, the entry in the i th row and t th column of X, x i,t represents the t th data reading (t ∈ [1, T]) sensed by the i th node (i ∈ N ). Both of the considered scenarios aim to estimate the full sensor nodes' readings, X, through the use of a small subset N rep = {1, . . . , N rep N} of active sensors, being denoted by representative sensor nodes. It is worth mentioning that the number of active sensors is relatively small when compared to the number of inactive/inexistent ones. Specifically, decreasing the number of active sensors can likely generate a set of absent sensors that also have all their neighbors absent. We call them isolated (IS) sensor nodes.
We propose grouping together sensor nodes having similar readings in the same cluster while using a spectral clustering technique. In fact, the whole network is organized, as follows: N = J j=1 CL j and N = ∑ J j=1 cl j , where cl j is the number of sensor nodes that belong to CL j (cl j = card(CL j )), J is the number of detected clusters and CL j is the cluster j. Note that sensor nodes should capture enough information to be chosen as active. In the following, we define the node selection criterion, i.e., determine the nodes having the best presentation of the whole network. It will be shown in the sequel that the representative node selection as well as the data transmission schedule depend on the detected clusters.
To further reduce energy consumption, the representative sensors do not transmit their raw data to the sink. Instead, they trade on the data sensing along the T time instants and deliver a part of their readings according to a given compression ratio, which is, m < N rep readings rather than N rep readings per time slot. Consequently, the received data matrix M ∈ IR N×T is composed of N rep partially empty data rows and (N − N rep ) completely empty data rows. Note that, to replace any missing entry in M, we set a "zero" as a placeholder. We use a binary sample matrix Ω M ∈ IR N×T that we call sensing and transmitting schedule to indicate, in each time slots t, which nodes sense and transmit its measurements. That is, Ω M(i,t) = 1 if x i,t is available and 0 otherwise. Hence, the incomplete delivered data matrix M can be represented as the Hadamard product between Ω M and X.
The first aim of our work is to well identify the matrix Ω M , as it represents the sampling schedule, which is of prime importance in the recovery performance.
The second aim of our work is to successfully recover all of the missing entries using a limited number of received readings. Therefore, we opted for the MC technique because of its numerous aforementioned benefits. Yet, the application of MC with the existence of a significant number of empty rows is still a challenging task to tackle, since the presence of empty rows or columns impedes the MC reconstruction. Thereby, in this paper we propose a novel interpolation technique that will be annexed to the MC one in order to recover the empty rows. It is noteworthy that the MC, as the first step in the reconstruction operation, is an important part since the performance of the subsequent proposed interpolation technique depends on the recovery accuracy of the MC. Figure 1 illustrates an example of a WSN that consists of N = 16 sensor nodes, among which N rep = 6 sensor nodes are selected to be active. The proposed combined reconstruction approach targets filling all of the missing entries corresponding to the non-transmitted readings.

Signal Model
In this section, we investigate the generation of a synthetic signal that is composed of different Gaussians, each of which presents a portion of the whole monitored geographic area. Because structure and redundancy in data are often synonymous with sparsity, which is analogous to low-rank [27], each portion of the signal is correlated in space and time, where the spatial correlation as well as the temporal correlation parameters differ from one Gaussian to another. These parameters can be separately adjusted because the corresponding functions are independent [41].
The proposed signal model is inspired from [41] that has introduced the solution of reproducing a signal retaining the behavior of a given real world data by adjusting the correlation parameters. In their model, all of the generated samples of the whole signal are Gaussian random variables with zero mean and unit variance. However, in this paper, we consider heterogeneous fields that are divided into a number of regions. Each one is modeled by a specific Gaussian (mean, variance) and different correlation characteristic. The number of different Gaussians as well as their distribution on the field can be fixed or defined according to the kind of signal that one wants to reproduce. This method represents an effective alternative to the real world signals.
In order to generate the signal of interest, we suppose that is the space domain, where x and y are the space coordinates. Consider that we have H different regions, where D h is the space domain of region h = 1, 2, . . . , H, and D = H h=1 D h . Likewise, we suppose that the time is slotted into equal time slots t = 1, 2, . . . ., T. Without a loss of generality, in Algorithm 1 we describe how to generate a correlated portion of the signal z x (p h , t) : D h × T → IR representing one region, where T is the time domain and p h is a point in (x, y) plane that corresponds to region h. The signal of the whole area is the combination of all the generated portions.
In order to obtain a spatially correlated signal, we apply to the signal, to be generated, a 2D filtering procedure using a specific correlation function rs(p), where p = (x, y). Among the numerous existing models in the literature, we generate the signal using the Gaussian filtering, as used in ( [15] Equation (2)), which can be controlled by the parameter γ > 0 (the Power Exponential model [41], when ν is equal to 2). The coloration of the signal with rs(p) has to be done in the frequency domain. Hence, before modeling the spatial correlation, a Fourier transformation is performed. Regarding the temporal correlation, the authors of [41] have used an autoregressive filter to enforce the temporal correlation in the signal model. Because the time is slotted into equal time slots, they only consider the one-step time correlation and use a simple coefficient ρ ∈ [0, 1].

Algorithm 1 Model for generating a portion of the signal
Input: the generated field for t = 1 : w x (p h , t), the temporal correlation parameter ρ h , the spatial correlation parameter γ h , the spatial correlation function computed in the frequency domain Rs(ω h ) = F(rs(p h )).
. 10: To start the signal generation process, for t = 1, we define w x (p h , t) : D h × T → IR to be an i.i.d random Gaussian. That is, for any specific position p h (x, y), w x (p h , t) is a Gaussian random variable with mean η h ∈ IR and unit variance. Algorithm 1 describes how to produce a portion of the whole signal z x (p, t) : D × T → IR, which represents the (x, y) signal. By construction, z x (p, t) is a three-dimensional (3D) matrix of size (2y D × 2x D × T). The data matrix of interest, X, denotes the two-dimensional (2D) signal that is discretized by the N sensor nodes along the T time slots. Figure 2 illustrates an example of an area of size 100m × 100m that is monitored by N = 50 sensor nodes. We can notice, through the colors, that this field is divided into three different regions (H = 3) that are presented by three different Gaussians.

Clusters Detection
In this section, we investigate the partition of the deployed sensor nodes into J clusters. The main reason for partitioning the nodes is to involve all of the detected clusters in the data sensing and transmission. It is well-known, in the conventional MC, that transmitting sensors are selected in a random way during the T time slots. This kind of selection can disregard sensors that belong to the small clusters, which deteriorates the recovery process. However, if we make all of the clusters contribute in the data transmission process, then we fortify the diversity in the delivered data set. Therefore, for each t, according to a given compression ratio and using the same percentage, a set of sensor nodes is picked from each cluster to form the sampling and transmission schedule. It will be shown, in the simulation section, that taking the detected clusters during the sampling process into account significantly enhances the data recovery performance, especially for the high compression ratios. Indeed, our aim is to partition the sensor nodes into different clusters in such a way that we attempt to maximize the intra-cluster similarities and minimize the inter-cluster similarities. Such a successful grouping can be achieved while using the Normalized Spectral Clustering Usually, sensor nodes, which are situated spatially close to each other, have similar readings. Nevertheless, there are some cases, where nearby nodes are separated by a certain barrier and they have readings relatively different from each other. Given the example of sensor nodes deployed in a city to monitor the air pollution. Suppose that we have a public garden located next to a road. Hence, the nearby nodes, which are placed on different sides of the borders, do not necessarily have similar readings. Therefore, to cluster the nodes, the sink relies on their delivered readings (at the initialization, we let all of the sensor nodes send their information during a short learning period T l p T) and considers the set of data vectors, χ l p = {x tr l p 1 , x tr l p 2 , . . . , x tr l p N }, which we want to partition into J clusters. x l p i ∈ IR 1×T l p , viewed as a T l p -dimensional data points, holds the readings that are sent by the sensor node i during the learning period. The spectral clustering technique performs data clustering and treats it as a graph partitioning problem without setting any assumption on the clusters form. It transforms the given set χ l p into a weighted graph G = (V, E) while using some notion of symmetric similarity matrix A ∈ IR N×N , where each vertex v i represents x l p i , and each edge between two vertices v j and v i represents the similarity a j,i ≥ 0. It is recommended to use the Normalized Spectral Clustering, as mentioned above. Hence, we implemented the NJW algorithm [42] (the algorithm name, NJW, is attributed according to the authors' names, which is, Ng, Jordan, and Weiss), which is detailed in Algorithm 2.
Commonly, identifying the number of clusters J in an optimal manner is the main concern of all clustering algorithms. Generally, with spectral clustering, we find the number J by analyzing the Laplacian matrix eigenvalues that are computed using A and according to the chosen clustering method. In this work, we choose to apply the eigengap heuristic [43], which defines J by finding a drop in the magnitude of Laplacian eigenvalues, {λ 1 , λ 2 , . . . , λ N }, sorted in increasing order. That is: The idea here is to pick the number J in such a way that all of the Laplacian eigenvalues λ 1 , . . . , λ J are very small when compared to λ J+1 , which marks relatively a large value.
Regarding the similarity matrix A, we opted for the Gaussian kernel to measure the similarity between the data points {x l p i } [42], where σ is a scaling parameter that controls the neighborhoods width: According to ([42] Theorem 2), an appropriate σ can be automatically fixed after repeatedly running the algorithm while using a number of values and choosing the one that forms the least distorted partition in the spectral representation space. In order to determine the appropriate parameter σ, in ([43] Section 8), the authors had provided several rules of thumb that are frequently used. As an example, the method that we have used states that σ can be chosen to be in the order of nearly the mean distance of a point to its k th m nearest neighbor, where k m ∼ log(N) + 1.

Algorithm 2 The Ng, Jordan, and Weiss (NJW) Spectral Clustering algorithm
Input: The set of data vectors χ l p = {x tr l p 1 , x tr l p 2 , . . . , x tr l p N }, the number J of clusters to detect according to (5).
Pre-processing: 1: Calculate the similarity matrix A according to (6). 6: Normalize the U's rows to norm 1 in order to get the matrix U n ∈ IR N×J , that is, i,j ) 1/2 . Clustering: 7: Treat each row of U n , (u n i ) i=1,. . . ,N , as a data point in IR J , then partition them into J subgroups, Q 1 , . . . , Q J , using k-means algorithm. Output: Clusters CL 1 , . . . , CL J with CL j = i u n i ∈ Q j . Figure 3 plots the sorted eigenvalues of the normalized Laplacian matrix that is computed from the generated signal of the example of Section 4 while using the first four steps of the aforementioned clustering algorithm. Clearly, there is a relatively large gap between the 3rd and 4th eigenvalue of this trace. According to metric (5), the data set contains three clusters, which is well approved.

Sampling Pattern
In this section, we determine how the correlation criteria can be considered to select the representative sensor nodes and how we take the detected clusters in the selection process as well as in the sensing and transmission schedule into account. Unlike our previous work [2], where the set N rep of representative sensor nodes is randomly chosen, in this paper N rep must hold enough information towards the other nodes to be chosen as representative of the network. Relying on the Enhanced Correlation Based Deterministic Node Selection (ECB-DNS) procedure, which was used in previous works [15,16], the active sensor nodes selection is achieved by considering the inter-spatial correlation, which is computed through the conditional variances of the sensor nodes. This technique enables selecting the sensor node g * holding the maximum informative value m with respect to the set S 1 of sensor nodes that are not selected yet. Namely: In (7), σ ig represents the covariance between the reading x i of sensor i and the reading x g of sensor g, whereas σ 2 g presents the variance of x g . It is noteworthy that the way of exploiting this technique in our approach is different to that in [15]. According to their scenario, all of the N nodes contribute to the data sensing and transmission over the T time slots, while, in this approach, only N rep N nodes are selected to be active and represent the J detected clusters. In order to cover all of the clusters, the set N rep consists of the combination of J subsets, (N rep j ) j=1,. . . ,J , where N rep j includes N rep j representative nodes picked from cluster CL j while using the same shared percentage pct Nrep . That is: In (8), if pct Nrep % × cl j is not an integer, we round N rep j to the nearest integer greater than or equal to the value of that element. Here, the selection of the sets N rep j of clusters' representative nodes is independent from one cluster to another. Hence, the set S 1 that appears in expression (7) is replaced by the set S j 1 , which represents the sensor nodes of the cluster CL j that are not yet selected. Thus, we have: The selection process is the same for the J sets N rep j . Thus, for each cluster CL j , according to (9), at each iteration n ∈ {1, . . . , N rep j }, a sensor node g * (n) is selected and moved from set S j 1 to set S j 2 . Note that S j 2 represents the set of nodes of cluster CL j that are already chosen during the previous iterations. Once a sensor g * (n) is put in S j 2 , the metric m of the remaining sensors of set S j 1 should be recomputed in order to prepare the selection of the next sensor node g * (n + 1). Here, by removing g * (n) from S j 1 , we cancel its impact on the rest of the nodes in S j 1 . Hence, the selection of the sensor node g * (n + 1) will be achieved as if the sensor node g * (n) did not exist in the network. The node selection process, especially the manner in how we remove the correlation effect of node g * (n) from S j 1 , follows the steps that are outlined in Algorithm 3. For the initialization, we define the data matrix sent during the learning period X l p = [x tr l p 1 , x tr l p 2 , . . . , x tr l p N ] tr ∈ IR N×T l p that we partition into J sub-matrices X j l p ∈ IR cl j ×T l p , where X j l p holds data sent by nodes belonging to CL j . Besides, we assume that the spatial correlation feature inherent in X l p reflects that in X. By analogy with [15], the computational complexity of selecting N rep j representative nodes from CL j is O(N rep j cl j ). However, different from [15], where in each time slot t, a new and a different set of active transmitting source nodes should be found using the node selection metric, in this work, the selection of the representative nodes' set is performed only once, at the beginning of the sensing period T.
Given the example of Figure 1, we can note the existence of three detected clusters within the network. We suppose that pct Nrep = 30. Thus, 30% of nodes will be selected from each cluster to be active. That is to say that we should pick N rep 1 = 2 sensors from CL 1 , N rep 2 = 1 sensor from CL 2 and N rep 3 = 3 sensors from CL 3 . That is, in total N rep = 6 representative sensors. Based on the correlation among the sensor nodes and using Algorithm 3, the obtained subsets are as follows: Once the set N rep of representative sensor nodes is defined, the sink focuses on the sensing and transmitting schedule, Ω M , by assigning m transmitting nodes for each time instant t. Obviously, these nodes are picked from the set N rep . As has been stated in the previous section, in order to ensure the diversity in the delivered data, the m transmitting nodes are chosen in such a way that we randomly pick, with the same shared percentage pct m , m j nodes from each subset N rep j corresponding to cluster CL j . Likewise (8), we have: Let us focus again on the example of Figure 1, we suppose that pct m = 20. Thus, for each t, 20% of sensors from each subset N rep j are randomly designated to deliver their data to the sink. Because the used number N of this example is very small, we end with m j = 1 transmitting sensor from each cluster for each t.
To conclude, rather than selecting, in a purely random way, the measurement locations, as usually used in the conventional MC method, in this section we presented how to intelligently assign transmitting sensor nodes that can well represent the network relying on their correlations.

5:
Remove the reading x j l p g * (n) of node g * (n) from X j 1 so that it becomes X ] ∈ IR cl j −n×T l p and X j 2 takes the values of node g * (n) so that X j 2 = x j l p g * (n) . 6: Following that removal, Σ j can be written as: else if (n ≥ 2) then 8: Following the removal of node g * (n − 1) from S j 1 , re-compute the conditional covariance matrix of X j 1 knowing X j 2 = x j l p g * (n−1) ; Σ j 1,1|2 ∈ IR cl j −(n−1)×cl j −(n−1) where: According to (9) and using Σ j 1,1|2 , re-compute the metrics m then select g * (n).

Reconstruction Pattern
After revealing in detail how to select the N rep representative sensor nodes and how to schedule their participation in the data sensing and transmission, in this section we focus on how to approximate the entire N × T data matrix X based on the limited amount of reported readings. Isolating (N − N rep ) inactive sensor nodes from the sampling and transmission schedule entails the existence of (N − N rep ) fully empty rows in the received data matrix M ∈ IR N×T , which impedes the MC technique that is completely unable to estimate the original matrix. Therefore, the use of other complementary interpolation techniques becomes needed. In this context, we develop a structured MC-based recovery algorithm that is able to ensure the reconstruction of the entire N × T data matrix X. Stage 1: obviously, it is not feasible to directly apply the MC technique with the existence of (N − N rep ) fully empty rows. Therefore, we have to remove these rows from M. We denote the resultant matrix as M MC ∈ IR N rep ×T , containing the partially delivered readings of the representative sensor nodes. We carry on with the same removal from Ω M to obtain Ω MC ∈ IR N rep ×T . Subsequently, making use of the solution introduced in (4) or any other method proposed for the MC resolution, we fill the missing entries of M MC that correspond to the non-transmitted data readings of the N rep sensor nodes. The threshold parameter τ roughly equals 100 times the largest singular value of M MC , as has been introduced in [39]. We denote X ∈ IR N rep ×T as the combination of the MC based estimation and directly observed data. Finally, we update X ∈ IR N×T by adding the (N − N rep ) empty rows and then placing them in their proper corresponding locations of M.
Stage 2: after filling the random missing readings, leave the (N − N rep ) completely missing rows that correspond to the inactive sensor nodes. In this phase, we carried on with the spatial pre-interpolation technique of [6], which rebuilds the data of an empty row relying on the available data of the neighboring sensor nodes. To apply this method, they used a kind of an N × N binary symmetric matrix Y that they called a 1-hop topology matrix, where both of the columns and rows denote the sensor nodes. The sink assigns 1 to Y(i, j) and Y(j, i) if it finds that sensor node i and sensor node j are 1-hop neighbors. However, according to the signals nature that we consider, and to avoid untrustworthy data reconstruction, we consider that, even though two sensor nodes are geographically close to each other, if they do not belong to the same cluster, then they are not considered to be neighbors.
The number N rep of the active sensor nodes is very small when compared the total number N, which means that the (N − N rep ) inactive sensor nodes constitute the preponderant portion of the network, as mentioned before. Consequently, there are several IS nodes in the network (having all of their neighbors absent). Hence, with the use of the stated topology matrix Y, this interpolation technique can achieve the data reconstruction only for the absent sensor nodes that have neighbors that belong to N rep . We suppose that the network distribution contains N Is isolated sensor nodes. Subsequently, the resulting data matrix X ∈ IR N×T , obtained at the end of this stage, still holds N Is empty rows to be recovered (N Is all-zeros rows). For the detailed steps of the above interpolation method, the reader may refer to ([6] Section VI) (As for the complexity of the used spatial pre-interpolation, according to [6], it is estimated to be very low, since this technique is based on simple matrix multiplication with neighbor information). Stage 3: since the above interpolation technique is limited to recover only a part of the total empty rows (absent nodes), we resort to a second spatial interpolation to rebuild the remaining part of the empty rows (isolated nodes). Benefiting once again from the spatial dependency among the sensor nodes, we fill the remaining empty rows while using the following minimization problem: where Ω IS denotes the set of indexes of the non isolated nodes, i.e., the representative nodes and the absent ones. S represents the spatial constraint matrix, whose computation steps will be detailed hereafter, f ac 1 and f ac 2 are two tuning parameters, and X ∈ IR N×T is the final reconstructed data matrix. It is noteworthy that the above proposed minimizationbased interpolation technique has been updated when compared to the one of our previous works ([2] Equation (8)) and ([4] Equation (5)), and, through simulations, we found out that the updated minimization significantly enhances the data reconstruction quality of the isolated nodes. Note that the resolution of this optimization problem can be easily accomplished while using the semidefinite programming (SDP). We opted for the CVX package [44], implemented in Matlab, as an advanced convex programming solver, in order to solve (11) and obtain X.
In this equation, the matrix S ∈ IR N×N relatively reflects our knowledge regarding the spatial structure inherent in the data, since it is computed based on the learning data matrix X l p ∈ IR N×T l p . This spatial matrix expresses the similarities between the sensor nodes' readings. Suitably, we use the Euclidean distance as a distance function in the data domain of the sensor nodes to model the similarity between the rows of X l p , whereby the smaller the distance between two rows, the closer they are. Below are the steps to obtain S: 1-We initiate these steps with an all-zeros matrix S.

2-
The similarity between the rows in X l p is not evident as the ordering of the sensor nodes' indexes in X l p is arbitrary. Thus, for each row i of X l p , we search for the set j i of indexes of the K closest rows to i, which is, 3-Assuming that the row i can be approximated through the linear combination of the rows of set j i , we perform the linear regression to compute the weight vector W = [w(1), . . . , w(K)] ∈ IR 1×K through the following equation: 4-Finally, we assign 1 to S(i, i) and −w(k) to S(i, j k ). As soon as these steps have been carried out for all the rows i, we obtain the matrix S, with which we interpolate X, as in (11) (here, since, for each row of X l p , we search for the set j i , while using a simple Euclidean distance, the complexity is O(N). Moreover, performing the linear regression in (12) to compute the weight vector W is basically dominated by simple multiplication and division operations of matrix, which makes the complexity low). Now, there remains the last adjustment to realize, that is, the scaling of the two parameters, f ac 1 and f ac 2 of (11). The regularization parameters f ac 1 and f ac 2 are introduced in order to establish a trade-off between a close fit to the matrix X and the intention of fulfilling the N Is remaining empty rows while using S. Through several simulations, we found that adjusting these parameters nicely improves the reconstruction performance, and the found values of f ac 1 and f ac 2 are independent of the size of the matrix (N and T) as well as the Gaussians' values composing the synthetic signal.
Let us focus again on the example shown in Figure 1. The dotted lines refer to the neighborhood relation between sensors. As we can see, the sensors {5, 8, 10, 11, 14} are each linked at least to a representative sensor. Thus, their data readings can be easily recovered through the spatial pre-interpolation method of stage 2. Whereas, the data readings of the sensors {2, 3, 4, 7, 15} are recovered thanks to the minimization (11) of stage 3.

Numerical Results
In this section, we first evaluate our proposed structured approach with the variation of the tuning parameter f ac 1 of the minimization (11) of stage 3, while fixing f ac 2 , in order to measure the data reconstruction error ratio with respect to the different simulated values of f ac 1 and choose the appropriate one that gives the lowest data recovery error. Secondly, we compare the performance of our proposed structured approach, with the fixed tuning parameter f ac 1 , to that of a benchmark scheme, which was designed basically on what was proposed in [6] and in line with our scenarios' requirements. Indeed, at the end of their work, Xie et al. considered, in [6], that there is a small number of empty rows in M, which is, for N = 196, 14 data rows were missing, namely 7% of N (i.e., 93% of N of representative sensors). As we have already stated at the beginning of this paper, treating an important number of missing rows has not been the main focus of their work. Thus, their proposed approach has not taken the existence of the isolated nodes in the network into account. In fact, they basically focused on the existence of successive missing or corrupted entries in the received data matrix M. However, to the best of our knowledge, this is the unique paper that has treated a similar case using MC, and with which we can compare our approach in the first part of this section. Subsequently, in the second part, we try to separately evaluate the benefits of each building block of the proposed approach, namely: • Involving all of the detected clusters equitably in the sampling process. • Selecting the representative sensor nodes using Algorithm 3. • Adding the minimization (11) to the reconstruction pattern.
Making use of the generated signal of the example of Section 4, we perform our structured approach over different scenarios to illustrate the impact of these aforementioned techniques on the interpolation accuracy of the data matrix. To measure the reconstruction error, we opted for the following metrics, where X and X represent, respectively, the initial raw data matrix and the reconstructed one: 1-N MAE tot : the Normalized Mean Absolute Error on all missing entries: 2-N MAE MC : the Normalized Mean Absolute Error on the partially missing entries, which correspond to the non-transmitted readings of the representative nodes: where Ω mc is the set of indexes of the partially missing entries, as found in the received data matrix M ∈ IR N×T . 3-N MAE ER : the Normalized Mean Absolute Error on the missing entries of the fully empty rows, which correspond to the inactive sensor nodes' readings: where Ω ER is the set of indexes of the (N − N rep ) empty rows, found in the received data matrix M ∈ IR N×T . 4-CR: the Compression Ratio: where Ω = {(i, t) | Ω M (i, t) = 1}. Hence, card(Ω) denotes the number of observed entries in M. We vary pct Nrep from 10 to 80, and, for each given pct Nrep , we vary pct m from 10 to 80, in order to assess the proposed approach under different CRs. It is obvious that the range of the values of CR depends on the value assigned to pct Nrep . The larger pct Nrep , the higher CR range can be used. Note that we are mainly interested in the small values of pct Nrep and pct m , since we are considering the high loss scenarios. Specifically, we consider that N = 50 sensor nodes are randomly distributed in a square observation area of size 100 m × 100 m, and we monitor the WSN during T = 100 time slots.
To begin, we measure the data reconstruction error ratio N MAE tot of our proposed structured approach with the variation of the regularization parameter f ac 1 . To do so, we fix f ac 2 to 1, then, we accordingly adjust f ac 1 , which vary from the value 1 to the value 10 −15 . Note that we have used K = 5 during all of the simulations of this paper. Figure 4 shows the effect of f ac 1 on the data recovery performance of our approach. For pct m = 20, we vary pct Nrep and for each case the N MAE tot is calculated with respect to f ac 1 . As we can note, the minimization (11) of stage 3 typically performs better for the value f ac 1 = 10 −5 than the other values. For that reason, we retain this value and use it in all of the next experiments.
In Section 7, the proposed minimization-based interpolation technique (11) of stage 3 has been investigated and then updated when compared to the one of our previous works ( [2] Equation (8)) and ([4] Equation (5)), as we have mentioned. Figure 5 illustrates a performance comparison in terms of N MAE tot between the two methods for different values of pct Nrep and with respect to the regularization parameter f ac 1 . As we can clearly notice through the simulations of Figure 5, for different values of f ac 1 , the data recovery performance is highly improved with the proposed minimization-based interpolation technique of this paper compared to the one shown in our original papers. In the third simulation, we implement a benchmark approach that is based on what was proposed in [6]. The sampling pattern of this approach consists in choosing the set N rep of representative sensor nodes in a purely random way, which is exactly the same as randomly selecting the empty rows. Likewise, for each time instant t, m nodes are uniformly selected from the set N rep to deliver their readings to the sink. Here, neither the selection of the representative sensors nor the selection of the transmitting ones takes the detected clusters into account. As for the reconstruction pattern, to obtain the final recovered data matrix X, this approach performs the MC, and then the spatial pre-interpolation. The temporal pre-interpolation was omitted, since we do not consider the existence of empty columns in the observed data matrix M (This is not the case with our scenario, since, at every t, we ensure the transmission of m readings sensed in different m locations). In Figure 6, we have measured the N MAE tot with respect to the variation of CR, namely pct m , for different values of pct Nrep . Our approach distinctly outperforms the benchmark one across the entire ranges of CR, as we can note from the plots. We are able to go up to 90% of missing rows (pct Nrep = 10) with an interesting reconstruction performance, N MAE tot of about 0.008, while the benchmark technique yields an N MAE tot of [0.47, 0.5].   For the convenience of comparison, we have implemented the N MAE MC and N MAE ER in order to separate the error ratios and demonstrate the recovery performance enhancement that has been achieved by our proposed approach on, respectively, the partially and fully missing readings. Note that the considered framework extremely reduces the overall network energy consumption, since we only use a small set of representative sensors for the data transmission. Furthermore, when compared to the benchmark approach, the proposed one can further improve the sensors lifetime. In fact, for a given N MAE tot target of 0.02 and pct Nrep = 60, we compute the energy consumption during the T time instants for the both compared approaches, depending on the number N of sensors. In this simulation, we consider that two nodes i and j can directly communicate with each other, without the need for relaying, only if the Euclidean distance dst i,j between them is within some transmission radius (r) that scales with Θ( logN/N) [21]. To route the data towards the sink node, we perform the shortest path tree that was computed by the Dijkstra algorithm [16]. The following model is used in order to compute the energy consumption during data transmission [45].
where E Tx (L, dst i,j ) and E Rx (L) represent, respectively, the amount of energy that is consumed by a specific node i, to deliver or receive an L-bit packet through a distance of length dst i,j . In (17), E elec is the energy that is required by the transceiver circuitry at the sender or the receiver and ε amp is the energy consumed by the transmitter's amplifier. Regarding the parameters setting, L = 120 bits [15], E elec = 50 nJ/bit and ε amp = 100 pJ/bit/m 2 [45]. Figure 9 illustrates the energy consumption for the proposed framework as well as for the benchmark one. Indeed, our approach requires far less sensor nodes' readings, consequently, much less energy consumption, in order to achieve the same reconstruction performance. Let us now focus on the benefits of the clusters selection. We show that taking the detected clusters during the representative nodes selection process as well as during the assignment of the sensing and transmitting schedule into account significantly ameliorates the data recovery performance. Thus, we compare our approach to another one, for which we proceed, regardless the existence of the different clusters. The set N rep of representative sensor nodes is selected according to (7) instead of (9), i.e., the spatial correlation criteria are present during the node selection process. Nevertheless, we do not have equitable representation of the different regions that compose the whole network. Withal, for each t, the m transmitting nodes are picked from the set N rep in a purely random way to sense then deliver their data readings, i.e., m = pct m % × N rep instead of (10). To recover the received data matrix, both algorithms apply the 3-stage reconstruction pattern of Section 7. Figure 10 illustrates the 3-D bar graph of the N MAE tot values with the variation of pct Nrep and pct m . This simulation shows how curiously interesting the clusters consideration is. The barres depict that our approach provides a considerable improvement in terms of N MAE tot when compared to the algorithm of comparison, especially in the high compression ratios, i.e., when the number of transmitting sensor nodes is very limited. Note that without enforcing the involvement of all the clusters in the data sensing and transmission process, sensor nodes that belong to the small clusters could be totally ignored, which gravely deteriorates the recovery process.
In Figures 11 and 12, we have measured, respectively, the N MAE MC and the N MAE ER with respect to the variation of CR, namely pct m , for different values of pct Nrep . Figures 11 and 12 highlight the effect of the introduced block on the recovery of, respectively, the representative nodes' and the inactive nodes' readings. Although both of the techniques apply the same MC resolution method, the N MAE MC of our approach is much lower than that of the benchmark. The N MAE ER also seems to be heavily affected, despite the fact that the clusters consideration, at the base, only targets the first stage of the reconstruction pattern, which is the MC resolution. For example, with (pct Nrep = 20, pct m = 10), (pct Nrep = 40, pct m = 10), and (pct Nrep = 60, pct m = 10), we can reach an improvement respectively of 93.88%, 87.87%, and 79.38%, when we enforce the involvement of all the clusters in the data sensing and transmission.  The next scenario aims to prove the importance of neatly selecting the N rep representative nodes. Making use of the spatial correlation in the selection process, as detailed in Algorithm 3, these nodes are selected under the criterion of having the best representation of the whole network. We compare our algorithm to another one that selects its representative nodes randomly in order to investigate the efficiency of the proposed selection process. However, in order to be comparable, this one takes the existing clusters when selecting its representative nodes into account. Hence, the set N rep of representative nodes consists of the combination of J subsets, (N rep j ) j=1,. . . ,J , where N rep j includes N rep j representative nodes selected randomly from cluster CL j while using the same shared percentage pct Nrep , where N rep = ∑ J j=1 N rep j and N rep j = pct Nrep % × cl j . Both of algorithms design their sensing and transmitting schedules, Ω M ∈ IR N×T , based on their selected sets N rep of representative nodes, as described in Section 6 and according to (10). To recover the received data matrix, both of the performed algorithms apply the three-stage reconstruction pattern of Section 7. Figures 13 and 14 depict the results of this simulation. Figure 13 illustrates the N MAE tot . As we can see, when compared to the random selection process, the selection scheme of Algorithm 3 provides a considerable improvement in terms of N MAE tot for the high CRs. The gap between the two curves decreases as we increase the number N rep of representative nodes, namely pct Nrep , since we decrease the probability of choosing different sets N rep .
Let us focus on Figure 14, which highlights the N MAE ER to reveal the impact of our selection process on the reconstruction performance of the empty rows. Expectedly, we find that the N MAE ER is sensitive to the used selection method, which confirms the aforementioned hypothesis. That is, in order to guarantee an accurate reconstruction for the inactive nodes missing data, great care must be taken when selecting the set N rep .   The last simulation highlights the benefit of the 3 rd stage of the proposed reconstruction pattern. We compare our algorithm to the one that only uses the first two stages of Section 7 to obtain its final recovered data matrix X. Following the same logic of the previous experiences, in order to be comparable, we use the sampling pattern of Section 6 with both of the simulated algorithms, which yields the same set N rep of representative nodes and, consequently, the same set of inactive nodes. Noticeably, we can detect a considerable gap in terms of N MAE tot between the barres of Figure 15. This difference for all of the pct Nrep values comes from the non-reconstructed readings of the N Is isolated nodes with the algorithm of comparison. Because we simulated the same network with the same sensor nodes neighboring, the set of the N Is isolated nodes is the same for both of the compared algorithms. Figure 16, which depicts the N MAE ER for both approaches, illustrates that we can reduce the reconstruction error of the empty rows up to 96.89% for (pct Nrep = 10, pct m = 40), 96.08% for (pct Nrep = 20, pct m = 40), 93.65% for (pct Nrep = 30, pct m = 40) and 90.2% for (pct Nrep = 40, pct m = 40), when we apply the minimization (11). These results show that the number of isolated nodes is important for a small pct Nrep . Hence, adding a third interpolation technique, as our proposed minimization (11), becomes heavily needed. Otherwise, we end with a data matrix, which is almost half built, even less.

Conclusions
In this paper, we have investigated an interesting challenge in the dense WSNs. In fact, we have proposed letting a significant number of sensor nodes remain idle. Subsequently, relying on a novel MC-based reconstruction framework, we recover their readings based on the received ones. The strength of our approach lies in its integration or inclusivity for both the compression and reconstruction patterns. For the sampling part, by making use of the inter-spatial correlation feature, we have presented a strategy that neatly selects a restricted number of representative sensor nodes under the criterion of having the best representation of the whole network. Subsequently, for each cluster, we schedule where and when to sense the field. As for the reconstruction part, by taking advantage of the readings similarities in WSNs, we propose an optimization technique that is annexed to the MC resolution. This method, which is positioned in the third stage of the recovery operation, guarantees the reconstruction of all the empty rows corresponding to the omitted sensor nodes. Altogether, these techniques succeed in handling the aforementioned high loss scenario. We have obtained satisfactory results proving the efficiency and the robustness of the proposed techniques as well as the whole unified approach. The results, which were obtained with the multi-Gaussian generated signal, outperform all of the state of the art techniques. They revealed that we are able to go up to 90% of missing rows (i.e. only 10% of N of representative sensor nodes), while we still achieve an interesting reconstruction performance by giving a N MAE tot of about 0.008 when compared to the benchmark one, which is still within the range of [0.47, 0.5].