Data prediction, compression, and recovery in clustered wireless sensor networks for environmental monitoring applications
Introduction
Recent climate change and natural disasters in the world have suggested the importance of the environmental monitoring, which is subsequently rapidly developing as a major application of wireless sensor networks (WSNs) [1], [2], [3], [4]. For instance, within a WSN sensor nodes can be used in a harsh environment to periodically measure meteorological and hydrological parameters around its surroundings, such as light, temperature, humidity, wind speed and direction. Especially by using the advanced wireless communication and sensor technology, WSNs have advantages in applications over other traditional networks, on the aspects of such as withstanding ability, clustering for scalability, and self-organization properties [5], [6], [7].
Although WSNs provide several benefits in the field of environmental monitoring, energy conservation should always be taken into account in almost all application areas. The main reason is that sensors in such environment are impossible to be recharged or replaced, that means the energy is a limited amount. Therefore, energy saving becomes one of the major design concerns in WSNs, and some energy-efficient schemes are proposed to reach the goal of energy saving from all aspects [8], [9], [10], [11], [12], [13]. A sensor node is typically deployed with sensing, computing and wireless communicating modules, in which the communication module consumes the most electricity [5], [14]. Moreover, in the context of continuous monitoring, the most of data changes at a slow speed, which results in a large amount of data redundancy in space or time, subsequently frequent communications between sensor nodes will be a waste of limited energy. Basically, the increase of network lifetime will be proportional to the reduction in the number of transmitted data packets. Following this principle, data reduction has become one of the most enhanced solutions that is aimed to reduce the amount of data transmissions[15], [16], [17], [18].
The most efficient way to obtain data reduction in WSN is data prediction that uses the prediction values instead of the real ones, thereby avoiding the data transmission. In a real-world scenario, it is often unnecessary and yet costly to obtain the precise measurements for each sample period. Data prediction techniques focus on minimizing the number of transmitted measurements from the sensor nodes during continuous monitoring process. However, one key concern is to ensure the accuracy of the prediction within a user-given error bound. For the periodical sensing applications especially environmental monitoring, each consecutive observation of a sensor node is temporally correlated to a certain degree. In our prediction model, the temporal correlation is exploited to perform the prediction of data for the monitoring application based on the user-defined error tolerance. The result of using this correlation-based approach is a dual prediction protocol that has a remarkable effect on reducing the frequency of data transmissions in a way that guarantees the prediction accuracy.
One alternative approach to realize data reduction is using compressing techniques [19], [20] that lead a reduction in the amount of transmitted data because the size of data is reduced. In general, we can classify the data compression schemes into two categories: lossless and lossy compression. Lossless data compression demands the original data to be perfectly reconstructed from the compressed data. By contrast, lossy data compression allows some features of the original data that may be lost after the decompression operation. For highly resource constrained WSN, lossless algorithms are usually not necessary despite the fact that they have better performance on data recoverability. To put it the other way, lossy compression is better able to reduce the amount of data to be sent over the WSN. In the case of lossy compression, the amount of compression and the reconstruction error are the important criterions to judge the quality of compression algorithms. Our work using the Principal Component Analysis (PCA) method to compress the original data is proved to be able to obtain satisfactory results in two ways. More importantly, the error generated by the PCA compression is negligible compared to the prediction error, which ensures the user’s acceptable total error bound.
In order to obtain the energy-efficient scheme for continuous environmental monitoring, we develop in the present paper a novel framework with delicate combination of data prediction, compression, and recovery in cluster based WSNs. The main idea of the framework is to reduce the communication cost through data prediction and compression techniques whilst the accuracy is guaranteed. First, sensor nodes collecting environmental parameters are grouped into multiple clusters based on their physical locations. At the same time, a dual prediction mechanism using LMS prediction algorithm with optimal step size is implemented at sensor nodes and their respective CHs, which not only improves the prediction accuracy, but also achieves faster convergence speed during the initial stage of algorithm. Then the CHs extract the principal component of collected data by the PCA techniques after a sampling period, so redundant data can be prevented. Finally, data is successfully recovered at the base station (BS). Throughout the entire process, all errors are controllable and kept within the tolerable bound. After achieving data reduction, the size of recovered data at the BS is equal to that of raw sensory data collected by all nodes. It is advantageous for the BS to gain a more in-depth understanding of environment parameters. The simulation results also demonstrate that the combination of LMS prediction algorithm and PCA technique is energy-efficient for environmental monitoring applications in cluster based WSNs.
The remainder of the paper is organized as follows. In Section 2, we discuss the related work on data reduction techniques in WSNs. In Section 3, we describe the dual prediction mechanism between sensor nodes and the CH, where an optimal step size LMS (OSSLMS) prediction algorithm is presented. Section 4 proposes the approach of energy-efficient data compression by PCA technique with data redundancy being lifted. In Section 5, we evaluate the communication cost and analyze the mean square error during data prediction and compression. Simulation results are provided in Section 6 to validate the efficacy and efficiency of the proposed approach. Finally, Section 7 presents the conclusions of the whole paper and future work.
Section snippets
Related work
Many models have been proposed to perform data prediction in WSN. The auto regressive (AR) model uses the linear regression function embedded in the sink to calculate the estimation of future sensor readings. By regularly collecting local measurements, the sensor node can compute the coefficients of the linear regression based on past real values. These coefficients are then delivered to the sink to perform time series forecasting.
Within the context of AR model, the paper [21] proposed a
Network model
Due to sensor nodes’ nature of limited battery capacity, how to design energy-aware network architecture has been the important research issue in WSNs. The method by grouping sensor nodes into clusters has been widely applied in WSN in order to achieve energy-efficient and long-lived objective. Cluster based structure is seen to have more advantages than other network model, including scalability, ease of data fusion and robustness.
Fig. 1 depicts a typical two-tier hierarchy in a WSN, in which
Principal component analysis for data compression and recovery
Simulation results in the following section show that OSSLMS can achieve high performance in data reduction for a simple star WSN (single-layer), however, the major obstacle to implement OSSLMS prediction algorithm on a clustered network is that CHs do not have the actual measurements of sensor nodes but just have the prediction values, which means that the prediction error cannot be calculated. Subsequently this causes the prediction mechanism between CHs and the base station to fail.
If CHs
Communication cost
In this paper, the reduction to the communication cost means the energy saving while guaranteeing the user-defined data accuracy. At this point, energy consumption is proportional to communication cost. First, we analyze the communication cost in data prediction. As shown in Fig. 1, the OSSLMS scheme is deployed in the first stage of the clustered sensor networks. Considering a set of P sensor nodes forming a cluster. Let represents a M × P data matrix where xi is a row vector
Data set
To evaluate the proposed algorithms, simulations are performed that are based on the Intel Lab Data collected by the Intel Berkeley Research Lab during a one-month period. In the publicly available sensor data set, we selected a set of temperature readings sampled every thirty one seconds from 54 Mica2Dot sensor deployments at the lab. Since our scheme is special to the clustered sensor networks, we carried out the experiment on two clusters, in which two sets of neighboring nodes (ID: 38–45
Conclusions and future work
In this paper, we propose a new framework for processing environmental data in a clustered WSN, which utilizes three data analysis techniques: prediction, compression, and recovery. In our framework, an optimal step size in the LMS algorithm is obtained to perform data prediction both at the node and at the cluster head, and then the cluster head applies the centralized PCA for data compression; the base station finally recovers the original data with the error tolerance. We have proposed an
Acknowledgments
The work described in this paper was supported by a grant from the National Natural Science Foundation of China (No. 61370107).
References (53)
- et al.
Wireless sensor networks for healthcare: a survey
Comput. Netw.
(2010) - et al.
Extending the lifetime of wireless sensor networks: a hybrid routing algorithm
Comput. Commun.
(2012) - et al.
Deployment guidelines for achieving maximum lifetime and avoiding energy holes in sensor network
Inf. Sci.
(2013) - et al.
Communication/computation tradeoffs for prolonging network lifetime in wireless sensor networks: the case of digital signatures
Inf. Sci.
(2012) - et al.
An energy-efficient data gathering algorithm to prolong lifetime of wireless sensor networks
Comput. Commun.
(2010) - et al.
Geographic energy-aware non-interfering multipath routing for multimedia transmission in wireless sensor networks
Inf. Sci.
(2013) - et al.
Aggregation in sensor networks with a user-provided quality of service goal
Inf. Sci.
(2008) - et al.
Energy-efficient and high-accuracy secure data aggregation in wireless sensor networks
Comput. Commun.
(2011) - et al.
Adaptive model selection for time series prediction in wireless sensor networks
Signal Process.
(2007) - et al.
Prediction-based data aggregation in wireless sensor networks: combining grey model and Kalman filter
Comput. Commun.
(2011)