Elsevier

Information Sciences

Volume 329, 1 February 2016, Pages 800-818
Information Sciences

Data prediction, compression, and recovery in clustered wireless sensor networks for environmental monitoring applications

https://doi.org/10.1016/j.ins.2015.10.004Get rights and content

Abstract

Environmental monitoring is one of the most important applications of wireless sensor networks (WSNs), which usually requires a lifetime of several months, or even years. However, the inherent restriction of energy carried within the battery of sensor nodes brings an extreme difficulty to obtain a satisfactory network lifetime, which becomes a bottleneck in scale of such applications in WSNs. In this paper, we propose a novel framework with dedicated combination of data prediction, compression, and recovery to simultaneously achieve accuracy and efficiency of the data processing in clustered WSNs. The main aim of the framework is to reduce the communication cost while guaranteeing the data processing and data prediction accuracy. In this framework, data prediction is achieved by implementing the Least Mean Square (LMS) dual prediction algorithm with optimal step size by minimizing the mean-square derivation (MSD), in a way that the cluster heads (CHs) can obtain a good approximation of the real data from the sensor nodes. On this basis, a centralized Principal Component Analysis (PCA) technique is utilized to perform the compression and recovery for the predicted data on the CHs and the sink, separately in order to save the communication cost and to eliminate the spatial redundancy of the sensed data about environment. All errors generated in these processes are finally evaluated theoretically, which come out to be controllable. Based on the theoretical analysis, we design a number of algorithms for implementation. Simulation results by using the real world data demonstrate that our framework provides a cost-effective solution to such as environmental monitoring applications in cluster based WSNs.

Introduction

Recent climate change and natural disasters in the world have suggested the importance of the environmental monitoring, which is subsequently rapidly developing as a major application of wireless sensor networks (WSNs) [1], [2], [3], [4]. For instance, within a WSN sensor nodes can be used in a harsh environment to periodically measure meteorological and hydrological parameters around its surroundings, such as light, temperature, humidity, wind speed and direction. Especially by using the advanced wireless communication and sensor technology, WSNs have advantages in applications over other traditional networks, on the aspects of such as withstanding ability, clustering for scalability, and self-organization properties [5], [6], [7].

Although WSNs provide several benefits in the field of environmental monitoring, energy conservation should always be taken into account in almost all application areas. The main reason is that sensors in such environment are impossible to be recharged or replaced, that means the energy is a limited amount. Therefore, energy saving becomes one of the major design concerns in WSNs, and some energy-efficient schemes are proposed to reach the goal of energy saving from all aspects [8], [9], [10], [11], [12], [13]. A sensor node is typically deployed with sensing, computing and wireless communicating modules, in which the communication module consumes the most electricity [5], [14]. Moreover, in the context of continuous monitoring, the most of data changes at a slow speed, which results in a large amount of data redundancy in space or time, subsequently frequent communications between sensor nodes will be a waste of limited energy. Basically, the increase of network lifetime will be proportional to the reduction in the number of transmitted data packets. Following this principle, data reduction has become one of the most enhanced solutions that is aimed to reduce the amount of data transmissions[15], [16], [17], [18].

The most efficient way to obtain data reduction in WSN is data prediction that uses the prediction values instead of the real ones, thereby avoiding the data transmission. In a real-world scenario, it is often unnecessary and yet costly to obtain the precise measurements for each sample period. Data prediction techniques focus on minimizing the number of transmitted measurements from the sensor nodes during continuous monitoring process. However, one key concern is to ensure the accuracy of the prediction within a user-given error bound. For the periodical sensing applications especially environmental monitoring, each consecutive observation of a sensor node is temporally correlated to a certain degree. In our prediction model, the temporal correlation is exploited to perform the prediction of data for the monitoring application based on the user-defined error tolerance. The result of using this correlation-based approach is a dual prediction protocol that has a remarkable effect on reducing the frequency of data transmissions in a way that guarantees the prediction accuracy.

One alternative approach to realize data reduction is using compressing techniques [19], [20] that lead a reduction in the amount of transmitted data because the size of data is reduced. In general, we can classify the data compression schemes into two categories: lossless and lossy compression. Lossless data compression demands the original data to be perfectly reconstructed from the compressed data. By contrast, lossy data compression allows some features of the original data that may be lost after the decompression operation. For highly resource constrained WSN, lossless algorithms are usually not necessary despite the fact that they have better performance on data recoverability. To put it the other way, lossy compression is better able to reduce the amount of data to be sent over the WSN. In the case of lossy compression, the amount of compression and the reconstruction error are the important criterions to judge the quality of compression algorithms. Our work using the Principal Component Analysis (PCA) method to compress the original data is proved to be able to obtain satisfactory results in two ways. More importantly, the error generated by the PCA compression is negligible compared to the prediction error, which ensures the user’s acceptable total error bound.

In order to obtain the energy-efficient scheme for continuous environmental monitoring, we develop in the present paper a novel framework with delicate combination of data prediction, compression, and recovery in cluster based WSNs. The main idea of the framework is to reduce the communication cost through data prediction and compression techniques whilst the accuracy is guaranteed. First, sensor nodes collecting environmental parameters are grouped into multiple clusters based on their physical locations. At the same time, a dual prediction mechanism using LMS prediction algorithm with optimal step size is implemented at sensor nodes and their respective CHs, which not only improves the prediction accuracy, but also achieves faster convergence speed during the initial stage of algorithm. Then the CHs extract the principal component of collected data by the PCA techniques after a sampling period, so redundant data can be prevented. Finally, data is successfully recovered at the base station (BS). Throughout the entire process, all errors are controllable and kept within the tolerable bound. After achieving data reduction, the size of recovered data at the BS is equal to that of raw sensory data collected by all nodes. It is advantageous for the BS to gain a more in-depth understanding of environment parameters. The simulation results also demonstrate that the combination of LMS prediction algorithm and PCA technique is energy-efficient for environmental monitoring applications in cluster based WSNs.

The remainder of the paper is organized as follows. In Section 2, we discuss the related work on data reduction techniques in WSNs. In Section 3, we describe the dual prediction mechanism between sensor nodes and the CH, where an optimal step size LMS (OSSLMS) prediction algorithm is presented. Section 4 proposes the approach of energy-efficient data compression by PCA technique with data redundancy being lifted. In Section 5, we evaluate the communication cost and analyze the mean square error during data prediction and compression. Simulation results are provided in Section 6 to validate the efficacy and efficiency of the proposed approach. Finally, Section 7 presents the conclusions of the whole paper and future work.

Section snippets

Related work

Many models have been proposed to perform data prediction in WSN. The auto regressive (AR) model uses the linear regression function embedded in the sink to calculate the estimation of future sensor readings. By regularly collecting local measurements, the sensor node can compute the coefficients of the linear regression based on past real values. These coefficients are then delivered to the sink to perform time series forecasting.

Within the context of AR model, the paper [21] proposed a

Network model

Due to sensor nodes’ nature of limited battery capacity, how to design energy-aware network architecture has been the important research issue in WSNs. The method by grouping sensor nodes into clusters has been widely applied in WSN in order to achieve energy-efficient and long-lived objective. Cluster based structure is seen to have more advantages than other network model, including scalability, ease of data fusion and robustness.

Fig. 1 depicts a typical two-tier hierarchy in a WSN, in which

Principal component analysis for data compression and recovery

Simulation results in the following section show that OSSLMS can achieve high performance in data reduction for a simple star WSN (single-layer), however, the major obstacle to implement OSSLMS prediction algorithm on a clustered network is that CHs do not have the actual measurements of sensor nodes but just have the prediction values, which means that the prediction error cannot be calculated. Subsequently this causes the prediction mechanism between CHs and the base station to fail.

If CHs

Communication cost

In this paper, the reduction to the communication cost means the energy saving while guaranteeing the user-defined data accuracy. At this point, energy consumption is proportional to communication cost. First, we analyze the communication cost in data prediction. As shown in Fig. 1, the OSSLMS scheme is deployed in the first stage of the clustered sensor networks. Considering a set of P sensor nodes forming a cluster. Let X=[x1,x2,,xM]T represents a M × P data matrix where xi is a row vector

Data set

To evaluate the proposed algorithms, simulations are performed that are based on the Intel Lab Data collected by the Intel Berkeley Research Lab during a one-month period. In the publicly available sensor data set, we selected a set of temperature readings sampled every thirty one seconds from 54 Mica2Dot sensor deployments at the lab. Since our scheme is special to the clustered sensor networks, we carried out the experiment on two clusters, in which two sets of neighboring nodes (ID: 38–45

Conclusions and future work

In this paper, we propose a new framework for processing environmental data in a clustered WSN, which utilizes three data analysis techniques: prediction, compression, and recovery. In our framework, an optimal step size in the LMS algorithm is obtained to perform data prediction both at the node and at the cluster head, and then the cluster head applies the centralized PCA for data compression; the base station finally recovers the original data with the error tolerance. We have proposed an

Acknowledgments

The work described in this paper was supported by a grant from the National Natural Science Foundation of China (No. 61370107).

References (53)

  • F. Marcelloni et al.

    Enabling energy-efficient and lossy-aware data compression in wireless sensor networks by multi-objective evolutionary optimization

    Inf. Sci.

    (2010)
  • H. Zhou et al.

    A novel stable selection and reliable transmission protocol for clustered heterogeneous wireless sensor networks

    Comput. Commun.

    (2010)
  • S. Zhao et al.

    Variable step-size LMS algorithm with a quotient form

    Signal Process.

    (2009)
  • H.C. Woo

    Variable step size LMS algorithm using squared error and autocorrelation of error

    Proc. Eng.

    (2012)
  • R. Szewczyk et al.

    Habitat monitoring with sensor networks

    Commun. ACM

    (2004)
  • C. Alippi et al.

    A robust, adaptive, solar-powered WSN framework for aquatic environmental monitoring

    IEEE Sens. J.

    (2011)
  • H. Liu et al.

    A wireless sensor network prototype for environmental monitoring in greenhouses

    Proceedings of International Conference on Wireless Communications, Networking and Mobile Computing (WiCom)

    (2007)
  • F. Ingelrest et al.

    Sensorscope: application-specific sensor network for environmental monitoring

    ACM Trans. Sens. Netw.

    (2010)
  • I.F. Akyildiz et al.

    Wireless Sensor Networks

    (2010)
  • W. Dargie et al.

    Fundamentals of Wireless Sensor Networks: Theory and Practice

    (2010)
  • Y. Yun et al.

    Maximizing the lifetime of wireless sensor networks with mobile sink in delay-tolerant applications

    IEEE Trans. Mob. Comput.

    (2010)
  • J. Polastre et al.

    Telos: enabling ultra-low power wireless research

    Proceedings of the Fourth International Symposium on Information Processing in Sensor Networks (IPSN)

    (2005)
  • J. Zheng et al.

    Distributed data aggregation using Slepian–Wolf coding in cluster-based wireless sensor networks

    IEEE Trans. Veh. Technol.

    (2010)
  • H. Jiang et al.

    Prediction or not? an energy-efficient framework for clustering-based data collection in wireless sensor networks

    IEEE Trans. Parallel Distrib. Syst.

    (2011)
  • C. Caione et al.

    Distributed compressive sampling for lifetime optimization in dense wireless sensor networks

    IEEE Trans. Ind. Inf.

    (2012)
  • Y. Quer et al.

    Sensing, compression and recovery for wireless sensor networks: monitoring framework design

    IEEE Trans. Wirel. Commun.

    (2012)
  • Cited by (0)

    View full text