A Novel Association Rule-Based Data Mining Approach for Internet of Things Based Wireless Sensor Networks

Wireless Sensor Network (WSN) is one of the fundamental technologies used in the Internet of Things (IoT) which is deployed for diverse applications to carry out precise real-time observations. The limited resources of WSN with massive volume of fast-flowing IoT data make the aggregation and analytics of data more challenging. Recently, data mining-based solutions have been proposed to effectively handle the data being generated by the sensors and to analyze the data patterns for deducing the required information from it. The increasing need of these techniques motivated us to propose a distributed and efficient data mining technique that not only handles the massive and rapidly generated data by the nodes, but also increases the life span of the network. In this paper, we propose a novel scheme for the IoT based WSN that mines the sensor data using association rule without moving it to any Cluster Head (CH) or Base Station (BS). The new proposed scheme enables sensors to perform computations locally and only the minimum higher-level statistical summaries of the data at Cluster Members (CMs) are exchanged with their CH. This considerably reduces the communication overhead which ultimately prolongs the network lifetime. The proposed scheme is evaluated via extensive simulations and the results obtained demonstrate that the integration of the proposed scheme in the existing protocols significantly reduces the communication overhead which ultimately prolongs the network lifetime and stability.


I. INTRODUCTION
With the advent of advanced communication technologies and sophisticated protocols, new paradigms have emerged on the technological horizon. The most prevalent one is the Internet of Things (IoT) which has attained extensive popularity and acceptance due to its broad range of applications in our daily life. One of the main functions and integral parts of the IoT is environment sensing using sensors that are usually deployed as standalone or embedded in other objects like smartphones, cars, building, etc. For effective decision making, data is gathered continuously or continually in a large scale from a domain of interest such as battle fields, natural disasters, health monitoring, coal The associate editor coordinating the review of this manuscript and approving it for publication was Shaohua Wan . mines, agriculture, weather, etc.. However, for large scale scenarios, sensor nodes are interconnected wirelessly to form a self-organized network, called Wireless Sensor Network (WSN), which is considered as the main source of huge volume of data generation. According to the International Data Corporation (IDC) report, IoT devices would reach a threshold of 6 billion whereas, data being communicated would raise to 175 Zettabytes in the span of 2018-2025, giving a 30 percent boost to real-time data by 2025 [1].
One of the key data generation sources for the IoT is WSNs in which sensors gather huge volumes of data across diverse applications such as remote health monitoring, weather, surveillance, etc. [2]- [4]. These applications are generally critical and require real-time analysis for effective decision making and reliable network operation(s). By integrating the WSN with the IoT and cloud (as depicted in Fig. 1), an almost unlimited storage and computation facility can be provided to such real-time applications for efficient data analytics using Artificial Intelligence and Machine Learning based algorithms. However, the intra-WSN data traffic should be minimized; otherwise, the availability of the IoT and the cloud may not be properly utilized due to the limited lifetime of the WSN, as we will discuss later in this article. The gateway node is responsible to bridge the WSN and the IoT. To manage the data efficiently in the WSNs, various authors advocated the database-oriented methods to be useful for efficient data handling while treating WSNs as distributed database. In such a context, sensor nodes being data sources act as database relations stored at each sensor, constituting a network level distributed database [5], [6]. The distributed database management in WSNs provides two-fold advantage: first, the cost of data collection and analysis in terms of energy is minimized; second, an SQL-like abstractions can be implemented on WSNs that would simplify data collection and query processing [7], [8].
Recently, data being relations of a database, data mining-based approaches have been used in the IoT domain in order to efficiently and effectively analyze the huge volume and high-speed data generated by the WSNs for the intelligent decision making. However, mining the data from the resource constraint sensors is considered to be a challenging task; hence, new AI based techniques are emerging to devise effective analytics upon huge quantity of data generated by WSNs for the IoT infrastructure. Some basic examples of these techniques used for data analytics and producing intelligent applications in the IoT domain are classification, clustering, association analysis, time series analysis, and outlier analysis. Among these techniques, association analysis is concerned with determining interesting patterns from a huge set of data items and to find all the co-occurrence relationships from a dataset, called associations. Association rules mining techniques for WSNs can be classified based on data processing location: Centralized and Distributed. In a centralized method, data from the entire network is stored in a central site for further analysis. In this case, the initial data reduction is performed in the central site. On the other hand, the in-network method considers the limited resource of sensor nodes and performs some extra computation in the nodes to limit the message and communication energy during transferring the data to the central site. Association rule mining is used in various sub-systems of the IoT domain, such as the smart homes, water quality monitoring [9], event detection in traffic management systems, health monitoring, and intrusion detection [10], [11]. However, although effective, a large body of these schemes are not appropriate to be used in the WSNs based IoT due to the fact that they are usually resource-hungry and require a centralized architecture which do not suit the distributed and resource constraint nature of the WSN. It is therefore required to redesign and customize the data mining based approaches to be adapted to the WSN architecture for the overall performance improvement of the IoT. For instance, for a WSN, a viable technique would be the one which is distributed and capable of tackling continuous and rapid streams of data without incurring a lot of processing and communication overhead. Moving along this direction, one of the main factors is energy conservation that is always required in the battery powered IoT-incorporated sensors.
Clustering in WSNs serves as a major strategy for energy preserving and extending network lifetime [19], [25], [26]. The clustering algorithm in WSN divided into three phases. The role of phase 1 is to initiate clustering processes and in phase 3, data are collected from cluster members (CMs) and then forward to a BS or sink. While in phase 2, the most important step is done, i.e., Cluster Head (CH) selection then the clusters are formed. CH selection is the essential step in clustering processes and there are various algorithms in the literature that handle this step, for example, DEC [27] algorithm utilizes energy as the factor for CH selection, and LEACH-MF [3] utilizes energy, moving speed, and pause time as the factors for CH selection, while, CREEP [28] and ECH [4] utilize energy and distance as the factors for CH selection. In our work, we considered that the network is clustered using any clustering algorithms for WSNs.
In this paper, WSN is divided into clusters each having a CH leading their CMs see Fig. 1. In the proposed association rule algorithm, CHs have the potential to decompose computations into local ones at their CMs. CHs and CMs exchange minimal statistical summaries in order to perform a partial computation task. Eventually, these partial outcomes from each CH will be aggregated by the BS. The BS may ultimately transfer this data to the cloud for further processing and analysis, via the IoT platform. Moreover, our approach is not only distributed in nature that suits the WSN architecture, but also it considerably reduces the data exchanges among sensors and between sensors and BS thereby saving the network bandwidth and energy, hence prolonging the lifespan of the network.
The contribution of this paper is as follows: We propose a novel version of the association rule-based data mining approach for the IoT based WSNs platform that finds the association rules of the sensor data without moving the data to BS or CHs. The new proposed scheme considerably reduces the energy depletion of the sensor nodes which is the main constraint. The effectiveness of the proposed scheme is proved via total exchanged messages analysis and via integration with the benchmark algorithms with extensive simulations using various metrics and compared the results with the original benchmarks. The results obtained depict significant performance improvement in terms of energy and network lifetime.
The article is organized into various sections as follows. Section II, presents the related work, Section III presents the integration methodology of the databases, Section IV presents the proposed technique, Section V presents the experimentation and results, and finally, Section VI concludes the paper.

II. RELATED RESEARCH
In the context of IoT, WSNs are the main data generation source offering new prospects for data mining and data analytics research to extract useful information for diverse array of applications. In the literature, a wide range of data mining algorithms have been applied in the IoT domain for discovering different knowledge patterns. These algorithms have been exploited in a variety of ways [29]. Frequent mining, sequential mining, classification, and clustering are the prominent approaches developed to analyze data and to make effective decisions in the IoT systems [30].
Various methods related to our proposed work have been proposed in the literature, which are reviewed as follows. The authors in [31] proposed a centralized technique named as Data Stream Association Rule Mining (DSARM). The DSARM is used to identify the missing values in the data captured by sensors. This technique detects the sensor nodes that recursively transmit the duplicate data and it also estimates the missing values by manipulating the readings reported by other related sensors. Another estimation technique proposed by [32], [33], termed as Closed item-sets-based Association Rule Mining (CARM), is employed to deduce the recent sensor association rules in the sliding window based on the latest closed item-sets. In [34], the authors proposed an online one-pass technique that changes the WSN data-stream to a list form, called Interval List (IL), thereby employing the inter-stream association rule mining from the huge sensor data stream. In [35], the authors proposed a rule-learning based technique that extracts distinct rules from the data reported by sensors to control and coordinate the operations performed by the network. In [36], the authors proposed a tree like data structure called the sensor pattern tree, that is used to derive association rules from sensor data. This technique is advantageous because it scans the database only once.
In the literature, various distributed approaches have also been proposed to solve the application related issues and to optimize the performance of WSN. The authors in [37] address the problem of distributed clustering in the context of WSN. An asynchronous distributed clustering algorithm is proposed which is based on in-network learning. The sensors learn clusters from the data being sensed without communicating the raw data to the BS; thereby minimizing mining time and communication overhead. The k-means and Gaussian Mixture Models were used as main clustering algorithms. This approach requires each node to know other network nodes and communicate data summaries. In [38], the authors proposed an environmental monitoring system and investigated the effect of environmental parameters on the Gross Primary Productivity (GPP) level by using and evaluating six classification models, i.e., naïve Bayes, support vector machine, multilevel perceptron, decision tree, and k-nearest-neighbor. After selecting the best classifier, it is deployed on sensors for predicting the effect of sensed data on the GPP level. For communication overhead reduction, sensors communicate the outcomes to the BS only if the GPP is affected by the sensor readings.
In [39] authors analyzed large datasets derived from WSN based real-time air pollution monitoring system. Different decision-making strategies were devised after applying business intelligence and data mining techniques on the data. K-means was used for clustering. The approach proposed in [40] integrates the WSN with the Artificial Neural Network (ANN) to detect the forest fire. Data related to fire, such as smoke, light, and temperature, is collected by sensors deployed at different regions and is transmitted to a pre-trained ANN installed at the BS. The ANN then detects fire and generate alarms. In [41], the authors proposed a forest fire detection and monitoring framework for clustered WSN. They also presented the inter and intra-cluster protocols. To reduce communication overhead, the fire detection task is done only by cluster heads.
In [42], data mining technique is used in the clustered WSN to detect fire in the forest. Each individual node is responsible for fire detection using data mining-based classifier. Upon detection, a sensor node transmits the alarm message to the BS routed via different CHs and CMs. To reduce the communication overhead, each sensor node only sends the abnormal values to the BS using the mining technique. All the duplicate and normal readings are discarded. Data mining approaches have also been applied for the advancement of agriculture using WSN integrated with IoT. For example, [43]- [45] employed association rules and linear regression to mine the sensor readings for efficient and accurate decision making.
In [46], the authors proposed an approach for energy reduction in the WSNs, called EK-means. The scheme works in two steps. In the first step, similar data is eliminated at each sensor using Euclidean distance. In the second step, the communication overhead is reduced by applying an enhanced version of k-means clustering at data aggregator nodes to group duplicate datasets produced by the neighbors into one cluster before sending the data to the BS. In [47], the authors proposed a Distributed k-means Clustering (KDC) mechanism. The authors also proposed an efficient data aggregation technique for WSN using adaptive weighted allocation which is based on their proposed KDC. The aim of the KDC is to reduce the data duplication at the sensor nodes. A closely related approach is used by [48] in underwater WSN. In [49], the authors proposed a data mining algorithm and a compact tree structure named as Associated Sensor Pattern tree (ASP-tree) for WSN. Both of these techniques are used to capture the associated sensor patterns. For overhead reduction, all associated patterns are produced by scanning the whole dataset only once using a pattern growth-based technique.
The above mentioned schemes are summarized in Table 1. The existing techniques produce considerable communication overhead to achieve improved real-time decision making with enhanced precision for the WSN applications. This usually become one of the main causes of energy depletion in the WSN.
Our proposed technique is distributed and different from the ones mentioned above. The main idea of our proposed technique is that we consider the data at sensor nodes in WSN VOLUME 8, 2020 to be a distributed database (D). The data stored at each sensor node is stored in the form of rows and columns (i.e., columns denote sensor attributes). In our proposed method, the sensor nodes are grouped into different clusters, each having a CH that manages the cluster and a set of CMs. The CMs will periodically answer the queries generated by their CHs; however, only the statistical summaries would be communicated back to CHs. The CHs after aggregating the summaries will send the accumulated and computed outcome to the BS for final processing. These queries (SQL-like abstractions) are often continuous so that the application is notified continually regarding the changes recorded by the sensors, unlike traditional queries that mainly focus on the current state of a database [5]. Sensor nodes would not respond to queries until they have new recorded readings in contrast to the previous probing. This approach of data accumulation reduces the huge volume of data communicated across the network, reducing the computation burden incurred on the BS. This would ultimately increase the lifespan of the WSN. Furthermore, for efficient query processing that incurs low energy dissipation and minimal delay, an efficient load balancing policy is employed that takes into account the remaining power and the load of the nodes.
The time synchronization problem to synchronize the local clocks of sensor nodes in the WSN have been extensively studied in literature over the last two decades and yet there is no specific time synchronization scheme available to achieve higher order of accuracy with greater scalability independent of topology and application [50]- [52]. In this paper, a synchronization process is assumed between the sensor nodes, however the synchronization process is not the main study of this paper.

III. PROBLEM FORMULATION
In this section, we provide the description of different types of data distribution and the proposed methodology to manage these distributed databases without moving and join at one node such as CH or BS. As mentioned earlier, we assume that each sensor node possesses a component database containing a set of attributes.

A. IMPLICIT GLOBAL DATABASE
Each sensor node s i stores a component D i of the global database D. Data at s i in the form of tuples. A database component D k residing at node s k , includes certain attributes shared with D i (k = i) and some diverse non-shared attributes. The distribution strategy of databases require "Join" operation to construct the implicit global database D from the components D i 's. The proposed methodology utilizes the shared attributes to perform the processing of data. This scheme presents a more realistic approach than applying non-overlapping single key attributes set for the components that are allocated around the nodes in the network. The implicit global database D exists as fragments (each fragment represents a data at one sensor node) that are distributed over the nodes in the network. By implicit format we mean that the each tuple of D exists in a distributed format at sensor nodes, i.e., tuples do not explicitly exist at BS or end user.

B. INTEGRATION OF SENSORS DATABASES
We assume that the global database D corresponding to the WSN is distributed as local database components over all the sensor nodes in the WSN. The distribution as described above in implicit global database. The global database D can be generated at the end user or BS through the join of such component relations and can provide remarkable data suitable for performing computation as well as mining activities using association rules. The key focus of the proposed scheme is to mine the implicit global database D using association rules by maintaining every data fragment D i of sensor s i and minimizing the data communication between the CMs and their CH. As a result, the local results of each database D i at sensor node s i will only be transmitted from the CM s i to its respective CH for aggregation. Finally, the results from CHs will be sent to the BS for producing final association rules results.
The mathematical formulation of the proposed problem can be described as follows: consider a WSN with n sensor nodes, each sensor node s i has a component of database D i 's with A i as the set of attributes. Let A is the union of attributes of the local components at all sensor nodes in the WSN (Equation 1).
Let S ij is the set of attributes shared between D i and D j as follows: Then, the union of all shared attributes among all the local relation can be defined as the set S where S can be computed as follows: The proposed scheme emphasizes on determining the association rules of implicit D through minimal communication of messages among the WSN nodes. Therefore, the global computation task is divided into local computations taking into account the shared attribute constraints. As a result, the aggregation of summaries of local computation results can help to produce the global association rules. This can be formulated mathematically in general way as follows: Consider that a function F is applied on the explicit database D to obtain the result R as given in Equation 4. As stated previously, the required distributed computation is the derivation of association rules corresponding to D. Here, we can denote F as the algorithmic implementation for the derivation of association rules for D, where R represents the obtained association rules for D.
If the database D is implicitly defined, the responsibility of every CM to perform local computation on its respective database component. The local results are then exchanged through communication with CH to obtain the global results of computation. The attribute set S shared between components can finally result in generating global D from the components D 1 to D n . The corresponding realization of function F as given in Equation 4 can be rewritten as: Here, h i (D i , S) represents the i th CM's local computation implemented on D i at s i such that, the operation H represents the aggregation of the results of local computations executed by the CHs. Each problem involves a distinct set of h-operators (h i 's) and the features of H and h i relies on S and the participated D i 's. Finally, the BS will get the discovered associations rules by the CHs.   (H and h i s), therefore, need to be dynamically determined by the CH for each instance of F(D), depending on the participating nodes, the attributes contained in their native databases, and the sharing pattern of attributes.

IV. DISTRIBUTED ASSOCIATION RULES MINING FOR IoT BASED WSN
The key focus of our newly proposed scheme is to identify the global association rules of the implicit global database D. This global computation task is divided and allocated over the sensor nodes in the network such that the computations are executed locally at each CM and only the statistical summaries are gathered and communicated. The BS initiates the global computation task by sending requests to the CHs to perform computations and find the association rules with its CMs. On receiving such requests, each CH starts to create the shared relation with its CMs and asks its CMs to execute computations such as support and confidence locally as we will explain in detail with example in this section. This helps to minimize the size of messages and ensure that only minimal number of messages are communicated between CMs and CH and between BS and CHs, which in turn can minimize the energy utilized and hence enhance the WSN lifetime. The proposed distributed mining technique for deriving association rules from a cluster at IoT based WSN comprises of three main phases: Initialization, Support and Confidence Computing, and Aggregation as depicted in Fig. 3. During the initialization phase, each CH generates the shared relation using attributes of shared set and their values obtained from the CMs. In Support and Confidence Computing phase, each CH queries its members to compute the Support and Confidence. Finally, in the aggregation stage, CH determines the local association rules and sends the aggregated results to the BS.
The whole network is converted into k distinct clusters with the help of a clustering technique, such as DEC [27]. Each CH CH i has n associated CMs s i j , j = 1 . . . n. The CH and its CMs collaboratively implement the mining algorithm as presented in Algorithm 1.

A. INITIALIZATION PHASE
In this phase, every CH generates the shared relation as follows: We define the relation Pshared as the cross product VOLUME 8, 2020 Algorithm 1 Mining Algorithm (will Be Executed by Every CH i ) 1: Call Shared Relation procedure. 2: Find the candidate item-sets C i . 3: Find frequent item-sets F i by calling find-frequent item-sets procedure. 4: Extract Rules R i by calling Extract Rules(F, support, confidence) procedure. 5: Compute the total number of tuples at CH i (N total−i ). 6: Send to BS: R i with confidence of each rule and N total−i . of all distinct values of shared attributes in S, i.e., it includes the records corresponding to all possible combinations of values for attributes in S, mediating the formation of the global D. The records at each node which have zero count are then eliminated from the relation Pshared and hence the resultant relation will be the shared relation, as described in Algorithm 2. This phase will be executed by every CH i . by removing from Pshared all tuples with zero count. Example: For clarification, we consider three database components at the sensor nodes s 1 , s 2 and s 3 with three databases D 1 , D 2 and D 3 respectively such that these databases jointly determine the global implicit database D. Moreover, we assume that s 1 , s 2 and s 3 are CMs of a cluster with CH CH 1 . The local databases D 1 , D 2 and D 3 from the three nodes are shown in Table 2. The shared attributes are a, b and c with distinct values for a = {a1,a2}, b = {b1, b2, b3} and c = {c1,c2}. The Pshared relation will be cross products of the values of shared attributes as in Table 3: Considering only the tuples in Pshared that have non zero count at s 1 , s 2 and s 3 , the record a1, b1, c2 has zero value at s 3 and the relation Shared will be as in Table 4.

B. SUPPORT AND CONFIDENCE COMPUTING PHASE
This phase is implemented by each CH i to perform the following two tasks: 1) To enumerate the candidate-sets of the upcoming level using the frequent item-sets of the prior level. 2) To compute support as well as confidence values. Every CH i will initiate the execution of the Support and Confidence Computing phase. CH i is responsible for performing the crucial control operations such as determining and handling both the active and candidate item-sets, communicating with its CMs to determine the support and confidence. The computation operation is therefore decomposed and iteratively executed and managed by every CH i .
The support of an item-set can be defined as the ratio between the count of transactions which include the item-set and the total count of transactions in the implicit D. Hence, the major computational primitive required is the determination of total tuples count in D.

1) TUPLES COUNT IN IMPLICIT DATABASE
The Tuples Count in Implicit Database can be computed only after obtaining the local outcomes of computation from each sensor node. However, such computations that satisfy particular attribute-value conditions (shared tuple) are challenging and are detailed below. We decompose this process of finding tuples, requesting feedback from the CMs of D i 's regarding the local counts. The corresponding replies from the CMs are then used to determine N total (D), i.e., the total tuples in D. This can be expressed as follows: where, cond j denotes the attribute-value condition for tuple j th belonging to Shared relation, N (D t ) cond j denotes the tuples count in D t at CM s i t that satisfies cond j . Based on Equation 6, we can write: Such that, j refers to the j th tuple of relation Shared. It is required to have such summary for each tuple in Shared from each CM. The role of the function H is to calculate the sum-of-products from the deduced summaries according to the Equation 6, in which each product term represents the count of tuples satisfying cond j in a D i and the resultant gives the number of distinct tuples fulfilling cond j , needed for the implicit Join of all the D i 's. Then the summation operation is performed on the product terms computed for each tuple. This operation simulates a Join operation executed on all the databases without explicit enumeration of the tuples. The most favorable aspect of decomposing N total (D) is that it is possible to translate each product term N (D t ) cond J into an SQL query; select count (*), such that cond j can be executed by CM (s i t ). In the example mentioned above, the count of each shared tuple can be computed using Equation 6 by taking cond j as the items of each tuple in shared for example the first tuple of shared relation, cond j will be (a = a1, b = b1, c = c1) which implies N (D 1 ) cond J = 2, N (D 2 ) cond J = 2, and N (D 3 ) cond J = 2 and the sum product will be 8, i.e., the number of tuples corresponding to first shared tuple will be 8. The indexed relation shared and the number of tuples corresponding to each shared are computed in Table 5. The total tuples (N total ) will be 23.

2) SUPPORT AND CONFIDENCE FOR CANDIDATE SETS
The support of an item-set can be defined as the ratio between the number of transactions which include the item-set and the total number of transactions in D, and the confidence regarding a set of transactions is the fraction of transactions which includes the consequent Y given that it includes the antecedent X . Hence, the major computational primitive required is the determination of total tuples count in D which can be computed only after obtaining the local outcomes of computation from each sensor node. It is attainable to extend the tuples count decomposition to have the number of tuples which meet a new condition by modifying cond j of Equation 6 as provided below, which is essential to identify the support measure for a candidate frequent item-set.
N (D t ) cond j and new−condtion (8) The method in which CH i finds the support measure for a candidate frequent item-set is as described below. In relation Shared, CH i checks the condition specified and identify the tuples matching the attribute-value pairs in the candidate set and then retains those tuples to find the number of tuples resulted from this reduced Shared relation. The support level for a candidate set of attribute-value pairs is given by the ratio of the resultant candidate set count by the total count N total . Algorithms 3 and 4 provide the frequent item-sets computation at each CH and the candidate item-set generation procedures respectively.

Algorithm 3 Frequent Item-Sets
In previous Example, In order to extract frequent item-sets, we assume the following: (1) the minimal threshold value (support value) is 0.30, i.e., the minimum count of occurrence for an item-set in the result list of frequent item-sets is 7 (0.30 * 23), (2) the minimum size of item-set is 1 and (3) the maximum size of an item set is set to the largest item-sets we have found (i.e., size k = 1, 3, etc.).
• According to our threshold (support value), the set of frequent 1-item sets of C 1 will be F 1 =  {a1, a2, b1, b2, c1, e2, e4, f 1, f 2, f 6}. The set of candidate 2-item-sets (C 2 ) will be the set of 2-combinations of items in F 1 such that there is no more than one item belongs to the same column, e.g., (a1, a2) is invalid combination. As a result the combinations will be as follows: {a2,b1}, {c1,e1}, {e2,f1}, {b2, As in previous step we can compute the frequency for each combination in Table using Equation 8. for example using Equation 8, the frequency of {e1,f1} will be as follows: new − condition = {e=e1 and f=f1} and cond j =shared tuple with index j, j = 0, 1, 2, 3, 4. And Similarly, the number of tuples containing the other non-shared items are given in Table 7  • In order to form the set of candidate 3-item-sets, C 3 ; we find the combination from items in F 2 such that there is no more than one item that belongs to the same column, e.g., {a1, a2, a1} is an invalid combination. Also, any two 2-itemsets from F 2 that intersect with others, e. • Similarly, we can get the frequency of each item in C 3 . The set of frequent 3-item-sets, F 3 , consisting of those candidate 3-item-sets in C 3 that have minimum support are as shown in Table 8. • The combination from F 3 is C 4 = a2, b2, c1, e2 and the count of this item-set is 3 and as a result F3 = φ which is below support value. Thus, C 5 = φ, and the algorithm execution is terminated and all of the frequent item-sets have been obtained.

C. EXTRACTION AND INTEGRATION OF RULES
Extraction: During this step, each CH i extracts the association rules using frequent item-sets F. The major steps involved in rules extraction process as presented in Algorithm 5 are given below: Algorithm 5 Extract Rules (will Be Executed by Every CH i ) 1: Input: F i : large item-sets, support, confidence 2: Output: R i : Association Rules satisfying support and confidence at CH i . 3: R = φ. 4: for each f ∈ F i do 5: for each c ⊂ f |c = φ, c = f do 6: if support(f ) support(c) ≥ confidence then 7: end for 10: end for • for each frequent item-set f ∈ F, using all nonempty subsets c of f and c = f .
(where confidence is a threshold).
Then, a message is sent to the BS from every CH i which includes the rules set R i with confidence of each rule and the total tuples count (N i ).
Integration: BS integrates the rules by considering the confidence with the help of N i and the total tuples count of the global WSN database. The major steps involved in the integration process as presented in the Algorithm 6 are given below.
. . , k, j is the rules count in R i , c j i is the confidence and r j i is the rule. • N i is the tuples count obtained from CH i . • Let δ be the total tuples count obtained from k clusters.

Algorithm 6
Rules Integration (will Be Executed by Base Station) 1: N total−j is number of tuples at cluster j. 2: Let confidence(R j i ) is the confidence of rule R i at cluster j. 3: Let δ is the total number of tuples obtained from k clusters. 4: R = R 1 · · · R k { all rules that received from k clusters} 5: δ = k i=1 num_of _tuples i 6: for each rule R i ∈ R do 7: num_of _tuples j δ 8: end for 9: Select rules that satisfy the weight threshold value In our example, finally at CH 1 , the rules generated as shown in Table 9.

D. COMPLEXITY COMPUTING AND ANALYSIS OF HYBRID ALGORITHM
The cost of working with implicitly specified set of tuples can be measured in various ways. One such cost model computes the number of messages that must be exchanged among various sites (sensor nodes). Complexity for distributed query processing in databases has been discussed in [53] and this cost model measures the total data transferred for answering a query. In our case the local computation at every sensor can be ignored by the number of exchanged messagees and also the amount of data transferred is very little (statistical summaries) but the number of messages exchanged may grow rapidly with the number of iterations of the proposed mining algorithm [54]- [58]. At each cluster in WSN, in order to extract the association rules, a number of messages need to be exchanged for the hybird algorithm. Let us say: 1) K be the number of CHs.
2) m be the average number of CMs in each cluster, and 3) there are k-frequent item sets. We derive below an expression for the number of messages that need to be exchanged for our proposed algorithm dealing with the implicit set of tuples as follows: the number of exchanged messages during the creation of Shared relation by every CH can be computed as below: • m messages to enquire the shared attributes between CMs.
• m messages to enquire and to receive the different shared attribute items to create Pshared relation.
• m message to compute the count of each tuple in Pshared to find the Shared relation.

2) DETERMINING k FREQUENT ITEM SET
Frequent item-sets at each level of the association rule algorithm can be determined by exchanging only 2m messages among CMs. If an association rule algorithm needs to run up to k levels, then we need to exchange a total of 2mk messages among the CMs to run the association rule algorithm. This number of messages is not dependent on the number of tuples contained in each database and the system, therefore, is easily scalable to large databases. Also, this number of messages is much smaller than the data that may need to be transferred if we were to accumulate all databases at one site and then perform the data mining task. Hence, the total count of exchanged messages will be as given below (Algorithms 3 and 4): Total count of messages for one cluster will be 3 * m + 2 * m * k. Therefore, the total number of messages for K clusters will be: Total number of messages = Km(2k + 3) The above analysis of complexity shows that the number of messages that need to be exchanged between the CHs and their CMs which is not dependent on the size of the database at each sensor. The communication complexity is dependent primarily on the number and manner in which the attributes are shared among the sensor nodes. This is significant because it shows that as the sizes of the individual databases grow, the communication complexity of the algorithm would remain unaffected. Computational cost of local computations would grow with the database size at each individual sensor but our decomposable version has an advantage in this regard also over the transport, join, and then run the traditional Association Rule at BS. There is tremendous saving in the computational cost when the decomposable version is executed instead of moving the data, creating a Join and then running the Association Rule algorithm. Also, for the communication cost, the number of partial results that need to be transmitted is far fewer that the messages that may have to be transmitted if entire databases are collected at some central site such as BS. Another important gain of decomposable version is that it preserves the privacy of the data by not requiring any data tuples to be placed on a communication network. It also preserves the integrity of individual databases because no sensor needs to update or write into any of the participating databases. All the queries are strictly reading queries.

V. SIMULATION RESULTS
In this section, using MATLAB R2016b, we evaluate the performance of our proposed algorithm. During simulation based experimentation, we conducted two types of experiments in order to validate and evaluate the performance of the proposed approach where both of these experiments employed in the DEC [27] as a clustering algorithm in the WSN. In the first experiment, we examine the effect of support value, number of shared attributes and number of CHs upon the number of messages being exchanged. In the second experiment, the network lifetime, the number of alive nodes per round and the average remaining energy are used as metrics before and after the integration of our proposed approach with DEC [27] and CREEP [28] clustering algorithms. We consider heterogeneous multifunctional sensor nodes, where each sensor node has the ability to sense multiple attributes [59], [60]. Each sensor node maintains a flat table as a database and each column in that table represents an attribute. The attributes are assigned randomly to each sensor node from a predefined set of attributes.

A. EXCHANGED MESSAGES PARAMETERS
In this set of experiments, 100 sensor nodes are randomly deployed on a 2D-plane to monitor a region with size 100 × 100 m 2 . All our experiments' results have been obtained by averaging various topology seeds while using different set of clusters.
Support value: In this experiment, we demonstrate the effect of the selected support values on the number of messages being exchanged. The support value is varied from 0.1 to 1 with increment of 0.1 and at each value the number of messages are calculated. Fig. 4 depicts that when the support value increases, the number of messages exchanged decreases. The percentage of the number of messages being reduced is from 40% to 90% as compared to the centralized approach (or centralized extraction) in which all data is communicated to the CH, i.e., with zero support value. Hence, our proposed mining data scheme reduces the amount of communicated data and as a result, decreases the communication overhead. Cluster head percentage: In this experiment, we used the same settings as exploited in the previous one. However, this time we vary the number of CHs from 5% to 50% with increment of 5%. We compute the number of messages at each percentage. Fig. 5 depicts the effect of the CH  percentage on the number of messages. It is evident from the figure that the number of messages goes up when the percentage of CHs increases. It is due to the fact that when the percentage of CHs increases, the messages communicated to the BS also increases. Moreover, the increase in the number of clusters induces the rise in the number of extracted rules which leads to augmented communication overhead. Percentage of shared attributes: In this experiment, we employ identical setup as used in our first experiment. However, we vary the percentage of shared attributes from 5% to 50% with increment of 5%. We calculate the number of messages at each percentage. The effect of the number of shared attributes on the number of exchanged messages is depicted by Fig. 6. It is evident from the figure that increase in the percentage of shared attributes induces a decrease in the number of messages. It is due to the fact that the increase in the number of shared attributes decreases the percentage of unshared attributes; as a result, the number of messages required for determining and controlling the unshared attributes decreases.

B. VALIDATION AND EFFECTIVENESS
In this set of experiments, the network area is taken as 100 m × 100 m with 100 nodes randomly deployed and the BS is placed in the center of the network area. The clustering algorithms adopted are DEC [27] and Cluster-Head Restricted Energy Efficient Protocol (CREEP) [28] having 10% of nodes acting as CH nodes. The energy model and its parameters adopted are same as prescribed by [27]. The total network energy is assumed to be 102J . All the experiments conducted use different random topology seeds. For comparison, we integrated our proposed algorithm in DEC, denoted by DEC+Proposed Approach, and also in CREEP, denoted by CREEP+Proposed Approach and we then compare them with the original DEC and the original CREEP algorithms using the following metrics. 1) First node dies: it is the time elapsed from the start of the experiment until the first sensor node dies. 2) Number of alive sensor nodes per round: it is the number of alive sensor nodes in the network after each round. 3) Average remaining energy per round: it is the ratio of total remaining energy of all sensor nodes to the number of nodes.  depicts that the integration of our proposed algorithm with DEC and CREEP augments the overall network lifespan until the FND, i.e., the proposed approach considerably reduces the energy depletion across all sensor nodes which is due to the fact that our proposed approach tackles the data accumulation process via mining and hence saves the energy. Moreover, extracting association rules among sensors helps to capture the set of sensors that report same data or predicate data which leads to reduction in the depleted energy. • Number of alive sensor nodes per round: Fig. 8 demonstrates the number of alive nodes in the network per round in the original DEC, original CREEP, DEC+Proposed Approach, CREEP+Proposed Approach. It is evident from the figure that the number of alive nodes in the integrated DEC and integrated CREEP are more than those of the original DEC and the original CREEP algorithms. This is due to the fact that our proposed algorithm reduces the energy depletion at each sensor node by reducing the volume of data sent by each node through the distributed mining algorithm which leads to energy conservation.
• Average remaining energy per round: Fig. 9 depicts the residual energy after each round in the original DEC, original CREEP, DEC+Proposed Approach, CREEP+Proposed Approach. The results depict that the integrated DEC and integrated CREEP have the maximum residual energy in each round, i.e., the energy consumed by integrated DEC and integrated CREEP is less than that of the energy consumed by the original DEC and the original CREEP. The reason for the energy conservation is that our proposed approach decreases the transmissions of duplicate information.

VI. CONCLUSION AND FUTURE WORK
WSNs being the integral part of the IoT are the main sources of huge volume of data generation. This huge data, if not managed properly, would cause serious problems of resource management. In this research article, we have proposed a novel approach to manage the data efficiently, i.e., our proposed technique would mine data generated by sensors locally without communicating it to any cluster head or base station. Cluster members would communicate only statistical summaries to the cluster heads. The main idea of our proposed scheme is to confine computations to sensor nodes and to reduce the inter-node communication in order to minimize the energy wastage and overheads during communication. This was how our proposed approach increased the sensor network lifetime. Our proposed idea was backed by extensive simulations where the results obtained depict the efficiency of our scheme in terms of reduced energy depletion and prolonged network lifetime. Our future work can be illustrated as follows: First, applying the proposed scheme for mobile WSN where the network topology is continually changing would be challenging because these changes may affect the results of the scheme.
One of the ideas to solve this is to employ the concept of fog nodes and moving cluster head operations to the fog node. Another solution is to modify the scheme to be aware of underlying clustering protocol and topology. Second, selecting improper and inflexible support and confidence thresholds could increase computation complexity, so selecting most appropriate and adaptable thresholds need more investigation. Third, the proposed scheme takes the benefits from the heterogeneity in multifunction sensors to reduce traffic size and to preserve the privacy of sensor data, but there is a required preprocessing step at each node which performs data labeling. Improper data labeling could increase the number of messages and computation costs. So, this step should be handled carefully.