Recovering Missing Data via Top-k Repeated Patterns for Fuzzy-based Abnormal Node Detection in Sensor Networks

The stream data acquired by heterogeneous Internet of Things (IoT) sensors are seldom perfect. Most of the collected data streams include either missing or abnormal values caused by various factors such as failure, malfunction, or integrity attacks. Such unreliable data affect the real-time monitoring and compromise the quality of data analysis. By simply analyzing the sensor data via anomaly detection, applications may still be unreliable over the incomplete sensor data streams. Therefore, a reliable method for recovering the missing data and detecting the abnormal ones is indispensable in the IoT environment. This paper presents FuzHD++, a new method to recover missing sensor data and detect abnormal nodes jointly rather than independently. Both elements, data recovery and abnormal node detection, rely on the observed temporal and spatial correlation of sensor data to effectively achieve reliable recovery estimation and detection performance. In the data recovery process, the system adopts a matrix profile to extract the top-k repeated patterns from different sensor nodes. Furthermore, it utilizes the k-nearest neighbor estimator to recover the missing data based on the extracted pattern information of multiple neighbor nodes. During the abnormal node detection process, the system adopts a refined fuzzy rule-based detection method. The refined fuzzy rule-based inference system integrates the expert rules and the rules obtained from sensor data analysis to treat the ambiguity in the decision-making process. We validated the performance of FuzHD++ by comparing it with existing methods using two real-world datasets. Our results showed that the proposed missing sensor data recovery method achieves more than 20% improved root mean square error results than most existing methods. Furthermore, FuzHD++ achieved an average accuracy of 92% for analyzing the sensor readings and detecting the abnormal ones. According to the results, the proposed mechanisms based on the observed temporal and spatial correlation analysis improve the robustness of IoT against data loss and integrity attacks.


I. INTRODUCTION
Since the emergence of the Fourth Industrial Revolution, there has been a growing trend in the use of elements of the Internet of Things (IoT). As a result, IoT environments such as smart homes and smart cities are becoming increasingly popular and have permeated various areas of our lives. In this context, the use of sensors on IoT devices ensures a seamless connection between the devices and the physical world.
Indeed, modern IoT devices are computer-like devices that come with a wide range of heterogeneous sensors connected through a dynamic and distributed wireless sensor network (WSN). In this context, the primary requirements of such devices are that they monitor their environmental conditions, report sensor data, and perform appropriate actions in response to the surrounding circumstances [1]. upon the reliability and trustworthiness of the collected sensor data. Unfortunately, the hostile environment of IoT sensors makes this issue more challenging. Most of the stream data obtained from such sensors include either missing or abnormal data caused by various factors such as sensor malfunction, transmission error, storage errors, or malicious attacks. Indeed, the growing popularity of IoT, coupled with physical vulnerability and lack of standardization [1,2], has led malicious attackers to take an interest in IoT devices. As a result, various types of malicious activities already exist that attempt to compromise the security and privacy of IoT devices. Specifically, some studies [3][4][5][6][7][8][9] have shown that attackers can compromise and manipulate sensor data in real deployments through false data injection attacks (FDIAs). Such compromised nodes hamper the system's functionality, leading to inappropriate decisions by operators and possibly catastrophic effects [10]. This is where anomaly detection has become a necessity.
To guarantee a safe and reliable IoT system, a myriad of solutions have already emerged that tackle the problem of anomaly detection in sensor data [11,12,[33][34][35][36][37][38][39][40]. However, most of these works do not take into account the presence of missing data within the collected sensor streams. Incomplete sensor data can confuse anomaly detection methods, as they may create false conclusions that lead to wrong detection results. To improve our chances of correctly detecting abnormal sensor nodes, we think it is necessary to ensure the completeness of a sensor data stream before using it in any anomaly detection system. Therefore, it is essential to use appropriate techniques to handle missing and anomalous data with the capability of recovering and detecting them in an integrated framework.
The problem of abnormal node detection was tackled in our previous work [11], in which the spatiotemporal (ST) and multivariate attribute (MVA) correlations of heterogeneous sensor readings were considered in the detection process. The collected sensor data were analyzed through a hierarchical framework based on fuzzy logic to take advantage of domain knowledge and treat the ambiguity in the decision of detecting abnormal nodes. To handle missing data, we carried forward the last observed value. In this context, two major issues remain in the methods presented in the original paper concerning the adopted missing recovering method and the design choice of the fuzzy inference system (FIS). First, the adopted missing recovering method can be described as a hard recovery approach, as we are rigidly forced to pick the single last observed value as the only recovered value. Such a naive interpolation method may not be the best way to handle missing sensor values, and it may affect the abnormal node detection performance. Second, the adopted abnormal node detection method considers a FIS based on background knowledge. However, such a FIS may suffer from a loss of accuracy, especially when dealing with a crafty FDIA.
In this paper, we refer to our prior work as fuzzy-based hierarchical detection (FuzHD), and we tackle its limitations by presenting FuzHD++, a new method to recover missing sensor data and detect abnormal nodes jointly rather than independently. Specifically, in FuzHD++, we propose two new methods, which we refer to as top-k repeated patterns (TkRP) and fuzzy-based hierarchical detection with a refined rule base (FuzHD+rRB). These novel methods were devised to handle the missing sensor data and ensure higher abnormal node detection accuracy. In TkRP, we adopt the concept of a matrix profile [13] to extract the top-k repeated patterns from different sensor nodes. Furthermore, it utilizes the knearest neighbor (k-NN) estimator to recover the missing data based on the extracted pattern information of multiple neighbor nodes. In FuzHD+rRB, the system adopts a refined fuzzy rule-based detection method to take advantage of domain knowledge and treat ambiguity in the decision-making process. Specifically, we design a hybrid FIS to detect abnormal nodes. Along with the background knowledge-based predefined rules, we also use the so-called Wang-Mendel (WM) method [14,15] for generating fuzzy rules from sensor data, making it a more comprehensive and flexible FIS.
The main contributions of this paper are as follows.
1) To guarantee an accurate and reliable abnormal node detection, we introduce a new recovery method, TkRP, to correctly recover the missing sensor data. 2) We improve the previously proposed FIS in FuzHD [11] by introducing FuzHD+rRB, in which we design a hybrid FIS to detect abnormal nodes, making it a more comprehensive and flexible FIS. 3) We use a new FDIA threat model to generate a malicious dataset from the original sensor data [12], allowing us to test the abnormal node detection method and evaluate its performance against different threat severity levels. 4) We evaluate our proposed methods through a number of experiments designed to test their parameterization (number of top-k patterns, number of sensor nodes per cluster, length of sensor data streams), accuracy and efficiency. 5) We also augment our evaluation with the Intel Lab dataset [42] in addition to the Yokota Lab dataset, as in our previous work. Our experiments using the two real-world datasets demonstrate that the proposed missing sensor data recovery method TkRP achieves more than 20% improved root mean square error (RMSE) results than most existing methods. Furthermore, the combination of the two proposed methods, FuzHD++, achieves better results than FuzHD in terms of abnormal node detection, with an average accuracy improvement of 14.11%.
The remainder of the paper is organized as follows. Section II reviews the related methods of missing data recovery and anomaly detection in time series. Section III describes some essential background characteristics before introducing our proposed method. Section IV presents the detailed architecture and design of TkRP and FuzHD+rRB to recover missing sensor data and detect abnormal nodes, respectively. We then describe the experimental setup in Section V and present an analysis of the results and evaluation in Section VI. Section VII contains some concluding remarks and perspectives.

II. RELATED WORK
Applications, such as anomaly or abnormal node detection, built upon incomplete sensor data streams are obviously unreliable. If the missing data cannot be filled accurately, existing detection algorithms can hardly be performed. Recovering dirty and missing data could improve clustering over spatial data [16]. For sensor data streams, we argue that recovering the missing values can also improve applications such as abnormal sensor node detection. To guarantee a dependable IoT system, it is essential to conduct studies to deal with these two issues. Although much related research has indeed been carried out, most work tends to focus on recovering missing data or detecting anomalous data, and few studies have simultaneously addressed these two problems.

A. MISSING SENSOR DATA RECOVERY
The straightforward idea is to carry forward the last observed value [11]. However, such a naive interpolation method may not be the best way to handle missing sensor values, as it may confuse the abnormal node detection method and raise false alarms. The estimation algorithms of missing data have been extensively researched by applying different methods; for example, mean impute, k-NN impute, maximum likelihood, Bayes estimator, regression imputation, and delete and multiple imputations [17]. However, none of these methods can be used in sensor data because they can only deal with discrete data and not continuous data. According to the underlying method, recovery algorithms can be classified as either matrix based or pattern based to solve the missing sensor data missing problem.
A matrix-based algorithm transforms the sensor data in a way that allows the application of dimensionality reduction. The singular value decomposition (SVD) method [18] is the most popular method that has been used to achieve such a goal [19][20][21][22][23]. Other matrix-based algorithms rely on techniques that differ from SVD, such as principal components analysis [24][25][26], centroid decomposition [27], matrix factorization [28], and nonnegative matrix factorization [29]. All of these matrix-based recovery algorithms multiply back the matrices after reduction and use the results to fill the original missing values. However, the number of reduced dimensions needs to be parameterized as the accuracy-efficiency tradeoff is heavily impacted. Moreover, these methods do not consider the sensor spatial correlation. In contrast, patternbased recovery methods [30][31][32] consider the high spatial and temporal correlation between sensor data streams. When a sensor stream is incomplete with missing values, an algorithm leverages the similarity to any number of reference sensor streams. The observed values in the reference sensor streams are treated as a query pattern. Any incomplete sensor stream matching in that pattern may reveal candidate replacement values in the base streams. Similar to matrix-based algorithms, pattern-based techniques also require predefined user parameters. The length of the query pattern dramatically impacts the accuracy-efficiency trade-off. If the pattern is too small, the technique loses accuracy; if the pattern is too big, the computational time in pattern comparison becomes too costly.

B. ANOMALY DETECTION IN SENSOR DATA
Several techniques for anomaly detection in IoT have been proposed, but most either restrict their application to faults or failures [33][34][35], or to specific network attacks alone [36][37][38][39]. The area of FDIA detection for WSN has been overlooked by existing works, and only a few studies tackled this issue [11,12,40].
In this context, anomaly detection in WSNs can be classified as methods that directly run on sensing devices (i.e., distributed methods) or those running on the cloud (i.e., centralized methods). Performing anomaly detection in a central processing system allows us to adopt complex algorithms and, consequently, to obtain accurate results. A centralizedbased approach is proposed where all heterogeneous sensor streams are collected and controlled in a centralized base station [11,33]. The proposed solution evaluates the intensity of the correlation between the sensor streams by calculating the lag correlation between them.
A centralized failure detection approach is proposed where the base station aggregates the network sensor readings and detects failures by finding an insufficient flow of incoming data [34]. In contrast, distributed methods run directly on sensor nodes equipped with light computation capability. Most of these approaches require historical data samples to be kept in the sensor node, which has limited memory storage. A rule-based distributed fuzzy inference system for WSNs is proposed that combines both local and neighboring observations to identify the occurrence of events [10,35]. Their experimental results showed that using fuzzy logic improved the accuracy of the event detection. Thus, notwithstanding the limitations of the aforementioned works, few studies have simultaneously addressed both the problem of missing data recovery and abnormal node detection. Instead of simply discarding the sensor streams with missing data, we propose to recover them and then detect the abnormal nodes. Table 1 shows a summary of the characteristics of different approaches along with our proposed method.
In this paper, our proposed methods utilize the observed temporal and spatial correlation of sensor data to achieve reliable estimation and detection performance.

III. PRELIMINARY BACKGROUND
This section provides the essential background characteristics used in our proposed framework and discusses some assumptions about the monitoring environments considered in this paper.

A. SYSTEM AND SENSOR DATA MODEL
An environmental monitoring application in a WSN is defined as an application that monitors the real world and issues VOLUME 4, 2016  [20], SoftImp [21], SVT [22], Grouse [24], Rosl [25], Spirit [26], CDRec [27], TeNMF [29] DynaMMo [30], STMVL [31], TKCM [32] SMART [34] 6thSense [35] FuzHD [11]  a report whenever an event of interest arises during a certain period in a specific location. This paper considers a typical WSN architecture consisting of heterogeneous sensor nodes, a server, and a network connecting all sensor nodes. The server is for collecting and processing sensor data. All the sensor nodes in the WSN are connected to this server directly or indirectly ( Figure 1). This paper addresses the network scalability issue by adopting a hierarchical WSN topology based on two-level clustering. The adopted clustering method allows us to properly utilize the network energy among all nodes, capture the correlation between the sensors, and enhance the system's trustworthiness.
Before describing our proposed approach, we give definitions of the key terms used in this paper. To implement such a thorough monitoring system, n sensor nodes (S 1 , S 2 ,. . . , S n ) are geographically divided into clusters, each covering a certain area. Each cluster should include one cluster head (CH) and other heterogeneous cluster member (CM) nodes arranged into groups according to their type. Each group is controlled by a cluster aggregator (CA). The CMs are responsible for sensing and collecting various attributes, such as temperature, humidity, and light intensity. The CA is responsible for all communication between the CM and CH nodes. Once all the sensed data within the cluster are collected, the CH forwards the messages directly to the server. Definition 4: We denote size(C) as the number of sensors deployed in the cluster C. Let C(S i ) be the cluster within which S i is located. The clustering formation is based on a defined distance threshold, th d . Two sensors, S i and S j , belong to the same cluster C if and only if C(S i ) = C(S j ), and the distance between L(S i ) and L(S j ) is less than th d . Definition 5: In addition to the clustering, homogeneous sensor nodes within the same cluster are divided into groups according to their type T (S i ). Let A(S i ) denote the group within which S i is located. The two sensor nodes S i and S j belong to the same group within the same cluster C if and only if T (S i ) = T (S j ) and C(S i ) = C(S j ). Definition 6: Let I(S i , S j , t) denote an input report message received by S i from S j at time t.
is the sensor node's output data stream with every S i sensing data at time t, and m is the length of the sensor data stream.

B. ASSUMPTIONS
Our research is based on the following assumptions.
• To reduce the complexity of the problem, we assume that every sensing environment is characterized by its environmental conditions, such as temperature, light intensity, and relative humidity.

VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Step 1 Step 2 Step 3 Local decisions Group decisions FIGURE 2: Overview of FuzHD++: Fuzzy-based hierarchical abnormal node detection with top-k repeated patterns as a missing data recovery method • As noted, the clustering concept is adopted for the network topology. Although several complex and innovative clustering techniques have been proposed for WSNs, this paper considers a very simple clustering technique for environmental monitoring in WSNs. • All clusters should be composed of homogeneous and heterogeneous sensor nodes to maintain high eventdetection accuracy. • Depending on the application, a CH node can be a special sensor with more potential than other sensor nodes in terms of energy, bandwidth, and memory. However, in this paper, we consider that all the sensor nodes in the network have the same performance characteristics. In addition, the role of the CH is periodically rotated among all nodes to balance the energy consumption and the traffic load in the network. • In this paper, the CAs or the CHs do not aggregate the collected data. Instead, we need to keep the actual collected data from each sensor to recover the missing data and detect the abnormal data. • N-modular redundancy is used to achieve a dependable and fault-tolerant WSN. Furthermore, the considered WSN must satisfy a good distribution of the clusters where at least three sensor nodes must be deployed within one cluster (i.e., triple modular redundancy is a particular case of N-modular redundancy). • While some sensor nodes may be compromised and considered abnormal nodes, we assume that the majority of the sensors will remain trustworthy.

IV. PROPOSED APPROACH
Our proposed approach aims to guarantee the system's dependability by recovering the missing sensor data and detecting abnormal nodes. This problem can be expressed as follows.
Problem. Given n coevolving correlated sensor stream sequences provided by n heterogeneous sensors collected at the same time, recover the missing sensor data, determine at any point in time which sensors are abnormal, and report all such nodes. To address these challenges and guarantee reliable and secure monitoring of WSN, we propose two new methods, called TkRP and FuzHD+rRB, for detecting abnormal nodes in a heterogeneous WSN while recovering the missing sensor data. Both TkRP and FuzHD+rRB utilize the observed temporal and spatial correlation of sensor data to achieve reliable estimation and detection performance. First, the proposed TkRP adopts a matrix profile to extract the top-k repeated patterns from different sensor nodes and utilize k-NN pattern information of neighbor nodes to recover the missing data. Second, it detects abnormal nodes using a fuzzy logic-based hierarchical detection method. The proposed framework is depicted in Figure 2, which shows the various sensor node modules and the flowchart for recovering missing data and processing abnormal nodes. We follow a three-step process to achieve our objective, as described in detail in the following subsections.

A. STEP 1 IN FUZHD++: DATA ACQUISITION AND FUZHD+RRB ABNORMAL NODE-LOCAL DETECTION
The first step involves collecting heterogeneous sensor streams from the various clusters deployed in the monitored area and performing the abnormal node-local detection. In the following subsections, we explain the details related to the local detection module in FuzHD+rRB.

1) Definition of the input/output variables along with their membership function
The CM senses environmental events and executes the local detection process to check whether the newly collected data are subject to abnormality. Figure 3 illustrates details of the design of the adopted scheme for the local detection module. This detection module considers temporal semantic correlations to derive a crisp local decision. Every CM maintains a short-term history of the collected sensed data. This aggregation of data is used to construct a sliding time window VOLUME 4, 2016 containing the most recent sensed data in the sensor node stream. In the literature on stream processing, time windows are a familiar concept [33].  In this paper, we use the time window not only as a mechanism for bounding the sensor node stream aggregation but also to profile the behavior of the sensor node readings over time [11]. The sensed data will be time correlated, and the variation range will usually be small in the short term [41]. By using the time window concept, we can derive valuable information regarding the sensor nodes' temporal similarity. The sensed data contain a k-second timestamp, indicating the time at which the sensor node reported the reading.

FuzHD+rRB: Local detection module
As shown in Figure 4, the sensor node time-series samples are grouped into (p + 1) frames to compose the sliding time window of size W l , where l ∈ {0, p}.
As time passes, the window slides in one-frame increments over the sensor data stream. Each frame contains T successive sensor readings. After setting the sliding time window, we apply a summarization function to extract the relevant information about sensor node temporal similarity. F 0 is the frame containing the recently collected sensor readings. F 1 is the frame for the T previous sensor readings. p + 1 is the size of the sliding time window, and T is the number of sensor readings within each frame. For each frame F l within the window, we calculate the temporal similarity between the frame F l and the current frame F 0 [11]. The temporal similarity is given by Equation (1).
The TAS between the current frame data and the data in the window is then calculated. As indicated in Equation (2), the average similarity is calculated by adding a weighted summation to the calculation. The closer the frame is to the current timestamp frame, the more it is correlated [11].
Here, T f0 is the last timestamp of the collected sensor reading O(S i , T ) in frame f 0 . The same applies to T f l , where it is the last timestamp of the collected sensor reading O(S i , T ) in frame f l . The smaller the TAS, the more the frame at the current timestamp deviates from the historical sensor node data. After the CM finishes the calculation of the TAS, it conducts the fuzzy local detection process. Indeed, this paper proposes a fuzzified methodology for detecting abnormal nodes in the network. It uses fuzzy logic to identify the severity of abnormality of sensor nodes rather than just giving crisp results. Moreover, a fuzzy-based system tends to provide rules that are by nature easy to interpret.
Thereby, together with the obtained TAS value, both the current raw sensed value O(S i , t) and its timestamp are fuzzified through predefined membership functions (MFs). Choosing to include the sensed data timestamp is highly significant for the accuracy of the detection [42]. The monitored environment can differ for each time-of-day segment. As a result, the input-output response will also differ depending on the time of day. For example, the light intensity and temperature during the day are generally higher than at night.
In this paper, we consider an environment monitored by sensor nodes for temperature, humidity, light, and smoke density. For each type of sensor node, the local detection module takes three linguistic variables as its input: sensed value, average temporal similarity, and sensed data timestamp. In the fuzzification process, the three crisp values are converted into degrees of membership, with each membership using the trapezoid and triangle models. The trapezoidal function is determined as follows: where x is a member of the universal set, and the parameters a, b, c, d (with a < b < c < d) determine the x coordinates of the four corners of the underlying trapezoidal 6 VOLUME 4, 2016 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3181742 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ MF. As for the triangular MF, the function is specified by three parameters {a, b, c} as follows: The sensed value input is one of temperature Te, humidity Hu, light Li, or smoke Sm. The trapezoidal and triangular MF for the Te variable has four semantic values, i.e., very low VL, low-to-medium LM, medium-to-high MH, and very high VH. For instance, the MFs for temperature are shown in Figure  5. The triangular MFs for the Hu, Li, and Sm variables have three semantic values, i.e., low Lo, medium Me, and high Hi. The triangular MF for average temporal similarity TAS has two semantic values, i.e., small Sm and big Bi. Finally, the trapezoidal MF for the sensed data timestamp TS has four semantic values, i.e., night Nt, morning Mo, afternoon Af, and evening Ev [11]. After being fuzzified, the three fuzzy inputs are then fed into the fuzzy inference system.

2) Generation of the refined rule set for FuzHD+rRB
The inference engine of the proposed FIS uses the Mamdanitype fuzzy process. FIS can be designed either from domain knowledge or from data. In FuzHD [11], the adopted FIS is based on expert knowledge. However, it may suffer from a loss of accuracy under different environmental conditions. Rule-based legibility is essential to take full advantage of FIS. For this reason, here we improve the previously proposed FIS in our FuzHD by designing a hybrid FIS that cooperates between two kinds of information, namely background knowledge and hidden knowledge in data. In FuzHD+rRB, the generation of the fuzzy rule base for the FIS is conducted offline and is decomposed into two main phases ( Figure 6).
• In the rule induction, we use the WM method for generating fuzzy rules from the sensor sample data along with the background knowledge-based predefined rules. • In the rule validation and selection, the initial combination rule set is further checked to only keep those that are relevant and select those that are appropriate for inclusion in the refined rule set.

Membership function
Temperature sensed data   where the premise is the fuzzy input variables connected by and and or logical connectors, and the consequent is the fuzzy output variable. More formally: . and x p is A i p then y is Ci. The fuzzy sets A i j are those for which the MF of x i j is maximum for each input variable j from pair i. The fuzzy set C i is that for which the MF of the observed output, y i , is maximum.
The predefined fuzzy rule base comprises a set of rules designed to decide the probability of the node being abnormal. By considering the background knowledge, we use heuristics to build the rule base for our abnormal node detection. An example might be as follows.
IF Te is Hi AND TAS is Sm AND TS is Ni, THEN Abnormal is Hi For instance, the predefined fuzzy rules related to temperature sensor nodes are listed in Table 2. This set of rules contains the rules involving linguistic variables based on the sensed values from temperature sensor nodes. The rule base for the other sensor types can be constructed similarly [11]. Once we acquire the predefined rules, we move to the WM-derived rules. The adopted WM model follows the following three steps.

1) Each variable of the input space is automatically divided
into fuzzy regions. The WM model does not impose any specific partition for the input variables. Indeed, they are equally partitioned on a predefined number of triangular membership fuzzy sets. Each domain interval is divided into 2N+1 regions. The center of each MF lies in the center of the region, and the extreme lies at the center of the next region. 2) Then, we generate the fuzzy rules from the given data pairs from the sample sensor data. One fuzzy rule is generated for each input-output data pair from the sample data. The output is computed through centroid defuzzification. 3) Finally, the conflicting rules are removed. For example, the rules that share the same antecedent but with different consequents are removed.  After acquiring both the predefined and WM-derived rules, we merge the two sets to obtain the initial combination rule set.

2.2) Rule validation and selection:
In the rule validation, we aim to identify the relevant rule set from the initial combination rule set. Clearly, certain rules will show some redundancy after combining the two sets. Therefore, such redundant rules need to be removed and not included in the relevant rule set. Moreover, rules having a number of MF fuzzy sets that differ from that defined by the WM model are also removed and are not included in the relevant rule set. In the rule selection, we aim to select the appropriate rules to be included in the refined rule set. At this stage, we assess the conflicting rules and rank them according to their prediction accuracy in the sample sensor data. A rule is added to the refined rule set if the expected predictive accuracy of the rule meets the desired accuracy and is not subsumed by a conflicting rule with a lower expected predictive accuracy.

3) Defuzzification
Finally, the defuzzification process of obtaining a single number from the output of the aggregated fuzzy set is conducted. It is used to transfer FIS results into a crisp output, which is used to make a local decision and send a report message to the CH for further analysis that eventually leads to the final decision. The centroid method is used to calculate the crisp value of the fuzzy output as follows: where x * is the crisp value, x i represents each member of the output universe, µ(x i ) is the aggregated output MF, and k is the number of items in the fuzzy set.
The confidence for abnormal node detection is defined as the output. The triangular MF for the fuzzy output variable is defined in terms of three levels, i.e., low Lo, medium Me, or high Hi. Figure 7 shows the MFs for abnormal detection confidence. The linguistic variables represent the detector's confidence about the presence of a data integrity attack. For example, if the local detection value is higher than 30, we are more than 30% certain that it is an abnormal node. If the detector's confidence is smaller than 30%, it is more than likely that it is not an abnormal node.

B. STEP 2 IN FUZHD++: TKRP MISSING DATA RECOVERY AND FUZHD+RRB ABNORMAL GROUP DETECTION
After collecting all the data (i.e., sensed data and local decisions) from the CMs, the CA verifies whether there are missing values within the sensed values. When missing values occur in the CMs, the CA executes the missing data recovery. In this subsection, we introduce our new recovery method that uses the matrix profile ( Figure 8) to extract the top-k repeated patterns from sensor nodes within the same group and utilize the k-NN pattern information of neighbor nodes to recover the missing data. The top-k repeated patterns are only extracted from the CMs identified as a normal sensor during the local detection process.

1) TkRP: Construction of the reference pattern database
The first step toward recovering the missing sensor data is constructing the reference pattern database. Figure 9 illustrates the construction process. From each collected sensor data stream −−−→ O(S i ), the CA starts computing the stream's matrix profile.
The matrix profile is a recently proposed data structure [12] that annotates a time series to solve the problem of anomaly detection and motif discovery. Besides its novelty, the method is robust, scalable, and parameter free. Hence, we adopt this data structure for our proposed recovery method. The matrix profile comprises two primary components, namely, a distance profile and a profile index. The distance profile is a vector of minimum Znormalized Euclidean distances. The profile index contains the index of its first nearest neighbor, i.e., it is the location of its most similar subsequence.
Then, we compute the pairwise distance among these windowed subsequences against the entire sensor data stream −−−→ O(S i ). The distance calculations occur m − o + 1 times, where m is the length of the sensor data stream and s is the window size. Because the subsequences are pulled from the sensor data stream itself, an exclusion zone is required to prevent trivial matches. For example, a subsequence of sensor data stream matching itself or a subsequence of sensor data stream very close to itself is considered a trivial match. The exclusion zone is simply half of the window size before and after the current window index. The values at these indices are ignored when computing the minimum distance and the nearest neighbor index. Figure 8 illustrates an example of matrix profile calculation. It shows the computation of a distance profile starting at the second window. The matrix profile stores the distances in Euclidean space, meaning that a distance close to 0 is most similar to another subsequence in the sensor data stream, and a distance far away from 0, say 100, is unlike any other subsequence.
With the matrix profile computed, it becomes simple to find the top-k of repeated patterns [43].
Definition 10: The most repeated pattern is a pair of subsequences where |v − w| ≥ gap th and |a − b| ≥ gap th for gap th : the gap that exists between the subsequences where gap th < 0.

S1's data stream :
O(S1,1) O(S1,2) O(S1,3) O(S1,4) O(S1,5) O(S1,6) O(S1,7) O(S1, 8) O(S1, 9) O(S1, 10) window size Here for simplicity, we only deal with pairs. However, we can also extend the notion of most repeated patterns to a set of subsequences that are very similar to each other. Once we finish extracting the top-k repeated patterns, we extract the snippets. For each extracted pair of subsequences within the top-k repeated patterns, we extract their snippet.
Definition 12: The most repeated pattern snippet is a subsequence of the most repeated pattern. Time series snippets are defined to describe the most representative subsequences in a time series [44]. The primary use of snippets here is to find the subsequence patterns that occupy most of each top-k repeated pattern in question to summarize the repeated pattern at a high level. In this paper, instead of extracting the k th snippet [39], we are only interested in extracting the top-1 snippet for each extracted pair of subsequences within the top-k repeated patterns. Indeed, the top-1 snippet is undoubtedly the most representative pattern that summarizes the subsequence. For each extracted snippet, we check if a similar one already exists in the reference pattern dataset, and if not, insert it.
Definition 13: The reference pattern dataset is a database that contains a set of d snippets that represent the most repeated patterns of sensors located within the same group. Having a reference database of patterns, we are now able to recover the missing sensor data.
2) TkRP: Missing data recovery method Figure 10 depicts the workflow of the missing recovery method. The CA collects the latest observations generated by the CMs' sensor data stream. A null value is included in the stream when missing observations are reported. If missing measurements are reported, the CA requests an estimation of missing values. This estimation is based on the k-NN algorithm and uses the previously constructed reference database that includes the top-k repeated patterns. It checks if a similar experience already exists in the reference dataset and identifies patterns similar to the current one. After identifying similar patterns, the CA computes the k-NN distances and generates the estimated values. This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.
Size of the cluster Cp where sensor i belongs Subsequence of sensor i of length o and starting from t Then, the recovered sensor data stream follows the process described in subsection B.1 to check if the stream includes any new patterns and updates the reference pattern database when it is necessary.

3) Abnormal node group detection
The group detection module operates at the CA level by considering spatial semantic correlations. In this detection module, a more accurate decision is made by including the local decisions of multiple homogeneous CMs located in the same group within the same cluster. After receiving a local decision message from a CM, the CA stores the crisp decision value. After collecting all the CMs' local decisions, the CH executes the cluster detection process for each CM node to give a group decision about the node's abnormality. The group detection module uses two inputs, the CMs' crisp local decisions and the CMs' local decisions. The fuzzifier converts the crisp values into degrees of membership by applying the corresponding MF. After being fuzzified, a sigmacount factor [45] is used as a measure of fuzzy cardinality to quantify the CMs' local decisions.
Here, f is a fuzzy set characterized by an MF µf (S i ), which gives the degree of similarity for S, and S i = (S 1 , S 2 ,. . . ,S n ) is the set of CMs. Precisely, f (S i ) is the property of interest related to the sensor node's local decision, e.g., "Abnormal level is high." A fuzzy majority quantifier is then used to obtain a fuzzified indication of the consensual CMs' local decisions. For a more accurate decision, we use the Most quantifier to characterize the fuzzy majority of the CMs' local decisions, which takes any value from the interval 0 to 1 as the truth value of its proposition [46].
Next, the fuzzified inputs and the quantified CMs' local decisions are fed into the fuzzy inference process. The fuzzy rule base comprises a set of rules designed to decide about the abnormality of the CM. An example of the format of the rule is "IF Abnormal is H AND Most(CMsDecision) is L THEN Abnormal is H." Fuzzy inference combines the rules to obtain an aggregated fuzzy output. Finally, the defuzzifier converts the fuzzy output variable back to a crisp value that is used to make a group decision and reported to the CH.

C. STEP 3 IN FUZHD++: FUZHD+RRB ABNORMAL NODE CLUSTER DETECTION
Cluster identification is processed at the CH level by considering the ST and MVA sensor correlations. In this detection module, a more accurate decision is made by including the group decisions of multiple heterogeneous CAs located in the same cluster. After receiving a group-decision message from a CA, the CH stores the crisp decision value. After collecting all the CA group decisions, the CH performs the fuzzy inference for each sensor node to give the cluster decision about the node's abnormality. The detection mechanism is similar to that for group detection. However, compared with the group decision, the cluster decision considers the observations from heterogeneous sensor nodes in addition to only homogeneous sensor nodes.
The CH's fuzzy rule base comprises a set of rules designed to determine the CM's abnormality. An example of the rule might be "IF Abnormal is Lo AND Most(CAsDecision) is Lo, THEN Abnormal is Lo." If abnormal nodes are detected, the CH sends a report message to the server.

V. EXPERIMENTAL SETUP
This section describes the datasets used to evaluate our proposed approach and the details of the sensor network that we have implemented, including the deployment setting and the experimental scenario design.

A. DATA ACQUISITION
To show that our proposed approach applies to real-world WSNs deployed with heterogeneous sensors, we use two datasets that have different types of sensor deployment. The This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.
first dataset is the publicly available Intel Berkeley Research Lab dataset [47]; the second is the Yokota Lab where the data are collected from our deployable WSN in our laboratory.

1) Intel Berkeley Research Lab dataset
In this dataset, the data were collected from 54 sensor nodes deployed in the Intel Berkeley Research Lab between February 28 and April 5, 2004. To effectively monitor the whole lab environment, 54 sensors are unevenly distributed in different locations in the research lab. Mica2Dot sensors with weatherboards are used to collect time-stamped topology information, along with temperature (in degrees Celsius), humidity (temperature-corrected relative humidity ranging from 0 to 100%), light (Lux) (a value of 1 Lux corresponds to moonlight, 400 Lux to a bright office, and 100,00 Lux to full sunlight), and voltage values (in volts ranging from 2 to 3). The batteries, in this case, were lithium-ion cells that VOLUME 4, 2016 11 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. A new reading was collected every 31 seconds. In total, 2.3 million readings were collected from these sensors. The sensors were dispersed in the lab, as shown in Figure 11.

2) Yokota Lab dataset
In addition to the Intel Lab dataset, we also collected sensor data streams from 27 sensor nodes in our laboratory between January 24 and July 25, 2018. The real-world sensor data were collected periodically while performing our usual daily activities. The sensor nodes were deployed using the Raspberry Pi 2 and 3 Model B microcontroller platforms (Table 5), as we consider the Raspberry Pi to be the best IoT hardware platform in terms of performance and flexibility. Each physical sensor node is equipped with one temperature sensor module (DS18B20 temperature sensor), one humidity sensor (DHT11 humidity sensor), one smoke density sensor (MQ-2 smoke sensor), and one digital light intensity sensor (BH1750 digital light sensor), yielding a total of 64 sensors. The technical characteristics of the Raspberry Pi platforms, sensors, and server used in our experimental setting are described in Tables 4 and 5. As shown in Figure 12, the sensor nodes were divided into five clusters separated from each other and with different environmental conditions. Two clusters comprised five sensor nodes each and were located in our laboratory room. The third cluster consisted of six physical nodes located in a kitchen corner and exposed to sunlight, the fourth consisted of six physical nodes located in a seminar room, and the fifth consisted of five physical nodes located in a server room. Each sensor transmits data approximately every two minutes, giving a total of 20.9 million readings.

B. DATA PREPROCESSING
Three main steps must be performed to prepare the dataset for evaluation: cleaning the raw sensor data, injecting false sensor data, and physically separating the sensor nodes into clusters. Cleaning the data is necessary to ensure that the proposed abnormal node detection is only executed on known FDIAs, allowing for consistent evaluation. After that, new false sensor data may be injected. The clustering process is also considered a necessary process here to capture the sensor data correlation adequately. In the following subsections, we explain the three main steps in more detail.

1) Data cleaning
To use the Intel Berkeley Research Lab dataset, we encountered an issue related to the notion of time variation (i.e., epoch) within the collected data. Indeed, the usage of the epoch is necessary to build a baseline that works on sensor data streams such as our collected dataset or the Intel Lab dataset. However, for the case of the Intel Lab dataset and even our dataset, the notion of the epoch is loosely defined. Indeed, even though sensor nodes are commanded to collect a new reading in every defined epoch, the fact of having multiple For the WSNs deployed at the Intel Lab, the reasons behind the failures were communication problems and the sensor battery condition. In addition, we found that readings of sensor node five in the Intel Lab data were not recorded. Consequently, it was removed from the dataset.
Concerning our deployed WSNs, some sensor nodes had missing data for different epochs because of SD card corruption. However, because of the sensor constraints, we found that the epoch was not strongly defined in either dataset. Indeed, both datasets were missing less than 10% of the expected measurements. Thus, we needed to standardize the concept of epoch and set it to a well-defined size. To unify the size, we split the readings into epochs of two minutes each. The proposed missing sensor data recovery TkRP was used to substitute each missing sensed value in an epoch. If a sensor had more than one reading during the epoch, we took the average of these measurements. We recovered missing data using the original dataset containing normal sensor readings. In other words, at this stage, we are still not dealing with malicious data. The summary of the datasets is shown in Table 7.

2) False sensor data injection
Given the lack of sensor datasets with malicious data for WSNs and the need to test the accuracy of our approach, we propose an FDIA model to create an attack strategy. An attacker's goal in the context of FDIA is to evoke or hide events without triggering the detection alarm. The primary 12 VOLUME 4, 2016 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and  challenge is to maintain a balance between the outcome of the attack and the risk of being detected. In our prior work [11], the proposed attack models were unsophisticated and not comprehensive enough to support the claims in the paper.
In that work, we only considered three trivial cases where the attacker deliberately either randomly changes a sensor reading or selects the minimum or the maximum possible value. We carefully chose the type of attacked sensor, insertion time, and attack period. With such a proposed attack strategy, it is impossible to guarantee a variety of attack patterns, resulting in uninteresting attack outcomes that are easy to detect. In contrast, WSNs are subject to various threats where we cannot simply anticipate the attacker's actual intention.
To tackle this issue, here we use an attack strategy that generates a malicious dataset from the original sensor data, allowing us to test the detection algorithm and evaluate its performance against different threat severity levels [12]. We create evaluation data, including FDIA based on the initially collected dataset. The occurrence probability of malicious data depends on the exponential distribution: where (500 ≤ ≤ 1000). In addition, we defined nine types of false data. The difference in the false injected data between the real data and the evaluation data depends on a Gaussian distribution: , where σ ∈ [1, 2, 3, · · · 9]. We referred to both Equations (5) and (6) and injected false sensor data readings into the initially collected sensor data. The sensor type, FDIA type, and insertion time were chosen randomly. With such an FDIA strategy, the attack can be very stealthy and deceive the detection mechanism without being easily detected.

VI. EVALUATION
Each conducted experiment was repeated twice, and the average results were taken. To measure TkRP accuracy, as our evaluation metric, we adopt the most commonly used measure, that is, RMSE between the original value and the recovered value: To compare the efficiency and accuracy of TkRP against state-of-the-art recovery algorithms (introduced in Table 1), we use the recent benchmark that evaluates missing value recovery techniques in time series [48]. We set missing sensor values to appear arbitrarily in the middle of a randomly chosen sensor data stream in the dataset. We then vary the size of the missing values from 20% to 80% (of the chosen sensor data stream) and measure the average recovery accuracy using RMSE. We normalize the error across all algorithms using a z-score (the lower, the better) and present the results in Figures 13-21.

1) Evaluation of TkRP with an increasing number of top-k patterns
In this set of experiments, we used the Intel Lab dataset and the Yokota Lab subdataset (one-month period). The most critical parameter for TkRP is the number of top-k extracted FIGURE 17: Accuracy comparison between pattern-based methods: Intel Lab dataset FIGURE 18: Accuracy comparison between pattern-based methods: Yokota Lab dataset repeated patterns. We observed that the runtime of TkRP increases along with the increased value, k (Figures 15 and 16). This result was expected because the higher the value of k, the higher the number of comparisons performed to produce the recovery (causing more time-and space-intensive computations). Surprisingly, increasing k did not always improve accuracy (Figures 13 and 14). The pattern extraction used by TkRP keeps only the most repeated patterns and filters out the rest.
At some point, the extra extracted patterns resort to infrequent pattern extraction that corrupts the recovery. To achieve a suitable trade-off between accuracy and efficiency, the best top-k proved to be k ∈ {8, 9, 10}.

2) Comparison of TkRP with other related works
In this set of experiments, we study the effect of traffic load associated with the decrease in the amount of sensed data on the performance of TkRP. In other words, we evaluate For the evaluation, we use the Intel Lab dataset and the Yokota Lab sub-dataset (i.e., one-month period). We also present the accuracy results using RMSE. We set the size of the top-k to 10. Figures 17-21 show the results. The results show that TkRP outperforms the other algorithms. Indeed, this experiment shows that, in general, TkRP takes advantage of having correlated sensor streams to produce better accuracy. The improvement is more noticeable in the case of the Yokota Lab dataset, although both datasets stand out by their very high correlation between the sensor streams. However, in the Yokota Lab dataset, the correlation between the sensors is higher than in the Intel Lab dataset because of the small distance between the CMs. This is why TkRP, which captures such correlations by design, performs so well compared with the other related works. We also observe in this experiment that the error does not always increase along with the size of the missing block. However, as shown in Figure 19, the runtime of TkRP increases almost linearly as the missing rates increase. Thus, although the

3) Scalability with a large dataset
In this experiment, we study the effect of traffic load associated with the increase in the amount of sensed data on the performance of TkRP. In other words, we evaluate the scalability of TkRP when we are dealing with an increase in the length of sensor data streams. For the evaluation, we use the Yokota Lab sub-dataset (i.e., one-month period) and the whole Yokota Lab dataset (i.e., six-month period). Figure  22 illustrates the experiment results. The increase in RMSE occurs as expected because adding more incomplete sensor streams increases the number of missing values. However, the accuracy of the results is still under 1.

4) Impact of the number of sensor nodes per cluster on the performance of TkRP
In this set of experiments, we evaluate the impact of the number of sensor nodes per cluster on TkRP performance. Figure 23 depicts the efficiency of TkRP and recovery accuracy on the Intel lab dataset when increasing the number of sensor nodes per cluster. We only use the Intel lab dataset for these experiments since it has more sensor nodes scattered VOLUME 4, 2016 15 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and in different areas. We set the size of the missing block to 20%. The sensor stream length is fixed when the number of sensor nodes per cluster varies. Here, we also use the average RMSE and runtime values to assess the efficacity and efficiency of TkRP. The experiment shows that the RMSE accuracy of TkRP remains largely unaffected when we vary the number of sensor nodes per cluster. This was expected, as TkRP is only interested in extracting the top-k repeated patterns from neighboring sensor nodes and disregards the rest. It can be seen from the results that when the number of neighbor nodes is small (less than 4), the selected neighbor nodes may not be in the adjacent area of the node's physical location. Thus, errors may happen while recovering the actual values of the incomplete sensor streams. Nevertheless, there is a slight decrease in RMSE when the number of sensors increases from 5 to 11. That is due to the increase in the number of nodes sharing similar physical locations.

B. ACCURACY OF THE ABNORMAL NODE DETECTION METHOD
To evaluate the combination of the two proposed methods FuzHD++ (FuzHD+rRB with TkRP) in terms of abnormal node detection, four performance metrics were used: accuracy, precision, recall, and F1-score. We used precision, recall, and F1-score to quantify the detection accuracy. Accuracy is the degree to which the detection results confirm Recall is the percentage of the identified abnormal nodes among the actual abnormal sensors. The F1-score is a measure of a test's accuracy and is calculated from the precision and recall of the test. In addition, we used one month of data from the Yokota Lab dataset and one week of data from the Intel Lab dataset as the sample sensor data to generate the WM-derived rules. Figures 23 and 24 show the evaluation results of the two datasets measuring the extent to which the combination of the two proposed methods (i.e., FuzHD++ and the previous method FuzHD) detects abnormal nodes.
Even though the environmental conditions for each cluster in the two datasets differed, the proposed combined method, FuzHD++, achieved good detection accuracy with a low false-positive rate for the task of analyzing the sensor readings to determine whether the sensor nodes were behaving normally or had been exposed to FDIAs.
The results show that FuzHD++ achieved an average accuracy of 92%, average precision of 84%, recall rate of 85.50%, and average F1-score of 85%. Moreover, FuzHD++ achieved better detection results than those achieved using FuzHD, with an average accuracy improvement of over 14.11%, improved average precision of 8.13% and recall rate of 14%, and a higher average F1-score of 11.05%. Therefore, we conclude that our proposed method detects abnormal nodes with high accuracy.
Furthermore, Figures 23 and 24 show the evaluation results of FuzHD++ and FuzHD+rRB (with the last observed values as the missing recovery method) using the two datasets. The results show that by adopting the proposed TkRP method as a missing recovery method, the abnormal node detection accuracy achieved an average accuracy improvement of over 1.6%. Overall, both datasets did not suffer from high missing rates (less than 10%), explaining the low improvement.

1) Evaluation of FuzHD++ under higher missing data rates
To investigate this issue further, we broke down the Yokota Lab dataset into smaller defined groups. Table 8 illustrates the breakdown of the missing data rate of the Yokota Lab dataset 16 VOLUME 4, 2016 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.   by one-month period and cluster location. From the obtained results, we selected three cases with high missing data rates as case studies to evaluate the importance of FuzHD++ for better detection accuracy. The first case evaluates the abnormal node detection accuracy of the cluster located in the kitchen area during the period from March 24 to April 24 (with a medium missing data rate of 25.44%). The second case evaluates the abnormal node detection accuracy of the cluster located in the seminar room during the period from June 24 to July 25 (with a medium-high missing data rate of 35.08%). Finally, the third case evaluates the abnormal node detection accuracy of the cluster located in the seminar room during the period from March 24 to April 24 (with a high missing data rate of 58.73%). Figure 25 illustrates the F1score evaluation of FuzHD, FuzHD+rRB with the last observed values as a missing recovery method, and FuzHD++ for each case study. The results show that FuzHD++ achieved the best detection results with an average F1-score improve-ment of over 8.22%. Thus, instead of simply replacing the missing values with the last observed values, the proposed recovery method TkRP is a better choice for FuzHD+rRB to achieve higher abnormal node detection accuracy.
2) Impact of the number of sensor nodes per cluster on the performance of FuzHD++ In this set of experiments, we evaluate the impact of the number of sensor nodes per cluster on FuzHD++ performance. Figure 27 depicts the detection accuracy and the efficiency of FuzHD++ on the Intel lab dataset when increasing the number of sensor nodes per cluster. Here, we use the average F1-score and runtime values to assess the abnormal node detection accuracy and efficiency of FuzHD++. It can be seen from the results that when the number of neighbor nodes is small, the selected neighbor nodes may not be in the neighboring area of the node's physical location, which may cause a wrong decision. Indeed, FuzHD++ uses the temporal and spatial correlation of the sensing data between neighboring nodes to detect the abnormal nodes and assist the fuzzy logicbased decision-making system. With the continuous increase of the number, the number of nodes sharing similar spatial environment conditions gradually increases, and the accuracy of the detection rate is also continuously improved. However, with many nodes grouped into one cluster, FuzHD++ will use the sensed data of nodes with a longer physical distance to participate in the decision-making, which will lead to an increase in the false detection rate. Thus, FuzHD++ depends not only on the number of neighboring sensor nodes to achieve a higher detection rate but also on the distance or the spatial correlation between intra-cluster neighboring sensors.

VII. CONCLUSION
To improve the chances of correctly detecting abnormal nodes in the sensor network, we think it is necessary to ensure the completeness of a sensor data stream before using it in any anomaly detection system. This paper presents FuzHD++, a new method that handles missing and anomalous data with the capability of recovering and detecting them in an integrated framework. In FuzHD++, the two new methods, TkRP and FuzHD+rRB, are devised to recover missing data and detect abnormal nodes, respectively. The observed temporal and spatial correlations of sensor data are utilized in both elements to effectively achieve reliable estimation and detection performance. In TkRP, we adopt a matrix profile to extract the top-k repeated patterns from different sensor nodes. Furthermore, it utilizes the k-NN estimator to recover the missing data based on the extracted pattern information of multiple neighbor nodes. In FuzHD+rRB, we adopt a refined fuzzy rule-based abnormal node detection method with a refined rule base. The refined rule-based FIS integrates the expert rules and the rules obtained from sensor data analysis, making it a more comprehensive and flexible inference system. We demonstrated the effectiveness of our proposed methods by conducting a variety of performance evaluations. We evaluate our proposed methods through a number of experiments designed to test their parameterization (number of top-k patterns, number of sensor nodes per cluster, length of sensor data streams), accuracy, and efficiency. Our experiments using two real-world datasets demonstrate that the proposed missing sensor data recovery method TkRP achieves improved RMSE results of over 20% compared with most existing methods. Furthermore, the combination of the two proposed methods, FuzHD++, achieves better results than FuzHD in terms of abnormal node detection, with an average accuracy improvement of over 14.11%. Besides, the experiment results show that the proposed method depends not only on the number of neighboring sensor nodes but also on the distance or the spatial correlation between intra-cluster neighboring sensors to achieve a higher detection rate. FuzHD++ can detect abnormal sensor nodes under FDIAs but cannot detect other types of attacks at this moment. As for future work, we are currently working on this part. Such expansion may contribute to enhancing the overall IoT security issues.