On the Security of Data Collection and Transmission from Wireless Sensor Networks in the Context of Internet of Things

In the context of Internet of Things (IoT), multiple cooperative nodes in wireless sensor networks (WSNs) can be used to monitor an event, jointly generate a report and then send it to one or more Internet nodes for further processing. A primary security requirement in such applications is that every event data report be authenticated to intended Internet users and effectively filtered on its way to the Internet users to realize the security of data collection and transmission from the WSN. However, most present schemes developed for WSNs don't consider the Internet scenario while traditional mechanisms developed for the Internet are not suitable due to the resource constraint of sensor nodes. In this paper, we propose a scheme, which we refer to as Data Authentication and En-route Filtering (DAEF), for WSNs in the context of IoT. In DAEF, signature shares are generated and distributed based on verifiable secret sharing cryptography and an efficient ID-based signature algorithm. Our security analysis shows that DAEF can defend against node compromise attacks as well as denial of service (DoS) attacks in the form of report disruption and selective forwarding. We also analyze energy consumption to show the advantages of DAEF over some comparable schemes.


Introduction
To become an indispensable part of the Internet of Things (IoT), wireless sensor networks (WSNs) need to adopt IP technologies to create a seamless, global network infrastructure together with the Internet.To achieve this goal, many standardization organizations have been actively pursuing standardization work for creating a global sensor network infrastructure [1,2].In the context of the IoT, any IPenabled node in the Internet shall be able to communicate directly with any remote sensor node in a WSN that is used to monitor specific events.Data transmission from any WSN node to any Internet node can be event-driven, whose scenarios include events that may be sensed sporadically by multiple cooperative sensor nodes when something happens, for example, when detecting fire or door-opening, or can be scheduled at predefined intervals, for example, reporting temperature every two hours.To monitor specific events of interest, more than one sensor node can be used to collect data and transmit it via multihop wireless paths to one or more Internet nodes to improve robustness, especially in an environment in which security threats resulting from internal and external attacks due to node compromises is a serious concern.Under such circumstances, it is still required that data report be sent to one or more of the intended Internet users as accurately as possible.
Since sensor nodes in a WSN may be deployed in an unattended environment, as shown in Figure 1, attackers can relatively easily compromise one or more sensor nodes so that they can be used to inject false event data (e.g., compromised node A can report false data for the event) or disrupt the transmission of legitimate event data (e.g., compromised node B can temper or even discard true data for the event within the multi-hop forwarding process).If undetected, such attacks can cause not only the generation of false alarms but also the depletion of limited energy in the sensor nodes.Moreover, Internet users may not be notified of a real event quickly and handle the event in time to avoid serious consequences.Therefore, it becomes imperative that security services such as data authenticity and availability be provided to resist such attacks.
In many critical applications, the data that are collected by sensor nodes can also be sensitive.Therefore, it is important to ensure data authenticity to detect the report of false or nonexisting events.If a report for an event can be collectively endorsed by multiple sensor nodes, data authenticity can be ensured in the sense that a certain number of compromised nodes cannot collectively forge a report.That is, to forge a valid report for an event, a larger number of nodes have to be compromised.
Moreover, since denial of service (DoS) attacks can occur as the result of corrupted partial endorsements or discarded true data for the event, authentication alone may not be sufficient.It thus requires that data availability measures be employed in addition to those for authenticity to make the security measures highly resilient to DoS attacks.Enroute filtering of false data is vital in saving scarce network resources and in prolonging the life of the network.
Consequently, to achieve security objectives, it is evident that authentication and efficient en-route filtering of data from WSN nodes to one or more Internet users be developed to detect false data injection and to fight against DoS attacks.
Traditional detection mechanisms developed for the Internet usually rely on infrastructure equipment (e.g., firewalls) to filter out distributed denial of service (DDoS) packets, which is deemed to be not adequate for WSNs due to the resource-constrained characteristics of sensor nodes as well as the lack of a comparable infrastructure in WSNs.Meanwhile, current detection mechanisms developed for WSNs rely primarily on the use of predistributed keys shared between sensor nodes and the sink node, which cannot be directly applied to the IoT scenario since the data of the event sensed by cooperative sensor nodes is sent only to one local sink rather than to one or more Internet users who are usually situated in different locations, even in different networks, and who may not be able to establish shared keys with every sensor node to authenticate the event data from a sensor node.
In this paper, we propose a data authentication and enroute filtering (DAEF) scheme to ensure the security of data collection and transmission from WSNs in the context of IoT.In DAEF, we make use of verifiable secret sharing cryptography to distribute the shares based on the most efficient ID-based signature scheme to multiple cooperative sensor nodes.In the case of a node compromise, with the tolerance of an adversary's compromising multiple neighboring nodes in the event area, the event report should be collectively generated, digitally signed, and forwarded to one or more of the intended Internet nodes through multipath routing.The main contributions of this paper are summarized as follows.
(1) After identifying the security requirements on data collection and transmission from WSNs in the context of IoT, we propose a secure and efficient solution to deal with the security problems which have not been sufficiently dealt with in existing solutions.(2) We propose a data authentication and en-route filtering scheme, referred to as DAEF, without requiring the use of any preshared keys between the Internet users and the sensor nodes.Furthermore, not only can the scheme tolerate node compromise attacks, but it can also mitigate the impact of DoS attacks while minimizing energy consumption for the WSNs.(3) We analyze some exiting data en-route filtering mechanisms proposed for WSNs with respect to their characteristics and limitations and compare them with DAEF.(4) We illustrate how DAEF can be used as a secure and efficient mechanism to counter report disruption attacks to WSNs, a capability that we have not found so far in the literature.(5) We conduct performance comparison between DAEF and some comparable schemes to demonstrate DAEF's advantages over those schemes in terms of energy consumption, making DAEF suitable for WSNs.
The rest of this paper is organized as follows.In the next section, we review some related work on data authentication and en-route filtering.In Section 3, we present our proposed scheme, which includes assumptions, threat model and design goals, two preliminaries, and finally the procedure of the scheme.In Section 4, we analyze our proposed scheme in terms of security and performance and compare it to some comparable schemes.Finally, in Section 5, we conclude this paper in which we also discuss some future work.

Authentication Frameworks for
IoT.There are currently some authentication frameworks for data reports designed specifically for WSNs in the IoT scenario.Oliveira et al. proposed the Secure-TWS scheme to authenticate the communication from a single node to multiple users by using certificate-based signature in which the certification authority (CA) is part of the existing infrastructure in the Internet and, hence, is easy to provide since the Internet users only trust the CA and do not have to allow the CA to impersonate as themselves [3].When the users receive a data report signed by a sensor node, they download the sensor node's public key and the corresponding certificate from the CA to authenticate the report using signature verification.Yasmin et al. proposed a framework for authenticated broadcast/multicast by the sensor node using the identity-based online/offline signature (IBOOS) scheme [4].The offline phase performs most of the signature computations to calculate the partial signature which is stored on sensor nodes.Whenever a sensor node reports an event, it performs minor computations to obtain the final signature based on the partial signature stored on it.
The above two schemes can enable all sensor nodes in the WSNs to send messages to report critical situations and allow every node on the path from the sender node to the receiver users to verify and filter out false data as early as possible without using any shared keys.The computation overhead of the first scheme is lower than that of the second, but it requires higher communication cost due to the transmission of certificates.
However, these two schemes do not take into consideration of the existence of compromised nodes that may inject false event data as well as can disrupt the transmission of legitimate event data.Firstly, an event may be reported by a single sensor node which may have been compromised but not yet detected, the false report can get propagated to the users who may then mistakenly take incorrect measures.Secondly, should there be a compromised node in the routes to the Internet users, the users might be misled or might even not be able to receive any messages.Therefore, it is necessary to use multiple surrounding sensor nodes to collectively generate a legitimate data report which should also be forwarded to the Internet users via multipath routing.

Authentication Based on Symmetric Cryptography in
WSNs.In WSNs, the problem of authenticating an event report collected by multiple sensors to the local sink node has attracted a great deal of attention in recent years [5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20].Most of the schemes achieve the goals through using message authentication codes (MAC) based on symmetric keys.The basic idea is to attach MACs to the event reports and to ensure that a legitimate report must have a certain number of valid MACs.When the event report is forwarded to the sink along a routing path, intermediate nodes can detect and drop a forge report if it does not carry enough number of valid MACs.SEF [5] and IHA [6] are two such schemes for filtering injected false data in WSNs.SEF allows both the sink node and the en-route nodes to authenticate a report that has  MACs generated by a cluster attached to it with a certain probability by using the keys from different partitions in a global key pool.IHA verifies a report that has  ( =  + 1) MACs and one compressed MAC attached that are computed by the cluster lead node in a deterministic and hop-by-hop fashion through using pairwise keys between two upper or lower associated nodes that are  hops away.All the following schemes are based on this T-authentication fashion, but on different technologies to be achieved.In DSF [7], dual keysharing is used, that is, the random keys sharing and the associated keys sharing, to reduce the number of hops for the forwarded false data.In RAS [8], dynamic authentication tokens from one-way hash chain are used based on a predesigned partition-overlapping key pool scheme.In KAEF [9], a one-way key chain authentication method is used to generate and verify endorsements for transferred data.In STEF [10], the query response operation mode is adopted and the concept of ticket is proposed based on lightweight one-way functions so that messages are only forwarded if they contain a valid ticket that is originally issued by the base station.In PCREF [11], polynomials stored in each node are adopted, including an authentication polynomial and a check polynomial derived from the primitive polynomial, and used for endorsing and verifying the reports.In DEFS [12], the socalled Hill Climbing approach is used to ensure that nodes close to a source cluster hold more keys for the source cluster than those that are further away to balance the network.In GPREF [13], a multiaxis division based approach for deriving location-aware keys is used.In DREF [14], an authentication scheme capable of filtering invalid messages, called CFA [21], and a novel idea of embedding proximity information into a Bloom filter prepared for the query purpose are employed.In EAB [15], an en-route authentication bitmap is developed by using the Bloom filter techniques to build an authentication manifest.In BECAN [16], a bandwidth-efficient cooperative authentication scheme is proposed based on random graph characteristics of sensor node deployment and the cooperative bit-compressed authentication technique.Last, but not the least, in LBRS [17] and LEDS [18], location-based keys are utilized to authenticate a data report to prevent compromised nodes from breaking the entire WSN even though a certain area of the WSN may have been affected.

Authentication Based on Asymmetric Cryptography in
WSNs.Some asymmetric cryptographic schemes, such as CCEF [19], PDF [20], and LTE [22], that rely on signature approaches can enable any report, not just the report that is sent to the sink node, to be authenticated and en-route filtered.Moreover, they do not require any preshared keys.CCEF employs a commutative cipher based on public key cryptography filtering mechanism in which cluster nodes can establish a secret association with the sink on a persession basis, while the en-route nodes are equipped with a witness key to be used to verify the authenticity of reports without knowing the original session key.PDF leverages Shamir's threshold cryptography and elliptic curve cryptography (ECC) to reject false data packets while LTE makes use of identity-based cryptography (IBC) based on bilinear pairing to bind the private key of each sensor node to both its identity and geographic location.

Critical Analysis and Comparison of Existing Schemes.
Although the existing schemes can effectively perform the functionality of authentication and false data report filtering, there are still some limitations in different aspects, such as key-sharing limitation, -threshold limitation, node location limitation, static route limitation, lack of tolerance to report disruption attacks, lack of tolerance to selective forwarding attacks, to name a few.The notion of such limitations is discussed below.
(1) Key-sharing limitation refers to the requirement that every node that generates a report must share key material with the sink.
(2) -threshold limitation refers to the situation in which the compromise of more than -authentication partitions would put the whole WSN in danger.Such a limitation makes the network less resilient to the increase in the number of compromised nodes.
(3) Node location limitation refers to the requirement that each node be equipped with GPS capability to locate itself since only rough estimation on the location can be achieved for sensor nodes that are not equipped with GPS.
(4) Static route limitation refers to the reality in which reports get forwarded to a fixed sink along a preestablished path and each en-route node is associated with the source nodes in the event area.
(5) Lack of tolerance to report disruption attacks refers to the situation in which intentional submission of corrupted partial MACs or signatures by compromised sensor nodes would disrupt the process of data filtering by some sensor nodes on the forwarding route, also called packet pollution attacks or falseendorsement based DoS (FEDoS) attacks [23].
(6) Lack of tolerance to selective forwarding attacks refers to the situation in which one or more compromised forwarding sensor nodes can drop a legitimate report, also called path-based DoS (PDoS) attacks [24].
Table 1 provides an analysis on the limitations of some of the main existing schemes.We can see from the table that all symmetric key-based schemes exhibit the key-sharing limitation since it is practically infeasible for every remote user to establish a shared key with each and every one of the sensor nodes.Therefore, these symmetric keys based schemes are not suitable for WSNs in the context of IoT.A result, only PDF and LTE that are asymmetric cryptography-based signature schemes can be used for event data authentication and filtering in the IoT scenario.However, PDF suffers from the vulnerability of both report disruption attacks and selective forwarding attacks.Only DREF, LEDS, and LTE can resist report disruption attacks and selective forwarding attacks that result from node compromises in the event area and from the use of a single route from a sensor node to the sink node.The result of the analysis shows that only LTE is applicable to the IoT scenario and can provide some level of tolerance to both types of DoS attacks.Therefore, we will compare our proposed scheme in terms of performance only to LTE at the same security level.

Location-Based Threshold-Endorsement (LTE) Mechanism.
Due to its functional capability as well as its applicability to the IoT scenario, we would like to elaborate a little more on the LTE scheme.In LTE, the sensor network is divided into  *  square cells of equal side length .Each cell is labeled with a pair of integers ⟨, ⟩, where 1 ≤  ≤  and 1 ≤  ≤ .⟨ 0 ,  0 ⟩ is the location of the sink.The cell key of cell ⟨, ⟩ is  , = ( ‖ ) where  is the network master secret key.Let holds, where V (0) , = ( , , ).If the check does not succeed, AP should proceed to verify each received   , by checking if (1) holds.Consider the following: If the check succeeds, AP considers node   , legitimate.AP is therefore able to pinpoint all the endorsers offering false signatures and delete them from Ω.
AP then sends to the sink the final report along a multihop path discovered using a secure multipath routing protocol.Upon receiving a report ⟨Λ, Υ , , ℎ(Λ ‖ )⟩ to be forwarded, each intermediate node computes where  pub =  is a public system parameter.If the report is authentic, then   = .Therefore, if ℎ(Λ ‖   ) = ℎ(Λ ‖ ), then an intermediate node would consider the report authentic and hence forward it to the next hop.However, LTE is not always feasible because it requires that every sensor node be equipped with localization capability, which incurs extra communication overhead as well as latency.Moreover, the bilinear pairing utilized in LTE is too expensive for low energy sensor nodes.In LTE, a data report must be cosigned by  out of  nodes ( ≤ ) in the event area.Thus, an adversary needs to compromise at least  nodes in order to inject false data.It is worth mentioning, however, that the relationship between  and  is not fully discussed in LTE with respect to dealing with report disruption attacks.From our analysis, we have found that the critical relationship  − ( − 1) ≥ , that is, the maximum

Static route limitation
Lack of tolerance to FEDoS attacks number of compromised nodes, that is, −1, for the scheme to tolerate false data injection attacks cannot cooperatively cause the report disruption attacks.Thus, we have  ≥ 2 − 1.

The Proposed Scheme
In this section, we describe the proposed DAEF, a data authentication and en-route filtering scheme, that can be used to ensure the security of data collection and transmission from WSNs in the context of IoT.In applications to which DAEF can be applied, we assume that a group of sensor nodes are used to monitor an event.In the proposed DAEF, a group ID-based signature is introduced for each event report so that any intermediate node and any Internet user with the identity of the group lead node can easily verify the event report, which easily removes the key-sharing and the -threshold limitations, while realizing -authentication to tolerate node compromise attacks without requiring static route and node localization technology.DAEF also employs the verifiable secret sharing algorithm to distribute and verify signature shares in order to defend against report disruption attacks.Last, but not the least, we employ the most efficient ID-based signature algorithm in DAEF to reduce computation overhead that results from signature generation and verification operations.Moreover, a multi-path routing protocol should be used in DAEF to resist selective forwarding attacks.

Assumptions.
We assume that all the sensor nodes in WSNs are deployed uniformly and bootstrapped securely using the scheme proposed in [25] so that the one-hop neighboring sensor nodes can establish pairwise keys and trust relationships to form a network with multi-hop clustertree hierarchical topology.Each node establishes and stores a neighbor trust list as shown in Figure 2. We also assume that every event of interest can be detected by multiple, say  ( > 1), sensor nodes in one group, each group covering a detecting area.Then, the event needs to be reported during which a group of at least  ( ≥ 2 − 1) nearby legitimate nodes should collaboratively agree on the event that will be forwarded to one or more Internet users.In this scenario,  and  are predefined system parameters and are determined by the application requirements.For each group, a lead head is elected and is responsible for collecting and summarizing all the received detection results from detecting nodes in the group, and generating a final report on behalf of the group.This group of neighboring nodes generates and broadcasts the signed report to a lead node which then aggregates the signatures before sending them to one or more Internet users through one or more forwarding nodes.
Moreover, we assume that the Internet users are determined by the service provider (SP) based on the service provided by the sensor nodes.Take the smart home applications as an example, the detection of fire in a home should be reported to the family members, security guards of the community, and the fire department, while the temperature in the home may be reported to all family members every two hours.The corresponding software code can be preloaded in the sensor nodes prior to deployment.Furthermore, the code can be dynamically updated by using an end-to-end secure communication protocol (referring to [26]).

Threat Model and Design
Goals.An adversary can eavesdrop on all traffic, inject packets, replay older packets, and take full control of the compromised nodes to launch false report injection attacks and DoS attacks.In our model, we assume that most - neighboring nodes in an event area Figure 2: Neighbor trust list: neighbor ID is the ID of the node's neighbor; short address is the short address of the node's neighbor; shared key is the pairwise key shared between the node and its neighbor; relationship points that the neighbor is the node's father, child, sibling, and so on; hop count is the distance between the node and the edge route, trust value is the computed result making use of the utility value method in a multiple criteria decision making scheme.can be compromised.Our objective is to design a scheme to detect these attacks for the event report in the IoT scenario.DAEF should achieve the following goals.
(1) It should not require the establishment of preshared keys between Internet users and sensor nodes (to overcome the key-sharing limitation).
(2) It should not depend on the localization technology to prevent compromised nodes from breaking the whole WSN (to overcome the -threshold limitation).
(3) It should tolerate node compromises in the WSNs even if the locations of the sensor nodes may not be known (to overcome the node location limitation).
(4) It can mitigate the impact of DoS attacks including report disruption attacks and selective forwarding attacks (to overcome the FEDoS and PDoS attacks).
(5) It should keep the overhead of communication and computation as low as possible in the WSNs.Given a message , the signer   performs the following steps to sign the message.
The signer then sends (,   , ℎ, ) to the receiver.To verify the message and the signature, the receiver does the following steps.

The Data Authentication and Filtering Scheme
3.4.1.Initialization.During the bootstrapping phase, the lead node in every group distributes the secret share of   to all group nodes.Specifically, in th group, the lead node   generates a secret polynomial    () and distributes the secret share    (  ) to every group node   using the shared key between them.Then,   deletes   and    () but stores    (  ).Therefore,   only needs to be authenticated by any other  − 1 group nodes and get  − 1 secret shares in order to reconstruct   .

Report Generation.
When an event occurs, the lead node will prepare a report, say, .To get an agreement on the event from other group nodes, the lead node   broadcasts  to all the group nodes and authenticates itself to them.After receiving , a group node will find the difference between the received  and what it has sensed.If the difference is within a predefined error range, it will agree on  and endorse the signature.These  group members including   itself, taking one node   as an example, will generate one random polynomial     () as shown in (3), in which the share of the random number is     () =     (  ) ( = 1, 2, . . ., ), distribute     () to the other sensor nodes   , and broadcast    =     (mod) ( = 0, 1, 2, . . .,  − 1).Consider the following: Note that the distribution of secret shares should be protected using the shared keys between the communicating peers.Therefore, each node will receive no less than −1 ( ≤  ≤ ) secret shares which should be verified as discussed in Section 3.3.2.For example,   should verify     () received from   according to formula (4).Consider the following: During the verification phase, a compromised group member may be detected.If   find a corrupted partial secret share sent by a group member,   broadcasts the  of the compromised group member.If more than  members claim that one node has been compromised, each legitimate sensor node can find all compromised group members and no less than  legitimate group members.These  legitimate sensor nodes should use the verified  − 1 secret shares and its own share to compute the share of   , denoted as    , shown in (5).Consider the following: The secret random number   is generated from the polynomial    () = ∑  =1     (), shown in formula ( 6), which is endorsed by these  legitimate sensor nodes and   =    (0).Therefore, no sensor node knows   , and any  sensor nodes can reconstruct   using Lagrange interpolation:  (7).It is possible that some of the neighboring nodes have been compromised during this phase and, thus, may provide the lead node   with incorrect signatures.Therefore,   should verify their authenticity by checking the equation   =  − ℎ(  +  pub ) where  =  1 (  ‖   ).Consider the following: Finally,   broadcasts the final data report (,   , ℎ, ) and assigns multiple upstream nodes in its neighbor trust list to make the report forwarded to the Internet users through multipath routing.In the cases in which the compromised lead node may either not send the final report or transmit a bogus report with a wrong (,   , ℎ, ), it will be detected by all legitimate group nodes.The verification is the same as the en-route filtering operations to be described in Section 3.4.3.In this case, the legitimate neighboring nodes will randomly elect a new lead node among themselves to generate a new threshold-endorsement and send the final report to the Internet users.The whole report generation procedure is illustrated in Algorithm 1.

En-Route
Filtering of Data Report.We denote   as the en-route verification probability.The forwarding sensor node verifies the signature of a report with the probability   which is a predefined system parameter.As discussed in Section 3.3.1, the verifying intermediate node or the final Internet user first computes  =  1 (  ‖   ), then checks (8).The data report will be regarded as authentic and forwarded to multiple upstream nodes in their neighbor trust lists if the verification is successful, otherwise, it will be immediately discarded.Consider .Finally, should the lead node change the report with a wrong (,   , ℎ, ) in the final step, it would be detected by the forwarding nodes as well as by all the legitimate neighboring nodes.These nodes can then elect a new lead node to generate a new thresholdendorsement and send the final report to the Internet nodes.In the worst case, even if the attacker can derive the private key   by compromising  nodes in the event area, it will not affect any other groups.

Mitigation of Report Disruption
Attacks.DAEF leverages verifiable Shamir's secret sharing cryptography described in Section 3.3.2which has been shown to be secure [28].In the report generation phase, at least  legitimate sensor nodes cooperatively generate the secret random number   by exchanging the shares     () =     (  ), which may be disrupted by compromised nodes.For example, an attacker may provide an incorrect share     () to a neighboring node so that it will not get the right share    = ∑  =1     (  ).However, in DAEF, each node will verify the received shares, detect compromised neighboring nodes, and broadcast the detection result so that only legitimate shares will be used in the    and only the shares   computed by legitimate neighboring nodes can be used by the lead node.Note, however, that if the attacker only distributes the wrong shares to only some of the legitimate neighboring nodes that are able to detect the wrong shares, based on the detected results broadcast by such neighboring nodes, it is not possible for the legitimate neighboring nodes to identify the compromised nodes.

Mitigation of Selective Forwarding Attacks.
In DAEF, in order to mitigate selective forwarding attacks, the lead node and the intermediate nodes would forward the final report (,   , ℎ, ) to multiple upstream nodes that are in their neighbor trust lists to ensure that the report is forwarded to the Internet users through multipath routing.Unless all the forwarding nodes are compromised, the legitimate report will ultimately be delivered to the destinations.However, this solution will incur high communication overhead.In the worst case in which all the upstream nodes of a forwarding node are compromised, another route path should be used by using a secure multi-path routing protocol in the WSN, such as SPREAD [29], which is out of the scope of this paper.

Performance Analysis.
In this section, we evaluate the performance of DAEF in terms of filtering efficiency, the number of hops that false data can travel, and the ratio of compromised area, which are the main metrics for the evaluation of en-routing filtering schemes [30].We also analyze and compare DAEF to LTE in terms of computation, communication, and energy consumption.We conducted all experiments in the analysis, evaluation, and comparison by using MATLAB [31] plus some programming in VC ++ whenever necessary.

General Analysis. First, let us analyze filtering efficiency
ℎ which is defined as the probability of successful filtering of false data within a specified number of hops.Similar to LTE, the probability of false data that can be filtered out within ℎ hops is Clearly, the greater the value of   , the greater the  ℎ can achieve, that is, the better the filtering efficiency.However, a smaller   can lower the computation overhead of intermediate nodes.Figure 3 shows the filtering efficiency with different   from which we can see that when   is 0.5, more than 90% of false reports can be filtering out within 4 hops.Ever for a small   , say 0.2, less than 11% of false reports can travel over 10 hops.Therefore, for a large WSN which has long forwarding paths, DAEF can efficiently filter out false reports as early as possible to save the energy of legitimate nodes.
The number of hops that false data can travel (ℎ) is defined as the average hops that false data are forwarded before being filtered and dropped, which reflects the filtering effectiveness.Similar to LTE, the average number of hops that a false report is forwarded before being filtered out is The ratio of compromised area  is defined as the percentage of compromised sensor nodes in the terrain, which reflects the effectiveness of filtering resilience to the increase on the number of compromised nodes.Given that the network size is  and that the average number of nodes in each group is , when the attacker successfully compromises  sensor nodes, the probability that no node in an event group is compromised is Let  () represent the probability that  nodes are compromised in a group.Then, Therefore, the percentage of compromised nodes is Figure 4 shows the resilience as the number of compromised nodes increases.We can observe that the percentage of

Comparison Analysis.
We analyze DAEF and LTE in terms of computation cost and communication overhead and then compare them in terms of energy consumption to show the advantages that DAEF has over LTE.First, we perform the analysis on computation cost.In all the schemes, expensive operations are normally pairing (), point multiplication (), and exponentiation (Exp).In DAEF, each endorsing sensor node needs to do one point multiplication operation to generate       and  ×  exponentiation operations which include  exponentiation operations to generate   ( = 0, 1, 2, . . .,  − 1) and ( − 1) ×  exponentiation operations to verify  − 1 received shares.Thus, the total number of computational operations in an event group is  *  +  2  * Exp.Meanwhile, each verifying forwarding sensor node needs three point multiplication operations to authenticate an event report.In contrast, in LTE, the number of computational operations in an event group is 2 *  + ( + 1) *  + 2 * Exp in cases in which no compromised nodes are selected by the lead node but is ( + 1) *  + ( + 1) *  + ( −  + 2) * Exp in the worst case.In addition, each forwarding sensor node needs two pairing operations and one exponentiation operation to authenticate an event report.
For communication overhead, which is defined as the total number of messages generated in an event group, in DAEF, the  sensor nodes have to jointly generate a random number   for each report, which contributes most to the communication overhead because the private key   can be distributed during the bootstrapping phase.For a share   , each sensor node needs to send  − 1 secret shares to the  − 1 neighboring nodes and broadcast one promise   ( = 0, 1, 2, . . .,  − 1) and one detection result.In the signature generation phase, each legitimate group sensor node sends       and   to the lead node and the lead node broadcasts   along with   .Thus, the total number of messages required to generate a signature for an event report is ( +  + 1).In contrast, in LTE, each endorsing node only needs to send one share   , to the lead node which should broadcast .Thus, the total number of messages required to generate a signature for an event report is .
Note that in the above analysis, we assume that there is no compromised node in any of the event groups.Let us now analyze and compare DAEF to LTE in terms of energy consumption for the whole WSN under the assumption that there is at least one compromised node in an event group at the time of reporting an event, for these two schemes offer a similar level of security and both can deal with report disruption attacks and selective forwarding attacks.
We employ the similar model to LTE when performing analysis and comparison on energy consumption which is determined by communication overhead as well as computation cost.We assume that the sensor nodes have the same capabilities as those of a standard Crossbow's MICA2 mote [32] which has 8-bit ATmegal 128L clocked at about 7.37 MHz microcontroller and complies with the IEEE 802.15.4 standards with data transmission rate of 12.4 kbps.According to [4], completing a 160-bit point multiplication operation of ECC, a pairing operation, and an exponentiation operation consumes 24.3 mJ, 62.73 mJ and 2.81 mJ, respectively.In addition, MICA2 consumes 52.2 J, and 19.4 J to transmit and to receive one byte, respectively.We assume that the length of the node's ID is 2 bytes, making the lengths of     (),   ,      ,   ,   and   to be 2 bytes, 20 bytes, 40 bytes, 20 bytes, 40 bytes, and 40 bytes, respectively.The original report is assumed to be 15 bytes, thus allowing us to transmit a report in one data packet.We denote  as the average number of hops a report travels in the WSNs.For the sake of simplicity, we only consider the single routing path; thus, both schemes involve   en-route filtering operations.
In the analysis, we specify (, ) to be (2, 3), (3, 5), (4, 7), (5, 9), and (6,11), respectively, because of the relationship between  and  discussed in Section 2.4. Figure 5 shows the energy consumption for various  when   = 0.2 under the condition that the number of compromised nodes in the event group is the maximum value  − 1.We can see from the figure that energy consumption in DAEF is lower than that in LTE for (2, 3), (3, 5), (4, 7), and (5, 9).However, the difference narrows as the value of  increases until when (, ) is larger than (6,11) where DAEF will consume more energy than LTE.In addition, when  increases by one hop, the increase in energy consumption in DAEF is about 15 mJ whereas that in LTE is more than 25 mJ.That is because the energy consumption of computation in DAEF that incurs () cost of point multiplication and ( 2 ) cost of exponentiation for report generation and three point multiplication operations for report en-route verification is lower than that in LTE that incurs () cost of pairing and () cost of point multiplication as well as () cost of exponentiation for report generation and two pairing operations and one exponentiation operation for report en-route verification.However, the energy consumption of communication in DAEF for generating ( 2 , ) communication cost is higher than that in LTE which only generates (, ) communication cost.
Figure 6 shows the energy consumption for various   when  = 10 under the condition that the number of compromised nodes in the event group is the maximum value  − 1.With the same (, ), the difference in energy consumption gets higher as   increases.This is because one filtering operation requires two pairings and one exponentiation in LTE which incurs more cost of computation, about 128 mJ, than three point multiplications, about 72 mJ, of one filtering operation in DAEF.
Figure 7 shows the energy consumption as the number of compromised nodes increases in an event group when   = 0.2 and  = 10.We can see from the figure that both DAEF and LTE perform better as the number of compromised nodes increases with the same (, ).This is because the number of legitimate nodes will decrease as the number of compromised nodes increases, hence reducing the computation cost and communication overhead in both schemes.Meanwhile, in DAEF, the larger the value of  is, the larger the difference is for the same (, ).In LTE, however, the reduction in energy consumption becomes less significant as the value of  increases.That is because the change in the communication overhead in DAEF is more than that in LTE in terms of the number of compromised nodes, which can be seen when (, ) = (5,9).
Figure 8 shows the energy consumption for various parameter  when   = 0.2 and  = 10 under the condition that the number of compromised nodes in the event group is maximum value  − 1.The energy consumption in DAEF is lower than that in LTE except for the values  ≥ 11 and  > 4.However, the former increases slowly while the latter decreases as  increases.In LTE, regardless of the value of , the computation cost has little difference, which results from the number of valid shares   , randomly selected by the lead node.Moreover, the larger value of , the more effective computation and communication get, which reduces the waste of energy.In DAEF, the slowly increasing energy consumption results from the computation and broadcasting of promise   ( = 0, 1, 2, . . .,  − 1).
In conclusion, DAEF outperforms LTE for data authentication and en-route filtering when an event group has a smaller number of nodes and larger number of compromised nodes.

Conclusion
In this paper, we proposed DAEF, a new data authentication and filtering scheme, to ensure the security of data collection and transmission from WSNs in the context of IoT.In the scheme, the verifiable secret sharing cryptography is used for the distribution of the shares to multiple   neighboring collective sensor nodes based on the most efficient ID-based signature scheme.As long as an adversary does not compromise more than - neighboring nodes in an event area, any event report can be collectively generated with a digital signature attached and forwarded via multipath routing to multiple Internet nodes.Analysis on the proposed scheme showed that DAEF can effectively defend against node compromised attacks and DoS attacks in the forms of report disruption attacks and selective forwarding attacks.Quantitative analysis to compare DAEF to the LTE scheme   has also been performed in terms of energy consumption in which we showed that DAEF outperforms LTE in terms of energy consumption when fewer numbers of nodes and more numbers of compromised nodes of a group exist in the event group.In the future, we will conduct more experiment in real network settings to verify the results and to further improve the performance in terms of latency in DAEF.

Figure 1 :
Figure 1: Compromised nodes can inject false event data or disrupt the transmission of legitimate event data.

9 Figure 3 :
Figure 3: The filtering efficiency with different   .

Figure 4 :
Figure 4: The percentage of compromised groups with the increased number of compromised nodes when  = 1000.

Figure 7 :
Figure 7: Energy consumption with an increasing number of compromised nodes for various (, ) when   = 0.2 and  = 10.
, denote the th node with location   , in cell ⟨, ⟩.LTE utilizes the secret sharing technique to assign a share of  , to each   , ; that is,   , has the share   , .An event occurs in cell ⟨, ⟩ and is detected by  ≥  nodes.The lead node AP in cell ⟨, ⟩ chooses a random  ∈  *  and computes  = (, )  to be broadcast to the other detecting nodes.Upon receiving , each detecting node   , endorses the report Λ by computing   , =   , ℎ(Λ ‖ ).The node then sends to AP   , encrypted and authenticated with the pairwise key shared with the AP.Once receiving  or more such endorsements, AP would randomly select  endorsers denoted by a set notation Ω which may include itself.AP would then calculate  , = ∑ ∈Ω     , =  , ℎ(Λ ‖ ) and Υ , =  , + .The final report is ⟨Λ, Υ , , ℎ(Λ ‖ )⟩.Once deriving  , , AP verifies its authenticity by checking if equation ( ,

Table 1 :
Analysis of limitations.
[27]ature and a certificate with at least 100 bytes).With the current state of the art technology, the most efficient ID-based signature (i.e., vBNN[27]) needs one point multiplication operation to sign a message and three point multiplication operations to verify the signature with the length of a signature being 83 bytes.Let us briefly describe the vBNN scheme below.Given a sensor node   , the SP picks a random number   ∈   , where the multiplicative group   = [1, . . .,  − 1] computes   =   , where  is a large prime, and an elliptic curve (  ) is defined over a finite field   = [1, 2, . . .,  − 1].Then, SP calculates   =   +  1 (  ‖   ) in which  is the master key of the WSN picked by SP and  1 : {0, 1} * ×  1 →   , where  1 is an additive group of the prime order .  and   are stored in the sensor node   .
3.3.Preliminaries3.3.1.ID-Based Signature.As discussed above, ECC-based signature (i.e., ECDSA) requires one point multiplication operation to generate a signature and two point multiplication operations to verify a signature.Moreover, authentication of the public key of the signer also requires two point multiplication operations.For an ECC of 160 bites, ECDSA produces a 60-byte signature, resulting in more than a 160byte message payload (including a 60-byte ECDSA Sharing.Sensor node   generates a secret polynomial   () =  0 +  1  +  2  2 + ⋅ ⋅ ⋅ +  −1  −1 ,where  0 , . . .,  −1 are random numbers picked by the sensor node   and the secret key   can be picked as   =  0 .The secret share of   for the neighboring node   is thus     =    (  ).Then, any  sensor nodes together can reconstruct   by using Lagrange interpolation   = ∑  =1      /(  −   )) is the Lagrange coefficient.However, it is computationally infeasible if fewer than  sensor nodes try to reconstruct the secret key   . Al     must be distributed through the secure communication channels. Te sensor node   broadcasts  0 =   0 and   =    (mod ) ( = 1, 2, . . .,  − 1), and every   can verify the received