How not to secure wireless sensor networks: A plethora of insecure polynomial-based key pre-distribution schemes

Three closely-related polynomial-based group key pre-distribution schemes have recently been proposed, aimed specifically at wireless sensor networks. The schemes enable any subset of a predefined set of sensor nodes to establish a shared secret key without any communications overhead. It is claimed that these schemes are both secure and lightweight, i.e.\ making them particularly appropriate for network scenarios where nodes have limited computational and storage capabilities. Further papers have built on these schemes, e.g.\ to propose secure routing protocols for wireless sensor networks. Unfortunately, as we show in this paper, all three schemes are completely insecure; whilst the details of their operation varies, they share common weaknesses. In every case we show that an attacker equipped with the information built into at most two sensor nodes can compute group keys for all possible groups of which the attacked nodes are not a member, which breaks a fundamental design objective. The attacks can also be achieved by an attacker armed with the information from a single node together with a single group key to which this sensor node is not entitled. Repairing the schemes appears difficult, if not impossible. The existence of major flaws is not surprising given the complete absence of any rigorous proofs of security for the proposed schemes.


Introduction
In this paper we are concerned with the problem of enabling groups of nodes (such as exist in a Wireless Sensor Network (WSN)) to establish shared secret keys for use by any subset of the nodes, using only information distributed to the nodes in advance by a trusted Key Generation Centre (KGC). In particular we examine three separate (albeit closely related) protocols described as being designed for wireless sensor networks, although they have no particular features restricting their operation to this use case. Indeed, they would work just as effectively for any case where a number of hardware tokens, such as RFID tags or smart cards, are to be distributed by a single trusted party. The schemes we consider are designed to ensure that nodes which are part of the system, but not part of a selected group, cannot access the key for that group, i.e. nodes can only access keys for groups of which they are a part.
All three of the schemes we consider are polynomial-based. They appear to be inspired by the work of Blundo et al. [2]; however, unlike this prior art, the schemes all have major flaws. Unfortunately, many other polynomialbased group key distribution schemes have been shown to be flawed, [7,9,10,11,12], meaning that the approach needs to be used with care. In any event, there are many existing schemes which achieve the same objectives in an efficient way and which have rigorous proofs of security -see, for example, Boyd et al. [3], and, of course, the previous work of Blundo et al. [2].
The remainder of this paper is structured as follows. Some preliminary remarks are made in §2. The 2015 Harn-Hsu protocol is described in §3, together with an attack against this scheme. The very closely-related 2015 Harn-Gong scheme is briefly considered in §4. A more recent scheme, the 2019 Albakri-Harn protocol, is described and analysed in §5. Other issues are discussed in §6, and brief conclusions are provided in §7.

A preliminary observation
All three of the schemes we describe involve arithmetic computed in the ring of integers Z N , where N is an 'RSA modulus' [14], i.e. N = pq for two large prime numbers p and q. This means that addition and multiplication are computed modulo N . For today's RSA applications, p and q are typically chosen to be of the order of 2 1024 , and we make this assumption throughout. This ensures that with today's techniques and computing resources, factoring N is infeasible (and this is necessary to ensure that the RSA cryptosystem is secure).
If p and q are of this size, then the probability that a random integer s (1 ≤ s < n) is coprime to N = pq is very close to 1. To see this, observe that φ(n), the Euler phi function that gives the number of positive integers less than n that are coprime to n (Menezes et al. [8], Definition 2.100), satisfies φ(pq) = (p − 1)(q − 1) for any primes p, q ([8], Fact 2.101). Thus the probability that a random positive integer less than N = pq (where p and q are of the order of 2 1024 ) is coprime to N is Hence we assume throughout that all randomly chosen integers are coprime to N and so have multiplicative inverses modulo N ; moreover, such inverses can readily be computed ( [8], Algorithm 2.142). That is, we can readily compute divisions modulo N , a fact that underlies all the attacks we describe.

The Harn-Hsu scheme
In 2015 Harn and Hsu [5] proposed a group key pre-distribution protocol for WSNs. Unfortunately, as we describe in detail below, the Harn-Hsu scheme is insecure. In particular, anyone with the shares belonging to one node can compute the keys for all possible groups, regardless of membership. The nature of the flaws giving rise to the attack are rather fundamental, meaning that it is difficult to imagine how the scheme could be rescued.

The protocol
We next describe the Harn-Hsu protocol [5]. The protocol involves a set of ℓ end nodes S = {S 1 , S 2 , . . . , S ℓ } and a Key Generation Centre (KGC). Note that, in this description and in the description of related protocols (in §4 and §5), the terminology and notation have been changed slightly from that used in the source papers to ensure consistency. As with all such protocols, the Harn-Hsu protocol has two phases, as follows.
• Share generation, a pre-distribution phase in which the KGC generates a set of m − 1 shares where each share s i,j (x) is a polynomial. It is assumed that the KGC distributes the shares to the nodes using a secure channel, e.g. at the time of manufacture/personalisation, and that each node stores its shares securely.
• Group key establishment, where all members of a subset of nodes independently generate a group key for use by nodes in that subset. Note that, if repeated with the same subset of nodes, the key establishment process will generate the same key, but a different key is generated for each node subset.
We next describe the operation of these two phases in greater detail.

Share generation
The KGC first chooses an 'RSA modulus' N , i.e. N = pq where p and q are two primes chosen to be sufficiently large to make factoring N infeasible. It is also assumed that each node S i has a unique identifer It is (presumably) the case that the KGC keeps p and q secret, although, as we see below, this does not appear to make a significant difference to the security of the scheme.
The KGC next chooses a polynomial f over Z N of degree k for some prechosen k. We suppose that the polynomial coefficients are chosen uniformly at random from Z N . No explicit guidance is given on the choice of k, but later it is claimed that the scheme is secure if up to k nodes are captured and their secrets are revealed. We assume throughout that k ≥ 2.
For every S i (1 ≤ i ≤ ℓ) the KGC calculates a set of ℓ − 1 shares in the following way.
• The KGC first computes f (ID i ).
• The KGC then computes u i,m−1 as • Finally, the shares are computed as Finally, the KGC equips each node S i ∈ S with the following: • the values of N and the node's own identifier ID i ; • the identifiers ID j of all of the other nodes.

Group key establishment
Suppose some subset S ′ ⊆ S of sensors (|S ′ | ≥ 2) wish to share a group key, Any node S i ∈ S ′ can compute K S ′ using the shares However, a node S k ∈ S ′ cannot use its shares to compute K S ′ , at least not in the way described above.
Finally note that it should be clear that the group key K S for the set of all sensors is simply

Critical vulnerabilities
We describe below a simple insider attack which enables the discovery of all the group keys. This can be achieved using the shares held by any one of the nodes. The attack works regardless of the choice of the polynomial degree k. We also show that, even without access to a set of shares, knowledge of some keys enables others to be deduced.

Some simple deductions
We start with a simple but key observation where here, and in the remainder of §3, we suppose that the values z 1 , z 2 , . . . , z ℓ satisfy Proof By definition, we immediately have The result follows immediately from the definition of z r .
We next have a related result.
Lemma 3.2 If S ′ ⊆ S is some non-empty subset of the nodes, then the key K S ′ for the group S ′ satisfies: Proof By definition (and using the notation established in §3.1.2) we have This immediately gives the following.

Corollary 3.3 The group key K S shared by all nodes satisfies
We also have the following.

Completing the insider attack
First observe that, using Lemma 3.1, anyone possessing the shares belonging to a single sensor node can learn the values of z r , 1 ≤ r ≤ ℓ, for every r.
Next observe that, since a single set of shares enables recovery of the group key K S shared by all nodes, the attacker can obtain f (0) ℓ mod N since it follows from Corollary 3.3 that: Given knowledge of f (0) ℓ mod N together with the complete set of values z 1 , z 2 , . . . , z ℓ , Lemma 3.2 enables the computation of any group key K S ′ . This completes the attack.

An outsider attack
Even if an attacker does not possess any of the shares, then attacks are still possible if the attacker has access to some of the keys K S ′ generated using the system. We give a simple example of an attack, but many other variants are possible.
Suppose an attack has access to three group keys: K S 1 , K S 2 and K S 3 , for groups S 1 , S 2 and S 3 , where the attacker also knows the membership of these groups. Suppose also (to make the discussion simpler) that S 1 = S 2 ∪ {S y } for some node S y . Then, by Corollary 3.4, the attacker can immediately compute z y .
If S y ∈ S 3 , then the attacker can now compute That is, the attacker can compute another valid group key.
This attack works because there are simple algebraic relationships between the keys.

The Harn-Gong scheme
In 2015, the same year in which the Harn-Hsu paper appeared, Harn and Gong [4] presented another group key pre-distribution scheme. It would appear that this paper was actually submitted a few months earlier than the Harn-Hsu paper. However, we consider it after Harn-Hsu as the scheme it describes is essentially just a special case of the Harn-Hsu scheme.
Like the Harn-Hsu scheme, a KGC distributes shares to each of a set of ℓ nodes. The KGC first chooses an RSA modulus N and a polynomial f (x) over Z N of degree k for some pre-chosen k. The single share sent to each user S i is computed as Using the notation of §3, it should be clear that this is simply a special case of the Harn-Hsu scheme where the set of ℓ − 1 shares for a user are all chosen to be equal, i.e. where u i,1 = u i,2 = . . . = u i,ℓ−1 and hence . All other aspects of the scheme are identical.
Since it is just a special case of the Harn-Hsu scheme, precisely the same attacks work, and hence we do not consider the scheme further here.

The Albakri-Harn scheme
Four years after the Harn-Hsu paper appeared, in 2019 Albakri and Harn [1] proposed yet another group key pre-distribution protocol for WSNs. Three variants of the protocol are described, a basic scheme, which has a heavy storage overhead, and two derived schemes which use the same underlying idea but reduce the storage requirement for individual nodes. Given its fundamental role, we focus here on the basic scheme.
At first sight the scheme is somewhat different to the two schemes we have examined so far. However, more careful analysis reveals that it is again very closely related. Not surprisingly, the Albakri-Harn scheme is also insecure.
In particular, if two nodes collude, or if one node gains access to a single key to which it is not entitled, then all keys for all groups, regardless of membership, can be computed.

The protocol
We describe the Basic scheme, [1]. The protocol involves a set of ℓ sensor nodes S = {S 1 , S 2 , . . . , S ℓ } and the KGC. The protocol has two phases.
• Token generation, a pre-distribution phase in which the KGC generates a token T i for each node S i ∈ S, 1 ≤ i ≤ ℓ. As previously, it is assumed that the KGC distributes the tokens to the nodes using a secure channel, e.g. at the time of manufacture/personalisation.
• Group key establishment, where all members of a subset of nodes independently generate a group key for use by nodes in that subset.
We next describe the operation of these two phases in greater detail.

Token generation
Again as before, the KGC first chooses an 'RSA modulus' N , i.e. N = pq where p and q are two primes chosen to be sufficiently large to make factoring N infeasible. It is also assumed that each node S i has a unique identifer The KGC next chooses a set of ℓ univariate polynomials F = {f 1 , f 2 , . . . , f ℓ } over Z N , each of degree t−1 for some t. In the absence of further information, we suppose here the coefficients of these polynomials are chosen uniformly at random from Z N . No explicit guidance is given on the choice of t, but later it is claimed that the scheme is secure if up to t − 1 nodes are captured (and presumably their secrets are revealed), and that the choice of t is a trade-off between the computational complexity of key establishment and the security of the scheme. We assume throughout that t ≥ 2, since if t = 1 all polynomials are of degree zero and all tokens (and group keys) are identical.
For every i (1 ≤ i ≤ ℓ) the KGC calculates the token T i as the following polynomial in ℓ − 1 variables: Finally, the KGC equips each node S i ∈ S with the following: • the token T i ; • the values of N and the node's own identifier ID i ; • the identifiers ID j of all of the other nodes.
It is important to note that some of the steps above are based on the author's interpretation of the Albakri-Harn paper, [1], as many details are left unclear.

Group key establishment
Suppose some subset S ′ ⊆ S of sensors (|S ′ | ≥ 2) wish to share a group key. The group key K S ′ ∈ Z N for S ′ is defined to be: Any of the nodes S i ∈ S ′ can compute K S ′ by evaluating the token T i mod N for a particular choice of the indeterminates x j , namely by setting However, a node S k ∈ S ′ cannot use its token T k to compute K S ′ , at least not in the way described above.
Finally note that it should be clear that the group key K S for the set of all sensors is simply

An observation
It was stated above that this scheme is closely related to the Harn-Hsu scheme. This is not immediately apparent, as the Harn-Hsu scheme involves a node being given a set of shares each of which is a univariate polynomial, and in the Albakri-Harn scheme a sensor node is given a token consisting of a single multivariate polynomial. However, this token is actually analogous to the product of the Harn-Hsu shares, in the case where the share polynomials have distinct indeterminates (although the polynomials f i in Albakri-Harn are all distinct). Armed with this insight, the protocol then works in an essentially identical way to Harn-Hsu. It is therefore not surprising that, as we discuss below, closely analogous attacks apply.

Stage 1: Partial polynomial recovery
In the discussions below we need notation for the coefficients of the polynomials in F, and hence, recalling that all these polynomials have degree t − 1, we suppose that We first consider what a single node S i can learn about the polynomials in F from a single token T i . It follows immediately from the definition that T i consists of the sum of all terms of the form where 0 ≤ k j ≤ t − 1 for every j. That is, S i will know the value of Given these observations, for any r (1 ≤ r ≤ ℓ, r = i) and any s (1 ≤ s), and using the known coefficients of the polynomial T i , it is simple to compute the ratio of two such coefficients, namely where k j = k ′ j for every j except for j = r and k j = s, where we put k ′ r = 0. In this case all the identical terms will cancel, and the above expression will equal f rs f r0 mod N.
That is, for every polynomial f r (r = i), the ratio of each coefficient in f r to the constant coefficient f r0 can be computed. Hence S i can readily discover the values w 1 , w 2 , . . . , w t−1 where Moreover, given ID r (which is known to all nodes), S i can use the above to deduce that f r (ID r ) = z r f r0 for some z r known to S i , for every r = i.

Stage 2: Pair-wise collusion to complete the attack
We next describe how, if two nodes collude, e.g. by sharing their tokens, they can completely break the system; that is they will have the means to readily compute every possible group key, including for groups excluding both of them. That is, as is the case for all three schemes examined, the system is completely insecure if just two nodes collude, regardless of the choice of t.
We first need the following simple result (analogous to Lemma 3.2), which uses the notation of §5.1.

Lemma 5.1
Suppose the values z 1 , z 2 , . . . , z ℓ satisfy for every r (1 ≤ r ≤ ℓ). If S ′ ⊆ S is some non-empty subset of the nodes, the key K S ′ for the group S ′ satisfies: Proof By definition we have We assumed that f r (ID r ) = z r f r0 mod N for every r, and it trivially holds that f r (0) = f r0 for every r. Hence The result now follows by re-arranging the products.
The following corollary (analogous to Corollary 3.3) is immediate.

Corollary 5.2 The group key K S shared by all nodes satisfies
We now observe that, from the arguments in §5.2.1, if any two users collude then they can learn the complete set of values z 1 , z 2 , . . . , z ℓ satisfying f r (ID r ) = z r f r0 mod N , 1 ≤ r ≤ ℓ.
Next observe that, since they can both readily compute the group key K S shared by all nodes, they can obtain ℓ v=1 f v0 mod N since it follows from Corollary 5.2 that: Armed with knowledge of ℓ v=1 f v0 mod N together with the complete set of values z 1 , z 2 , . . . , z ℓ , the colluding nodes can use Lemma 5.1 to immediately compute any group key K S ′ , regardless of whether either of the colluding nodes are members of the group S ′ . This completes the attack.

An alternative to collusion
Just as previously, even in this case it remains possible to completely break the system if a node (by some means) learns a single key for a group of which the node is not a member. Suppose node S i knows the key K S ′ for the group S ′ , where S i ∈ S ′ . By Lemma 5.1, S i knows that: Knowledge of ℓ v=1 f v0 mod N together with the (almost) complete set of values z 1 , z 2 , . . . , z ℓ (except for z i ), enables S i to now compute any group key K S ′ for any group S ′ for which S i ∈ S ′ . This completes the attack.

A simplified attack
Using the same notation as before, we now present an even simpler version of the attack approaches described in §5.2. This attack does not seek to recover information about the polynomials in F, but instead recovers just sufficient information to be able to recover all group keys. We use the notation established in §5.2.

Learning a set of ratios
We start with another simple observation, analogous to Corollary 3.4.
Proof By Lemma 5.1, Similarly, by Corollary 5.2, The result follows immediately.
From Lemma 5.3, it follows that S i can learn the values of z r , 1 ≤ r ≤ ℓ, for every r = i.

Completing the attack
Armed with the values of z r , 1 ≤ r ≤ ℓ, for every r = i, the attacks described in §5.2.2 and §5.2.3 now work exactly as previously described.
6 Other observations 6.1 But what about the security analyses?
On the face of it, the above analyses are rather surprising in the light of apparently robust claims made in the Harn-Hsu [5], Harn-Gong [4], and Albakri-Harn [1] papers. In all three papers there are apparently 'theorems' proving the security of the respective schemes. However, in every case closer examination reveals that the 'proofs' are not in any way rigorous arguments.
For example, Theorem 1 of Albakri and Harn [1] states that 'The adversaries cannot obtain any information of secret polynomials selected by KGC'. The 'secret polynomials' referred to here are the polynomials in F, and the set of possible adversaries includes parties with knowledge of one or more tokens T i . However, we have shown in §5.2 that knowledge of a single token is sufficient to learn all but one of the polynomials in F up to a constant term. How can this be, given Theorem 1? Examination of the 'proof' of Theorem 1 reveals that it is by no means a rigorous proof -it is more a series of unsubstantiated claims. For example, the proof starts with the following statement 'Capturing one sensor -It is obvious that by capturing any single sensor S i , and obtaining the token T i , the adversary cannot recover information of any individual polynomial f i , nor the product of all individual polynomials'. That is, the 'proof' of the main claim seems to amount to a statement that it is 'obvious'. Sadly, the claimed result is clearly not as obvious as the authors hoped.
More seriously, this highlights the need for newly proposed cryptographic protocols to be provided with robust and rigorous evidence of their security. Indeed, this has been the state of the art for a couple of decades, as has been very widely discussed -see, for example, Boyd et al. [3].

Three almost identical schemes
Quite apart from the lack of security, it is unfortunate that, given all three papers we considered ( [1,4,5]) share one author, that three such similar schemes have been published separately. Moreover, the authors make no attempt to explain the close relationships between the three schemes.

An application
To make matters worse, some authors have sought to build broader security schemes on top of one of the schemes considered here. For example, Harn et al. [6] describe a secure routing protocol for WSNs, of which the Harn-Hsu scheme [5] forms an integral part; indeed, for some reason the authors have chosen to describe the Harn-Hsu scheme again in some detail. This, of course, means that the routing protocol, regardless of its design, is inherently insecure. It would, of course, have been good design practice to describe the routing protocol in terms of a 'black box' technique for key pre-distribution, and then to mention possible candidates for this black box, since there is no inherent reason to couple the two techniques. This would have avoided the main problem.

Summary and conclusions
As we have demonstrated, the Harn-Hsu, Harn-Gong and Albakri-Harn schemes all possess fundamental flaws. Given the nature of these flaws, it is difficult to imagine how the schemes could be rescued. Indeed, there is no reason to believe that a secure scheme can be designed using the underlying approach adopted in all three cases. As observed in §1, many related polynomial-based group key distribution schemes have been shown to be flawed, [7,9,10,11,12]. Again as observed above, there are many existing schemes which achieve the same objectives in an efficient way and which have rigorous proofs of security -see, for example, Boyd et al. [3] and Blundo et al. [2].
Fundamentally, the fact that the authors have not provided rigorous proofs of security for the various schemes means that attacks such as those described here remain possible. It would have been more prudent to follow established wisdom and only publish a scheme of this type if a rigorous security proof had been established. Similar remarks apply to the all-toooften misconceived attempts to fix broken schemes, unless a proof of security can be devised for a revised scheme. Achieving this seems very unlikely for variants of the schemes we have examined.