Algebraic secret sharing using privacy homomorphisms for IoT-based healthcare systems

: Healthcare industry is one of the promising ﬁelds adopting the Internet of Things (IoT) solutions. In this paper, we study secret sharing mechanisms towards resolving privacy and security issues in IoT-based healthcare applications. In particular, we show how multiple sources are possible to share their data amongst a group of participants without revealing their own data to one another as well as the dealer. Only an authorised subset of participants is able to reconstruct the data. A collusion of fewer participants has no better chance of guessing the private data than a non-participant who has no shares at all. To realise this system, we introduce a novel research upon secret sharing in the encrypted domain. In modern healthcare industry, a patient’s health record often contains data acquired from various sensor nodes. In order to protect information privacy, the data from sensor nodes is encrypted at once and shared among a number of cloud servers of medical institutions via a gateway device. The complete health record will be retrieved for diagnosis only if the number of presented shares meets the access policy. The retrieval procedure does not involve decryption and therefore the scheme is favourable in some time-sensitive circumstances such as a surgical emergency. We analyse the pros and cons of several possible solutions and develop practical secret sharing schemes for IoT-based healthcare systems.


Introduction
Internet of Things (IoT) is an emerging technology that utilises cloud connected devices to collect data for analysis.Healthcare industry is one of the most promising fields that have adopted IoT solutions since its early stage.The development of wearable technology, wireless body area network and cloud computing has established a new way for medical practitioners to acquire health data from patients.It greatly benefits health monitoring, epidemiological studies, and pharmaceutical re- search [1][2][3][4][5].A common IoT-based architecture for healthcare applications is illustrated in Figure 1, which consists of a gateway device, a cloud server and several sensor nodes.Each sensor node can be viewed as a wearable equipment used for monitoring the health status of an individual, such as heart rate, blood pressure, brain wave, glucose level, etc.Under the given framework, the sensor nodes send the medical data to a local gateway device via wireless communication such as Wi-Fi or Bluetooth, whereas the gateway device aggregates the data and store it in the cloud server for further analysis.However, there are risks of information leakage during data transmission and storage.For example, an adversary may attempt to eavesdrop the wireless communication, attack the gateway device or even access to the cloud server.Therefore, it is advisable to encrypt the data at each sensor node immediately after it is produced and incorporate secret sharing schemes to realise access control.In more details, each sensor node transmits the encrypted data to the gateway device by which data is integrated and encoded into shares of information.Due to security concerns, these shares are stored in separate cloud servers and the data retrieval must conform with the access policy.To realise this system, we present a novel research upon secret sharing in the encrypted domain.
Secret sharing is a study in cryptography originated independently by Blakley [6] and Shamir [7] in 1979.A secure (t, n)-threshold scheme is defined as splitting a secret message into n pieces of information in such a way that any fewer than t ≤ n pieces reveal no information about the secret.Only in the presence of t or more pieces will the secret be determined.Each piece of information is generally called a 'share' (as Sharmir's terminology) or a 'shadow' (as Blakley's terminology).An intuitive way of splitting a secret message, say, 'password' is to split it literally into shares: 'pa------', '--ss----', '----wo--', and '------rd'.This naïve approach is, however, insecure in the sense that every share leaks a part of the secret.Shamir proposed an elegant solution to share the secret in a secure manner.Suppose that a dealer wants to share a secret to n participants in such a way that only more than t participants pool their shares together will the secret be reconstructed.Let the secret be denoted by s and we generate t − 1 random numbers denoted by r 1 , r 2 , ..., and r t−1 .Then, we form a polynomial where P is a randomly chosen prime number.Let us draw any n points from the polynomial, for example, (1, f (1)), (2, f (2)), ..., and (n, f (n)), and distribute them to n participants respectively as shares.
It is observed that there are t unknown variables in the polynomial and thus with t different points one is able to solve for the variables including the secret (i.e. the constant term).In other words, the reconstruction process is to simply use Lagrange interpolation to solve a set of t simultaneous equations.Shamir's scheme is algebraic in nature in contrast to Blakley's scheme based on geometric structures.As a toy example of Blakley's construction, consider the secret as a point in a three-dimensional space and the shadows as hyperplanes whose common intersection is the secret point.Any three of the planes suffice to identify the point.As each successive shadow is exposed, however, the range of possible values of the secret narrows.Since the introduction of secret sharing, numerous extended problems have appeared.The study towards a general access structure was considered by Ito, Saito, and Nishizeki [8] and had become a principal study since then [9][10][11][12].To manage various malicious behaviour by dishonest parties, the notion of verifiable secret sharing was introduced by Chor et al. [13] and had been studied extensively thereafter [14][15][16][17][18][19].Another closely related branch is visual cryptography originated by Naor and Shamir [20] for the secrecy of visual information, including greyscale, colour, and halftone images [21][22][23].
Over the past decades, secret sharing schemes have found various applications.One of the possible applications in the healthcare industry is to protect the privacy of patients' health records against cybersecurity threats while allowing efficient access for a group of authorised physicians and surgeons.In IoT-based healthcare applications, a patient's health record often contains medical data acquired from different sensor nodes.A full measure of privacy protection ought to even prevent data revelation between sensor nodes.More generally, we consider the problem of sharing multiple secrets (i.e.health data) generated from t different sources (i.e.sensor nodes) amongst a society of n participants (i.e.cloud servers of medical institutions).Every secret is prohibited from being revealed to another source as well as the dealer (i.e.gateway device).We remark that this statement can also be applied to other mutually distrustful situations, especially in the case of commercial applications.This problem, though different, is similar to that studied by Iirgemarsson and Simmons [24].In their study, they noted that the problem of sharing the secret in the absence of a trusted dealer has been largely ignored by researchers in this area.In response to this, they introduced a two-level control protocol to share a secret determined by a democratic consent scheme without mutually trusted parties.Each participant equally contributes a private input to the determination of the secret and distributes the contribution among other participants through an autocratic sharing scheme.
In the following, let us consider two naïve solutions to our research problem and analyse their pros and cons.Among a variety of privacy protection mechanisms, encryption has a high level of reliability and universality.Naturally, the secrets are encrypted once they have been produced from the sources.The problem is therefore reduced to the sharing of encrypted data.Consider a key server who has a pair of public and private keys.The public key is used for encryption, whereas the private key is employed for decryption.The first solution is to create shares of the private key by arranging the key as the constant term in Eq. (1.1).The encrypted files, instead of being encoded as shares, are stored in a database.At the time when the number of collaborative participants are as many as required, the

Mathematical Biosciences and Engineering
Volume 16, Issue 5, 3367-3381 private key will be reconstructed and then the files in the database can be deciphered.On the one hand, this solution is simple and the computational load of the sharing procedure is light.On the other hand, however, to access the secret files, one must perform one reconstruction algorithm for the key plus one decryption algorithm for the files.In addition to this, this scheme requires different pairs of public and private keys for different sets of secret files (e.g.different patients' health records); otherwise, once the participants reconstruct the private key, they will be able to decipher all the files stored in the database.Furthermore, storing all the important files in a central database may be vulnerable to a number of cyber attacks.Thus, it is reasonable to share the files to authorised participants to reduce the risk of cyber threats.
The second solution is that suppose there are t encrypted secrets denoted by E(s 0 ), E(s 1 ), ..., and E(s t−1 ).We form a polynomial by arranging t encrypted secrets as t coefficients in Eq. (1.1).More generally, we can assume that there are k encrypted secrets, where k ≤ t, and choose t − k random numbers as the rest of the coefficients to complete the polynomial.Either way, we can draw n points as the shares for individual participants.In the presence of t shares or more, the encrypted data will be reconstructed.With the decryption key, the data will eventually be revealed.In practice, this scheme has a non-trivial issue of key distribution amongst the participants.It may be addressed by one of the following approaches.First, use a secure channel to transmit the key to individual participants.Second, let the pair of encryption and decryption keys be generated by a key agreement protocol (e.g.Diffie-Hellman key exchange protocol [25]) amongst the group of participants, instead of being generated by the key server.Third, encrypt the key with each participant's public key and send it to the corresponding one as an instance of asymmetrical cryptography (e.g.elliptic curve cryptosystems [26]).Aside from the issue of key distribution, this scheme still requires extra efforts of participants, namely, one reconstruction step for the encrypted data plus one decryption step for the original data.It may be troublesome in particular situations.For instance, when there is a surgical emergency, the time delay for accessing health records becomes problematic.Hence, we conclude that these naïve solutions, though feasible, are deficient in several aspects, which motivate us towards finer constructions.
In this paper, we study how multiple sources are possible to share their secrets amongst a group of n participants without revealing their own secrets to one another.We analyse the pros and cons of several possible solutions and develop practical schemes: a simple (2, 2)-threshold scheme, an extended (n, n)threshold scheme, and a generalised (t, n)-threshold scheme.The developed schemes follow Sharmir's construction in which a collusion of fewer than t participants has no better chance of guessing the secret than a non-participant who has no privileged information at all.The remainder of this paper is organised as follows.Section 2 gives the preliminaries of privacy homomorphisms.Section 3 reviews a naïve (2, 2)-threshold scheme.Section 4 discusses an extended (n, n)-threshold scheme.Section 5 studies a generalised (t, n)-threshold scheme.The paper is concluded in Section 6 with directions for future research.

Privacy homomorphisms
The term 'privacy homomorphisms' was coined by Rivest et al. to describe special encryption functions which permit encrypted data to be operated on [27].These special algebraic mappings between the paintext and ciphertext spaces allow the result of operations on the ciphertexts, when deciphered, to match the result of operations on the plaintexts.Let us see a well-understood example of privacy homomorphisms.Let p and q be two large primes and the modulus N = p • q.Let e and d be the public and private keys of the RSA cryptosystem, respectively.Note that e and d satisfy the condition that where φ is Euler's phi function, i.e. φ(N) = (p − 1)(q − 1).The RSA cryptosystem has an encryption function c ≡ m e (mod N), and a decryption function where m denotes the message and c denotes the ciphertext.Suppose that we wish to generate the encrypted result which, when decrypted, matches the product of two messages m 1 and m 2 through the operations on the ciphertexts c 1 and c 2 .This is achieved by For more information about the RSA cryptosystem, the reader is referred to [28].
Since the introduction of privacy homomorphisms, there has been a surge of interests in the design of homomorphic cryptosystems (e.g.ElGamal [29], Okamoto-Uchiyama [30], and Damgård-Jurik [31] cryptosystems).One of the well-studied homomorphic cryptosystems is the Paillier cryptosystem [32].Let m 1 and m 2 be two arbitrary messages, N be the product of two large primes, E(•) be the encryption function, and D(•) be the decryption function.The Paillier cryptosystem permits homomorphic addition: and homomorphic multiplication: More details of the Paillier cryptosystem are described as follows.This consists of three phases: key generation phase, encryption phase, and decryption phase.In the key generation phase, we choose two large primes p and q.Then, we compute N = pq and λ = lcm(p − 1, q − 1), where 'lcm' stands for least common multiple.Afterwards, we select a random integer g ∈ Z/N 2 Z * and calculate where As a result, the public key is (n, g) and the private key is (λ, µ).In the encryption phase, let m be a message to be encrypted and r be a randomly selected integer, where m, r ∈ Z/NZ.The ciphertext is then computed as c ≡ g m • r N (mod N 2 ). (2.9) This scheme has a ciphertext expansion phenomenon as the message space is M = Z/NZ and the ciphertext spac is C = Z/N 2 Z * .In the decryption phase, the plaintext message is deciphered by It is observed that the decryption process involves a modular exponentiation, which is computationally expensive, with the addition of other operations of minor cost.Therefore, as aforementioned, in some time-sensitive applications, one would wish not to involve decryption in the secret reconstruction process.In the remainder of this paper, we assume all the homomorphisms applied are those of the Paillier cryptosystem unless otherwise specified.Nevertheless, the applicable homomorphisms are included but by no means limited to the homomorphisms of this particular cryptosystem.

(2, 2)-threshold secret sharing
Recently, we proposed a (2, 2)-threshold multi-secret sharing scheme to split a batch of two secrets into two shares via a semi-honest (or honest-but-curious) cloud service provider [33].Only in the presence of two shares, the batch of two secrets can be restored.Let us describe how this scheme can solve the problem of privacy-preserving secret sharing.Let s 1 and s 2 be two secrets generated from two separate sources, respectively.To preserve the privacy of secrets, s 1 and s 2 are encrypted immediately after being produced.The encrypted secrets E(s 2 ) and E(s 2 ) are uploaded to the dealer for sharing.Let x 1 and x 2 be any integers that satisfy Note that 'gcd' stands for greatest common divisor.It is not difficult to find proper x 1 and x 2 because N is the product of two large primes.Since we derive gcd( Then, two shares are created as Following the homomorphic properties, we rewrite The dealer distributes x 1 and x 2 to two participants and sends E(y 1 ) and E(y 2 ) to the key server for decryption.The decrypted results are Then, y 1 and y 2 are also dispensed to the participants.When the participants pool their shares (x 1 , y 1 ) and (x 2 , y 2 ) together, they compute Note that Since gcd(x 1 2 − x 2 2 , n) = 1, we know there exists one and only one modular multiplicative inverse such that The value of (x 1 2 − x 2 2 ) −1 can be solved by the extended Euclidean algorithm.Eventually, the secret s 1 is unveiled by In the same manner, the secret s 2 is decoded as It is worth noting that even though y 1 and y 2 have been disclosed to the key server during the process, s 1 and s 2 are still kept secret since the key server has no knowledge about x 1 and x 2 .The secret reconstruction process does not involve the decryption operation and thus is time-efficient.

(n, n)-threshold secret sharing
Let us extend the previous (2, 2)-threshold scheme to a (n, n)-threshold scheme.For conciseness, we omit modulus symbols in the following description where there is no ambiguity.Let n secrets generated from sources be denoted by s 1 , s 2 , . . ., and s n .After encryption, the encrypted results, written as E(s 1 ), E(s 2 ), . . ., and E(s n ), are transmitted to the dealer for sharing.The dealer chooses n random numbers x 1 , x 2 , . . ., and x n such that a matrix Mathematical Biosciences and Engineering Volume 16, Issue 5, 3367-3381 has a modular multiplicative inverse X −1 in Z/NZ.Alternatively, X must satisfy gcd(det(X), N) = 1 and det(X) 0. Note that 'det' stands for determinant.Let the dealer compute According to the homomorphic properties, we derive 3) The sharing process can be fulfilled by cloud computing to relieve the dealer of computational burdens without revealing the private information about the secrets.The dealer dispenses x 1 , x 2 , . . ., and x n to n participants respectively and passes E(y 1 ), E(y 2 ), . . ., and E(y n ) to the key server for decryption.The decrypted results, written as are allocated to individual participants as well.When all the participants pool their shares together, they retrieve the secrets by Example.Let us demonstrate that the previous (2, 2)-threshold scheme is actually a special case of the (n, n)-threshold scheme.In the case where there are two secrets s 1 and s 2 to be encoded, the dealer randomly chooses x 1 and x 2 to form a matrix Then, we compute which are equivalent to the results in Eq. (3.5) and Eq.(3.6).By decryption, we obtain which are equal to the results in Eq. (3.7).Eventually, we retrieve the secrets by which are identical to the results in Eq. (3.11) and Eq.(3.12).
As an extension of the (2, 2)-threshold scheme, this scheme has the same security strength.It is theoretically secure in the sense that any subset of participants has absolutely no knowledge about the secrets unless all the shares are in presence.The secret reconstruction procedure does not involve decryption.Thus, it is time-efficient and can be established without the means of key distribution.

(t, n)-threshold secret sharing
In light of the previous (n,n)-threshold scheme, we further derive a generalised (t, n)-threshold scheme.Before we proceed further, let us discuss some possible (t, n)-threshold schemes and analyse their pros and cons.Let {s i } t i=1 denote t secrets generated from separate sources and P be a large prime.With Shamir's algorithm, the dealer constructs a polynomial and draws n points (x 1 , f (x 1 )), (x 2 , f (x 2 )), . . ., (x n , f (x n )) as shares for n participants.In our defined scenario, the secrets are encrypted into {E(s i )} t i=1 immediately after being produced.Let k denotes the decryption key.The first possible scheme is to split k into n shares by drawing n points from the following polynomial: where {r j } t−1 j=1 are t − 1 randomly chosen integers.The encrypted data has to be stored in a database so that when t or more participants reconstruct the key collaboratively, they can retrieve and decrypt the data.Nonetheless, the database may be vulnerable to numerous cyber attacks.The second possible scheme is to create shares according to the following polynomial: When t or more participants co-operate, they can reconstruct the encrypted data.In order to decrypt the data, the scheme must engage a key distribution protocol to share the key amongst the participants.
To compensate for the shortcomings, one may think of combining the previous two solutions and build a polynomial in the following form: E(s j )x j (mod P). (5.4)

Mathematical Biosciences and Engineering
Volume 16, Issue 5,[3367][3368][3369][3370][3371][3372][3373][3374][3375][3376][3377][3378][3379][3380][3381] In this way, the authorised subset of participants is able to reconstruct and decrypt the data from the shares.With the knowledge of the key, however, the dealer is able to decipher the data and thus the privacy is threatened.Regrettably, as previous strategies all have obvious limitations, we need to find another way to do so.For a moment, let us forget about the problem of sharing ciphertexts and only consider sharing the plaintexts since extending the idea to the sharing of the ciphertexts is easy once the following concepts are understood.Let s t,1 denote a vector of t secrets, y n,1 denote a vector of n shares, and X n,t denote an n × t matrix.We define an encoding function and a decoding function where y t,1 ⊂ y n,1 , and X t,t ⊂ X n,t .In the case of (n, n)-threshold secret sharing, the above encoding and decoding functions are equivalent to Eq. (4.4) and Eq.(4.5), respectively.In the previous special case, we only require that X n,n has a modular multiplicative inverse.In the current generalised case, however, we require that any t × t sub-matrix of X n,t has a modular multiplicative inverse.In fact, when t = n, the current requirement reduces to the previous one since the one and only sub-matrix of X n,t is X n,t itself.Our question is hence 'is it possible to construct a valid matrix X n,t such that any square matrix X t,t consisting of t rows of X n,t has a multiplicative inverse?'.A matrix A is invertible if and only if its determinant is non-zero.When t and n are small numbers, we could use trial and error to construct a valid X n,t such that det(X t,t ) 0 for any X t,t .This approach is, however, not practical since collisions become difficult to be handled as the ratio between n and t, implying the number of possible combinations, grows large.To obtain a valid matrix in a systematic way, one of the possible solutions is to construct a Vandermonde matrix.
Definition (Vandermonde matrix).An n × t Vandermonde matrix has a form For a t × t square Vandermonde matrix, the determinant is given by Example.Let us compute det(A), where Corollary (Invertible Vandermonde matrix).A square Vandermonde matrix is invertible if and only if all α i are distinct.When the condition suffices, the matrix has a nonzero determinant.
Given the above preliminaries, we can start with the detailed construction of sharing ciphertexts.Consider a Vandermonde matrix written as where {x i } n i=1 are all distinct.The shares are created as (5.8) Due to privacy homomorphisms, the above results are equivalent to (5.9) Mathematical Biosciences and Engineering Volume 16, Issue 5, 3367-3381 After decryption, the results become or alternatively, as expressed in Eq. (5.5).Each participant will receive a share (x i , y i ), where i ∈ {1, 2, . . ., n}.Suppose that a subset of participants has gathered a collection of shares, say, (x j , y j ), where j ∈ {1, 2, . . ., t}.Hence, the participants form a matrix and reconstruct the secrets with Eq. (5.6).Note that X t,t is a square Vandermonde matrix, thus invertible.The reader may have observed that when X n,t is a Vandermonde matrix, Eq. (5.5) and Eq.
(5.6) are the encoding and decoding functions of Shamir's scheme per se.Let f (x) denote Shamir's encoding function, while g(x) denote ours.The connection between two functions can be expressed as s i x i−1 ) = E( f (x)). (5.12) Except for the processing domain (either the plaintext or ciphertext domain), a noticeable difference between two schemes is the decoding process for which Shamir uses the Lagrange interpolation and we utilise a matrix multiplication.We remark that there are many studies on fast algorithms for matrix inversion [34] and multiplication [35][36][37].

Conclusion
In this paper, we address a novel research problem of secret sharing in the encrypted domain for IoT-based healthcare applications.We study the problem of sharing encrypted data, acquired from different sensor nodes, among a set of cloud servers.In conclusion, the proposed schemes are theoretically secure in the following senses.First, since the secret data is concealed by a secure encryption algorithm immediately after its creation, the dealer as well as other sources cannot access the secret data.Second, the key server only has partial shares and thus is also unable to retrieve the secret data.Third, conforming with the access policy, a subset of fewer than a certain number of participants does not suffice to decode the secrets either.In addition to this, the data is not required to be stored in a common database so that the scheme is not vulnerable to cyber threats against the database.Furthermore, since data retrieval does not involve computationally expensive decryption operations, the scheme is advantageous in time-sensitive circumstances.In the near future we intend to extend this work into a more general access structure based on the assumption that there are dishonest parties involved.Another line of further investigation is the application of this work in visual cryptography.

Figure 1 .
Figure 1.An IoT-based healthcare architecture.The health data acquired from the sensor nodes (e.g.heart rate, blood pressure, brain wave, and glucose level) is aggregated at the gateway and store in the cloud. 1