A Modified Symmetric Key Fully Homomorphic Encryption Scheme Based on Read-Muller Code

: Homomorphic encryption became popular and powerful cryptographic primitive for various cloud computing applications. In the recent decades several developments has been made. Few schemes based on coding theory have been proposed but none of them support unlimited operations with security. We propose a modified Reed-Muller Code based symmetric key fully homomorphic encryption to improve its security by using message expansion technique. Message expansion with prepended random fixed length string provides one-to-many mapping between message and codeword, thus one-to many mapping between plaintext and ciphertext. The proposed scheme supports both (MOD 2) additive and multiplication operations unlimitedly. We make an effort to prove the security of the scheme under indistinguishability under chosen-plaintext attack (IND-CPA) through a game-based security proof. The security proof gives a mathematical analysis and its complexity of hardness. Also, it presents security analysis against all the known attacks with respect to the message expansion and homomorphic operations.


Introduction:
There has been a continuous shift towards cloud computing and it has provided as a means of public storage and related services (1,2).At the same time, data stored in the public cloud are more vulnerable to unauthorized access as well as attacks.Adding security to the data using encryption will give problem to public access (3).Homomorphic encryption helps in providing public access and security on the data in a single step.Homomorphic encryption schemes satisfy an important property that, given two ciphertexts say  1 =   ( 1 ) and  2 =   ( 2 ) where  1 ,  2 are plaintexts and  is the key, one can compute  =  1   2 =   ( 1   2 ) for some operation  such that   () =  1   2 .For example, the text book RSA encryption scheme is multiplicatively homomorphic (4).If  corresponds to only a single operation like addition or multiplication, such a scheme is called partially homomorphic.Several partially homomorphic encryption schemes were proposed and successfully used in the applications such as oblivious polynomial evaluation, electronic voting, multiparty computation, private information retrieval, Deep Learning systems, Big data systems, medical applications and so on (5)(6)(7)(8).However, in order to perform arbitrary computations over the encrypted data so that the scheme is suitable for any application in general, it must support both addition and multiplication operations over the ciphertexts unlimitedly.Such an encryption scheme that supports unlimited additions and multiplications and thus, allows arbitrary computations over the encrypted data, is called as Fully Homomorphic Encryption (FHE) scheme.The problem of constructing an FHE scheme has been a dream of cryptographers, which was first theoretically solved in a pioneering work by Craig Gentry using an innovative construction method (9,10).
Because of time complexities of Gentry's generic blueprint, most of the later work tried to bring Gentry's scheme close to practicality or used different security assumptions to create an FHE with practical time complexities (11)(12)(13)(14)(15)(16)(17).Even though all of these have shown progressive improvements one over the other, none of them could qualify for practical implementation.The design of an FHE scheme with implementation model is still a challenge.
HE schemes based on coding theory are of interest because of the availability of alternative security assumptions in solving multivariate equations over a finite field and simplicity in decoding operation (18,19) Because of simplicity of the linear mapping in decoding function, homomorphic operations can be supported by encryption schemes based on coding theory (18).Schemes based on Reed-Muller codes, McEliece codes and Goppa code based on McEliece have also not been homomorphic in nature (19)(20)(21)(22)(23)(24)(25).Hence, these schemes do not support computation of arbitrary functions over the encrypted data.Another McEliece code-based scheme supports homomorphic addition and does not support homomorphic multiplication operation (26).Armknecht et al. (2011) presented a first code-based Somewhat Homomorphic Encryption (SHE) scheme using Reed-Muller codes.The computation complexities of this scheme stands at ( 2 ) and () for encryption and decryption respectively and at () for the homomorphic addition and multiplication operations.Anew Reed-Muller Code (RMC)-based FHE scheme, based on the scheme presented by Armknecht et al. (2011), has been proposed with a de-noising step which nullifies the error terms produced during each homomorphic multiplication (27).This is, in one way, the ciphertext refreshing or post-processing idea similar to bootstrapping step in the Gentry's blueprint (9).
The RMC-based FHE scheme (27) may have vulnerability owing to one-to-one mapping between message (plaintext) and codeword.The underlying codeword in the ciphertext at the fixed positions, as specified by the secret key, might be computed through Chosen Plaintext Attack (CPA).Present work is the modification to the RMC-based FHE scheme (27) by padding mechanism in order to give one-to-many mapping between message and codeword which will improve its security against CPA.The scheme is proved CPA secure and is thoroughly analyzed with respect to the changes and new techniques suggested to show that the proposed scheme is secure against all the known attacks.
The proposed scheme with its security proof and security analysis is presented in the subsequent sections of the paper.The proposed modified RMCbased symmetric key FHE scheme is detailed in the section Modified RMC-based symmetric key FHE Scheme.Game based security proof in mathematical expression is provided in section The proposition is CPA secure.Security analysis against the known attacks is discussed and work is concluded in last section.

Modified RMC-Based Symmetric Key FHE Scheme: Notations
The symbols and their meaning used in this work are given as follows 1.  is an arbitrary finite field 2. Vectors have been denoted by small case bold letters e.g. 3. ′ + ′ denotes addition operation over GF(2) fields 4. ′. ′ denotesmultiplication operation over GF(2) fields 5.An integer is denoted as a lower-case italic letter eg. 6.

Proposed Fully Homomorphic Encryption scheme
In the proposed scheme, Reed Muller encoding and decoding operations have been used.The decoding algorithm used here is similar to the decoding used in the RMC-based FHE scheme (27).In encryption step, the novel method is adopted to expand the message by prepending a random binary string of zeros and ones in order to generate a new (nondeterministic) codeword for same message if the same message encoded multiple times, which will provide one-many mapping between message and codeword.

Overview of the scheme
Like any other FHE scheme, KeyGen, Encrypt, Decrypt, and Evaluate algorithms have been used for the proposed scheme.
In the Keygen algorithm, given the security parameter (˂), the RM parameters(, ) are chosen, which determine the length of the codewords and the maximum length of the message that can be encoded.The actual plaintext chosen for encryption will be smaller than the maximum lengthof the message that can be encoded by the RM(r,m).This enhances security.Precisely, the length of the actual plaintext to be encrypted is limited to the parameter.Upon choosing the RM parameters(, ), the length of the codeword, dimension , the generator matrix  , for the code RM(r, m) and the length of the ciphertext are computed.A secret key is chosen as a random subset of the set of integers {1, 2, … , }.The Encrypt algorithm takes the plaintext vector  of length≤ , the generator matrix , , maximum length of the message, and the key as inputs and produces ciphertext vectorof length as an output.Before encryption, the given plain text  is expanded to a message of length by prepending a zero vector0 to make the message to the length , and then prepending a random bit vectorof appropriate lengthto .For easy recovery of the plaintext  at decryption stage, it is recommended to choose of length( − ).In fact, appending the vector0is optional, which can be done only when the given plaintext is short of the length .
The generator matrix , can be used to transform the expanded plaintext message  ∈  2  into an-bit codeword .Then, the code word  is used to generate the ciphertext.To achieve this, bits of  are embedded in (a-bit random vector) at locations specified by.
Decrypt algorithm takes secretkey  and ciphertext , it recovers codeword  from by collecting the bits from the locations specified by .Then the recovered is decoded to produce the expanded plaintext message  from which the actual plaintext is recovered by discarding the first ( − ) bits followed by discarding the zero bits till the first non-zero bit is encountered.
The procedure, H.Add has been used to perform homomorphic addition operation and the procedure H.Mul has been used to perform homomorphic multiplication to construct the Evaluatealgorithm.A homomorphic component wise  2 addition can be performed because of structure of ciphertexts.But same thing doesn't hold true for homomorphic component wise 2 multiplication.To overcome this problem, one more step of de-noising has been added to multiplication operation to nullify the error introduced as discussed in the RMC-based FHE scheme (27).

Algorithms
The algorithms describing the construction of the proposed scheme are presented as follows: (1  ) → (, , ): Upon input of the parameter , Similarly,H.Mul is  2 multiplication over the ciphertexts.Unlike addition (H.Add), mod 2 multiplication is not straightforward due to noise that appear in the resultant ciphertext   =  1 . 2 .But Multiplication with de-noising step as discussed in RMC-baseed FHE scheme (27) eliminates the additional noise generated during multiplication and decryption of resulting cipher produces  1 . 2 .

The Proposition is CPA Secure:
It may be noticed that the key vector is an instance of the sparse subset sum problem (9)(10)(11).However, the sum of the bit positions or any such aggregate corresponding to key positions is not being mentioned anywhere.So, this becomes a hidden sparse subset sum problem (28) for which no solution is known.So, the security of the scheme is totally dependent on the size and randomness of the ciphertext and the secret key permutation chosen, which involves the positions at which the codeword is to be embedded.Once the codeword is recovered, decoding is simple.In general, no explicit hard problem assumption is known for the security of a cryptosystem which involves embedding the plaintext bits in a randomly generated bit stream.
Similar studies have been performed and proved its security following game-based security proof (29)(30)(31) and the security of present work can also be proved on same lines as follows.

The IND-CPA game
For a probabilistic symmetric key algorithm, IND-CPA is defined by the following game between an adversary A and a challenger C: A is assumed to be a probabilistic polynomial time (PPT) algorithm.That means, A must complete the game and output a guess within a polynomial number of steps (31).Let the parameters have their usual meaning as described in the scheme Notations.Let the proposed symmetric key encryption scheme be S= (KeyGen, Encrypt, Decrypt, Evaluate).Then the IND-CPA game, adapted from article (32), is defined as follows

Symmetric key CPA indistinguishability game:
,  () 1. C generates a random secret key based on the security parameter and retains .In the proposed scheme, the size of the ciphertext and the key are dependent on the RM parameters(, ).(32).A function() is said to be negligible if for every ∈ , there exists an integer 0 such that,() ≤ 1   for all  ≥  0 .Now, let  ← {0 ,1}  is a random-bit string.We claim that, under the assumption of random embedding of the permuted in, no PPT adversary A can learn any information about the plaintextor codeword(beyond what it could guess at random) except with negligible probability.This is proved in the following theorem.chosen/computed as defined.An adversary A that breaks the IND-CPA security of the proposed scheme will have an advantage() , where() is negligible.
Proof.Given the two plaintexts  0 ,  1 , it is evident that the probability of choosing the one uniformly at random by the challenger is . We analyse the probability the adversary A will have over and above this random guess.Given a ciphertextof length, A first tries to obtain the length of the codeword.The only way it can do this is by assuming that =  2 , the minimum length of the ciphertext, because, we keep the parameters, as secret and ≥  2 .Based on the value of, A selects a subset of size from [  ] which he can retrieve the codeword.This probability is

(𝑙 𝑛 )
. As another way, A may directly guess a codeword  from the set of all possible strings of length.In fact, only few of these strings, i.e.,2  << 2  are the codewords.Suppose that, A is able to compute.Even then, the permuted codeword may not belong to the set of codewords of dimension and the problem boils down to choosing a string from the set of2  strings.Clearly, this probability is Upon choosing the permuted embedded codeword, A has to compute the exact permutation of the -bit string chosen to obtain the actual embedded codeword.However, the number of permutations depends on the pattern of the strings.The codewords like all 0's and all 1's will make no sense with respect to the permutation.So, we assume that the permutation can be guessed with a probability of 1 in such cases.This makes the whole probability of guessing a codeword and its permutation in worst case as(  (32).The Lemma 1 below, shows that the factorial based component, which is inverse a combination function is also negligible for sufficiently large values of.Hence, from the Proposition1 it is obvious that,() is negligible as claimed.
Proof.We have, Based on the assumption that, retrieving a secret randomly embedded permutation is hard, specifically when the bit string is sufficiently large, the proposed scheme is CPA secure.

Analysis of Known Attacks:
The proposed scheme has been analyzed for the following known attacks: Guessing the key for sufficiently large parameters is considered a very difficult problem.Therefore, such an attack against the codeword would not be effective and can be thwarted with ease.2) Privacy of the homomorphic operations: Privacy, in present scheme, is inherent due to a constant ciphertext size.This size does not change in course of operations over ciphertext.Hence it becomes hard to predict the type and number of operations performed.

3) Attacks with respect to the first bit of codeword:
Attack over first bit is of no significance as the length of the codeword and locations of remaining codeword bits are not revealed.
Deducing the first bit is if of no use in comparison to the total remaining length of codeword.4) Attacks against the DSCP:Any such scheme can be defended by keeping the length of ciphertext 5/2 for a given codeword ofbits (18).Proposed scheme having the length of the ciphertext which is larger than 5/2 , can defend itself against such attack.5) Attack based on the properties of the RM code: It can be noticed that, generally the mapping between a message and a codeword in a RM code is deterministic.That means given a message  we always get the same codeword .So, when a codeword is embedded in a random string using the same key, we always get same bits at the positions specified by whenever we encrypt the same message  using the same key.So, simple XOR operation over sufficient number of ciphertexts will reveal the positions at which the codeword is embedded, because, those positions contain zero bits after XORing.The only way this attack can be defended is that, the message to be encrypted should be changed in a way that encoding of the same message results in different codewords.Prepending a random string and zero bits as described in the Encrypt algorithm serves this purpose and with this message expansion operation such attacks can be successfully thwarted.

6) Attack concerned with guessing the𝑟, 𝑚 values:
It is proposed to keep the Reed-Muller parameters,  for additional security.But, it is possible to guess the , values as described below, though guessing them will not compromise the security of the scheme.As a simple analysis, consider the length of the ciphertext which is ofbits, where ≥  2 .Unless  >>  2 , for smaller values we get√ ≈ or in other words the value of√will be close to.Since = 2  , it will be easy to guess the value of as power of 2 and from that, it is easy to compute =  2 .Now, having the value of  the, values can be any of the pairs(1, ), (2,  ), … (, ).So, one can consider these pairs in turn compute the corresponding generator matrix and obtain the plaintext corresponding to the codewords in each of these codes.However, the complexity of such attack to determine the codeword and in turn the corresponding plaintext is exponential, i.e.,2  as discussed in Theorem 1 as probability.Moreover, the recovered plaintext each time would correspond to an expanded plaintext and obtaining the original plaintext from it adds up a further complexity of 2  .Thus, the overall complexity will be2 + .

Conclusion:
In present work, a modified RMC-based symmetric key FHE scheme has been proposed with padding mechanism.The padding of random bit string to the plaintext has been ensuring one-tomany (non-deterministic) mapping between plaintext and ciphertext.It is shown that padding mechanism is enhanced the security against the IND-CPA.The modified algorithms of Keygen, Encryption and Decryption with respect to padding mechanism have been presented.The security of the proposed scheme is proved by Indistinguishability Chosen plaintext attack (IND-CPA) game-based proof.The mathematical proof of IND-CPA gamebased security proof proved that retrieving a fixed length bit string which is embedded in the large random bit string specified by the secret positions is a hard problem.Further, the scheme is thoroughly analyzed with respect to the changes and new techniques suggested to show that the proposed scheme is secure against all the known attacks.The present scheme involves simple operations which makes this scheme easy to implement.The homomorphic operations H.Add and H.Mulare simple MOD 2 operations and they are easy to implement.In this paper, we focused on study of algorithm and its security proof with respect to the proposed enhancements.Many other aspects with respect to the practical implementation of the scheme with different parameters need to be studied and compared its performance analysis with other related works in future work.

Theorem 1 .
Let the proposed symmetric key encryption scheme, S = (KeyGen, Encrypt, Decrypt, Evaluate).Letbe the security parameter and the other parameters, , , ,  are

Open Access Baghdad Science Journal P-ISSN: 2078-8665 2021, Vol. 18 No.2 (Suppl. June) E-ISSN: 2411-7986 901 locationsat
The key  represents set of bit positions, random in nature, within a binary string of length.It means is set of which each bit of the codewordare to be embedded during encryption.The parameters , , , , the generator matrix, and the key  all are kept secret.
2. A is given the security parameter 1  and an oracle access to ( .).

1 )
Brute-force attack: Based on Theorem 1, it can be deduced that present scheme is secured against brute-force attack for all the cases where