Efficiently Processing Complex-Valued Data in Homomorphic Encryption

Abstract We introduce a new homomorphic encryption scheme that is natively capable of computing with complex numbers. This is done by generalizing recent work of Chen, Laine, Player and Xia, who modified the Fan–Vercauteren scheme by replacing the integral plaintext modulus t by a linear polynomial X − b. Our generalization studies plaintext moduli of the form Xm + b. Our construction significantly reduces the noise growth in comparison to the original FV scheme, so much deeper arithmetic circuits can be homomorphically executed.


Introduction
The goal of homomorphic encryption is to allow for arbitrary arithmetic operations on encrypted data, such that the decrypted result equals the outcome of the same calculation carried out in the clear. Since the publication of Gentry's seminal work [15], this research area has evolved rapidly and is on the verge of reaching a first degree of maturity, as was recently demonstrated e.g. by practical implementations of privacy-enhanced electricity load forecasting [2,4], digital image processing [1,10], and medical data management [8,12,17]. Most of the current focus lies on somewhat homomorphic encryption (SHE), where the schemes are capable of homomorphically evaluating an arithmetic circuit having a certain predetermined computational depth. The leading proposals for realizing this goal are the Brakerski-Gentry-Vaikunthanathan (BGV) scheme [5] and the Fan-Vercauteren (FV) scheme [13].
In actual applications, the input to the homomorphic evaluation of an arithmetic circuit C needs to be preprocessed in two steps. The first step is encoding, where one's task is to represent the actual 'real world data' as elements of the plaintext space of the envisaged SHE scheme. This plaintext space is a certain commutative ring, and the encoding should be such that real world arithmetic agrees with the corresponding ring operations, up to the anticipated computational depth.
In the original descriptions of BGV and FV, the plaintext space is a ring of the form R t = Z[X]/(t, f (X)) where t ≥ 2 is an integer and f (X) ∈ Z[X] is a monic irreducible polynomial. Throughout this paper we will stick to the common choice of 2-power cyclotomics f (X) = X n + 1, where n = 2 k for some integer k ≥ 1. Encoding numerical input is typically done by taking an integer-digit expansion with respect to some base b, then replacing b by X and finally reducing the digits modulo t. Decoding then amounts to lifting the coefficients back to Z, for instance by choosing representatives in (−t/2, t/2], and evaluating the result at X = b. Thanks to the relation X −1 ≡ −X n− 1 it is possible to allow the expansions to have a fractional part. In this case the decoding step must be preceded by replacing the monomials X i of degree i > B by −X i−n , for some appropriate point of separation B. All these parameters need to be chosen in such a way that the evaluation of C on the encoded data decodes to the right outcome. At the same time one wants t to be as small as possible, because its size highly affects the efficiency of the resulting SHE computation. Selecting optimal parameters is a tedious application-dependent balancing act to which a large amount of recent literature has been devoted, see e.g. [2,7,9,11,12,17,19].
Because in practice n is of size at least 1024, the plaintext spaces R t can a priori host an enormous range of data, even for very small values of t. Unfortunately this is hindered by their structure, which is not a great match with numerical input data types like integers, rationals or floats. For example, if t = 2 then it is not even possible to add a non-zero element to itself without incorrect decoding. Because of such phenomena, values of t are required that typically consist of dozens of decimal digits, badly affecting the efficiency. An idea to remedy this situation has been around for a while [5,14,16] and uses a polynomial plaintext modulus, rather than just an integer. Recently the first detailed instantiation of this idea was given by Chen, Laine, Player and Xia [7], who adapted the FV scheme to plaintext moduli t = X − b for some b ∈ Z ≥2 . In this case the plaintext space becomes , whose structure is a much better match with the common numerical input data types. This allows for much smaller plaintext moduli (norm-wise), with beneficial consequences for the efficiency, or for the depth of the circuits C that can be handled [7,Section 7.2].
This paper further explores the paradigm that the structure of the plaintext space R t should match the input data type as closely as possible. Concretely, we focus on complex-valued data types, such as cyclotomic integers and floating point complex numbers. We study this setting mainly in its own right, but note that complex input data has been considered in homomorphic encryption before, e.g., in the homomorphic evaluation of the Discrete Fourier Transform studied by Costache, Smart and Vivek [10] in the context of digital image processing, where the input consists of cyclotomic integers.

Representing complex numbers.
One naive way to encode a complex number z would be to view it as a pair of real numbers, for instance using Cartesian or polar coordinates. These can be fed separately to the SHE scheme, which is now used to evaluate two circuits. A more direct way is to use a complex base b. For instance, one could take b = e πi/n , as was done by Cheon, Kim, Kim and Song [9], albeit in a somewhat different context. This choice has the additional feature that f (b) = 0, so that wrapping around modulo f (X) = X n + 1 does not lead to incorrect decoding. However, finding an integer-digit base b expansion with small norm which approximates z sufficiently well is an n-dimensional lattice problem, which is practically infeasible. To get around this Costache, Smart and Vivek [10] instead use b = ζ := e πi/m for some divisor m | n, which is small enough for finding short base ζ approximations, while preserving the feature that wrapping around modulo f (X) is unharmful. But in their approach, a huge portion of plaintext space is left unused. Indeed, the encoding map is where Y = X n/m , t ≥ 2 is an integral plaintext modulus and z i is the reduction of z i mod t, so that all plaintext computations are carried out in the subring Z[Y]/(t, Y m + 1), which is of index t n−m in R t . Our proposal is to resort to a plaintext modulus of the form t = X m + b for some small integer b, with |b| ≥ 2. In this case, for m < n, we have R X m +b = Z[X]/(X m + b, X n + 1) = Z[X]/(b n/m + 1, X m + b). An additional assumption (which is discussed in more detail in the next section), is that where b denotes the reduction of b modulo b n/m +1. Throughout we fix such an α and let β be its multiplicative inverse, which necessarily exists. This implies that (βX) m + 1 = 0, therefore we have a well-defined ring homomorphism which is surjective with kernel (b n/m + 1). In other words, while Costache, Smart and Vivek restrict their computations to an injective copy of Z[ζ ]/(t) inside R t , we can view R X m +b as an isomorphic copy of Z[ζ ]/(b n/m +1). Essentially, our approach transfers the unused part of the plaintext space coming from the large dimension n into a larger integral modulus, reflected in the exponent n/m.
In the remainder of this paper, we explain how this observation can be used to efficiently process complexvalued input data in homomorphic encryption. First, in Section 2 we explain how to encode and decode elements of the ring Z[ζ ] of 2m th cyclotomic integers and discuss the assumption (1), with special attention to the case m = 2 where Z[ζ ] = Z[i] is the ring of Gaussian integers. Next in Section 3 we explain how this can be used to encode other data types such as cyclotomic rationals or complex floats, either by resorting to LLL as in [10] or by using Chen et al.'s fractional encoder from [7]. In Section 4 we discuss how to adapt the FV scheme so that it can cope with plaintext spaces of the form R X m +b . Finally, in Section 5 we discuss the performance of this adaptation in comparison with previous approaches. In short we can reach a depth at least 5 times that of the best approach which directly encrypts encodings of complex numbers [10]. We can also reach very similar depths to the state of the art where one encrypts the real and imaginary parts separately [7]. However, since we natively encrypt complex numbers our ciphertexts are two times smaller and hence our approach is more efficient by roughly a factor two in time and three in space.

Encoding and decoding elements of Z[ζ ]
Encoding Encoding an element of Z[ζ ] happens in two steps. The first step applies the map (2) yielding a polynomial of degree less than m which typically has very large coefficients. The second step is comparable to the hat encoder of Chen et al. [7] and switches to another representant by spreading this polynomial across the range 1, X, . . . , X n−1 while making the coefficients a lot smaller. The result will then be lifted to R = Z[X]/(X n + 1) and fed to our adaptation of the FV scheme, where the smaller coefficients are important to keep the noise growth bounded.
Here is how this second step is carried out in practice: we think of the coefficients z i β i as being represented by integers between −⌊b n/m /2⌋ and ⌈b n/m /2⌉. We then expand these integers to base b using digits a i,j from the range −⌊b/2⌋, . . . , ⌊b/2⌋ to find There is a minor caveat here, namely if b is odd then there are more integers modulo b n/m + 1 than there are balanced b-ary expansions of length at most n/m. This is easily resolved by allowing the last digit to be one larger. For even b the situation is opposite: since z i β i is represented by an integer of size at most b n/m /2 = b/2 · b n/m−1 we have a surplus of base-b expansions. Here it makes sense to choose an expansion with the shortest Hamming weight (e.g., if b = 2 then we simply pick the non-adjacent form). We denote the maximal number of non-zero coefficients that can appear in a fresh encoding by N b .
Given such base-b expansions of the coefficients, we replace each occurrence of b by −X m and then substitute the results in the image of (2). We end up with an expansion ∑︀ n−1 i=0 c i X i where the c i are represented by integers of absolute value at most ⌊b/2⌋, or in fact ⌊(b + 1)/2⌋ if we take into account the caveat.

Decoding
In order to decode a given expansion ∑︀ n−1 i=0 c i X i we walk through the same steps in reverse order. First we pick another representant by reducing the expansion modulo X m + b, in order to end up with This can be rewritten as On the assumption (1) Usually n and m are determined by security considerations and the concrete application. To apply our encoding method we want to find a small value of b for which condition (1) is met. This is easiest if n/m is small or m is small. If no satisfactory value of b can be found then one can try to enlarge m and view Z[ζ ] as a subring of a higher degree cyclotomic ring. Below we give two lemmas constraining the possible choices for b given m and n; still assuming we are working with 2-power cyclotomic f . Their proofs are given in the full version of this paper [3]. We note that it does not help to allow for negative b in our case, that is for n = 2 k , because b satisfies (1) if and only if −b does.

Lemma 2.2. Let g be an element of order n in Z ×
4n and let t be an element of order 2 not in ⟨g⟩ so that Z × 4n = ⟨t⟩ × ⟨g⟩. If condition (1) is satisfied for odd b > 1 and m > 1 then b mod 4n is an element of the subgroup ⟨t⟩ × ⟨g m ⟩. In particular this implies that b ≡ ±1 mod 4m.
In fact, one may always take g = 3 and t = −1 in the above lemma.
Our method is particularly friendly towards Gaussian integers. Indeed if m = 2 then one can always take b = 2, as one easily verifies that α 2 = 2 where The map (2) then defines an isomorphism between R X 2 +2 and Z[i]/(2 n/2 + 1). If this ring is not large enough to ensure correct decoding, then one can move to slightly larger values of b. The next choice which always works is b = 4, where one can simply take α = 2. Here the ring becomes

Encoding complex-valued input data
In this section we look at the more general problem of encoding floating point complex numbers. Our approach will be to approximate these complex numbers by suitable cyclotomic rationals and then proceed as in Section 2. We have many choices for such approximations including the choice of m which defines which root of unity we are working with. We also have the choice between using integer or rational coefficients for the approximation. Perhaps the most obvious and straightforward approach is to consider our complex number z written in terms of its real and imaginary parts, say z = x + yi for some real numbers x and y. We can then approximate x and y by rationals depending on how much precision we require. This leads us to considering the case m = 2 and the question then arises of how to encode fractional coefficients.

Fractional encoding
Here we consider how to encode a rational number into the space Z/pZ for some integer p, so that it can then be expanded using the technique in Section 2. This problem was considered by Chen, Laine, Player and Xia in [7,Section 6]. Their approach is to define a finite subset P of Q along with an encoding map Enc : P → Z/pZ and a decoding map Dec : Enc(P) → P. The maps should satisfy, firstly, correctness: Dec(Enc(x/y)) = x/y for x/y ∈ P and secondly, Enc should be both additively and multiplicatively homomorphic so long as it still encodes an element of P. The natural choice for the map Enc is Enc(x/y) = xy −1 mod p where the inverse of y is computed modulo p. Care thus needs to be taken to ensure that y has such an inverse, which is ensured with a careful choice of P.
In our setting the coefficient modulus p is of the form b n/2 +1, thus if one wants roughly the same precision for the integer and fractional parts one can take for an odd base b while for even b one can choose where δ ∈ {0, 1} depending on whether you want one more base-b digit in the fractional (δ = 0) or integer (δ = 1) part.
The encoding of an element e ∈ P is then computed as −eb n/2 mod b n/2 + 1. The important thing to note about using this encoding is that for decoding to work the result of the computations must lie in P. If your input data are complex numbers and you approximate them using n/4 fractional b-ary digits then it is likely that after one multiplication the result is no longer in P. Thus one must appropriately choose the precision with which to encode the data, depending primarily on the depth of the circuit to be evaluated and the final precision required. The only constraint is that the precision should be a divisor of b n/4 so that −eb n/2 is an integer.
We note that the fractional encoder need not require m to be 2. However in this case there appears to be no straightforward way to find a good rational approximation with small numerators and denominators except when the denominators are all equal, in this case if this denominator is r then we simply require an approximation of rz in Z[ζ ] subject to some constraint on the coefficients. However, the problem of finding such an approximation to our complex number itself, rather than a scaling, is interesting in its own right as it avoids the need for encoding fractional values and tracking the denominator inherently present in such encodings.

Integer coeflcient approximation
The task of finding a cyclotomic integer closely approximating an arbitrary complex number was considered by Costache, Smart and Vivek in [10]. Here the idea is to solve an instance of the closest vector problem (CVP) in the (scaled) lattice Z[ζ ], where the power basis is scaled and split into real and complex part, which are approximated by integers. In detail: we choose a scaling constant C > 0, and define the constants The target vector in our CVP instance will then be the appropriately scaled real and complex parts of the complex number z we wish to approximate. Concretely, this vector is (0, . . . , 0, ⌈ℜ(Cz)⌋, ⌈ℑ(Cz)⌋).
If (z 0 , . . . , z m−1 , A, B) is a solution to the CVP instance then we must have and similarly for the imaginary part. We therefore see that ∑︀ m−1 i=0 z i ζ i is a good approximation to z. Further, C gives some control over the quality of the approximation, larger C gives a finer-grained lattice but also increases the size of the last two coefficients of the basis vectors which may lead to a larger distance between the target vector and the closest lattice point, which in turn makes solving the CVP instance harder and negatively affects the quality of our approximation of Cz.
In [10] the authors solve this CVP instance using the embedding technique. Namely they attempt to solve the shortest vector problem in the lattice spanned by the rows of ⎛ for some non-zero constant T. With suitable parameter choices, performing LLL reduction on this lattice will return a basis of short vectors for this lattice, among which at least one has ±T in the final coordinate. The remaining coefficients then give plus or minus the target vector minus a close vector.
One issue with the embedding technique is that each new instance of the CVP problem requires performing lattice reduction which for large m is rather time-consuming. In typical applications we want to approximate many different complex numbers, using the same C so only the target vector changes. A more efficient approach therefore is to perform lattice reduction on the CVP lattice itself and since this is independent of the target vector it needs only to be done once so we can spend significantly more time in this step to find a good basis of this lattice. We can then apply a technique such as Babai's nearest plane algorithm, or Babai's rounding algorithm, with this reduced basis to find an approximate closest vector.

Adapting the Fan-Vercauteren SHE scheme
In this section we construct a variant of the FV scheme [13] with plaintext modulus X m + b following the blueprint given in [7]. We prove correctness of this scheme (Theorem 4.1) and analyze the noise growth induced by homomorphic arithmetic operations (Theorem 4.2). The proofs of these theorems are given in the full version of the paper [3].

Basic scheme
Writing R = Z[X]/(X n + 1), the ciphertext space is defined by Rq = R/(q) for some positive integer q, while the plaintext space is R X m +b = R/(X m + b). We will assume that b ≪ q. Recall that in the original FV scheme the plaintext space is R/(t) for some positive integer t ≪ q. We define the scaling parameter ∆ b as Obviously, ∆ b is the analogue of the scalar ∆ = ⌊q/t⌋ in the original FV scheme. Other parameters are the error distribution χe = D(σ 2 ) on R (coefficient-wise with respect to the power basis, with standard deviation σ) and the key distribution χ k = U 3 which uniformly generates elements of R with ternary coefficients (with respect to the power basis). We also define the decomposition base w and denote ℓ = ⌊log w q⌋.
The new encryption scheme ComFV is then defined in the same way as FV where t and ∆ are replaced by X m + b and ∆ b , respectively.
The security of this scheme is based on the same argument as of the original FV scheme. In particular, it is hard to distinguish the public key pk and ciphertext pairs from uniform tuples according to the decision version of the Ring-LWE problem [18]. The evaluation key evk does not leak any information about the secret key as long as a circular security assumption holds [13].
Recall that for an element a ∈ K the canonical (infinity) norm of a is defined as To verify correctness we use the notion of invariant noise introduced in [7]. The invariant noise of a ciphertext ct = (c 0 , c 1 ) encrypting a plaintext msg ∈ R X m +b is an element v ∈ K with the smallest canonical norm such that for some g ∈ R.¹ Then decryption works correctly when ‖v‖ can ∞ < 1/2 that is supported by the following theorem.

Homomorphic operations
In this section we show how homomorphic addition and multiplication are performed in the new scheme. We prove correctness of these operations and estimate the invariant noise growth. Throughout this section, Ct(msg, v) denotes a ciphertext encrypting message msg ∈ R X m +b with invariant noise v. Addition is the coordinate-wise sum of corresponding ciphertext components: ]︀ q ).
The invariant noise grows additively after homomorphic addition. Multiplication consists of two steps. The first one, denoted ComFV.BMul, returns the coefficients of the ciphertext product when expressed as of a polynomial in s, namely of (ct 0 [0] + ct 0 [1]s)(ct 1 [0] + ct 1 [1]s). The second step then maps the degree two term back to degree one using the relinearization technique.
1 In [7] they do not reduce c 0 + c 1 · s modulo q but this doesn't change v as the quotient is absorbed into g.

Heuristic 4.2 (Multiplication noise). Given two ciphertexts ct
with very high probability.
We note that the dominating term here is the first term and not the term containing the product of the canonical norms of the multiplicands since the canonical norms are smaller than 1/2 when the ciphertext can be decrypted correctly.

Comparison with FV: regular circuits
To estimate the performance of ComFV in a general setting and fairly compare it with the original FV scheme and the work of [7], we resort to regular circuits as introduced in [11]. These circuits have already been used in [7] for the same purpose. A regular circuit consists of D computational levels where each level contains A ∈ {0, 3, 10} addition levels, requiring 2 A inputs, followed by one multiplication. Therefore, in total the number of inputs required is 2 D(A+1) . Each circuit input is given by a complex number with real and imaginary parts from (−U, U) for some U ∈ {2 8 , 2 16 , 2 32 , 2 64 }. We will always use a precision of 16 fractional bits in this paper which in the case of a complex number refers to both the real and complex parts independently.
Our aim is to compare ComFV to the previously best known scheme allowing native complex inputs as well as to the state of the art when encoding the real and imaginary parts separately [7]. We will compare this method with our method where we use the same encoding of the complex number as a cyclotomic integer. We chose m = 4 as this is the minimal m for which Z[ζ ] is dense in C and it allows us to use b = 4 h for some h ∈ N, taking α = 2 h/2 if h is even and α = 2 (h(n+4)−4)/8 (2 hn/4 − 1) if h is odd. We also use m = 4 when using to ensure correct decoding at depth D is to require 2 16D | b n/4 so taking b a power of two looks a good fit. We again compare this approach with ours, in this case we also use the fractional encoder.
We computed the theoretical and heuristic maximal depth of a regular circuit which can be reached using FV, the CLPX approach of using plaintext modulus X−b and our ComFV with parameters n, q and σ given in the SEAL library [6] and the relinearization base w = 2 32 . Our results are presented in Table 1. In the table we also give a value for b (or t) which allows one to reach this maximal depth, this b is very often not unique and in this case we give the smallest b for which there is a decryption error at the next level. To find a heuristic estimate of the maximal depth that can be reached in each scheme we take a carefully chosen complex number and use this as the complex number given for all inputs of the circuit. One reason for this can be seen in the table of  results, Table 1, where we see that for A = 10, depths of 14 can be achieved, this requires 2 14·11 = 2 154 inputs, meaning using different inputs would be completely infeasible in practice. Another good reason for choosing all inputs to be the same is that during addition there is no cancellation occurring, indeed the A levels of addition simply become the worst case of scaling by 2 A . The precise complex number we chose depends on the encoding scheme but essentially one finds one with an encoding which has many large coefficients. If the fractional encoder is used then we take the complex number to be (U − 2 −16 )(1 + i) while when using the cyclotomic integer approximation approach it is a matter of trial and error but this need only be done once for each U and m.
From Table 1 we see that in all cases our methods greatly outperform the best scheme natively encrypting complex numbers. At a minimum we can achieve 5 times the depth and for larger n our method becomes even more efficient as the amount of plaintext space not being efficiently used only grows in the current solution. The CLPX method on the other hand is able to achieve slightly larger depths than our scheme, at most one more for the largest n we consider. Where our method improves is on efficiency, we effectively halve the ciphertext size and are expected to be roughly three times faster due to the fact that we can use one multiplication operation per level whereas the CLPX approach requires three.

Conclusion
We constructed a new encoding algorithm for complex data values and a corresponding somewhat homomorphic encryption scheme by utilizing a polynomial plaintext modulus of the form X m + b. This choice allows for a much better use of the available plaintext space and much slower noise growth compared to existing solutions encrypting complex numbers. As a result, for the same ciphertext modulus q and degree n, we can homomorphically evaluate between 5 and 12 times deeper circuits compared to existing solutions based on FV and natively encoding complex numbers. In comparison to the state of the art, which encrypts the real and imaginary parts of the complex numbers separately, our method reduces the size of ciphertexts by a factor of 2 making our scheme at least twice as efficient in time and three times more efficient in space.