On McEliece-Type Cryptosystems Using Self-Dual Codes With Large Minimum Weight

One of the Round 3 Finalists in the NIST post-quantum cryptography call is the Classic McEliece cryptosystem. Although it is one of the most secure cryptosystems, the large size of its public key remains a practical limitation. In this work, we propose a McEliece-type cryptosystem using large minimum distance error-correcting codes derived from self-dual codes. To the best of our knowledge, such codes have not been implemented in a code-based cryptosystem until now. Moreover, we modify the decryption step of the system by introducing a decryption algorithm based on two private keys. We determine the parameters of binary codes with large minimum distance, which, if implemented into a McEliece-type cryptosystem, would provide a security level respectively of 80, 128, and 256 bits. For the 80-bit security case, we construct a large minimum distance self-dual code of length 1064, and use it to derive a random punctured code to be used in the corresponding McEliece-type cryptosystem. Compared to the original McEliece cryptosystem, the key size is reduced by about 38.5%, although an optimal decoding set is yet to be constructed to make the new system fully defined and usable.


I. INTRODUCTION
The process initiated by NIST to standardize one or more quantum-resistant public-key cryptographic algorithms is ongoing, and currently at the fourth round. 1 One of the candidate submissions for the public-key encryption and key-establishment algorithms is the Classic McEliece cryptosystem. This fact indicates that after a long time of research on the original encryption scheme [11], this public-key cryptosystem is still considered one of the most secure.
Still, there is a major drawback, namely the size of its public key. This is a practical limitation for broad use in the current communication systems. For comparison, for the 128 bits security level of the McEliece cryptosystem, the size of its public key is around 187.69 Kb [4], whereas the public key of RSA for the same bit security is 3 Kb (or equivalently, 3 072 bits) [13, Table 2].
The associate editor coordinating the review of this manuscript and approving it for publication was Oussama Habachi . 1 As of April 2023.
A significant number of studies aim to minimize the key size of the McEliece cryptosystem by using different families of error-correcting codes, but most of these variants have been broken (see e.g. [6], [12], [14]). This paper proposes a McEliece-type cryptosystem using codes with error-correction capability higher than the capability of the codes adopted until now. By increasing the minimum distance of the implemented codes, we aim to decrease the size of the public key of the cryptosystem. More specifically, we use high minimum distance punctured codes derived from self-dual codes. Such punctured codes have no specific structure and do not belong to any known family of errorcorrecting codes. Our choice to use binary self-dual codes as a source code is based on two reasons: first, self-dual codes with large minimum distance exist (e.g., the extended Golay Code), and second, there is an algorithm for contracting self-dual codes [7], [9], [22]. To the best of our knowledge, self-dual or punctured codes derived from them have not been implemented in a code-based cryptosystem until now. The reason is most likely twofold: first, binary self-dual codes with high minimum distance are known up to length 130, which is too small for current security requirements. Second, there was no efficient decoding algorithm for such codes until recently [23], an exception being the extended Golay code [16].
The main contributions of this paper can be summarized follows: • We determine the parameters of a putative optimal selfdual code, from which a punctured code would provide a classic security level of 80, 128, and 256 bits (respectively a quantum security level of 67, 101, and 183 bits) if implemented in a McEliece-type cryptosystem.
• For the 80-bit security case, we construct an optimal self-dual code of length 1 064. To the best of our knowledge, such a code is presented here for the first time.
• We derive a punctured code of this self-dual code to generate the public key of a McEliece-type cryptosystem. Further, we modify the decryption step of the system by introducing a decryption algorithm that uses two private keys, namely the punctured and the self-dual code.
Our theoretical analysis estimates that the security level of the so defined system is 80 and 67 bits against classical and quantum attacks, respectively. The size of the resulting public key is 276.39 Kb, whereas the best-known example of a binary Goppa code providing the same bit security level in the original McEliece cryptosystem is 449.85 Kb [4]. Therefore, in this case, we achieve a reduction of the key size around 38.5%. The results on the 80-bit security case suggest that self-dual codes can be used in a McEliece-type cryptosystem to reduce the key size for the same security level. However, a current limitation is that to make this cryptosystem usable, one also needs to define an optimal decoding set. The computational effort to search for an optimal decoding set is currently undergoing for the 80-bit security level case, and we leave the complete definition and analysis of our cryptosystem in this particular instance for future research.
In summary, the main innovation underlying this paper is the idea to investigate self-dual codes for McEliece-type cryptosystems. The motivation, in perspective, is to obtain more compact public keys in such cryptosystems, which is the main issue for their use. The approach proposed in this paper is still far from providing a practical solution, as the limitation outlined above on the optimal decoding set suggests. However, we deem this work to indicate a promising future research direction in code-based cryptosystems.

II. BACKGROUND
Let F n 2 be the n-dimensional vector space over the binary field F 2 , whose vector sum is the bitwise XOR between nbit vectors, while the multiplication by a scalar corresponds to the logical AND between a single bit and a n-bit vector. The Hamming distance between two vectors in F n 2 is the number of coordinates where they differ, while the Hamming weight (or only weight) wt(v) of a vector v ∈ F n 2 is the number of nonzero coordinates in v. A k-dimensional subspace C of F n 2 is called a [n, k, d] binary linear code where d is the minimum Hamming distance between any pair of vectors (also called codewords) of C. Equivalently, d is the minimum Hamming weight among all nonzero codewords of C. Since a [n, k, d] binary linear code C is a vector subspace, it can be spanned by a k × n generator matrix G of rank k. On the other hand, a parity-check matrix H for C is a (n − k) × n matrix such that Hx ⊤ = 0 if and only if x ∈ C. The vector s = Hx ⊤ is also called the syndrome of x. A coset of a vector x ∈ F n 2 is the set x + C = {x + c : c ∈ C}, and a coset leader is any element in x + C with minimum Hamming weight.
Two binary linear codes C 1 and C 2 of length n are called equivalent if one can be obtained from the other by a permutation of coordinates, that is, if there exists a permutation σ ∈ S n , with S n being the symmetric group of order n, such that σ (C 1 ) = C 2 . In particular, if a permutation σ maps a code C to itself, then σ is called an automorphism of the code.
The inner product in F n 2 is given by for u, v ∈ F n 2 , and u and v are orthogonal if such product is equal to 0. Then, C ⊥ = {v ∈ F n 2 : ⟨u, v⟩ = 0, ∀u ∈ C} is the orthogonal of the code C.
The code C is called self-orthogonal if C ⊂ C ⊥ , and selfdual if C = C ⊥ . It is known that the weight of any codeword of a binary self-dual code is even [17, p.9]. If an error-correcting code is a linear [n, k, d] code then it can correct up to t ≤ (d − 1)/2 errors. Let C be a linear code and C i the set of all words of C without the i-th coordinate. Then, C i is the punctured code of C on the i-th position.

A. McEliece CRYPTOSYSTEM
The McEliece Cryptosystem is the first code-based cryptosystem, and it was proposed by Robert McEliece in 1978 [11]. The original cryptosystem uses a binary [1 024, 524] code with an error-correcting capability of 50 errors. The steps of the encryption scheme are as follows: 1) Define the system parameters: • k: the length of the message block.
• n: the length of the ciphertext.
• t: the number of the intentionally added errors (equal to the error-correcting capability of the implemented linear code). 2) Key generation: define: • G: a generating matrix of an [n, k, 2t + 1] code for which there is a fast decoding algorithm.
• P: a random n × n permutation matrix. • S: a random dense k × k non-singular matrix and, compute G ′ = SGP, P −1 . • S −1 : the inverse of P and S. Note that G ′ generates a linear code with the same n, k and t. Then, one has: 3) Encryption: split the data for encryption into k-bit blocks. Then each block m is encrypted as r = mG ′ +e, where e is a random vector of length n and weight t. 4) Decryption: The received vector r is decrypted as follows: a) Compute r ′ = rP −1 , which is mSG + eP −1 . b) Decode r ′ into a codeword c ′ using the efficient decoding algorithm for the code with generator matrix G, c ′ = mSG. c) Compute c such that cG = c ′ (If G is in a systematic form, then c is the first k bits of c ′ ). d) Compute m = cS −1 . The scheme above can be applied with any linear code for which a fast decoding algorithm is known, and for which there is a significant number of different codes of this family for the chosen length, dimension, and error-correcting capability. The original system in [11] employs a binary [1 024, 524, 101] Goppa code.

B. CRYPTANALYSIS
As with any other public encryption scheme, the McEliece cryptosystem gives the following information to the attacker: the encryption parameters, the encryption and decryption algorithms, and the public key. Hence, the adversary can also select any plaintext and compute the corresponding ciphertext.
Concerning the adversary's goals (total break, partial break, or distinguishing break), there are three main categories of attacks: • Key-recovery attack: the attacker deduces the private key.
• Message-recovery attack: the attacker obtains a part of or the complete plaintext corresponding to a ciphertext without knowing the private key.
• Distinguishing attack: the attacker can distinguish a ciphertext from a random message without knowledge about the private key, or the attacker can distinguish the public key from a random code. Next, we review a few of the known attacks on the McEliece encryption scheme. For each attack, we evaluate the probability of success or the inverse problem of evaluating the average number of attempts of the attack until the attacker achieves its target. For algorithmic attacks the security level of a system is defined as a minimum work factor. The work factor is the average number of elementary (binary) operations needed to perform a successful attack [1, p.72].
In the following sections, we describe the main attacks published in the relevant literature, assuming that a McEliece cryptosystem is defined by a private key (G, P, S), where G is a k × n generator matrix of a binary [n, k, 2t + 1] code, P is a random n × n permutation matrix, and S is a random dense k × k non-singular matrix. The public key is (G ′ , t) where G ′ = SGP. Further, we assume that the attacker has access to a ciphertext c produced by the encryption scheme.
We start by first recalling the components over which brute-force attacks can be mounted. Then, we describe the basic Information Set Decoding (ISD) attack and its work factor, along with some of its improved versions, particularly Stern's ISD attack.

1) BRUTE-FORCE ATTACKS
A brute-force attack can be mounted towards different components of the encryption system: • Towards the message: the attacker takes a random message m 1 of length k, encrypts it to c 1 = m 1 · G ′ , and computes the difference e 1 = c − c 1 . If the difference e 1 has weight ≤ t, then the plaintext corresponding to the ciphertext c is exactly m 1 and the attack succeeds. Then the probability of success is 1/2 k since the number of all possible messages of length k is 2 k .
• Towards the coset leaders of the code generated by G ′ : the attacker computes the syndrome of all coset leaders. The coset leader with syndrome equal to the syndrome of the ciphertext c is the error vector. Knowing the error vector, one can compute the codeword and then the message. The number of the coset leaders is |F n 2 |/|C ′ | = 2 n−k . Therefore, the work factor of this attack is at least 2 n−k .
• Towards the error-vector: the attacker searches among the vectors e of length n and weight t such that the syndrome of e is equal to the syndrome of the received vector c (the ciphertext). Thus, it is a search on e such that S(e) = e · H T equals S(c), where H represents the parity-check matrix corresponding to G ′ . This problem is equivalent to finding a linear combination of t columns of H , which results in a column vector with weight S(c). Since there are n t possible choices for the vector e, the work factor of the brute force attack towards the error vector is n t .
An information set for a [n, k] code C is any subset A = {i 1 , · · · , i k } of k coordinates such that, for any given set of The information set thus consists of any k indices such that the corresponding k columns of a generator matrix of C have rank k.
where G ′ is a generator matrix of an [n, k, 2t + 1] code C and e is an error vector of weight t. Let A be an information set of k coordinates such that all entries of the error vector indexed by A are 0. In summary, the algorithm for the ISD attack works as follows: 1) Choose k out of n indices for the information set. These k columns of G ′ are permuted to the first k positions, VOLUME 11, 2023 which is which takes O(k 3 ) operations [11], since it entails solving k linear equations in k unknowns. This is equivalent In the original code-based cryptosystem, Goppa codes were used, and for these codes, around 29% of the choices of k columns are invertible. Therefore, the work factor for the ISD attack is where β is the proportion of the invertible k columns out of n for the generator matrices of the family of [n, k, 2t +1] codes. Note that β depends on the specific family.

3) STERN'S ISD ATTACK
Stern [21] proposed a refinement of the ISD attack, which is based on the use of the extended code generated by G ′′ , defined as: It is known [1] that such code has only one minimum weight codeword, which coincides with e. Stern's attack consists in finding the unique codeword e of weight t in the code generated by G ′′ . The algorithm is probabilistic, using two input parameters p and l with the parity check matrix of the extended code. The work factor is B = f 1 + f 2 + f 3 for one iteration of the attack, where [21]: The total work factor of the attack is B P t , where P t is the probability of finding a codeword of weight t in one iteration. In particular, P t is estimated in [21] as:

4) QUANTUM BASIC INFORMATION SET DECODING ATTACK
Let v = mG + e, G and e be defined as before. The Basic Quantum ISD attack first searches for an invertible submatrix G S of G, by selecting k of its columns. Once it is found, the algorithm computes then determines mG ∈ F n 2 , and finds the error vector e = v − mG, checking if its Hamming weight is t.
Regarding [3], randomly searching for a root can succeed in approximately n k /0.29 n−t k iterations, where one iteration of this function has around O(n 3 ) bit operations. Grover's algorithm uses about square root of the number of iterations, i.e., n k /0.29 n−t k . Then the work factor for the Basic Quantum ISD attack, which is the complete number of qubit operations for finding a solution, is O(n 3 ) n k /0.29 n−t k . Note that the meaning of 0.29 is that, on average, 29% of the selected matrices G S are non-singular when G is a generator matrix of the Goppa code. A list of the described attacks with names used further in this work are reported in Table 1.

III. PARAMETERS ESTIMATION FOR SELF-DUAL CODES WITH BIT SECURITY 80, 128, AND 256
To estimate parameters for the self-dual codes, which would provide a security level of 80, 128, and 256 bits, we apply the upper bounds for the work factor of the attacks in the previous section to the known recently proposed Goppa codes with these security levels. Since our attacks are not the best known, we expect to obtain higher values for the upper bounds. These higher values we use further for the estimation of the parameters of the self-dual codes.
The private key of the original McEliece cryptosystem is a [1 024, 525] Goppa code with the error-correcting capability of 50 errors. Initially, it was estimated to provide a security of 64 bits. Later, via an improved version of Stern's attack in [4] the security of the system was reduced to 60.5 bits. In the same publication, the authors proposed parameters for the Goppa codes, where implementation in the McEliece cryptosystem would provide a security level of 80, 128, and 256 bits. The proposed codes are listed in Table 2. The latest proposed codes providing security levels of 128, 196, and 256 bits are in the NIST proposal [2].
From the results listed in Table 2, it follows that we have to search for codes providing a bit security level of 83, 148, and 302 to ensure that they would provide at least 80, 128, and 256 bit security concerning the latest attacks. In Table 3, we list the parameters of a few such codes. 43514 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. Note that these are the parameters of the punctured [n, k, 2t + 1] codes. The corresponding self-dual codes must be of length n + 2 and minimum weight 2t + 3, to ensure that the punctured codes are within the required parameters. The upper bounds for the minimum weight of a putative self-dual [n 1 , n 1 /2, d 1 ] code are as follows [20]: Remark 1: In our estimation, we consider a minimum weight that is 15% smaller than the above bounds. In this way, we achieve the following: • increasing the probability that such a code exist and can be constructed; • if such a code exists, then a large number of codes with the same parameters, length, and minimum weight exist. This is a preliminary requirement for the security of the McEliece-type cryptosystem.
The size of the putative punctured codes B 1 , B 9 , and B 31 is at least 38% smaller than the size of the proposed smallest Goppa codes D 1 , D 2 , and D 4 providing the security level of 80, 128, and 256 bits, correspondingly. In the next section, we will present a possible construction of a self-dual code where the punctured code has the parameters of B 1 .

IV. A NEW EXAMPLE OF McEliece-TYPE CRYPTOSYSTEM WITH 80-BIT SECURITY
In this section, we first construct an example of a binary [1 064, 532, d ≥ 162] self-dual code, to define a McEliece-type cryptosystem with 80 bit security. Then, we derive a punctured code from such code to generate the public key of the encryption scheme. Next, we discuss an efficient decoding algorithm suitable for the new self-dual code. The decoding is used in the decryption step of the cryptosystem. Further, we propose a modified decryption algorithm for the McEliece-type system with two private keys: the new binary [1 064, 532, d ≥ 162] self-dual code and one of its punctured codes. The decryption integrates the decoding of the complete self-dual code. Finally, we discuss the bit security level of the McEliece-type cryptosystem thus defined.  .(3)). Here we construct such a code where the aim is for d to be at least 162. Note that this value for d is much smaller than the upper bound (see Remark 1).
To construct a binary [1 064, 532, d ≥ 162] self-dual code we use a known algorithm presented in [7] and [22]. Let us assume that a self-dual [1 064, 532, d ≥ 162] code exists. Let B be such a code and let B have an automorphism σ of order 133 with 8 cycles of length 133 and no fixed points. Without loss of generality σ can be represented as: where i is a cycle of length 133 for 1 ≤ i ≤ 8.
If v ∈ B, then v can be expressed as where X is a generator matrix of F σ (B) and Y is a generator matrix of E σ (B). The maps π and ϕ are defined as follows: for some j ∈ i , i = 1, . . . , 8, and where v| i = (v 0 , v 1 , . . . , v 132 ) is identified with the polynomial ϕ(v| i )(x) = v 0 +v 1 x+· · ·+v 132 x 132 in P for 1 ≤ i ≤ 8, and P is the set of even weight polynomials in the quotient ring R 1 = F 2 [x]/(x 133 − 1). An inner product in P 8 is defined as: for all g, h ∈ P 8 . VOLUME 11, 2023 Algorithm 1 Construction of a Self-Dual Code Having an Automorphism 1 Determine a generator matrix X ′ of π(F σ (B)).
generates a code with a minimum weight d, then 6 return G (G generates B); 7 else 8 return to 1.
To construct the code B, we take the steps of Algorithm 1. 1) Determine a generator matrix of π(F σ (B)).
Since the image π(F σ (B)) is a binary self-dual code of length 8, a possible generator matrix of π(F σ (B)) is: The image ϕ(E σ (B)) ⊂ P 8 , where P is the set of even-weight polynomials in R 1 .
• e i (x)e j (x) = 0, i ̸ = j. After generating the idempotent e j (x) of the ideal I j , for j = 1, . . . , 9, we observe that e 1 (x −1 ) = e 2 (x), e 3 (x −1 ) = e 4 (x), e 5 (x −1 ) = e 6 (x), e 7 (x −1 ) = e 8 (x), and e 9 (x −1 ) = e 9 (x). The same relations also hold for the generator polynomials g i (x) for 1 ≤ i ≤ 9, i.e., g 1 (x −1 ) = g 2 (x), g 3 (x −1 ) = g 4 (x), etc. Using these relations and the self-orthogonality of the image ϕ(E σ (B)), we construct a generator matrix of ϕ(E σ (B)) having the form: where Y j is 4 × 8 matrix with elements of I j , for j = 1, . . . , 9. The cells Y 1 , Y 3 , Y 5 , and Y 7 are constructed under certain conditions, which we discuss at the end of this section. The cells Y 2 , Y 4 , Y 6 , and Y 8 are obtained from the previous four cells using the orthogonality condition Eq. (8). Also there, we present a particular example of the complete generator matrix Y ′ of ϕ(E σ (B)) in Eq. (10). We note that for each of Y 1 , Y 3 , where each entry of the first 8 rows is a right circulant 3 × 133 matrix since I j is a cyclic [133, 3] code for j = 1, 2, and each entry of the rest of the rows is a right circulant 18 × 133 because I j is a cyclic [133, 18] code for j = 3, . . . , 9. 2 In step 3 we mentioned that the cells Y 1 , Y 3 , Y 5 , and Y 7 are constructed under certain conditions. The matrix Y ′ , i = 1, . . . , 9, specifies the generator matrix Y of the subcode E σ (B). The minimum weight of the code B has to be greater than or equal to 162. Thus, the same has to hold for the minimum weight of Y . In this regard, we construct the Y i cells according to the following requirements: • each row of Y i , i = 1, . . . , 9, has at least four non zero elements, i.e., each row has weight greater than or equal to four; • the weight of Y 1 , and Y 9 is at least 3.
In step 4, the matrix Y has to satisfy the following requirements: • The first 24 rows, corresponding to Y 1 Y 2 have a minimum weight of at least 162.
• Each next 18 rows corresponding to a row in Y s , s = 3, . . . , 9, have a minimum weight of at least 162.
• The linear combinations of up to 8 rows of the 144 rows of Y corresponding to Y 7 Y 8 and the 72 rows corresponding to Y 9 , have a weight of at least 162. Once we have constructed the generator matrices for the subcodes F σ (B) and E σ (B), we can proceed to step 5. of Algorithm 1.

If
generates a code with a minimum weight d ≥ 162, where X and Y are given in Eq. (9) and 2 The first rows of the circulants corresponding to the polynomials in the matrix Y ′ and the corresponding binary generator matrix G can be found at the following repository: https://github.com/ NoAuthorSubmission/McEliece_Data Eq. (11) respectively, then G generates the code B.
To obtain a generator matrix of B, we developed a software in C++ that performs the following operations: • Construct sub-matrices of Y corresponding to Y s , with s = 1, . . . , 8, which meet the conditions in Step 3.
• Create the sub-matrix of Y corresponding to Y 9 defined in step 3.
• Create the matrix G defined in Eq. (4) and the parity-check matrix H of G.
• Compute the weight of all linear combinations up to 8 rows of G and of H . This calculation is performed by implementing the algorithm for efficiently computing the codewords of fixed weight in linear codes (for the binary case) presented in [5]. Calculating the exact minimum weight has a work factor of 2 87 (regarding Stern's attack, Section II-B), which is infeasible. Instead, the following computations are carried out: ( be obtained from the other by applying φ l , for some l, that is, b ̸ = φ l (c) for 1 ≤ l ≤ pr − 1, ∀ b, c ∈ C. An optimal decoding set can be defined after experiments with sets of cyclically different codewords of minimum weight or mixed sets of codewords of different weights close to the minimum weight. The new self-dual code B possesses an automorphism σ of order 133 with eight cycles of length 133 and no fixed points. Therefore, the decoding scheme of [23] is a valid decoding scheme for B.
Next, we define the McEliece cryptosystem using the punctured code B p . Recall that B p is a [1 062, 531, d ′ ≥ 160] punctured code obtained from the self-dual code B, while G and G p are respectively generator matrices of B and B p . 1) System parameters: • k = 531: the length of the message m.
• P: a random n × n permutation matrix.
• S: an invertible k × k matrix such that SG p P is in a systematic form.

3) Encryption:
• e: a random error vector of length n and wt(e) = t. • m → r = m G ′ p + e 4) Decryption: For the decryption, we define two more elements. Let S 1 and P 1 be extended matrices of S and P defined as follows: One can show that for the matrices defined above the following holds: (0|m) · S 1 · G · P 1 = (0|m · S) · G · P 1 = (m * 1 , m * 2 |m · S · G p · P) = (m * 1 , m * 2 |m · G ′ p ).
Thus, we can decode r ′ of length 1 062 via decoding a padded ( * , * | r ′ ) of length 1 064 by the initial self-dual decoding the public key or decoding B 1 is expected to be as difficult as the problem of decoding a random code.

V. CONCLUSION
This paper proposed a McEliece-type cryptosystem using high minimum distance self-dual codes and punctured codes derived from them. We determined the parameters of a putative optimal self-dual code, providing a classic (respectively, quantum) security level of 80, 128, and 256 (respectively, 67, 101, and 183) bits. For the 80-bit security case, we constructed an optimal self-dual code of length 1 064, reducing the key size by around 38.5% with respect to the original McEliece cryptosystem. The main limitation of our work is that a complete optimal decoding set is needed to make our cryptosystem practically usable. The computational search of such a decoding set is currently undergoing, and it is a direction for future research on the topic.