A Post-Quantum Biometric Template Protection Scheme Based on Learning Parity With Noise (LPN) Commitments

Biometric recognition has the potential to authenticate individuals by an intrinsic link between the individual and their physical, physiological and/or behavioral characteristics. This leads a higher security level than the authentication solely based on knowledge or possession. One of the reasons why biometrics is not completely accepted is the lack of trust in the storage of biometric templates in external servers. Biometric data are sensitive data which should be protected as is contemplated in the data protection regulation of many countries. In this work, we propose the use of biometric Learning Parity With Noise (LPN) commitments as template protection scheme. To the best of our knowledge, this is the first proposal for biometric template protection based on the LPN problem (that is, the difficulty of decoding random linear codes), which offers post-quantum security. Biometric features are compared in the protected domain. Irreversibility, revocability, and unlinkability properties are satisfied as well as resistance to False Acceptance Rate (FAR), cross-matching, Stolen Token, and similarity-based attacks. A recognition accuracy with a 0% FAR is achieved, because user-specific secret keys are employed, and the False Rejection Ratio (FRR) can be adjusted depending on a threshold to preserve the accuracy of the unprotected scheme in the Stolen Token scenario. A good performance in terms of execution time, template storage and operation complexity is obtained for security levels at least of 80 bits. The proposed scheme is employed in a dual-factor authentication protocol from the literature to illustrate how it provides security using authentication and database (cloud) servers that can be malicious. The proposed LPN-based protected scheme can be applied to any biometric trait represented by binary features and any matching score based on Hamming or Jaccard distances. In particular, experimental results are included of a practical finger vein-based recognition system implemented in Matlab.


I. INTRODUCTION
Nowadays, our society has accepted extensively the use of biometric systems as a way of user authentication. The problem is that biometric data, which are stored as template at the registration phase or enrollment, are sensitive and, hence, should be protected, as contemplated in the data protection regulation of many countries [1]. Another problem is that biometric data that are revealed cannot be employed any The associate editor coordinating the review of this manuscript and approving it for publication was Marina Gavrilova . more to avoid impersonation and privacy attacks. This also motivates to protect templates since people cannot provide many biometric traits.
The ISO/IEC 24745 standard on biometric information protection establishes the requirements of irreversibility, unlinkability, and revocability for biometric template protection schemes [1]. Irreversibility means that no information related to the biometric data can be recovered even if protected templates are compromised. Hence, biometric data remain private. Unlinkability means that no adversary can know which individual is the owner of the protected template, VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ thus allowing user identity privacy. In case a protected template is compromised, it should be revocable or renewable to obtain a new protected template from the same biometric sample. Traditionally, biometric template protection schemes have been classified into 1) biometric cryptosystems and 2) feature transformations or cancelable biometrics [2].
Biometric cryptosystems bind a secret cryptographic key to the biometric data. Among them, fuzzy extractor, fuzzy vault and fuzzy commitment schemes were proposed, the latter being widely employed [3]- [5]. In Fuzzy Commitments [6], the commitments are Auxiliary or Helper Data generated as a combination of biometric data with an error correction codeword indexed by a cryptographic key. A cryptographic hash of the secret key (or of the error correction codeword) is stored together with the Helper Data. Biometric data should be represented as binary strings and the Hamming distance is used as the distance metric. Matching is performed by attempting to recover the cryptographic key from the Helper Data and the input biometric data, applying error correction decoding. Irreversibility is based on the computational difficulty to retrieve either the key or the biometric data from the stored Helper Data. Unlinkability and revocability are based on employing different keys.
The recognition accuracy of biometric cryptosystems is worse than the systems without protection, also known as the baseline systems. Hence, their security is very much lower than a cryptographic system because their False Acceptance Rate (FAR) is not sufficiently small. Considering a bruteforce attack (also known as FAR attack), FAR should be smaller than 2 −N to achieve at least N-bit security. However, the FAR of biometric systems usually ranges from 10 −5 (17 bits) to 10 −7 (24 bits) [4]. Therefore, multibiometric fusion should be employed to improve security. Another limitation of biometric cryptosystems that forces the use of multibiometric fusion is the low entropy of biometric traits [4].
In the feature transformation approach, the biometric template is protected by a transformation function, which is applied at the registration and the verification phases. Therefore, biometric data are compared in the protected domain. Transformations can be non-invertible or invertible (salting). Transformation functions proposed in the literature are BioHashing [7], Alignment-Robust Hashing (ARH) [8], re-mapping and warping [9], and Bloom filters [10]. Unlinkability and revocability are based on the variation of the parameters of the transformation functions. Irreversibility depends on the difficulty to obtain the original biometric data from the transformed data.
Transformed templates often contain less information than the original templates. Hence, the usual consequence is a recognition performance degradation compared to the baseline version (without transformation) [7]- [11]. As in biometric cryptosystems, multibiometric fusion should be employed to improve security [12].
The accuracy obtained with the transformation can be improved due to the entropy added by a user-specific secret key as in salting schemes. In fact, the advantage of salting schemes, such as BioHashing [7], is that, theoretically, there is the possibility of achieving a 0% error rate due to the use of a dual recognition based on the biometric information and the user-specific secret key. However, this is risky and not advisable because an attacker can use the device with the user-specific secret key to improve the chances of successful authentication. This is known as the Stolen Token scenario [7]. Besides, as happens to biometric cryptosystems, a limitation of many salting schemes is that their security is very much lower than a cryptographic system because they are not robust to FAR attacks [13].
In the other side, most of cancelable biometric schemes apply similarity-preserving transformations, also called Locality Sensitive Hashing, in order to preserve in the protected domain the accuracy performance obtained in the unprotected domain [14], [15]. The problem is that this similarity or distance-preserving property (distances between unprotected samples are nearly the same as the distances between protected samples) can be exploited by similaritybased attacks that break these schemes. If an attacker can access the protected template, he/she can apply a search algorithm to generate first guesses randomly, transform them to the protected domain, compute the distances with the protected template, use the information to improve the probability of success with new guesses, and repeat the process until reaching a successful guess. The work in [15] confirms the vulnerability of BioHashing and Bloom-filter schemes to a Genetic Algorithm enabled similarity-based attack. The work in [14] introduces non-linearity in the transformation with the aid of a deep neural network, but this requires retraining whenever a new user is enrolled.
An alternative approach recently proposed to preserve the accuracy of baseline systems is homomorphic encryption [16]. When it is employed in a biometric application, the template and the input biometric data are encrypted by using a public key. The comparison is performed in the encrypted domain by means of an encrypted score computation operation. Thus, the resulting score after comparison is encrypted. In order to obtain the final score, a decryption operation by using a private key should be applied.
The practical implementation of Fully Homomorphic Encryption schemes is still a challenge because not all the operations needed to obtain an encrypted score are feasible due to their high cost in computational and memory requirements [2]. The practical proposals of biometric Homomorphic Encryption schemes only allow a limited subset of operations (additions or multiplications) in the encrypted domain. The most used approach is the additively homomorphic scheme and, specifically, the Paillier homomorphic encryption scheme [17]. In the schemes based on Paillier homomorphic encryption, the security of the operations employed are based on hard problems that cannot be solved nowadays in polynomial time, such as the Discrete Logarithm Problem and the Integer Factorization Problem. However, these problems are not so complex for quantum computers, which is a relevant threat to consider, because protected schemes that nowadays are considered secure will not be so in the future.
Among the systems believed to resist the attacks of quantum computers, lattice-based cryptography has attracted most interest. Lattice cryptography uses high-dimensional geometric structures to hide information creating problems that are considered impossible to solve if the private key is unknown, even for quantum computers. Homomorphic encryption can be also constructed on the lattice problem fundamentals. In [18], two variants of Homomorphic Encryption are employed based on ideal-lattice and, particularly, ring-LWE (ring Learning With Errors) schemes, which are an example of ideal lattice cryptography.
The drawback of homomorphic encryption-based approaches is not only their high computational cost but also their memory requirements since the size of the protected template is around two order of magnitude greater than the unprotected template [18]. In addition, a simple attack algorithm has been reported to the authentication server that computes the final decrypted score. The biometric data can be revealed in at most 2N−T queries, where N is the bitlength of the biometric template and T is the authentication threshold [19].
In this work, we propose a post-quantum lightweight solution based on lattice cryptography. Specifically, Learning Parity with Noise (LPN) commitments are employed to protect biometric data. Our proposal of biometric LPN commitments uses a public generator matrix to convert biometric data to linear codewords that then are randomized with a userspecific secret. LPN commitments are not opened (the secrets are not revealed) but compared in the protected domain. The commitments using impostor secrets are detected and directly rejected without proceeding to calculate a biometric similarity score. Hence, False Acceptance Rate is 0%.
In comparison, conventional Fuzzy Commitments also uses a public generator matrix but to convert a secret to a linear codeword that is then combined with the biometric data. Biometric cryptosystems using Fuzzy Commitments accept an individual if the commitment can be opened (the secret can be reconstructed) because the biometric data provided at verification is enough similar to the data provided at enrollment. Hence, FAR is not 0% and FAR attacks can be successful.
LPN-based schemes have been applied to pseudorandom generators, symmetric key encryption, secret-key authentication protocols, public-key identification, and zero-knowledge proofs [20], [21]. However, to the best of our knowledge, this is the first proposal of LPN-based cryptography for biometric template protection. The main contributions of this paper are the following: • The first biometric template protection scheme based on LPN commitments, whose hardness is a NP complete problem to classical and quantum computers.
• A low cost solution in terms of computational and memory requirements for protected template generation and storage (lower than approaches based on homomorphic encryption).
• High security against attacks to recover the biometric data, because comparison is done in the protected domain, using efficient cryptographic protocols.
• Resistance to similarity-based attacks because LPN commitments are random (computationally hiding) and, hence, do not preserve the distance values obtained between unprotected samples with respect to the distance values obtained between protected samples.
• A recognition accuracy with a FAR of 0% because user-specific secret keys are employed in the biometric LPN commitments. In case of the Stolen Token scenario, where an attacker uses a client device with a user-specific secret key, the accuracy of the unprotected approach is preserved.
• A security level comparable to a cryptographic system, even with unibiometric systems.
• Experimental results are included from a practical implementation in Matlab.
• The proposed solution was applied to a finger veinbased biometric system, compared to other systems, and evaluated in terms of irreversibility, revocability and unlinkability, as established in the standard ISO/IEC 24745. This work is structured as follows. Section II describes our proposal of application of LPN commitments to biometric template protection. The operations required are defined, and a security analysis is carried out, considering a distributed scenario with cloud-based services where our scheme is included in an authentication protocol proposed in the literature. The implementation of biometric LPN commitments by using Matlab functions is explained in Section III. Parameters are selected to achieve several security levels and performance is evaluated in terms of execution time, template storage and operation complexity. In addition, a comparison to homomorphic encryption-based proposals is included. A practical realization is presented in Section IV by using finger veins. Accuracy, irreversibility, revocability, unlinkability, and resistance to attacks are proven and compared to other proposals of biometric template protection schemes applied to finger veins. Finally, Section V concludes the work.

II. PROPOSAL OF BIOMETRIC TEMPLATE PROTECTION BASED ON LPN COMMITMENTS A. DEFINITION OF BIOMETRIC LPN COMMITMENTS
Commitment schemes are fundamental cryptographic primitives for cryptographic protocols. A commitment scheme allows a party to commit to a message by using a secret key to maintain it hidden to others. The security properties required by a commitment are the hiding and binding properties. Hiding means that one cannot learn anything about the committed message from the commitment. Binding means VOLUME 8, 2020 that the commitment created for a message is different to the commitment created for a different message.
An LPN commitment is based on encoding a message (in our proposal, biometric data) by using a random linear code with some noise added to the codeword. Formally, the LPN commitment to an m-bit message B ∈ {0, 1} m is as follows [21]: where · applies the bitwise AND and XOR operations; || is the concatenation of two vectors; ⊕ is the bitwise XOR operation; r is a uniformly random vector ∈ {0, 1} l included to add randomness; e is a low-weight uniformly random vector ∈ {0, 1} n following a Bernoulli distribution with parameter τ (0 < τ < 1/2), i.e., every bit in e has a probability τ of being 1 and probability . When the weight of e is exactly nτ , instead of expected, the LPN problem is named as exact LPN (xLPN for short) [21].
Using the same notation as above, the search version of the LPN problem with parameters k ∈ N (the length of a secret s), τ ∈ R (the noise rate in the e [i]), and n ∈ N (the number of samples), asks to find a k-bit secret s from the n noisy linear equations resulting from b = A · s ⊕ e, where A is public. In our case, r and B (the biometric data) concatenated form the secret. The computationally hard problem underlying the security (i.e., the computational hiding property) of the LPN commitment scheme is the search LPN problem, which can be stated as the NP complete problem of decoding random linear codes [22]. Since the decoding problem in random linear codes is known to be robust for quantum as well as for classical computers, the search LPN problem is suitable for the construction of quantum-resistant commitments of secret biometric data B.
Setting n = θ(k) = θ(l + m) large enough, the commitment scheme becomes computationally hiding and perfectly binding (with overwhelming probability over the choice of A). On the one hand, the binding property is satisfied by the large distance of the code generated by the random matrix A. On the other hand, the hiding property is satisfied by the LPN assumption which implies that A · s ⊕ e is pseudorandom.
Let us define a linear code C as a k-dimensional subspace of {0, 1} n . In the decoding problem, the input is a noisy version of a codeword c ∈ C, c ⊕ e, with error vector e ∈ {0, 1} n of Hamming weight w. In a typical setting, the weight w is upper bounded by the code distance d, which is the minimum Hamming distance between two codewords (full distance decoding). The target of decoding is to recover the codeword c (which is equivalent to find e).
Every instance of the LPN problem is an instance of a syndrome decoding problem where n is the length of the codeword, k is the linear code rank, A is the generator matrix, and w is the linear code distance (d) obtained from an error parameter τ as w = nτ . Let n be the number of samples, we can write an LPN instance as the following matrix-vector tuple: where e = (e 1 , . . . , e n ) and the i th row of A and b represent the i th LPN sample. Nowadays, the best algorithms for decoding random binary linear codes formulated as a syndrome decoding problem are based on Information Set Decoding (ISD) [23], a probabilistic decoding strategy that essentially tries to guess k correct positions in the noisy received word, b. The running time, T , of decoding algorithms is typically a function of the parameters n, k and w. If the Gilbert-Varshamow bound is used, w is a function of n and k, and therefore the running time can be expressed as a function of n and k only. For all Information Set Decoding algorithms, the highest running time is achieved when the code rate k/n is slightly below 1/2. In that case, the ISD algorithms offer exponential running times of the form T (n) = 2 an where α is a constant which can be used as a metric to compare the different algorithms.

B. COMPARISON OF BIOMETRIC LPN COMMITMENTS IN THE PROTECTED DOMAIN
In general, the algorithms of a commitment scheme are: key generation (KGen), which results a public commitment key; commitment generation (Com), which outputs a commitment for a message; and verification Ver, which verifies the commitment. In the LPN commitment scheme proposed in [21], KGen generates the public key A; Com outputs the randomness r and the commitment from the public key A and a message m: Com (m) = A · (r||m) ⊕ e; and Ver takes the key A, the randomness r, the commitment Com (m), and the message m, and outputs 1 (successful verification) if Com(m) ⊕ A · (r||m) has weight w, and 0 (failed verification) otherwise.
In our proposal of biometric LPN commitments, the message is the biometric data, B t at enrollment, and B v at matching, which should be always protected. Therefore, in our proposal, KGen generates the public matrix A, Com outputs r t and Com (B t ) = A · (r t ||B t ) ⊕ e t at enrollment, and r v and at matching, and Ver is modified to work only with protected data, that is, with commitments. Our verifier combines the biometric LPN commitments by a XOR operation as follows: This result can be considered as a system of linear equations with A as coefficient matrix and A|[Com are generated from the genuine prover, e t = e v . Hence, the XOR operation applied to genuine commitments results Since r t and r v can be known by the verifier, it can be checked if (r t ⊕ r v ) is correct. Then, the verifier employs (B t ⊕ B v ) to compute the score measurement of the template and the input features. Typically, the score is based on the Fractional Hamming Distance (FHD), which can be computed as follows: where HD is the Hamming Distance and m is the total number of bits in the biometric data. In addition, the score can be based on the Jaccard Distance (JD), which can be computed as follows: where FHW is the Fractional Hamming Weight (that is In the case of Jaccard distance, the Hamming weights of the biometric data are needed, but this is not a problem since they do not reveal any sensitive information about biometric data. If the score calculated (based on FHD (B t , B v ) or JD(B t , B v )) is below an authentication threshold, the verification outputs 1 (success), and outputs 0 (failure), otherwise.
If Com (B t ) and Com (B v ) are generated from genuine and impostor provers, e t = e v . In this case, the system of linear equations with A as coefficient matrix and A|[Com (B t ) ⊕ Com (B v )] as augmented matrix cannot be solved, because the rank of the augmented matrix is higher than the rank of the coefficient matrix. As stated by the Rouché-Frobenius theorem, the system has solution if and only if the ranks of the coefficient matrix and the augmented matrix are equal. Therefore, the impostor is directly rejected without proceeding to a score measurement.

C. USE OF BIOMETRIC LPN COMMITMENTS IN AN AUTHENTICATION PROTOCOL
In this work, we apply biometric LPN commitments in the typical scenario where cloud-based services and distributed architectures are employed, as proposed in [13]. The entities involved are: 1) N users (i = 1, . . . , N ), each one with a client device; 2) a client device which obtains user biometrics, identities and keys; 3) an authentication server in charge of the verification of biometric LPN commitments; and 4) a database server for (cloud) storage. In this protocol, there are two authentication factors: 1) the biometrics, and 2) the knowledge of a user key or the possession of a token with the user key stored in a secure memory or reconstructed with a Physical Unclonable Function (PUF) [24]. In the following, the knowledge of a user key is considered, as being more general [13].
The enrollment and verification phases are illustrated in Fig. 1 and Fig. 2. During the enrollment phase, the client device acquires the biometric samples s ti , the user key k i and the user identity ID i . From the biometric samples s ti , the client device extracts the biometric features B ti . e i is derived by using what we call a Weighted Key Derivation Function (WKDF) from the user key k i and the user identity ID i . This function starts from an all-zero vector of n elements. Since the resulting vector e i must have a constant weight w, as commented in Subsection II.A, it means that w = nτ ones are inserted in the sequence of zeros. The positions in which the w ones are introduced follow a uniform distribution of random values in the range [1, n] provided by a deterministic random generator. The deterministic random generator provides the same positions if the user introduces the same k i and ID i . If a random position is repeated, it is discarded, and a new position is generated until w ones are inserted. More details about this function are given in the following Section. A practical implementation can be seen in [25].
The random vector r ti is generated by using a Random Number Generator (RNG). The public matrix A i , which is obtained by the KGen algorithm, can be stored locally in the client device. Then, the associated biometric LPN commitment Com (B ti ) = A i · (r ti ||B ti ) ⊕ e i is created. The client device sends (ID i , Com (B ti ) , r ti ) to the authentication server. The authentication server maps ID i to a unique index i, stores (i, ID i ) in its local database and sends (i, Com (B ti ) , r ti ) to the database server for storage.
During the verification phase, the client device acquires the biometric samples s vi , the user key k i and the user identity ID i . The input biometric features B vi are extracted from the biometric samples s vi , e i is derived by using the WKDF from the input user key k i and the user identity ID i , and r vi is generated by using the RNG. Then, the associated biometric LPN commitment Com (B vi ) = A i · (r vi ||B vi ) ⊕ e i is created using the retrieved A i . The client device sends (ID i , Com (B vi ) , r vi ) to the authentication server, and also A i (although this is a public matrix that could be obtained in another way). The authentication server recovers i from its local database associated to the received ID i and (Com (B ti ) , r ti ) from the database server by using a private information retrieval (PIR) scheme. Then, the authentication server carries out the verification algorithm as described in Subsection II.B. A Private Information Retrieval (PIR) is a protocol that allows the authentication server to retrieve an element of the database server without the owner of the database being able to determine which element was queried. A secure PIR is employed together with the database anonymization, as proposed in [13], to satisfy the user identity privacy.
The communication channels among the protocol entities are assumed to be secure, which is a usual scenario. This means that an external adversary cannot intercept or modify a message which is communicated through the channels. Besides, the client device is assumed to be trusted, that is, we do not consider it stores user IDs, keys or biometric samples, or executes a malicious software. Finally, at the enrollment phases, all the entities are assumed to behave honestly.
However, an external adversary can use the client device to carry out impersonation attacks, which is the Stolen Token scenario commented in Introduction, and can attack also the information stored in the database server. Biometric LPN commitments are robust to these attacks as described in the following section.
In addition, since external adversaries cannot obtain more information than the internal ones, the protocol considers malicious authentication and database servers at the verification phase. If the authentication server is malicious, the target is to learn the user biometrics or keys. However, this is not possible because the authentication server does not have access to this information. If the database is malicious, the target is to obtain the link between the commitment and the user identity. However, since the database is anonymized and a PIR protocol is employed, the user identity privacy is satisfied.

D. SECURITY ANALYSIS OF THE BIOMETRIC LPN COMMITMENT AS TEMPLATE PROTECTION SCHEME
According to the ISO/IEC 24745:2011 standard on biometric information protection [1], the biometric template protection schemes should fully meet the security requirements of irreversibility, revocability (or renewability) and unlinkability.
Irreversibility is related to the difficulty to recover the original biometric features from the protected template. Irreversibility in an LPN commitment is based on the security of the LPN problem, which is the hardness of decoding random linear codes (a NP complete problem resistant to quantum algorithms) [20]. Since A and e are both random, the resulting LPN commitment is random. Hence, in terms of Shannon entropy, the entropy in bits of the biometric LPN commitments is practically 100%, independently of the biometric feature. The entropy provided by Fuzzy Commitments is lower, as depicted in Table 1, with data taken from [5].
Revocability (or renewability) is related to the ability to create a new and different protected template from the same biometric features of the same individual i by using different keys. This security requirement is associated to the binding property of an LPN commitment [21]. If Com (B ti ) = A i · (r ti ||B ti ) ⊕ e i is created from the biometric features B ti and another random Com (B ti ) = A i ·(r ti ||B ti )⊕e i can be created The decodability based cross-matching attack presented in [26] for Fuzzy Commitments is based on XOR-ing two Fuzzy Commitments created with the same linear Error Correction Code. The attack checks whether the result is decodable and, hence, detects that the biometric features are similar. In biometric LPN Commitments this attack is not possible since e i and e i should be equal to decode the system of linear equations.
FAR attack occurs due to interclass correlation between biometric samples from different individuals that are very similar. It reduces considerably the security of Fuzzy Commitments and salting schemes, as commented in Introduction. For LPN biometric commitments, if the value of e is unknown, the A|[Com (B t ) ⊕ Com (B v )]-based equation system cannot be resolved although the biometric features B t and B v were similar. Therefore, FAR attacks are avoided by a biometric LPN commitment-based template protection scheme.
In addition to these security requirements, a template protection scheme should maintain the security under the named Stolen Token scenario. Originally, the Stolen Token scenario comes from the Biohashing technique [27], where a physical device or token stores the user key. In our context application, this scenario is possible since an attacker can access the client device and employ it for recognition during the verification phase. The commitment is created with e v = e t and the verifier obtains a matching score from [B t ⊕ B v ]. However, B t belongs to the genuine individual and B v belongs to the impostor individual. Therefore, the recognition results are the same as in the unprotected system.
Concerning similarity-based attacks, if an attacker knows the protected template Com (B ti ) = A i · (r ti ||B ti ) ⊕ e i , generates first guesses randomnly, and transforms them to the protected domain, Com B = A i · (r ||B ) ⊕ e , the distance between Com (B ti ) and Com B does not reveal information about the distance between B ti and B , because Com (B ti ) and Com B are random (computationally hiding). To carry out a similarity-based attack in the authentication protocol described above, the attacker should be successful to discover the association between a commitment and a user identity, that is, the attacker should break the database anonymization and, in addition, should employ the client device of that user, that is, should be in the Stolen Token scenario. Only then, the attacker is able to generate Com B = A i · (r ||B ) ⊕ e i , and from the distance between Com (B ti ) and Com B is able to extract information about the distance between B ti and B .

III. IMPLEMENTATION AND PERFORMANCE EVALUATION A. SOFTWARE IMPLEMENTATION OF THE BIOMETRIC LPN COMMITMENT-BASED PROTECTION SCHEME
Our proposal has been developed in Matlab and thus the implementation of operations is based on Matlab functions. The first step to create a biometric LPN commitment is to generate the keys. The generation of the n · (l + m)-bit matrix A requires a uniformly distributed random generator. This is possible by employing the Matlab function rand if the result is rounded. The generation of the n-bit vector e requires a weighted uniform random bit generator with Hamming weight equals to nτ . The Matlab function randperm determines randomly the positions of the nτ elements of e with value 1. The rest of the elements are established to 0. A seed is employed by randperm which is associated to the user identity and key. In this way, the Matlab function randperm acts as a Weighted Key Derivation Function (WKDF).
The LPN commitment Com (B) = A · (r||B) ⊕ e is composed of binary (AND) multiplications and binary (XOR) additions. The LPN commitment operation is translated to Matlab code as a 2-modulo operation applied to the addition of A · (r||B) and e. Previously, the biometric features B are extracted and concatenated to r. The generation of l-bit vectors r is performed with a uniformly distributed random generator based on the Matlab function rand.
At the verification phase, the authentication server has to solve the system of linear equations composed of A as coefficient matrix, [Com (B t )⊕Com (B v )] as matrix of independent terms and A|[Com (B t ) ⊕ Com (B v )] as augmented matrix. In order to employ Gaussian elimination, superior matrix triangularization is applied to A. The gflineq Matlab function used for this operation finds a particular solution over prime Galois field of two elements. Two types of operations are required: 1) swap a current row with a row containing a major element, and 2) clear all non-zero elements in the column except the major element and set the major element to one by adding to one row a scalar multiple of another and applying a 2-module operation. Given the independent terms composed of [Com (B t ) ⊕ Com (B v )], the authentication server checks firstly if the rank of the augmented matrix A|[Com (B t ) ⊕ Com (B v )] is k (like the coefficient matrix A). If the ranks are different, the authentication server finishes the verification with a failure. VOLUME 8, 2020

B. BIOMETRIC LPN COMMITMENT PARAMETERS AND PERFORMANCE
The LPN commitment parameters can be selected according to the latest results on Information Set Decoding (ISD) algorithms presented in [23]. In that work, the worst-case running time obtained for decoding random binary linear codes (considering full distance decoding) is 2 0.0885·n , which means a security level of 0.0885·n bits. It is achieved for k/n= 0.46 with relative distance w/n= d/n= 0.1237, by using an improved proposal of the BJMM decoding algorithm of Becker et al. [28]. The value of n is selected to achieve the security level and then k and w are obtained. For different security levels, Table 2 shows execution times of the main operations in the LPN commitment-based template protection schemes. Execution times correspond to average of ten runs, executing the software implementation described above in an Intel Core 3.3 GHz i5-7400 CPU. The most timing consuming operation is the triangularization of the matrix A. However, the operations and the values required for the superior matrix triangularization can be pre-calculated at the enrollment phase and can be known by the authentication server to speed up the comparison of biometric LPN commitments at the verification phase. Table 3 shows a comparison of our proposal to others proposals from the literature based on homomorphic encryption. The proposal in [18] offers a security level as high as ours (more than 80-bit security against exhaustive-search and birthday attacks). The results of our proposal consider the parameters selected in Table 2. The n values determine the protected vector length while the k values determine the unprotected vector length. The storage requirements of our proposal are the lowest. Regarding the cost of the operations at the verification phase, encryptions and decryptions are the most costly operations for homomorphic encryption approaches [29]. In contrast, our proposal does not require decryption and the operations involved are the simplest.

IV. PRACTICAL REALIZATION WITH FINGER VEINS A. BIOMETRIC RECOGNITION BASED ON FINGER VEINS
Although our proposal can be applied to any biometric trait represented by binary features, this Section proposes an example of realization to protect finger vein features. The extractor of finger veins employed is based on the Wide Line Detector, which is a state-of-art finger vein extractor [30] initially proposed in [31].
The input to the Wide Line Detector is the brightness of a finger-vein image F and the output is a binary feature image V whose background pixels have the logic value '0' and the vein pixels have the logic value '1'. A circular neighborhood region N with radius r is defined for each center pixel (x 0 , y 0 ) from F as follows: and the brightness similarity between two pixels is mea- where u is a brightness contrast threshold. Then, each pixel (x 0 , y 0 ) in V is defined as follows: where m is the summation of the similarities within the circular neighborhood region: and g is a geometric threshold defined as half the maximum value that m can take.
Since the feature vectors extracted are unbalanced (with a great difference for the number of 1's and 0's), we propose to measure the matching score with the Jaccard distance, which is the score already commented in Subsection II.B as Equation (5). With the knowledge of FHW (B t ) and FHW (B v ), the authentication server can compute this score from the biometric LPN commitments, and compare it with a threshold to output a success or a failure. We point out that our matching score is normalized, while that proposed in [32] is not.

B. ACCURACY ANALYSIS OF THE UNPROTECTED APPROACH
In order to obtain biometric recognition results, we applied the Wide Line Detector to extract features from the finger vein images from the Tsinghua University Finger Vein database [33], in particular from THU-FVFDT3 FV3_Test (which contains 4 samples of finger vein images for each 610 individuals). We use the Wide Line Detector implementation from [34] with the parameters r = 5, u = 1 and g = 41.
Matching experiments were performed following the FVC (Fingerprint Verification Competition) protocol [35]: Genuine comparisons were made between every pair of samples corresponding to the same individual (in total, (4· 3/2)· 610 = 3,660 comparisons). Impostor comparisons were made between the first sample of an individual and the first sample of the rest of the individuals (in total, (610· 609/2) = 185,745 comparisons).
The finger vein image has 370 · 576 pixels, but the area centered on the middle of the image, which corresponds roughly to the middle phalanx, is usually described as the most stable and the most discriminant area for finger vein recognition, as indicated in [3]. Hence, we have evaluated feature vectors of finger veins formed by 32 · 64 bits (that is, 2,048 bits), and no displacements were applied to the feature vectors, as in [3]. The EER (Equal Error Rate) obtained (when the False Rejection Rate equals to the False Acceptance Rate) was 0.34 %.

C. RECOGNITION ACCURACY ANALYSIS OF THE PROTECTED APPROACH
The analysis is performed by considering the parameters selected in Table 2 for an 80-bit security with k = 416, which determines the number of divisions of the unprotected feature vector, and n = 904, which determines the protected feature vector length according to the number of divisions of the unprotected feature vector. For a 2,048-bit unprotected feature vector, eight 256-bit divisions are considered (with l = 160, m = 256 and k = l + m = 416). The time to compare two commitments is 101,84 ms using the above described Matlab implementation. Although this time is competitive, it can be reduced considerably if the code is optimized. Table 4 shows a comparison of the recognition accuracy of our proposal and other template protection schemes based on finger veins. Our proposal is the only one that does not reduce the recognition accuracy in the protected domain. The False Acceptance Rate of the protected approach is 0% because an impostor, who does not know the user-specific secret key (the user key in the authentication protocol in Subsection II.C), is directly rejected. The False Rejection Rate (FRR) can be adjusted depending on the authentication threshold selected for the biometric data. If the authentication threshold of the EER of the unprotected domain is also used in the protected approach, the FRR = 0.34%, as shown in Table 4. In that case, in the Stolen Token scenario, the EER = 0.34% is preserved. In all the other proposals, the recognition accuracy when using the protected approach is always reduced.

D. SECURITY ANALYSIS OF THE PROTECTED APPROACH
In order to evaluate unlinkability, we applied the framework proposed in [36] by considering the distributions of mated and non-mated instances. The Jaccard distances of mated instances are computed with the commitments of templates extracted from different samples of the same instance by using different e values. The Jaccard distances of non-mated instances are computed with the commitments of templates extracted from samples of different instances by using different e values. If both distributions coincide, the unlinkability of a scenario is proven. Fig. 3 proves the unlinkability of our proposal.
The revocability property is satisfied if different protected templates can be generated from the same sample by using different e values. The results are shown in Fig. 3. This distribution overlaps extensively with the two above, and, therefore, the revocability of our proposal is also proven.
Regarding unlinkability, our proposal outperforms the results obtained by using re-mapping, warping and Alignment-Robust Hashing proposals included in [8]. The rest of the proposals do not provide unlinkability results. Revocability results are not provided by the proposals considered.
The evaluation of the resistance to similarity-based attacks of the biometric LPN commitments was performed according to [14], which considers that protected templates are secure only if the mutual information between the normalized   distances of the impostor data in the protected and unprotected domains is very small. In [14], mutual information is assumed to be upper bounded by the variance of the distribution of the impostor protected distances. The results obtained are shown in Fig. 4. It illustrates that the impostor protected distances (y-axis) do not change with respect to their impostor unprotected distances (x-axis), that is, their correlation is quite small. In fact, the variance of the impostor protected distances was 4.104·10 −5 , much lower than the variance obtained for the solution proposed in [14] whose value is 0.31425. Therefore, it is very difficult for an attacker to infer the unprotected distances.
Regarding the Stolen Token scenario, let us consider a scenario where an attacker is able to generate the biometric LPN commitment at the verification phase using the same values for A and e. As expected, Fig. 5 illustrates through the DET curve that the recognition results from the unprotected and the Stolen Token approaches are the same. In [7], the recognition performance in the Stolen Token scenario is significantly worse because the recognition results are affected by the dimensionality reduction of the BioHashing transformation.

V. CONCLUSION
In this work, we have proposed the use of LPN commitments to construct a biometric template protection scheme. To the best of our knowledge, this is the first proposal of such schemes based on the LPN problem. Its use is described with a dual factor authentication in a distributed scenario where authentication and database servers can be malicious.
Irreversibility is based on the LPN problem, which is the difficulty of decoding random linear codes. Parameters are selected to obtain security of 80, 128, 256 and 512 bits. The analysis of execution times (of the order of milliseconds using a non-optimized code for the verification of biometric LPN commitments), template storage (with a length of the protected vector of, approximately, 2 times the length of the unprotected vector), and operation complexity (based on ANDs and XORs) shows that a practical realization has low cost. Hence, this scheme is feasible for hardware with constrained resources and verification at real time. Accuracy performance is achieved with a FAR of 0% and a FRR that can be adjusted depending on the authentication threshold selected for the biometric data and can be set to preserve the accuracy of the unprotected scheme in the Stolen Token scenario. Revocability, unlinkability, and resistance to FAR, cross-matching, and similarity-based attacks are also achieved. Experimental results are compared to other proposals from the literature based on homomorphic encryption, transformation, and biometric cryptosystems.
The application of biometric LPN commitments is possible for any biometric trait represented by binary features. In this work, the biometric LPN commitments are applied to finger veins extracted by the Wide Line Detector. For this realization, we have proposed a comparison of finger veins based on the Jaccard distance (more suitable for binary feature vectors with an unbalanced number of ones and zeros).
ILUMINADA BATURONE received the degree (Hons.) and Ph.D. degree (Hons.) in physics from the University of Seville, Seville, Spain, in 1991 and 1996, respectively. She has been with the Instituto de Microelectrónica de Sevilla, University of Seville-CSIC, since 1990. She is currently with the Department of Electronics and Electromagnetism, University of Seville, where she is a Full Professor. She has coauthored the books Microelectronic Design of Fuzzy Logic-Based Systems (CRC Press, 2000) and Fuzzy Logic-Based Algorithms for Video De-Interlacing (Springer, 2010). She has more than 150 scientific articles. She has participated in more than 40 Spanish and European research and industrial projects and leading 12 of them. She holds three patents. She is one of the developers of Xfuzzy environment. Her current research interests include hardware security, microelectronic design of crypto-biometric systems, hardware design for embedded control, and neuro-fuzzy systems. VOLUME 8, 2020