Unlinkable Zero-Leakage Biometric Cryptosystem: Theoretical Evaluation and Experimental Validation

Template protection is an issue of paramount importance for the design of secure and privacy-compliant biometric recognition systems. Template unlinkability, together with template irreversibility, is an essential requirement to properly guarantee template protection. In fact, it ensures that templates generated from the same trait, but used in different applications, cannot be linked to the same identity. This paper deals with the design of a system satisfying the unlinkability requirement. The robustness of the proposed solution is evaluated by exploiting methods stemming from the theory of stochastic optimization, as well as by using quantitative measures specifically proposed to characterize the unlinkability of biometric protection schemes. A case study using finger-vein biometrics is considered to test the proposed cryptosystem on non-ideal data. The proposed scheme guarantees 128 bits of security with acceptable false recognition rates in real-life conditions. Moreover, we provide guidelines to determine the parameters of the transformations to be applied to real biometric traits so as to ensure proper recognition, security, and unlinkability performance.


I. INTRODUCTION
T HE use of biometric traits in automatic recognition systems offers several advantages over traditional approaches relying on passwords or tokens since biometric characteristics cannot be lost or forgotten, and, in general, they allow a much easier and more natural human-machine interaction. Nonetheless, the usage and storage of biometric data also pose several threats [1], [2]. For instance, if a biometric identifier is compromised, an attacker could exploit the collected information to impersonate its owner, and fraudulently gain access to specific resources. Therefore, the need to revoke biometric credentials could arise, posing an issue given the limited number of available traits. Furthermore, biometric data, when used as universal identifiers, could be used to track the users' activities across different domains, thus posing privacy concerns. Biometric data can also reveal sensitive information about their owners, that might be exploited for discriminatory purposes [3]. Not surprisingly, biometric information is retained as strictly confidential by the EU General Data Protection Regulation (GDPR), which recommends its management with adequate levels of security. Therefore, the aforementioned concerns have to be carefully taken into account and properly addressed when designing a biometric recognition system. In more details, the templates generated from the raw biometric data should be protected as effectively as the traits themselves, since it is often possible to adequately reconstruct the original data from their representations [4]. However, even if a template is not reversible, it must be considered sensitive data. 1 Unfortunately, standard cryptographic algorithms cannot be effectively used to protect a biometric template, since comparison in the encrypted domain is not feasible due to the noisy nature of biometric data [1]. Homomorphic encryption has been exploited to tackle the aforementioned disadvantage, designing pipelines where the recognition step is performed in the encrypted domain [5]. However, the related computational complexity is relatively high, and trusted servers are needed to manage the exchange of the involved data, making this solution impractical for many applications.
Several biometric template protection (BTP) schemes have been proposed to design a secure and privacy-compliant biometric system. BTP methods generally modify the available biometric representations to generate alternative templates not leaking information about the original data. According to the ISO/IEC 24745 standard [6], a proper BTP method should satisfy the following properties: • irreversibility: given a protected template, it should not be possible to reconstruct its unprotected version; • renewability: from a given biometric sample, it should be possible to issue multiple protected templates; • unlinkability: given two protected templates, generated from the same biometric sample and stored in different datasets, it should not be possible to determine that they belong to the same subject; • performance: the use of a BTP scheme should not significantly affect the system recognition performance. BTP schemes are typically implemented by means of two distinct methodologies, namely cancelable biometrics [7] and biometric cryptosystems [8]. Cancelable biometrics refers to methods applying either invertible or non-invertible transformations to the biometric data. While non-invertible tranformation-based approaches have been proposed for several of the most used biometric traits [9], [10], their irreversibility has been rarely evaluated through rigorous proofs, due to the intrinsic difficulties in proving the non-invertibility of a function against any possible kind of attack.
On the other hand, biometric cryptosystems [11] can be classified into key-generation and key-binding approaches. The first extracts cryptographic keys from the considered biometric data. The latter aims at securing a cryptographic key by means of biometric data and vice versa: the key and the biometric data are combined into a template that can be split into a pseudonymous identifier (PI) and auxiliary data (AD) [6].
The key-generation approaches commonly lack in renewability and unlinkability since, by definition, the key is generated by exploiting the biometric trait only. On the other hand, detailed evaluations have been performed on the information about the original secrets leaked from templates protected with key-binding approaches [12]. Fundamental trade-offs among recognition performance, irreversibility, and security have been, for instance, discussed in [13] and [14]. In this regard, it has been demonstrated that both security, measured as the mutual information between the employed secret key and the stored AD, and irreversibility, intended as the difficulty of retrieving the original biometric information from the AD, could be improved only at the cost of worsening the achievable recognition performance. This paper stems from our previous work [15], where we have proposed a zero-leakage key binding approach based on the use of quantization index modulation (QIM), with the goal to embed a secret key within a biometric representation. The approach in [15] guarantees that the generated AD does not reveal any information regarding either the employed secret binary key or the associated PI, thus achieving perfect security. Nonetheless, as it will be detailed in the following, the scheme proposed in [15] is vulnerable to linkability attacks. In this paper we overcome this issue, by designing a novel approach that enforces the desired template unlinkability, thus obtaining a zero-leakage biometric cryptosystems satisfying all the properties required by the ISO/IEC 24745 standard.
The effectiveness of the proposed approach is evaluated by testing its robustness against different linkage attacks, considering both methods stemming from the theory of stochastic optimization, as well as quantitative measures specifically designed to characterize the unlinkability of biometric protection schemes. The influence of the parameters employed in the proposed approach on the security and unlinkability of the templates created from non-ideal biometric data is also investigated. Furthermore, in order to apply the proposed BTP scheme to real-life biometric data, finger vein patterns are considered as case study.
The paper is organized as follows. Section II briefly outlines the zero-leakage cryptosystem introduced in [15], here analyzed with respect to the unlinkability requirement, and then further developed. In Section III, the approach proposed to generate unlinkable templates is described. Its effectiveness Fig. 1. Zero-leakage biometric cryptosystem [15]. against different attacks is tested in Section IV. Specifically, attacks aimed at linking templates generated from the same original biometrics, and protected using different keys, are taken into account, considering biometric data with ideal distributions. In Section V, the issues to be faced when dealing with non-ideal data are discussed, and guidelines to define the parameters employed in the proposed approach are provided. The experimental tests conducted to verify the effectiveness of the proposed method in practical scenarios are described in Section VI, while conclusions are given in Section VII.

II. A ZERO-LEAKAGE CRYPTOSYSTEM
In this Section, the zero-leakage biometric cryptosystem presented in [15], and sketched in Figure 1, is briefly summarized. More in detail, in the enrolment stage, a fixed-length biometric representation w ∈ R L and a secret key with K bits are used to generate the couple (PI, AD), where the PI is a hashed version of the employed key, whereas a transformed version of w and an encoded version of the key are used to generate the AD using QIM.
Specifically, the K bits of the key are encoded into a string of N bits through an error correcting code (ECC), to handle the intra-class variability of the considered biometric data. The use of highly efficient ECC such as turbo codes, and the representation of the employed biometric templates with continuous variables, instead of binary ones as in the fuzzy commitment [16], is recommended for biometric cryptosystems in order to approach the Shannon limit during the decoding process, thus allowing to achieve the best possible recognition performance in terms of false rejection rate (FRR) [17]. The N encoded bits are divided into L symbols, each corresponding to a, potentially different, number B of bits. Each symbol is embedded into a coefficient w of the representation w as: which represents the AD, where , is a symbol belonging to an alphabet with M elements and associated to the B bits to embed 2 ; • (·) is a point-wise function defined as follows: where C D F W (w) and C D F X (x) are, respectively, the cumulative density functions of the original biometric coefficient W and of the target variable X . As mentioned in [18], a zero-leakage biometric cryptosystem should guarantee that an auxiliary data Z leaks only a negligible amount of information about the associated secret key S and the biometric trait X . Information-theoretic analysis [14] has proved that the mutual information between the employed biometric representation and the stored AD cannot be null. In fact, the assumption that I (X, Z ) = 0 implies that Z would not retain any information about the employed biometric trait X , with the consequence that the only achievable operating condition would be the one with FRR = 100% [14]. It is therefore possible to design cryptosystems with only close-tozero leakage about the biometric data [19].
On the other hand, the zero-leakage property is achievable for the employed secret key, assuring a null mutual information between the secret key and the AD [20], [21], i.e., I (S, Z ) = 0. Within the considered framework, this latter requirement can be obtained by choosing the function (·) is such a way that the characteristic function C F of the target variable X , i.e., the Fourier transform of its probability density function (PDF), satisfies [15]: A family of random variables X , fulfilling the requirement in Eq. (3), is the one whose PDF has a raised cosine shape [15], that is: with γ ∈ [0, 1]. As shown in Eq. (15), the choice of the parameter γ is responsible for both the irreversibility and the capacity of the resulting coefficient x. Specifically, in the considered framework, the irreversibility can be evaluated by measuring, for each coefficient of the employed biometric representation, the mean root square error between the actual value x and its best estimatex(z) obtained by exploiting the knowledge of the associated AD z, that is: The irreversibility P, evaluated according to Eq. (5) as a function of the parameter γ , parameterized wrt an increasing number of bits B embedded into x, is shown in Figure 2.
Higher values of irreversibility, corresponding to a negligible leakage about the original biometric information, are achieved for increasing values of γ . As suggested in [15], in the proposed scheme the γ value used in the employed transformation is determined with the goal of minimizing the information  leakage about the used biometric data, thus achieving closeto-zero leakage about X , by guaranteeing that P ≈ 1.
On the other hand, in [15] it has also been shown that the capacity of each coefficient x, i.e., the upper bound of the number of bits B that can be embedded in the enrolment stage and reliably retrieved during verification using an ECC, decreases when increasing the value of γ . In Figure 3, the capacity C vs γ , for a synthetic biometric coefficient x characterized by a signal-to-noise (SNR) ratio equal to 4.7dB, as in [22], is depicted. It is evident that the use of larger values of γ improves the irreversibility of the created templates and yet affects the capacity of the obtained representations. This behavior makes the coefficient capable of hosting a smaller number of bits, and results in a greater vulnerability to brute force attacks since the usable secret keys must be shorter. This confirms a trade-off between security and irreversibility [14].
As shown in the following sections, the parameter γ also influences the unlinkability of the proposed enhanced system to properly generate multiple protected templates from the same biometric representation.

III. PROPOSED APPROACH: GENERATION OF UNLINKABLE TEMPLATES
In this Section, we show the limits of the system we have proposed in [15] (see Section II) in terms of unlinkability, and we propose a possible solution for their mitigation.

Algorithm 1 Key Embedding Process
The method we have proposed in [15] appears not to be robust wrt the unlinkability of protected templates generated from the same biometric data. In fact, as shown in [23], given a biometric trait x and a pair of keys k 1 and k 2 encoded into s 1 and s 2 , the corresponding AD are obtained as We observe that: meaning that the difference between two AD generated from the same biometric data x is bound to a discrete set of values. On the other hand, when the AD are created from different biometric representations x 1 and It is worth observing that even when the two biometric representations are not identical, but they differ because of the intra-class variability of the considered trait, [z 1 − z 2 ] 2π would be still close to [s 2 − s 1 ] 2π , as shown in Figure 4. Therefore, a linkability attack would be able to relate the two identifiers, thus posing privacy concerns. In order to overcome the limitations of the method in [15], we propose the following approach. With reference to Figure 5 and to the pseudo-code in Algorithm 1, the generic coefficient x i of the template x is obtained as follows 3 : In details, f (·) is a point-wise function: designed such that the coefficients u i = f (w i ) have a normal distribution N (0, 1). Assuming that the template coefficients w i are statistically independent, as commonly assumed in the analysis of biometric cryptosystems [14], u = (u 1 , u 2 , . . . , u L ) ⊺ will be normally distributed with an identity covariance matrix, namely u ∼ N (0, I). Therefore, u is a realization of a rotational-symmetric distribution. The vector v = Au, being A a record-specific L × L orthonormal matrix, is therefore a realization of the same process N (0, I). Roughly speaking, given a properly designed A matrix, it is not possible to distinguish two independent realizations (u, u ⋆ ) ∼ N (0, I) × N (0, I) from the couple (u, Au). This is a key factor for the unlinkability of AD instances.
The template x, with coefficients distributed as in Eq. (4), is finally obtained by applying a proper point-wise transformation g(·): to v. Then, the couple (PI, AD) is obtained as summarized in Section II, with the AD given by (z, A).
It can be noted that, when A = I, the proposed scheme is equivalent to the one in [15]: This suggests that not all orthonormal matrices A ∈ R L×L are eligible for the proposed system. This aspect will be further explained in Section V. In summary, the key embedding process can be expressed by the pseudo-code given in Algorithm 1. The inverse procedure, i.e., the key retrieval, is summarized by the pseudo-code given in Algorithm 2. Further implementation details are provided in Section VI-D.
It is worth mentioning that a similar approach, yet relying on permutation matrices, has been proposed in [24] to improve the unlinkability of the fuzzy commitment BTP scheme [16]. A permutation matrix ∈ {0, 1} L is a special kind of orthonormal matrix, with all zeros but only a 1 for each row and column. However, the use of a permutation matrix would be ineffective to obtain the desired unlinkability property for the considered zero-leakage biometric cryptosystem. In fact, if a permutation matrix is used instead of a generic orthonormal matrix A, we obtain that g ( u) = g(u). Hence, given a pair of AD z 1 and z 2 , derived from the same biometric trait w, we have: and from which where the last step exploits the fact that the permutation of a string comprising a set of discrete symbols produces another string with coefficients belonging to the same alphabet. Therefore, this approach leads to the same scenario in Eq. (6) and Fig. 4(a), thus failing to provide unlinkability.

IV. UNLINKABILITY ANALYSIS
As it has been shown in [14], no helper data scheme can guarantee, from an information-theoretic perspective, a null mutual information between the original biometric data and the stored AD, and consequently a perfect unlinkability. In fact, a certain amount of template information should be retained in the AD to absorb the intra-class variability of the biometric trait and guarantee reliable recognition performance. Nevertheless, the linkability attack can be made computationally unfeasible. In this regard, the unlinkability property of the proposed approach is here investigated.
Specifically, we analyze the system robustness against two different attacks. The attack described in Section IV-A relies on the assumption that the space discretization carried out by the QIM module should match in case of mated biometric traits. We show that the verification of such hypothesis reduces to a Boolean Satisfiability (SAT) problem, hence it can be only solved with brute force. The attack described in Section IV-B attempts to link distinct AD by matching the best estimates of the biometric templates the attacker can achieve from the AD themselves. We show that the system can be set in a way that the mated estimates are indistinguishable from non-mated ones, thus making the attack ineffective.

A. Space Discretization Attack
The first attack we consider is an extension of the one proposed in [23], described by Eq. (6) and Figure 4. The direct application of such attack is not effective against the proposed system since each coefficient of z is not dependent on a single element of w, as in [15], being instead obtained as a non-linear function of the entire original template. The domain of z is still quantized as a function of u, but in a convoluted manner.
We try to retrieve u from z = [g( Au) − s] 2π as follows: Given that the co-domain of g(·) is limited to [−2π, +2π ), we can write: where ξ ∈ {0, 1} L represents the information lost by the modulo operation. We can now express u as: Considering now two AD sets {z 1 , A 1 } and {z 2 , A 2 }, generated respectively by the inputs {w 1 , k 1 } and {w 2 , k 2 }, and assuming the same biometric trait w 1 = w 2 = w, we have: thus obtaining: Eq. (17) represents a system of non-linear equations whose unknowns are the coefficients of s 1 and ξ 1 , and whose solution would allow demonstrating that z 1 and z 2 are linked to the same identity. We claim that there is no algorithm that can solve this problem in polynomial time. Let's redefine the problem as a minimization problem. We rely on a stochastic optimization algorithm, e.g., a genetic algorithm (GA), whose objective is to find: with the fitness function t (s, ξ ) defined as: The mixture of modulo, rotation/reflection, and non-linear operators makes the system of equations strongly non-smooth where t, d, σ t , and σ d are respectively the means and standard deviations of the fitness function t and the distance to the optimum d. Such correlation reaches the value r = 1 in case the global optimum is found during the optimization process [25]. Since the following expression can be used for the distance d: being || · || the norm operator. Figure 6 shows an estimation of the average fitness-distance correlation r , obtained with a Monte-Carlo simulation for different γ values, with B = 1, L = 24, 4 and orthonormal matrices randomly generated as described in [26]. It is evident that r rapidly decreases for increasing γ , and approximately reaches r = 0 when γ ≥ 0.3, suggesting that querying the fitness function in Eq. (18) would not give any useful feedback to find the solution of Eq. (17).
To get more insights into the complexity of the optimization problem, we can inspect the scatter plots of the fitness as a function of the considered distance. Examples of such plots are reported in Figures 7 and 8, for systems using γ = 0 and γ = 1, respectively. For illustrative purposes, these figures 4 Simulations with larger templates were computationally unfeasible with the available computing node: 2 Xeon 16-Core 2.3Ghz processors, 8 × 16 GB RAM, 4 NVIDIA Tesla V100 32GB.  are referred to a simple scenario with a simulated biometric template with L = 8 coefficients and B = 1 bit embedded into each element of x. For γ = 0, the significant correlation between distance and fitness suggests that a hill-climbingbased algorithm can solve the optimization problem. If γ = 1, there is instead no correlation between the two values.
The analysis based on the fitness-distance correlation suggests that solving the problem in Eq.
The entropy of S is the key-length K , while the equivocation of given S and AD is given by the uncertainty of g( Au) given [g( Au)] 2π = [z + s] 2π , hence, H ( |S, AD) = H (X|X 2π ).
The equivocation of each coefficient X is given by: with P(x) = 2π rc 2π γ (x). On average, the equivocation expected value is then: Interestingly, by solving the integral numerically, it turns out that the average equivocation grows roughly linearly with γ , as shown in Fig. 9. It is worth mentioning that such equivocation is zero with γ = 0. In fact, with reference to Figure 10, when γ = 0 no information is lost after the modulo operator, since x ∈ [−π, π), and, in this case, the modulo operator is a bijective function. Summarizing, solving the linkability problem in Eq. (17) is equivalent to randomly guessing approximately K + L × H (X |X 2π ) bits.

B. Template Estimation-Based Attack
Other linkability attacks can be attempted by performing the best estimates of the biometric representations that generate different AD. From Eq. (16), given the AD, namely {z, A}, the original template u can be estimated as: being E[·] the expected value over all (s, ξ ) couples. Therefore, given two sets AD 1 = {z 1 , A 1 } and AD 2 = {z 2 , A 2 }, an attacker can first estimate the corresponding representationŝ u 1 andû 2 and then compute their similarity through a linkage function l = L(û 1 ,û 2 ). The effectiveness of such linkage function can be assessed using metrics specifically designed to evaluate template unlinkability, such as those proposed in [27] or [28]. In more detail, we here consider the linkability measure D sys ↔ defined in [27] as: where is the score-specific linkability, and is the likelihood ratio between mated (H m ) and non-mated (H nm ) distributions, and ω = p(H m )/ p(H nm ) denotes the ratio between the unknown prior probabilities of the mated and non-mated score distributions. The measure D sys ↔ is bound within [0, 1], with D sys ↔ = 1 obtained for fully distinguishable mated and non-mated distributions, therefore corresponding to fully linkable AD. On the other hand, D sys ↔ = 0 is achieved for fully overlapped distributions, meaning that two AD derived from the same biometric trait cannot be linked using the considered linkage function.
As shown in [15] and mentioned in Section II, high values of γ in Eq. (4) make the estimateû, in Eq. (26), arbitrarily unreliable, and therefore the described linkability attack ineffective. The behavior of the linkability measure D sys ↔ as a function of γ , obtained using the Euclidean distance as linkage function, is shown in Figure 11. As it can be seen, D sys ↔ is roughly equal to 1 for γ = 0, i.e., two AD generated from the same trait are fully linkable. As γ grows, two AD related to the same identity get as unlinkable as templates obtained from distinct users. Examples of mated and nonmated distributions, together with the corresponding linkability measures, are shown in Figures 12 and 13, for γ = 0 and γ = 1 respectively.

V. DEALING WITH NON-IDEAL DATA
The system described in Section III and the analysis of its effectiveness reported in Section IV refer to the ideal assumption of biometric templates with i.i.d. coefficients. The i.i.d. hypothesis is commonly assumed for security evaluation assessment of most biometric cryptosystems [19], [21], as well as in the analysis of the requirements ensuring the zero-leakage conditions [14], [20]. However, biometric representations with i.i.d. coefficients are hardly encountered in real life. In more  detail, templates usually employed in biometric recognition systems include strongly correlated elements. Moreover, coefficients' distributions can be significantly different, with some features characterized by much greater discriminative capabilities than others. Therefore, the design of a protection mechanism applicable to real biometric data needs to take into account many issues not addressed when considering ideal conditions in order to avoid significant security losses, which become increasingly severe the more the biometric data deviate from ideal assumptions [29].
In order to approximate the i.i.d. condition, whitening methods, such as principal component analysis (PCA) or independent component analysis (ICA), could be employed. These methods have the side effect of generating representations with coefficients having uneven SNRs, i.e., most of the meaningful information is concentrated in a few components [15]. Dealing with data having the aforementioned characteristics has a major impact on the selection of the orthonormal matrices employed in the proposed cryptosystem.
In order to gain a deeper understanding, we consider a toy model where the template u is made of two coefficients, i.e., u 1 and u 2 , with Gaussian distributions and unitary covariance matrix. Let us assume that both coefficients are affected by additive Gaussian noise, having respectively variance σ 2 1 and Fig. 14. Overall capacity of a two-coefficient template vs SNR balance. σ 2 2 , with σ 2 1 + σ 2 2 = 1. The Shannon's capacity of the system is given by: Such capacity tends to infinity as |σ 2 1 − σ 2 2 | → 1, that is, when one of the two coefficients is noiseless. On the other hand, the overall capacity is minimum when the noise is evenly distributed between the two coefficients, as depicted in Figure 14. Therefore, in order to guarantee high capacity, it is desirable to describe the coefficients in a vector basis that concentrates the noise in few coefficients, leaving the remaining ones noiseless, which is what the PCA tries to achieve. Unfortunately, the application of a random orthonormal matrix to a given representation tends to distribute the noise more evenly across the coefficients. In fact, considering a generic transformation: we have σ 2 , where |2 cos 2 φ − 1| < 1, meaning that the operating point on Figure 14 moves left and the overall capacity decreases. The capacity loss becomes more prominent the more the original coefficients have uneven SNRs. In the extreme case, even a noiseless element would be mapped into a noisy term, leading to an infinite capacity loss. On the other hand, no capacity is lost when combining features having the same SNR.
Given these observations, specific care needs to be taken when dealing with real biometric traits, in order to generate templates with features having high SNR, and thus preserve the original capacity. Specifically, each template coefficient should be combined only with features having similar SNRs. This goal can be achieved by rearranging the vector u so as to be sorted with respect to the SNR, and designing the matrix A as a banded matrix. The matrix A bandwidth Q ∈ {1, 2, . . . , L} controls the capacity-unlinkability tradeoff. Note that in the extreme case of Q = 1, the orthonormal matrix is diagonal and the proposed approach collapses to the original one described in [15], providing no unlinkability at all. Clearly, Q controls the trade-off existing between capacity and unlinkability. This approach can be implemented by initializing A as a diagonal matrix R whose elements are randomly chosen as {−1, +1}, i.e., a random reflection matrix. Then we iteratively rotate randomly chosen coefficients (i, j), such that |i − j| ≤ Q, by a random angle 0 ≤ θ i j ≤ π/20. This can be formalized by the use of Givens rotation matrices [30]. Specifically, from a set of G i j matrix operators, each performing a rotation on the (i, j) plane: with G i j (k, k) = 1 for k ̸ = i, j, G i j (i, i) = G i j ( j, j) = c = cos θ i j , and G i j ( j, i) = −G i j (i, j) = s = sin θ i j , with θ i j randomly sampled in (0, π/20), we design the orthonormal matrix A as: with S ∈ {(i, j) : |i − j| < Q)}. A visual example of a matrix A obtained as in Eq. (33) is shown in Figure 15. It can be noticed that such matrix is not banded in a strict sense, because the many consecutive rotations may produce combinations of coefficients for which |i − j| ≥ Q. Nevertheless, since the weights of the combinations decrease with |i − j|, the resulting matrix can be assumed to be banded in a fuzzy meaning. The effects of the Q parameter on the properties of the resulting templates can be illustrated through an example relying on a synthetic template w made of L = 24 coefficients, whose capacities are not evenly distributed as shown in Figure 16, mimicking the behavior of biometric data whitened using, for example, a PCA or ICA method. The overall capacity of the template x generated through the proposed approach is shown in Figure 17. As it can be seen, the number of bits that can be embedded in x rapidly decreases for increasing values of Q, till a plateau is reached when the effective bandwidth of the rotation matrix A is saturated.  Template unlinkability with respect to the employed parameter Q is evaluated by considering the attacks described in Section IV. The fitness-to-distance correlation r is shown in Figure 18, considering synthetic templates made of L = 24 coefficients and one bit embedded into each coefficient. The behavior of the linkability measure D sys ↔ is instead shown in Figure 19, for different values of γ . As expected, templates generated from the same original representations get less linkable as both parameters Q and γ increases.

VI. EXPERIMENTAL VALIDATION ON
REAL BIOMETRIC DATA The proposed zero-leakage unlinkable cryptosystem, depicted in Figure 20, is tested using real biometric data. Specifically, in Section VI-A we introduce the biometric database exploited in the tests. The feature extraction approach is described in Section VI-B. The preprocessing applied to the extracted features to generate a template w with independent coefficients, and the estimation of the point-wise function f (·), are given in Section VI-C. Details about both the employed ECC and the QIM are given in Section VI-D. The obtained results are finally discussed in Section VI-E.

A. Finger Vein Biometrics
Without any loss of generality, in our experiments we have considered finger-vein biometrics, and specifically the  SDUMLA database [31], containing images of the index-, middle-and ring-fingers captured from the left and right hands of 106 subjects. Six gray-scale samples of 320 × 240 pixels are available for each finger.
Assuming an open-set scenario, the employed database has been split into two equal-size subsets, with data from 53 subjects employed for training, and samples from the remaining ones for testing. The employed feature extractor has been trained using each available finger as an independent class. Then, the template used in the experimental tests is obtained by concatenating the features obtained from the three fingers of a user's hand, in order to handle identifiers with a larger number of coefficients. Therefore, the considered testing dataset comprises 6 samples for each of 53 × 2 classes.

B. Feature Extraction
A fixed-length feature vector, with the desirable discriminative capabilities described in Section II, is obtained by using the approach proposed in [32], where representations of vein patterns suitable for verification systems have been obtained using deep learning techniques.
In more detail, a Densenet-161 [33] convolutional neural network (CNN), modified with the addition of a custom embedder layer producing 2048 features in the final output layer, has been trained using an additive angular margin penalty (AAMP) [34] as loss function. Such approach allows training the employed network in a standard modality for classification purposes, while achieving the additional goal of generating representations having the largest possible interclass variance, as well as the smallest possible intra-class variance.
The employed network has been trained by initializing Densenet-161 with weights pre-trained on the ImageNet dataset for object recognition purposes, while a Glorot uniform distribution has been used to initialize the fully-connected layers of the custom embedder. Stochastic gradient descent (SGD) with a batch size of 64, a learning rate of 0.01 divided by 10 after every 30 epochs, a momentum of 0.9, and a maximum number of 120 epochs have been considered during training. As for the hyper-parameters of the employed AAMP loss function, the penalty margin has been selected in the range m ∈ [0.3, 0.7], with a step size of 0.05, as the one providing the best results, while the associated scale parameter has been selected in the range s ∈ [16,96], with a step size of 16.
In [32], an equal error rate (EER) at 0.02% on SDUMLA, using identifiers derived from a single finger, has been reported. The concatenation of the features associated to three fingers allows to further improve the performance of an unprotected system, with the FRR and the false acceptance rate (FAR) reported in Figure 21, when using the Euclidean distance to compare the considered identifiers. Given the size of the employed database, the obtained FRR and FAR curves do not Intersect each other, being therefore only possible to report that EER < 0.06%, the lowest measured FRR, for an unprotected system.

C. Preprocessing
As remarked in Section V, in order to enforce the security requirement, the employed biometric representations should have independent features. Unfortunately, the features extracted through a CNN are not independent and therefore they should be further processed to generate an appropriate representation, namely w in our approach, as input of the proposed protection scheme.
To this goal, we have here exploited the Reconstruction Independent Component Analysis (RICA) [35], differently from [15] where the PCA has been employed. RICA is an unsupervised feature learning approach, which possesses some advantages wrt ICA. In fact, ICA requires a whitening stage, commonly performed through PCA, which makes its application difficult when dealing with high-dimensional input data and limited training sets. These conditions apply to the considered scenario, since the CNN described in Section VI-B extracts 2048-long templates, while the available training set only comprises 53 × 6 unique fingers, 6 instances each. Therefore, the total number of samples is slightly smaller than the size of the input space. Furthermore, since the samples of each user are strongly correlated, the number of reliable components that a PCA can learn is limited by the number of available classes. Therefore, while the use of classical PCA or ICA is inappropriate in the considered framework, RICA can  be instead effectively applied in such over-complete scenarios since it does not need an initial whitening stage.
We set the RICA algorithm to extract 128 features from the original 2048 coefficients. As mentioned in Section VI-A, the proposed cryptosystem is tested on biometric representations obtained by combining the features extracted from three fingers of a user's hand, thus obtaining a template w with L = 384 coefficients.
During the training stage, the templates obtained applying a RICA to the representations generated through the employed CNN are also examined in order to estimate the PDFs of each feature, required to define the functions f (·) introduced in Eq. (8). Since the treated coefficients can be assumed independent, the distribution estimates can be easily computed through the marginal variables. For each feature, the following seven different distributions are fit to the available data: The function f (·) associated with each coefficient is chosen by selecting the best fitting distribution by means of the Anderson-Darling test [36].
In summary, the proposed preprocessing allows creating representations x, after having set γ , with i.i.d. coefficients. These templates are used as input of the proposed QIM-based protection scheme.

D. System Configuration
The implementation of the proposed biometric cryptosystem needs the design of the function g(·), the required ECC, and the allocation of the bits within the template coefficients.
As outlined in Sections II and III, the function g(·) can be specified by selecting the roll-off parameter γ of the employed raised-cosine distribution. As discussed in Sections IV and V, the unlinkability of the proposed system improves for increasing values of γ , just like the irreversibility shown in Figure 2, with no significant improvements for γ > 0.7. Since the overall capacity is instead negatively affected by high γ values, in the performed tests we have opted to select a g(·) function with γ = 0.7 for all the coefficients. The N encoded bits c are distributed among the coefficients of x as a function of their capacity: being ⌊·⌋ the floor function, K the size of the secret key, C the capacity of the considered coefficient, and α a parameter, same for all the L features, chosen in such a way that the sum of all the bits assigned to each coefficient equals the size N of the encoded secret key. As for ECC, we have used Turbo Codes with a rate N /K = 7. Specifically, we have used codes specified in the Long Term Evolution (LTE) standard. Turbo codes are particularly powerful because they can rely on log-likelihoodbased receivers to perform soft decoding, so to approach Shannon's capacity minimizing the FRR.
As mentioned in Section II, the possibility of using likelihood-based decoders is the main advantage of real-valued auxiliary-data schemes over binary ones. In fact, the hard quantization needed in classical schemes, such as in the fuzzy commitment, leads to huge information losses with a significant impact on the probability of correct recognition.

E. Obtained Results
The overall capacity C tot of the representations obtained following the described approach is reported in Figure 22, where the influence of the bandwidth of the orthonormal matrix A on the system characteristics is shown. As already outlined in Section V, an increase in the values of Q and γ negatively affects the attainable capacity. Choosing a value γ = 0.7 for the employed raised cosine distribution allows achieving capacities around 250 bits for Q ≤ 16, while secret keys with 128 bits can be obtained also for large values of Q.
The recognition performance achievable by applying the proposed protection method to finger-vein traits, as a case study and without any loss of generality, is reported in Figure 23. While the system is inherently designed to work at FAR = 2 −K , the achievable F R R depends both on the length of the employed secret key and on the value Q adopted for the orthonormal matrix A. A system with Q = 1, as in [15], achieves the best possible recognition performance, with FRR = 0.1% for K = 80, yet it is not able to provide any unlinkability. On the other hand, the approach here proposed guarantees unlinkability with a FRR lower than 5% when using secret keys with K = 128 bits. This makes the considered system secure against any brute-force attacks carried out with the technology currently conceivable. 5 Embedding keys with 256 bits would imply a FRR at about 10% for Q ≤ 16, while worse recognition performances are achieved when using orthonormal matrices with larger bandwidths due to associated reduction in the available capacity. To the best of our knowledge, no other biometric cryptosystems able to embed secret keys with lengths in the order of hundreds of bits, and able to properly meet, at a satisfactory recognition rate, both the required security and renewability constraints, has been proposed in the literature.
It is worth remarking that conditions at FAR = 2 −K are achievable in the proposed system under the assumption of i.i.d representations x. However, since both RICA projections and the PDFs of each feature in w are estimated over a training dataset, and then applied to a different one, realistic applications of the proposed method would result in non-ideal characteristics for the biometric templates x adopted in testing conditions. This is due to the inaccurate estimates of the involved coefficients distributions, and it is typically the more significant the smaller the size of the training dataset. The influence of such discrepancies on the achievable security has been analyzed by evaluating the security bound K DoF N described in [29], with DoF representing the degrees of freedom of the best binomial distribution fitting the inter-class Hamming distance distribution obtained when comparing the binarized templates x of a subject with those associated to possible impostors. In case of independent coefficients, the estimated DoF would correspond to N , with the achievable security therefore corresponding to the length of the secret keys employed in the proposed scheme, that is, K DoF N ≈ K . Figure 24 shows the results obtained when considering Q = 1 in the employed rotation matrix (A = I), in order Fig. 24. Effective security as in [29]. to avoid any other source of fluctuation in addition to the inter-class variability. For the training dataset, a behavior close to the ideal one (K DoF N ≈ K ) is achieved. However, the mismatch between the distributions characterizing the training and the testing datasets causes a performance worsening on the testing dataset. Nevertheless, the consequent amount of degradation is limited, thus allowing high values of K DoF N to be reached. Therefore, the carried out analysis confirms that the proposed protection scheme, which applies transformations to the original templates to make the coefficients of the representations w independent, is effective to guarantee high levels of security in practical scenarios.
Eventually, Figure 25 reports the linkability measure D sys ↔ computed on real data, for different values of K , when considering Q = 16. The obtained results show that the evaluated D sys ↔ is weakly correlated with the length K of the employed secret key, yet with a decreasing slope. In addition, as already shown in Figures. 11 and 19 for ideal and synthetic data, our analysis shows that very low linkability rates can be achieved also in real-world conditions.

VII. CONCLUSION
In this paper, the biometric cryptosystem system proposed by the authors in [15] has been improved in order to make it immune to linkability attacks. In contrast with other methods proposed in the literature, unlinkability is here achieved using parameters that can be considered as public, with no requirements for their secret storage.
The effectiveness of the proposed solution is tested against two different kinds of attacks. The first one, based on stochastic optimization, is shown to be unfeasible due to the computational complexity required to solve a system of non-linear equations. The second one relies on estimates of the employed biometric representations, and it is evaluated using quantitative measures, showing that templates stemming from the same identity cannot be linked if the parameters of the proposed scheme are properly selected.
In addition, real-world scenario data have been considered as a case study, and guidelines to properly design the components of the proposed scheme are given. The proposed cryptosystem has been applied to biometric templates derived from finger-vein patterns, using CNNs to generate the employed representations, and processing the obtained data to achieve feature independence, irreversibility, and unlinkability. The performed tests have shown that it is possible to perform protected biometric recognition while guaranteeing user-friendly recognition performance in terms of FRR at FAR ≈ 0, and a level of security comparable with the one achieved in current cryptographic protocols relying on keys with at least 128 bits.