A Hybrid Information Reconciliation Method for Physical Layer Key Generation

Physical layer key generation (PKG) has become a research focus as it solves the key distribution problem, which is difficult in traditional cryptographic mechanisms. Information reconciliation is a critical process in PKG to obtain symmetric keys. Various reconciliation schemes have been proposed, including the error detection protocol-based approach (EDPA) and error correction code-based approach (ECCA). Both EDPA and ECCA have advantages and drawbacks, regarding information leakage, interaction delay, and computation complexity. In this paper, we choose the BBBSS protocol from EDPA and BCH code from ECCA as a case study, analyzing their comprehensive efficiency performance versus pass number and bit disagreement ratio (BDR), respectively. Next, we integrate the strength of the two to design a new hybrid information reconciliation protocol (HIRP). The design of HIRP consists of three main phases, i.e., training, table lookup, and testing. To comprehensively evaluate the reconciliation schemes, we propose a novel efficiency metric to achieve a balance of corrected bits, information leakage, time delay, and computation time, which represents the effectively corrected bits per unit time. The simulation results show that our proposed method outperforms other reconciliation schemes to improve the comprehensive reconciliation efficiency. The average improvement in efficiency is 2.48 and 22.36 times over the BBBSS and BCH code, respectively, when the range of the BDR is from 0.5% to 11.5%. Compared to the BBBSS protocol and the BCH code, HIRP lies at a mid-level in terms of information leakage and computation time cost. Besides, with the lowest time delay cost, HIRP reaches the highest reconciliation efficiency.


Introduction
Wireless communication is ubiquitous in our daily life, and it is expected to support extremely high data rates and radically new applications in the foreseeable future. Meanwhile, wireless transmission is vulnerable to eavesdropping attacks due to the broadcast nature of the wireless medium. Therefore, safeguarding data transmission is given the top priority in the development of next-generation wireless networks [1,2]. A paradigmatic problem of securing data transmission is the key distribution. Traditional public key cryptography techniques are widely used in existing communication networks. However, they require a public key infrastructure and are computationally intense, and thus encounter key distribution and management difficulties in the limited-resource mobile networks. Furthermore, with the advent of quantum computers capable of rapidly performing a complex and massive factorization, the traditional cryptography mechanism based on computation complexity is no longer reliable.
In existing work, reconciliation efficiency is the most commonly used evaluation metric, which is inversely proportional to leakage bit rate (LBR). However, rare work takes into account the interaction delay and computation complexity. But these factors may affect reconciliation efficiency greatly in some specific scenarios. For example, heavy interaction is unbearable in long-distance satellite communications. In Internet of Things (IoT) networks with limited resources, computation complexity has to be considered. Furthermore, both EDPA and ECCA have their pros and cons. There is still a gap regarding how to integrate their strengths to improve the reconciliation efficiency. To address these problems, this paper carries out a comprehensive and theoretical study on the information reconciliation schemes to establish highly efficient identical secret keys. Our contributions are as follows. • A comprehensive reconciliation evaluation metric is proposed, taking consideration of LBR, interaction delay, and computation overhead. The metric represents the effective corrected bit number per unit time. The calculation expression of the metric is derived in this paper.

•
The characteristics of BBBSS and BCH are analyzed from the perspective of the proposed new metric. Combining advantages of the two together, a new hybrid information reconciliation protocol (HIRP) is proposed. The detailed realization steps of HIRP are presented, including training, table lookup, and testing.

•
The simulation results verify the theoretical analysis of both BBBSS and BCH. Monte Carlo simulations validate that the proposed HIRP outperforms the other two approaches to provide a more efficient information reconciliation in PKG.
The rest of the paper is organized as follows. Section 2 introduces the system model, the secret key generation process, the general information reconciliation model, and the problem studied in this paper. Section 4 provides a comprehensive reconciliation efficiency metric and presents the calculation expression for each factor. Section 5 proposes a new hybrid information reconciliation protocol (HIRP) and designs the realization algorithms. Section 6 presents the simulation results, and Section 7 concludes the paper.

General System Model
We consider a general Single Input single output single eavesdropper (SISOSE) model. All the users are equipped with a single antenna. Alice and Bob are two distinct legitimate users with a distance between each other of d meters. The communication system works at a frequency of f c GHz with a bandwidth of B Hz. The data transmission rate is then B bits per second (bps). Alice and Bob intend to extract secret keys from their channel characteristics to protect the data transmission. Key generation requires a temporally dynamic channel, and the channel variation can be introduced by the movement of users and/or surrounding objects [19].
Eve is a passive eavesdropper located more than the coherence distance from both Alice and Bob. According to the definition of coherence distance, the coherence distance at a carrier frequency of 2.4 GHz is 6.25 cm. Therefore, we assume that Eve experiences a fading channel independent of that of Alice and Bob. Despite this, Eve knows the whole communication protocols, the pilots, and all the information transmitted over the public channels between Alice and Bob.
The notations used in this paper, and their definitions, are summarized in Table 1.

Secret Key Generation Process
Considering the t i ∈ {1, 2, · · · }-th round of secret key generation, Alice and Bob generate secret keys during a period of time T, as shown in Figure 1, which includes four main steps: channel sounding, quantization, information reconciliation, and privacy amplification. At first, Alice and Bob estimate their channel characteristics through channel sounding, i.e., sending pilots to each other. Eve may also estimate her channel to Alice or Bob. For simplicity, Eve's channel is referred to as that between Eve and Bob in this paper. Denote the channel characteristics estimated during T for the user u ∈ {A, B, E} as H u with length L H , where A, B, and E represent Alice, Bob, and Eve, respectively. Secondly, the user u maps the input values from H u into output values in a bit sequence set through quantization, e.g., channel quantization with guardband (CQG) used in [20]. The quantized bit sequence is represented as Q u with length L Q .
Until now, there has existed unavoidable bit disagreements between Q A and Q B , caused by time delay in TDD systems, hardware differences, and noise [21]. Although some preprocessing approaches, e.g, principal component analysis (PCA) [20], are applied, the bit disagreements are not fully eliminated. However, even a bit difference in a secret key will trigger an avalanche effect, leading to complete decryption failure. To deal with this problem, Alice and Bob correct the bit disagreements of Q u through information reconciliation, and the corrected bit sequence is denoted by R u with length L R = L Q . Totally, the L M bits of parity information M are transmitted during the reconciliation process. The dashed line in Figure 1 shows that the communication may either be bidirectional or one way. Unfortunately, M is also leaked to Eve as she knows all the information transmitted through public channels. According to the leftover hash lemma, L M bits arbitrarily chosen for R u are discarded to guarantee key security during the privacy amplification step. For example, when L M = 40 and L R = 168, a simple realization method is to map the 168-bit corrected sequence R u to a 128-bit random sequence P u through an MD5 hash function. Finally, the key consistency is verified by sending a simple hash value V u of P u from one to another. When the hash value is identical, the t i -th round of secret key generation is successful. Otherwise, the t i -th round of secret key generation fails, and the P u is discarded.

A General Model for Information Reconciliation
Various approaches, including EDPA and ECCA, are proposed for information reconciliation. In this section, we establish a general model for them. During the information reconciliation step, Alice communicates with Bob over public channels for K passes. All information transmitted over public channels is assumed to be error-free.
An EDPA, such as BBBSS, has J(k), k = 1, 2, · · · , K rounds of back-and-forth interactions for the k-th pass. In each round of interaction, Alice first sends the parity information to Bob, then Bob feeds back the information about error position to Alice. On the other hand, an ECCA, such as BCH codes, generally only has one pass and one round of communication, i.e., K = 1 and J = 1. It is because the error correct code has error propagation when the error number is beyond its error correcting capability. Therefore, it is inefficient for ECCA to gradually reduce bit error numbers through multiple passes or rounds. Besides, ECCA is a one-way communication in which Alice sends a syndrome to Bob but Bob does not provide feedback. Bob uses the syndrome to correct his channel observation through decoding algorithms, e.g., the Viterbi algorithm. Figure 2 illustrates the information reconciliation process for both EDPA and ECCA. During the k-th pass of communication, Alice and Bob divide Q u into N G (k) groups with group length L RG (k). Denote N aGE as the estimated average number of error bits in one group. For EDPA, group length L RG (k) is designed to guarantee that each group has about one error, i.e., N aGE = 1. During the first round of communication (J(k) = 1), Alice sends the parity of each group to Bob, and Bob feeds back the indexes of wrong groups. A group is defined as an error group if the parity information of Alice and Bob is different. Then, for each wrong group, J(k) − 1 rounds of bisect error-correcting are applied to find the position of error bit. As for ECCA, group length L RG depends mainly on the affordable decoding complexity of Bob. In the affordable range, the larger the L RG , the more accurate the N aGE . Therefore, L RG is usually set as the largest affordable length. Each group may have more than one error in this case. According to the signal-to-noise ratio (SNR), it is estimated that the ratio of N aGE to L RG matches the coarse bit disagreement ratio (BDR) estimation of Q u . Then, ECCA chooses the error correction code C(n c , k c , t c ), where n c , k c and t c are the code length, message length, and error correcting number, respectively. Code C(n c , k c , t c ) satisfies that the message bit length k c = L RG and the correction error number t c ≥ N aGE . Next, Alice divides Q A into groups and sends all groups of syndromes to Bob. According to the syndromes, Bob corrects the inconsistent bits in Q B using decoding algorithms.

Problem Statement
Information reconciliation approaches are mainly categorized as EDPA or ECCA, and each has its advantages and drawbacks. The downside to EDPA is that it needs multiple passes and multiple rounds of back-and-forth interactions, as shown in Figure 2. When Alice is far away from Bob, it causes a very large interaction delay and communication overhead. Furthermore, the efficiency of EDPA decreases with the increase in pass number. The proof is provided in Section 3. On the plus side, EDPA just uses bisect error-correcting, which consumes less computation and leaves less leakage of information.
Conversely, ECCA only has one pass, one round, and one-way communication. Obviously, the interaction delay and communication overhead are significantly reduced. The negative side of ECCA is that it has expensive computation overhead and large information leakage, especially for low SNR scenarios. If L RG is small, the estimate of N aGE is inaccurate, which may lead to error propagation. Instead, if L RG is large, the decoding complexity is high. Even worse, information leakage increases rapidly with the rise of N aGE for ECCA. Table 2 summarizes the features of EDPA and ECCA. Since both EDPA and ECCA have their pros and cons, this raises a natural question: "How to comprehensively evaluate the performance of an information reconciliation approach?" Existing work only considers one or two indicators of performance, e.g., information leakage, leaking the evaluations of interaction time, complexity, etc. The subsequent problem is: "Is it possible to integrate the strengths of both EDPA and ECCA to design a new reconciliation approach that makes a trade-off of all these performance indicators?" To address this problem, we first propose a comprehensive metric to evaluate the efficiency of reconciliation approaches. Next, we discuss the performance of EDPA and ECCA, respectively. In this paper, we choose BBBSS of EDPA and BCH code of ECCA as a case study. Under the guidance of the new metric, we then design a new approach named HIRP to achieve good efficiency.

A Comprehensive Information Reconciliation Evaluation Metric
In this section, we propose a comprehensive reconciliation efficiency metric, taking consideration of corrected bits, information leakage, interaction delay, and computation time.

Information Leakage
Information reconciliation poses a security threat as eavesdroppers can infer keys from the interacted information. Therefore, the information leakage should be considered when evaluating an information reconciliation scheme. Definition 1. Denote η as the information leakage ratio, which is defined by where R A is the reconciled key with length L R , M is information disclosed during interaction, and I(R A , M) is the mutual information between them, which represents the information that eavesdroppers can obtain about the key. To guarantee the security of the final key, at least η proportion of the reconciled keys should be wiped off in the privacy amplification step.
where h(ε) is the entropy of ε with The lower bound of η represents the minimum amount of interaction information per bit for Q A and Q B to obtain identical keys. Since L M is the length of M, then the maximum disclosed bits is L M . When M has a linear relationship with Q A , the disclosed bits is L M . Otherwise, it is less than L M due to the increased ambiguity caused by nonlinearity. Therefore, the upper bound of η is L M /L R . In this paper, we calculate the information leakage ratio through its upper bound with η = L M L R for security purposes.

Interaction Delay
The interaction delay represents the time spent on exchanging information M. It can become significant in EDPA, which has multiround interactions. Denote T delay as the interaction delay, which includes two parts, i.e., the data transmission time and the propagation time. Then, T delay is calculated as where B is the system bandwidth, The latter term rises with the increase of K ∑ k=1 J(k). Besides, in long-distance communications, such as satellite communications, the latter term becomes the dominant factor for long interaction delays. Therefore, the number of information interactions should be lowered to reduce the delay and communication overhead.

Computation Time
In some resource-constrained systems, the performance of error-correcting schemes may be constrained since decoding algorithms require multiple round iterations.
Therefore, computation complexity, which is characterized by the computation time T c , should be taken into account. Denote T c as where t c is the time cost of an "equivalent addition" and N eqAdd represents the number of equivalent additions. The required mathematical and logical operations can be viewed as multiples of "equivalent addition" due to current digital signal processor (DSP) specifications in [22]. In Table 3, computation operations are normalized to 5. T comp is determined by the BDR of initial keys, group size, and decoding complexity. The higher the BDR is, the longer the computation time is. Generally, ECCA has a much heavier computation cost than EDPA.

Effective Reconciliation Rate ξ
To achieve a balance in the above factors, we propose a novel comprehensive metric ξ, which is called the effective reconciliation rate, to evaluate the performance. The definition of ξ is given by where N corr denotes the number of corrected inconsistent bits. Actually, ξ represents the effective corrected bit number per unit time. Therefore, it reflects the efficiency of an information reconciliation approach. There is a negative correlation between ξ and information leakage η, and interaction delay T delay and computation time T comp . Reducing the value of η, T delay , and T comp contributes to the improvement of ξ. The higher the ξ is, the more efficient the information reconciliation approach is.

A Hybrid Information Reconciliation Protocol
In this section, we first review the characteristics of BBBSS and BCH from the perspective of the new metric ξ. Combining the advantages of both, we propose a new approach named HIRP, which aims to improve the comprehensive reconciliation efficiency.

BBBSS
The BBBSS protocol uses permutation-and-bisect block to remove the discrepancies [23]. Define one pass of bisect and permutation block correction as one pass of BP. Figure 3 illustrates the flow chart of BBBSS, in which solid blocks contain information interactions. Permutation distributes disagreements randomly and then groups key strings into blocks using estimated BDR. The block length is recommended as L RG (k) = 0.73/ε(k), where ε(k) is the BDR for the k-th pass. Then Alice and Bob interact the parity check of each block to find out error blocks and apply bisect error correcting to correct disagreements. Since this method couldn't detect the block that has an even number of disagreements, multiple passes of BPs are required. The pass iteration terminates when the parity check of all the blocks are identical. We further define the efficiency metric in the k-th pass as The information leakage satisfies that where N (k) corr indicates the number of error groups for the k-th pass and J(k) = log 2 L RG (k) + 1 is the number of back-and-forth interactions for the k-th pass. Except for the first round of finding the error groups, it needs additional log 2 L RG (k) rounds to find the error position.
The time delay satisfies that The computation time T comp has a linear growth with L M (k).
In general, with the increase of pass number k, the group number N G (k) and corrected bits N (k) corr decline. However, the group length L RG (k) increases, and thus the interaction number J(k) increases. As stated in Section 4.2, the latter term in the time delay plays a dominant role. When k is small, one round of interaction is more efficient as it processes parity information for multiple groups in parallel. However, when k is large, even one error group may need a round of interaction, which causes low efficiency. With N (k) corr and T (k) delay playing dominant roles, ξ decreases with the increase in pass number, which is also verified in the simulations of Section 6. To sum up, BBBSS has high efficiency at the first several passes and then becomes less efficient in subsequent passes.

BCH
BCH code with C(n c , k c , t c ) has only one pass of interaction. Since each group has the same code, the ξ in one group is equal to the whole ξ. Then the leakage rate satisfies that When C(n c , k c , t c ) is capable of correcting all of the errors, then N corr = t c . The time delay is At last, T comp rises with the increase of t c . The metric ξ is To correct t c errors, it has to be satisfied that n c − k c ≥ 2t c + 1. Assume that n ≈ k c + (2t c + 1), then Equation (13) is approximated as . (14) In one group, C(n c , k c , t c ) satisfies that the message bit length k c = L RG and the correction error number t c ≥ N aGE . Thus, ε ≈ t c /k c , and k c is a constant that mainly depends on the affordable decoding complexity of Bob. Thus, Equation (15) is further written as where T comp (ε) decreases monotonously along with increasing ε. Because a 1 < 0 and a 3 > 0, ξ decreases along with increasing ε generally. In summary, BCH is more efficient in low BDR regions.

The Algorithm of the Proposed HIRP
In the previous analysis, both BBBSS and BCH are efficient in some specific conditions. On the one hand, BBBSS is effective for the first few passes of BP, and its BDR is reduced down (about threefold) after each pass. On the other hand, BCH shows better efficiency at low BDR regions. Inspired by these, we propose a hybrid approach named HIRP to further improve the reconciliation efficiency, combining the virtues of both BBBSS and BCH. Figure 4 illustrates the flow chart of HIRP. The core idea is that when the BDR is high, several passes are firstly exploited to reduce it to a low value, and then the few residual errors are further corrected by BCH, which is efficient in low BDR regions. Algorithm 1 gives the details of the realization steps of HIRP, which contains three main phases, i.e., training, table lookup, and testing.  2. Traverse all possible BPs for different ε in range and calculate their efficiency respectively. 3. Find the optimal pass number to maximize the efficiency ξ and draw Table 4.  Table 5 with the estimated BDR. 3. Use the H p algorithm for reconciliation, which applies p designed passes of BPs firstly and then eliminates remaining disagreements by BCH codes.   1  2  2  2  3  3  3  3  3  3  3  3   Table 5. Output BDR after p passes of bisect and permutation (BP). Define H p as the HIRP approach with p passes of BP. When p = 0, HIRP turns into BCH, and when p gets large enough, HIRP is equal to BBBSS. The parameter selection of p is critical to our proposed approach. In the training phase, we first collect the optimal p values that achieve the maximal ξ for a group of BDRs. Figure 5 shows a realization framework to find p optimal . By adding artificial noise to training data Q Train A , we get the desired Q Train B with BDR ε ranging from 0.5% to 11.5%. The collected results of p optimal versus ε are shown in Table 4. Although when p = p optimal , H p can achieve the highest ξ, the traversing method is complicated, and the cost is huge in practical applications. Besides, it is challenging to go through all the possible H p s for every possible ε. To deal with the problem, we design a new table of p design versus ε with the combination of both Tables 4 and 5. The element (ε in , p, ε out ) in Table 5 represents the input BDR, pass number, and the corresponding output BDR. From Table 5, the BDR is reduced to roughly one third after every pass. After p passes, the BDR of output signals ε out satisfies that

Algorithm 1 Algorithm of HIRP
To simplify the process, p designed is calculated as the minimum value of p, that satisfies where the threshold ε th is set as the largest value of ε out among the marked elements. From Table 5, the threshold of our simulation is ε th = 0.425. According to the above rules, Table 6 gives the value of p designed versus different ε.   1  2  2  2  3  3  3  3  3  3  3  4 In the testing phase, Bob first estimates the ε. The coarse BDR estimation can be calculated according to the channel signal-to-noise ratio. After that, Q A and Q B are grouped with len = 0.73/ε coarse , which satisfies N aGE = 1. Then A and B interact the parity check of each block for a fine BDR estimation. Next, Bob selects the corresponding pass number p designed from Table 6. Finally, Bob conducts the algorithm of H p for information reconciliation and recovers the bit sequence of Q A . The block diagram of the testing phase is illustrated in Figure 6. The testing phase does not need sophisticated communication or a heavy computation cost. The additional operation is the table lookup, which is easy to realize in practice.

Simulations
In this section, we give some simulation results of the BBBSS, BCH, and our proposed HIRP scheme with p optimal and p design for comparison. The communication distance is set as d = 5 KM, the communication bandwidth is B = 4 MHz. The communication overhead in one interactive is set as T 0 = 50 ms for the consideration of packet loss.
First, we simulate the efficiency metrics of BBBSS in every individual pass. The results are given in Figure 7. Both the corrected bit number and information leakage ratio reduce with the increase in pass number. The interaction time delay rises at first and then goes down after the 4-th pass. This is caused by the fact that the group length L RG increases, while the number of error groups is not reduced significantly. Comprehensively, as shown in Figure 8, the metric ξ decreases with the increase in pass number, which means that BBBSS has a high efficiency at the first several passes. The simulation results coincide with the theoretical analysis in Section 5.1.  We also simulate the individual performance of BCH for different BDRs in Figure 9. With the increase in the BDR, the information leakage ratio, the time delay, and the computation time show an upward trend. Therefore, the performance of ξ presents a general falling tendency, as shown in Figure 10. When the BDR is lower than 1.5%, the ξ has a slight increase. This is because the correct bit number has a significant rise with the increase in BDR.  Next, we compare the performance in terms of the information leakage, the time delay, the computation time, and the comprehensive efficiency for various information reconciliation approaches including BBBSS, BCH, and HIRPs with optimal and designed pass numbers. Figure 11a shows the information leakage ratio versus BDRs. BCH has the highest information leakage ratio, which rises significantly with the increase in BDR. When the BDR is 7.5%, the BCH code is chosen as C(8191, 4148, 311), and the leakage ratio reaches 1. Therefore, we do not represent the BCH performance results for BDRs larger than 7.5%. The leakage ratio of the HIRPs is almost identical to that of BBBSS, and their growth is slow with BDR. Figure 11b represents the interaction time delay as a function of the BDRs. BBBSS has a longer interaction time compared with others. HIRPs have the shortest time delay and the slowest growth for BDRs above 1.5%. Figure 12a describes the computation time with respect to BDRs. The computation complexity rises significantly with the ramp-up of BDR. BCH has the longest computation time, which becomes significant in high BDR regions. BBBSS has the shortest computation time, and HIRPs have the middle one. In addition, the computation times of BBBSS and HIRPs rise slowly with the increase in BDR.  Figure 12b compares the efficiency ξ of different information reconciliation approaches. In low BDR regions, BCH has a better performance than BBBSS, while in high BDR regions, the opposite is true. It is observed that the ξ of HIRP outperforms that of both BBBSS and BCH along all BDR regions. It should be noted that when we only consider information leakage and computation time, HIRP seems to have no advantage compared to BBBSS. However, the time delay in Figure 11b shows that HIRP has a much lower time delay than BBBSS. The multipass interaction in the BBBSS protocol increases its time delay seriously. Therefore, the final comprehensive efficiency of HIRP is higher than that of BBBSS. In addition, HIRP with designed p has almost the same performance as HIRP with optimal p. The results verify the effectiveness of our proposed approach. Table 7 shows the numerical improvement results of HIRP against BBBSS and BCH. According to Equation (7), the effective reconciliation rate ξ is inversely proportional to information leakage η, time delay T delay , and computation time T comp . Compared to BBBSS, the comprehensive efficiency ξ is improved 2.48 times, mainly due to the fact that HIRP declines T delay by 73% on average. Compared to BCH codes, HIRP declines η, T delay , and T comp , which results in the improvement of HIRP efficiency by an average of 22.36.

Conclusions
This paper examined the efficiency of information reconciliation approaches. We introduced a comprehensive reconciliation efficiency metric that considers the corrected bits, the interaction delay, and the computation time synthetically. Furthermore, we analyzed the characteristics of both BBBSS and BCH from the perspective of the metric. The efficiency of BBBSS decreases along with pass number, and BCH has low efficiency in high BDR regions. Inspired by this, we proposed a HIRP method that exploits certain passes of BP and then corrects the residual errors by BCH. The design of HIRP contains training, table lookup, and testing phases. The simulation results verified the effectiveness of our proposed HIRP approach. HIRP improves the comprehensive reconciliation efficiency 2.48 and 22.36 times compared with BBBSS and BCH, respectively. It makes a trade-off between individual performance indicators by achieving a median value of information leakage, interaction delay, and computation time. In the future, we plan to study the parameter design of HIRP from the theoretical point of view in some specific scenarios. In addition, we chose the BBBSS protocol of EDPA and the BCH code of ECCA as a case study in this paper. We plan to expand HIRP to a more general hybrid method considering more protocols and codes in EDPA and ECCA in our next step.