Optical Systems Identification through Rayleigh Backscattering

We introduce a technique to generate and read the digital signature of the networks, channels, and optical devices that possess the fiber-optic pigtails to enhance physical layer security (PLS). Attributing a signature to the networks or devices eases the identification and authentication of networks and systems thus reducing their vulnerability to physical and digital attacks. The signatures are generated using an optical physical unclonable function (OPUF). Considering that OPUFs are established as the most potent anti-counterfeiting tool, the created signatures are robust against malicious attacks such as tampering and cyber attacks. We investigate Rayleigh backscattering signal (RBS) as a strong OPUF to generate reliable signatures. Contrary to other OPUFs that must be fabricated, the RBS-based OPUF is an inherent feature of fibers and can be easily obtained using optical frequency domain reflectometry (OFDR). We evaluate the security of the generated signatures in terms of their robustness against prediction and cloning. We demonstrate the robustness of signatures against digital and physical attacks confirming the unpredictability and unclonability features of the generated signatures. We explore signature cyber security by considering the random structure of the produced signatures. To demonstrate signature reproducibility through repeated measurements, we simulate the signature of a system by adding a random Gaussian white noise to the signal. This model is proposed to address services including security, authentication, identification, and monitoring.


Introduction
Communication around the interconnected world is an inseparable part of quotidian life. Indeed, fiber-optic communication networks are the most deployed telecommunication infrastructures across the globe [1]. Modern technology is developing towards providing secure, fast, cost-effective communication with high capacity around the world. Enhancing the protection of accessibility and confidentiality of networks and data is indispensable to protect communication systems from the constantly growing threat of adversarial attacks. Whereas network developers are responsible for building, developing, and maintaining data networks, network security engineers oversee security systems providing network security against adversary threats. For this purpose, network security engineers have defined the security and confidentiality of the networks. Considering the Open Systems Interconnection (OSI) model, which is known as a protocol-independent network communication model, security and confidentiality are generally subject to the upper layers. The OSI security architecture (ITU-T recommendation) delivers a standard for data security by identifying attacks, security services, and security mechanisms. From the last layer, namely,

Signature Extraction
Although in fiber-optic systems, RBS is usually considered a disruptive noise that leads to throughput degradation, we propose to exploit RBS to read the signature of fibers. Any fiber has an individual RBS pattern. This is a result of the randomly distributed particles along the fiber core, which occur during the manufacturing process of the fiber. These particles are much smaller than the wavelength of the light used in the communication system that propagates into the fiber and causes Rayleigh scattering. The unknown position and various distribution of the particles in the fibers is an advantage that causes the RBSbased signature to be unpredictable.
Optical time or frequency domain reflectometry (OTDR/OFDR) is generally used to measure fiber RBS. From the large pool of proposed OFDR techniques in the literature [33,34], we chose coherent optical frequency domain reflectometry (C-OFDR) [35], which exploits coherent reception. At the transmitter side, a continuous wave (CW) laser scans a frequency range by means of direct tuning or external linear frequency modulation. The interferometer arms consist of the RBS from the interrogated fiber (IF) and the original linearly frequency-swept light as the local oscillator (LO). On the receiver side, RBS data are acquired using an analog-to-digital converter (ADC) with N samples. C-OFDR assures that RBS is measured as a function of distance with high spatial resolution and high sensitivity [35,36]. Figure 1 illustrates a simplified experimental scheme for implementing C-OFDR. It is worth mentioning that the details of the experimental setup are not shown in Figure 1. Figure 1 represents the C-OFDR scheme for RBS measurement as a technique for reading a fiber signature to introduce the concept of optical identification. In long-distance measures, where the distance is beyond half of the laser coherence length, phase noise and frequency sweeping nonlinearity must be properly compensated, as described in [35].
The received RBS combined with LO at the photodetector [31] can be expressed as where n represents the number of reflection points of IF, E 0 is the amplitude of light, R i and τ i are the reflectivity of i-th reflected point and round-trip time of reflected light at each point, and γ is the laser sweep rate. It should be noted that random phase noise is neglected in Equation (1).

Binary Signature
The RBS-based signatures rely on the OPUF challenge-response protocol. Thus, any stimulus (called challenge) maps a unique result (called response) and provides a challenge-response pair (CRP). Figure 2 shows the challenge-response protocol in our proposed model to generate an OPUF-based signature with the following CRP: Challenge (C): the frequency-modulated continuous wave (FMCW) parameters on the Tx side represent the challenge. FMCW parameters are: sweep rate (Hz/s), which indicates the rate at which the frequency changed in FMCW; sweep range ∆ (∆ = − ), which indicates the distance between the start and stop wavelength in FMCW; and the value of stop and start sweep wavelengths. Every single parameter has an important role in the spatial resolution and quality of the obtained RBS pattern; this fact is beyond the scope of this manuscript and is clarified in [35]. The required FMCW parameters to define a specific challenge to generate the signature are summarized in Table  1. Response (R): the obtained RBS pattern represents the response. By changing FMCW parameters i.e., changing the challenge, the RBS pattern will change accordingly. Thereby, we can extract different RBS patterns from the same fiber if we stimulate the fiber with various challenges. Due to this fact, even if the adversary has access to the fiber, he cannot generate the intended signature without applying the correct challenge.

Binary Signature
The RBS-based signatures rely on the OPUF challenge-response protocol. Thus, any stimulus (called challenge) maps a unique result (called response) and provides a challengeresponse pair (CRP). Figure 2 shows the challenge-response protocol in our proposed model to generate an OPUF-based signature with the following CRP: The procedure used to generate the signature is shown in Figure 3. Part of the obtained RBS pattern is extracted using a selecting window (SW). The measured data are converted to the binary domain as bit sequences, called the main sequence. Such operation is easily performed using a low-cost single-bit ADC. Subsequently, a sequence of random bits, called the key , is generated and stored together with two parameters called flag points F1 and F2 that represent a specific random position in the key and main sequence, respectively. The main sequence and the key are then combined, as described in Figure 3c. The final sequence starts with the main sequence bits up to the F2 position followed by key bits up to the F1 position. The remaining bits are then combined and retain the same order: main sequence from the F2 position to the end followed by key bits from the F1 position to the end. It is notable that F1 and F2 are given as percentages of the corresponding sequence length. Challenge (C): the frequency-modulated continuous wave (FMCW) parameters on the Tx side represent the challenge. FMCW parameters are: sweep rate γ(Hz/s), which indicates the rate at which the frequency changed in FMCW; sweep range ∆F (∆F = λ stop − λ start ), which indicates the distance between the start and stop wavelength in FMCW; and the value of stop and start sweep wavelengths. Every single parameter has an important role in the spatial resolution and quality of the obtained RBS pattern; this fact is beyond the scope of this manuscript and is clarified in [35]. The required FMCW parameters to define a specific challenge to generate the signature are summarized in Table 1.
Response (R): the obtained RBS pattern represents the response. By changing FMCW parameters i.e., changing the challenge, the RBS pattern will change accordingly. Thereby, we can extract different RBS patterns from the same fiber if we stimulate the fiber with various challenges. Due to this fact, even if the adversary has access to the fiber, he cannot generate the intended signature without applying the correct challenge. The procedure used to generate the signature is shown in Figure 3. Part of the obtained RBS pattern is extracted using a selecting window (SW). The measured data are converted to the binary domain as bit sequences, called the main sequence. Such operation is easily performed using a low-cost single-bit ADC. Subsequently, a sequence of random bits, called the key K (1) F1,F2 , is generated and stored together with two parameters called flag points F1 and F2 that represent a specific random position in the key and main sequence, respectively. The main sequence and the key are then combined, as described in Figure 3c. The final sequence starts with the main sequence bits up to the F2 position followed by key bits up to the F1 position. The remaining bits are then combined and retain the same order: main sequence from the F2 position to the end followed by key bits from the F1 position to the end. It is notable that F1 and F2 are given as percentages of the corresponding sequence length. The procedure used to generate the signature is shown in Figure 3. Part of the obtained RBS pattern is extracted using a selecting window (SW). The measured data are converted to the binary domain as bit sequences, called the main sequence. Such operation is easily performed using a low-cost single-bit ADC. Subsequently, a sequence of random bits, called the key , ( ) , is generated and stored together with two parameters called flag points F1 and F2 that represent a specific random position in the key and main sequence, respectively. The main sequence and the key are then combined, as described in Figure 3c. The final sequence starts with the main sequence bits up to the F2 position followed by key bits up to the F1 position. The remaining bits are then combined and retain the same order: main sequence from the F2 position to the end followed by key bits from the F1 position to the end. It is notable that F1 and F2 are given as percentages of the corresponding sequence length.  The final sequence is converted into a quick-response (QR) code, i.e., a two-dimensional (2D) binary matrix, to denote the signature. This process has been performed through the MATLAB code by converting bits into the binary image. In the worst-case scenario, the adversary has physical access to the nominated fiber (or device) and possesses strong knowledge and high-performance devices that can predict the key. In this case, the adversary cannot find out the signature without knowing the key parameters. Consequently, adding a key with the mentioned parameters to the original data enhances the signature Sensors 2023, 23, 5269 6 of 14 robustness against such adversary attacks. In Section 4.3, we will demonstrate which signature length, and consequently key length, can be defined due to the expected level of security from the signatures.

Signature Evaluation
We propose a new method to generate a signature associated to any optical fiber segment (i.e., optic links or devices' fiber pigtails) to assist services including security, identification, authentication, and monitoring at the physical layer. However, the generated signatures must be unique and reproducible for every fiber. Thus, repeating measurements using the same challenge should result in the same signature. Moreover, it should be robust against physical and digital attacks. To address these significant features, some specific metrics must be considered.
In [31][32][33][34][35][36][37], we evaluated the obtained signatures' uniqueness and reproducibility using Hamming distance. In this work, we further assess the signatures to complete the evaluation of the model's performance. In addition to Hamming distance, we considered the following metrics, which have been used to assess the OPUF responses of the optical diffuser [38]: where A and B are binary signatures with N × M dimension. The XCOR-C assesses the signatures based on the following procedures: i. shows how distinguishable the generated signatures from a single fiber (OPUF) are with different challenges; ii.
shows that noisy signature is distinguishable from an imposter version which is generated with the same OPUF but with the wrong challenge; iii. assists in finding the best challenges to generate more reliable signatures.
The uniqueness assessment of the signatures is implemented through the XCOR-C. In this study, we have considered the Pearson cross-correlation coefficient, which can provide information regarding the structural similarity between the acquired signatures and accurately distinguish the noisy signature from the imposter. It is the most common way to assess the similarity between acquired PUF responses [38].
where → x and → y are signature vectors (non-binary) with the length of N. The ED assesses the signatures based on the following: i.
The ED between the RBS obtained through several measurements of the same OPUF and the same challenge translates into reproducibility; ii. ED between the RBS obtained through several measurements of the same OPUF but different challenges indicates unpredictability; iii. ED between the RBS obtained through several measurements of various OPUFs but with the same challenges indicates unclonability.
(C) Hamming Distance (HD): where s1 and s2 are the signatures vector (binary sequences) with the length of M. The HD shows the number of non-identical bits between two vectors, and if they get flipped, two signatures will be similar. The HD assesses the signatures based on the following: i. HD between the binary sequence achieved through several measurements of the same OPUF and the same challenge indicates reproducibility; ii. HD between the binary sequence achieved through several measurements of the same OPUF but different challenges indicates unpredictability; iii. HD between the binary sequence achieved through several measurements of various OPUFs but with the same challenges indicates unclonability; iv. HD between each row of the binary image (QR code) gives uniqueness [37][38][39].
It is worth mentioning that the results of HD (ED) will commonly be demonstrated in histogram form [38]. In this manner, it is easy to find the HD (ED) threshold for accepting (genuine) or rejecting (imposter) the signature. Overlapping of the two histograms, which depict the HD (ED) between one OPUF response obtained through several measurements with the same challenge consisting of additional measurement noise (genuine signatures) and the HD between measurements of the same OPUF with different challenges (imposter), implies the presence of false positive (imposter signature accepted) or false negative (genuine signature rejected) measurements [31]. Zero overlapping guarantees that neither of the two challenges will result in an identical response, ensuring the signature's unpredictability and unclonability.

(D) Statistical Test Suite (STS)
The aim of using STS is to run a test to investigate the probability of randomness of the bits in the signature sequences (S). Wherein, the hypothesis test result is returned as h = 1 or h = 0 with the probability of randomness p. If h = 0, the bits in S are in random order with the probability of p. Small values of p cast doubt on the validity of a null hypothesis. When h = 1, the test rejects the randomness of bits.
One of the most significant ways to explore a signature's cybersecurity is by considering the random structure of the produced signatures (bits). In [32], we demonstrated the strength of the signature against a brute force attack using Hamming distance. Here, STS with h = 0 verifies the results in [32].

Security Validation
To validate the security robustness of the signatures generated using our proposed model, in terms of uniqueness, unclonability, and unpredictability, we considered various simulation scenarios for a 0.5 m fiber pigtail using C-OFDR. The measurement scenario was modified depending on which security feature of the signature was to be explored. In all scenarios, we considered a single polarization signal measure with C-OFDR.

Signature Evaluation through the ED and HD Metrics
To investigate the reproducibility and unpredictability of the signatures, which are defined in the following section, we considered an ADC with N = 4000 samples.
To validate robustness, we generated the signature by applying the following challenge: Ci (60 nm/s, 30 nm, 1530 nm, 1560 nm). The signature was stored as valid. Subsequently, we generated 100 signatures of the same fiber with the same challenge (Ci) adding random white Gaussian noise to emulate the signature robustness against measurement noise during different times of measurements. We implemented ED between the original signature and the noisy ones, as shown in Figure 4a. illustrates that there was no overlap between the robustness and unpredictability histograms, which diminishes the probability of false positives and false negatives. The histogram of robustness highlights a very low ED (around 0.9 on average), which means that the obtained responses are very similar notwithstanding the added noise. On the contrary, the histogram of unpredictability has higher values of ED (around 11 on average) by at least of one order of magnitude with respect to the previous case. To validate unclonability, we generated 100 signatures from the 100 different fibers by applying the same challenge to all of them. Another worst-case scenario of adversary attacks is when the adversary knows the challenge but has no access to the original PUF and tries to apply the known challenge to different PUFs to find the signature. Because any PUF has a unique response for each specific challenge, the adversary trials will fail again. We implemented ED between the original signature and the other signatures obtained from other fibers with the same challenge as the original signature. Figure 5 shows no overlap between robustness and unclonability which reduces the probability of false positives and false negatives. To validate unpredictability, we generated the signature of the same fiber by applying 40 different challenges by changing the sweep rate (and sweep range in a few cases). One worst-case scenario of an adversary attack is when the adversary has access to the fiber and attempts to use different challenges to obtain the original signature. In essence, any new challenge generates a new response, even though the challenge is applied to the same PUF, and this fact neutralizes the adversarial attempts. To demonstrate that, we implemented ED between the original signature and the signatures obtained with different challenges. The results are depicted in Figure 4b. Figure 4 illustrates that there was no overlap between the robustness and unpredictability histograms, which diminishes the probability of false positives and false negatives. The histogram of robustness highlights a very low ED (around 0.9 on average), which means that the obtained responses are very similar notwithstanding the added noise. On the contrary, the histogram of unpredictability has higher values of ED (around 11 on average) by at least of one order of magnitude with respect to the previous case.
To validate unclonability, we generated 100 signatures from the 100 different fibers by applying the same challenge to all of them. Another worst-case scenario of adversary attacks is when the adversary knows the challenge but has no access to the original PUF and tries to apply the known challenge to different PUFs to find the signature. Because any PUF has a unique response for each specific challenge, the adversary trials will fail again. We implemented ED between the original signature and the other signatures obtained from other fibers with the same challenge as the original signature. Figure 5 shows no overlap between robustness and unclonability which reduces the probability of false positives and false negatives.
In all cases, the ED was implemented before the RBS conversion to the binary domain. To investigate the robustness, unpredictability, and unclonability of the signatures with the HD metric, we converted the abovementioned signatures to the binary domain, added the key, and generated the QR code. The normalized histogram of HD (expressed in percent) between the original signature and noisy signatures, and between the original signature and the signatures obtained with different challenges, is shown in Figure 6a,b, respectively.
The HD between the original signature and noisy signatures, and HD between the original signature and the other signatures obtained from other fibers with the same challenge as the original signature are shown in Figure 7a,b, respectively. The results of the HD metrics confirm the results obtained with the ED metric in the binary domain. In all cases, the ED was implemented before the RBS conversion to the binary domain. To investigate the robustness, unpredictability, and unclonability of the signatures with the HD metric, we converted the abovementioned signatures to the binary domain, added the key, and generated the QR code. The normalized histogram of HD (expressed in percent) between the original signature and noisy signatures, and between the original signature and the signatures obtained with different challenges, is shown in Figure 6a,b, respectively. The HD between the original signature and noisy signatures, and HD between the original signature and the other signatures obtained from other fibers with the same challenge as the original signature are shown in Figure 7a,b, respectively. The results of the HD metrics confirm the results obtained with the ED metric in the binary domain.  In all cases, the ED was implemented before the RBS conversion to the binary domain. To investigate the robustness, unpredictability, and unclonability of the signatures with the HD metric, we converted the abovementioned signatures to the binary domain, added the key, and generated the QR code. The normalized histogram of HD (expressed in percent) between the original signature and noisy signatures, and between the original signature and the signatures obtained with different challenges, is shown in Figure 6a,b, respectively. The HD between the original signature and noisy signatures, and HD between the original signature and the other signatures obtained from other fibers with the same challenge as the original signature are shown in Figure 7a,b, respectively. The results of the HD metrics confirm the results obtained with the ED metric in the binary domain.

Signature Evaluation XCOR-C
Are noisy signatures adequately distinguishable from imposters?
To answer this question, we studied signatures through the XCOR-C metric. The

Signature Evaluation XCOR-C
Are noisy signatures adequately distinguishable from imposters? To answer this question, we studied signatures through the XCOR-C metric. The original signature (QR code) was generated and stored in the database. The XCOR-C was, then, calculated between the original and noisy signatures. Meanwhile, different signatures (imposters) were generated by applying various challenges to the same OPUF (the original fiber pigtail). Subsequently, the XCOR-C between the generated signatures and the original signature was calculated. The obtained results indicate that different challenges on one OPUF make different signatures distinguishable from the original signal with the noise, as shown in Figure 8a.

Signature Evaluation XCOR-C
Are noisy signatures adequately distinguishable from imposters? To answer this question, we studied signatures through the XCOR-C metric. The original signature (QR code) was generated and stored in the database. The XCOR-C was, then, calculated between the original and noisy signatures. Meanwhile, different signatures (imposters) were generated by applying various challenges to the same OPUF (the original fiber pigtail). Subsequently, the XCOR-C between the generated signatures and the original signature was calculated. The obtained results indicate that different challenges on one OPUF make different signatures distinguishable from the original signal with the noise, as shown in Figure 8a. Afterward, we dropped one noisy signature among the imposter signatures. Once more, we carried out the XCOR-C. The original noisy signal (genuine) was detected and appeared with the highest cross-correlation coefficient (peak) among the results. The results are shown in Figure 8b. Afterward, we dropped one noisy signature among the imposter signatures. Once more, we carried out the XCOR-C. The original noisy signal (genuine) was detected and appeared with the highest cross-correlation coefficient (peak) among the results. The results are shown in Figure 8b.
By regarding the obtained results, it is possible to define a threshold (TH) to accept or reject the signature as genuine. Consequently, the decision rule will be: if XCOR-C is below a certain threshold (here 0.5), the signature will be considered an imposter, but if it is above the TH, it will be deemed genuine.

Cyber Security
One of the most significant ways to explore a signature's robustness against cyber attacks is to consider the random structure of the produced signatures (bits). Indeed, a sequence that consists of random bits is extremely strong against cyber attacks attempting to predict that sequence. To investigate the randomness of our proposed signatures, we generated 500 signatures (QR codes) with 2D and 60 × 60 bits, generated by 3500 samples and a key of 100 bits. After running the STS test among 500 signatures, only seven of them failed the test, i.e., 1.4% failed sequences. We increased the signature dimension by measuring the RBS with a higher number of samples (4000) and achieved a signature with 64 × 64 bits, consisting of a 96-bit key. The test showed only 0.6% failed sequences. Eventually, we increased the number of samples (5000) to increase the QR resolution to 71 × 71 bits, consisting of a 41-bit key. As a result, we obtained 0.0% failed sequences i.e., all signatures passed the test. The probability of randomness of signatures with different QR resolutions is depicted in Figure 9.
measuring the RBS with a higher number of samples (4000) and achieved a signature with 64 × 64 bits, consisting of a 96-bit key. The test showed only 0.6% failed sequences. Eventually, we increased the number of samples (5000) to increase the QR resolution to 71 × 71 bits, consisting of a 41-bit key. As a result, we obtained 0.0% failed sequences i.e., all signatures passed the test. The probability of randomness of signatures with different QR resolutions is depicted in Figure 9.

Discussion
Optical identification (OI) through an optical signature is a perfect security complementary tool to the networks and guarantees the security of systems, sub systems, and devices. Such an OI identification method may find uses within several applications. It can be used in applications related to classic network security [32], quantum network security [31], or network quality of transmission (QoT). Regarding a network device census, as highlighted in [40], it is common that network operators may not be fully aware of all the deployed fiber types in a network. This implies problems in the quality of transmission estimation, which increases estimation inaccuracy forcing the operator to assume even higher network margins than expected, with a consequent underestimation of the optical reach and an increase in the costs of regeneration [41]. Regarding network security, OI can be implemented for different network scenarios [32] to identify, authenticate, and monitor networks. As a simple example, we consider a point-to-point

Discussion
Optical identification (OI) through an optical signature is a perfect security complementary tool to the networks and guarantees the security of systems, sub systems, and devices. Such an OI identification method may find uses within several applications. It can be used in applications related to classic network security [32], quantum network security [31], or network quality of transmission (QoT). Regarding a network device census, as highlighted in [40], it is common that network operators may not be fully aware of all the deployed fiber types in a network. This implies problems in the quality of transmission estimation, which increases estimation inaccuracy forcing the operator to assume even higher network margins than expected, with a consequent underestimation of the optical reach and an increase in the costs of regeneration [41]. Regarding network security, OI can be implemented for different network scenarios [32] to identify, authenticate, and monitor networks. As a simple example, we consider a point-to-point communication system that can be simply schematized through three sub-systems: transmitter, channel, and receiver. Each sub-system, as it was clarified in the previous sections, has its own signatures that are labeled as: Transmitter Signature, Channel Signature, and Receiver Signature, respectively. In this scenario, when the channel is supposed to be a passive sub-system, three possible security approaches can be envisaged: a.
the transmitter reads the Channel Signature and Receiver Signature to be sure that the information will pass to the specific channel and reach the specific receiver; b.
the receiver reads the Transmitter Signature and Channel Signature so that the receiver knows the sender and the physical path; c.
the transmitter reads the receiver, and the receiver acquires the Transmitter Signature so that both know with whom they are talking. Both may even check for Channel Signatures to check the path.

Conclusions
In this paper, the concept of a network digital signature has been introduced, in particular, it has been proposed to generate such a signature by resorting to using optical unclonable functions (OPUF) that are already intrinsically present in fibers and determined by the specific internal characteristics of each fiber. Specifically, the OPUF that has been individuated and then tested in this work is the Rayleigh backscattering signal (RBS). This kind of signal has been analyzed in order to understand if it is a good OPUF candidate that satisfies all the requirements that a PUF, in general, must fulfill to be adopted as a security mechanism in authentication scenarios. Such requirements are usually unclonability, robustness, unpredictability, and randomness; all of them have been thoroughly verified and evaluated on the basis of different metrics such as Hamming distance (HD), Euclidean distance (ED) and cross-correlation coefficient (XCOR-C). In all the different cases, the RBS has been demonstrated to significantly perform and has been confirmed to grant both robustness and distinctiveness.
The optical identification concept represents a novel approach to physical layer security which can be applied to any optical communication system and network. It provides supplementary quality and security to optical systems and network operations.
Future works will be dedicated to further investigating other typologies of metrics specific to real-valued signals, such as RBS, to better understand if superior performances can be achieved, particularly in terms of robustness. Following this line of investigation, diverse operative scenarios will be considered to comprehend the actual applicability of the proposed solution in more depth.