Qualification of distributed optical fiber sensors using probability of detection curves for delamination in composite laminates

Despite the promising application of Distributed Optical Fiber Sensors (DOFS) in monitoring damage in composite structures, their implementation outside academia is still unsatisfactory due to the lack of a systematic approach to assessing their damage detection performance. The existing tool developed for non-destructive evaluation, Probability of Detection (POD) curves, needs to be adapted for structural health monitoring applications to account for spatial and temporal dependency. Damage detection performance with DOFS is deeply related to the inherent variability sources of the system, the strain transfer properties of the optical fiber, and the loading conditions, which determine the damage-induced strain on the structure. This paper establishes a systematic approach based on the Length at Detection (LaD) method to qualify DOFS for damage detection in composites under different scenarios. Specifically, this study considers two DOFS with different strain transfer properties for monitoring delamination in carbon fiber reinforced polymers double-cantilever beam specimens under mode I quasi-static and fatigue loading. The POD curves derived from the LaD method confirm that this methodology can quantify the change in the detection performance due to the DOFS type and the loading conditions. The study also proposes a practical solution to compare POD curves obtained with different sample sizes, by introducing the concept of virtual specimens to simulate the lower confidence bound convergence.


Introduction
Over the last decades, composite laminates have become the predominant structural material in various engineering applications. Nowadays, the quest to develop safer and lighter structures still fosters the scientific community to investigate different damage mechanisms in composite materials and their reciprocal interaction. However, despite the impressive amount of research, open questions are still present, and the understanding of the physics behind failure modes in composites is limited. 1 Moreover, composite structures are particularly susceptible to flaws arising from the manufacturing process and service and exhibit complex failure modes as opposed to metals. Among them, delamination constitutes one of the most common damage mechanisms and can also occur in adhesive bonds. 2 Using large safety factors mitigates the risks of catastrophic structural failure but leads to heavier designs and might not be deemed sufficient to guarantee safety.
Consequently, delamination growth represents a severe threat to structural integrity in carbon fiber reinforced polymer (CFRP) structures, and it becomes necessary to implement Structural Health Monitoring (SHM) strategies. SHM can provide essential information about delamination existence, location, and size. Moreover, it can also deepen the understanding of other correlated damage mechanisms and thus promote the introduction of innovative composite materials and structures. 3 SHM offers a wide range of techniques, each with its strengths and weaknesses depending on the application.
Among them, Optical Fiber Sensors (OFS) provide numerous advantages over traditional strain sensing techniques. For example, they are intrinsically immune to electromagnetic interference; they have large bandwidth, which enables multiplexing solutions; and can survive harsh environmental conditions if protected with appropriate coatings and cable sheaths. Alj et al. provide further details about the durability of OFS. 4 Moreover, their lightweight and small size allow them to be embedded in composites 5 as well as 3D printed structures, 6 and they have recently been shown to be a viable alternative to accelerometers for modal analysis. 7 Recent advances in optical fiber technology fostered the use of Distributed Optical Fiber Sensors (DOFS) based on Raman, Brillouin, 8 and Rayleigh backscattering. DOFS based on Rayleigh backscattering are particularly promising for monitoring damage, such as delamination growth, in composites since they provide millimeter resolution along the fiber length within several meters of range. 9 However, assessing the damage detection performance of optical fibers is not a straightforward process. The optical fiber datasheet specifies the geometrical, mechanical, and optical properties. On the other hand, the interrogator datasheet provides the resolution, wavelength range, wavelength stability, maximum sensor length, measurement uncertainty, and sampling rate. Nevertheless, these metrics do not directly assess strain-based damage detection performance since damage is not a physical quantity that can be directly measured. Indeed, Axiom IVa of SHM states that sensors cannot measure damage and that a feature extraction process is needed to obtain damage-related information. 10 For example, considering delamination monitoring, the detection performance is expected to change depending on the loading conditions because they affect the damage-induced strain in the structure. In addition, depending on the strain transfer occurring from the structure to the fiber core, 11 DOFS may exhibit different detection performances. The current literature lacks well-established methodologies for certification and performance evaluation for damage detection, preventing the adoption of this technology in many applications.
The performance of Non-Destructive Evaluation (NDE) methods, widely accepted in many industries (aerospace, automotive, oil and gas, medical, and marine, to name a few), is evaluated following the guidelines provided in the MIL-HKBK-1823A. 12 First, damage detection performance is quantified using Probability of Detection (POD) curves and Probability of False Alarm (PFA). Furthermore, varying the threshold value makes it possible to evaluate the POD against the PFA and obtain the so-called Receiver Operating Characteristic curve. 13,14 It is legitimate to ask whether these NDE reliability metrics can be applied to SHM. The naive application of POD curves in SHM would lead to inconsistent results. One of the most critical differences between NDE and SHM is their variability sources. The human factor represents the highest variability contribution in NDE, whereas SHM is affected by both temporal and spatial sources of variability. Moreover, SHM is typically characterized by repeated measurements over time, implying that the independent measurement assumption used in NDE does not hold. 15 Meeker et al. 12 reviewed and proposed statistical methods for SHM, 16 extending the theory described in the MIL-HKBK-1823A. The authors demonstrate that the Length at Detection (LaD), 17 and the Repeated Measures Random Effects Model (REM 2 ), 18 are valid statistical methods to handle SHM data. However, in both cases, the lack of data often hinders their applications since it is challenging to manufacture and test many structures equipped with identical sensing systems. Model-Assisted POD (MAPOD) curves can reduce the amount of requested experimental data. They allow the modeling of many types of variability sources, but the computational cost can be prohibitive due to the curse of dimensionality. Surrogate modeling can mitigate this problem and is already available in software such as CIVA. 19,20 A systematic literature review by the authors 21 shows that in SHM, POD curves were mainly applied to Guided Waves 14,15,[22][23][24][25][26][27][28][29] 21 highlights that only a few POD studies on OFS are present, and no POD studies on DOFS are available. Grooteman developed a numerical model of a three-stringer thermoplastic composite panel installed with Fiber Bragg Gratings (FBGs) and computed the frequency shift in the eigenmodes. Then, using the modal strain energy as a damage indicator, they generated a POD curve using the hit/miss approach. 33 Sbarufatti and Giglio 34 developed POD curves to quantify the performance of FBGs bonded onto an aluminum stiffened panel in terms of minimum detectable crack length. In this work, the authors compared the confidence interval for a population proportion method 35 with the one-sided tolerance interval (OSTI) for a normal distribution. 36 To the best of the authors' knowledge, no study presents a rigorous methodology for qualifying DOFS in different scenarios using POD curves. A first attempt at generating POD curves for DOFS was explored by Falcetelli et al. 37 This study profoundly extends the preliminary research activity by establishing a systematic methodology based on the LaD method to qualify the damage detection performance of DOFS. The method is validated considering the use of DOFS for monitoring delamination in composite structures as a case study. Specifically, Single-Mode (SM) DOFS with ORMOCERÒ coating and Graded-Index Multimode (GIM) DOFS with a dual acrylate coating are surface mounted onto CFRP double cantilever beam (DCB) specimens under Mode I quasi-static and fatigue loading.
POD curves developed with the LaD method are used to evaluate the performance of the monitoring system of the two DOFS types in the two loading conditions. The results confirm that both strain transfer and loading conditions affect POD curves and prove that the proposed methodology can quantify the damage detection performance of DOFS in different scenarios.
Moreover, the authors introduce a practical approach to evaluating the required number of specimens based on the expected level of uncertainty. This method is two-fold since it can also serve for comparing POD curves generated from different sample sizes introducing the concept of virtual specimens. This dual functionality might be of great help in real applications where it is rare having homogeneous datasets.
The final aim of this article is not to promote the use of specific DOFS for delamination detection, rather is to develop a comprehensive methodology to assess the performance of DOSF and show the implications of making a POD study in SHM using DOFS. The proposed methodology aims to provide the SHM community with a reference procedure required to deploy DOFS in composite aircraft structures.
The article is organized as follows: Section ''Materials and methods'' presents the methodologies of this work, including the basic notions about POD in SHM, the method to estimate the required number of specimens and to compare POD curves produced from various sample sizes, the working principle of DOFS with Rayleigh backscattering, and the experimental methodology employed in the study; Section ''Results'' presents the results of the research; Section ''Discussion'' discusses the results and highlights their implications; Section ''Conclusions'' retraces the main stages of the article and suggests potential future research activities.

POD for SHM: LaD method
The MIL-HKBK-1823A 12 describes how to derive POD curves using the famous aˆversus a method. As an example, Figure 1 shows the output of this methodology using synthetic data. The critical a 90 value represents the crack length whose POD equals 90%. On the other hand, a 90=95 represents the same concept but is referred to as the POD lower bound. Hence, it can be thought of as the crack length which is detected with 95% confidence with a POD of 90%. The difference between a 90=95 and a 90 reflects the uncertainty associated with the experiments.
Different from NDE, where all the observations can be considered independent, in SHM, engineers may have to deal with repeated measurements. In this case, the hypothesis of statistical independence of the observations does not hold anymore, making it impossible to apply the standard linear regression model used for POD in NDE studies. 21 The LaD method avoids the issue of dealing with repeated measurements by considering only one observation among the time series of data. Similar to the standard method proposed in the MIL-HKBK-1823A, 12 detection is positive if a certain measurement is above a certain threshold. However, only the measurements at which damage (delamination in this study) is detected for the first time are considered. The crack lengths at detection lay on the threshold, potentially following different statistical distributions. Assuming that the population is normally distributed, the POD curve can be determined as a function of the crack length using Equation (1): The symbol F norm denotes the standard normal cumulative distribution function; x and s represent the sample mean and variance, respectively, which differ from the true mean and the true variance, both of which are unknown. One of the available statistical methods to produce confidence intervals is the OSTI approach. 17 The tolerance bound T for a certain quantile of a normally distributed population can be obtained by exploiting Equation (2): The tolerance factor, k, is the key term in Equation (2) and depends on three parameters: n, g, and a. The first parameter n represents the number of samples. In this study, as outlined in Section ''Experimental setup,'' the number of samples is equivalent to the total number of DOFS segments bonded in the specimens. It can be demonstrated that k decreases as n increases. 38 The second parameter g controls the desired confidence level and is set to 95% in most cases. Finally, the third parameter a is used to define the coverage level. In POD studies, the value of a is usually equal to 90%.
It should be noted that the LaD is not the only method to derive POD curves. For instance, the REM 2 method 16,18 is a valid alternative that allows more efficient data use. Nevertheless, when less than ten observations are available, issues arise to fit a fiveparameters model such as the REM 2 . 21,39 In these kinds of studies, where testing many specimens or even real structures becomes costly and time-consuming, the LaD approach seems more appropriate.

Required number of specimens
Here the authors propose a practical scheme to assess the required number of samples, n, required for the experimental campaign. The strategy is to perform first a pilot study with few samples required to compute x and s. Then, D = a 90/95 2a 90 is iteratively computed, leveraging the properties of the non-central t-distribution, 40 and increasing the number of samples, n, in each cycle. Once D exceeds the imposed tolerance value, the algorithm exits the while loop and returns the required number of samples. The corresponding pseudo-code is represented in Table 1: Where t nÀ1, g, d is the inverse of the g percentile of the non-central t distribution with n-1 degrees of freedom and non-centrality parameter d.

Simulating the effect of virtual specimens on the lower bound
As the a 90/95 value reduces with n, in principle it would only be possible to compare the POD curves obtained from equal sample sizes. This aspect can be a limiting factor when acquiring data is particularly expensive and time-consuming. Therefore, in real applications, there is the need to consistently compare POD curves generated from a different number of test structures. The problem can be tackled by virtually augmenting the number of samples to a common value. Non-centrality parameter of the non-central t distribution 5 n = n + 1 Increase the number of specimens by 1 6 Computation of the tolerance factor k 7 End of the While Loop The first step is to use the LaD method to compute the value of x and s. Then, assuming that the tested specimens properly captured the main variability sources affecting the experimental setup, it is possible to simulate the effect of an increasing number of specimens. Using the same equations described in Table 1, one can define the tolerance factor k at different n values but keeping the values of x and s fixed. Introducing these virtual specimens shrinks the lower bound toward the POD curve, potentially allowing for comparison of POD obtained from small datasets with others obtained from greater sample sizes. The results of the proposed simulation must be taken cautiously, and based on the user expertise and the available previous knowledge, one can judge if the initial hypothesis that the original specimens properly capture the inherent variability of the experimental setup holds to be true.

Specimens manufacturing
The DCB coupons were produced following the guidelines described in the ASTM D5528 standard. 41 The AS4 HexPly 8552Ò unidirectional carbon prepreg 42 was employed to fabricate a 300 mm square panel with [0 24 ] stacking sequence by hand layup. A 12 mm Teflon TM film was placed during lamination at the panel mid-plane. This non-adhesive insert served as an initiation site for the delamination, providing an initial crack length of 50 mm. The specimens were cut from the panel utilizing an automated ProthÒ cutting machine such that 25 mm strips were obtained. Ad hoc loading bocks were machined, matching the specimens width of 25 mm. Before bonding, the loading block surface was sandblasted, whereas the bonding surface of the specimens was slightly scrubbed with traditional sandpaper. Impurities were removed with an alcoholic solution, and the 3Mä Scotch-Weldä EC-9323 structural epoxy adhesive 43 was used for bonding.

Optical fiber sensors
Two types of DOFS were used in this study: SM OFSs with ORMOCERÒ 44 coating, produced by FBGS Technologies GmbH (Jena, Germany), and GIM DOFS, produced by Plasma Optical Fibre (Eindhoven, The Netherlands). These were connected via LC/APC connectors to an ODiSI-B, 45 developed by LUNA Innovations Inc. (Roanoke, VA, USA). The interrogator uses swept-wavelength coherent interferometry to measure Rayleigh backscattering, 9,46,47 which originates as a result of non-propagating material-density fluctuations. 48 The scattered light exhibits a repeatable profile that is sensitive to longitudinal strain, e, and temperature variation, DT . By correlating the scattered light profile before (baseline) and after (testing) a certain perturbation, it is possible to compute the spectral shift, Dn, or the variation in the resonance wavelength, Dl, of the scattered light according to Equation (3): Where K T and K e are the temperature and strain calibration constants. Equation (3) resembles the response of an FBG sensor. However, in this case, strain and temperature changes can be computed as a function of the fiber length with a certain spatial resolution, Dx, rather than just at the grating location. In this study, the interrogator was set up with a sampling frequency of 23.8 Hz and Dx equal to 0.65 mm.

Experimental setup
There are a large number of studies proposing analytical solutions for DCB specimens. The simplest analytical solution considers the DCB arms as cantilever beams clamped at the crack tip. 49 Both the Euler-Bernoulli beam theory and the Timoshenko beam theory can be used, with the latter providing more accurate results (Euler-Bernoulli-based solutions are a special case of Timoshenko-based solutions if the shear stiffness becomes infinite). 50 The plot in Figure 2(a) shows a qualitative representation of the theoretical (Euler-Bernoulli solution) and expected measured experimental strain profiles along the longitudinal direction (x-axis) of the DCB specimen. The scheme shown in Figure 2  Seg. #3, denote the three bonded segments present in each specimen. The configuration was chosen to minimize the bending radii of the DOFS. Figure 3 shows an example of a DCB specimen used in the fatigue test and the DOFS positioned above its top surface. The bonding of the DOFS was achieved using ThreeBond 1742Ò cyanoacrylate adhesive. 51 Before testing, one side of the DCB coupons was coated with a thin layer of white spray paint. After drying, 1 mm spaced vertical lines were used as a reference for visually estimating the crack length from the camera. An extra vertical mark is placed at the crack tip after the pre-cracking procedure explained in the D5528 standard. 41 Figure 4 shows a picture captured from a 9-Megapixel camera positioned in front of the specimen.
The true crack length is estimated by exploiting its relationship with the compliance C, which can be defined as the ratio between the load point displacement d and the applied load in the DCB specimen P. As explained in the D5528 standard, 41 and shown in Sans et al., 52 there is a linear relationship between the cube root of C and the crack length a: Where c 1 and c 2 are the fitting parameters of the linear model. Therefore, once a sufficient number of observations is available, the linear model can be fitted, allowing the assessment of future crack length estimations from the C values (available at each time step) without visualizing hundreds of images. For example, Figure 5 shows the linear regression performed on specimen number 4, along with confidence and prediction intervals. As shown in the zoomed view, data fall inside the prediction intervals.

Data structure
Acquired strain data during static and fatigue tests of the i-th DOFS segment and j-th specimen are organized in a matrix S j i as follows: Where t and x represent the time at which the measurement was taken and the location along the x-coordinate, respectively. The columns of S j i can be interpreted as the time history, t, of a single sensing element, whereas each row shows the strain profile along the fiber segment at a certain moment.
Similarly, the crack length is organized in a vector a j c , where the lower script c indicates that the crack has been estimated with Equation (3)

Static tests
A Zwick-20 kN tensile test machine was used for static testing, as shown in Figure 6.   The Zwick software was set up to synchronize the LUNA system and the camera. The tensile load is applied at a displacement rate of 1 mm/min. A sampling frequency of 0.5 Hz was used to collect data. The first experimental campaign used a total of five DCB specimens equipped with SM OFSs with ORMOCERÒ coating. Since three optical fiber segments are bonded onto each specimen, the number of linear regressions used to build POD curves can be multiplied by three.
The same methodology was applied in a second experimental campaign, where six specimens equipped with GIM DOFS were tested. Preliminary results revealed that GIM optical fibers are more sensitive to small bending radii. As a result of the repeated bending of the optical fiber, the configuration shown in Figure 2 would have resulted in an unsatisfactory signal-to-noise ratio. Therefore, in this case, only one central optical fiber segment was bonded in the specimen.

Fatigue test
An experimental fatigue test campaign was carried out on three specimens, where SM DOFS with ORMOCERÒ coating were surface bonded using the scheme previously shown in Figure 2. The DCB specimens were mounted in an MTS-10 kN Elastomer hydraulic test machine equipped with a 10 kN load cell. The whole experimental setup is shown in Figure 7.
The fatigue tests were performed in load control. Figure 8 shows a schematic overview of how the cycling loading was applied to the DCB specimens. Preliminary fatigue tests using DCB specimens manufactured from the same CFRP laminate were performed to assess the optimal load level to be used during fatigue testing. This preliminary study found that 80% of the pre-cracking load was the optimum load level for delamination growth. Lower loads would have led to very slow delamination growth, whereas higher loads would have resulted in an unstable delamination growth, which is not suitable for developing POD curves.
The MTS software was programmed to reach 80% of the pre-cracking load, P, with a ramp. Then, after every 500 cycles at 5 Hz with a loading ratio, R, equal to 0.1, the test is paused, and a trigger signal is sent to both the camera and the LUNA system to allow synchronized DOFS measurement and crack length estimation, respectively. This scheme was necessary   because the DOFS signal-to-noise ratio degrades with vibrations, and acquiring clean data without interrupting the test is difficult. Moreover, this acquisition configuration guarantees that DOFS measurements are acquired under the same applied load on the specimens during the fatigue test, which is desirable since the damage index (defined in the next Section ''Fatigue test'') is load-dependent. Even if this choice brings some difficulties because the crack propagation may become unstable, the damage index will depend only on the crack propagation and not on the applied load.

Damage index definition
The first step in developing POD curves is to identify a proper damage-sensitive feature. From theory, it is possible to predict that the stress field reaches its maximum compressive value at the crack tip. Therefore, the strain value at the crack tip is a potential damage-sensitive feature. For a generic delamination value, and thus a generic time value t, it is possible to define a damage index DI t as: Figure 9 shows an example of the strain profiles obtained using DOFS at different times in the static test profile. The black stars, placed in correspondence with the lower peak of each strain profile, highlight the crack tip position and its relative propagation as delamination grows. Due to the non-linear strain transfer occurring between the specimen and the optical fiber, 11 and the distortion in the measured strain due to the interrogator resolution, the strain does not decrease linearly with the delamination length.

Results
Static test SM optical fibers. Figure 10 shows the application of the LaD method to SM DOFS with ORMOCERÒ coating. The abovementioned damage index behaves linearly with respect to the crack length, and linear regression is performed for every damage index vector DI j i . Since every regression line has its own intercept and slope, the between-segment and between-specimen variability is considered in the model.
In Figure 10 the abscissa assumes zero value at the onset of the bonding length of each DOFS segment.  The threshold was chosen by quantifying the noise level in preliminary experiments. Precisely, three standard deviations related to noise data were summed to the highest intercept of the regression lines. This procedure avoids negative lengths at detection, which would be the equivalent of saying that the crack was detected before it reached the bonded region of the DOFS, which should not be possible in principle.
The normality assumption of the lengths at detection can be verified using the Anderson Darling test (Figure 11). The null hypothesis, H 0 , states that the data follow a normal distribution. The null hypothesis can be rejected if, for a certain significance level, a, the Anderson Darling statistics, A 2 , is greater than the critical value. In the present case, considering a significance level a = 0:05, and with a sample size N = 13, the critical value is equal to 0.679. The Anderson Darling statistics resulted to be A 2 = 0:298\0:679. Hence, H 0 cannot be rejected, and the normality assumption holds.
Under the assumption that the crack lengths at detection, denoted as black squares in Figure 10, follow a normal distribution, it is possible to build a POD and its relative lower bound by applying Equations (1) and (2), respectively. Figure 12 shows the POD that was obtained using this methodology.
The identified values for a 90 and a 90/95 in Figure 12  GIM fibers. The same methodology used for SM DOFS in Section ''SM Optical Fibers'' is now applied to the static test data obtained with GIM DOFS. The LaD results are shown in Figure 13. Although it is difficult to verify the normality assumption using the Anderson-Darling due to the low number of samples, the collected data are enough to show how a different strain transfer performance affects the resulting POD curve. The GIM DOFS has a dual acrylate coating whose stiffness is lower than the ORMOCERÒ coating of SM DOFS. This results in a lower strain transfer performance and a higher discrepancy between the real strain (the one present on the specimen surface) and the measured strain (strain present in the fiber core).    Figure 14 displays the corresponding POD curve with a 90 and a 90/95 equal to 13.03 and 18.56 mm, respectively. The poor performance in terms of strain transfer is reflected in a 90 and a 90/95 whose values are significantly higher than the previous static case.
Applying the method proposed in Section ''Required number of specimens,'' it is possible to show the convergence of the lower bound as the number of specimens increases (Figure 15). For example, when the number of specimens is equal to 97, the a 90/95 value reaches 14.02 mm. This procedure is useful for comparing experimental data collected from different samples. As will be shown in Section ''Discussion,'' to compare the a 90/95 values of different experimental setups, the number of samples is virtually augmented to 30 in all cases. Figure 16 shows that the LaD method was applied to fatigue test data. The variability within segments of the same specimens and between different specimens is more pronounced than for the static case, even if the same type of SM DOFS was used (with ORMOCERÒ coating).

Fatigue test
The corresponding POD curve is shown in Figure 17, with a 90 and a 90/95 equal to 5.88 and 7.82 mm, respectively.
The data highlight that both variability sources due to between-specimens and within-specimen heterogeneity are present. The first two specimens (black and blue color in Figure 16) have more data points with respect to the static case because of the large number of samples acquired every 500 cycles. On the other hand, the third specimen (magenta color in Figure 16) has few data points because the crack propagated beyond the bonded region of the DOFS right after the application    of the pre-cracking load and propagated faster than in the previous two cases.
Compared to the static case, the measured strain is lower because the specimens were fatigue loaded at 80% of the pre-cracking load, P. This translates into a lower signal-to-noise ratio and consequently lower DI values, producing higher a 90 and a 90/95 . Table 2 summarizes the results in terms of a 90 and a 90/95 for the different case studies.

Comparison of POD curves
Different optical fibers in the same loading configuration exhibit different a 90 values, as shown in Table 2. GIM DOFS with the dual acrylate coating have lower strain transfer performance than SM fibers with ORMOCERÒ coating, which is reflected in the higher a 90 value. Note that even the load plays a significant role in these metrics even if the fiber is the same. This is outlined in Table 2, comparing the first and the third columns. In fatigue loading, the POD curves are worse than the static case with higher values of a 90 .

Comparison of POD lower bounds with virtual samples
The difference between a 90 and a 90/95 , D, can be considered a measure of the variability sources involved in the experiments. Indeed, as shown in Equation (2), the position of the lower confidence bound is a function of the standard deviation, s, of the lengths at detection.
In this study, the number of DOFS segments (samples) in each case is different. This situation is likely to occur in real applications due to the availability of different DOFS or, for example, a limited amount of time to perform fatigue tests compared to static tests.
As described in Section ''Required number of specimens,'' since the tolerance factor k decreases as n increases, a higher number of samples, n, shrinks the difference between the POD curve and its lower bound, decreasing a 90/95 and correspondingly D. For example, Figure 15 shows how the sample size affects the a 90/95 value. The result was obtained using the pseudo-code developed in Table 1 (Section ''Required number of specimens''). However, performing long and expensive experimental activities is not always possible. In such a case, the only solution to lower a 90/95 is redesigning the experimental setup and diminishing the associated variability sources to achieve a lower sample standard deviation.
Nevertheless, it would be interesting to compare the results obtained in this research by having the same number of samples for each case study. Referring to the procedure outlined in Section ''Simulating the effect of virtual specimens on the lower bound,'' the authors virtually augmented the number of samples of the different case studies to 30 units. Under the assumption that the experimental data correctly captured the  variability sources involved in the experimental setup, this methodology allows a fair comparison between the different cases, eliminating a potential bias error due to the different sample sizes. Applying this procedure to Table 2, one obtains the results in Table 3.
The results confirm what is already seen in Table 2, even if the differences within the case studies are less accentuated.

Interpretation and implications of the results
The DOFS type proved to be a determinant factor in the POD analysis, which can be directly correlated to different strain transfer properties. On the other hand, the loading type is also shown to be a key variable. This is not surprising since DOSFs are sensitive to strain which depends on the applied load. The higher scattering in the fatigue data can be attributed to a lower signal-to-noise ratio. First, the test itself involves a higher amount of noise due to vibrations. Second, the crack propagates at a lower load, thus further reducing the signal-to-noise ratio. Moreover, the mechanisms involved in delamination growth are different in fatigue loading compared to quasi-static loading. 2 For example, a different amount of fiber bridging can affect the strain field in the process zone, 53 thus affecting the damage index and the POD parameters.
This result suggests that also the loading mode could potentially lead to different POD curves. Indeed, different mode mixites of Mode I and Mode II would affect the process zone and the strain profile, thus affecting the damage index. In such a case, a novel and more appropriate damage index should be developed because the strain at the crack tip may no longer be the best damage-sensitive feature.
Temperature variations are not considered in this study but are expected to be determinant in the POD analysis due to the relation between DT and Dl given in Equation (3). More in general, variation of Environmental and Operational Conditions (EOCs), damage morphology, sensor drift due to degradation (sensor and coupling), and additional variability sources dependent on the specific application will certainly affect POD curves. Therefore, it is essential to raise awareness about the limitations of the results and perform sensitivity studies to address the influence of the most determinant parameters.

Upscaling POD curves
In real applications, it could be inconceivable to test a sufficiently high number of structures to perform a statistically consistent POD study for DOFS. Indeed, one should be able to produce and replicate a large number of identical complex structures, each equipped with an identical DOFS setup. Even though the proposed methodology was developed considering DOFS in laboratory case studies, it offers a framework for assessing POD curves in real applications in two different ways.
First, it is possible to use the same methodology as a basis to derive MAPOD for DOFS. This could be achieved by simulating the outcome of the LaD method given the noise level, the loading conditions, and the strain transfer properties of the DOFS-structure mechanical system. The variability sources can be modeled assigning a certain probability distribution to the most critical parameters.
Second, POD curves obtained at a coupon level could be transferred at a structure level to monitor a specific damage type. The objective is to use the proposed methodology and build an experimental setup that mimics the local perturbation caused by damage in the strain field of a real structure. For example, in a hot spot monitoring scenario, where the structure is expected to fail due to mode-I delamination, the POD curves obtained from equivalent DCB specimens can provide an acceptable estimate of the damage detection performance of the system in the real application.

Conclusions
To the best of the authors' knowledge, this is the first time an experimental POD study has been performed for DOFS based on the Rayleigh backscattering. The study proposed a methodology to develop POD curves using the LaD method focusing on delamination, which is one of the major causes of failure for composites. Mode I static and fatigue loading experiments were performed on DCB specimens with two types of DOFS (SM fibers with ORMOCERÒ coating and GIM fibers with dual acrylate coating).
Probably, better POD curves could be obtained by using stiffer adhesives, redesigning the experimental setup to have lower noise, or using DOFS with higher strain transfer properties. However, the case studies that have been shown only serve as examples to show the implications of performing a POD study in SHM using DOFS. The goal is to develop an easily reproducible methodology to assess the performance of DOSF and to bring the attention of the SHM community to this topic which is often underestimated.
The following bullet points summarize the main finding of this research: Both loading conditions and DOFS type affect the performance in delamination detection POD curves for DOFS can also be sensitive to different loading modes, damage types, and laminate stacking sequences, dramatically increasing the problem complexity compared to classical NDE applications. The LaD model proved effective in producing POD curves for DOFS, but the normality assumption is difficult to verify as the sample size decreases. Other POD models, such as the REM, do not require any normality assumption but are difficult to fit with small sample sizes. In many cases, the only feasible solution is to derive a MAPOD. The proposed framework, combined with preliminary knowledge regarding the most frequent damage modes in the structure, could be used to develop MAPOD for DOFS. The study provides a practical approach to estimating the required number of samples for the POD study. The same approach can be used to simulate the lower bound convergence, imposing a certain number of virtual samples to compare POD curves obtained from different sample sizes. Caution must be taken in interpreting the results since the underlying assumption is that the available samples properly captured the variability. The presence of unexpected variability sources, which are not captured in the experiments, such as varying EOCs, leads to unconservative results.
Based on the finding of this work, further research is needed and should be devoted to the following aspects: development of multi-dimensional POD curves varying the mode mixites between Mode I and Mode II for delamination; development of a MAPOD framework for DOFS; link the concepts of strain transfer and POD curves; development of compensation strategies for varying EOCs, sensor drift, and other variability sources potentially affecting POD curves. analysis of upscaling potentialities and limitations of such methodology, from both structural complexity and loading complexity aspects; The final aim of this work is to spark a constructive debate in the SHM community about developing the most appropriate methodologies to certify DOFS for damage detection using POD curves.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.