Ultrasonic Guided Wave Estimation of Minimum Remaining Wall Thickness Using Gaussian Process Regression

Ultrasonic Guided Waves (UGW) offer the possibility of inspecting a strip across a structure rather than just the point under a traditional bulk wave transducer. This can increase the rate of inspection and enable inspection under obstructions. This paper investigates the instantaneous phase characteristics of the shear horizontal guided waves for various defect depths and widths. The Gaussian process regression is then evaluated for estimating the minimum remaining wall thickness between a pair of transducers. A Gaussian process regression model is built using the fusion of large-scale simulated and low-scale real experimental data. For this purpose, a more precise model of an electromagnetic acoustic transducer is initially built by integrating both electromagnetic and elastic wave fields. Then the simulated data set is built after having been calibrated using a genetic algorithm. The examination of an unseen simulated evaluation data set shows that 96% of data has an error during thickness gauging of less than 10 per cent of wall thickness. Finally, an experimental testing data set containing three different defects with depths of 3.7, 5.7 and 9.2 mm was examined, resulting in a good depth prediction of large defects with < 1 mm error for defects wider than one wavelength.


Introduction
According to a study by the National Association of Corrosion Engineers, corrosion is predicted to cost the global economy around $2.5 trillion annually, see in [1]. In order to avoid any catastrophic failures, materials and structures should be regularly evaluated to identify corrosion defects that may drastically change their mechanical properties in service [2]. Conventional ultrasonic NDE for the insitu corrosion thickness gauging of a sample is typically carried out in a point-by-point manner using an ultrasonic bulk wave transducer, where the probe is scanned over every point on the surface. This method is the most suitable for some applications due to the small inspection area of the sample under investigation and the potentially high measurement accuracy. However, point-by-point inspection, which generates a huge amount of data with low information content, is a time-consuming approach for inspecting larger industrial assets such as the storage tanks/pipelines that can be found in the oil and gas and power industries. Mobile robotic platforms are commonly replacing manual inspection methods to make this ultrasonic point-by-point inspection more efficient; however, by its very nature point by point inspection will always be time-consuming [3,4].
A potential NDT solution without the constraints of traditional bulk ultrasound is to take advantage of ultrasonic guided waves (UGW), which are suitable for mid/long-distance inspection [5], material characterisation [6,7] and structural health monitoring [8]. This class of waves is appropriate for use on curved/irregular surfaces [9], such as composite skin-stringer materials [10], which can be complicated when using bulk wave techniques. UGWs are supported by the upper and lower surfaces of the component under inspection. Unlike ultrasonic bulk waves, guided waves can also be used to inspect and detect defects such as corrosion in inaccessible regions. Khalili and Cawley have conducted comprehensive research on selecting guided wave modes in [11]. They reported that the first shear horizontal (SH1) wave mode experiences a significant reflection from a wide-area gradual thinning defect as the minimum remnant thickness of the notch is below its cut-off frequency-thickness product. In contrast, the A1 Lamb wave mode offers the best performance for inspecting sharp severe defects. Electromagnetic acoustic transducers (EMATs) are increasingly used to generate shear horizontal ultrasonic guided waves [9]. In contrast to contact piezoelectric ultrasonic transducers, EMATs can be used on conductive materials without any surface preparation or couplant, making them more appropriate for in-field inspection of industrial assets. Shear horizontal ultrasonic guided waves avoid much of the signal attenuation related to the liquid loading, which is problematic for most Lamb wave modes and is straightforward to generate with an EMAT [12].
Multiple factors should be considered for an effective and efficient ultrasonic-guided wave inspection process, summarised in Figure 1. These data acquisition strategies include; a) excitation factors such as the transducers, wave mode, excitation frequency, measurement setup, etc. b) the feature extraction and its linear and nonlinear damage sensitive properties c) data visualisation strategies such as tomographic imaging or statistical models.
For example, Khalili and Cawley [11] considered both pulse-echo and pitch-catch modes to extract amplitude-based parameters of the recorded waveforms. Huthwaite [13] explored the phase velocity of multiple recorded signals scattered from a defect in the frequency domain to present a thickness map of the structure under inspection. Extraction of nonlinear ultrasonic signatures such as harmonics has also been reported in [14] in a pitch-catch measurement to identify and localise defects. To this day, there is no one simple, well-established solution for UGW inspection. UGW research is going on to explore the involved factors in identifying the most robust parameters for the UGW inspection for different application scenarios.  Guided waves are still considered mainly as qualitative ultrasonic testing primarily for screening purposes. For instance, simple guided wave inspection results still lack the required information to determine the type of defects present and their characteristics, such as the schematic defect in Figure 2, although plenty of robust signal processing techniques have been developed to identify the existence and location of defects [14]. The dimensions of the defect such as its depth and lateral extent, are needed for estimating the remaining useful life of the structure under inspection. Therefore, this research aims to investigate a data-driven approach for a quantitative guided wave-based inspection system that can have applications in different industries such as manufacturing, oil and gas and energy. Knowing defect characteristics, the system can then predict the remaining useful life of the structure.  -catch measurement set up to measure the minimum remaining wall  thickness using guided waves, T and R stand for transmitter and receiver in the pitch-catch mode. A number of publications have reported on the determination of remaining wall thickness using UGW and a simple pitch-catch or pulse-echo measurement. Different damage sensitive properties, including the cut-off frequency [15], phase/group velocity, reflection and transmission coefficients have also been evaluated. Table 1 summarises the content of several publications. Different types of investigation have been reported; for example, references [16,17] use a wide range of wavelengths to investigate the cut-off frequency of the SH1 mode, and one needs to design a customised EMAT to provide this range of wavelengths (#1&2). Ref [18] (#8) used SH0 and SH1 waves and classified the defects into three categories of low, medium and high severity, while ref [19] (#9) investigated the correlation between some features and the defect depth/length. Based on this literature review, the available physics-based approaches are not yet completed nor well-established to estimate the remaining part of defects using standard EMATs and ultrasonic techniques. Machine learning techniques such as shallow and deep neural networks and Gaussian process regression [28] are being investigated for non-destructive testing and evaluation (NDT&E) and structural health monitoring purposes (SHM) in the context of ultrasonic inspection [29] [30]; however, this technology is still in the early stage of research and development. A number of ML publications have been reported on the optimisation of data acquisition strategies [31], artefact suppression [32], damage detection [33,34] and damage classification [35]. In the context of damage quantification, Paixão et al. have explored the quantification of a delamination area in a composite structure using a Gaussian Process Regression (GPR) model [36]. Damage sensitive features are extracted by autoregressive models and Mahalanobis squared distance.
As can be observed in Table 1, machine learning (ML)-based approaches [37] have also recently emerged in this area as a potential solution to address the defect depth; however, a large amount of labelled data for training is required. In practice, the collection of substantial labelled data is often expensive or even impractical in many NDT applications. To tackle this issue, the training process is mainly carried out using simulated data modelled in finite element method (FEM) software. Compared with real-world data, simulated data has the following advantages: a) cost-effective, b) easier to label, c) faster and practically scalable, and d) less wear and tear of transducers and inspection systems. Synthetic data sets generated from cloud-based FEM simulations can be created in hours using software platforms such as OnScale [38].
This manuscript brings together experimental observations and simulations in a logically connected way for the in-situ studies of structural integrity. The authors have previously investigated the integration of robotic vehicles with ultrasonic guided waves generated by non-contact transducers such as EMAT [39,40] and air-coupled transducers [41] in order to enhance the inspection process, measurement repeatability and the ability to access certain hazardous locations. This was followed by an investigation of guided wave mapping techniques such as Bayesian occupancy grid mapping to interpret SH ultrasonic data better using robotic platforms [42,43]. This research extends the authors' previous studies, creating a richer ultrasonic mapping of structures by characterising defects on the fly as the robot navigates on the structural assets. The contribution of this paper is an estimation of the minimum remaining wall thickness using shear horizontal guided waves in a pitch-catch mode, using a novel combination of GPR and SH guided waves. The final aim is to integrate this technique into a crawler-based inspection system for autonomous large area inspection, equipped with a pair of generation and reception transducers to highlight areas of significant wall loss using guided waves, and provide defect characteristics such as the defect depth. The specific novelty of the paper can be summarised as follows:  Evaluation of instantaneous characteristics of UGW signals as a function of defect depth and width.  Demonstration that the Gaussian Process Regression (GPR) technique can be used as a new estimation method to predict the remnant wall thickness using large-scale simulation data.  Demonstration that the simulated GPR model can be used for estimating the remnant wall thickness of wide defects 100% of wavelength using SH0 wave mode in practice. ≥ The remainder of this paper is organised as follows: Section 2 provides general background on Gaussian process regression. Section 3.1 presents the implementation of the finite element analysis (FEA) model to simulate the SH guided waves, the evaluation of different signal features as a function of defect depth and width, and the analysis of the testing data set to obtain the GPR model. Section 3.2 details the processing steps to extract the instantaneous features and the evaluation of remnant thickness using the simulated data set. Section 4 covers the experimental evaluation of the algorithm and investigates the feasibility of exploring the simulated GPR model to predict the minimum remaining wall thickness of defects in an experiment. Section 5 concludes this paper and notes the limitations of the proposed algorithm as well as future areas of relevant work.

Theoretical background on Gaussian process regression
The Gaussian process (GP) is a probabilistic supervised machine learning framework that has been widely exploited for regression and classification tasks [44,45]. A Gaussian process is defined by its mean and covariance function (kernel function). In Gaussian process regression (GPR), the data can be divided into the training data with the corresponding known output , and testing data with ( ) ( ) ( * ) the corresponding unknown output . Their prior mean and covariance is expressed as ( * ) (.) (.,.) the joint distribution [45]: (1) where denotes the matrix of covariances evaluated at all pairs of training and testing points, and the covariance between the individual variates within X and , respectively, is the noise-free observations under training input , and is the noise-free output under the test input * . Once the GP prior is specified, the GP posterior can predict the testing output. It can be defined by * the conditional distribution as below: where and are the posterior mean and its covariance. * * An exponential covariance function with automatic relevance determination (ARD) is used for this purpose as follows: where the hyperparameters of and are signal variance and a separate length scale for each 2 σ predictor is a sample input point and is some other sample input point within the m, m = 1,2,…,d. data set. Figure 3 shows a schematic of a pitch-catch measurement setup used in the simulation. The simulation was built using a 10 mm Aluminium sample to be consistent with an available experimental sample for estimating the remaining wall thickness. In order to reduce the computational time, a quarter of the model was considered, making use of symmetry. The mesh of the model is rectangular with an element size of 1 mm, except in the defect region of the plate, where the mesh is finer with a size of 0.5 mm. This was carried out by a "glue" command in OnScale. Particle velocity in the x-direction (inplane component, see Figure 3) was measured for evaluation.

Finite Element Analysis
In order to make a calibrated model of the measurement system, the finite element simulation consists of the three following sections:  Simulation of periodic permanent magnet (PPM) EMAT in COMSOL Multiphysics ® [46] to calculate a more accurate Lorentz force distribution rather than using a uniform distribution [47].  Three-dimensional finite-element simulations of ultrasonic wave propagation in OnScale using inputs generated by COMSOL.  Optimisation of simulated parameters using the incorporation of genetic algorithm and OnScale software.
Note that simulation involves electromagnetic and wave propagation studies [25]. In this paper, the electromagnetic part was calculated in COMSOL to calculate the Lorentz force distribution, and the 3D wave propagation part in OnScale to reduce the computational time.

Simulation of PPM EMAT
To obtain an accurate representation of the Lorentz force field, the electromagnetic interaction of a PPM EMAT with the sample under inspection was modelled in COMSOL. This is a crucial step since the simulated data forms the foundation of the training dataset for the machine learning algorithm and, therefore, must match the data produced experimentally. The Lorentz forces are given by: where is the static magnetic field and the eddy current induced on the plate [48]. Note that the 0 terms appearing on the right-hand side of equation (7) are independent of each other. Therefore, two separate sub-models, simulating the magnetic field of the transducer and the eddy current density induced by the coil on the sample were developed.
The first sub-model, developed using COMSOL's magnetic fields interface, computes the magnetic field distribution, see Figure 4, through the curl of the magnetic vector potential, , This 3D model simulation of an EMAT was set up to mimic the EMAT we used in the experiment with two rows of 6 magnets, each with alternating polarity and a spacing of 2.5 mm along the propagation direction (Z). The magnets are identical with a remanent flux density and = 0.21 relative permeability . The geometrical dimensions are given in Table 2. The magnets are = 1 surrounded by an air volume of 83×50×27 mm in the COMSOL model. Note that the spacing between the centre of magnets with the same polarity orientation is 25 mm in order to generate SH waves with a 25 mm wavelength. A rectangular mesh with an element size of 1 mm was used in this simulation to be consistent with the mesh size of wave propagation.   The second sub-model considers the action of the coil on an aluminium plate. This 2D simulation models the middle cross-section of the plate and the coil. For simplicity, the coil was modelled as two rectangles with dimensions of 20 mm (corresponding to half the length of the actual EMAT coil) by 0.315 mm (related to the diameter of the coil used in the EMAT), as shown schematically in Figure 5. These two rectangular coils correspond to the alternating current inside a racetrack EMAT coil. To increase the accuracy of the simulation, a custom coefficient form PDE (partial differential equation) [49] scheme was developed to model the complete equation of the source current density [50], where , are permeability, conductivity, the component (see Figure 3) of the magnetic vector , , potential, and cross-sectional area of the coil, respectively. The output signal of a RITEC RPR-4000 Pulser/Receiver system (RITEC Inc., Warwick, RI) was replicated in simulation to be used as a current input, . The mesh is rectangular with an element size of 1 mm to be consistent with the wave ( ) propagation, except when considering the skin depth of the plate underneath the coil where the mesh is finer with the size of 0.07 mm to ensure that there is a sufficient number of elements to resolve the skin effect. Ultimately, this model computes the eddy current distribution at the surface of the plate [48]: The Lorentz forces were then calculated using equation (7), which are the primary source of the shear horizontal wave excitation. The Lorentz force distribution at the sample surface was finally fed into OnScale as the excitation input for the 3D ultrasonic wave simulation, generating models such as the one shown in Figure 6a. Throughout this research, the samples were excited with a sinusoidal 3cycle tone burst at 128 kHz (25 mm wavelength) to generate a dominant SH0 wave mode., see Figure  6b. For the sake of presentation, the x-symmetrical part of the model is turned on in the OnScale simulation.

Optimisation of material parameters in simulation
Besides the simulation of PPM EMAT, the uncertainty in the sample parameters and the measurement setup may also have an impact on the generated SH signal. These simulation parameters include density, longitudinal velocity, shear velocity, damping of the sample under inspection and the separation distance between EMAT generator and receiver. In real-life scenarios, depending on the inspection purposes, other parameters such as temperature may also need to be taken into account. To gain insight into how much the aforementioned parameters impact the SH signals, the parameters were individually varied by ten per cent in the simulation, with the impact shown in Figure 7. The difference between the original signal and the new simulated signal was quantified by the signal difference coefficient (SDC) defined in equation (12). The interaction effect of stimulation parameters was not considered here. Figure 7 shows that variations in separation distance and shear velocity have the most significant impact on the generated SH signal. In contrast, density variation has the least effect, as shown by the zoomed-in section of the figure presented in Figure 7 (b). It is also obvious that shear velocity and shear attenuation has a higher impact than their longitudinal counterparts. This is expected as we are dealing with a shear horizontal wave mode. In real-life scenarios, these values contain some uncertainty and are found either by measuring them through experiments or looking them up in standards and/or in the literature. Therefore, these values with the corresponding uncertainties were fed into an optimisation genetic algorithm (GA) to find optimal parameters that closely match the available experimental SH data from the intact state of the material, with the process shown in Figure 8, giving the optimal values of the parameters discussed in Figure 7. The combination of FEM and genetic algorithm simply helps to find an optimised set of structural parameters [52]. The shear and longitudinal parameters were initially estimated experimentally using a pulse-echo method. This estimation was then fed into the GA with some level of uncertainty observed in experiments. To calculate the fitness function, the difference between the simulated SH signal and the experimental one, both from the intact state of the sample, were quantified. In this research, the GA population size was set to 20, and it was allowed to run for up to 30 generations. The fitness function can be defined by incorporating both signal difference coefficient (SDC) and normalised root mean square error (NRMSE). To do that, Pearson's linear correlation coefficient, is initially defined as below: , , where and indicate simulated and experimental signals at the intact state, respectively. Using equation (11), can then be defined as below such that the result always has a value between 0 and 1.
, (12) = 1 -( + 1)/2 Also, as the is not sensitive to the signal's amplitude-only changes, the is defined as: where indicates the 2-norm of a vector. In the end, the cost function incorporates both and metrics, Objective function = (14) ( + )/2. Table 3 lists the optimised simulation parameters for an Aluminium alloy sample under inspection. A comparison between the optimised simulated signal and the experimental signal is shown in Figure  9. There is still a small discrepancy between simulation and experimental data. However, there is a very close correlation between the simulated SH0 wave mode and its corresponding experimental data compared to the SH1 wave mode. Further investigation is required to identify the discrepancies between the real and simulated signals. Note that a part of the SH0 wave mode within a time window centred at the maximum amplitude is selected for building the ML model. a b

Defect introduction in simulation
Having calibrated the simulation model, the simulated dataset for the ML was built with the process described by the flowchart in Figure 8. However, defects need to also be included in the model as the purpose is to estimate the depth of the defect. Four different defect shapes are considered for this research, shown in Figure 10. Circular (a.k.a flat bottom hole, FBH), rectangular, elliptical, and tapered defects were chosen to train the model in simulation. The depth and width of all defects were varied between [1-10] mm and [10-60] mm (40%-240% of wavelength) with 1 mm step size, respectively. It is important to note that the taper angle of the tapered defect also varies as the depth of this type of defect is changed. This is because the bottom width of the defect was fixed as half of the defect size at the top surface, as shown in Figure 10b.
Different shapes of defects should be included in training data to avoid bias and make the ML model insensitive to the shape of the defect. Different defects, such as the ones with sharp edges, may cause various complex wave interaction effects at the entering or exiting of the defect region [53]. Ideally, all types of potential real defects should be modelled to build a model that is invariant to the type of defects; however, this work focuses on proving the concept on the lab scale. In a real scenario, at least 8 different shapes of defects should be considered for the corrosion modelling according to the ASTM G46 [54]. These include narrow (deep), elliptical, wide (shallow), subsurface, undercutting, horizontal grain attack, and vertical grain attack.
Furthermore, to make the ML model invariant to the defect position, this should also vary across the entire interrogation path. However, due to computational time and for the sake of proof of concept, in this manuscript, it was only changed by 5 mm along the wave propagation path (z-axis, see Figure  ± 3). Therefore, the ML data includes four different defect shapes varying in depth and width in addition to the data set obtained by changing the defect position.

Feature extraction
Conventional amplitude-based processing approaches can easily be affected by various parameters such as variation in the EMAT lift-off and mounting conditions [55]. To reduce this impact, the phase characteristics were evaluated in this research. Similar to the traditional amplitude processing approaches, it has been reported that the instantaneous phase (IP) can be used as an additional source of information for defect detection by improving conventional amplitude-based analysis [56][57][58]. It has been shown that IP images could result in fewer artefacts and sidelobe interference (more suitable for defects near the surface), and there is no need for time-gain compensation. Pavlopoulou et al. [59,60] have also shown that instantaneous characteristics of ultrasonic guided waves have a strong potential for defect characterisation. They have reported that the damage index extracted from the IP increases as the damage develops.
The instantaneous phase was extracted by transforming each signal, from the time to phase ( ), domain. The instantaneous phase, in time-domain analysis can be understood from the ( ), expressions below: (15) ( ) = * sin ( + ∆∅), , (16) where is the amplitude of the signal and the angular frequency, respectively. There are a number of definitions to extract the instantaneous characteristics of a signal. The most popular approach is defined by Gabor and Ville (see in Pavlopoulou et al. [59]) as: (17) ( ) = ( ) + [ ( )], (18) where and are the analytical signal and Hilbert transform, respectively. Note that is ( ) [ ( )] ( ) also the wrapped phase in the range of . As an example, Figure 11 shows a typical variation [ -, + ] of IP against the depth and the width of two circular and tapered defects at the maximum amplitude instant of SH0, see Figure 9. Note that thirty IPs around the maximum amplitude were used for building the GPR model. Figure 11 only shows one IP at the maximum amplitude of the signal. A simple interpretation is that the IP is monotonically decreasing to some level of defect width and then increasing. The abrupt discontinuities observed on the deep defects (8-10 mm) are artefacts of phase wrapping. This means that the SH0 signal reaches the out-of-phase state compared to the intact ( -) state, and as we further change the depth and width, the phase shift increases. However, as it is outside the range of , the phase values are increased by to put the phase values within the [ -, + ] 2 mentioned range. Instantaneous phase values were then transformed to the cubed value of their original values to more closely approximate a Gaussian distribution with reducing the skewness of feature distribution.
The ML model was built using the available simulated and experimental intact data set. In real-life scenarios, the real intact data set could be collected using a reference sample with identical material properties and thickness. Lee et al. have shown that combining a few high-fidelity experimental data sets with large-scale low-fidelity simulation data can enhance prediction accuracy in Gaussian process regression [62]. Table 4 shows the composition of the data set used for the ML model. It includes a variation in the shape, size and position of the defect. The data includes all calibrated simulated data in addition to the 130 experimental data measured from the intact state of the sample. This shuffled data was then split into a ratio of 80:20 for training and evaluation, respectively. Each IP feature was also zero-centred and normalised by its standard deviation on the training set. The evaluation (unseen simulated data set) and testing data sets (see section 0) were then z-scored relative to the mean and standard deviation of the training set, rather than the test set, to place the test trials on the same scale as the training set [61]. Figure 12a illustrates a good linear correlation (RMSE = 0.35 mm) between predicted and ground truth data. Prediction error (mismatches between predicted and truth values) also confirms that 96.45% of the data has a prediction error of less than 1 mm, see Figure 12b.

Experimental setup
Experiments were carried out on two samples with a 10 mm thickness to evaluate a) the instantaneous phase of SH wave modes on a mild steel sample containing three different defects' depth with an identical diameter, b) the estimation of the experimental minimum remaining wall thickness on an Aluminium sample containing three different defects (diameter and depth) using SH0 wave mode. The defect details of each experiment are introduced in the following sections of 4.1 and 0, respectively. For all measurements, a RITEC RPR-4000 Pulser/Receiver (RITEC Inc., Warwick, RI) [63] was used to generate a 3-cycle tone-burst at the desired frequency. The transmitted signals were then measured using a PicoScope 5000a series (Pico Technology, Cambridgeshire, UK) [64], which was triggered by the RITEC. All the measuring equipment was controlled by a LabVIEW based programme. A commercial pair of EMATs from Sonemat Ltd. [65], with model numbers SHG2541-S and SHD2541-S, with a nominal wavelength of 25 mm was used. 3.1.1 The transducers dimensions were identical to the ones used in section 3.1.1. In this experiment, the signals were measured by a sampling frequency of 62.5MHz; however, they were downsampled to 2.2321MHz to match the sampling frequency used in the simulation for further analysis.

Instantaneous phase evaluation
To gain more insight into instantaneous phases, an experiment was conducted on a 10 mm thick mild steel sample (S275) containing three different FBHs with a depth of 3.7 mm, 5.5 mm and 9.2 mm along the scanning direction, see Figure 14. The defects' diameters were intentionally kept fixed to 25 mm (100% of wavelength) to merely evaluate the effect of defect depth on the instantaneous phase of the measured signals in this section. The instantaneous phase was calculated at the maximum amplitude instant of both SH0 and SH1 wave modes. Figure 15a shows that there is a direct relationship between the SH0 phase and the defect depth, as we would expect from the simulation data shown in Figure 11. The phase of the SH0 mode monotonously decreases as the defect depth increases. However, this relationship does not hold for the SH1 data, see Figure 15b. Suitability of SH0 compared to the SH1 wave mode for the prediction of defect depth has also been reported by Hirao and Ogi [66]. Similarly, we can also observe from Figure 15 that the phase of the SH1 wave mode is more sensitive to the presence of the defects than that of the SH0 mode; however, the SH0 wave mode seems more suitable for extracting the defect depth's information in this simple pitch-catch configuration. However, it has also recently been reported that SH1 is a promising candidate for thickness mapping when the phase velocity information [12] and or cut-off properties [16,17], see Table 1 (#1 and #2), are used as sensitive features. For the former case, the SH0 is not dispersive and cannot be used for velocity inversion thickness mapping; therefore, they have selected the first dispersive anti-symmetric mode [12]. For the latter case, the transducer should be configured in such a way as to accommodate a wide range of wavelengths to create a time-frequency dispersion map.  Table 1, it can be deduced that the optimal wave mode for the thickness mapping is still under question as it also depends on the data acquisition scheme and the extracted feature. However, it has been shown that SH wave modes are promising candidates for obtaining thickness information compared to the Lamb wave modes when phase velocity information is used in a ring excitation scheme consisting of 120 transducers [12]. In contrast to this, we are considering a simple pitch-catch measurement as this setup is more suitable for robotic deployment. Therefore, the resolution of thickness estimation using one single pitch-catch excitation scheme will not be as precise as the ring excitation system since this measurement neglects diffraction and scattering of all surroundings of the defect. Thus, there is a trade-off between the complexity of the measurement scheme and the amount of extracted information for the thickness mapping.
Scanning direction direction Figure 14. A cross-section of a 10 mm mild steel sample was merely used for the assessment of the instantaneous phase feature, showing that only the depth of defects varies.  Figure 15. Instantaneous phase at the maximum amplitude instant using the experimental data containing three FBHs with identical diameter (25mm) but different depths, see Figure 14, a) SH0 wave mode, b) SH1 wave mode.
Note that a systematic comparison between Figure 11a and Figure 15a was not the purpose of this paper, as we used the steel sample in this section. Furthermore, the data measured here was not included in the experimental testing data set of the GPR model (section 0). Due to logistics and practicalities, we could not gain access to the aluminium sample containing FBHs with the same diameter and different depths. However, this material discrepancy does not compromise our comparative evaluation of IPs extracted from SH wave modes. Figure 15a shows a correlation between IP and the depth for the SH0 wave mode; however, this data was measured by keeping the defect diameter fixed while only changing the defect's depth, see Figure 14. When the diameter or width of the defect is also changing, the interpretation is not straightforward anymore, as shown in Figure 15a, and there is no longer a simple correlation between IP and depth. As shown in Figure 11, the instantaneous phase is dependent on the defect depth, width and shape. So, the depth cannot simply be estimated using physics-based approaches in real-life scenarios when we use IPs as sensitive features. However, data-driven approaches such as the GPR model can be used to figure out this complicated relationship.

Detection and remnant thickness estimation of defects using an experimental data set
An additional testing data set made of practical measurements was created using a 10 mm thick Aluminium sample (grade 6082) containing three FBHs with different depths and widths, as shown in Figure 16. The same Aluminium material as sections 3.1 and 3.2 was utilised in this evaluation as the final purpose is to evaluate the measured experimental data using the model built in section 3.2. The same feature extraction method explained in section 3.2 was also applied to this measured experimental data to prepare the testing data for the depth prediction. Note that this testing data set did not participate in training and validation steps and was only used for GPR model testing. To make it more complicated, the depth and widths were selected in such a way that as the depth of FBHs increases, their width decreases. This experimental data was measured in the scan across the three defects from shallow wide to narrow, and deep flaws, see Figure 16. Before the assessment of the GPR model in this section, the signals were initially evaluated using three reference-based algorithms with the intact signal as a reference. This analysis confirms that if the purpose is merely to identify defects in this pitch-catch configuration, we could take advantage of signal processing algorithms rather than a data-driven approach. The primary added value of machine learning would be then to estimate the defects' depth, not just the detection of defects.
The measured signals were evaluated using the SDC (see equation (12)) and dynamic time warping (DTW) [67], see Appendix A, and squared Mahalanobis distance (SMD), to check for the presence of a defect, see Figure 17. SMD is calculated as [68], (20) = ( -) ∑ -1 ( -), where is the measured SH0 signals in each position and and are the mean and covariance of 130 ∑ intact signals, respectively.
These three robust algorithms can be used to measure the similarity between time-series data. However, the SDC and SMD result in higher contrast of the defect to the intact ratio for defect detection. However, SMD is more sensitive to wide shallow defects, see Figure 17a, compared to SDC, whereas SDC is more sensitive to narrow-area deep defects, see Figure 17b. Note that all three aforementioned algorithms are reference-based, and therefore the intact state of the sample was used for this purpose. The depth estimation of the FBHs was also carried out using the ML build model using the lowscale real and large-scale simulated data, as listed in Table 4. Around 450 unseen experimental EMAT data sets were measured to evaluate the ML model explained above. As shown in Figure 18, the depth for both (3.7 mm and 5.5 mm) wide FBH (100% and 140% of wavelength) can be successfully estimated; however, for the 15 mm wide FBH (60% of wavelength), there exists an error of 24.67%. This mismatch can be likely attributed to the defect's small width to wavelength ratio. As shown in Figure 11, at the lower defect width to wavelength ratios, estimating the defect depth becomes more challenging using SH0 wave modes. Therefore, future work should focus on selecting an excitation frequency or wave mode such that the wavelength becomes shorter to improve prediction accuracy.  Figure 18. Estimation of experimental testing data set containing three FBH defects using the GPR model built in section 3.2, showing a good estimation of wide defects but 24.67% error in the prediction of the defect with 60% width to wavelength ratio, depth prediction and the ground truth are indicated by the solid blue and dashed red lines, respectively.

Conclusion
A combination of ultrasonic guided wave technology and the prediction of the remaining useful life of a structure provides us with a more efficient and effective approach for the robotic inspection of large structures. To achieve this, we have evaluated the feasibility of a data-driven approach for estimating the remnant thickness of a defect as a critical damage attribute. An ML model built mainly by a calibrated simulated data set and real intact SH0 data was explored to estimate the real experimental data containing three FBHs. The validation data set shows that we can have a good estimation of defects by extracting information from the instantaneous phase. However, the evaluation of real experimental data shows that the GPR model is still restricted by the characteristics of SH0 wave mode, such as its wavelength. A good estimation of wide defects, defect width to wavelength ratio of 100%, could be achieved; however, it also turned out that it fails in a good prediction of narrow, deep defects. The authors also recognise that further work is required to reliably address this issue in addition to the estimation of real-life corrosion defects.