Reliability based impact localization in composite panels using Bayesian updating and the Kalman filter

Abstract In this work, a reliability based impact detection strategy for a sensorized composite structure is proposed. Impacts are localized using Artificial Neural Networks (ANNs) with recorded guided waves due to impacts used as inputs. To account for variability in the recorded data under operational conditions, Bayesian updating and Kalman filter techniques are applied to improve the reliability of the detection algorithm. The possibility of having one or more faulty sensors is considered, and a decision fusion algorithm based on sub-networks of sensors is proposed to improve the application of the methodology to real structures. A strategy for reliably categorizing impacts into high energy impacts, which are probable to cause damage in the structure (true impacts), and low energy non-damaging impacts (false impacts), has also been proposed to reduce the false alarm rate. The proposed strategy involves employing classification ANNs with different features extracted from captured signals used as inputs. The proposed methodologies are validated by experimental results on a quasi-isotropic composite coupon impacted with a range of impact energies.


Introduction
The application of Structural Health Monitoring (SHM) techniques in the aviation industry has gained noticeable attention in the recent years due to the increased use of composites in aircraft for the many advantages they offer over traditional materials. However, impact damage in composites, in particular Barely Visible Impact Damage (BVID), can be a major concern if not detected in time. SHM is a promising technique that can result in impact [1][2][3][4][5] and consequently damage detection and characterization [6][7][8][9][10][11] by monitoring of the structure with permanently installed sensors. Based on the sensing technology, sensors can be used in passive or active configurations. However, for any SHM system to be adopted as a non-destructive damage inspection (NDI) technique it must comply with high reliability requirements under operational conditions. One way to improve the reliability of any decision-making algorithm is by adopting Bayesian updating or the Kalman filter. For a SHM system to be employed reliably in practice, it should be capable of distinguishing between different impact events which may result in damage or not (false alarm) as well as operating with a high level of reliability in cases when one or more sensors have become faulty during service.
As discussed by Niri et al. [25] and in a review of acoustic-emission source localization techniques by Kundu et al. [26], the problem of AE source localization for simple structures such as isotropic metallic plates and quasi-isotopic composite plates has largely been solved. However, more work needs to be carried out on their application to complex structures (L or T shaped panels or those with internal cavities), where the propagation from the AE source to the sensors is interrupted.
The main objective of this work involves the application of the probabilistic techniques, Bayesian updating and the Kalman filter, to an existing impact localization algorithm, in this case, ANNs, with the aim of improving the reliability of the algorithm when used with data containing uncertainties, representing the response of the structure under real operational conditions. The robustness of the proposed methodology is examined by considering the possibility of a faulty sensor, whose effects are mitigated by the method proposed in Section 8 utilising Bayesian updating and the Kalman filter.
The main novel contributions of this paper are threefold: (I) An investigation into the reliability enhancing capabilities of Bayesian updating and the Kalman filter with regards to impact localization when applied to sensor data containing uncertainties has been carried out. The aim is to improve the application of ANNs to real structures where the recorded data would be subjected to noise and variabilities due to geometrical tolerance, bonding quality, load levels, and environmental effects. A comparison of the performance of Bayesian updating and the Kalman filter has also been undertaken. (II) A method for mitigating the effect of a possible faulty sensor, involving fusing ANN sub-networks using Bayesian updating and the Kalman filter, has been proposed. The aim is to improve the robustness of the employed impact localization algorithm and make it more suitable for use in real structures under real operational conditions where sensor failure is a distinct possibility. (III) A classification strategy for passive-sensing systems with the aim of reliably differentiating between damaging and non-damaging impacts, and thereby reducing the false alarm rate, has also been presented. The classification strategy is presented for a fairly idealised case involving impacts of a relatively small range of masses, impact energies, and angles of attack and should therefore be seen as a first step towards applying the concept of impact classification to structures under real operational conditions.
The layout of the paper is as follows. Details of the proposed BU and KF algorithms for impact localization are outlined in Sections 2 and 3 respectively. Section 4 outlines the methodology for constructing and training an ANN for impact location estimation and a parametric study in achieving its optimum performance with experimental data obtained by impacting a composite plate with impacts of varying impact energy. The experimental setup is detailed in Section 5. A methodology for reliably differentiating between true and false impacts, using ANNs with identifiers as inputs, is presented in Section 6. Finally, the performance of both BU and the KF when applied to a healthy sensor network, with no faulty sensors, are compared and reported in Section 7. Section 8 then expands on this, with the application of BU and the KF to a faulty sensor network where the possibility of sensor failure is considered.

A Bayesian updating approach to impact location estimation
Bayesian updating (BU), or Bayesian inference (BI) uses prior assumptions, coupled with new information, to provide a more up-to-date prediction regarding the probability of an event [12]. In this section, the application of BU to impact detection using sensor data is outlined.
BU is an improvement over traditional deterministic approaches to damage identification. BU can provide a probabilistic prediction of the impact location and show, quantitatively, the uncertainties associated with the prediction. Bayes' rule is the key to BU and is shown below: where PðHjDÞ is the posterior probability of hypothesis H given new data D, PðDjHÞ is the likelihood of data D given hypothesis H, PðHÞ is the prior probability of hypothesis H, and PðDÞ is a normalising factor to ensure that the integral of PðHjDÞ over the predefined domains is equal to unity. To initiate the BU process, it is necessary to make an initial guess for the prior probability PðHÞ. In the absence of experience, a uniform distribution can be assigned to PðHÞ.
Referring to Eq. (1), the hypothesis H in the case of impact location estimation is that the true impact location is at some coordinates (x, y) on the top surface of the plate and D is an impact location estimate from an ANN ðx ann ; y ann Þ. Eq. (1) can be rewritten as: where the posterior probability Pðx; yjD k Þ is the probability of the impact being located at (x, y) on the top surface of the plate after being given D k , the impact location estimate from the k'th ANN. D k ¼ ðx ann;k ; y ann;k Þ, where x ann;k and y ann;k are the x and y coordinates of the impact location estimate from the k'th ANN respectively. PðD k jx; yÞ is the likelihood function. Pðx; yjD kÀ1 Þ is the prior probability of the impact being located at (x, y) before being given the impact location estimate from the k'th ANN. It is also the posterior probability from the previous cycle, calculated using the impact location estimate from the (kÀ1)'th ANN. And PðD k Þ is a normalising factor to ensure that the integral of Pðx; yjD k Þ over the x and y domains is equal to unity: The likelihood function can be determined by assuming that the error in the location estimates from the ANN follows a normal distribution: where: where X hist;k is a vector containing the x coordinates of all the impact locations estimates from the 1st ANN to the k'th ANN, X hist;k ¼ ½x ann;1 ; x ann;2 ; . . . ; x ann;k T , and Y hist;k is identical but for the y coordinates. It is assumed that there is no correlation between the x and y coordinates, therefore the off-diagonal terms in R k are zero.
After the k'th impact location estimate (from the k'th ANN) has been obtained, the entries of R k and l k are updated and a new likelihood distribution PðD k jx; yÞ is calculated using Eq. (4) for each coordinate pair ðx; yÞ. The posterior distribution Pðx; yjD k Þ is then calculated using Eq. (2). The posterior impact location estimate from BU is determined by finding the pair of coordinates ðx; yÞ on the top surface of the plate that corresponds to the maximum of Pðx; yjD k Þ.
To initiate BU, it is necessary to choose an initial prior probability distribution Pðx; yjD 0 Þ. For this work, a uniform distribution was chosen. A flowchart of the proposed BU algorithm is shown in Fig. 1.

A Kalman filter approach to impact location estimation
The Kalman filter (KF), like Bayesian updating, uses new measurements that contain noise and uncertainties to update predictions on the current state of a system. The KF was first proposed by R.E. Kalman in 1960 [27]. For conciseness, the methodology for the implementation of the KF in this work is the focus of this section. The reader is referred to [28] for a detailed explanation of the workings of the KF.
Applying the KF to passive sensing, where the AE source location is static, requires a different approach than when tracking a moving object whose coordinates in physical space change with time. Therefore, some simplifications can be made to the KF equations. Because the AE source location is static, the system has no inputs, u k , or process noise, w kÀ1 , that can alter its location. The process noise covariance matrix, Q, is therefore also omitted. The state matrix, A, is equal to the identity matrix I, and the prior state estimate for time step k,x À k , is equal to the posterior state estimate for the previous time step, x kÀ1 . The prior estimate error covariance matrix, P À k , is also equal to the posterior estimate error covariance matrix of the previous time step, P kÀ1 . This is the approach used by Niri et al. [19].
In [19], the initial estimates for the means ðx s ;ŷ s Þ and variances ðr 2 xs ; r 2 y s Þ of the AE source location ðx s ; y s Þ were chosen to be: where subscripts U and L denote the maximum and minimum coordinates of the edges of the plate respectively. Eq. (9) describes an uncorrelated uniform distribution. The KF was applied in a similar manner to Niri et al. [19]. As well as the above simplifications, the measurement matrix H was chosen to be equal to the identity matrix I to simplify the calculations and the state vector x k ¼ ðx imp ; y imp Þ T is constant for all k and is a vector containing the real coordinates of the impact location. The measurement vector z k ¼ ðx ann;k ; y ann;k Þ T contains the coordinates of the k'th impact location estimate (provided by the k'th ANN). It was assumed that the measurement noise v k could be treated as a random Gaussian variable, with covariance matrix R k . The updated equations for the KF are: x 0 ¼x ŝ y s where: X hist;k and Y hist;k are defined in Section 2. K k is termed the Kalman gain and can be thought of as a weight that either puts more emphasis onx kÀ1 or on the residual ðz k Àx kÀ1 Þ when calculatingx k . The variances in R k provide a measure of the uncertainty in the estimates provided by the ANNs. To obtain an estimate for the initial measurement noise covariance matrix, R 0 , the impact location estimates from three additional ANNs were used before the KF algorithm was initiated. In addition, for the first step there is nox k or P k from a previous step and therefore an initial guess must be made for both variables to initiate the KF algorithm, these are taken to be (15) and (16). The KF iteratively updates its estimate for the impact location using new location estimates provided by ANNs. At the end of each iteration a probability distribution Pðx; yjz k Þ can be created per the bivariate normal distribution (4) with the covariance matrix AE k equal to P k and l k ¼x k . A flowchart of the proposed KF algorithm is shown in Fig. 2.
The KF can actively reduce the influence of new noisy data when calculating its next estimate by varying the entries of K k . K k is a function of the covariance matrix of the previous estimates from the KF, P kÀ1 , and the covariance matrix of the estimates provided by the ANNs, R k . If a new estimate z k from an ANN has a higher uncertainty than previous estimates ðz kÀ1 ; z kÀ2 ; . . . ; z 1 Þ then K k will act to reduce the influence of z k on the current KF estimatex k and will instead give more influence to the previous KF estimatex kÀ1 . Therefore, the effect of noisy measurements is minimised.
Even though the KF in this work has been used as a means to improve the reliability of the decision-making algorithm (ANN) in the presence of noisy data, it also provides an ideal opportunity to correlate ToA measurements with impact locations through the use of the H matrix, in a similar manner as proposed by Niri et al. [19] with ToA triangulation. This method could provide a very good means of evaluating the performance of the ANN and will be addressed in future work.

Artificial neural networks for impact location estimation
In order to obtain impact location estimates, feedforward backpropagation Artificial Neural Networks (ANNs) were created and employed using the numerical computing software MATLAB. The data obtained from the experimental work was randomly assigned to training, validation, and test datasets per the proportions 70%, 15%, and 15%. Estimating the location of an impact using ToAs is essentially a nonlinear regression (function approximation) problem. The 'tansig' or 'Hyperbolic tangent sigmoid' transfer function is therefore suitable in this context.
There are no set rules for optimising an ANN [29]. Therefore, to determine the optimum architecture for the ANNs used in this work a short parametric study was carried out. The study involved determining the optimum number of neurons for the hidden layer and the optimum training function used for the ANNs. It was found that a neuron number of 20 with the training function 'trainlm' or Levenberg-Marquardt backpropagation provided the highest accuracy and the fastest training times.

Evaluating the performance of an ANN
In order to evaluate the performance of an ANN with its test dataset it is useful to create a fitness function. This fitness function could be the overall mean squared error (MSE) of the ANN's impact location estimates. A more comprehensive means of defining the fitness function would be to allow the user to evaluate the Probability of Detection (POD) of impact location estimates that correspond to a particular error or the error that corresponds to a certain POD. This can be achieved using a cumulative distribution function (CDF) and is the method used by Sharif-Khodaei et al. [22] and Mallardo et al. [30].
In this work a POD of 90% was used. Therefore, the fitness function is the absolute error corresponding to 90% probability in the analytical CDF plot of best fit. In this work the fitness function is given as a ratio of the distance between sensors 1-8 (197.99 mm) and is hereafter termed the 'fitness function ratio' or in the abbreviated form 'FFR'.

Time of arrival acquisition
ToA acquisition is important when the objective of a SHM system is to determine the location of an impact. Two ToA acquisition methods that are widely used in the field of SHM are the threshold method [30] and the Hilbert transform [13,22].
In this work, a combination of the threshold method and the Hilbert transform was used to calculate ToAs. Firstly, the signals were normalised with respect to the largest amplitude, setting the amplitudes between [À1 1]. During the experimental work, it was noticed that the frequency of the background noise was concentrated below 100 Hz while the dominant frequencies of the impacts were distributed above 100 Hz. To reduce the influence of this noise a 3rd order Butterworth high-pass filter with a cut-off frequency of 100 Hz was used. The normalised-voltage threshold used was 0.2, with the ToA being determined when the envelope of the signal first leaves this window.

Experimental procedure
A 300 mm Â 225 mm composite plate T800/M21 with a quasi-isotropic layup of [0/+45/À45/90] 2s was manufactured and a 200 mm Â 200 mm box with an inner uniform grid was drawn on the top surface of the plate. The intersections of the grid lines were used as impact location targets, giving a total of 64 different impact locations. A diagram of the composite plate can be seen in Fig. 3. The small black plus signs represent impact location targets, all 64 targets are numbered. The sensors are marked by large blue crosses and numbered.
Eight piezo-electric sensors were attached at regular intervals 30 mm from the edge of the fixture (outer dashed line) on the bottom surface of the plate and used to record guided waves produced by the drop impacts of varying energies. The data acquisition was carried out using an 8-channel NI PXIe-5105 oscilloscope with an input voltage range of ±15 V. The trigger voltage was set to a value of 0.2 V, which is higher than the voltage amplitudes of the background noise recorded by the sensors, but small enough to ensure the signals are reliably captured. In the experiment, the composite plate was impacted on the top surface, with the impactors being guided to the impact locations using a cylindrical guidance tube with an inner diameter of 50 mm. Due to the guidance tube having a diameter 10 mm larger than that of the impactors, an inaccuracy of at most 5 mm was estimated for the impact locations.
The plate was clamped in the two 50 mm wide strips either side of the impact area, ensuring that the clamping apparatus provided minimal interference during the experimental work. The hatched regions in Fig. 3 show the areas of the plate which where in contact with the clamping apparatus, both regions are 25 mm wide.
The two types of impactors used in the experiment both had a diameter of 40 mm and dry masses of 2.7 g and 24 g. In total, 6 different impact energy levels were tested: 10 mJ, 45 mJ, 85 mJ, 120 mJ, 155 mJ, and 190 mJ. Which are referred to as E0-1, E0-2, E1, E2, E3, and E4 respectively. These different impact energy levels were obtained by increasing the mass of the impactors and altering the impact height, see Table 1. Each impact of energy levels E0-1 and E0-2 was repeated 3 times, giving a total of 192 impacts for each of these two energy levels. For energy levels E1-E4 the impacts were repeated 5 times, giving a total of 320 impacts for each of these energy levels. A total of 1664 impacts were conducted during the experiment.

Defining false alarm
A false alarm (or false positive) impact is defined as an impact which has hit a structure and is thought to have caused damage when in reality no damage was sustained. To define false alarm, it is necessary to first define true impacts and false impacts. True impacts are impacts which damage the structure while false impacts do not cause damage. It would not be practical to damage the composite plate used in the experimental work discussed in Section 5, as each impact would cause damage that would influence the results of all future impacts. For this reason, it was decided that no damage-causing impacts would be conducted and that impacts of different energies from the experimental work discussed in Section 5 would be arbitrarily categorised into different energy levels representing different hypothetical damage levels. This would enable an investigation into defining false alarm, but without the prospect of damaging the plate. The different energy levels and the corresponding impactors are defined in Table 1. The two lightest impactors from the experiment were chosen as the false or 'non-damaging' impacts and given designations E0-1 and E0-2 for 2.7 g and 12.7 g respectively. The four heavier impactors were chosen to be the true or 'damaging' impacts and were given designations E1-E4. The four heavier impactors from the experiment each have impact energies that are about 35 mJ apart, this allows for equally-sized energy categories to be created.
An investigation was carried out to determine identifiers that could be used to reliably differentiate between true and false impacts with the aim of minimising the false-alarm rate, the percentage of false impacts that are incorrectly identified as true impacts. The first identifier examined was the dominant/instantaneous frequency of each impact calculated using the CWT. The second identifier involved calculating the relative energies of the impacts by also using the CWT. The third and final identifier involved calculating the power spectral density (PSD) of the captured signals, providing a measure of the power of the signals which could be related to impact energy.  The CWT has recently found use in the field of SHM and is often preferred over traditional methods such as the threshold or Hilbert transform approaches due to its good resolution in both the time and frequency domains [19,31]. Examples of its use can be seen in [4,14,[16][17][18][19]31]. The method used in this work is similar to that used by Niri et al. [19].
The complex Morlet wavelet is often used as the mother wavelet in the CWT due to its good resolution in the time and frequency domains. For this reason, the Morlet wavelet is used in this work. In addition, there is no set method for choosing the optimum values of the central frequency (Hz), F c , and the bandwidth parameter (Hz), F b [19]. Therefore, a trial and error approach was undertaken and the optimum values were chosen based on which values provided the clearest scalograms. The values of F c and F b for E0 were 2 Hz and 0.2 Hz and for E1-E4 they were 1.5 Hz and 0.2 Hz respectively.
The instantaneous frequency, f i , of a signal is defined as the frequency that corresponds to the point of maximum energy in the scalogram of that signal and can be thought of as the signal's dominant frequency. Instantaneous frequencies may vary substantially between signals for a single impact; therefore, the average instantaneous frequency, " f , of an impact would be a more effective identifier. The average instantaneous frequency (Hz) for each of the 64 impact locations for each energy level were calculated and plotted in Fig. 4.
In Fig. 4 there is a clear difference between the false impacts and the true impacts. The frequencies of energy levels E1-E4 are concentrated between 100 and 300 Hz while the frequencies of E0 are distributed over a wider range from 200 to 800 Hz. This can also be seen in Table 2, which shows that both the mean and the standard deviation tend to decrease as impact energy increases. This suggests that instantaneous frequency could be a suitable identifier.
This behaviour is similar to the behaviour seen in Ghajari et al. [29] where an ANN was employed to predict the impact force history of impactors of varying masses on a composite plate. It was found that small mass impactors exhibited a wider range of dominant frequencies while large mass impactors exhibited a much smaller range. This difference in dominant frequencies was explained as being due to differences in the response of the plate as a result of different impact masses. Large mass impacts tend to have longer impact durations and thus the plate experiences a very slow deformation rate and displays a frequency spectrum with energy concentrated towards low frequencies. The plate therefore provides a quasi-static response [32]. Small mass impacts on the other hand, tend to have shorter durations and the plate therefore experiences a faster deformation rate and provides a flexural-wave controlled response. The frequency spectrum shows a wider distribution of energy over a wider range of frequencies.
Olsson [32] identified that the response of the plate is governed largely by the impactor-plate mass ratio and less due to impact velocity. This could explain why the mean instantaneous frequencies for energy levels E1 and E2, which both use an impactor of mass 24.46 g are quite similar even though their impact energies differ by about 33.60 mJ.
Dominant frequency is not only influenced by the impactor-plate mass ratio but also by many other factors such as the damping characteristics of the impactor, the angle of contact, and the shape of the impactor/plate. These factors would cause uncertainty in the classification of an impact based on a dominant frequency threshold, as used in this work with ANNs. The impacts conducted in the experimental work were purposely of low impact energy to prevent damage to the plate. Therefore, it would be expected that the dominant frequencies obtained from the captured signals would have a narrow range. In reality, the range of dominant frequencies, if both damage causing and non-damage causing impacts were considered, would be greater, potentially causing the uncertainty created by variations in the damping characteristics to become less significant and allowing for a clearer differentiation between damaging and non-damaging impacts.
To improve the reliability of impact classification with dominant frequencies, the cut-off frequency of the high-pass filter discussed in Section 4.2 could be varied automatically depending on the noise level in each signal. This would improve the robustness of the technique when applied to different impactors. Evidently, more work needs to be carried out to quantify the influence of impact energy and impactor-structure interaction characteristics on dominant frequency.

CWT coefficient integrals
As noted by Niri et al. [19], the scalogram of a signal is obtained by calculating the squared modulus of the CWT coefficients at each scale a and time b. By calculating the squared modulus of the CWT coefficients at " f for each time b and for each of the 8 signals for an impact, and then numerically integrating the results over time b, a measure of the energy content of each signal can be obtained. The steps are shown below: (1) An impact occurs which allows 8 signals to be obtained from 8 sensors.
(2) The CWT is used to obtain 8 instantaneous frequencies f i .
(3) Calculate the average instantaneous frequency " f and convert it into a scale. (4) Calculate the squared modulus of the CWT coefficients for each signal at the scale found from step 3 and at each time b. (5) Numerically integrate the squared modulus of the CWT coefficients found from step 4 over time for each signal to obtain an arbitrary scalar quantity Q (s) that represents the signal's energy content at the frequency " f .
It can be seen from Fig. 5 that there is a clear distinction between the true impacts of E1-E4 and the false impacts of E0-1 and E0-2. However, this distinction between the two types of impacts is not as clear as it was for the instantaneous frequencies shown in Fig. 4. Overall, this method behaves as expected, with the higher energy levels showing higher values of Q.
To show the variance of Q for the different energy levels, the maximum, minimum, and average values of Q were calculated and plotted in Fig. 6. It can be seen in Fig. 6a and b that the variance of Q for energy level E0-2 is much higher than for E0-1. The highest variance occurs at the extremes of the x-axis, between impact locations 1-8 and 57-64, and gradually reduces towards the centre of the x-axis. By comparing these results to the impact locations seen in Fig. 3 we can see that the high variance areas correspond to the rows of impact locations which are the nearest to the unclamped edges of the plate (locations 1-16 and 49-64). The peaks seen in Fig. 6b seem to occur at impact locations corresponding the middle of each row. Based on these observations the regions of high variance in Q correspond to regions of low bending stiffness; those regions near the unclamped edges (y 50 mm & y ! 150 mm) and along the middle of the plate (70 mm x 130 mm). Similar behaviour can also be seen with energy levels E1-E4 in Fig. 6c, suggesting that this high variance becomes more prevalent as the impact energy is increased. The peaks in the maximum and mean curves in Fig. 6c occur at impact locations corresponding to the middle of each row, and the troughs either side of each peak occur at impact locations that are at the edges of each row, nearest to the clamped regions. It can also be seen in Fig. 6a-c that the variance increases towards the right-hand side of the x-axis, suggesting that the bending stiffness reduces towards the top of the plate, this could be due to an inconsistency relating to the clamping apparatus used.

Power spectral density integrals
The power spectral density (PSD) of a signal describes how the power of that signal is distributed over the frequency domain. Its units are power per frequency (W/Hz) and by integrating PSD over a frequency range it is possible to obtain the power within that frequency range [33].
Welch's PSD estimate method was used to compute the PSD distribution for each signal as it ensures that the resulting PSD values are scaled by the sample frequency in order to provide units of power/frequency and that the integral has units of power. PSD was evaluated between the frequency range 0-250 kHz.
From Fig. 7 we can see that there is a clear distinction between E0-1 and E1-E4 which is to be expected since the impact energy of E0-1 is much less than that of E1-E4. However, it was not expected that E0-2 would have similar PSD integral values to E1 and E2. When compared to the instantaneous frequency distributions shown in Fig. 4 or the integrals of the squared modulus of the CWT coefficients shown in Fig. 5 it is clear that the PSD integrals are the least effective at differentiating between true and false impacts, with the most effective appearing to be the instantaneous frequencies of Fig. 4.

Artificial neural networks for impact type classification
It was initially thought that adopting a threshold approach using the data obtained from the methods described in Sections 6.1-6.3 would be an effective means of determining whether an impact was either a true or false impact. However, given the asymmetric distributions seen in Figs. 4-7, it was decided that defining a threshold would be a rather unreliable approach. Given the effectiveness of using ANNs in classification problems, it was believed that ANNs could be effective for classifying impact types. To test this hypothesis a pattern recognition and classification ANN was designed and the most effective inputs were determined based on the minimization of the false-alarm rate.
The impacts from energy levels E0-E4 were randomly assigned to the train, validation, and test datasets per the proportions 50%, 25%, and 25% as these were found during preliminary tests to provide the minimum average false alarm rate and standard deviation. The ANN's architecture consisted of 1 hidden layer with a neuron number of 24. The inputs used for each impact were any combination of the following sets: Input set 1: 8 Â 1 vector of the instantaneous frequencies for each impact. Input set 2: 8 Â 1 vector of the Q values for each impact. Input set 3: 8 Â 1 vector of the PSD integrals for each impact.
Impact classification is a pattern recognition and classification problem and therefore requires an ANN that is designed differently than the ANN designed in Section 4 for impact location estimation. The training function used was 'trainrp' or 'resilient backpropagation' as it has been shown to perform very well with pattern recognition problems [34]. The transfer functions used for the hidden layer and for the output layer were 'tansig' and 'logsig' respectively.
The targets given to the ANN were the probabilities of each impact being a false impact (impact energies significantly below the threshold of damage initiation). When the ANN was fully trained, it could be given a vector of inputs corresponding to an impact and was able to provide a conditional probability estimate as to whether the impact was a false impact. If the conditional probability estimate was 50% or above it was considered to be a false impact, if it was below 50% it was considered to be a true impact.
A study was carried out to determine the most effective inputs for the ANN. All combinations of input sets 1, 2, and 3 were evaluated based on their average false alarm rate and standard deviation with the impacts in the test dataset over 1000 trained ANNs. The impacts used in the train, validation, and test datasets were randomised for each of the 1000 trained ANNs to ensure that the false alarm rates obtained were reliable and repeatable. The average false alarm rates and standard deviations of the test impacts over 1000 ANNs for different combinations of input sets 1, 2, and 3 are shown in Table 3.
When discussing false alarm impacts, a discussion on false negative impacts is prudent. A false negative impact is defined as the case where a true impact has been mistakenly identified as a false impact. The average false negative rate and standard deviation were calculated over 1000 trained ANNs for each combination of input sets 1, 2, and 3. The results are shown in Table 3.
The results in Table 3 for the individual inputs are as expected, with input 3 proving to be the least effective and input 1 proving to be the most effective. The false alarm rate improves significantly when combinations of inputs are used to train the ANNs and the best performance is achieved by using a combination of all three inputs which achieved the lowest average false alarm rate of 1.72% and a low average false negative rate of 0.03%.
From Table 3 we can see that the average false negative rates are very low for every combination except for the one which involves the PSD integrals in isolation. These very low values can be explained by the fact that the values of instantaneous frequencies, CWT coefficient integrals, and PSD integrals show very little variance for the true impacts E1-E4 in the 8 Â 1 vectors calculated for each of these identifiers for each impact and so are therefore less likely to be identified incorrectly. This can be clearly seen in Figs. 4-7. The opposite is true for E0, which shows a very high amount of variance and so is more likely to be incorrectly identified. This contributed to the relatively high values of average false alarm rates seen in Table 3.
In conclusion, ANNs seem to be very effective at differentiating between true and false impacts when given the appropriate inputs. With the most effective inputs being a combination of input sets 1, 2, and 3.

Bayesian updating and the Kalman filter -Healthy sensor network
Bayesian updating and the Kalman filter were applied to the impact location estimates provided by an impact localization algorithm utilising ANNs. The aim was to improve the reliability of the impact localization algorithm when used with sensor data containing uncertainties, obtained from experimental work. In this section the SHM sensor network is assumed to be healthy, with no faulty sensors.

Methodology
In total, 4 different sets of ANNs were trained. Each set was trained with a different combination of impacts from energy levels E1-E4. For example, set 1 was trained with the impacts from energy levels E2-E4. Set 2 was trained with E1, E3, and E4, etc. This was to test their generalisation ability. 100 ANNs were trained for each set, giving a total of 400 ANNs. Each ANN was trained in a similar manner as in Section 4 with 1 input layer and a hidden layer of 20 neurons. The training function was 'trainlm' and the transfer function was 'tansig'. The data for each energy level was divided into two datasets: dataset 1 and dataset 2. Dataset 1 contained the training, validation and testing data. In the case of set 1 this dataset included impacts from E2-E4. Dataset 2 included the running data, which was run with the ANNs after they had been trained. In the case of ANN set 1 this dataset included the impacts of E1. Dataset 2 simulated impacts that occurred after the ANN had been installed into a SHM system and tested the ability of the ANN to generalise its estimates. The proportion of impacts from dataset 1 that were assigned to the train, validation and test datasets were 70%, 15%, and 15% respectively. The impacts assigned to these datasets were randomised for each of the 100 ANNs for each of the four ANN sets. The FFR of each ANN with the test dataset was determined and the 100 ANNs for each set were ranked based on this FFR, with the best ANN of a set having the lowest FFR of that set. These ANNs were then run with their corresponding dataset 2 to obtain impact location estimates. Each set's dataset 2 contained 320 impacts.
The impact location estimates of these 100 ANNs were input sequentially into BU and the KF according to their ranking, with the estimates from the highest ranked ANN being input first. It was hoped that this would maximise the performance of BU and the KF and enable more accurate posterior impact location estimates to be obtained. In total, there were 100 cycles, with each cycle being composed of two independent phases, the BU phase and the KF phase. An overview of the method used for each BU and KF phase can be seen in Figs. 1 and 2. At the end of each phase, BU and the KF both provided a posterior impact location estimate for each of the impacts in dataset 2. Two FFRs, one for BU and one for the KF were calculated at the end of each cycle using these posterior location estimates.
To determine the effectiveness of BU and the KF they were compared to the mean performance of the 100 ANNs when run with dataset 2 and to the mean ± standard deviation. The FFRs of BU and the KF were also compared to a third method. This third method involved the vector l k , calculated for each cycle k, being used as a third set of posterior estimates to act as a baseline to check if BU and the KF were performing correctly. This method can be thought of as a moving average and is referred to as MEAN.

Results and discussion
It can be seen from Fig. 8 and Table 4 that BU and the KF both provide lower FFRs than the mean performance of the 100 ANNs after only using the estimates from about 3 ANNs. The FFRs of the KF seemed to follow those of MEAN quite closely and differed at most by only 1%. BU, however, did not follow MEAN very well, with a maximum difference of 2.71% between  them. BU provided average FFRs that were lower than for the KF for ANN sets 1, 2, and 4. Both BU and the KF were able to provide FFRs that were lower than MEAN for sets 1 and 4 and only slightly higher than MEAN for sets 2 and 3. To assess how close the KF and BU follow MEAN it is possible to calculate the mean-squared error (MSE) between the FFRs of the KF and MEAN and between BU and MEAN. It was found that the average value of MSE of the KF across E1-E4 was 5.02 Â 10 À6 while for BU it was 8.02 Â 10 À6 . This shows that the KF follows MEAN more closely than BU, suggesting that the KF provides a more reliable performance improvement than BU does when compared to the mean of the 100 ANNs. The variability of BU's performance can be explained as follows: From Fig. 1 it is clear that to obtain a posterior impact location estimate from BU, ðx BU;k ; y BU;k Þ; it is first necessary to calculate Pðx; yjD k Þ over the x and y domains. This is not the case for the KF, where Pðx; yjz k Þ can be calculated after ðx KF;k ; y KF;k Þ. This means that to numerically carry out BU, the domain is divided into square pixels each with centre coordinates ðx; yÞ.
Since it is not necessary to calculate Pðx; yjz k Þ to obtain ðx KF;k ; y KF;k Þ, it is expected that the KF would be faster than BU if Pðx; yjz k Þ is only calculated for the final cycle. In this case, it was found that the KF could complete a cycle 462 times faster than BU, with an average wall-time of 5.127 Â 10 À4 s per cycle compared to 0.237 s for BU. If Pðx; yjz k Þ is calculated for each cycle, then the KF has a similar wall-time to BU, 0.161 s per cycle compared to 0.162 s for BU.
A pixel resolution of 200 Â 200 is used in this work which corresponds to a pixel size of 1 mm Â 1 mm. Dividing the domain into pixels has the effect of ensuring that a posterior location estimate from BU will be equal to the centre coordinates ðx; yÞ of a pixel, whereas the coordinates of a posterior location estimate from the KF can take the value of any real number. This explains the behaviour of BU seen in sub Fig. 8c and d where the FFR of BU undergoes significant change very quickly, suggesting that several of the posterior location estimates were shifted by one or more pixels, therefore causing the overall FFR for that cycle to change significantly. This is more likely to occur when the number of ANNs used is small, because the posterior location estimates from BU are still converging, but can still happen even when many ANNs have been used. This can be seen with the 100 0 th ANN in Fig. 8d. It is worth noting that the jump is not significant (an increase of 8.1 Â 10 À3 from 0.0981 to 0.1062) and corresponds to an increase in absolute error of 1.60 mm from 19.43 mm to 21.03 mm. It is recommended that a fine pixel mesh be used with BU to minimise the effect on FFR.
A means of allowing BU to follow MEAN more closely would be to reduce the pixel size. However, just halving the size of the pixels causes their number to increase by a factor of 4, meaning that the processing time required for BU also increases by about 4 times.
It is also worth noting that BU and the KF provided the greatest performance improvement with sets 1 & 4. These two sets contain the least accurate ANNs, with high mean FFRs of 12.09 and 13.05 respectively, suggesting that there is a high level of uncertainty associated with their impact location estimates. Both BU and the KF provided lower average FFRs than MEAN for these two sets, thereby suggesting that BU and the KF can improve reliability when applied to data containing high levels of uncertainty.
In conclusion, both BU and the KF provided an improvement over the mean performance of the 100 ANNs for each set. They were also able to achieve a similar level of performance to MEAN and in some cases, provide better performance, suggesting the correct application of both methods. Overall, BU provided an average FFR of 8.29 over E1-E4 that is slightly lower than that obtained for the KF, 8.31. The value found from MEAN is 8.31. The results shown in Fig. 8 and Table 4 suggest that the KF provides a performance improvement that is more consistent than for BU. This is most clearly seen in sub Fig. 8c where BU has a large region of relatively high FFR between cycles 16-54. See Figs. A1 and A2 in the Appendix for examples of posterior impact location estimates provided by BU and the KF.

Bayesian updating and the Kalman filter -Faulty sensor network
In this section, the reliability enhancing capabilities of Bayesian updating and the Kalman filter when applied to an impact localization algorithm utilising ANNs for a faulty SHM sensor network are investigated.
During the lifetime of a structure one or more sensors can become faulty, (either due to an impact or due to degradation from fatigue and/or environmental loads) and provide erroneous voltage data leading to inaccurate ToA measurements, adversely affecting the performance of the entire SHM system. In order to be considered for use in the maintenance of aircraft, SHM systems must satisfy strict reliability and robustness requirements [11]. This includes the ability of the system to operate with adequate performance under the presence of one or more faulty sensors, as well as the quick detection of the presence of a faulty sensor. One way to improve the reliability of a SHM system would be to provide the system with the ability to detect the presence of a faulty sensor within its sensor network, allowing for the quick repair or replacement of the damaged sensor. Recent work by Sharif-Khodaei et al. [35] demonstrated that electromagnetic interference (EMI) tests can be used for such a purpose. Comparison between signal data obtained during operation and signal data obtained for the training dataset could also be used in the case of ANNs. The scope of this work is not to come up with a new means of detecting a faulty sensor, which is assumed to be an inherent capability of the employed SHM system. But to instead develop a means of mitigating its effects on the SHM system as a whole once its presence has been detected, thereby improving both the robustness and reliability of the system under real operational conditions. The robustness of a SHM system could be improved by including redundant sensors, this would ensure that the system could still operate with good performance even if one or more sensors become faulty during operation. A downside of this would be that the cost of the SHM system would increase. Therefore, a means of mitigating the effect of one or more faulty sensors while reducing the need for redundant sensors is desirable. Three methods of achieving this with ANNs are investigated in this work, these include: (1) Replacing the ToAs of the faulty sensor with those of the nearest healthy sensor and (2) Deleting/removing the ToAs of the faulty sensor from subsequent calculations. A third method proposed in this work, in the case where ANNs are used for impact location estimation, is to train ANNs for different sensor combinations and use BU and the KF to calculate posterior impact location estimates using these ANN sub-networks which could be more accurate than those provided by using ANNs with methods 1 and 2.

Methodology
To triangulate the location of an impact it is necessary to use a SHM system composed of at least 3 sensors. In a SHM system of 8 sensors this means that sensor combinations consisting of 3-8 sensors are possible. Method 3 involves training multiple ANNs with different combinations of sensors. For example, in the case where the ANNs are trained with sensor combinations consisting of 3 sensors this would provide a total of 56 distinct sensor combinations and therefore 56 corresponding ANNs. In the case where the sensor combinations consist of 7 sensors this would provide a total of 8 combinations, and so 8 ANNs. The total number of sensor combinations, m, for each number of sensors c in a combination, can be found using the following equation: For c = 3, 4, 5, 6, 7 the values of m are 56, 70, 56, 28, 8 respectively. All the impacts from energy levels E1-E4 were involved in this section. These impacts were split initially into two datasets, dataset 1 and dataset 2. Dataset 1 was used to train, validate, and test the ANNs while dataset 2 was used to run the ANNs after they had been created to simulate new data being received by the ANNs, possibly after a sensor has become faulty. 200 8-sensor ANNs were designed as described in Section 4 and the FFR of each of these ANNs when run with dataset 2 was determined and the average of these FFRs was calculated. This provided the average FFR of the 8-sensor ANNs when run with the pristine data, when none of the sensors were faulty.
Generally, sensor faults fall into one of two categories, soft faults or hard faults [36]. A sensor could be completely broken (a hard fault), incapable of providing an output of any meaningful use. This could be simulated by replacing the ToAs of a healthy sensor with zeros or some other constant value [37]. A second, less severe level, would involve the sensor being faulty in such a way that it provides distorted outputs, leading to the SHM system producing incorrect results (a soft fault). With regards to soft faults, it has been found that sensor faults in the form of bonding defects between a sensor and a structure can lead to the received Lamb wave signals being distorted, with the phase, shape, and amplitude of the received signals all showing notable differences when compared to the signals received by healthy sensors [38]. These distortions will inevitably lead to high levels of uncertainty in the obtained ToA measurements. In general, there are three types of soft faults [39], bias, in which the sensor reading is offset by a constant amount from its normal value, drift, in which the difference between the sensor reading and its normal value increases linearly with time, and precision degradation, where the sensor reading contains a significant amount of white noise. In this work, a sensor with the precision degradation fault is considered. In order to simulate this type of fault, a significant amount of uncertainty was added to the ToA measurements of sensor 1 for dataset 2 in the form of a stochastic Gaussian variable with a mean of zero and a standard deviation equal to 10% of the magnitude of the ToA measurement. This created a new, faulty version of dataset 2. Dataset 1 was untouched. Eq. (19) describes this process: where T i is the ToA (s) obtained from sensor 1 for the i'th impact, T F i is the faulty version of T i , and uðT i Þ is a stochastic variable such that uðT i Þ $ Nð0; r 2 i Þ, where N denotes a normal distribution with standard deviation r i ¼ T i

10
. The 8-sensor ANNs trained as described earlier in the section were run with the new faulty version of dataset 2 and the average FFR was determined. It was expected that it would be higher than the average FFR of the ANNs run with the pristine version of dataset 2 calculated earlier.
Of methods 1 and 2, method 1 involved replacing the ToAs of the faulty sensor with those of the nearest healthy sensor. In the event where there are two healthy sensors equally close to the faulty sensor, the replacement sensor is randomly chosen from these two. In this work the replacement sensor was chosen to be sensor 2. Method 2 involved deleting the ToAs of the faulty sensor, therefore the ToAs of sensor 1 were deleted for this method.
In total, 5 different groups of ANNs were designed for method 3. Group 1 was composed of the ANNs trained with sensor combinations consisting of 3 sensors, group 2 with sensors combinations consisting of 4 sensors etc. The steps of the proposed BU and KF algorithm for fusing these ANN sub-networks are shown below: (1) Randomly assign the impacts from E1-E4 to datasets 1 and 2.  Table 6) and compare them with the mean performance of the 8-sensor ANNs and the other methods investigated (Table 5).
It was expected that the sensor combinations at the front of the ranking in step 9 would provide lower FFRs than those at the back of the rankings when run with dataset 2 in step 10. This seems to be the case when looking at Fig. 9. It can also be seen that the sensor combinations containing the faulty sensor are the worst performing combinations when run with dataset 2, which is to be expected. It was hoped that by inputting the impact location estimates obtained from step 10 into BU and the KF according to this ranking that the accuracy of the posterior impact location estimates obtained would be maximised. This is justified by observations made during the coding process where it was found that the posterior impact location estimates provided by BU and the KF are highly sensitive to the first few inputs provided and gradually become less sensitive as more inputs are given. Therefore, inputting the highest-ranked sensor combinations first and the lowestranked last should allow more accurate posterior impact location estimates to be obtained.
Although in this work the precision degradation fault is considered, the proposed methodology for mitigating the effect of a faulty sensor using BU and the KF could also be applied to the remaining types of soft faults, bias and drift, with minimal alterations.

Results and discussion
The results shown in Fig. 10 and Table 5 show that with only one faulty sensor, the performance of the ANNs deteriorates significantly. The FFR increases from an average of 7.77 for the ANNs run with the pristine ToAs to 108.04 for the ANNs run with the faulty ToAs, an increase of 13.90 times.
Of methods 1 & 2, method 2 led to the lowest FFR with an average of 8.94. While method 1 provided a higher average FFR of 18.72. Both methods 1 and 2 provided a considerable improvement in performance over the 8-sensor ANNs run with the faulty ToAs, but still performed worse than the 8-sensor ANNs run with the pristine ToAs, which is to be expected. Fig. 9. Fitness function ratios of the best ANN from each sensor combination when run with dataset 2. The order of the combinations along the x axis shows the ranking of the 3-sensor combinations determined from step 9. Method 1 is the simplest to carry out. However, it is only applicable to SHM systems that contain many sensors that are reasonably close together. This method would be less effective if used with a sparser sensor network. Method 2 is the most complex, as it requires ANNs to be created for all possible combinations of faulty sensors.
From Fig. 10 we can see that the performance of MEAN deteriorates significantly after the estimates from a certain number of ANNs have been used. This is because some of the ANNs include the ToAs of the faulty sensor. For example, the 3sensor ANNs run with ToAs containing the faulty sensor are 36-56, for the 5-sensor ANNs it is 22-56, etc. It is clear that as the number of sensors used in each combination increases, the proportion of combinations containing the faulty sensor increases significantly and the worse MEAN performs.
In Table 6 we can see that BU and the KF both performed well for the 4, 5, and 6 sensor ANNs, providing average FFRs that were on average 16.41% and 21.33% lower respectively than that of method 2. It can be seen with the healthy SHM system in Section 7 that the KF performs similarly to MEAN, this is because there is relatively little variance in the estimates provided by the ANNs in that case. In Fig. 10 we can see that the KF and MEAN perform similarly when the estimates from the combinations not including the faulty sensor are used. However, when the sensor combinations involving the faulty sensor are used, the performance of the moving average (MEAN) worsens significantly while the performance of the KF remains largely unchanged. This is because the KF is able to actively reduce the influence of new noisy data when calculating its next estimate by varying the entries of the Kalman gain matrix K k . Thereby minimising the effect of these new noisy measurements.
BU also shows very little change; this is due to the influence of R k and l k in the calculation of Pðx; yjD k Þ.
In Fig. 10 we can see that BU and the KF converge after about 10 ANNs. BU and the KF perform noticeably worse with the 3 and 7-sensor ANNs. The performance of the 3-sensor ANNs can be explained by the fact that the sensor combinations used consist of the minimum number of sensors required for locating an impact, thereby making them less effective than ANNs that utilise combinations of a greater number of sensors. Hence, the posterior estimates obtained using BU and the KF will be less accurate. 7 of the 7-sensor ANNs contain the faulty sensor and thus their performance is very poor. Given that there were only 8 of these ANNs, BU and the KF were not able to converge before they encountered the ANNs run with the faulty sensor, and as a result their performance worsened significantly.
Overall, BU and the KF performed very well and were able to provide FFRs that were less than that of their nearest competitor, method 2. Given the difference in computation time between BU and the KF as discussed in Section 7.2 and the fact that the KF outperformed BU for almost every sensor-number combination ANN, it is clear that the KF is the most effective method, with BU being a close second. The KF also provided an improvement in performance that was more reliable across different sensor numbers, showing less variation than BU. The most effective sensor-number combination ANNs seem to be those utilising 6 sensors as they provided the lowest FFRs for BU and the second-lowest for the KF. They were also the second-least computationally expensive, with the number of possible sensor combinations being the second-lowest at 28 combinations. If the number of faulty sensors were to be increased above 1 then it is likely that the optimal sensor combination would move towards containing 3, 4, or 5 sensors.

Conclusions
In this paper a new methodology based on Bayesian updating and the Kalman filter has been proposed to improve the reliability of impact location estimation based on ANNs when used with sensor data containing uncertainties. Under real operational conditions, recorded sensor data will include noise and variabilities due to geometrical tolerance, bonding quality, load levels, and environmental effects. Results showed that in the case of the healthy sensor network, Bayesian updating and the Kalman filter improved the reliability of impact detection by 24.77% and 24.59% respectively while also providing quantitative estimates regarding the uncertainty in their posterior location estimates.
The possibility of sensor failure has also been considered and its effects mitigated, improving the application of the proposed impact localization algorithm to real structures. In this case, it was found that by providing sub-networks of sensors and applying a decision fusion algorithm on a number of these sub-networks, that the reliability of the impact location estimates could be improved. It was shown that the sub-networks consisting of 4, 5, or 6 sensors provided the greatest improvement with Bayesian updating and the Kalman filter. For each of these combinations, Bayesian updating and the Kalman filter provided a performance improvement over the mean solution. The greatest improvement was achieved using sub-networks consisting of 6 sensors which provided average improvements of 21.36% and 23.60% for Bayesian updating and the Kalman filter respectively when compared to the next best method; deleting the faulty sensor. Overall, results suggest that Bayesian updating and the Kalman filter are capable of improving the reliability of impact location estimation with ANNs. For the proposed SHM system based on ANNs with PZT transducers to be considered for use in the maintenance of aircraft, it is required to comply with strict requirements for reliability and robustness. This has been addressed in this work through the application of the proposed Bayesian updating and Kalman filter methodologies to both healthy and faulty sensor networks. Future work will seek to apply the proposed Bayesian updating and Kalman filter methodologies to more complex structures, improving their applicability to the aeronautics industry. In addition, a new strategy for differentiating between true and false impacts with the aim of minimising the false alarm rate was presented. By conducting a study into the most effective features extracted from captured signals, it was shown that by utilising a combination of instantaneous frequencies, CWT coefficient integrals, and PSD integrals in conjunction with a pattern recognition and classification ANN, an average false alarm rate of 1.72% could be achieved. Thereby demonstrating that the proposed method is capable of reliably differentiating between impacts of different energy levels, as well as signifying its potential for differentiating between damaging and non-damaging impacts. Future work will aim to build on the proposed classification strategy, improving its application to real structures and involving the consideration of more complex geometries and a wider range of impactor masses, energies, and angles of attack.