An optimized data fusion strategy for structural damage assessment using electromechanical impedance

This paper proposes a new sensor network optimized data fusion approach for structural health monitoring of metallic structures using electromechanical impedance (EMI) signals. The integrated approach used to fuse common healthy state baseline model based damage detection, quantification and classification in EMI technique. Towards this, the principal component analysis (PCA) is carried out and corresponding the root mean square deviation (RMSD) index is calculated to study the information of piezoelectric transducer’s impedance (|Z|), admittance (|Y|), resistance (R), and conductance (G) in the frequency domain. A new optimized data fusion approach is proposed which was realized at the sensor level using the PCA as well as at the variable level using self-organizing maps (SOMs). The SOM comparative studies are performed using the Q-statistics (Q index) and the Hotelling’s T2 statistic (T index). The proposed methodology is tested and validated for an aluminum plate with multiple drilled holes with variable size and locations. In the process, a centralized data-fused baseline eigenvector is prepared from a healthy structure and the damage responses are projected on this baseline model. The statistical, data-driven damage matrices are calculated and compared with the RMSD index and used in a fusion based data classification using SOM. The proposed method shows robust damage sensitivity for hole locations and hole enlargement irrespective of the wide frequency range selection, and the selected frequency range contains the resonant frequency range.


Introduction
All man-made civil, aerospace and mechanical structures have a limited lifespan and are prone to structural defects like Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. corrosion, fatigue, erosion, wear and delamination. These structural defects can be monitored using suitable structural health monitoring (SHM) techniques. The electromechanical impedance (EMI) method is one of the SHM techniques that have been used in the high-frequency domain to assess the local health of a structure [1,2]. This method uses metrics as a damage-identification formula. These metrics are usually based on the comparison of healthy and damage spectrum. The piezoelectric lead zirconate titanate (PZT) transducers act as sensors and actuators. The electrical impedance of the bonded PZT transducer is equal to the voltage (V) applied to the PZT transducer divided by the current passing through the PZT. The measured electric current (I) is used to calculate EMI (Z (ω)) for the circular PZT [3,4]: where,ε T 33 = ε T 33 (1 − jδ),s E 11 = s E 11 (1 − jη), c p = 1/√ρs E 11 1 − µ 2 ,φ = ω Cp a, 'J 0 ' and 'J 1 ' are zero and firstorder Bessel function of the first kind, 'Z a ' is the shortcircuited mechanical impedance of piezoelectric transducer (actuator), 'Z s ' is the mechanical impedance of the structure, 'h p ' the thickness of piezoelectric transducer, a is the radius of transducer, 'ρ'-density, 's E 11 ' is the compliance coefficient, 'd 31 ' is the piezoelectric coefficient of the transducer for direction 3-1 (electric field applied in direction 3, strains in direction 1), j -complex symbol, ε T 33 is the complex permittivity of the piezoelectric transducer for direction 33, 'η' is the mechanical loss factor and δ is dielectric loss factor.
The EMI method effectiveness depends on the selection of effective frequency spectrum which is usually hard to determine for the incipient damages in the structure. There is no established theoretical methodology to determine the effective frequency range of the transducer from the experimental data. The robust frequency range of the damage detection in the EMI method can be determined using a trial-and-error method [5]. The trending data-driven approach in EMI technique for damage detection is more focused on the sensor's data fusion. It is established that multiple sensors are more effective than using one sensor alone [6]. Data fusion is a process of merging information from the various sources for reducing the uncertainty and yields a better signal to noise ratio. Data fusion can be carried out at three levels: data level, decision level and feature level [7][8][9]. Although significant work has been done on data fusion, very few techniques have been implemented for EMI based SHM techniques. One of the exceptions is a work by Shishir et al. The authors described novel data fusion technique using parameter F that combines information of R and G for detection of a low level of damage [10]. On the other end, the deployment of a sensor array of the PZT for the large structure to be monitored is becoming popular in recent days. This method enhances the probability of successful damage detection for the complex structures [11]. Multisensory data fusion gives additional information to improve the assessment of health identification using data fusion technique [12]. Likewise, there are several other data fusion method based damage detection techniques and these techniques have certain limitations.
The Bayesian probability-based approaches in structural damage quantification is popular in non-destructive testing using acoustic emission and guided ultrasonic wave propagation, but it requires reasonable assumptions for the suitable application [6,13]. Bayesian probabilistic framework and the state estimation method performance depend on the selection of the model and these approaches often require subjective decisions about the prior probabilities and model selections [6,14]. Zhao et al [15] proposed the hierarchical ensemble scheme to data fusion based on the Dempster-Shafer (DS) theory and the rotation forest (RF) method. RF used to build an accurate and diverse base and DS used to combine the output of RF data sources [15,16]. Fuzzy logic is considered good for addressing vagueness and imprecision of each decision, but heavily dependent on a mathematical foundation with respect to change of domain of application [6,14,17]. However, there is no ubiquitous data fusion technique that applies to all SHM applications [14]. Feature level fusion based machine learning techniques like artificial neural network (ANN), support vector machine or deep learning is becoming popular in data fusion [6]. Recently, Chen proposed deep-learning-based data fusion concept to detect and localize cracks on the metallic surfaces of the nuclear power plant's reactors [18]. In this work, we proposed to employ the combination of principal component analysis (PCA) with root mean square deviation (RMSD) and machine learning approach to detect and classify the damage.
The PCA was developed by Karl Pearson and integrated to the mathematical statistics by Harold Hotelling and used to reduce the dimensions of the multivariable complex data set [19]. Joe Quin used PCA based Q statistics and T 2 statistics in fault detection and diagnosis of polyester film manufacturing process [20]. Mujica explored these statistical techniques to detect and distinguish damages in steel plate and turbine blade structures [21]. Tibaduiza proposed a data-driven statistical approach using PCA for damage classification for distributed piezoelectric active sensor network for time-domain vibrational structural responses [22]. Park et al employed PCA model for impedance data in identifying loose bolts in bolted Al plate structure in wireless SHM [23]. They used onboard active sensor system consisting of impedance measuring chips and a micro-fiber composite sensor. Further, the authors in [24] combined the unsupervised hierarchical clustering and k-means clustering based methodology in damage detection on the near surface mounted fiber reinforced polymers using EMI technique. Using this method, the authors tried to separate different loading stages in the cluster. Junior et al used self-organizing maps (SOMs) classification architecture with RMSD features of the real part of impedance in very narrow frequency range. The features showed significant improvement in EMI based damage classification of multipoint metal dressing tool [25].
A data-driven based approach is more suitable than a model-based approach when mathematical modeling of the system is not of interest [6,12]. An application of data-driven fusion technique in the EMI method for damage detection is inspired by the need to obtain a unified visualization of damage to the structure. It will help with extracting knowledge from frequency domain data to improve damage detection by decision making. As in the previous work [26], the authors try to quantify the sensitivity to the hole generating process in the square aluminum plate using the resistance, conductance and susceptance based EMI features separately. From the review of literature, it is realized that several works have been carried out on data fusion, but only a limited number of data fusion techniques were implemented in the EMI based SHM technique.
In this paper, we propose a new EMI based data fusion approach for the sensor network that uses a damage detection algorithm and a statistical matching strategy to classify damage and undamaged (healthy) condition of aluminum plate with multiple holes. The data fusion allows the extraction of information from frequency domain data to improve damage detection through decision making. The PCA based RMSD damage index is calculated to detect abnormalities in the structure. A PCA projection based modified algorithm is used for localization of the damage by using the contribution of each sensor to the RMSD index. The overall structural damage detection is achieved by first performing local data fusion (sensor data integration) that integrates the information from four sensors, then performs global data fusion (|Z|, |Y|, G and R from four sensors) that combines the frequency-domain features using SOM. The final assessment result is obtained by integrating the variance contribution of RMSD indices and feature level fusion which are obtained from four different data variables (|Z|, |Y|, G and R) using SOM.

Theory and methodology
RMSD is the most popular damage detection index employed in EMI techniques. The following formula is used to quantify damage with respect to a healthy state of the structure in the EMI techniques: where symbol n is used as the number of frequency spectrum samples, symbol 0 is used for a healthy state, D i is the single sample of the spectrum of damage state. This paper introduces a new approach of data analysis using PCA, which provides further opportunities for damage classification using statistical index Q, and T 2 index. In this paper, for the EMI data fusion demonstration |Y|, |Z|, G and R were used from four sensors. Data fusion has been used at several levels to analyze the damage to the structure. Firstly, at the data level by directly combining the raw data using a variance contribution of the principal components of sensor network and secondly, feature-level fusion has been done as heterogeneous (|Y|, |Z|, G and R from four sensors) input of statistical indices in SOM. Statistical features of signatures are extracted from the original raw data using PCA based damage indices, and these features are concatenated prior to the decision level SOM and effective RMSD fusion. A data fusion based general framework of the adopted methodology in damage classification of an aluminum plate using a sensor network is given in figure 1.
The variables |Z|, |Y|, G and R of sensor network have different magnitudes and scales and can be scaled using mean and standard deviations of the sensor's measurement. The standard procedure of normalization has been used using the formula: where D ij represents jth sample for the ith sensor, µ i is mean of D ij , σ i is the standard deviation of D ij .
The PCA based baseline model can be developed by arranging the data in I × J matrix, D1 a normalized matrix of the data set, having information (|Y|, |Z|, G and R) from the different sensors (I) of the sensor network. PCA is used to combine the pre-processed data from different piezo-actuators. PCA is used to compute the covariance matrix of the data, eigenvalues, eigenvectors and the principal components. The components are organized in descending order of variance contribution. The covariance matrix of normalized data matrix can be calculated using the formula [19]: This covariance matrix has I × I dimension and measures the degree of linear relationship among all variables. If V contains the eigenvectors of the covariance matrix C d , P is damage state data matrix and T is a damage score matrix which represents the projection of damage data set in the direction of V and given by: The most common PCA based damage detection indices are Q index and the Hotelling's T 2 index. The former one uses to analyze the variability of projected data in the residual subspace and latter one uses the new space of the principal components [19][20][21]. IfÎ is the identity matrix, x is the corresponding piezo-actuator variable, x T is the transpose matrix of x, P 1 are the reduced eigenvectors, P T 1 is the transpose matrix of P 1 and Λ is the eigenvalues based diagonal matrix, then statistical indices can be calculated using equations (6) and (7): The extracted feature indices and scores (length of Q index and T 2 index) from the PCA technique combined using the mixing weight matrix to build a simplified ANN. SOM is an ANN technique based on the unsupervised algorithm for the classification of different states of the structure. SOM is a set of nodes which is connected by inputs based on the weight. These nodes are usually connected using rectangular or hexagonal topology and the winning neuron is based on the similarity between weight and input variables (x1, x2). SOM algorithm is based on the minimum Euclidian distances d(x j , w ij ) of the input variables and each neuron and given by the equation (8) [26].
They have special differentiating techniques to various features depending on the internal representation of input signals and is becoming a promising tool for clustering and visualizing high dimensional data [27,28]. The input layer neurons are fully connected to the output neurons of the Kohonen layer for the strongest response using the weight matrix. The relations of the weight matrix and input are given by equation (9).
The winner in the Kohonen layer is given by equation (10) for the kth iteration Here, w ij is the weighting factor between the ith neuron of the input layer and jth neuron of the Kohonen layer and x ij is the input signal of the network in the form of Q and T index. A diagram of the two-dimensional rectangular topology SOM is given in figure 2 for the input and Kohonen layer using neurons. For the two inputs, x ij is x1 and x2 used as a combination of the variables |Z|, R and |Y|, G, respectively.
In this paper, PCA is combined with SOM for structural damage detection. SOM performed on the damage score matrix to classify the damage state of the structure using a Kohonen SOM toolbox of Matlab [29].

Experimental setup
The experiments were performed on a thin square aluminum plate with attached piezo-actuators sensor network. The locations of piezo-actuators are a result of different study with an optimization approach for guided wave-based damage detection [30,31]. The sensor placement was optimized using a genetic algorithm and deals with guided wave-based damage localization based on (a) maximum coverage of the structure with at least one sensor-actuator pair, (b) maximum coverage of the structure with three sensors-actuator pairs, (c) minimum number of sensors [31]. For the EMI study, this sensor network should be treated simply as an example of the distributed network, because the optimization did not concern the EMIbased damage detection. The study investigates the sensitivity of EMI responses to the different size of drilled hole 'D-a' and 'D-b' in the Al plate as shown in figure 3. The dimension of the Al plate was 100 × 100 × 0.1 cm 3 and the temperature of the room was kept constant (approximately 24 • C) while conducting the above experiment. A HIOKI IM3570 Impedance Analyzer was used to measure the EMI signatures at the piezoactuator terminals. In the first step, the 5 mm diameter hole at the D-a location was created using drilling operation, further, it was enlarged to 8 mm and further to 10 mm as shown in figure 3. After introducing the 10 mm hole at location D-a, a 5 mm of hole was drilled at new location (D-b).

Results and discussions
Sensor network conductance spectra for the healthy and damaged state of the Al plate are shown in figure 4 for the piezoactuators P1, P4, P5 equidistant from the hole and P8 farther away from the hole. The prescreening of the EMI signatures indicated that the 17-600 kHz frequency range is suitable to demonstrate the method for all variables for damage detection and classification. This frequency range contains the resonant frequency range (180-250 kHz) as well. Hence, the method is tested in both narrow (180-250 kHz) and wide frequency ranges (17-600 kHz).
The RMSD damage indices were calculated using equation (2) for these piezo-actuators for different stages of damage severity (5 mm, 8 mm and 10 mm hole). The RMSD damage index for the wide 17-600 kHz and narrow frequency range 180-250 kHz are given in figures 5 and 6, respectively. Piezo actuator, P1 has shown the maximum RMSD index for all features (|Z|, |Y|, G and R) as shown in figure 5. The piezoactuator P5 has shown the second highest sensitivity for the two variables R and G but for |Z|, |Y|, showing exceptional behavior and less than P8. From figure 5, it is also noticed  that RMSD index is not following increasing trends in damage severity detection for all the sensors and all the variables. Further, in the narrow frequency range, the performance of the piezo-actuators for the damage sensitivity (figure 6) is bad in comparison to their performance in the wide frequency range. The RMSD values for 8 mm and 10 mm hole damage cases are not always higher than for 5 mm hole damage case. Similarly, the RMSD for 10 mm hole damage case is not always higher than for 8 mm hole damage case in the both narrow and wide frequency ranges. Summarizing, it can be seen that using the RMSD index the damage severity cannot be seen for both frequency ranges.  A principal component contribution based method was analyzed for all these variables (|Z|, |Y|, G and R from four sensors). Most of the data variation is contained by the first few principal components so 1st principal component was used for damage analysis. The data from the healthy state is properly trained from the ten experiments of each piezo actuator to prepare the healthy state baseline model for four piezo-actuators.
These data are organized in matrix form to create a high dimensional space matrix (J × I) to create the baseline PCA model. The RMSD calculation is made after the normalization of projection of damage state data on baseline model. Equation (11) was used to calculate the RMSD index for projected data's principal components. Figure 7 shows the reconstructed conductance spectrum plot for P1, P4, P5 and P8 using 1st PC  which is used to calculate the RMSD index in 17-600 kHz frequency range. Figure 8 shows the 1st principal component based RMSD indices for the P1, P4, P5 and P8 for |Z|, |Y|, G and R variables. The 1st PCA based RMSD index for piezo actuator P1 shows the maximum sensitivity towards damage due larger index values for the each cases ( figure 8). Based on 1st PCA RMSD damage indices P5 shows more sensitivity than P8 for all the variables in case of 8 mm hole. However, R based RMSD of P5 is closely less than P8 for 5 mm hole and hence verifies the P8 is at a larger distance from the P5. However, in the resonance frequency range, P1, P4 and P5 have shown better sensitivity than P8 as high value of the index and increasing trend for the damage severity as in    (17-600 kHz). This approach provides the flexibility for the data fusion using variance contribution based PC1, PC2 and so on in the effective RMSD index. Figure 10 shows the variance with respect to principal components to the variable |Z|, |Y|, G and R in the baseline model of the sensor network.
The most of the variance contribution for the PC1 is for |Z| and |Y| as shown in figures 10(a) and (b). However, PC2 cannot be ignored for the R and G variables as shown in figures 10(c) and (d). The general algorithm used to calculate effective RMSD is given by:  where w 1 , w 2 and w n are variance contribution of corresponding principal components; RMSD PC1 , RMSD PC2 and RMSD PCn are the RMSD values based on the 1st, 2nd and nth principal components of projected damage data with respect to the healthy state. Figures 11(a)-(c) shows, effective RMSD index combining 1st PC and 2nd PC compared with 1st PCA RMSD and traditional RMSD for the most sensitive piezo actuator P1 in damage 5 mm, 8 mm and 10 mm diameter hole case. RMSD using PCA has a higher scale than traditional RMSD for the variables |Y|, |Z|, G and R. In figure 11(d), effective RMSD (using PC1 and PC2) used to the quantification of damage severity has shown an increasing trend for 5 mm, 8 mm and 10 mm diameter hole.
In the second case, the method was used for the identification of the hole at 'D-b' location using the PC1 RMSD of the common baseline healthy state model above mentioned methodology. The P5 shows the highest sensitivity in this study  which supports that P5 is nearest to the damage 'D-b' location. The PC1 RMSD of the P1, P4, P5 and P8 for the all variables |Z|, |Y|, G and R is shown in figure 12. P8 shows the lowest sensitivity in the all cases in compared to P1, P4 and P8.
This methodology enhanced the scope of the study for damage classification and detection using data fusion and extended opportunities for damage indices and scores. Figures 13  and 14 have shown the calculation of Q index and T 2 indices for the P1 in damage classification for the 5 mm, 8 mm and 10 mm holes. The P1 was selected to demonstrate the variation of Q index (equation (6)) and T 2 index (equation (7)) based on maximum sensitivity among all piezo-actuators for the damage. These indices are plotted against rearranged dimensional score (corresponding to 25 measurements of each case) of the four variables of sensor data |Z|, |Y|, R and G and showing very small differences.
The results obtained from Q index and T 2 index for healthy, 5 mm, 8 mm and 10 mm diameter drilled holes are fused for |Z|, |Y|, R and G by entering as input to the SOM-this is the data fusion step. An SOM can be used to group and contrast similar and different features based on heterogeneous featurelevel fusion. The Q index and T 2 index dataset of |Z|, R and |Y|, G variable of piezo-actuator P1 are grouped in the form of data matrix using the input to the SOM. These dataset further normalize using variance method. Since the input to SOM is in terms of the indices (Q index and T 2 index) so training time for     In the second damage case 'D-b', the P5 was selected to demonstrate the variation of Q index and T 2 index based classification due to maximum sensitivity among all piezo-actuators for the damage. |Y| and G based fused Q index and T 2 index cannot differentiate healthy (H) and damage cases (D-a and D-b) in the structure as given in figures 19 and 20. Hence it can conclude that fusion of variables |Z| and R is a better quality of data than variables |Y| and G which supports the fusion-based RMSD index calculation. From figures 19 and 20, it is observed that Q index based damage classification is less sensitive since unable to classify all the damages while T 2 index is more sensitive towards the damage classification.

Conclusions
An integrated data fusion based robust identification method is proposed for structural damage estimation irrespective of wide or narrow frequency range selection, and the selected frequency contains the resonant frequency range based on a trial-and-error approach. This paper successfully used data fusion at sensor level (P1, P4, P5 and P8) using a common baseline model as an identification of multiple damages at different locations in wide frequency range. The proposed method shows an integrated approach using PCA and SOM as a more robust technique in damage localization. The SOM used for the variable level data fusion using four sensor data |Z|, |Y|, R and G in EMI technique. The proposed method combines the PCA based RMSD index, statistical PCA tools based damage classification and data fusion based SOM classification for SHM. This work concatenates the data fusion technique based on variance contribution and machine learning SOM in the decision of damage identification.
• The comparison between the standard RMSD approach and PCA-based RMSD index with a fusion of data, for damage detection, evaluation indicates the potential of the proposed approach with respect to the traditional approach. • The methodologies developed in this work are successfully tested and validated by creating drilled hole 5 mm and enlarging to 8 mm and then to 10 mm. The fusion of variables |Z| and R is a better quality of data than variables |Y| and G which supports the fusion based RMSD index calculation. • The application of data fusion increases the value of data mining by using data variables |Z|, |Y|, R and G in EMI technique. The SOM of T 2 index has shown a better performance over the Q index of the fused variables.
• The method shows robust damage sensitivity to the selected resonance frequency 180-250 kHz and wide frequency range (17-600 kHz) which contains the resonant frequency range irrespective to the trial-and-error based damage detection approach.
The future research can be focused on exploring the potential of the proposed approach for robust SHM of complex metallic and composite structures under variable operating conditions, which is ongoing research by the authors.