Independent component analysis for pulse compressed frequency modulated thermal wave imaging for inspection of mild steel

Non-destructive testing & evaluation plays a crucial role in various sectors for testing the reliability of materials. Of the different available non-destructive examination techniques, thermal non-destructive testing provides fast and remote inspection of the materials. Among various widely used thermal non-destructive testing techniques, frequency modulated thermal wave imaging gained its importance due to its higher test sensitivity and resolution. The adopted matched filter approach on the obtained temporal temperature distribution further concentrates supplied excitation energy into a narrow duration high peak power pulse. In this paper, the merits of the reconstructed high peak power pulsed data have been considered and emphasized in the context of independent component analysis. The obtained results clearly indicate that pulse compressed data improves defect detectability, reliability, memory usage, and computational complexity.


Introduction
Various materials used in different sectors of industry such as aerospace, electronics, biomedical and marine, etc are prone to defects. These defects may occur during the manufacturing stage itself or may induce over some time due to wear and tear of the material in use [1][2][3]. The defects need to be detected at the earliest to avoid any mishaps in the future. Non-destructive testing and evaluation (NDT&E) techniques inspect the material without disturbing its usefulness [1][2][3]. Active infrared thermography is one of the best NDT&E techniques due to its fast and remote testing capabilities. Active thermal wave imaging can be implemented in various modes, namely, pulsed, step, and lock-in thermal wave imaging [3][4][5][6][7]. Nowadays, frequency modulated thermal wave imaging (FMTWI) is being widely used as it has the merits of pulsed as well as lock-in mode of thermal wave imaging [8,9]. In FMTWI, frequency modulated thermal waves from the heating sources impinge on the specimen surface. The reflected thermal waves from the specimen surface are captured back by the infrared camera over a period of time. Hence, FMTWI has the capability of scanning a range of depths inside the specimen in one experimentation period [10][11][12].
The data captured by the infrared camera is further enhanced using data processing techniques as defects are not often visible in the captured raw thermographic data. These processing techniques can be widely categorized as time, frequency, and statistical domain processing techniques [13][14][15]. Statistical domain processing techniques such as principal component analysis (PCA), independent component analysis (ICA), factor analysis (FA), non-negative matrix factorization (NMF) and higher-order statistics (skewness and kurtosis), etc have been recently implemented in the field of NDT&E due to their high data compression ability and enhanced defect detection. Thermography analyses based on PCA and NMF were applied on the defective specimens to bring out the defect-related information in one of the components [16][17][18]. Higher-order statistics such as skewness and kurtosis also compressed thermographic data into one image frame depicting the defects but were sensitive to the noise present in the data [19]. ICA highlighted the defects present in the test specimens in one of the estimated independent components in the case of infrared thermography and eddy current pulsed Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. thermography [20][21][22]. More recently, ICA had been compared with PCA for its defect detection ability and independent components provided a better signal to noise ratio for the defects as compared to the principal components [21,22]. The noise rejection capabilities of mainlobe of the pulse compressed reconstructed FMTWI data were highlighted in the context of PCA based thermography [23]. Present work highlights the defect detectability by considering the mainlobe of pulse compressed data instead of full data in context of ICA.
In this paper, a mild steel specimen having six flat-bottom hole defects has been inspected using FMTWI. The pulse compressed thermal data is obtained by pre-processing of FMTWI data using cross-correlation. Further, the total data and mainlobe decay curve of reconstructed pulse compressed data have been processed using ICA to ascertain the advantages of mainlobe of reconstructed pulse compressed data in terms of defect detection reliability, computational time, memory usage, and defect detectabilty considering signal to noise ratio (SNR) of defects as a figure of merit. The effects of input parameters of ICA algorithm have also been kept into consideration during the analysis. The mathematical aspects of ICA have been described in section 2. In section 3, the methodology approach, including experimentation and processing techniques, has been detailed. Section 4 discusses the results obtained, and then at the end section 5 concludes the paper.

Mathematical theory
Independent component analysis is based on blind source separation model. It decomposes the data into different components that are statistically independent. The statistical independence of the components is assessed based on the measure of non-Gaussianity of the components or the measure of mutual information of the components [24]. The ICA algorithms are designed to locate the maxima of independent components or to locate the minima of mutual information of independent components. Among different algorithmic implementations of ICA, FastICA is one of the widely used algorithms due to its computational efficiency owing to the fixed point iteration scheme to locate the maxima of non-Gaussianity of independent components [25]. The captured thermographic data D is assumed to be a linear combination of independent components C as represented below in (1) Where. A is the matrix with weights for linear combination representation of the data. The independent components in C are estimated using FastICA algorithm by locating the maxima of non-Gaussianity of independent components. The estimation of independent components becomes faster if the data is whitened before [25,26]. The whitening transformation transforms the data in such a way that the transformed data is uncorrelated with unity variance. PCA is used for the whitening of data before ICA. In the analysis, the number of independent components (NOC), the data needs to be decomposed into, is decided beforehand. Similarly, as PCA is being used as a whitening step, the number of eigenvalues (NEV) to be retained is also set before IC estimation process to ensure a reliable representation of data by the retained eigenvalues. The effect of both these parameters on the defect detection reliability has been discussed in Results and Discussions section.

Experimental set-up and data processing
In this experiment, a mild steel sample has been considered as an inspection material. Six flat-bottom hole defects at different depths are introduced in the mild steel sample to test depth-resolved capabilities of the processing techniques. Figure 1(a) describes the schematic dimensions of the mild steel sample having six flatbottom hole defects D1-D6 at different depths of 1.2 mm, 1.37 mm, 1.97 mm, 2.13 mm, 2.32 mm and 3.42 mm respectively from the top surface of the mild steel specimen. The top surface of the mild steel specimen has been inspected using FWTWI by illuminating the test specimen with an incident heat flux having a frequency sweep of 0.01-0.1 Hz for a duration of 100 s as shown in the experimental set-up in figure 1(b). The infrared camera with a spectral region of 3-5 μm and spectral resolution of 320×240 has been used to capture the thermal responses at 20 Hz sampling rate during during the active heating of the test specimen. Further, a mean zero thermal sequence is obtained by subtracting the mean rise in the temperature from the captured thermal responses at all the spatial locations of the test specimen. Pulse compression is then applied over the obtained mean zero thermal sequences to reconstruct the thermal data in such a way to compress most of the vital information in the mainlobe. Then, the processing technique ICA is explored over this pulse compressed FMTWI data to test its efficacy on the full-length data and the mainlobe of the compressed data.

Results and discussions
As discussed in the previous section, the captured FMTWI data is pre-processed. The typical temporal distribution of temperature at a location over a mild steel specimen is shown in figure 2(a) and the typical temporal distribution of mean removed temperature at a location is shown in figure 2(b). Pulse compression by cross-correlation concentrates the significant thermal information in a mainlobe, as shown in figure 2(c). The mainlobe decay curve is highlighted and represented as a zoomed-in region in figure 2(c).
The independent components have been estimated for the full duration of data comprising 2000 frames and also for mainlobe decay curve of the pulse compressed data consisting of 153 frames (only 7.65% of full data).  The obtained results have been compared in terms of various factors such as defect detection reliability, memory usage, computational cost and defect detectability in terms of SNR as a figure of merit. The results are discussed as follows:

Defect detection reliability
The selection of NEV and NOC parameters in FastICA algorithm plays a crucial role in defect detection reliability. The defect detection reliability has been analyzed by considering the IC estimation problem for different combinations of NEV and NOC. In this analysis, the values of NEV and NOC have been chosen as D/2 n where n=0, 1, 2, 3, K. so on until D/2 n reaches a minimal number. Here, D is a total number of frames in the data. Considering its statistical nature, an ICA estimation for a particular NEV and NOC value has been carried out for five runs to assess the defect detection reliability for different combinations of NEV and NOC. The defect detection reliability in the case of full data is indicated in table 1, where D=2000. Some boxes are marked as Not Applicable (N/A) because such combinations do not exist as NOC can only have value equal to or less than NEV. As it is shown in table 1, if all 2000 eigenvalues of the full data are retained then defects are not detected for any of the values of NOC and in any of the runs out of five runs. As the NEV is decreased to 1000, 500, and so on, then defects are visible only for few values of NOC. It has been found out that very low values of NOC are not able to detect defects. Defects are detected with full reliability only when data is decomposed into a higher number of components possible e.g., when NOC=NEV or NOC=NEV/2. It is also to be noted that estimating independent components involves considerable computation. Hence, a very large value of NOC accounts for more computation time. On the other side, when lower values of NEV are selected, such as 15, 7, and 3, then smaller values of NOC enable reliable defect detection but at the cost of insufficient data representation by less number of eigenvalues. So, it is tough to select a particular NEV and NOC combination which provides reliable defect detection and is simultaneously computationally efficient too in case of full data.
On the other hand, the defect detection reliability for possible values of NEV and NOC is shown in table 2 for mainlobe decay curve data where D=153. Here, the total number of frames is only 153 as compared to 2000 in case of full data. Moreover, most of the significant information has been compressed into the mainlobe of the pulse compressed data, which makes the mainlobe decay curve a very reliable representation of the data in a compressed form. It can be seen in table 2 that almost every possible combination of NEV and NOC provides a reliable option for detecting defects except at lowest value of NOC. Otherwise, every other combination of NEV and NOC is reliable enough in terms of defect detection as defects are detected in all runs out of total five runs. In general, the number of independent components required to be estimated is very less in the case of mainlobe decay curve as compared to full data hence involves very less computation time.
To compare both the cases of IC estimation in full data as well as mainlobe decay curve data in terms of memory usage and computational cost, one particular instance of NEV and NOC has been considered. NEV has been chosen equal to one-fourth of total eigenvalues, and NOC has been selected equal to half of the value of NEV. These particular values have been chosen as these values were able to detect defects for all the total five runs in both cases of data. Table 3 represents the comparison between the IC estimation in case of full data and mainlobe decay curve, which depicts 97.26% improvement in computational time and 92.28% improvement in memory usage in case of mainlobe decay curve over full data.
Here, D stands for total number of image frames present in full data which is equal to 2000 image frames and 'a/b' indicates that all the defects were detected for 'a' runs out of total 'b' runs.  Here, D stands for total number of image frames present in mainlobe decay curve data which is equal to 153 image frames and 'a/b' indicates that all the defects were detected for 'a' runs out of total 'b' runs.

Defect detection quality
In the previous sub-section, strong data compression abilities of mainlobe decay curve data have been highlighted by interpreting the obtained ICA results in terms of defect detection reliability, computational cost, and memory usage. In this sub-section, the quality of defect detection in both cases of full data as well as mainlobe decay curve data has been assessed in terms of SNR of all the six defects as a figure of merit. For comparison between full data and mainlobe decay curve data, IC estimation is considered for one particular case where, NEV=D/4 and NOC=NEV/2 because of the same reason cited earlier. Figures 3(a) and (b) depicts the independent component highlighting the defects obtained by ICA of full data and mainlobe decay curve data, respectively. In quantitative terms, the SNR bar chart in figure 3(c) compares both cases. Overall, the obtained independent component from mainlobe decay curve data provides almost similar SNR of defects with respect to SNR values obtained in estimated independent component in case of full data. More specifically, deeper defects D4, D5, and D6 exhibit better SNR in the case of mainlobe. Whereas the shallower defects D1, D2, and D3 have good SNR in case of full data itself because of their higher signal levels. Hence, the mainlobe decay curve of pulse compressed data is beneficial to be considered for its better defect detection in critical scenarios, involving test specimens with deeper defects.

Conclusion
This paper highlights the efficient data compression abilities of pulse compressed reconstructed FMTWI data in the context of ICA. The merits of mainlobe decay curve data over full data have been ascertained by comparing the results obtained by ICA on both. The mainlobe decay curve provides better defect detection reliability in almost every combination case of NEV and NOC. Also, IC estimation is more efficient in terms of computation and memory usage in case of mainlobe decay curve owing to less number of frames. Further, the better SNR values for the deeper defects obtained in one of the independent components of ICA of mainlobe decay curve highlight the better compression abilities of pulse compressed data.