Eigen Background Subtraction for Industrial Flaw Detection: Application to High-Pressure Turbine Blade CT Scans

We propose a state-of-the-art approach that is the first to use Eigen background subtraction to reveal flaws in three-dimensional Computed Tomography data. Our method is composed of two main steps. During the first step, principal component analysis (PCA) is applied on flaw-free blade stack data. From a statistical perspective, a series of “flaw-free” characteristic functions is extracted. The second step consists of decomposing the blade of interest according to the functions calculated from PCA. This projection allows the construction of a synthetic blade without any flaws. A subtraction between the synthetic blade and real blade highlights the abnormal variations. The main advantage of this technique is that the processing remains applicable even when the measurement system or parts of the system have variability that is greater than the flaw size.


Introduction
In an aircraft engine, high-pressure (HP) turbine blades are located just behind the combustion chamber. They operate in a deleterious environment with high pressures, high velocity rotations and high temperatures. Cooler air extracted from the compressor circulates in a cavity network within the blade. Computed Tomography (CT) is one of the quality controls methods suitable for the next generation of parts whose geometries are becoming increasingly complex. On the one hand, CT allows separation of the cavities, and on the other hand, it can better characterize anomalies. The problem with the CT method is the amount of data generated, which makes finding the flaws more difficult for an operator.
In the industrial context, a widespread method to help in the search for anomalies involves aligning defect-free reference parts called golden parts (GPs) [1,2]. This object is defined as a mean of several validated parts in the quality process sense. Then, each produced part is compared with this GP. Deviations from the GP can indicate potential defects. These methods are very reliable as long as the measurement system and the produced parts do not change significantly. In our case, the variation in the produced parts is larger than the sought indications.
More complex background subtraction methods are typically used to segment moving regions in image sequences taken by a static camera by comparing each new frame to a model of the scene background. The work of Piccardi in [3] applied several techniques, which are representative of this class. We can sort these techniques into two families. The first family consists of using the temporal evolution of a pixel's values, such as the running Gaussian average, temporal median, or a mixture of Gaussian values [4][5][6]. The assumption is that a higher and more compact distribution is more likely to belong to the background. The second family consists of methods that analyse the feature space [3,7,8]. It is intuitive that neighbouring locations are spatially correlated in the modelling and classification of values. Various morphological operators, such as kernel density estimation or sequential kernel density approximation, analyse the global space for data cleansing. In our case, we cannot use these methods because we have equivalent variation that does not have an equivalent acceptance criterion. Some geometric patterns are more critical for the part acceptability than others, and different criteria strictness values are used.
The method that we propose is based on a scheme that is similar to the Eigen subtraction method [7]. This method applies PCA on a 2D image and reduces the obtained modal base formed by the resulting eigen vectors. From this assumption, we propose a new approach by projecting the actual image on a modal base generated from a flaw-free part database. This projection creates a synthetic image, which is the "flaw-free" component of the real part. In such a case, subtraction between this synthetic image and the actual image draws attention to the defected component of the part. This solution is applied to help an operator in the industrial quality process detect any anomaly in a produced part. This method will ensure the safety and reliability of every produced part and help address the issue when an anomaly is detected.

Materials and Methods
The 3D PCA method presented here was developed for CT applications. It is based on a statistical learning approach that can highlight a defect on a part with the information from a collection of flaw-free parts (training dataset). The input data consist of tomographic volumes of AM1 turbine blades, and the measurement parameters are summarized in Table 1.
The input data are discretized volumes in voxels [9]. Each voxel in this grid is associated with a grey value representing the absorptivity of the X rays in the volume measured. Then, the volume is registered with the reference Computed Aided Design (CAD) mesh to eliminate the coordinate system gap. A high pass filter is then applied to the volume.

Data Analysis on the Training Dataset
Since we want to perform a statistical analysis based on a dataset of parts that are considered healthy, we build a tensor representation of the CT scans for all the parts measured as the reference base. Each volume can be expressed by a scalar function (x) that returns the grey level matrix linked to the X-ray absorption at position x ∈ R 3 . Since we are dealing with a digital system, the volume is stored as a voxelized representation depending on the coordinates (i, j, k) of the voxel ensemble: N flaw-free parts are extracted from the complete dataset and are vectorized to build the healthy dataset X . The latter is centred for statistical purposes by subtracting the mean X (i) from each column X (i) of χ . The centred volume is denoted as X for the rest of the paper. We consider that every part is an exclusive combination of a defective and a healthy component. We assume that global geometrical scatter or metallurgical or local geometrical anomalies (as Porosities, shrinkage and local movements) is responsible for the defective component. The first type of variation is a consequence of the nonideality of the process that provokes geometrical deviations along the parts. As we consider hollow parts manufactured by the investment casting process, independent movements of both inner and outer surfaces characterize the global geometrical variations. The second category, which is the most interesting in this work, is more critical for the quality process. To avoid interference due to global (at the scale of the part) geometrical variations, we work on subdomains of the initial volume, usually slices. Once the initial volume is subdivided, a further local registration of each subvolume is applied to fix the global geometrical variations. Hence, for a decomposition in s subdomains on a volume X from the training dataset: Local variations, such as distortion or shearing, cannot be fixed by local registration; they will be part of the second category of defects. As shown in the work of [7], we perform PCA on X i to determine the modal base truncated with the highest eigenvalues. To perform PCA, we apply singular value decomposition (SVD). For a given matrix A with dimension (n, p) and n ≤ p, its rank r ≤ p is also the dimension of the space generated by the free vectors of A. In other words, it corresponds to the independent information contained in A. Hence, we can compute the eigenbase of a given volume with the N healthy parts. To represent the characteristic functions of each local subvolume, we gather all the subvolumes X l of the whole dataset X . As the data of X are vectorized, we build a matrix X l 1, N that gathers the subvolume l for each flaw-free part of X . Technically, we perform SVD on each X l 1, N for a given subvolume l such that: where U and V are rectangular transfer matrices from the original vector space to the orthogonal vector space whose columns are called left and right singular vectors, respectively, and · T defines the transposition. In general, the matrix is pseudo diagonal since from a given rank r (depending on the input), the diagonal values of are null. All values of indices i < r are nonnull and decreasing. Thus, truncation is realized on t < r, and the matrix V is extracted as the truncated eigenbase, which is also called the truncated modal base. In our case, we have the values N = 170, s = 50, i = j = k = 700 and t = 170. Regarding the l matrices, no truncation is necessary, as no redundancies are observed in the data. This means that our sample choice is not yet sufficient to completely parametrize the healthy space dimension. All bases V are formed by t eigen vectors, also called modes, and the associated coefficients, v. The larger the training set is, the larger the matrix will be, allowing for a more precise representation among a greater number of eigenmodes. A summarize of the global algorithm is presented on Fig. 1.

Volume Analysis
Once the PCA bases have been computed on the flaw-free parts for all sub volumes, a healthy base set is available. It is possible to apply the 3D PCA method to newly scanned parts from the complete dataset. The main difference between our method and the methods in the literature occurs in this last step. In contrast with Oliver et al. [7], our training set is only composed of healthy data. In that way, we do not consider the modal shape of flaws within the modal base. The base contains only 3D spatial functions describing the manifold of acceptable geometrical variabilities. The next step is to apply the same preprocessing pipeline to a new scan. A diagram is shown in Fig. 2 to explain the processing algorithm. This new volume is then decomposed with respect to the previously calculated modes. For a new turbine blade scan , we assume that it can be decomposed into two exclusive components: the first H , which includes the healthy parts, and D , which includes the anomalies. By a projection in the From a linear algebra point of view, an orthogonal projection applied on a PCA base is divided into a base composition and a residual that is orthogonal to the subspace formed by the base. Hence, where R denotes the residual of the projection, which is, as there is no truncation, equal to D . Furthermore, because those defects are not present in the training dataset, their projection on the orthogonal base will be small. Hence, when reconstructing the volume, they will be part of the reconstruction or projection error R. Figure 3 illustrates the different components with a real sub volume. Thus, the new synthetic volume H , as shown in Fig. 3B, represents the volume without defects. Then, a subtraction between this synthetic volume and the real volume is performed; see Fig. 3C. This allows the removal of the healthy elements from the real volume. Thus, only the volume of the anomalies (Fig. 3C), not spatially coded in the modes, will be part of the residual. This subtraction removes the repeatable artefacts and the correct metal geometry; that is, the flaw signature will be largely highlighted compared to the original data .

Results
A qualitative visual comparison of the raw volume, the result of a golden part subtraction method and the result of our 3D PCA method is shown in Fig. 4.
In Fig. 4, an Inconel wire (nickel-based alloy) is selected among standard equipment to estimate the image quality indicator (IQI) value, and it placed into the part to represent a flaw. Different features are noticeable in the different figures. First, the contrast of the IQI does not decrease with the 3D PCA method, while the structure signal decreases significantly. Another remarkable behaviour is that the artefact is erased after processing (compare the artefact on the foot of the blade in Fig. 4 before (left) and after the 3D PCA method (right)). As this artefact is a consequence of the complex geometry of X-ray CT, it is repeated in the measurements and is removed with the modal flaw-free base. Consequently, the repeatable artefact is contained in mode V .
The quantification of the algorithm performance is shown in Fig. 5. Here, the anomalies are represented by six different Inconel devices of several diameters: 250, 200, 160, 130, 100, and 80 µm. An estimation of the contrast-to-noise ratio (CNR), defined in Formula 6, of the grey level of the residual signal part (GL H ) and the grey level of the anomaly signal (GL D ) is used as a performance metric. The noise considered is the background standard deviation σ B .
We can see in Table 2 that in the blade, all the CNR values of the wires are greater than 0. In other words, the signal of the wires becomes greater than the signal of the structure. In the foot of the blade, which is a very thick zone (more than 10 × 40 mm), the large wire becomes distinguishable. The grey level of the IQI and the noise remain stable, and the structure that is the cause of false-positives has been erased. This method is more efficient than the GP method.

Discussion
Thanks to the method described in this paper, we are able to segment defects between 80 µm (in the thinner parts of the turbine blade) and 200 µm (in the thicker parts of the turbine blade). The discussed technique can be used to help a human operator find regions of interest in the CT data and thus reduce the control time of a part. The algorithm is tested  In high-risk applications, such as HP turbine blade development, one major problem with classification-based machine learning or template matching methods [10] is the presence of false negative results, especially if the anomaly is not present in the training dataset. Our method is based on valid samples, which means that these samples pass the quality control requirements. Since our method learns the nominal shape of the product available in the training set, it can highlight any anomaly, independent of its spectral signature. The 3D PCA method applied in this paper does not learn from the defect shape; it only extracts the anomalies not represented in the flaw-free database. Particular attention must be paid to database construction, as no unwanted anomalies should be inserted.
Another limitation of the method is its susceptibility to geometric variance. The performance of 3D PCA methods on manufacturing anomalies that have a small size by nature is significantly degraded if the large-scale geometry variation of the sample is not controlled. To increase performance (i.e., reduce the need for training parts), we segment the part into multiple sub volumes where large-scale variation is less significant. The most straightforward decomposition approach is to cut the volume in slices, but on more complex geometries, especially with thin structures (susceptible to bending), a local decomposition approach may be more adaptable. In addition, we register those subvolumes to ensure the smallest geometric variation. The 3D PCA method is then applied independently to each discretized part of the initial volume, and the parts are then patched back together.

Conclusion
This paper describes our proposed method, which allows the suppression of normal variations in turbine blades and repeatable artefacts. The novelty of our work is that with very few training data (approximately 170 samples), our model can make the signal from the background smaller than the signal from the indications. Furthermore, our method is not susceptible to false negatives because defects are not present in the training set. The proposed method achieves this performance because the spatial invariance is controlled. However, this requires a good understanding of the process variation and an ad hoc discretization of the controlled parts in the subvolumes.