Corrosion Detection for Automated Visual Inspection

Vessels constitute one of the most cost effective forms of transporting bulk goods around the world. However, despite the efforts on reducing maritime accidents, they still occur and, from time to time, have catastrophic consequences both in personal, environmental and financial terms. Structural failure is the major cause of ships wreckages and, as such, Classification Societies impose extensive inspection schemes for assessing the structural integrity of vessels.


Introduction
Vessels constitute one of the most cost effective forms of transporting bulk goods around the world.However, despite the efforts on reducing maritime accidents, they still occur and, from time to time, have catastrophic consequences both in personal, environmental and financial terms.Structural failure is the major cause of ships wreckages and, as such, Classification Societies impose extensive inspection schemes for assessing the structural integrity of vessels.
The external and internal parts of the hull can be affected by different kinds of defects typical of steel surfaces and structures, being corrosion of paramount importance.Nowadays, to detect these defects, visual hull inspections are carried out at a great cost [1]: the vessel has to be emptied and situated in a dockyard, where typically temporary staging, lifts, and/or movable platforms need to be installed to allow the workers for close-up inspection (and repair if needed) of all the different metallic surfaces and structures.Taking into account the huge dimensions of some vessels, this process can mean the visual assessment of more than 600,000 m 2 of steel.Besides, the surveys are on many occasions performed in hazardous environments for which the access is usually difficult, while the operational conditions turn out to be sometimes extreme for human operation.For large tonnage vessels, such as Ultra Large Crude Carriers (ULCC), total expenses can be as high as one million euros.
Corrosion is a clear indicator of the state of the hull metallic structures, and, thus, it is of great interest for the surveyor [2].Different kinds of corrosion may arise: general corrosion, that appears as non-protective friable rust which can occur uniformly on uncoated surfaces; pitting, a localized process that is normally initiated due to local breakdown of coating and that derives, through corrosive attack, in deep and relatively small diameter pits that can in turn lead to hull penetration in isolated random places; grooving, again a localized process, but this time characterized by linear shaped corrosion which occurs at structural intersections where water collects and flows; and weld metal corrosion, which affects the weld deposits, mostly due to galvanic action with the base metal, and are likelier in manual welds than in machine welds.
The goal of the EU-funded FP7 MINOAS project [3] is to reengineer the inspection process through the incorporation of robotic technologies.Some of these robots are equipped with cameras which can provide images of the different areas of the vessel hull to be inspected [1].This chapter revises related work in visual general defect detection, including corrosion, and then describes two pattern recognition approaches developed within the context of the MINOAS project to detect corrosion from digital images.

Related Work
Talking about automated visual defect detection, the scientific literature contains an important number of proposals.These can be classified depending on the kind of surface in which they are looking for defects: some approaches face the defect detection on particular objects or surfaces (e.g.LCD displays [4], printed circuit boards [5], copper strips surfaces [6], ceramic tiles [7], etc) while other methods are intended for the detection of defects in generic surfaces (e.g.[8][9][10][11]).
A second classification can be established depending on the kind of defect that they try to detect.In this regard, many approaches intended for the inspection of generic surfaces look for general and unspecific defects, although there is an important amount of contributions that are dedicated to a specific type of defect.This is the case of [12], [13] and [14], which are part of a large collection of contributions to automated visual crack detection.
Regarding corrosion detection from images, just a few works can be found in the literature (see e.g.[15][16][17][18]).This reduced amount of contributions indicates that there is still much work to do.By way of example, the following sections introduce two novel algorithms for corrosion detection and assess their performance against a varied set of test images.

Description of the algorithm
WCCD is a supervised classifier which has been built around a cascade scheme, although its two stages can be considered as weak classifiers.The idea is to chain different fast classifiers with poor performance in order to obtain a global classifier attaining a much better global performance.To this end, each weak classifier takes profit from different features of the items to classify, reducing the number of false positive detections at each stage.For a good global performance, the classifiers must present a false negative percentage close to zero.
The first stage of the cascade is based on the premise that a corroded area presents a rough texture.Roughness is then related to the energy of the symmetric gray-level co-occurrence-matrix (GLCM), calculated for downsampled intensity values between 0 and 31, for a given direction α and distance d [19].The energy is obtained by means of Equation 1: where p(i, j) is the probability of the occurrence of gray levels i and j at distance d and orientations α or α + π.Patches with an energy lower than a given threshold τ E , i.e. exhibit a rough texture, are finally candidates to be more deeply inspected.The second stage filters the pixels of the patches that have passed the roughness stage.This stage makes use of the colour information that can be observed from corroded areas.More precisely, the classifier works over the Hue-Saturation-Value (HSV) space after the realization that HSV-values that can be observed in corroded areas are confined in a bounded subspace of the HS plane.Although the V component has been observed neither significant nor necessary to describe the color of corrosion, it is used to prevent the well-known instabilities in the computation of hue and saturation when color is close to white or black.In that case, the pixel is classified as non-corroded.
A training step is performed prior to the application of this second stage of the corrosion classifier.In this case, training consists of building a bi-dimensional histogram of HS values for image pixels known to be affected by corrosion in the training image set.The resulting histogram is subsequently filtered by zeroing entries whose value is below 10% the highest peak.
The classifier proceeds as follows for every 3-tuple (h, s, v): (1) pixels close to black, v < mV, or white, v > MV ∧ s < mS, are labeled as non-corroded, and (2) for the remaining pixels, the HS histogram is consulted and the pixel is labelled as corroded if HS(h, s) > 0, for given thresholds mV, MV and mS.
Notice that the stages of the cascade cannot be reversed since they do not work with the same kind of entities: while the second stage works at the pixel level, the first stage operates over 15 × 15 -pixel image patches since it depends on texture, which necessarily involves a pixel neighborhood.Figure 1 shows the flow diagram of WCCD.

Performance of WCCD
The performance of WCCD depends on the performance of its different stages.To configure the parameters of the roughness stage, several experiments have been performed considering different values for d and α when calculating the GLCM and, consequently, its energy level.No significant differences have been observed among the output values, and so the parameter values are set to d = 5 (pixels) and α = 0 (horizontal direction).As for the energy threshold τ E , its value determines the algorithm performance in terms of computation time and number of false positives, since all patches with a high energy level are discarded and only those with a low value become input of the colour checking step.
The parameters of the colour-based step, mV, MV and mS, are set to prevent the instabilities of h and s values from affecting the pixel classification.Using 8-bit HSV values, the minimum value mV is set to 50, as well as the minimum saturation mS.The maximum value MV is set to 200.
Figure 2 provides classification outputs for the same input image using different energy thresholds τ E .This parameter can be tuned to decrease false positives and just allow the detection of the most significant corroded areas.In the images, pixels labelled as corroded are colour-coded to indicate the probability of successful classification.To be more precise, the colour depends on the height of the corresponding histogram bin in the following way: where HS = max{HS(•, •)}.
The performance of the detector has been quantified by means of a varied set of test images and manually generated ground truth data (after the proper configuration of the different parameters, as explained above).False positive and false negative percentages, respectively FP/no.pixels and FN/no.pixels, have been used as the figures of merit.The process carried out to perform this assessment has entailed the implementation of different techniques to filter the HS histogram and a posterior comparison among the alternatives with regard to the generalization capability of the resulting classifier.
Downsampling the histogram to 32 levels for hue and saturation has been the first filter considered, which merely groups bins with similar hue-saturation values.As can be seen in the first and second rows of Table 1, this filter has resulted in a considerable improvement in comparison with the original 256 × 256 HS histogram, thus it has been considered as the reference for comparing with the other filtering strategies.
More specifically, two more attempts have been performed in order to reduce the false negative percentage while preserving the false positive percentage.On the one hand, the Parzen windows method [19] has been applied to the original 256 × 256 histogram using the two-dimensional Gaussian kernel shown in Equation 2:  1, which correspond to σ = 12.
On the other hand, a Bilateral filter [20] has been applied to the original 256 × 256 histogram, considering the bins height as the intensity values of an image.This approach filters the histogram using a kernel consisting of two Gaussians, one for the spatial domain and another for the range domain.After the different experiments carried out, the best performance has been obtained for σ spatial = 15, σ range = 1 and a kernel size of 30 pixels.The fourth row of Table 1 provides the resulting performance values.The resulting histograms for the different filters can be found in Figure 3.By way of conclusion, it seems the approach based on the bilateral filter is the one providing best results, although it is true the different strategies lead to a final similar performance.The bilateral filter has thus been selected for being part of WCCD.
Examples of final classification outputs for WCCD are provided in Figure 4.
Regarding the execution times, WCCD provides corrosion-labelled images in 7-25 ms.These execution times correspond to images ranging from 120.000 to 172.000 pixels, and for a runtime environment comprising a laptop fitted with an Intel Core2 Duo processor (@2.20GHz, 4GB RAM).Laws' texture energy filter responses are used to feed the decision trees.This is so because these filters are able to enhance different features of every material texture.We use a filter bank with 48 different filters that are obtained after combining the following five 1D five-component basic filters: To describe a texture, the corresponding gray-level patch is convolved with the set of energy filters (T ⊗ f ilter → c) and different statistical measures are taken over a 15 × 15 neighbourhood of the filter response, which finally constitute the texture descriptor: During the learning process, AdaBoost is fed with 192 statistical measures per image patch (4 statistical measures × 48 energy filters), together with the roughness of the patch (computed as in the first stage of the WCCD detector) and its class label.The output is a set of weak classifiers together with their weights, what allows a correct discrimination of those patches which belong to defective areas from those which does not.
In addition, the same technique used in the second stage of the WCCD algorithm, based on a Hue-Saturation histogram, is also used in ABCD to filter the pixels from the patches that are classified as corroded by AdaBoost.The flow diagram for ABCD is shown in Figure 5.

Performance of ABCD
We have considered three versions of AdaBoost: Real, Gentle and Modest.Real AdaBoost is the generalization of a basic AdaBoost algorithm.Gentle AdaBoost [23] is a more robust and stable version of Real AdaBoost, used, for example, by the Viola-Jones object detector.Finally, Modest AdaBoost [24] is a version mostly aimed for better resistance to overfitting.For training the algorithm, a total number of 39746 patches have been gathered from 25 different images.These patches have been labelled as defective (12952 patches) or non-defective (26794) by means of visual inspection.One half of the total amount of patches has been used to train the different versions of AdaBoost, while the other half has been used as control samples to assess the performance of the resulting classifiers.
The AdaBoost parameters have been configured as follows: 1. the maximum number of boosting iterations, i.e. the number of weak classifiers that make up the final classifier, has been set to 100.
2. the tree depth, that is, the depth of the CART, which determines how good the weak classifiers are, has been set to three levels.
Both parameters have been configured to improve the detection performance without prolonging the learning time unnecessarily.
After the execution of the three versions of AdaBoost, their performances have been assessed.Table 2 shows the error percentages obtained for the three versions.
As can be seen, best results are obtained for the Gentle version of AdaBoost.Some results obtained for this version, after adding the colour-based filter, are shown in Figure 6.Among the different images shown, special mention is done for the image in the first row, where corrosion in form of pitting, affecting a very reduced fraction of the image, is successfully detected.
The final performance of the complete algorithm has been analysed following the same procedure used for assessing the performance of WCCD.The false positive and false negative percentages have been again used as the figures of merit.The values obtained are shown in Table 3.
Regarding the execution times, ABCD provides corrosion-labelled images in 300-512 ms.These execution times correspond to the same images and runtime environment used for assessing WCCD.

Conclusions
In this chapter we have introduced two vision-based corrosion detection algorithms that have been developed within the context of the European project MINOAS.Both algorithms are based on the idea of combining weak classifiers for obtaining a good global performance.
After assessing their performance, the misclassification percentages obtained for both algorithms result to be not null.These results can be explained by analysing the kind of misclassifications and the areas where they appear.On the one hand, the FN percentages are not zero because the detectors tend to label as corrosion the center of the corroded area, while the borders are usually not totally labelled.On the other hand, the FP percentages are neither null due to the presence of different structures in the image that are misclassified as defects.
Nevertheless, if corroded areas are considered as entities and it is assumed that the labelling of a single pixel within a defective area is useful, then the ratio between the number of undetected defective areas and the true number of defective areas turns out to be zero for both algorithms, since all them are always detected.In this regard, it is important to remember that the detectors are intended to be used to facilitate visual inspections, and, thus, reporting about the existence of corroded areas in images is considered worth enough even if the areas are not completely labelled.
Taking into account the misclassification ratios, it is not clear whether WCCD performs better or worse than ABCD.However, based on the shorter execution times and the qualitative evaluation of the results, it seems that WCCD outperforms ABCD.

Figure 2 .
Figure 2. Corroded areas detected by WCCD for different energy threshold values where µ x and µ y are the hue and saturation values for the neighbourhood center, x and y are the values for a nearby sample, and σ is the standard deviation.The algorithm performance has been assessed filtering the histogram using different values for σ.Best misclassification rates and percentages are shown in the third row ofTable1, which correspond to σ = 12.

Figure 3 . 4 . 4 . 1 .
Figure 3. HS histograms resulting from the different filtering strategies (WCCD) 4. AdaBoost based Corrosion Detector (ABCD) 4.1.Description of the algorithm ABCD makes use of the Adaptive Boosting paradigm (AdaBoost) for both learning and classifying corroded areas.Decision trees, as produced by the Classification and Regression Trees (CART) learning technique [21, 22], are used as weak classifiers.

Table 2 .
Error percentage obtained for the different AdaBoost versions

Table 3 .
Misclassification measures for ABCD