Complex Crack Segmentation and Quantitative Evaluation of Engineering Materials Based on Deep Learning Methods

Recognizing and quantifying microcracks on surfaces are crucial for early detection of structural damage, as they can lead to more complex issues in engineering structures. In this study, a dataset reflecting varying surface cracks in various engineering materials from the 2018 Ecuador earthquake was constructed. Furthermore, we proposed deep learning-based methods for recognizing and quantifying complex surface cracks. The methods utilized an enhanced U-Net semantic segmentation model, along with noise reduction and topological parameter extraction post-processing algorithms. The intersection-over-union recognition capacity for simple extended, complex crisscrossed, and microcracks was significantly improved, reaching as high as 91.71%, 89.56%, and 100% compared to the original U-Net. The results showed that the enhanced U-Net architecture had a steady training process and high accuracy in segmenting various types of cracks, especially microcracks, which were difficult to detect through manual observation or conventional imaging techniques. Therefore, the proposed model is suitable for pre-alarming monitoring of multiscale damage. A noise reduction post-processing algorithm was introduced to enhance the segmenting capability by extracting detailed characteristics of prediction samples. Statistical analysis of false-positive samples showed that 90% of misrecognitions were eliminated, compared to the original situations. The study also developed an approach for determining the topological parameters of complicated patterns on hierarchical cracks in the refined masks at the pixel level. The average relative errors of area, perimeter, length, compactness, average width, and centroid coordinate for crack samples in ground-truth annotations and refined masks are below 10.75%, 1.89%, 1.51%, 6.20%, 9.75%, and 0.72%, respectively. Finally, a graphical user interface integrating all algorithms was explored and applied for damage assessment of structural components under various environmental disturbances.


I. INTRODUCTION
Engineering materials are the foundational elements of engineering structures and inevitably undergo the effects of The associate editor coordinating the review of this manuscript and approving it for publication was Massimo Cafaro .environmental factors, aging, and dynamic loads during longterm service.This often leads to accumulation of damages at multiple scales.Detecting these damages accurately is critical for structural health monitoring [1], [2], [3].Surface cracks are one of the most intuitive indicators to assess the damage status throughout the lifecycle of engineering structures [4].
They generally present complex geometrical features due to different inducing conditions, such as stress concentration, loading level, and temperature gradient.Microcracks, as precursors of severe fractures, offer early warning for internal damages to engineering materials.Early recognition of these cracks ensures timely intervention and informed decisions for repair and maintenance of engineering structures, which is more cost-effective than post-processing for severe damage and structure failures.Compared to other types of structural damages, surface cracks present more complex geometric features, making it challenging to establish a universal framework for accurate recognition [5], [6].
Over the last few decades, vibration-based and computer vision-based techniques have become popular for identifying damages in engineering materials [7], [8], [9].Vibrationbased methods depend on the structural dynamic response of the entire structural system.However, due to their low sensitivity in the frequency domain, they may fail to identify local surface cracks.On the other hand, manual inspection relies heavily on empirical observations and subjective judgment, making it unsuitable for highly accurate identification [10].With the recent advancements in computer vision techniques, vision-based recognition has become a popular non-contact and automatic inspection approach [11].This technique is used for dynamic displacement tracking [12], [13], global deformation recognition [14], and local damage detection of engineering infrastructure [15], [16], [17], [18], [19], [20], [21].Image processing-based methods (IPMs) are commonly used in the early stages of crack detection.For example, Fujita et al. [22] used a combination of median and linear filters based on the Hessian matrix to segment cracks in images.Abdel-Qader et al. [23] compared four edge detection methods and found the most robust algorithm to be the fast Fourier transform.Kim et al. [24] evaluated five commonly used crack binarization methods to obtain optimal parameters for segmentation.To improve detection accuracy, global features of cracks are combined with filter-based techniques and classification algorithms, such as artificial neural networks (ANNs) and support vector machines (SVMs).Li et al. [25] developed a crack detection algorithm using Canny edge detection and SVMs.Jahanshahi et al. [26] proposed a morphological feature extractor and an ANN/SVM classifier to extract cracks on engineering materials.Despite the benefits of image processing technologies, background interferences can limit their performance, especially for microcracks.
Deep learning models trained on multi-scale datasets have proven to be highly automated, accurate, and robust in comparison to other methods.Since the introduction of LeNet-5 [27] by Lecun et al., convolutional neural networks (CNNs) have become the dominant architecture for deep learning models.Some studies like Cha et al. [28] used a sliding window CNN model for crack detection on engineering material surfaces.Others like Mohtasham Khani et al. [29] combined CNN and classical image processing methodologies to propose a crack detection framework.However, most patch-level detection algorithms only label the category and localize the window position that covers an approximate region of cracks, rather than elaborately quantifying detailed features of edge, shape, morphology, and distribution at the pixel level.
Crack recognition at pixel-level generally uses end-toend semantic segmentation methods allowing for the direct identification of pixels belonging to crack regions without the need for initial region localization and subsequent edge detection.The fully convolutional network (FCN) [30] can be used to detect cracks at the pixel level.Various network architectures have been used as the encoder backbone, including VGG16 and VGG19 [31], Inception V3 [32], and ResNet152 [33].In particular, VGG19 was used as the FCN backbone to segment cracks in infrastructural systems by Yang et al. [34].Bang et al. [35] selected ResNet152 as the backbone and utilized deconvolution layers for up-sampling in the decoder part.Dung and Anh [36] compared the performance of FCN models with different backbone networks of VGG16, InceptionV3, and ResNet152 on an open-source dataset of crack images, and VGG16 was found to have the best performance as the backbone of FCN for crack segmentation.Compared to other frameworks, U-Net [37] has proven to be more effective in binary semantic segmentation, with a simple symmetrically-designed encoder-decoder structure, and has shown great success in processing medical images.In engineering materials, U-Net has been used for damage segmentation by Liu et al. [38].Zhang et al. [39] investigated different U-Net-like models, indicating that deeper architectures had better accuracy for crack detection.Shi et al. [40] replaced the original backbone of U-Net architecture with VGG-net and established VGG-U-Net to segment corrosion regions and cracks on an engineering structure.It is evident from the aforementioned literature that commonly-used deep learning-based methods were merely designed for detecting surface cracks on concrete structures.However, more detailed quantification information needs to be acquired from a limited number of images for more precise damage assessment.Accurately capturing and quantifying surface cracks with intricate morphologies and complex backgrounds remains a great challenge.In addition, misrecognitions increase with background complexity.Therefore, a post-processing method for the elimination of false-positive samples must be developed, despite the challenges that come with it.Moreover, related studies on the extraction of morphological parameters for crack quantification are scarce, and the corresponding algorithm should be devised.Furthermore, the integration of accurate extraction and quantification of cracks remains an issue, particularly when applying for assessment of structural components.
To address these issues, this study proposed deep learning-based methods for accurate detection and quantification of surface cracks.The methods are based on the hypothesis that randomly sampled images of external damage to engineering materials can be correlated with their internal damage, enabling further safety assessment of structural components.The model, based on the enhanced U-Net architecture, was designed to effectively detect complex cracks with diverse morphologies against complex background disturbances.Additionally, the model was integrated with a noise reduction and a morphological parameter extraction algorithm, and an exquisitely-crafted graphical user interface (GUI) was introduced for refined assessment of structural components.The study was organized into six sections.Section II introduced the overall framework of the proposed method.Section III provided details of the investigated dataset and configurations of the training procedure.Section IV exhibited the test results of crack segmentation and quantification, which further validated the effectiveness of the enhanced U-Net model.Section V integrated the proposed model and algorithms into an exquisite GUI for actual applications with structural damage assessment.Finally, Section VI concluded this study.

II. METHODOLOGY A. METHODOLOGY OVERVIEW
The proposed method for accurately recognizing and quantifying surface cracks on engineering materials consists of three modules, as shown in Fig. 1.The first module is an enhanced U-Net model that is used for complex cracks semantic segmentation.The second module is an IPT module that reduces false-positive noise samples in predictions.The third module is a crack quantification and geometric morphology extraction module.These modules are integrated into a platform for practical applications of damage assessment of structural components.More detailed descriptions are provided in the following subsections.called skip connection is employed to merge the encoding and decoding features.The improvements of the new model over the original U-Net are as follows: 1) The input grayscale image of the original U-Net model is in a resolution of 572 × 572, and the feature map size is reduced by two after each convolution operation without zero padding.In this study, the color image is first converted to grayscale, and then the image with a size of 512 × 512 is input into the model.Zero padding is added in the convolution operation to ensure that the input and output sizes of each layer remain the same.
2) BN is added after each convolutional layer to control the mean and variance of multilevel feature maps within a specific range, which reduces internal covariate shifts.This operation makes the model conducive to improving the learning capacity.
3) The bilinear interpolation is utilized for the up-sampling procedure in the decoder, instead of transposed convolution.This reduces the model parameters during feature map reconstruction and enhances the training efficiency.
The network architecture of enhanced U-Net model is as follows, and the detailed configurations are in Table 1.
1) In the first encoder layer, the input image undergoes a convolution process using 64 3×3 convolutional kernels.The generated feature map then goes through a BN layer and a nonlinear activation layer of Rectified Linear Unit (ReLU).This entire process is referred to as ''Conv 3 × 3, BN, ReLU'' (CBR), as shown in Figure 2. The CBR procedure is repeated, resulting in a 512×512×64 feature map as the output feature map of Down1.The spatial dimension of the feature map is then reduced to 256 × 256×64 by applying a max pooling operation with a 2 × 2 kernel.The second encoder layer Down2 includes two sets of CBR with 128 channels, and this feature map is also processed by a max pooling operation.This procedure is similar to that of Down1.The CBR and max pooling operations are repeated until the fifth encoder layer Down5, where the spatial dimension of the feature maps decreases, and the channel number increases for more global and abstract feature extraction.The last encoder layer Down5 contains a feature map of 32 × 32×512.2) To achieve a detailed classification of cracks through semantic segmentation, a series of up-sampling and feature fusion operations are utilized in the decoder section.In this study, we used bilinear interpolation to the output feature map of Down5 with a magnification factor of two.This produced a feature map of 64 × 64×512.Furthermore, we designed the feature fusion procedure through a skip connection operation that copies the output feature map in the encoder stage and concatenates it with the up-sampling feature map in the decoder stage.A similar process of up-sampling and feature fusion is repeated in the decoder from Up4 to Up1.As a result, we see a decrease in the channel number, but the spatial dimension increases in the decoding process to reconstruct the crack semantic map.Eventually, we generate a feature map with the same resolution as the input image and 64 channels.The final semantic map is obtained by applying a set of convolutions with a 1×1 kernel (conv 1×1) and a pixel-wise classification by using softmax.

C. POSTPROCESSING MODULE FOR FALSE-POSITIVE NOISE REDUCTION
To further refine the semantic segmentation results, a postprocessing module for noise reduction is designed to eliminate false-positive predictions caused by complex background interferences.The false-positive samples can be characterized by minuscule area and scattered distribution.The directed area of the closed polygon can be calculated from the coordinates of the points on the boundary.Therefore, in order to calculate the directed area of predicted samples, it is necessary to obtain the boundary set of a certain sample.In this situation, an image scanning and contour tracing algorithm, as a fundamental image processing method, is selected to obtain the boundary set.A simple and fast procedure is presented in Algorithm 1.
A discrete function f (i, j) = 0 or1 in the Cartesian coordinate system is used to represent a binary crack mask with m × n pixels.A new zero matrix of the same size as f , a variable l marking visited pixels is set first.Then, a variable C s storing output boundary sets of all samples is initialized as an empty list.A variable C storing the boundary set of each sample is passed to C s with initialization as an empty list.A variable c p storing the coordinate of the current pixel is updated with the loop to trace the boundary path of a sample.A variable storing neighborhood coordinates of the current pixel is defined as Fig. 3(a), while the 8-neighbor domain is commended.Finally, the variable C s records boundary sets of all samples.The directed area of each sample can be calculated by the boundary pixels coordinate C in C s determined by where n denotes the number of boundary pixels and (x i , y i ) denotes the coordinate of i th boundary pixel in the boundary set of a certain prediction region C.As shown in Fig. 3 if f (i, j) = 1 and l (i, j) = 0 5: C.append c p 8: l (i, j) = 1 9: While True: 10: = get_neighbors c p 11: for (ni, nj) in 12: if f (ni, nj) = 1 and l (i, j) = 0 13:  therefore S=0) or vertically (where y 1 =y 2 =. . .=y n , therefore S=0) connected pixels.
The directed area of the samples is equal to zero and can be recognized by setting an area threshold to zero.Thus, these false-positive noise samples can be eliminated by replacing them with pure background pixels.The general flow of the main techniques and implementation procedure is shown in Algorithm 2.

D. MORPHOLOGICAL PARAMETER EXTRACTION FOR COMPLEX CRACK QUANTIFICATION MODULE
Fig. 4 illustrates the schematics of extracting morphological parameters for crack quantification, encompassing both single and complex criss-cross cracks.As displayed in Fig. 4(a), the pixels of a single crack are segregated into five categories: the starting pixel, the ending pixel, the two lists of pixels

4:
if S = 0 5: for (i, j) in C 6: g (i, j) = 0 7: else 8: for (i, j) in C 9: g (i, j) = f (i, j) 10: Return g (i, j) on the left and right boundaries, and the internal pixels.On the other hand, Fig. 4(b) demonstrates how the pixels of a complex criss-cross crack is categorized into two parts: the list of pixels on the boundary and the internal pixels.
As shown in Fig. 4(a), a single crack is relatively simple and split along an explicit direction.The quantification procedure for a single crack is determined by ( 1)-( 4).
w avr = S L (4a) ) where n denotes the number of total boundary pixels; During the long-term service of structural components, single crack may gradually expand and intersect with other cracks, forming complex criss-cross cracks.As shown in Fig. 4(b), some basic morphological parameters of a complex criss-cross crack are determined in the same manner as for a single crack, including perimeter C, length L = 0.5C, centroid coordinates (x, ȳ), area of crack region S, area of the corresponding convex polygon ⌢ S and compactness λ.Compared to simple cracks, complex criss-cross cracks randomly extend in diverse directions, possibly propagating into intricate crack networks.Because the box-counting model has excellent capacity and high reliability on the quantitative characterization of complex surface cracks [41], [42], therefore, this method is employed as an additional morphological parameter to quantify the complexity of criss-cross crack and determined by (5).
where D is the box-counting dimension reflects the complexity of criss-cross cracks; L and N denote the side length and the statistic quantity of the unit square boxes, respectively, which can encircle the crack region.
A general flow of morphological parameters extraction for crack quantification follows Algorithm 3.

III. IMPLEMENTATION DETAILS A. DATASET
The study involved collecting images of different scales and views from various devices like single-lens reflex cameras and smartphones.These images are originated from the 2018 Ecuador earthquake, showcasing different severities of surface cracks on various engineering materials and complex background disturbances under real scenarios.The established dataset consists of 209 pairs of original images and the corresponding pixel-level labels with resolutions of 6016 × 4000, 4288 × 2848, 4288 × 3848, 4608 × 3456, and 480 × 360.These image-label pairs are cut into 4133 patches with a resolution of 512 × 512 to improve training and testing efficiency.The patch generation procedure involves image cropping with an overlap of 256 pixels in a left-to-right and top-to-bottom manner, as shown in Fig. 5.
Notably, the proportion of crack pixels is exceedingly low, resulting in an imbalanced data issue between crack and background patches.To address this issue, 50% of pure background patches were eliminated, with 590 containing crack-like edges and complex background disturbances added.Finally, the generated dataset contained 2362 crack patches and 1181 non-crack patches, with the details clarified w avr = S L (4a) in Table 2.These patches were split into three subsets for training, validation, and testing in a ratio of 7:2:1; each of them contained 2,480, 709, and 354 patches, respectively.

B. LOSS FUNCTIONS, OPTIMIZATION ALGORITHMS, AND TRAINING HYPERPARAMETERS
The U-Net models for crack segmentation have been trained using the binary cross-entropy loss function [43], which measures the difference between the predicted probability distribution and the actual label distribution for binary 41402 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.segmentation of crack or background.This loss function is used for both the original and enhanced versions of the U-Net model.Eqn (6) shows how this loss function works.
where both h and w denote the height and width of an image, respectively.y i,j and ŷi,j represent the value of pixel at position (i, j) in the predicted binary segmentation and actual label, respectively.In terms of the optimization algorithm, Adam [44] selected from several candidate algorithms [45] of stochastic gradient descent (SGD), SGD with momentum [46], and RMSprop [47] as the training hyperparameters are determined as follows.The exponential decay rates β 1 and β 2 for the first-order and second-order moment estimates are set as 0.9 and 0.99, respectively, while the numerical stability constant ε is 10 −8 .The initial learning rate η is set as 10 −4 and follows a decent strategy of 8% every five epochs.The total number of training epochs t is 100, and a batch size is selected as 8.
The models are trained on a workstation with a GPU of Nvidia GeForce RTX-2080Ti and a CPU of Intel(R) i9-10900X.The code is developed using Python 3.7 with a software environment of PyTorch 1.8.1 and Numpy 1.21.6.

C. EVALUATION METRICS
The model performance of the enhanced U-Net for crack segmentation is evaluated by crack Intersection-over-Union (IoU) [48], which is employed to evaluate the overlap ratio between the predicted segmentation map and the ground-truth mask for a crack ranging from 0 to 1.In addition, the mean IoU (mIoU) calculates the average IoU for both classes of crack and background.In general, the higher values of IoU  and mIoU reveals the better segmentation accuracy: where n represents the number of predicted classes (n is set as 1 in this study only considering crack as the foreground), and i denotes the class index of crack (i = 1) and background (i = 0); TP, FP, and FN represent the number of predictions associated with true-positive, false-positive, and false-negative crack pixels, as shown in Table 3.In this subsection, the proposed model is compared with the original U-Net [37] and its series of improved models [49], [50], [51].The corresponding results are summarized in Table 4, where the top scores are highlighted in bold.According to the data in Table 4, it is evident that the enhanced U-Net achieves a superior identification result compared to other methods.Specifically, the mIoU is higher than that of the original U-Net by 16.13%.

A. TRAINING AND TEST RESULTS OF ENHANCED U-NET FOR CRACK SEGMENTATION
Representative test results with complicated background interferences using trained models at epochs 10 and 100 are shown in Fig. 7.In some cases, complex textures and rough surfaces may be wrongly identified as cracks.Comparing the prediction results using trained models at epochs 10 and 100, it is evident that the sufficiently trained model has fewer misidentification predictions (i.e., false-positive samples as shown in red circles and false-negative samples as shown in green squares).
Fig. 8 presents some representative test results to further verify the model performance of crack segmentation under real disrupting scenarios.These include a single crack with simple background, a single crack with dark illumination, cracks with complicated surface texture, complex criss-cross cracks, microcracks, and pure background with complicated disturbances.The model is able to successfully segment most cracks with various morphological features and different backgrounds.
For instance, a single crack with a simple background is accurately distinguished with a maximum mIoU reaching up to 91.71%, as shown in Fig. 8(a).The model also achieves satisfactory segmentation performances with dark illuminations and complicated surface textures, with the maximum mIoU presenting a slight decrease to 84.15% and 72.05% as shown in Fig. 8(b) and 8(c), respectively.In comparison, complex criss-cross cracks with intricate geometry exhibit a significant improvement of segmentation with a maximum mIoU increased up to 89.56% as shown in Fig. 8(d).Moreover, the enhanced U-Net demonstrates excellent performance in detecting microcracks that are usually neglected in the labeling process or conventional surveying techniques.The related mIoU reaches 100%, as shown in Fig. 8(e).Compared with the original U-Net model, the enhanced U-Net could identify these microcracks with more intensity, further improving the robustness of crack segmentation and alleviating the negative effect of labeling leakage.In addition, pure background patches with burglar meshes, fences, and crack-like edges are infallibly classified, as presented in Fig. 8(f).The enhanced U-Net has demonstrated stable and reliable accuracy on crack segmentation with diverse morphological features and complex background disturbances.However, there are still many false-positive predictions, resulting from similarities in surface textures and edge boundaries between cracks and disturbances.Therefore, some false-positive samples appeared as spurs and isolated noises around the crack edge.To enhance the segmenting capability in crack segmentation, a noise reduction operation is designed as the post-processing procedure.

B. VERIFICATION OF NOISE REDUCTION MODULE
Fig. 9 presents a comparison of three types of masks (i.e., ground-truth annotations, the original predictions, and the noise-reduced binary masks), revealing that the noise reduction results in smoother crack boundaries.Moreover, the noise reduction module proves advantageous in the subsequent crack quantification process, as it helps to present crack regions more clearly by reducing spurs and isolated noises.
To further explicitly validate the effectiveness of the module, the quantity of true-positive and false-positive noise samples in the three types of masks is calculated as: where N P N and N N N denote the quantity of false-positive samples in the original predictions and the noise-reduced binary masks, respectively.N G , N P , and N N denote the quantity of all samples in ground-truth annotations, the original predictions, and the noise-reduced binary masks, respectively.They are equal to the quantity of boundary sets in the masks, which can be obtained by Algorithm 1.
Based on (9), Fig. 10 displays the statistical results of the instances in Fig. 9(a)-(f).It can be seen that the number of noise samples is reduced from hundreds to dozens, leading to a tenfold drop in quantity.The proposed noise reduction module proves effective in eliminating false-positive samples, demonstrating its generalizability across diverse crack morphologies under background disturbances scenarios.However, there are still dozens of persistent false-positive samples.It should be noted that these samples consist of regionally clustered pixels with a tiny area but not equal to 0. Therefore, the proposed noise reduction method is not applicable to this type of sample, and further research is needed.

C. VERIFICATION OF MORPHOLOGICAL PARAMETER EXTRACTION FOR CRACK QUANTIFICATION MODULE
When assessing the condition of engineering materials, it is important to extract geometric information of surface cracks.To achieve this, noise-reduced binary masks and ground-truth annotations are utilized for morphological parameters extraction with a comparison.The critical morphological parameters for simple cracks include area, perimeter, length, compactness, width distribution along the longitudinal direction, average width, maximum width, minimum width, and centroid coordinates, which can be calculated by Algorithm 3. Quantification results of instances in Fig. 11 are shown in Table 5.In addition, the Pearson correlation coefficient as (10), is used to measure the of the width distribution along the longitudinal direction between the two types of masks, as depicted in Fig. 12.
where µ G and σ G represent the mean and standard deviation of width distribution along the longitudinal direction of the ground-truth annotation, denoted as G. Similarly, µ R and σ R are those in corresponding noise-reduced binary mask, denoted as R.
The proposed quantitative method effectively extracts the morphological parameters of a single crack with complicated backgrounds.The correlation coefficient for the single crack exceeds 0.5, while for a single crack with fewer background disturbances, it exceeds 0.9.The average relative errors of various parameters, including area, perimeter, length, compactness, average width, maximum width, minimum width, and centroid coordinate, are all below a certain percentage.For example, the average relative error of area is lower than 10.75%, while the average relative error of centroid coordinate is lower than 0.72%.Furthermore, we observed that cracks with straight line morphology typically exhibit a higher compactness than other curved cracks.The compactness value is typically above 0.3 for straight cracks and below 0.3 for other types of cracks.These findings suggest that the proposed quantitative method is effective for analyzing single cracks in various scenarios.
Compared to simple cracks, complex criss-cross cracks propagate along diverse directions, which increase the likelihood of evolving into intricate crack networks.Therefore, in addition to measuring area, perimeter, length, compactness, and average width based on Algorithm 3, fractal dimension analysis (i.e.box-counting dimension) is performed by (5), measuring the complexity of criss-cross cracks.The specific step is first to cover the crack regions with unit square boxes in different sizes L and obtain the corresponding number of boxes N (L).Secondly, the scatter diagram is plotted with log 1 L as the abscissa and log (N (L)) as ordinate, a linear regression is employed with the box-counting determined from the slope of the fitted line.Fig. 13 shows some criss-cross crack samples selected from the test dataset with various backgrounds, and the box-counting dimension D of both ground-truth annotation and corresponding noise-reduced binary mask were nearly equal.This indicates that the ground-truth annotation and the corresponding recognition result have a similar level of complexity.Furthermore, the results in Table 6 demonstrate that the average relative errors of area, perimeter, length, compactness, average width, and centroid coordinate are all lower than 2.85%, 1.51%, 1.51%, 6.30%, 2.02%, and 0.33%, respectively.These results suggest that the morphological feature parameters of cracks are well-identified.However, residual false-positive samples in the noise-reduced binary mask still affect the accuracy of quantitative results.Therefore, future research should focus on discriminating between true-positive and false-positive samples in the quantification results.

V. PRACTICAL APPLICATIONS FOR POST-EARTHQUAKE ASSESSMENT A. PLATFORM INTEGRATION BY GUI
A GUI for crack recognition and quantification has been developed by combining the crack segmentation model, the crack refinement algorithm, and the algorithm in Section II.The crack prediction and measurement system comprises functional modules for crack segmentation, noise reduction, and morphological quantification, as shown in the following Fig. 14.To use the system, an image is input GUI, and the well-trained model applies the optimal parameters to obtain the crack segmentation result.The result is then refined by noise reduction by annotating crack pixels in white and background pixels in black, and displayed in the Prediction Result section.

B. ACTUAL APPLICATIONS FOR POST-EARTHQUAKE DAMAGE ASSESSMENT OF CONCRETE STRUCTURE COMPONENTS
Accurate recognition of surface cracks plays a vital role in assessing the damage status of structural components such as columns, shear walls, and beams.To aid in post-earthquake damage assessment of concrete structural components, the above-developed segmentation and quantification model is applied.The model utilizes detailed fine-scale damage descriptions of vertical load-bearing components based on the work of Zhang [53], and the seismic damage evaluation specification [54].The whole failure process of components was recorded in sufficient experimental images, which were collected for this purpose.The damage index [55] of these typical damaged components was calculated, and the damage level coinciding with the visible damage phenomenon was determined.The results are presented in Table 7.
Regarding the measurement congruence at the geometrical scale between the crack regions and related components, the  geometrical parameters of cracks can be calculated as: L actual L image w image (11) where w actual and w image denote the maximum width of cracks in the physical coordinate (mm) and pixel coordinate (dpi).Similarly, L actual and L image denote the cross-sectional length of RC components in the physical coordinate (mm) and pixel coordinate (dpi).
The maximum width of cracks in the physical coordinates is calculated using (10).This width is then used to cross-reference the damage description in Table 7.This helps identify the damage severity and damage index of vertical load-bearing component.For example, the maximum width of cracks in the plastic hinge zone on the surface of column (a) in Table 8 is 0.8112 mm.This width ranges from 0.5 mm to 1 mm, and it agrees with the damage description of Minor 2 in Table 7.Some other representative examples of damaged columns are shown in Table 8.

VI. CONCLUSION
The study presents three novel methods for segmenting and quantifying complex surface cracks on engineering materials across various backgrounds.This is achieved using an enhanced U-Net, post-processing algorithms of noise reduction, and morphological parameter extraction.The research concludes the following key points: 1) An enhanced U-Net model is developed by improving fundamental operations of convolution, BN, and upsampling.Examined on the test set from the 2016 Ecuador earthquake, the model demonstrates exceptional capacity to detect diverse cracks with various morphologies and severities against complex background disturbances.The successful detection of microcracks further attests to its prior performance in multiscale damage identification for prealarming monitoring.
2) A noise reduction algorithm and a morphological quantification algorithm are designed to eliminate small-region false-positive misrecognitions and determine the geometric parameters of crack samples.The results show that these operations effectively reduce scattered fragments and enable the measurement of morphological feature similarity between the ground-truth annotation and corresponding recognition result with high reliability and robustness.
3) An integrated platform of the well-trained enhanced U-Net model, post-processing noise reduction, and morphological feature parameter extraction is developed based on a GUI for practical applications, which enhances the convenience of complex crack recognition and quantification for post-earthquake components.The damage levels are determined based on external appearances in this study, following the hypothesis that external damage to engineering materials correlates with their internal damage.Future investigations will explore internal damage recognition and quantification in computer vision to validate the effectiveness of the proposed methods.

FIGURE 1 .
FIGURE 1. Overview of semantic segmentation, noise-reduction postprocessing, and morphological quantification for complex surface cracks on engineering material.

FIGURE 2 .
FIGURE 2. Schematics of enhanced U-Net model for complex crack segmentation.

Algorithm 1 1 :
(b), the most common forms of false-positive predictions often appear independently as spurs around the edge of cracks, which are horizontally (where x 1 =x 2 =. . .=x n , Image Scanning and Contour Tracing for Binary Mask f (i, j) with m × n Pixel Input:Binary crack mask f (i, j) = 0 or 1; Output:Boundary sets of all predictions C s ; Set l = zeros (m, n), C s = [] 2: for i = 1 to m do 3: forj = 1 to n do 4:

Algorithm 2
Noise Reduction for Binary Mask f (i, j) With m × n Pixel Input:Binary crack mask f (i, j) = 0 or 1; Output:Noise-reduced binary crack mask g(i, j) = 0 or 1; 1:Calculating C s based on Algorithm 1 2: for C in C s do 3: Calculating S based on

FIGURE 4 .
FIGURE 4. Schematics of morphological parameter extraction for crack quantification.

m 1 and m 2
mean the number of left and right boundary pixels and n = m 1 + m 2 + 2; j and k note the indices of left and right boundary pixels; (x i , y i ) denotes the image coordinates of the i th boundary pixel; d i represents length between two adjacent pixels of i − 1 and i on the boundary; C and L label the perimeter and length of a crack region; ( ⌢ x i , ⌢ y i ) denotes the i th vertex pixel coordinates on convex polygon with p vertices pixels of a crack region; (x, ȳ) denotes centroid coordinates of a crack region; S and ⌢ S represent pixel areas of a crack region and the corresponding convex polygon determined by (1) and (3), respectively; λ reflects the compactness of crack by calculating the ratio of S to ⌢ S; w avr is the average crack width; w l means the crack widths along the longitudinal direction and l ∈ (1, min (m 1 , m 2 )); w max and w min denote the maximum and minimum crack width.

FIGURE 5 .
FIGURE 5. Schematic for patch generation of image-annotation pairs.

FIGURE 6 .
FIGURE 6. Training and validation result of enhanced U-Net model for crack segmentation.

Fig. 6
Fig. 6 presents the training details of the enhanced U-Net model.The model shows a significant drop in training and validation losses in the first 25 epochs, followed by a gradual decrease until convergence after 90 epochs.This indicates that 100 training epochs are sufficient for the enhanced U-Net model.The optimal network parameters are selected based on the minimum validation loss and are utilized for crack segmentation on the test dataset.The validation IoU for crack and mIoU for crack and background show a sharp increase and trend towards convergence after 100 epochs of training, which is consistent with the observation of loss curves.In this subsection, the proposed model is compared with the original U-Net[37] and its series of improved models[49],[50],[51]. The corresponding results are summarized in Table4, where the top scores are highlighted in bold.According to the data in Table4, it is evident that the enhanced U-Net achieves a superior identification result

FIGURE 8 .
FIGURE 8. Representative test results of crack segmentation under complex scenarios (black: background; white: crack).

FIGURE 9 .
FIGURE 9. Comparison results between original predictions of the enhanced U-Net and noise reduction.

FIGURE 10 .
FIGURE 10.Statistical analyses of predictions of the enhanced model and noise reduction results corresponding to Figure 9 (a)-(f).

FIGURE 11 .
FIGURE 11.Morphological parameter extraction for single crack.

FIGURE 12 .
FIGURE 12. Comparisons of crack width distribution along longitudinal direction between recognition results and ground-truth annotations.

FIGURE 13 .
FIGURE 13.Box-counting dimension D for the ground-truth annotations and recognition results.

FIGURE 14 .TABLE 7 .
FIGURE 14. Flowchart and GUI integration for practical applications of crack segmentation and quantification.

TABLE 1 .
Configuration details of enhanced U-NET model.

TABLE 1 .
( Continued.)Configuration details of enhanced U-NET model.

TABLE 3 .
Confusion matrix of crack segmentation.

TABLE 4 .
Performance comparison of different models.
FIGURE 7. Representative test results at different training stages.

TABLE 5 .
Comparisons of single crack quantification between recognition results and ground-truth annotations.

TABLE 6 .
Comparisons of criss-cross crack quantification between recognition results and ground-truth annotations.

TABLE 8 .
Damage assessment of deteriorated concrete columns under 2018 ecuador earthquake.