Non-destructive detection of highway hidden layer defects using a ground-penetrating radar and adaptive particle swarm support vector machine

In this paper, a method that uses a ground-penetrating radar (GPR) and the adaptive particle swarm support vector machine (SVM) method is proposed for detecting and recognizing hidden layer defects in highways. Three common road features, namely cracks, voids, and subsidence, were collected using ground-penetrating imaging. Image segmentation was performed on acquired images. Original features were extracted from thresholded binary images and were compressed using the kl algorithm. The SVM classification algorithm was used for condition classification. For parameter optimization of the SVM algorithm, the grid search method and particle swarm optimization algorithm were used. The recognition rate using the grid search method was 88.333%; the PSO approach often yielded local maxima, and the recognition rate was 86.667%; the improved adaptive PSO algorithm avoided local maxima and increased the recognition rate to 91.667%.


INTRODUCTION
Many forms of road deterioration can develop after prolonged utilization of expressways; examples include crack formation, development of voids, and subsidence (Marecos et al., 2017). The main cause of these conditions is the appearance of cracks under the roadbed, which gradually affects the road surface and causes surface cracks. Continued use of expressways causes incremental damage and can significantly increase the amount and type of maintenance work. With increasing traffic volumes in China, it is necessary to develop more efficient and automated road condition-detection methods.
Much research has been conducted on using ground-penetrating radars (GPRs) for landform surveys. With the continuous development of the GPR technology and with the improvement of detection accuracy, the use of the GPR technology for non-destructive detection of structural road conditions has been garnering increasing attention. Yuanlei et al. used the GPR technology to detect different physical anomalies, based on physical simulations and field tests, and the response characteristics across the two scenarios were consistent (Si & Wang, 2018). Shili et al. used the GPR technology to detect hidden defects, such as road breaks, voids, and subsidence, and acquired defect images (Guo, Xu & Li, 2018). Literature (Liang & Su, 2001) uses amplitude attenuation method in ultrasonic testing to evaluate the corrosion damage of reinforced structures in concrete roads. Document (Mandal, Tinjum & Edil, 2016) detects the defects in concrete road base by ultrasonic detection technology and establishes the mathematical model of sound wave propagation in different media. Literature (Wen et al., 2011) establishes the mathematical model of ultrasonic wave propagation in asphalt pavement and concrete pavement to identify highway surface defects, analyzes its feasibility and realizes ultrasonic nondestructive testing of road defects based on this.
Using methods from the image recognition field, this study focuses on the design of shallow hidden defect classifiers based on the support vector machine (SVM) algorithm. The SVM method has been widely studied and applied in different contexts. Hou improved the SVM algorithm's low precision around the hyperplane and reduced its computational complexity for processing large amounts of data; they also improved the algorithm's training efficiency, and managed to reduce the number of false calls (Hou et al., 2018). El-Saadawi & Hatata (2017) used the SVM algorithm for the stator winding protection of synchronous generators, and achieved good results. The SVM algorithm has been widely used in a variety of context, such as big data, medical, agricultural, and transportation applications (Wang, Du & Wang, 2019;Wang et al., 2019;Zhang, Hu & Mao, 2008). Using the SVM algorithm, this study optimizes and improves its parameter selection process. By comparing the optimization performance of the grid search (GS) (Liu & Zhang, 2019) method and the particle swarm optimization (PSO) (Yang et al., 2018) algorithm, the superiority of the PSO algorithm is demonstrated, and the SVM classification algorithm for PSO-based parameter optimization is studied (Ma et al., 2018). By collecting radar images of three diseases on the 107 national highway section from Zhengzhou to Xinxiang in Henan Province, it is shown that the obtained method performs well in detecting hidden pavement defects.

Detection principle of the ground penetrating radar
In the GPR approach, the ground is irradiated by high-frequency electromagnetic waves using a transmitting antenna, while the waves reflected from the ground are detected using a receiving antenna. Electromagnetic waves are reflected differently from different underground media; thus, different waveforms registered by the receiving computer for analysis (Benedetto et al., 2017). Figure 1 shows the GPR detection process.
The reflection coefficient of an electromagnetic wave mainly depends on the dielectric constants of the medium in which the wave travels originally and the medium from which the wave is reflected, as given by Eqs. (1) and (2): where, r is the reflection coefficient; ε 1 ε 2 are the relative dielectric constant of the incident medium and the medium from which the wave is reflected, v is the echo speed, and c is the speed of light in vacuum.

Selection of the ground penetrating radar
The ground penetrating radar used in this study is LTD-2100 ground penetrating radar system developed by China Electronics Technology Group Corporation. The main components of the system are 900 MHz shielded antenna, data acquisition host, computer, data cable, etc.

Image preprocessing
The original images of roads detected by a GPR include the reflection characteristics of road conditions and various types of clutter caused by environmental factors. The clutter distribution is typically non-uniform, affecting the correct recognition and classification of road conditions. To improve the accuracy of the road condition determination, it is necessary to remove the effect of this clutter (Bu, 2017). To extract the features associated with various road conditions, the acquired images should be segmented.
Image filtering. The objective of image filtering is to minimize the effects of noise and interference in raw images. The noise and interference are mainly attributed to the processes -image acquisition and image transmission. Gaussian filtering is typically used for image denoising, which can remove unnecessary interference and protect the edge of the image.
Image segmentation. The objectives of image segmentation are to classify foreground and background pixels, and to determine the organization of foreground pixels (i.e., detect foreground objects) (Fengjun et al., 2018).
The Canny operator has been widely used for image segmentation of radar images, and its segmentation performance after Gaussian filtering is demonstrated in Fig. 2. Figure 2A is the original radar acquisition diagram. Figure 2B is the result after Gaussian transformation. It can be seen that the clutter in the original image is removed. Figure 2C is the disease characteristic waveform diagram extracted after Canny operator. To retain only the necessary information, the image was subjected to the morphological operations of  Figure 3A is the image after Canny operator segmented the disease waveform. Figure 3B is an enhanced disease waveform diagram after the segmented waveform is processed by digital morphology, so as to facilitate the extraction of disease features in the following text. The processed image was segmented and objects were extracted using the vertical projection method, as shown in Fig. 4.  to geometric distortions; (4) distance between similar images should be as small as possible; and (5) distance between different images should be as large as possible. The algorithm for extracting feature vectors should be simple, and the dimensionality of the feature space should not be too high for ensuring the classification performance of the system (Xiao & Liu, 2017). To satisfy the above requirements, the following features were used: area, image complexity, image texture, and seven rectangular features. In this study, three types of road hazards are studied, namely, road cracking, hollowing, and subsidence. Through simulation and actual images, combined with the classification of expert experience, three types of samples are collected to describe these conditions. Twentynine feature vectors were extracted for each sample, and the obtained set of features is shown in Tables 1, 2 and 3.
The dimensions of the different features are very different. Direct use of feature data not only reduces the system performance, but also affects the classification accuracy. To avoid this shortcoming, it is necessary to perform data normalization (Chowdhury et al., 2019). Let the eigenvector of a pattern vector be X = (x 1 ,x 2 ,...,x m ). Then the normalized where, x i,max ,x i,min are the maximum and minimum of {x i (k)|k − 1,...,P}, and P is the overall number of samples in the training set.
Feature selection. In this study, K − L transformation was used as the feature selection algorithm (Jun, 2016). K − L transformation can take into account different classification information and realize supervised feature extraction. Under the criterion of minimum mean square error, it can obtain an orthogonal transformation matrix A that can map the original feature X from high-dimensional space D to low-dimensional space vector y . K −L transformation can retain the data component with the largest variance in the original data and highlight the data difference. Empirical knowledge was used for removing several highly correlated features, and K −L changes were used for reducing the dimensionality of the original feature space. A relatively simple class-center classifier was used for identifying the samples after the K − L dimensionality reduction transformation. The recognition rate is shown in Fig. 5 for the test set. According to Fig. 5, when the dimensionality of the feature space reaches 12, the recognition rate saturates; thus, the optimal dimensionality of the feature space is 12. K −L transformation can take into account different classification information, realize supervised extraction of features with too high correlation, and then use K −L transformation to extract data information from them, thus achieving the purpose of compressing dimensions and improving recognition rate.

The SVM classifier design
Basic principle of the SVM SVM is a binary classification algorithm that is trained using the supervised learning paradigm. The algorithm attempts to determine an optimal classification hyperplane, whereby the edge distance between two sample classes and the dividing hyperplane (decision boundary) is maximized. The larger the edge distance is, the more separated the two sample classes are, the stronger the classification robustness, and the better the method's generalizability (Liu et al., 2018). The hyperplane equation is where ω is the normal to the hyperplane, and x determines the angle with the hyperplane; b is the distance between the hyperplane and the origin. The hyperplane is denoted as (ω,b); then, the distance from a certain point (sample) to the hyperplane (ω,b) is denoted by r: Let the support vector be 1 and point away from the hyperplane, so that ω T x + b = 1, and let the vectors in the two different classes be γ and point away from the hyperplane: For optimal segmentation, we need to find the hyperplane with the largest interval; that is, we need to find the parameters ω and b that maximize γ . According to Eq. (6), only ω −1 should be maximized.

Optimization of the SVM classifier parameters
We used MATLAB (Mathworks, Inc.) to validate the diagnostic accuracy of the SVMbased classifier with respect to the road conditions. We obtained a dataset comprising 100 examples of road cracking, hollowing, and subsidence conditions; 80% of these images were used for training the method, while the remaining 20% were used for testing the classifier. The performance of any SVM classifier depends on the penalty factor and kernel function parameters. The optimal parameter values are typically determined using the grid search approach, which exhausts the set of possible parameter combinations to determine the optimal combination.
The penalty term c and the kernal function parameters g were considered to grow exponentially in the [2 −14 ,2 14 ] range; at the same time, the step between each cell on the grid was set to 0.5, that is, the growth time-points were 2 −14 ,2 −13.5 ,...,2 14 . As shown in Fig. 6, the best classification performance was obtained for log 2 (c)=7.60, best log 2 (g ) = −6.60.
As shown in the retrieval results above, the accuracy of log 2 (c) between [2 5 ,2 10 ] and log 2 (g ) between [2 −10 ,2 −4 ] was relatively good; thus, the step was set to 0.2 in this range, and the parameters were further optimized. The results are shown in Fig. 7.
According to Fig. 7, the best classification was obtained for best = 7 and best= −5.2. These optimal values were used as the SVM classifier parameters for validating its classification performance; then, the accuracy of the classifier with these parameters was characterized. The validation results are shown in Fig. 8, and the final validation accuracy rate was 88.333%, the image recognition time is t = 0.630 s.

Adaptive mutation particle swarm
The results obtained using the grid search approach can meet the desired detection requirements, but because the grid search approach only considers discrete combinations of parameters, the optimal solution can be easily missed. At the same time, the grid search method is exhaustive, and needs to consider all the possible cases; thus, this optimization approach is time-consuming. To improve the accuracy and reduce the computation time, the PSO algorithm is proposed in this section.
The PSO approach. PSO simulates a flock of birds, which are modeled as massless particles (Hsieh et al., 2018). Each particle has only two attributes: velocity v and position x. Velocity represents the speed of movement, and position represents the direction of movement. Each particle searches for the optimal solution separately in the search space, and records it as the current individual extreme value P; the individual extreme values are shared among all the particles in the entire swarm, and the optimal individual extreme value is determined as the current global optimal solution G of the entire swarm of particles. All of the particles in the particle swarm adjust their speeds and positions according to the current individual extremum P found by themselves and the current global optimal solution G shared by the entire particle swarm. The underlying PSO process is relatively simple, and can be divided into the following steps: (1) initializing the particle swarm; (2) evaluating the particles, that is, calculating the fitness values; (3) searching for individual extrema P; (4) searching for the global optimal solution G; (5) modifying the particles' speeds and positions (Yang, Wang et al., 2019). The update equations are: Where ω is the inertia factor, c 1 and c 2 are the acceleration constants, and rand (0,1) are random numbers in the interval (0,1). p id denotes d the dimension of the individual extreme value of variable i.p gd represents the dimensionality d of the global optimal solution. k represents the current number of iterations.
The PSO approach. The following parameter values were used: c 1 = 2,c 2 = 2, population size = 20, maximal number of iterations = 200, and cross-validation fold K = 5. The fitness curve for the optimized SVM classifier parameters is shown in Fig. 9. After iteration 15, the fitness reached its optimal value. At this point, the classifier accuracy was 89.197%, the best c was 0.7596, and the best g was 0.7674. Although the basic PSO algorithm exhibits a good convergence speed and good optimization performance, it can prematurely converge onto a locally optimal solution; as a result, the population will easily stagnate without external pressure.
Variation improvement. To deal with the premature convergence problem, we utilized the concept of mutations that is often used in genetic algorithms, and incorporated mutations into the PSO framework. For that, we defined a trigger to allow particles to escape locally optimal solutions, thus ensuring a global search (Tong et al., 2018). The population fitness variance σ 2 was used for determining whether a local maximum was reached in the iteration process. The population fitness variance σ 2 was defined as follows: In the above equation, N is the number of particles, f is the normalized calibration factor, f i is the fitness of the first particle, and f avg is the average fitness. It can be seen that the larger the value of σ 2 , the more divergent the particle swarm; conversely, the smaller the population fitness variance, the more convergent the particle swarm is. Values of σ 2 close to zero indicate that the particle swarm is approaching the globally optimal solution or is converging onto a locally optimal solution. To avoid premature trapping of the particle swarm in local optima, the swarm is subjected to mutations: In the above equation, C is a normally distributed random number in the [0,1] interval and is the variation factor; rand(0,1) is a random number in the [0,1] range; k is the number of iterations. Mutations alter the particles' positions and thus allow to escape local optima (Deng et al., 2019).
The fitness curve of the improved adaptive mutation PSO algorithm for SVM-based parameter optimization is shown in Fig. 10. Evidently, the best fitness converges to a local maximum after 17 generations. Mutations allow the particle swarm to escape this local maximum after 97 generations; after that, optimization parameters are determined with higher accuracy. The finally estimated parameter values are: c = 7.751, g = 5.754. The fitness at this global optimum is 91.698%. Simulation results. Based on the results in the previous section, the SVM model was constructed using the following parameter values: c = 0.7596 and g = 0.7674, which were determined using the PSO method. The results are shown in Fig. 11.
For these parameter values, the accuracy of the model was 86.667%, the image recognition time is t = 0.590s; however, the recognition accuracy had not improved. The main explanation was the trapping of the particle swarm in a local optimum during the process of parameter optimization. An SVM classifier was then constructed and validated using c = 7.751 and g = 5.754; these parameter values were determined using the adaptive mutation PSO algorithm. The validation results are shown in Fig. 12.
From Fig. 12, the classification accuracy of the SVM classification model with the parameter values determined by the adaptive mutation PSO was 91.667%, the image recognition time is t = 0.615s. This classification accuracy is significantly higher compared with that achieved by the SVM method that uses parameters optimized by the grid search approach.

CONCLUSIONS
According to the requirements of automatic recognition of highway hidden layer conditions, this paper proposes an automatic detection and recognition method that uses an SVM with parameters optimized using the adaptive mutation PSO approach. In this method, PSO with mutations is used for parameter optimization. In this study, three different methods were used for parameter optimization: (1) the grid search method, (2) the PSO approach, and (3) the adaptive mutation PSO. MATLAB and Python were used for implementing these optimization methods, and the optimization processes and their results were validated. Compared with the grid search method and the simple PSO approach, the accuracy of the SVM with parameters optimized using mutation PSO was higher, translating into better performance on automatic identification of highway conditions. Our simulation results showed that the classification accuracy of the SVM classifier with the grid search method was 88.333%, the classification accuracy of the SVM classifier with PSO was 86.667%, and the classification accuracy of the SVM classifier with mutation PSO was 91.667%. Thus, the effect of SVM classifier with mutation PSO is obviously better than that of the other two. Compared with the grid search method, the classification accuracy was improved. However, owing to the pre-processing of images and processing of feature-related data, some defect-related information is likely to become distorted, which can affect the accuracy of recognition. If a zero-distortion image processing method can be found, the recognition accuracy will be greatly improved. At present, only cracks, voids and subsidence can be analyzed and studied, while asphalt pavement diseases and defects can be divided into 11 categories and 21 items. In order to better improve