Buried Object Characterization Using Ground Penetrating Radar Assisted by Data-Driven Surrogate-Models

This work addresses artificial-intelligence-based buried object characterization using 3-D full-wave electromagnetic simulations of a ground penetrating radar (GPR). The task is to characterize cylindrical shape, perfectly electric conductor (PEC) object buried in various dispersive soil media, and in different positions. The main contributions of this work are (i) development of a fast and accurate data driven surrogate modeling approach for buried objects characterization, (ii) construction of the surrogate model in a computationally efficient manner using small training datasets, (iii) development of a novel deep learning method, time-frequency regression model (TFRM), that employes raw signal (with no pre-processing) to achieve competitive estimation performance. The presented approach is favourably benchmarked against the state-of-the-art regression techniques, including multilayer perceptron (MLP), Gaussian process (GP) regression, support vector regression machine (SVRM), and convolutional neural network (CNN).


I. INTRODUCTION
Ground penetrating radar (GPR) has a wide range of applications such as detection of buried mines, pipes and wires. GPR has been used as a near-surface remote sensing technique, and its working principle is based on electromagnetic (EM) wave theory [1], [2], [3]. It operates by sending and receiving electromagnetic signals using antennas. In [4], a conventional C-Band horn antenna was used to transmit and receive the signals. L-Band TEM horn antenna was used for data collection to detect buried object by applying background subtraction or clutter reduction techniques in [5]. In a typical The associate editor coordinating the review of this manuscript and approving it for publication was Claudia Raibulet .
GPR system, the antenna is moving along a path or a synthetic aperture on the upper surface of the ground and it is scanning underground. The received signal at a specific point is 1D time-variable amplitude signal, and it is referred to as A-scan signal. As a result of scanning along an axis, 2D data or B-scan image can be obtained. Most of the available studies have investigated scattered fields from a buried object via B-scans, which are concatenated forms of the A-scans [6], [7], [8], [9], [10], [11], [12]. The reason is that buried cylindrical object such as a pipeline, a wire or a rebar subjected to hyperbolic regression in the B-scan [6], and a recognition of a hyperbolic signature or pattern, is the most common detection approach in both analytical, numerical, and artificial intelligence (AI) methods [6], [7], [8], [9], [10], [11], [12].
There are several methods available to reduce the reflections of the ground surface, and the reflections coming from the scanned subsurface medium. These are generalized as data (or image) pre-processing. Some of the studies, e.g., [5], [9], and [12], report pre-processing methods based on background subtraction for buried object detection and figuring out object-related properties, characterization (position, object size estimation etc.) via a hyperbola. After pre-processing of B-scan images, column-connection clustering (C3) algorithm is proposed to identify the regions of interest, and neural network (NN) has classified C3 outputs to identify hyperbolic indication in [9]. Subsequently, orthogonal-distance hyperbola fitting algorithm is applied to identification of the hyperbola in [9]. Furthermore, upon removing ground reflection, normalized B-scans are used as inputs for convolutional neural networks [12]. 3D GPR data generated with along an axis and perpendicular to an axis are analyzed using CNN and LSTM (Long Short-Term Memory) units together in a framework as a cascaded structure for detection of buried explosive object and discrimination target or nontarget alarms [13]. Another approach is permittivity mapping of the subsurface structures for lining detection [14], [15] by using customized CNN, deep neural network frameworks. By this approach, an inversion of dielectric images can be obtained from B-scan data. Also, feature extraction techniques are applied on the average window subtracted B-scan images [16] for detecting object using support vector machine (SVM) and NN classifiers. The buried objects are identified using geometrical features such as minor and major axes, along with statistical features obtained via principal components, mean, variance and kurtosis, etc. [16].
The literature includes a number of works that study buried object identification/determination by means of classifier algorithms. Therein, the purpose is to answer questions of the form ''is there an object?'' or ''what is the material type of the object.'' The response belongs to one of the predefined classes. Furthermore, various tasks such as identification, localization, estimation of the object size, dielectric features of host medium, and classification of material type and shape, are solved in addition to object detection [6], [7], [9], [11], [12], [16], [17], [18]. To solve these tasks, AI-based surrogate modelling approaches are proposed, including cascaded networks, e.g., NN with Hilbert Transform of time signals for object shape, material classification and depth of the object, dielectric properties of the host medium, and the object size [12]. In addition, windowing on pre-processed B-scan images, the results of material type classification, hyperbola curvature and the depth of the object are used to obtain estimation of the object size via Gaussian process (GP) regression [17]. Another study predicts the object radius by using compressed reflected signals, as well as the depth and water content of the subsurface as the inputs of its machine learning framework [18]. The procedure [18] is also an example of a cascaded framework, with the object radius predicted as being dependent on other characteristic parameters of the model. In [18] and [19], while estimating characterization parameters, A-scans of 2,000 randomly created scenarios are used to obtain satisfactory results. Utilization of the A-scan analysis in the mentioned studies, is aimed at realizing practical processing, and a reduction of the needed computational resources for generating the training data sets.
The mentioned techniques require data obtained using a series of signal and image processing methods for identification and characterization of a buried object. In this work, as one of its contributions, the characterization of buried object is achieved by using data-driven surrogate model constructed with sparse dataset, without removing ground reflections, background subtraction operations, B-scan image processing, and hyperbola investigations. In particular, removing ground reflections or background subtraction operations might be challenging for buried object scenarios that belong to more than one soil type or host medium dielectric features, which changes with the percentage of the water content. Herein, three different soil media with respect to their water content are taken into account. The main aim of the proposed work is to prioritize computationally efficient surrogate modeling studies and propose a novel deep learning-based framework that focuses on the object characterization in terms of its geophysical parameters with A-scan analysis without any background removal, extraction of hyperbola signature by using a few linear sampled train scenarios. In addition to proposing a novel framework that employs the aforementioned a unique methodological approach, the performance comparison with the state of the art techniques reported in the literature is provided for parametric estimation of buried object characterization. The target characterization is applied independently based on the subsurface dielectric properties. In addition, the A-scan data (1-D time signal) obtained from 315 scenarios is analyzed to localize and estimate the object size by using artificial intelligence (AI) techniques. A novel deep-learning-based model (time-frequency regression model, TFRM) is proposed to simultaneously predict the depth, the location (the lateral position), and the radius of the object, at low computational cost. For supplementary evaluation of the performance of the proposed model under realistic scenarios, additional data sets were generated by incorporating random noise to the A-scan signals. This allows for analysing the effects of the environmental and internal noise of the GPR system on the performance of the proposed surrogate modeling methodology. Furthermore, the proposed surrogate modeling approach is validated using the measurement data.
The remaining part of the paper is organized as follows. Section II provides a brief explanation of the GPR model. In Section III, the proposed modelling framework is introduced along with a brief characterization of the benchmark methods. The numerical results are discussed as well, which include performance evaluation based on the mean absolute error (MAE) and relative mean error (RME). Also, B-scan-based estimation of the buried objective geometry is provided for a number of scenarios. Section IV concludes the paper.

II. GPR MODEL AND GENERATION OF DATA SETS
This section provides a conceptual explanation of the studied ground penetrating radar (GPR) model and the data set generation for data driven surrogate modelling for buried object characterization. We start by explaining the GPR setup and the studied buried object characterization problem. This is followed by discussing the data set generation, and the details of the selected training and test scenarios of the surrogate model.

A. PROBLEM FORMULATION
The primary problem to be solved in this work is estimation of geophysical parameters of buried cylindrical PEC object [4], [6], [11], [18], [19] in different soil media by using A-scan analysis as well as surrogate modelling techniques, specifically, a dedicated TFRM framework, described in detail in Section III. The details concerning the configuration of ground penetrating radar model, as well as the datasets being processed have been elaborated on in Sections II-B and II-C, respectively. Section III provides the details of surrogate modelling method developed to carry out the estimation process, along with the numerical results.

B. CONFIGURATION OF GPR MODEL. DESIGN SCENARIOS
In this work, we consider a ground penetrating radar (GPR) problem of characterizing a buried object, based on 3D fullwave electromagnetic (EM) analysis. Here, a C-band pyramidal horn antenna is used in a monostatic configuration [8]. Figure 1 shows the configuration of the characterization process. The scatterer is defined as perfectly electric conductor (PEC) object such as a wire, a pipe, or a rebar. The travelling time of the wave transmitted by the antenna can be computed using the object depth, the relative permittivity of the subsurface, and the speed of light in the free space. Note that the wave propagation time is monotonically dependent on the depth but also the subsurface permittivity.
The dimensions of subsurface are set to 400 × 300 × 500 [mm]. The surface itself is assumed to be a dispersive soil parameterized by the percentage of water content and the dielectric features, as described by the extended Debye model [20] where ε s and ε ∞ are the relative permittivity values at zero and infinite frequency, respectively, t 0 is the relaxation time, σ is the conductivity, w is the angular frequency, and ε 0 is the free space permittivity. These parameters have been demonstrated in Table 1 [20] and used to compute the subsurface permittivity. Using this model, three different types of soil media are chosen according to their water content of 0.2%, 2.8%, and 5.5% [20]. Also the buried object estimation is carried out without any information of the soil type or dielectric permittivity of the background media. These soil types, distinguished with respect to the water content, have been chosen similarly as in [18] and [19]. GPR scenarios have been configured by placing the aforementioned scatterer of various radii at different depths of the dispersive soil medium, and the object has placed at different lateral positions based on the origin of synthetic aperture. The geometrical structure of GPR scenarios for solving the proposed problem has been shown in Figure 2.
In the next subsection, the generation of the training and the test data sets for a computationally efficient construction of surrogate model for buried object characterization is presented.

C. DATA SET FOR CHARACTERIZATION OF BURIED CYLINDRICAL PEC OBJECT
As mentioned earlier, one of the main contributions of this work is to propose a method for buried cylindrical object [4], [6], [11], [18], [19] characterization, which-to a certain VOLUME 11, 2023 FIGURE 2. Diagram explaining the geometry of the GPR model and the characteristic parameters of the buried object to be identified. extent-trades off the accuracy for exceptional computationally efficiency. Achieving this goal requires a construction of the surrogate model using a possibly small training dataset. The parameter space has four dimensions, which include the object depth, its lateral position, radius, and the soil water content. Table 2 gathers information about the design of experiments, which is a linear sampling method, used for generating 315 sample points within the given variable ranges. For each scenario, 16 A-scans are taken. It should be mentioned that by increasing the number of A-scans for each scenario, it is possible to increase the accuracy of the model for estimating the lateral position and the object size.
However, this would also significantly increase the dataset generation time. In this work, the 3D full-wave simulation tool CST Microwave Studio has been used for obtaining the data. The discretized structure of the GPR model contains approximately 25,000,000 mesh cells, and the average simulation time of each scenario is ∼14 hours (hardware configuration: Intel(R) Core(TM) i7-CPU @ 2.60GHz Turbo Boost 4.5 GHz, 16 GB RAM). Thus, for the sake of computational efficiency, the number of A-scans has been kept at sixteen. Another comment is that although it is possible to include the water content of the ground as another input variable, such approach might not be feasible in practice due to the inhomogeneities in the examined area. To ensure a more realistic approach, water content of the ground is variable in a discrete sense, i.e., by using three different values. However, this information is not provided to the model neither in training nor the test process. Although this approach would make the modelling process more challenging and be detrimental to the model predictive power, it needs to be followed to be more in line with realistic GPR applications.
As for the test data set, Latin Hypercube Sampling (LHS) [21] method is used for creating randomly selected sample points to prevent over-fitting of the model. Total of 63 scenarios is generated to be used for performance evaluation of surrogate models. Figure 3 presents the configuration of the characterization parameters for the training and test data sets. The considered problem of estimating geophysical parameters of a buried object is 2D. The training and testing scenarios include B-scan images (2D data), which contain 16 pairs of A-scan (concatenated forms of A-scans). Each A-scan is a time-varying normalized power amplitude signal obtained at the one point along the synthetic aperture. In other words, the scanning path length is 600. In addition, A-scan combination was presented as the A-scan ID according to points at the scanning path (400 mm). The data consists of 600 time-varying amplitudes and the A-scan ID (601 × 1). Here, the A-scan ID is an integer between 1 and 16. The training and testing datasets consist of 315 linearly samples scenarios, and 63 randomly selected scenarios, respectively, with the data acquired using a full wave EM simulator. Each data set contains 16 A scan signals (the total of 5040 and 1008 A-scans for training and test data sets respectively).
In the studied GPR problem, the received reflected signal or statistical features of the signal (power amplitude and A-scan ID) are used as inputs of AI based surrogate models. The latter are employed to predict the characteristics of the buried objects. A-scans have been used as inputs to enable practical processing, as well as reduction of the necessary computational resources for generating the training/testing datasets [18], [19]. Raw time-variable signal dataset consists of the power amplitude of the received reflected signals versus the time of travelling wave at the penetration axis which is taken as 12 ns using the travelling time, the depth, and the subsurface permittivity relation. Figure 5 shows a modulated Gaussian signal with a center frequency of 6 GHz, used as the excitation signal, along with the example of A-scans obtained for a specific scenario. The same figure shows a B-scan image constructed by combining the A-scans obtained as a result of reflections of the transmitting excitation signal along a selected scanning path. In Fig. 4 a black-box representation of the proposed data-driven surrogate modeling approach is presented. 601 features that are defined on the left-hand-side of the black box are the input variables of the problem to be used for the characterization of the targeted outputs (depth, lateral position, and the radius of the object), presented on the right side of the model.

D. NOISY DATA SETS FOR CHARACTERIZATION OF BURIED CYLINDRICAL PEC OBJECT
In this study, for the purpose of further verification of the proposed surrogate model, new data sets are generated by random noise addition [15], [22], [23], [24], [25], [26], [27], [28] to the generated raw A-scans. The literature offers different approaches to noise incorporation and for different purposes such as data augmentation [14], [23], [24], [29], being closer to realistic scenarios [14], [22], [24], [29], [30] and obtaining further verification to test the sensitivity and stability of the considered models [15], [25], [26], [27], [28].  The cases studied in [22] and [23] are arranged to bring the models closer to the real-time applications, specifically by considering noisy data sets. A noise with the signal-to-noise ratio of 10 dB [22] is added to B-scan image generated by gprMax simulation tool. In [23], random noise is added to B-scans generated by gprMax tool. Another technique is used as integration of the real background reflections with B-scans in order to generalize on realistic scenarios [14] as well as for data augmentation purpose. Application of a similar technique is explained in the study of buried target detection with deep learning [29] by using real background without targets and removed air-soil boundary reflections. Another approach is to replace randomly chosen pixels (typically, from 0.3% to 25% of the overall number of pixels) with white and black pixels to obtain noisy data and for further verification of their model in realistic scenarios by using noisy data [30]. This methodology is applied to GPR B-scans generated by gprMax tool and in predicting the object size using a CNN model and extracted hyperbolic patterns [30]. This method is suitable for analysis using B-scan image or 2D data processing techniques, thereby approaching real environmental conditions because reflected waves in concatenated form (B-scan) in different amounts are negatively affected and distorted depending on the percentage of pixel altered.
The particular problem considered in this work is solved by using the A-scan analysis, 1D time-varying amplitude signals (A-scans), and A-scan ID combination, so the methodology of randomly white and black pixel adding is not applicable to the proposed surrogate modeling approach. Hence, for further verification of the proposed surrogate model, supplementary noisy data sets were created with different SNR values (20 dB and 30 dB) by adding white Gaussian noise [15] to emulate conditions that are closer to the real time or on site applications. In Fig. 6, noisy A-scans are demonstrated with two different scenarios and different SNR value of 20 dB and 30 dB.
It should be emphasized that the approach to noise incorporation employed in this work is commonly used in the literature. For example, adding Gaussian noise is proposed for an automatic recognition and localization of pipelines [28]. A deep convolutional neural network model is developed to coastal hazard mitigation [24]. Therein, random Gaussian noise is inserted to replicate field scenarios [24]. Some subjects on noise suppression and denoising in addition to the mentioned issues follow the mentioned approaches and utilize Gaussian noise addition [25], [26], [27], [28]. In GPR systems, interior or system noise lead to interferences on the reflected signals; it is defined as similar to white Gaussian noise [25]. The K-singular value decomposition (K-SVD) dictionary learning method for the denoising of GPR signals [25] has been introduced, which can effectively suppress Gaussian noise. Another study proposes ensemble empirical mode decomposition (EEMD) method for noise suppression on original GPR data with added white Gaussian noise [26]. The method has been shown successful for denoising [26] synthetic noisy and practical GPR data. Also, different amplitudes of white Gaussian noise are added to generated signals to test the sensitivity to noise and stability of the model [15], [27].
To facilitate further research in this area, the data sets (with and without noise) used in this work have been shared in the IEEE data port [31]. To briefly describe the data sets, the training and the testing sets are data matrices of the size of 5040 × 605 and 1008 × 605. In the data sets, 5040 (obtained from 315 different scenarios), and 1008 (obtained from 63 different scenarios) are the sample sizes of training and test data sets, respectively. The features in the 1 st and 601 st columns are the input of the model while the Depth, Lateral Position, Radius of the object, and water content of the soil are present in between 602 nd -605 th features. As mentioned in Section II-B, although the water content of the soil is a variable, this feature is not presented to the model since such an approach might not be feasible in practice due to the inhomogeneities in the examined area. In [31], three data set pairs for studied cases in this work are presented with the mentioned data set as explained above: (I) data without any noise, (II) data with 20 dB SNR, and (III) data with 30 dB SNR rate. VOLUME 11, 2023

III. DATA DRIVEN SURROGATE MODELING
This section outlines the state-of-the-art techniques utilized in the context of model-based characterization of buried objects, as well as introduces a novel deep-learning-based approach, time-frequency regression model (TFRM). MATLAB has been used as coding environment for all analysis, training, and testing of the benchmark methods and the proposed TFRM framework.

A. STATE-OF-THE-ART OF SURROGATE-BASED CHARACTERIZATION OF BURIED OBJECTS
Surrogate-based characterization of buried objects has been a subject of extensive research over the last years. Some of popular methods utilized in this context include SVRM [32], MLP [3], [9], [11], GP regression [17] and CNN [12], [13], [30], [33]. These methods are briefly characterized below, and will be used as benchmark methods compared to the modelling approach introduced in Section III-B.
Gaussian process approach is an approximation-based machine learning technique widely used for estimation and classification problems. GP regression is based on generalizing Gaussian probability distributions to functions [17], [34]. In these works, the kernel function for numerical experiments has been selected as ''matern3/2'' [35], and the prediction method has been defined as block coordinate descent with the block size of 250. Determination of hyper-parameters is an important step in surrogate modelling. Here, Bayesian optimization (an in-built optimization tool in MATLAB) has been used for optimum hyper-parameters determination. For validation, the K -fold technique with K = 5 has been used.
Another surrogate modeling technique in the machine learning class is support vector regression machine (SVRM), which has been applied not only to object and material type detection through classification [7], [11], [16] by using support vector machine (SVM), but also to prediction of soil permittivity and depth [32] in regression approach. It belongs to the group of supervised statistical learning methods [7], [11], [16], [32].
Herein, similar to the GP regression model, the hyperparameters of SVRM are determined via Bayesian optimization to realize nonlinear mapping between the reflected received signals and characteristic parameters of the scatterer. The optimally selected hyper-parameters for SVRM are: Gaussian kernel functions, Box-constraint of 1.72, Kernel scale of 0.1977 and Epsilon 0.0008.
MLP mimics biological neural systems in the form of interconnected neurons. This framework has demonstrated high accuracy in surrogate modeling applications [3], [9], [10], [11], [36]. The MLP model utilized here features the following hyper-parameter configuration: two hidden layers with 32 and 16 hidden neurons, respectively; logsigmoid activation functions, training by the Levenberg-Marquardt algorithm. Similar to GP regression and SVRM, these hyper-parameters are also obtained via Bayesian optimization.  The last state-of-the-art benchmark surrogate model is CNN, which is a technique derived from deep learning [6], [12], [13], [29], [33]. It takes its name from the convolutional layer, which is one of the main elements in the network, and has the ability to automatically extract the features owing to the convolution filter in this layer. To implement the CNN model, several blocks are utilized as a convolution layer (filter), a batch normalization layer, a pooling layer, the activation function, and a fully connected (FC) layer [37], all involved in the hidden layer. The structure and the hyperparameter configuration of the CNN used in this work is as follows: three convolution layers (3 × 1) followed by the batch normalization layer, activation function as ReLU (Rectified Linear Unit) layer, as well as the pooling layers (2×1) incorporated after the last convolution layer, and a fully connected layer with three neurons to model the requested outputs. The CNN model is trained using a version of a back propagation algorithm (referred to as the ''Adam'' optimizer) and a batch size of 256. The learning rate has been set to 10 −3 until maximum epoch number reached to 300. Other user-defined parameters, such as the size and the number of the convolution filter (32, 64, 128) and pooling layer are set up as recommended in the literature in the context of buried object detection and characterization [13]. At the same time, it should be mentioned that-in the cited works-the input data is two-dimensional, therefore, the filters of the CNN layers are of the corresponding dimensionality, whereas the filters of the CNN model used in this work are onedimensional.
All the algorithms considered in this work, including CNN and the proposed technique use the same data sets (1D) that include time-varying reflected normalized power amplitude and A-scan ID (1 × 1) according to scanning aperture. As mentioned in Section II-C, the data set used for the proposed surrogate modeling is a combination of 1D timevarying amplitude signals (A-scans, 600 × 1) and A-scan ID (601 × 1). The first input gives the A-scan ID varying between 1 and 16 according to the along axis. It should be reiterated that the proposed TFRM is a customized deeplearning-based framework that internally converts the given input data as time versus frequency spectrogram, which is the specific form of A-scan and A-scan ID combination.

B. PROPOSED TIME-FREQUENCY REGRESSION MODEL TECHNIQUE FOR SURROGATE MODELING OF BURIED OBJECTS
This section introduces the proposed deep-learning-based time-frequency regression model (TFRM) for buried object characterization. The Short Time Fourier Transform (STFT) of 1D signals offer a joint distribution that enables both time and frequency analysis [38]. In addition, the STFT images (2D data) have been used to classify sound signals in many CNN-based studies [39]. The proposed time-frequency regression model (TFRM) has been inspired by the aforementioned developments. Here, it should be mentioned that deep learning algorithms have shown great success in problems featuring large data sets (e.g., in big data applications). It is worth mentioning that applicability and success rate of deep learning is not strictly bounded to cases featuring large numbers of training samples [40]. Deep learning algorithms can be used to transform the input data to a higher-dimensionality space using large number of neurons and layers, and then distilling the transformed data into lower dimensions to better handle the interrelations between the problems input and outputs. Thus even with small amount of training samples it is possible to use deep learning algorithms to create a globally accurate surrogate model [41].
As mentioned earlier, the proposed methodology utilizes GPR data as time-varying signals. In other words, A-scans and A-scan analysis is carried out. The A-scan analysis including data obtained from B-scans is designed by adding A-scan ID (1 × 1) according to the lateral position at the scanning aperture (along axis) to each of the A-scans of all of the B-scans. Here, the A-scan ID is an integer between 1 and 16. In this way, 2D data (B-scan) is reduced to 1D data combination of reflected field amplitude versus time, and A-scan ID (601 × 1), so that A-scan data [31] with a length of 601 is used in this study. It should be emphasized that the proposed TFRM (time-frequency regression model) is a customized deep-learning-based framework transforming internally the given input data as time versus frequency spectrogram (2D time-frequency spectrogram), which is the specific form of A-scan and A-scan ID combination. It should be mentioned that the literature offers a similar approach [13], oriented towards detection of buried explosive object and discrimination target or nontarget alarms. Therein, 3D GPR volume data is used as an input by converting 2D data.
Firstly, the signals are zero-averaged. Then, the blocks of the length 32 are extracted over sixteen overlapping intervals to extract the STFT image of the signals. Each block is multiplied by the Kaiser window [42] to prevent the spectral leakage. Subsequently, Fast Fourier Transforms (FFT) of the length of 64 are taken. Since the FFT is symmetrical for real signals, the one-sided portion of the image is taken. Thus, a spectrogram of a dimension 33 × 36 is obtained. The magnitude spectrum is calculated to process the spectrogram as an image. With the conversion of complex numbers to real numbers by taking their magnitude, the STFT magnitude spectrum images of A-scan signals are formed. Next, these images are reduced to dimension 32 × 32 to be an exact multiple of two (cf. Fig. 7). Before the model is trained, the A-scans are zero-averaged. The TFRM model consists of five main blocks as shown in Fig. 8. The first three blocks are used to extract the features from the 32 × 32 × 1 STFT image. After extracting features of the STFT image in the convolution layers, data processing is followed with the batch normalization operations and Leaky ReLU layer with the scale parameter of 0.1 (a scalar multiplier of negative inputs) as an activation function. Owing to the global average pooling layer [43] used in the fourth block, a 1 × 1 × 256 feature vector is obtained. In the last, fifth block, the feature vector is processed over 512 neurons in the fully connected layer, and converted to characteristic parameters in terms of the location (depth, lateral position), and the radius of the object.
The Mean Absolute Error (MAE) and the Relative Mean Error (RME) has been used as performance metrics of the surrogate models. The models are defined as where N is the total number of samples, whereas T i and P i are the target and model-predicted values, respectively, for the ith sample. Table 3 shows the average errors of all considered surrogate models and their corresponding average training times, averaged over ten independent runs. In order to clearly demonstrate the performance of the proposed TFRM, it is compared with the state-of-the-art and most commonly used regression algorithms reported in the literature (GP regression, SVRM, MLP, CNN). To ensure a fair comparison between TFRM and mentioned counterpart algorithms, the hyper parameter configuration of these models are taken as standard values in literature or based on the counterpart works given configurations [13]. It can be observed that the accuracy of the proposed TFRM framework is significantly better (given the training datasets, which are identical for all models) than all benchmark surrogates. In particular, the MAE of TFRM is twice as low as for the second-best model (here, CNN). Furthermore, the training time of TFRM is considerably lower than for all benchmark methods. Table 4 shows the modelling errors for individual characteristic parameters for the proposed and the benchmark models. It can be observed that, in a qualitative sense, the results are consistent with those presented in Table 4. As it can be seen from the error metrics of the benchmark models and the proposed model, the performance of parametric estimation of geophysical parameters of buried object in terms of depth, lateral position and radius independent of each other with raw A-scan signal analysis is satisfactory, and it is achieved in a computationally efficient manner even in the case of using sparse training samples. Also, Table 5 and Figure 10 demonstrate the target and predicted parameters of selected test samples for the second-best benchmark model and the proposed model.  The TFRM framework uses the Adam optimizer, which is a version of the back propagation algorithm for training the framework. Here, it is used with the batch size of 256 and the maximum epoch number of 300, also data is shuffled in every epoch. In Figure 9 training progress has been demonstrated with train loss values versus the iteration number.

C. B-SCAN EVALUATION
In the proposed approach, the buried objects characterization is obtained using the 1D data of the normalized power  amplitude of the reflected signals. The latter are obtained for different positions of the scanning aperture, by means of the proposed TFRM framework. For additional validation, and to better demonstrate the performance of the proposed approach as compared to the benchmark methods, a B-scan evaluation is studied in this section.
The B-scan images are 2D datasets that can be generated by combining the A-scan data obtained from different positions [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12]. In the GPR setup defined in Section II, a B-scan data can be obtained using sixteen A-scan datasets for each scenario. Again, it should be mentioned that by increasing the total number of A-scans, it is possible to increase the accuracy of the model; however, this will be at the expense of decreasing the computation efficiency. The performance evaluation of the best benchmark method, CNN, and the proposed approach, TFRM, have been presented in Table 5, and Fig. 10.
These results are the average values for all characteristic parameters (depth, lateral position, radius) obtained from sixteen different A-scans, compared with the actual object characteristics. As it can be seen, the performance of the proposed TFRM framework (average MAE of 11.95) is clearly superior in terms of object characterization over the best benchmark technique, CNN (average MAE of 23.01).
It should be noted that TFRM can almost pinpoint the target object center position, whereas CNN fails with this respect.
As mentioned in Section II-D, for the supplementary evaluation of performance of the proposed model by using noise addition to raw A-scans, new data sets were generated by adding a random noise [15], [22], [23], [24], [25], [26], [27], [28] to the A-scan signals. Thus, the environmental and internal noise of the GPR system and its effect was analyzed. In particular, the supplementary noisy data sets were created with different SNR values (20 dB and 30 dB) by adding the white Gaussian noise [15]. These noisy data sets have been analyzed without any pre-processing to estimate characteristic parameters under scenarios that are closer the real time or on site applications. For this investigation, the second best successful surrogate model, CNN, was used to compare the performance of the proposed framework and the results have been presented in the following table. Also, performance evaluation of the best benchmark model CNN, and the proposed approach TFRM for the noisy data sets have been demonstrated in Table 7 and Figure 11.
Here, it must be emphasized that all the data samples in training and test data sets, 315 scenarios with linear sampling and 63 scenarios with Latin Hypercube Sampling (LHS) method, are completely different from each other. This can be observed in Fig. 3 and the obtained results. In the case of high similarity between the training and test samples or high amount of training samples compared to test samples, all surrogate models would exhibit high accuracy (either true performance or over-fitting). In this case, since the same data samples are utilized to construct all models (GP regression, SVRM, MLP, CNN, TFRM) it is expected that all methods present similar high accuracy in case of training and testing sample similarity. However, the results presented in Tables 4-7 and Figs. 10-11 suggest the opposite. Furthermore, with respect to results in Tables 4 and 6, in case of increasing the complexity of the data set such as adding noise, the performance of TFRM is still superior to the nearest benchmark method (CNN) where the average MAE of TFRM increase from 11.9 to 27.5 for SNR value of 30dB, the CNN average MAE is increased from 23 to 38.3. Thus, even though increasing the complexity of the data set either via changing the training test ratio or adding noise would definitely have an impact on the overall performance of all models that can be seen from the presented results, and under such case still the proposed TFRM is superior to the benchmark methods.
In the proposed model, the characteristic parameters of the object are estimated without any information of the background media such as dielectric permittivity or water content which is in a proportionality relation with the dielectric permittivity and conductivity of the underground according to the Debye model. This approach is followed to account for complexity of the generated scenarios with regard to the relative dielectric permittivity in different subsurface medium being dependent on the water content. As defined in the study of reconstruction of permittivity images with deep convolutional network [33], real dielectric features of the underground medium are complex and it is difficult to obtain. For this reason and because of the relationship between the traveling time of signals (the depth) and the subsurface permittivity, the error rates in the estimation of geophysical parameters of the object can be negatively affected.
By adding water content parameters to input data, an additional analysis has been conducted to investigate the influence of knowledge about the background media on the proposed model. The resulting average MAE of 11.1 [mm] and the average RME of %15.30 has been obtained.  The characteristic parameter prediction is as follows: the radius predicted with MAE of 7.3 [mm], RME of %33.2;  In this study, the performance of the proposed surrogate modeling approach was also validated using measurement data. 1D A-scans have been collected through the measurements in a ''sand pool'' environment. Herein, the purpose is to demonstrate that the proposed approach is also applicable when using physical measurements as a source of data.    The experimental samples are obtained in the laboratory at Yıldız Technical University. During the process, raw B-scan  data corresponding to various scenarios are generated by the impulse ground penetrating radar system, which is utilized in various subsurface imaging operations [44], [45], [46]. Figure 12 shows the experimental setup. The measurements are taken in a wooden pool filled with inhomogeneous dry soil consisting of a mixture of small stones and sand. The object is buried in there used as the scanning subsurface domain. It has the dimensions of approximately 1.40 m (width), 0.22 m (depth) and 1.15 m (length).
The experimental setup (GPR, transmitter and receiver antennas) is manually moved above the soil along the scanning path of the approximate length of 1.40 m. Each B-scan is set as 382 (discrete time step) × 65 (A-scan number). Hence, the input data length is 383, the first input data being the A-scan ID according to lateral position at the scanning axis, and the left part of the input being the amplitude of reflected time signals (382 discrete time steps). Figure 13 presents A-scans from a sample test scenario. The buried object which has various radii of 10mm, 15mm, 20mm, 25mm and 30mm is placed at different locations (depth and lateral positions as 550mm, 650mm, 770mm and 900 mm). The data set created Two case studies are used to investigate the performance of the proposed surrogate modeling approach for measurement data. The first one (Case 1) is based on K-fold validation [47]. The second one (Case 2) is when the available data is split into training and hold-out data sets.
The unprocessed raw B-scan images are mapped to 1D data (A-scan and A-scan ID combination, 383 × 1) for the use in the proposed A-scan analysis technique to simultaneously obtain characteristic parameters of the buried object in terms of its depth, lateral position and radius. For the sake of comparison, the CNN method has been used as a benchmark technique. The results can be found in Tables 8 and 9, as well as in Fig. 14.
These results are the average values for all characteristic parameters (depth, lateral position, radius) obtained from different runs of the models. Figure 14 and Table 10 demonstrate the predicted characteristic parameters compared with the actual object parameters. As it can be seen, the performance of the proposed TFRM framework (average MAE of 25.9) is clearly better in terms of object characterization than the benchmark (average MAE of 35.0). In particular, TFRMbased prediction of the geophysical parameters is satisfactory as indicated in Table 10 and Fig. 14, and it is considerably better than that of CNN.

IV. CONCLUSION
This work introduced a novel approach to surrogate-assisted characterization of buried objects. The presented approach utilized AI methods, in particular, a deep-learning-based TFRM framework. Its major advantage is computational efficiency, and the ability of constructing accurate representation of the buried object characteristics using small training datasets. The presented framework has been comprehensively validated using a number of specific cases of perfectly electric conductor (PEC)-object buried in different, dispersive soil media, at various positions. It has also been compared to a number of state-of-the-art benchmark methods, including GP regression, MLP, SVRM, and CNN, all commonly used for the same purpose. The results indicate competitive performance of the proposed technique with the MAE value of less than 12 (as compared to MAE of 23 for the best benchmark approach, CNN). These results also demonstrate that the TFRM framework can be viewed as efficient and accurate approach to solving GPR-based buried object characterization task under low computational budget. The proposed methodology for estimating characteristic parameters has been validated using noisy data (SNR value of 20 dB and 30 dB) and the measurement data. The error metrics and geometrical representations of characteristic parameters indicate that a novel regression surrogate model, TFRM is superior to the benchmark models, including CNN. The future work will include the extension of the proposed data-driven surrogate modeling approach to characterization of multiple objects and the increased number of characteristic parameters of the buried objects.
REYHAN YURT received the M.Sc. degree in electronics and communication engineering from Çankaya University, Turkey, in 2017. She is currently pursuing the Ph.D. degree with Yıldız Technical University, Turkey. Her main research interests include analytical and numerical electromagnetic modeling approaches, application of artificial intelligence-based algorithms, and analytical and numerical modeling of microwave devices.
HAMID TORPI received the B.S. and M.S. degrees in electronics and communication engineering from Yıldız University, Istanbul, Turkey, in 1988 and 1991, respectively, and the Ph.D. degree from Yıldız Technical University (YTU), in 1997. He is currently an Assistant Professor with YTU. His research interests include neural network applications of microwave circuit and devices, antennas, and design of microwave circuits and devices. He was a recipient of Science Awards from the Turkish Scientific and Technical Research Council and YTU.
PEYMAN MAHOUTI received the M.Sc. and Ph.D. degrees in electronics and communication engineering from Yıldız Technical University, Turkey, in 2013 and 2016, respectively. His main research interests include analytical and numerical modeling of microwave devices, optimization techniques for microwave stages, application of artificial intelligence-based algorithms, analytical and numerical modeling of microwave and antenna structures, surrogate-based optimization, and application of artificial intelligence algorithms.
AHMET KIZILAY received the B.Sc. degree in electronics and telecommunications engineering from Yildiz University, in 1990, and the M.Sc. and Ph.D. degrees in electrical engineering from Michigan State University, East Lansing, MI, USA, in 1994 and 2000, respectively. In July 2001, he joined the Department of Electronics and Communications Engineering, Yıldız Technical University, where he is currently a Full Professor. His main research interests include time-domain electromagnetic scattering, electromagnetic wave theory, computational electromagnetics, antennas and propagation, and microwave remote sensing and imaging.