Data Mining Techniques in Artificial Neural Network for UWB Antenna Design

. With data mining techniques for the preprocessing of training patterns, an artificial neural network (ANN) model is proposed for parametric modeling of electromagnetic behavior of ultrawide band (UWB) antennas in this paper. In this ANN method, two data mining techniques, including correlation analysis and data classification based on support vector machine (SVM), are employed to determine geometrical variable inputs and classify the inputs during the training and testing processes. Compared with the traditional ANN, the proposed model with data mining can achieve the trained model with small training datasets and accurate results. The validity and efficiency of this proposed method are confirmed with two band-notched UWB antenna examples.


Introduction
In recent studies, the artificial neural network (ANN) has been recognized as a powerful tool in electromagnetic (EM) modeling and passive component design [1][2][3][4].ANN can obtain the relationship between geometrical variables and EM responses through a training process.Once the geometrical parameters are input, the trained ANN can fast output the accurate EM responses.Thus, the efficient and repeatable ANN model is a good alternative to empirical models or EM simulations [5].An advanced study, which combines the neural network with the transfer function (TF), was developed to model the EM behavior of embedded passive components [6].In [6], the neural network is used for mapping the geometrical variables onto the coefficients of TF.This approach is an important advance of parametric modeling of passive components without having to rely on the prior knowledge [7].
Due to their high speed data rate, extremely low spectral power density, high precision ranging, low cost and low complexity, ultrawide band (UWB) communication systems have attracted great attention in the wireless world since the Federal Communication Commission (FCC) allowed 3.1-10.6GHz unlicensed band for UWB communication [8].Generally, a UWB antenna consists of a number of geometrical parameters which affect antenna performance.The description of the relationship between one geometrical variable and corresponding EM responses is non-quantitative and even relies on experience.With quite a few geometrical parameters, the computational complexity of ANN will be increased and the optimized design of the antenna will also be a time-consuming job.Therefore, reducing the number of the parameters is an effective solution.Moreover, the corresponding EM responses will lead to different orders of TFs when the number of the geometrical variations is large.Thus the ANN training is probably chaotic and its convergence becomes slow.To solve this problem, one common way is to set the TF orders to the maximum one among all geometrical samples [9].However, the high order TF, which is used to deal with the low order problems, may result in non-unique or arbitrary solutions for vector fitting and then the discontinuity of pole/residues.Recently, the pole/residue tracking technique has been proposed as another way for the orderchanging problem.
In this paper, an ANN model based on data mining techniques is proposed to solve above two problems from a new angle for parametric modeling of UWB antennas.In our proposed model, two data mining techniques, including the correlation analysis and data classification, are employed to determine the geometrical variable inputs and then to classify the inputs during the training and testing processes.In order to reduce the dimensionality of the dataset of ANN inputs, the correlation analysis is used to reveal the relationship between the geometrical parameters of an antenna and the corresponding EM responses.To make ANN accurately learn the mapping from the geometrical variables to the TF coefficients, the original training samples are classified into different categories according to the TF orders.Meanwhile a data-classification technique of support vector machine (SVM), which has been proved effective in modeling microwave devices and antennas [10], is trained with the collected geometrical variables and the corresponding TF orders in the training process, and then it classifies the geometrical variables into the proper categories in the testing process.The TF coefficients corresponding to the frequency EM responses are set as the outputs of the proposed ANN model.Finally, this proposed model is applied to the design of two band-notched UWB antennas.

Correlation Analysis
It is a time-consuming job for an antenna design and optimization with many geometrical variables.To effectively decrease this consumption, it is important to reduce the number of geometrical variables.However, an accurate description of the relationship between each geometrical variable and corresponding EM responses cannot rely on experience.
The Pearson correlation coefficient is a measure of the linear dependence between two real-valued vectors.Thus, in this paper, this method is employed to study the linear dependence of two groups of variables.Let V = [v m,n ] M  N be a matrix representing the geometrical variables, where M is the number of geometrical variables in each sample and N is the total sampling number.Let F_L = [f_l m,n ] M  N and F_U = [f_u m,n ] M  N be matrices representing the lower and upper limits of the notched band, respectively.According to [11,12], the Pearson correlation coefficients between v m and its lower and upper limits of the notched band can be respectively defined by: , 1 1 where After confirming the relationship between each geometrical variable and frequency information in all samples, the geometrical input dataset for the training process could be built.If the absolute value of a Pearson correlation coefficient is less than 0.3, the corresponding geometrical variable will be transferred as a constant to decrease the number of the whole geometrical variables.Then the training and testing data of ANN are constructed with the design of experiments (DOE) method [13] based on the correlated geometrical variables of V corre = {v corre } M' 1 , where Mʹ is the dimension of V corre and Mʹ  M. HFSS 15.0 software performs the full-wave EM simulation and generates the training data according to V corre .TF is used to represent the EM responses versus frequency and it is presented as where p i and r i are the pole and residue coefficients of TF, respectively, and Q is the order of TF [9].
The initial training data of neural networks are obtained by the vector fitting technique [14].With vector fitting, we obtain the poles and residue coefficients of TF corresponding to a given set of EM responses.However, the different responses may cause different orders of the TF.It is a trouble for ANN to learn the relationship between the geometrical variables and the TF coefficients with different orders.Thus a data-classification technique is introduced in the following.

Data Classification
To reduce the internal interference from the original samples for ANN training with high accuracy, the training samples are classified into different categories of C k (k = 1, 2, …, K) by the orders of TF, where K is the total number of categories.The samples with the same TF order are classified into one category, and the order of each category could be presented as Q k (k = 1, 2, …, K).Each category contains the input of V corre and the output of TF poles/residues with the same order.Since the relationship between V corre and the TF coefficients is nonlinear and unknown, ANNs are employed to learn this nonlinear relationship through the training process.Let O = {O 1 ,…, O W } be a vector representing the frequency responses of the EM simulations, where W is the number of the sampling points of frequency, and Oʹ = {Oʹ 1 ,…, Oʹ W } be a vector representing the outputs of the pole-residue-based TF.The objective here is to minimize the error between O and Oʹ for different V corre by adjusting the internal weights and thresholds of ANN.It is worth noting that one category of C k is only used to train one ANN model of ANN k .
At the same time, the training samples are also used to train an SVM model, which determine the TF orders of V corre for classification during the testing process.The major advantage of the SVM is the use of convex quadratic programing, which provides only global minima; thus, it avoids being trapped in local minima.Due to its advantageous nature, SVM has been applied to a wide range of classification tasks [15][16][17][18].Let Qʹ = {Qʹ 1 ,…, Qʹ K } be a vector representing the output of SVM and Q = {Q 1 ,…, Q K } be a vector representing the actual order of TF.The training objective is to minimize the error between Qʹ and Q for different V corre by adjusting the internal weights and thresholds of SVM.For more details of SVM, one can refer to [19].

The Whole Process of the Proposed Model
The whole process of the proposed model can be descripted as follows.Firstly, the relationship between each geometrical variable and the corresponding EM response is analyzed to determine V corre with Pearson correlation coefficients.According to the Hecht-Nelson method [20], when the node number of the input layer is n, the node number of the hidden layer is 2n + 1.This operation could decrease the dimensionality of input and hidden layers of ANN, thereby reducing the computational complexity and improving ANN stability.Then the obtained geometrical variables are employed for full-wave EM simulation to construct training samples.With vector fitting, we obtain the poles and residue coefficients of TF corresponding to a given set of full-wave EM simulation.According to TF orders, the training samples are classified into proper categories for ANN training.Meanwhile, SVM is trained for classification during the testing process.In the testing process, V corre is firstly classified into proper categories with the trained SVM and then V corre is input to ANN k to obtain TF coefficients.The whole process of the proposed model is shown in Fig. 1.

Parametric Modeling of a Single Bandnotch UWB Antenna
A single band-notched UWB antenna in Fig. 2 is considered as the first example [21] The proposed model is applied to two different cases, i.e., Case 1 with a narrow parameter range and Case 2 with a wide one.In the both cases, the DOE method with seven levels defines a total of 49 training samples, while the DOE method with five levels employs a total of 25 testing samples as shown in Tab.      generation from EM simulations is 1.63 hours, and the total time for testing-data generation is 0.83 hours.
The geometrical variables and corresponding TF orders from training samples are set as the input and output of SVM for training, respectively.Based on the K-fold Cross Validation method, the values of penalty parameter c and kernel function g are set as 2 and 1, respectively.Meanwhile, to select a proper kernel function, four kernel functions are examined based on the LIBSVM which is a toolbox proposed by Lin Chih-Jen [22], as shown in Tab. 2, and the radial basis function achieves the highest accuracy.To reduce the sensitivity of SVM from input data, all the input data are normalized.For 25 testing samples, the classifying results are shown in Fig. 4.
Meanwhile, the training samples are divided into proper categories according to their TF orders for ANN training.We use the Hecht-Nelson method [20] to determine the node number of the hidden layer: when the node number of the input layer is n, the one of the hidden layer is 2n + 1.
After the modeling process, the average training errors are 0.531% for Case 1 and 0.582% for Case 2, while the average testing errors are 0.684% and 0.714%.
To the trained model with the data mining techniques.It also means that less time for sample collection of EM simulations is required to train the proposed model, as illustrated in Tab. 5.
Table 6 shows the range and the constant value of each variable in the correlation analysis.The notched bands are in the frequency range of VSWR ≥ 3. The Pear-  son correlation coefficients between each geometrical variable and its lower and upper limits of the two notched bands (f_l 1 , f_u 1 , f_l 2 and f_u 2 ) are listed in Tab. 7.
It is obvious that the geometrical variables W A1 and W A2 could be transferred as geometrical constants with 2.5 mm and 2.5 mm due to their small values of Pearson correlation coefficient.Thus the number of geometrical variables is reduced to six, i.e., Frequency is an additional input parameter with an original range of 2-12 GHz.The model has one output, i.e., O' = VSWR.
The proposed model is also evaluated with 49 training samples and 25 testing samples (as shown in Tabs.8 and 9) in two different cases.Similarly, the comparison model is also evaluated with 49 training samples and 25 testing samples, and W A1 and W A2 are added as the geometrical variables.The total time for training-data generation from EM simulations is 2.45 hours, and the total time for testingdata generation is 1.25 hours.
The trained SVM is also used in the testing process for data classification.Similarly, the node number of the hidden layer is determined with the Hecht-Nelson method.To further evaluate the effectiveness of the proposed model, 81 samples defined with the nine-level DOE and 49 samples with the seven-level DOE are respectively used for training and testing in the proposed model, as shown in Tab. 9.The comparison model is also evaluated with the added geometrical variables.The total time for trainingdata generation from EM simulations is 4.05 hours, and the total time for testing-data generation is 2.45 hours.
As illustrated in Tab. 10, when the number of training samples is 49, the accuracy of the proposed model is good but that of the comparison model is unsatisfying.When the number of training samples is added up to 81, the accuracy of the comparison model is obviously improved.Thus, it can be seen that fewer number of training samples is required to train the proposed model than the comparison one.Once the proposed model training is completed, the trained model which is a substitute for the time-consuming EM simulation can be applied to the design optimization.As an example of using the trained model for antenna design, two separate antennas are optimized to reach two different design specifications, shown in Fig. 8.The objective of Specification 1 is that the dual band-notches cover 5.05-5.4GHz and 5.65-5.85GHz for the rejection of interference with existing wireless local area networks (WLANs) such as IEEE 802.11a in the USA (5.15 to 5.35 GHz and 5.725-5.825GHz) [23].For Specification 2, the dual band-notches are required to cover 5.1-5.5 GHz and 5.7-5.85GHz for the rejection of interference with the existing WLANs.
With the flower pollination algorithm (FPA) [24], the design optimization of the dual band-notched UWB antenna is performed by calling the proposed model repeatedly.The initial parameters are chosen as V corre = [15 15 0 0 0 0] T .The optimization spends only about 45 seconds to achieve the optimal solution for each antenna.The optimized geometrical values for the two separate antennas are [15.Radiation patterns of H-and E-planes: (a) @3.1 GHz for Antenna 1, (b) @3.1 GHz for Antenna 2, (c) @5.6 GHz for Antenna 1, (d) @5.6 GHz for Antenna 2, (e) @9 GHz for Antenna 1, (f) @9 GHz for Antenna 2, (g) @11 GHz for Antenna 1, and (h) @11 GHz for Antenna 2.

Conclusion
In this paper, a new ANN model based on the data mining techniques is proposed for parametric modeling of EM behavior of UWB antennas.In this method, the correlation analysis and data classification are employed to decrease the number of geometrical variables and to classify the inputs during the training and testing processes.Compared with the comparison model, the proposed model with data mining can achieve the trained model with small training datasets and accurate results.Two parametric modeling examples, including a single band-notched UWB antenna and a dual band-notched UWB antenna, are employed to confirm the validity of this proposed model.The proposed model provides its powerful computing ability especially in the field of EM optimization design.
. The parameters are as follows: L = 36 mm, L 1 = 16 mm, L 2 = 22.7 mm, L 3 = 1.3 mm, L 4 = 1.5 mm, L 5 = 0.5 mm, L 6 = 11.5 mm, W = 24 mm, W 1 = 6.5 mm, W 2 = 10.25 mm, and W 3 = 3.5 mm.The parameters of strip length, width and position (L A , W A , L B and W B ) are varied.To study the correlation between one geometrical variable and the corresponding lower and upper limits of the notched band, the other variables are set as constants.The notched band is in the frequency range of VSWR ≥ 3.The Pearson correlation coefficients of L A , W A , L B and W B for the corresponding lower limits of the notched band are r LAf_l = -0.9468,r WAf_l = -0.1947,r LBf_l = 0.9616 and r WBf_l = -0.0252.The Pearson correlation coefficients of L A , W A , L B and W B for the corresponding higher limits are r LAf_u = -0.9376,r WAf_u = -0.1754,r LBf_u = 0.9242 and r WBf_u = -0.0364.It is obvious that the geometrical variables W A and W B could be transferred as the geometrical constants with 4 mm and 0 mm due to their small values of the Pearson correlation coefficient.Thus the number of geometrical variables is reduced to two, i.e., V corre = [L A , L B ] T .Frequency is an additional input parameter with an original range of 2-12 GHz.The model has one output, i.e., O' = VSWR.

Fig. 3 .
Fig. 3. Range of each variable and the VSWR results: (a) L A (W A = 4 mm, L B = 0 mm and W B =0 mm), (b) W A (L A = 17 mm, L B = 0 mm and W B =0 mm), (c) L B (L A = 17 mm, W A = 4 mm and W B =0 mm), and (d) W B (L A = 17 mm, W A = 4 mm and L B = 0 mm).

Tab. 7 .
Pearson correlation coefficients of each variable.

Figure 6
Figure 6 shows the outputs of two different tests of the dual band-notched UWB antenna with the proposed model and HFSS simulation.The geometrical variables for the two tests in the range of the training data are V corre1 = [15.25 16.6 0.5 1.1 0.45 0.3] T and V corre2 = [16.215.6 0.3 0.9 -0.45 0.3] T .It is observed that the proposed model based on data mining can achieve good accuracy for different geometrical variables which were never used in the training process.Meanwhile, two other tests, which are selected out of the range of the training data, are chosen to evaluate the proposed model.The geometrical variables for two tests are Vʹ corre1 = [17.6 17.6 1.85 1.85 0.95 0.85] T and Vʹ corre2 = [17.6 12.8 1.9 1.9 0.95 0.85] T .From Fig. 7, it is observed that our model can achieve good accuracy for different geometrical variables even though these samples are out of the range of the training data.

Fig. 6 .
Fig. 6.Comparison of VSWRs (a) Sample 1 and (b) Sample 2, where the samples are in the range of training data.

Fig. 7 .
Fig. 7. Comparison of VSWRs (a) Sample 1 and (b) Sample 2, where the samples are out of the range of training data.
9617 16.0412 0.0057 2.4697 2.2123 0.4768] T and [17.0315 15.8679 2.1758 1.8964 0.1367 1.1635] T .For the two antennas, the radiation patterns of H-plane and E-plane at 3.1 GHz, 5.6 GHz, 9 GHz and 11 GHz are shown in Fig. 9.The radiation efficiency, gain in the broadside direction, and group delay for the two antennas are shown in Figs. 10 and 11.Compared with the directive EM optimization in which the EM simulations are repeatedly called by FPA, the design using the proposed model could save considerable time as shown in Tab.11.

Fig. 11 .Tab. 11 .
Fig. 11.Group delay for Antenna 1 and Antenna2.CPU Time of Model Development Direct EM optimization Proposed Model Antenna 1 10 hours 45 s Antenna 2 11 hours 45 s Total 21 hours 3.7 hours (training) + 90 s Tab.11.Running time of direct EM optimization and the proposed model.
In this section, two application examples of bandnotched UWB antennas are used to evaluate the proposed model.The inputs are the geometrical variables and operation frequency of the antennas, and the outputs of the overall model are the voltage standing wave ratio (VSWR).HFSS 15.0 software performs the full-wave EM simulation and generates the training and testing data for modeling.All calculations in this paper are performed on an Intel i7-4870 2.50 GHz machine with 16 GB RAM.
O Fig. 1.The whole process of the proposed model.ANN evaluate the proposed model, an ANN model which doesn't include data mining techniques is employed as a comparison model to calculate this single bandnotched UWB antenna.Similarly, the seven-level training data (49 samples) and five-level testing data (25 samples) defined by the DOE method are used for the comparison model.The information of training data and testing data are shown in Tab. 3. The node number of output is defined according to the maximum value of TF orders.After the modeling process, the average training errors are 8.461% for Case 1 and 9.703% for Case 2, while the average testing errors are 10.461% and 11.134%.
Tab. 5. Running time of the two models for the single bandnotched UWB antenna.

5 .
Structure of the dual band-notched UWB antenna.