Classification of Microwave Planar Filters by Deep Learning

,


Introduction
Functions and abilities of a brain have been fascinating engineers for decades.First attempts on modeling a brain by electronic systems appeared in years of World War II.In 1943, Warren McCulloch a Walter Pitts developed a simple neural network with electrical circuits.In 1949, Donald Hebb reinforced the concept and pointed out that neural pathways are strengthened each time when being used.
Thanks to advances of computers, Nathanial Rochester from the IBM research laboratories led the first effort to simulate a neural network in the 1950s.In 1956, the Dartmouth Summer Research Project on Artificial Intelligence provided a boost to both artificial intelligence and neural networks.In the years following the Dartmouth Project, John von Neumann developed a simple neuron by using telegraph relays or vacuum tubes.After that, Frank Rosenblatt began to work on the perceptron.The perceptron computed a weighted sum of the inputs, subtracted a threshold, and passed one of two possible values out as the result.
In 1959, Bernard Widrow and Marcian Hoff presented models called ADALINE (adaptive linear elements, see Fig. 1) and MADALINE (multiple adaptive linear elements).MADALINE, an adaptive filter eliminating echoes on phone lines, was the first neural network to be applied to a real-world problem.
Because of the earlier successes, the potential of neural networks was overestimated, particularly due to limitations of available electronics.The unfulfilled claims caused halting the funding.
In 1982, John Hopfield presented an approach to simply model brains and created useful devices.At the same time, Japan announced an effort to further develop neural networks.The US have been funding the research once again.That time, the new area of artificial neural networks has been started.Most applications of neural networks have been using the feed-forward structure (see Fig. 2).Neurons in the input layer distribute input signals to neurons in the first hidden layer.Hidden neurons multiply input signals by synoptic weights  ()    .Here,  denotes the number of the hidden layer,  is the number of the neuron in the input layer and  indicates the number of the neuron in the hidden layer [1].The products of input signals and synoptic weights are summed and the threshold  ()   is subtracted.Indexes  and  refer to the layer and the neuron as previously.The output of the summer is limited by a non-linear activation function (a Gaussian function, a unipolar sigmoid or a bipolar one in most cases).That way, the output signal of the neuron is obtained [1].
Feed-forward neural networks are trained to map vectors of input patterns [  1 ,  2 ...  ] T into vectors of output targets [ 1 ,  2 ...  ] T where  is the number of input neurons and  denotes the number of output neurons [1].
Let us assume that a feed-forward network is used to model a patch antenna (see Fig. 3).In order to map dimensions of the antenna (the width , the length , the width of the microstrip feeder ) and parameters of the substrate (the dielectric constant  r , the height ℎ) to the input impedance ( in + j in ) at the frequency  , input patterns [ , , ,  r , ℎ,  ] have to correspond to 6 input neurons and the output targets [ in ,  in ] have to be related to 2 output neurons.The number of hidden layers and neurons in those layers should be sufficiently high to have capacity for absorbing stored information [2].
Information is stored in synoptic weights of the network during training: input patterns are introduced to input neurons, and synoptic weights are changed to obtain corresponding output targets at output neurons.Hence, knowledge is distributed over the whole network [2].
Introducing an unknown input pattern to the input of a trained network, a proper output target is obtained at output neurons.Therefore, the feed-forward network is usually used as a black-box model approximating results of measurements or CPU-time expansive numerical analyses [2].
During the latest development of artificial neural networks, several researchers pointed out that artificial neural networks have sometimes been used to understand brain functions but the primary design has not been intended to be realistic models of the biological function.Therefore much more general term deep learning is used nowadays [3].The term deep learning emphasizes the fact that neural architectures consist of a relatively high number of hidden layers.Or, various neural networks are composed into a cascade identifying edges, classifying objects and sorting outputs, for example [3].
When searching for keywords microwave-filter-deeplearning in the IEEE Xplore database, about 20 papers published in last years can be obtained.Contributions can be divided into following categories: • A conventional mapping is performed by conventional feed-forward networks [4][5][6].
• A conventional mapping is performed by several feedforward networks arranged into a cascade creating the deep structure [7][8][9].
• An inverse mapping is performed by different types of neural networks arranged into a cascade creating the deep structure [10], [11].
Moreover, IEEE published in 2021 two special issues devoted to machine learning in microwaves [12] and to artificial intelligence in electromagnetics [13].From the viewpoint of terminology, the deep learning is a subset of machine learning, and the machine learning is a subset of artificial intelligence [3].From the viewpoint of contents: • Feed-forward networks were exploited for fast parametrized electromagnetic modeling of microwave filters [12].Using transfer functions as prior knowledge for model development, a CPU-efficient tool was obtained for the high-level electromagnetic design with repetitive geometrical variations.Hence, conventional mapping was boosted in a clever way using conventional neural networks.
• An overview of artificial intelligence techniques applied to forward modeling, remote sensing, adaptation of reconfigurable antenna arrays, biomedical imaging and inverse design was provided [13].Dealing with the inverse design, an application of deep learning techniques covers forward and inverse mapping of electromagnetic structures (meta-structures, reflect-arrays, nano-structures) using feed-forward networks, support vector machines and generative adversarial networks.
Obviously, exploitation of deep learning structures consisting of different neural networks are in field of electromagnetics (and microwave filters, especially) quite rare.Moreover, those structures are dominantly used for an inverse mapping.
In this paper, a deep structure consisting of different neural networks is applied to the identification of planar filters from images of their layout: • The first neural network identifies edges in the layout.
• Depending on edges, the second network identifies inductive and capacitive segments of the layout.
• Considering topology created by inductive and capacitive segments, the third neural network classifies the filtering structure.
• Identifying the number of repetitions of the fundamental segment of the filter, the fourth network estimates the order of the filter.
According to our knowledge, the described approach has not been published in the open literature yet.In Sec.

Planar Filters
In order to approve functionality of the deep structure classifying planar filters, following filters using design synthesis were included into the training [14]: • Stepped impedance low-pass filter; • Low-pass filter with shunts; • Band-pass filter consisting of short-ended quarterwavelength shunts.
The training set can be completed by other filter types like elliptic low-pass structures, filters consisting of coupled resonators, etc. [14].No matter the deep structure becomes larger and the training process takes longer time, the fundamental principles stay unchanged.
In order to prepare training patterns, a MATLAB script based on closed-form descriptions of planar segments was created.Particular segments were described by ABCD matrices, and the whole filtering structure was cascaded by their multiplication.Accuracy of the generated training models is very limited because mutual couplings among segments are neglected, fringing fields are not taken into an account and parasitic phenomena are not considered.
In order to verify classification abilities of the deep structure, photographs of implemented filters were used.Examples of testing structures are shown in Fig. 4

Proposed Architecture
CNNs are type of feed-forward neural networks with modified architecture.The architecture of CNNs usually consists of convolutional layers followed by a pooling layers, where each neuron in a convolutional layer is connected to some region in the input.This region is usually called a local receptive field.All weights (filters) in CNNs are shared based on the position within a receptive field.The convolution operation can be described as follows [15]: where  (, ) is the input image at position (, ) and ( − ,  − ) is a trainable filter.The pooling layers in CNN reduce the dimensionality of features which leads to reduction of connection between layers, hence it reduces computational time [16].
Due to type of input data and required filter classification, CNN was selected and proposed.Selected lightweight CNN architecture described in Tab. 1 represents a good choice compared to state of the art architectures that are purely designed to achieve great results on competitive datasets (NMIST, CIFAR, etc.) containing hundreds of classes.All the layers and it's parameters were used from [17].These models then have high computational demands [17].
The proposed architecture requires input image of 224×224 px.To comply with this condition, each input image was adjusted to the correct resolution.For both training scenarios, the Adam optimization algorithm with default setting was chosen and the cross entropy method was used as the loss function.

Dataset
The dataset used was generated by MATLAB program for the purpose of training the neural networks.The aim of the generation was to generate a set of images that represent planar structures and that are physically correct.The generation was based on predefined values of permittivities  r = [2.10,2.55, 2.55, 2.59, 3.00, 3.38, 3.78, 4.43, 4.80, 6.22] and frequencies  c =[433, 888, 1200, 1600, 2400, 2800, 5200, 8200, 28000, 32000] MHz.The dataset was generated using grayscale pixels.The background is represented by black color, the planar structure is represented by white color, and the vias (an electrical connections between copper layers in a printed circuit board) are represented by gray color.
For the purpose of structure and order recognition of the planar filters, we have generated 2 datasets: • Structure dataset -the dataset consists of three classes split into folders: bandpass, lowpass shunt and lowpass spepped impedance.Each class consists of 1000 images.Hence the total number of images is 3000.
The presented datasets can be downloaded from [18].Example of generated structures can be found in Fig. 6.

Experiments and Discussion
Techniques of data augmentation (rotation, gray scale, normalization, and resizing) were applied to datasets (described in Sec. 4) in order to improve generalization of trained model and increase number of input data.With applied augmentation the number of images in datasets was tripled.After CNN training, the trained models were validated on a separate evaluation dataset that had not been used for training before.

Filter Classification
Measured results from training and evaluation using CNN architecture from Tab. 1 for classification of microwave planar filters are shown in Fig. 7 and 8. High accuracy (99.8% on evaluation data) and low loss (inverted function of accuracy) was achieved during 20 epochs of training.Figure 8 shows confusion matrix, where each row represents the instances in an actual class (band pass filter, low pass shunt filter and low pass stepped filter) while each column represents the instances in a predicted class.Given 100 evaluation samples per class (in total 300 images), the proposed CNN architecture missed 2 images, which instead of predicted as low pass shunt filters were predicted as band pass filters.

Filter Order Estimation
Achieved results from training and evaluation using CNN architecture from Tab. 1 for classification of orders are shown in Fig. 9 and 10. High accuracy (94.87% on evaluation data) and low loss (inverted function of accuracy) has been achieved during 20 epochs of training.Figure 10 shows confusion matrix, where each row represents the instances in an actual class (order of filter) while each column represents the instances in a predicted class.Given 26 evaluation samples per class (in total 156 images), the proposed CNN architecture missed 6 images of 3 rd order, one image of 5 th order and one image on 13 th order.

Real Data Evaluation
In order to verify the constructed neural networks, we used photographs of planar filters.Examples of the photographs can be seen in Fig. 4.
Therefore, for the presented mechanism to work, it is crucial to present the image in grayscale, where the white color represents the conductive layer, the black color the surroundings, and the gray color the vias.To function properly, it is necessary to provide high quality patterns -especially sharp edges.
The achieved accuracy for tested images (40 images) for the task of filter classification is 98%.For the task of order estimation, the accuracy is 91%.

Conclusion
This paper demonstrates benefits of deploying CNNs for filter classification and its orders.The proposed architecture has been trained to recognize three categories of planar filters and its order (from 3 rd to 13 th order).In our case we simplified training datasets of filters which were generated by MATLAB due to insufficiency of real data, where mutual coupling among segments was neglected, fringing fields were not taken into an account and parasitic phenomena was not considered.However, the neural network architecture could be extended to recognize additional filter types or adapted for other attributes of filters with minimum modifications.The CNNs themselves could be further improved by trying to use bitwise compression of individual variables to achieve faster network response, but this was not considered [19].Finally, the presented network was tested using real photographs of planar filters.The accuracy for the filter classification task is 98% and for the order estimation task is 91%.Nevertheless, the algorithm proved to be powerful, the result accuracy highly depends on the quality of input images.In 1996 and 1997, he occupied the position of independent researcher at Laboratoire de Hyperfrequences, Universite Catholique de Louvain, Belgium, working on variational methods for numerical analysis of electromagnetic structures.He and his team have been researching methods of numerical modeling and optimization of electromagnetic structures, and ways of applying artificial neural networks to solving electromagnetic compatibility issues, and advanced approaches to the design of special antennas.

Fig. 1 .
Fig. 1.Adaptive linear element (ADALINE).Synoptic weights and a summer forming the neuron correspond to a finite impulse response adaptive filter.

Fig. 3 .
Fig. 3. Patch antenna fed by microstrip transmission line.Dimensions of the layout and parameters of the substrate form the input pattern.Input impedance of the antenna (resistance and reactance) correspond to the output target.
2, planar filters used for the training of the deep structure are briefly presented.Section 3 describes particular neural networks creating the deep structure and discusses the software implementation, training patterns and training itself.In Sec. 4, functionality of the trained deep structure is verified.And Section 5 concludes the paper.

Fig. 6 .
Fig. 6.Planar filters included into the training set: (a) bandpass filter consisting of short-ended quarter-wavelength shunts -thirteenth order, (b) low-pass filter with shuntsninth order, (c) stepped impedance low-pass filter -fifth order.

Fig. 7 .Fig. 8 .
Fig. 7. Accuracy function of CNN during training on basic classification of filters.

Fig. 9 .
Fig. 9. Accuracy function of CNN during training on classification of orders of filters.

Fig. 10 .
Fig. 10.Confusion matrix representing CNN accuracy on evaluation data.The color bar represents quantity.

Fig. 11 .
Fig. 11.Upper: Photograph of printed circuit board of low-pass stepped impedance filter.Bottom: Extracted image of the photograph used as an input for NNs.

Fig. 12 .
Fig. 12. Upper: Photograph of printed circuit board of low-pass filter with shunts.Bottom: Extracted image of the photograph used as an input for NNs.

Tomas
GOTTHANS received the bachelor and the Ing.degrees in Electrical Engineering from the Brno University of Technology in 2008 and 2010, respectively, and the Ph.D. degree from the Universite de Marne-La-Vallee, France, in January 2014.In 2011, he joined the ESIEE Paris, ESY-COM Laboratory, where he worked on the project AMBRUN (in collaboration with Thales Communications and Security, TeamCast, Supelec).He is currently an Associate Professor with the Department of Radio Electronics, Brno University of Technology.His research interests include digital predistortion of power amplifiers, wireless communications theory, and chaos theory.Zbynek RAIDA (born 1967 in Opava) graduated from Brno University of Technology (BUT), Faculty of Electrical Engineering and Communication (FEEC).Since 1993, he has been with the Department of Radio Electronics, FEEC BUT.
From 2002 to 2006, he occupied the position of the Vice Dean for Research.From 2006 to 2013, he headed the Department of Radio Electronics.Since 2010, he has managed the SIX Research Center.He is a Senior Member of the IEEE.He is active in the Antennas and Propagation Society, Microwave Theory and Techniques Society, Computational Intelligence Society, and Signal Processing Society.

About the Authors . . . Jiri VESELY was
born in Liberec, Czech Republic on July 3, 1972.He graduated as Ph.D. on Low Flying Target location using Surface Seismic Waves in 2001, and as Associate Professor with Habilitation treatise on Modern ELINT System Principles Extension at University of Defense in Brno, Czech Republic in 2012.His main field of study is modern signal source location principles and algorithms, radar signal analysis and classification for ELINT and EW systems, radar tracking and data fusion in complicated environment.He begins his teacher carrier in 1996 at Radar Department in Military Academy in Brno and from 2019 he is leader of Department of Communication Radar and Electronic Warfare Technology, Faculty of Military Technology, University of Defense.Jana OLIVOVA received her masters degree in Communication Engineering in 2007 and from 2007 to 2010 she worked on several projects in the field of numerical modelling and optimization of high-frequency structures.In 2011 she finished her Ph.D. studies on Multiobjective Optimization in EMC at Brno University of Technology and received Ph.D. Since 2011 she has been working at the University of Defense, Faculty of Military Technology, Department of Communication Technologies, Electronic Warfare and Radiolocation.She is interested in use of optimization methods in radiolocation and in new methods of antenna structures manufacturing.Jakub GOTTHANS received his masters degree in System Engineering and Informatics from the Brno University of Technology in 2014.In 2019 started his Ph.D. degree from the Brno University of Technology.Since 2014 he worked on multiple SATCOM projects (in collaboration with Honeywell Aerospace).His research interests include SATCOM systems and machine learning.