Constrained tandem neural network assisted inverse design of metasurfaces for microwave absorption

Designing microwave absorbers with customized spectrums is an attractive topic in both scientific and engineering communities. However, due to the massive number of design parameters involved, the design process is typically time-consuming and computationally expensive. To address this challenge, machine learning has emerged as a powerful tool for optimizing design parameters. In this work, we present an analytical model for an absorber composed of a multi-layered metasurface and propose a novel inverse design method based on a constrained tandem neural network. The network can provide structural and material parameters optimized for a given absorption spectrum, without requiring professional knowledge. Furthermore, additional physical attributes, such as absorber thickness, can be optimized when soft constraints are applied. As an illustrative example, we use the neural network to design broadband microwave absorbers with a thickness close to the causality limit imposed by the Kramers-Kronig relation. Our approach provides new insights into the reverse engineering of physical devices.

Subsequent advancements, such as the Jaumann absorber [9,10] and circuit analog, have resulted in thinner and lighter microwave absorbers, with a broader working bandwidth within the targeted frequency range.However, small thickness and large working bandwidth are conflicting requirements due to the limitation of Kramers-Kronig relations [11][12][13].Recently, microwave absorbers based on metasurfaces have been proposed [4,[14][15][16][17][18][19][20][21].These metamaterial absorbers have sub-wavelength geometric patterns that can induce multiple electric and magnetic resonances, providing numerous design parameters that can be tuned to satisfy diverse requirements.However, the relation between design parameters and electromagnetic response is often quite complicated, and tuning the design parameters to achieve the target response is effectively an optimization process in a high-dimensional parameter space, which is tedious and time-consuming.Indeed, it would be highly desirable if the design problem can be solved as an inverse scattering problem, which itself is very challenging for electromagnetic waves.
In recent years, machine learning has found applications in many areas.With the ability to learn from data, the machine learning system can find the hidden characteristics of the data and establish the relationship between the data sets.It has been used successfully in many fields, such as computer vision [22][23][24][25][26], natural language processing [27][28][29][30][31], time series forecasting [32][33][34][35][36], etc.In the field of physics, machine learning has also been used extensively [37][38][39][40][41][42], including solving the inverse problem [43][44][45][46][47][48][49][50].Recent studies have demonstrated the powerful capabilities of machine learning in the design of optical and nanophotonic devices [51][52][53][54][55][56][57][58][59].However, achieving a design with specific optimized physical attributes is challenging due to the extensive experimentation and iteration required to identify the optimal combination of physical parameters.This often involves the use of complex artificial neural networks (ANNs) and iterative processes.Therefore, it is worthwhile to further explore and demonstrate how machine learning can assist in designing a microwave absorber to achieve a target absorption spectrum.Moreover, it would be even more desirable if the designed absorber could simultaneously have a thickness close to the causality limit, representing the thinnest achievable in passive media.
In this work, we demonstrate the potential of ANN in realizing the inverse design of a broadband microwave absorber based on metasurfaces.Using the training data generated from an analytical model of a microwave absorber composed of a multilayered metasurface, we establish and train a forward network that maps the design parameters of the absorber to its frequency response(in our case, it is the reflection spectrum).To overcome the significant challenge of one-to-many mapping in the inverse design [43], we build a tandem network by connecting the forward network to an inverse network.In addition, we included the constraint on the thickness of the design as a penalization term in the cost function to prevent the design from being excessively bulky.After the training process, our constrained tandem neural network(CTNN) can not only provide us with a design that satisfies the target absorption spectrum but also has a thickness that is very close to the causality limit.Our investigation indicates that machine learning can effectively solve inverse design problems with specific requirements, and our approach can be readily extended to design acoustic and elastic wave absorbers.

II. GENERATING TRAINING DATA
We will first introduce the fundamental structure of the microwave absorbers that we aim to design.Metamaterial/metasurfaces, composed of a periodic array of subwavelength resonant elements, can manipulate the electric and magnetic response.By carefully tuning the resonant elements, good absorption can be achieved [15].Nevertheless, such a single-layer absorber typically has a narrow absorption band.To achieve broadband absorption, many multilayered metasurfaces with multiple resonant modes have been proposed [60,61].However, the increased thickness of such structures may render them too bulky for various applications.According to the Kramers-Kronig relations, for the normal incident illumination, there is a theoretical limit for the total thickness of a metal-backed non-magnetic absorber with relative permeability µ r =1 at the static limit [11,62] where Γ(λ) is the reflection coefficient as a function of the wavelength λ.This expression suggests that a highperforming absorber, capable of effectively absorbing a wide range of wavelengths, typically requires a considerable minimum thickness.Therefore, designing a broadband absorber with a small thickness is challenging.
Our objective is to design a structure that possesses excellent absorption properties across a wide frequency range, while maintaining a thickness that approaches the minimum limit dictated by causality.We consider a stacked multilayered structure as depicted in Fig. 1.Such a structure is chosen because it can be fabricated in a layer-by-layer manner, and the electromagnetic response can be simulated using simple models.The proposed broadband absorber comprises four layers of metasurfaces (resistive patches placed on dielectric substrates (yellow)), which are separated by spacer layers (grey).In the i th layer (i = 1, 2, 3, 4), the metasurface is a periodic array of resistive square patches (brown) with sheet resistance Rs i , periodicity a i , and square length w i .This multilayered structure is placed on a flat metallic ground plane (red), which works as a perfect electric conductor (PEC), resulting in zero transmission.Therefore, the absorption of the multilayered structure is 1 − |Γ| 2 , and the perfect absorption is achieved when |Γ| = 0.This generally requires the absorber to be impedance-matched to the free space.
Although simulation software packages such as COM-SOL can be used to obtain the absorption spectrum of arbitrary metamaterial structures numerically, the fullwave simulation of 3D problems is time-consuming, particularly when generating absorption spectra of multiple absorbers with varying parameters.Therefore, for this design, we utilized the simple structure shown in Fig. 1 that employs the square patch array, which enables us to solve the absorption spectrum almost analytically.By employing the capacitive circuit absorber approach [63], we constructed an equivalent circuit model for this multilayered structure, as shown in Fig. 1(c).This enabled us to efficiently generate a large amount of training data for machine learning.
The capacitive circuit absorber approach uses the lowpass RC circuits instead of the conventional resonant RLC circuits to model the resistive patches with large square length (w/a > 0.7) [63].For each layer, the resistive square patch can be modeled as a series RC circuit with resistance Re i and capacitance C i , which are complex functions of the patch parameters (Rs i , a i , w i ).For the normal incidence, the values of the lumped elements can be evaluated analytically [64], where Y 0 is the admittance of the free space, and c is the light of speed.ϵ ri = (ϵ di + ϵ ci )/2 is the averaged relative permittivity of the spacers and the substrates in which the patch is embedded [65].The function where g = a − w represents the gap between patches, A = 1 − (a/λ) 2 −1/2 − 1 and α = sin(πg/2a).Therefore, the admittance of the ith series RC circuit is[67] The dielectric substrates and spacers can be repre-sented by transmission line sections with lengths t i and d i , respectively.According to the transmission line theory [67], we can calculate the input admittance of dif- ferent layers iteratively, where and The spacer of the first layer is backed by PEC and can be regarded as a short-circuited transmission line with the input admittance Therefore, for the transmission line circuit in Fig. 1 where Y in represents the input admittance of the absorber and is equal to the input impedance of the fourth layer Y in = Y 4 .Using Eq.( 9), the reflection spectrum R(λ) = 20 log 10 |Γ(λ)| of this 4-layer microwave absorber can be obtained analytically.As shown in Fig. 3(b), the analytical results are in good agreement with the results obtained from the full-wave simulation.

III. CONSTRAINED TANDEM NEURAL NETWORK
In this part, we will introduce how to use the constrained tandem neural network to realize the inverse design of metasurface microwave absorbers that operates within a specified frequency range.To represent the design parameters of the microwave absorber, we utilize an array D = {D 1 , D 2 , ..., D N }, and we refer to the array R = {R 1 , R 2 , ..., R M } as the response vector, which represents the reflection at discretely sampled frequency points within the specific range.In this way, our objective can be reinterpreted as creating the inverse mapping from the response vector R to the design parameter vector D, which is commonly referred to as an inverse problem.
It is natural to use an ANN to form this mapping directly, in view of the ability of such a network to fit highly-nonlinear functions [68].However, this approach proves ineffective due to the presence of a one-to-many relationship within the dataset.There exist multiple design parameter vectors D real which accord to the given response vector R. Therefore, the loss function of the network (10) will encounter difficulties as the design parameter vector D pred bounces around in parameter space when attempting to converge toward multiple target objectives represented by D real .Such a struggle often results in significant challenges or even outright failure in achieving network converge.
To overcome this issue, the tandem neural network(TNN) method is proposed [47].TNN consists of two parts: the forward network (the orange part) and the inverse network (the blue part), shown in Fig. 2(a).As we have stated above, the forward network takes D (the right part of the crimson rectangle) as the input and R (the right green rectangle) as the output, while the inverse network (the left green rectangle) takes the R as the input and D (the left part of the crimson rectangle) vector as the output.The two parts of the tandem network are connected end-to-end, meaning that the output of the inverse network is the input of the forward network.Such R-D-R structure is similar to the encoder-decoder ANN, and the intermediate output D can be regarded as the latent vector, which encodes the information in the response vector R. Broadly speaking, the latent vector can be assigned with different physical meanings on different occasions.On some occasions, it just stands for "compressed information" without explicit physical meaning.
To train the TNN, one should first train the forward network, enabling it to form the mapping from the D vector to the response.After the training, the forward network should be frozen, which means that the weights and the bias of the forward network are kept unchanged in the subsequent steps.The TNN will then be trained to minimize the difference between the input of the inverse network and the output of the forward network.In this process, only the weights and the bias of the inverse network will be changed.
As such, the loss function can be expressed as: Here, R real means the ground truth corresponding to the design, while R pred means the predicted response given by the network.Since the output R is unique, there will be no conflict arising from competing objectives.When the training of TNN is completed, the inverse mapping from R to D has been established.However, there are also some drawbacks to this method.While the output R of the TNN is unique, there are multiple possible choices for the intermediate output D. As the network is unbiased towards different D, the design parameters provided by the TNN have a significant degree of randomness.Therefore, it may provide us with design parameters that are undesirable, such as those with an excessively large thickness.
To impose additional control over the intermediate output design parameters, we proposed the constrained tandem neural network (CTNN).Different from the ordinary TNN, we introduce a soft constraint term into the loss function, (12) where (d tot − d min ) 2 represents the difference between the total thickness d tot of the absorber and the minimum thickness d min , and α is a constant coefficient to control the second loss term.We have the flexibility to include multiple constraint terms as desired, but our primary focus is on a term that prioritizes the selection of thinner slabs as thickness is a critical factor when it comes to absorber design.The constraint incurs an additional cost to penalize thicker designs, thereby favoring thinner meta-structures that yield the same response.As Fig. 2(b) illustrates pictorially that although there are two different design parameter vectors D A and D B that yield essentially the same response, the extra penalty term differentiates them because they have different total thicknesses.And this penalty will lead the CTNN to converge towards D B , because the total thickness of the design D B is closer to the causality limit than In this way, the arbitrariness of the design parameter vector D is removed.
Additionally, since the soft constraint term can be chosen as desired, it enables control over other design parameters, making it possible to meet various design requirements.For example, by introducing a constraint related to the sheet resistance Rs, we can realize a design with smaller resistance values if we choose to do so, ensuring that the material parameters can be constrained to a prespecified range.We also note that this method can be generalized to the inverse design for other systems since the physical meaning of the latent vector and the constraint can be defined according to the real occasion.

IV. NETWORK TRAINING
In the training process of CTNN, the steps are the same as those in TNN.The only change is the customized loss function incorporating constraint.Therefore, the training time is almost the same as TNN.After the training, the end-to-end application is enabled and no more training is needed.
Considering the fabrication feasibility, we simplify the design by requiring the parameters of the dielectric substrate of different layers to be the same value ϵ ci = ϵ C , t i = T C , and fixing the dielectric constant of the spacer as ϵ di = ϵ d = 1.2, which can be realized by using porous PVC substrates.The lattice constant a i = 6.8 mm, 6.8 mm, 3.4 mm, 1.7 mm for i = 1, 2, 3, 4, respectively.These structural parameters have been used previously to design absorbers [11].
The forward network D−R has 4 hidden layers with 80, 320, 320, and 320 neurons for different layers respectively to avoid sudden zooming, and the activation function leaky ReLU(k = 0.01) is adopted.The input and output layer contains 14 and 320 units respectively, to match the data.We take α = 0.01 in the constraint term.The inverse network R − D also contains 4 hidden layers with 320, 320, 320, and 80 neurons respectively, and the Adam optimizer is adopted with a learning rate of 10 −4 .Within a pre-specified range of parameters(specified in the 2nd column in Table I), we randomly generate the training data.We use 10 5 D − R supervised data to train the forward network.As shown in Fig. 3(a), after 100 epochs of training of the forward network, the loss decreases to an acceptable value.The design parameters and the corresponding response of one of the test samples are shown in Table .I and Fig. 3(b).Then, we use 10 5 R data to train CTNN.The data is distributed by training: valuation: testing = 8: 1: 1. Figure 4 (a) demonstrates that after 50 epochs of training, the loss function decreases to an acceptable value.In the following section, we will provide examples of utilizing this well-trained CTNN for designing microwave absorbers.

V. INVERSE DESIGN OF MICROWAVE ABSORBERS
Based on the trained neural network, we can now design the EM wave absorber according to a given absorption spectrum.Here we show a specific example in Fig. 4(a), where we want to design a broadband absorber capable of achieving a 20 dB absorption across the frequency range of 4.5 to 31.5 GHz.Mathematically, the input response has a trapezoid-like shape (the green dashed line): Here we set Notice that we provide the desired spectrum using reflection coefficients.We input the desired spectrum into the trained tandem network.The intermediate layer output gives the design parameters, as shown in the 4th column of Table .I. In Fig. 4, the desired absorption spectra, as specified mathematically by Eq. ( 5), is shown as the green dashed line.The red line in Fig. 4 is the absorption spectra of this designed structure calculated the analytical method while the black open dots are the absorption calculated for the same structure using full wave computation (using the package COMSOL).These results show that absorber we designed has absorption characteristics that match well with the required absorption spectra, which is shown as the green dashed line.In particular, the absorber has a 20 dB reflection re-TABLE I.The design parameters of the absorber.The random sampling region of design parameters in the training sets is given in the 2nd column.The sample from the test set shown in Fig. 3   duction covering the frequency interval 4.5 to 31.5 GHz.We also calculated the minimum absorber thickness corresponding to our desired spectrum as required by causality, which is found to be 12.339 mm.The thickness of our design is 12.552 mm, only 1.7% thicker than the causality limit.For comparison, the ordinary TNN gives a 5.3% thicker design(shown in the 5th column of Table .I).More designs featuring various response spectra are included in the supplementary materials to illustrate the versatility of our network.We should also note that the design parameters are experimentally realizable.The required value of ϵ C can be realized by various plastics like PVC and PET and the ITO films can serve as the resistive square patches [69].

VI. CONCLUSION
In conclusion, we demonstrated a CTNN-assisted approach to design a custom microwave absorber that meets a pre-described frequency response requirement.This approach utilizes the tandem network to solve one-to-many problems with great effectiveness and efficiency.Furthermore, we added a soft constraint on the loss function Compared to the traditional TNN (with a thickness 5.3% thicker than the causality limit), the improved CTNN (with thickness 1.7% thicker than the causality limit) exhibits better performance.
to enable the network to achieve biased convergence towards a specific target, namely minimal thickness in this case.We construct the relation between absorber design vector D and its absorption response vector R by training the forward and the tandem network, respectively.After the training, our network can conduct the inverse design and give corresponding design parameters based on input-customized absorption responses.We use the network to design a broadband absorber, and the thickness is very close to the causality limit.We note that data from numerical simulations or experiments can also replace the training data for the network if such information is available.The approach can be easily applied to the inverse design of optical, acoustic, and other devices.
Constrained tandem neural network assisted inverse design of metasurfaces for microwave absorption: supplemental document Fig. 1 shows more designs given by the CTNN.The desired spectra are denoted by the green dashed line, while the absorption response obtained by the improved CTNN, the traditional TNN, and the full-wave numerical simulation are represented by the red line, blue line, and open circles, respectively.The parameters of designs are shown in Table I.The inset graphs show the percentage difference between the thickness of the design and the thickness given by the causality limit, showing the CTNN always gives a thinner absorber.The blue (red) bar represents the design given by TNN (CTNN).It is evident that CTNN is more capable of providing designs that approach the causality limit compared to the traditional TNN.Besides, these results show that our model is applicable to a wide range of desired spectra.

FIG. 1 .
FIG. 1. Microwave absorber and its equivalent circuit model.(a) Schematic of a 4-layer microwave absorber composed of resistive square patch arrays (brown), dielectric substrates (yellow), and spacers (gray).A top-view picture of the patch array is shown at the lower right.Rs, a, and w are the sheet resistance of the patch, the lattice constant, and the length of the square, respectively.(b) The side view of the absorber.d1,2,3,4 and t1,2,3,4 are the thickness of the spacers and substrates of different layers.(c) The equivalent circuit model of the absorber.The ith resistive square patch array can be modeled as a series RC circuit with resistance Ri and capacitance Ci.The dielectric substrates and spacers can be represented by transmission line sections with lengths ti and di, respectively.

FIG. 2 .
FIG. 2. Constrained Tandem Neural Network(CTNN).(a)The CTNN is a tandem network composed of the forward network (orange) D − R and the inverse network (blue) R − D. The forward network takes the latent vector D as input and the corresponding response R as output, while the inverse network takes R as input and D as output.In our case, components of the array D are exactly the design parameters, and R represents reflectivity at discrete frequencies.A soft constraint imposed on the loss function can differentiate the designs D A and D B corresponding to the same response.(b) With the punishment of the thicker design, the latent vector will finally converge at a extreme point where the design with thickness closer to the causality limit (as the red arrow shows).
(b)  is given in the 3rd column, and two specific absorber designs given respectively by CTNN and TNN corresponding to our desired absorption responses shown in the Fig.4(b) are given in the 4th and 5th column.

FIG. 3 .
FIG. 3. Forward training of data.(a) Loss descending during the training.The orange line denotes the training loss while the blue line denotes the validation loss.After training for 100 epochs, the forward network achieved convergence.(b) The absorption response obtained by the analytical model, forward network D − R, and full-wave numerical simulation are represented by the red line, green dashed line, and open circles, respectively.

FIG. 4 .
FIG. 4. Training of CTNN and the design of the broadband absorber.(a) Loss descending during the training.The orange line denotes the training loss while the blue line denotes the validation loss.(b) Using the CTNN to design the absorber with the desired absorption response denoted by dashed green lines.The absorption response obtained by the improved CTNN, the traditional TNN, and the full-wave numerical simulation are represented by the red line, blue line, and open circles, respectively.Compared to the traditional TNN (with a thickness 5.3% thicker than the causality limit), the improved CTNN (with thickness 1.7% thicker than the causality limit) exhibits better performance.

FIG. 1 :
FIG. 1: More designs the CTNN gives.(a) and (b) are the "W" tyep absorbers.(c) and (d) are absorbers designed for a specific frequency band.