Experiment-based deep learning approach for power allocation with a programmable metasurface

Deep learning, as a highly efficient method for metasurface inverse design, commonly use simulation data to train deep neural networks (DNNs) that can map desired functionalities to proper metasurface designs. However, the assumptions and simplifications made in the simulation model may not reflect the actual behavior of a complex system, leading to suboptimal performance of the DNNs in practical scenarios. To address this issue, we propose an experiment-based deep learning approach for metasurface inverse design and demonstrate its effectiveness for power allocation in complex environments with obstacles. Enabled by the tunability of a programmable metasurface, large sets of experimental data in various configurations can be collected for DNN training. The DNN trained by experimental data can inherently incorporate complex factors and can adapt to changed environments through its on-site data-collecting and fast-retraining capability. The proposed experiment-based DNN holds the potential for intelligent and energy-efficient wireless communication in complex indoor environments.

With growing interest in exploring new phenomena and more complex applications with metasurfaces, an efficient and accurate metasurface inverse design method has become crucial.
Conventional inverse design procedures are usually guided by optimization algorithms [29][30][31][32], which are usually time-consuming iterative searching steps in a case-by-case manner.However, iterative methods cannot be implemented for real-time applications that require fast switching between different functionalities.Recently, deep learning approaches have been more efficient methods for metasurface inverse designs [33][34][35][36][37][38].Once the deep neural network (DNN) is welltrained by vast amounts of data, the network can immediately find proper metasurface designs for different targets without going through iterations again.Particularly, the DNN-assisted metasurface inverse design has been applied in beamforming and power allocation [39][40][41][42][43] to control the power delivered toward target users at different locations for wireless communication systems.However, most of these works rely on training the DNN with simulation data, which may oversimplify the modeling parameters in complex scenarios, thus limiting its use in realistic situations [44,45].
Additionally, changes in environments can significantly compromise the performance of a trained DNN or render it ineffective.
In this work, we propose an experiment-based deep learning approach for metasurface inverse design to achieve power allocation in complex environments with obstacles.Without the need for sophisticated modeling and time-consuming simulations, we train the DNN directly with experimental data, which can be measured in various configurations of a programmable metasurface enabled by its programmability.Experimentally collected data can inherently incorporate complex factors in realistic situations that are difficult to be included in the simulation model.Our results demonstrate that the experiment-based DNN is effective in controlling the power transmitted toward multiple receivers and can adapt to changed environments through its on-site data-collecting and fast-retraining capability.The proposed experiment-based deep learning scheme offers a promising direction for leveraging real-world data to achieve accurate and efficient metasurface inverse designs for boosting or damping Wi-Fi and 5G signals in complex indoor environments.

DNN for power allocation with programmable metasurface
We aim to control the power transmitted to specific receivers in a complex environment, generally with an obstacle, using a programmable metasurface together with an experiment-based deep learning approach, as shown in Fig. 1.A reflective programmable metasurface featuring tunable reflection phase profiles in the microwave regime is illuminated with a monochromatic excitation signal from a feed horn.The metasurface comprises 20 columns of unit cells, and the reflection phase {  } (  = 1,2, … ,20 ) for each column can be independently controlled.After the reflected wave is scattered by an obstacle (a metal frame in this case), the scattered field intensities ( 1 ,  2 , and  3 ) are measured by three open-end waveguide probes in specific locations.Our deep neural network (DNN) consists of a forward scattering engine (FSE) and an inverse-design engine (IDE), as shown in Fig. 1  Next, the IDE is constructed with reverse topology of 3-50-50-20 fully connected layers.The input target intensities {  } are inversely transformed to the desired reflection phase profile {  }.
During the training the IDE, the MSE between {  } and {  ′ } (IDE combined with the pre-trained FSE) is used as the loss function and no experimental data is needed in this stage.Finally, for any target set of {  }, the output of the IDE, {  }, can now be used as the input of the real metasurface to test whether the experimentally obtained {  } is similar to {  }.We note that due to the inverse design nature of the problem, there may be multiple phase profiles {  } that can achieve the same set of target intensities {  }.The integration of the IDE and pre-trained FSE as an integrated DNN (an autoencoder setting for results instead of design parameters) can help mitigate the nonuniqueness issue [46].We also note that there is an additional pre-trained quantization network (approximated using a smooth function) when the IDE is connected to FSE.The quantization network transfers input continuous values to 8 possible discrete values of reflection phases for realistic implementation of the programmable metasurfaces with FPGA.More details of the DNN architecture can be found in Sec. 1 in the Supplementary Materials.There is a need to investigate the possible range of the measured intensities {  } from the metasurface.In the scenario without the obstacle placed in front of the metasurface, we randomly generate 10000 sets of phase profiles on the metasurface and use the 3 fixed probes to experimentally measure the corresponding intensities {  }.The intensities are plotted as threedimensional points in Fig. 2(d).The details of experimental data collection can be found in Fig. S5 in the Supplementary Materials.We note that the {  } plotted in the figure are normalized by   /  , where the   denotes the maximum intensity received from the 3 probes in the given 10000 sets of measurements.The color of the points denotes the sum of the intensities from the 3 probes.The contour surfaces show up approximately as planes and more data points are located near the coordinate origin.The normalized intensities  1 ,  2 , and  3 less than 0.6 account for 97.9%, 94.1%, and 98.8% of the total data, respectively.In the following, these data are used to train the DNN, and any target normalized intensity values are assumed to range from 0 to 0.6.

DNN training and testing without obstacle
The To test the performance of the trained DNN, we first demonstrate three special cases called "001", "101", and "000".The "001" case denotes that the metasurface can manipulate the scattered fields toward one particular probe with a strong signal while the other two probes obtain weak signals.Similarly, the "101" case shows that two probes receive strong signals while the central probe receives a weak signal.The "000" case means minimum or zero target power level for the signals to be received for all the 3 probes.As shown in the black bars in Fig. 3 to realize the three special cases, in an actual experimental setting.Particularly, these results enable the application of the programmable metasurface to deliver and damp signal receiving at different locations, pointing to applications for the metasurfaces as RISs, e.g. for a room decorated with such metasurfaces to selectively deliver signals at different locations [16].We note that the experimental conditions remain the same for the whole training and test process.To evaluate the overall performance, our system can arbitrarily control the allocated power to target values within the reasonable range as shown in Fig. 3

On-site updated DNN with obstacle
The above results have demonstrated that our DNN-assisted metasurface is capable of controlling the power to specific receivers on demand in the scenario without obstacles.Normally, a well-trained DNN for the specific scenario may fail to work after the ambient conditions change (the emergence of obstacles, for example).However, our experiment-based DNN can simply adapt to the changed ambient conditions because the DNN can be retrained using experimental data that can be collected within a short time and updated periodically.To demonstrate the adaptivity of the system, a metal frame obstacle is added in between the metasurface and the three probes as shown in Fig. 1.We input the same 3000 testing sets of

Discussion and Conclusion
In this work, we use 3 probes with specific locations to collect the experimental training data for the DNN construction.Our scheme also allows for the control of scattered fields in other locations by adding more probes, depending on the number of target users.Furthermore, at the current stage, our system manipulates field patterns in the horizontal plane, as our metasurface only has the degree of freedom to control the phase profiles along the y direction (each column is independently controlled).To further enhance the system's capabilities, power allocation with higher degrees of freedom in space can be achieved by independently controlling each unit cell of the metasurface in two transverse dimensions.
In summary, we have proposed the experiment-based DNN approach for power allocation enabled by a programmable metasurface.We directly train the DNN using experimental data, circumventing the need for complex modeling and computationally intensive simulations as training data.The experimental data can inherently incorporate complex factors that can be challenging to simulate or model, leading to more reliable and robust DNN results.Our experimental results demonstrate that experiment-based DNN can effectively control power transmitted towards multiple receivers and can adapt to the changed environments through its on-site data-collecting and fastretraining capability.Our work provides valuable insights into the potential of leveraging real-world data for more accurate and efficient metasurface designs for intelligent and energy-efficient wireless communication in complex indoor environments.

Fig. 1
Fig. 1 Schematic of the experiment-based DNN for power allocation with a programmable metasurface.A reflective metasurface with tunable reflection phase profiles {  } is excited by a monochromatic source at 11GHz from a feed horn.Three fixed probes in specific locations are employed to measure the corresponding field intensities {  } after scattering by a metal frame obstacle.By collecting large sets of {  } as the experimental training data, a forward scattering engine (FSE) is pre-trained to convert reflection phase profiles {  } to predicted intensities {  ′ }.Additionally, an inverse-design engine (IDE) is developed to transform input target intensities {  } to the output required reflection phase profiles {  }.The integrated DNN, which combines the IDE with the pre-trained FSE, is trained to co-ordinate the metasurface inverse design to manipulate the scattered fields on demand.
To obtain the experimental training data, we design and fabricate a programmable metasurface consisting of 20 × 20 unit cells operating at 11GHz, in which the reflection of each column {  } can be independently controlled as shown in Fig.2(a).The reflected fields depend on the assigned phase profiles {  } on the 20 columns of the metasurface.By varying the reflection phase profiles rapidly in time, a large set of experimental data can be collected from the 3 probes within a short time for the DNN training.In our case, 10000 sets of randomly selected {  } are chosen as input to the metasurfaces, and the experimental training data {  } can be collected within 10 seconds.

Figure 2 (
Figure 2(b) shows the unit structure of the metasurface with detailed geometric parameters.Three copper layers are printed on two substrate layers (Rogers 4003C, relative permittivity   = 3.55 , loss tangent  = 0.0027 ) and a bonding layer (Rogers 4450F,   = 3.52 ,  = 0.004 ) .A varactor diode (MAVR-000120-14110P), as an active component whose capacitance changes with the bias voltage, is embedded between two metallic patches on the top layer.Two metallic vias are used to electrically connect to the negative "−" electrode in the middle layer and the positive " + " electrode in the bottom layer, respectively.By applying different bias voltages to

Fig. 2
Fig. 2 Metasurface design and experimental training data collection.(a) Schematic of the programmable metasurface design.The reflection phases {  } for 20 columns on the metasurface can be dynamically controlled in time.Scattered fields caused by large sets of phase profiles are experimentally collected for the DNN training.(b) Unit cell structure with geometric parameters.The embedded varactor diode works as an active component, whose capacitance changes with the bias voltage, leading to the frequency shift of the dipole resonance of the metasurface.(c) Measured reflection phase of the metasurface at different bias voltages.The vertical orange line indicates the operating frequency at 11GHz.Eight discrete phase states at 11 GHz are used to control the metasurface.(d) Normalized intensities  1 ,  2 , and  3 experimentally measured from the 3 probes for the DNN training process in the scenario without obstacle.10000 sets of intensities {  } are collected from 10000 sets of the random reflection proposed experiment-based deep learning approach for power allocation is first demonstrated in the scenario without the obstacle.The randomly generated phase profiles {  } and the corresponding measured intensities {  } in Fig. 2(d) comprise 10000 sets of data, with 8100 sets used for training, 900 sets for validation, and the remaining 1000 sets for testing.The MSE loss function between {  ′ } and {  } is used to train the FSE for 10000 epochs with Adam optimizer, and a learning rate of 0.0001, achieving the final validation loss of 0.002.The testing results of the FSE are shown in Fig. S3 in the Supplementary Materials, indicating a relatively low MSE of 0.0011 on average.Then we train the integrated DNN comprising the IDE and pre-trained FSE.48000 configurations of the target intensities {  } are randomly generated with target intensity at each probe  chosen from a uniform distribution [0, 0.6], which is a reasonable range of the DNN to achieve as illustrated before from Fig. 2(d).40500 sets of the {  } are used as training data, 4500 sets as validation data, and the remaining 3000 sets as testing data.The integrated DNN is trained for 6000 epochs, using the MSE loss function between {  } and {  ′ } and an Adam optimizer with a learning rate of 0.0005, achieving the final validation loss of 0.0003 for convergence.Notably, only the weights of the IDE are optimized in the latter training process.
figure, the experimentally measured results (blue bars) match well with the targets and network predictions, showing our DNN-assisted metasurface can manipulate the scattered fields on demands

Fig. 3
Fig. 3 Performance of DNN-assisted power allocation without obstacle.(a)-(c) Three special cases of "001", "101", and "000" for the 3 probes.The black bars, orange bars, and blue bars represent the target intensities {  } , the predicted intensities {  ′ } from the DNN, and the measured intensities {  } from the experiment, respectively.(d)-(f) General cases for 3 probes with 3000 sets of test data.The predicted intensities {  ′ } (orange points) and the measured intensities {  } (blue points) are both plotted against the target intensities {  } (horizontal axis).The black dashed line is plotted for reference.The closer the data point is to the reference line, the smaller the error from the target value.
(d)-(f).We input the remaining 3000 testing sets of { 1 ,  2 ,  3 } as the target intensities to the trained DNN and obtained the 3000 sets of {  } and predicted intensities { 1 ′ ,  2 ′ ,  3 ′ }.In Fig. 3(d)-(f), the horizontal axes and right-hand vertical axes denote the target intensities { 1 ,  2 ,  3 } and predicted intensities { 1 ′ ,  2 ′ ,  3 ′ }, respectively.We observe that 3000 orange data points show a linear distribution around the dashed reference lines   ′ =   , showing that the DNN has been well-trained to predict the intensities of the three probes according to the input targets.The mean squared errors (MSEs) between { 1 ,  2 ,  3 } and { 1 ′ ,  2 ′ ,  3 ′ } are calculated and found to be 0.61 × 10 −3 , 0.52 × 10 −3 , 0.59 × 10 −3 for the 3 probes.Next, we evaluate the performance in an actual experimental test.The 3000 sets of {  } are implemented by the metasurface, and the corresponding measured intensities { 1 ,  2 ,  3 } are plotted against the target intensities.As expected, the measured results denoted by blue points are distributed linearly around the dashed reference lines   =   , indicating the system can control the allocated power at the 3 probes to target values.The MSEs for measured results are obtained as 2.4 × 10 −3 , 2.1 × 10 −3 , 3.1 × 10 −3 for the 3 probes, respectively.The errors for predicted and measured results may come from limited training samples, phase quantization errors, and noisy data acquisition.

{𝐼 1 ,
2 ,  3 } to the previous DNN (trained without obstacle) and obtain the {  } .By implementing the 3000 sets of {  } on the metasurface, we measure the corresponding intensities { 1 ,  2 ,  3 } and plot them with the target { 1 ,  2 ,  3 } as shown in Fig. 4(a)-(c).For (a) and (b), the measured data points deviate below the dashed reference lines   =   , which means the signals transmitted to these two probes are blocked or scattered away by the added obstacle.For (c), the measured results show a poor linear correlation with target values affected by the appearance of the obstacle.The MSEs are 7.5 × 10 −3 , 22 × 10 −3 , 4.6 × 10 −3 for the Fig. 4(a)-(c) respectively, showing larger errors compared with the case without obstacle in Fig. 3(d)-(f).Therefore, the original DNN trained without the obstacle performs poorly under the changed ambient conditions.

Fig. 4
Fig. 4 Performance of DNN-assisted adaptive power allocation with an obstacle.(a)-(c) The measured intensities {  } against target intensities {  } for the 3 probes, using the previous DNN trained without an obstacle.The data points deviate from the dashed reference line, showing the previous DNN fails to work after adding an obstacle.(d)-(f) The measured intensities {  } with target intensities {  } using the on-site updated DNN trained with the obstacle.The data points return to the reference dashed line, showing the experiment-based DNN is adapted to the changed ambient conditions.