Robust and Scalable Flat-Optics on Flexible Substrates via Evolutionary Neural Networks

for


Introduction
Flat-optics engineers surface materials supporting sharp changes of phase, polarization, or the direction of electromagnetic (EM) waves, [1] controlling light propagation for diverse applications ranging from imaging [2,3] to optical communications, [4,5] energy harvesting, [6,7] and computing. [8,9] While in the past flatoptics design was mainly driven by intuition, recent years have seen large interests growing in the study of inverse design techniques. These methods are based on various applications of optimization theory and, more recently, artificial intelligence, opening up promising approaches to implement flat-optics devices that could overcome the challenges that are not yet addressed by intuition-driven realizations. [10] Two main classes of inverse design techniques based on optimization theory are genetic methods and topology optimization. The former imitates biological evolution and exploits the dynamics of a suitably defined set of parameters that encompass genetic reproduction steps. [11] These algorithms iterate populations in the design space based on a predefined fitness function representing the design objective. During progressive iterations, candidate structures with larger performances spontaneously emerge from random genetic mutations and crossover. [12][13][14] Topology optimization, on the contrary, exploits discrete material distributions, such as, for example, binary structures. [15] After an iterative refinement, the distribution of materials evolves and clear boundaries appear, defining optimal designs. [16] Topology optimization has been successfully used to implement metalenses, [17][18][19] polarizers, [20][21][22] and wavelength splitters. [23] More recent inverse design techniques exploit statistical learning models based on deep learning neural networks. [24][25][26] These techniques have been successfully applied in the design of nanophotonics chiral metamirrors, [27] diffractive metagratings, [28] plasmonic waveguides, [29] and configurable plasmonic phase-change materials (PCM) metasurfaces. [30] Figure 1 shows a quantitative overview of the state-of-the-art performances reported with these methods. We rank each method based on parameters that provide the largest possible overlap with results available in the literature. In the case of optimization techniques [18,[20][21][22][23][31][32][33][34][35] (Figure 1a), results are illustrated in terms of working bandwidth and efficiency (transmission, reflection, scattering, etc.), whereas the performance of deep learning (DL) methods [28][29][30][36][37][38][39][40][41][42][43] (Figure 1b) is classified in DOI: 10.1002/aisy.202100105 In the past 20 years, flat-optics has emerged as a promising light manipulation technology, surpassing bulk optics in performance, versatility, and miniaturization capabilities. As of today, however, this technology is yet to find widespread commercial applications. One of the challenges is obtaining scalable and highly efficient designs that can withstand the fabrication errors associated with nanoscale manufacturing techniques. This problem becomes more severe in flexible structures, in which deformations appear naturally when flat-optics structures are conformally applied to, for example, biocompatible substrates. Herein, an inverse design platform that enables the fast design of flexible flatoptics that maintain high performance under deformations of their original geometry is presented. The platform leverages on suitably designed evolutionary large-scale optimizers, equipped with fast-trained neural network predictors based on encoder decoder architectures. This approach supports the implementation of flexible flat-optics robust to both fabrication errors or user-defined perturbation stress. This method is validated by a series of experiments in which broadband flexible light polarizers, which maintain an average polarization efficiency of 80% over 200 nm bandwidths when measured under large mechanical deformations, are realized. These results could be helpful for the realization of a robust class of flexible flat-optics for biosensing, imaging, and biomedical devices.
terms of network size and mean squared error (MSE) of the designs obtained with these approaches. This research, while still young in age, reports promising outcomes with many techniques already capable of designing complex devices with efficiencies above 80%, over optical bandwidths larger than 200 nm.
The majority of the devices shown in Figure 1 work on rigid substrates such as glass, quartz, or sapphire. [1,44] Another class of devices, which has recently stirred conspicuous interests, is represented by flat-optics realized on flexible materials. [45][46][47][48] Flexibility allows conformal integration on general surfaces, including biocompatible materials, opening a wide range of applications in integrated optoelectronics for sensing and soft medical devices, such as contact lenses. [49][50][51][52] A major hurdle in this field is performance degradation when the device operates in deformed conditions. In these cases, performances worsen considerably from ideal values of nondeformed configurations. Studies on Au nanorod arrays, [53] for example, reported diffraction efficiency dropping from above 90% to below 40% when the nanorods' stretch ratio changed from ideal values. While there is currently no general technique that can solve this problem, artificial intelligence methods such as the ones shown in Figure 1 can be adapted to address this issue.
In this work, we propose to generalize a flat-optics inverse design platform based on nanoscale neural network universal approximators [54,55] and develop a neural prediction unit that takes into account fabrication robustness for the design of flexible flat-optics at visible frequency and in purely dielectric materials. We validate these results by implementing a new class of flexible flat-optics components with almost negligible variations of optical response to deformations. This design platform is general and can be applied to a wide variety of flat-optics components and systems, opening up to flexible flat-optics on transparent substrates with robust performances.

Fast Neural Network Spectral Predictor
The inverse design platform discussed in the study by Getman et al. [55] exploits an autonomous learning framework for rulebased evolutionary design (ALFRED), composed of a global optimizer and a neural predictor unit. This approach engineers a large-scale resonance network in physical dielectric nanoresonators, which are theoretically demonstrated to act as universal approximators that can predict any user-defined function. Due to the nanoscale nature of the element being engineered (typical thickness around or below 100 nm and in-plane feature sizes as small as 50 nm), first-principle simulations are required to provide accurate prediction and no approximate theory can be used in this design scheme. While the optimizer exploits a general parallel algorithm that is identically applied to any design, the predictor is tailored to the specific task defined by the user. The main purpose of the predictor unit is to address the computational bottleneck arising from conducting 3D first-principle simulations during each iteration of ALFRED's optimizer. We here develop a DL-based spectral predictor unit, which predicts the finite-difference time-domain (FDTD) solution for the computation of flat-optics transmission/reflection responses in standard and deformed flat-optics dispersive materials.
The unit designed in this work allows ALFRED to rapidly evaluate an objective function by merely querying the network in a time scale of several milliseconds, which is % 10 4 times faster than parallel first-principle simulations launched on a typical scientific workstation. The predictor neural network processes a binary image with a candidate flat-optics geometry and returns its transmission/reflection response for both transverse electric (TE) and transverse magnetic (TM) polarizations. The predictor combines two sequential neural networks: an image that features extraction block, which transforms the input image into lowdimensional feature constituents, and a logical block, which maps features to output spectra. In this work, we compare different image processing architectures based on convolutional neural networks (CNN), namely, VGG, ResNet, DenseNet, and EffNet [56][57][58][59] (Figure S3, Supporting Information), and choose EffNet due to its superior performance over the others The logic block consists of several multilayer perceptrons (MLPs), sequentially connected and trained with a supervised learning approach from a set of first-principle spectra, each sampled with 9 nm-wavelength resolution. The logic block outputs a vector of predicted spectral points, representing the predicted spectra of the material response at the input of the CNN layer.  www.advancedsciencenews.com www.advintellsyst.com The overall training process of the neural predictor consists of three main stages: pretraining, training, and post-training. Following the ideas of U-net, [60] we design the pretraining stage using an encoderÀdecoder architecture ( Figure 2a) with skip connections. The system exploits the CNN blocks as an encoder (blue) with multiple upsampling blocks as decoders (green). The system is trained to conduct two main tasks: autoencoding and semantic segmentation. In the autoencoder task, the system performs a pixel-by-pixel binary classification, defining whether a particular pixel belongs to a nanoresonator or not (Figure 2b). The semantic segmentation task, in contrast, classifies each pixel as belonging to a specific shape such as, for example, rectangle, circle, ring, polygon, or the remaining space ( Figure 2c).
The pretrained convolution branch is then connected with the logic layer, completing the training process of the whole unit using the following mean-squared error (MSE) loss function.
withs pred ðω i , gÞ ands true ðω i , gÞ representing the transmittance and reflectance at a sample frequency ω i for the input geometry g, respectively. Figure 2d-g shows an example of training results obtained by applying the Adam optimization algorithm [61] on a training dataset composed of an array of silicon boxes (up to 5) with period 500 nm and discrete thicknesses from 50 nm to 300 nm, with a 25 nm step. Figure 2d shows probability density   Figure 2e shows the MSE obtained using different pretraining tasks (solid lines, orange for autoencoder and blue for semantic segmentation) versus a model that does not have any pretraining (solid line, green). Pretraining improves both convergence time and prediction error, with the performances arising from autoencoding slightly above semantic segmentation. Figure 2f shows a typical TE/TM spectral prediction versus ground-truth values for a two-box configuration with period 500 nm and thickness 100 nm (panel g). Figure S1, Supporting Information, shows additional examples of spectral predictions versus ground-truth values for random cuboid geometries.

Post-training and Data Analysis
The aim of post-training is to increase the prediction success rate of the model selected during pre-and main training. We begin by selecting a threshold mean square error L m , below which we consider the prediction as acceptable. For this analysis, we choose L m ¼ 0.004. Figure 3a-b shows two examples of successful and failing predictions with MSE below (a) and above (b) L m , respectively.
Once the L m is set, we then identify in the predicted test samples a subgroup of spectra with a highly failing prediction. Following the approaches used in unsupervised learning based on the principal component analysis (PCA), [62] we extract from each spectra 15 significant, highly variant dimensional components. We then group the spectra using a k-means clustering algorithm [62] and then visualize each cluster with a t-distributed stochastic neighbor-embedding (t-SNE) algorithm. [62] Figure 3c shows clustering results visualized in 2D nonlinear t-SNE coordinates. Using k-means, we split the test dataset with three different clusters (Figure 3c, orange, green, and blue areas). Supplementary Note II and Figure S2, Supporting Information, show results for different total number of clusters in k-means. Panel d shows the successful (with L < L m ) and failing (possessing L > L m ) predictions distribution using the same 2D nonlinear t-SNE coordinates of Figure 3c.
Comparing Figure 3c,d, we observe that the blue cluster region gathers considerably more failing predictions than the others, with a rate of failed predictions that is 3.9 times larger than in the whole test dataset. Panel e visualizes this analysis quantitatively, by comparing the PDFs of MSE, computed via histograms for the 4000 test predictions among the three different clusters. The highest density of predictions in the blue cluster is shifted away from 0, indicating a higher failure rate for this cluster region.
Following this analysis, we then generate an additional dataset of 10 000 geometries that belong to the high-failure (blue) cluster. We add this additional dataset to the original one and train the network again. Figure 3f-h shows the effect of post-training. Panels f and g show the example of a failing prediction (L ¼ 0.0086) on the model without post-training, which becomes successfully predicted after post-training (L < 0.001). Figure 3h shows the PDF of MSE calculated on the test dataset before and after post-training. The use of post-training improves prediction error and shifts maximum back to the origin. We present additional comparisons using different CNNs in Supporting Information, Figure 3.

Experiments
We design experimental samples robust to fabrication errors and geometrical deformations using a statistical approach [55] based on weighted least squares. Specifically, we create a set g 1 , g 2 , : : : , g N of randomly perturbed structures with different geometrical features obtained by applying to each cuboid resonator, with sizes Δx and Δy in transverse x and y directions, uniform random changes with mean values μ x ¼ 0, μ y ¼ 0, and standard deviations σ x ¼ 0.15Δx ffiffi 3 p and σ y ¼ 0.15Δy ffiffi 3 p . We chose these deformations to reproduce fabrication errors of electron beam lithography (EBL), such as those caused by the proximity effect, [63] that amount to tens of nanometers. Due to the rigid nature of the Si boxes nanopatterned on the surface, we also assume that these values reproduce deformation of the flexible substrate. We verified this assumption experimentally, carrying out measurements of samples under different mechanical stress conditions and comparing the results with the nondeformed case.
Once the values of σ x=y are set, we then create the following cost function.
in which ℱ is the cost function associated with the undeformed geometry, and F ðg n Þ addresses the deformed case. The coefficient α provides a weight to the deformed versus the nondeformed part of the cost function, allowing fine tuning of the design outcome. We illustrate this approach in the implementation of robust polarizers at the operating wavelength of λ 0 ¼ 900 nm, and with δλ ¼ 100 nm operational bandwidth, and compare the performance of structures found with and without optimizing for robustness. We use an unperturbed cost-function defined as follows.
with s TE þ and s TM þ being TE and TM transmission of the structure.
To evaluate an effect of robustness analysis, we run two independent searches for structures with deformation tolerance (g T ) and without deformation tolerance (g). These cases correspond to α ¼ 0.5 and α ¼ 0.0, respectively. We choose α ¼ 0.5 for g T to provide an equal balance between the efficiency (α ¼ 0) and robustness (α 6 ¼ 0) terms appearing in Equation (2). For both g and g T cases, we create sets of N ¼ 50 perturbed geometries G T and G at each iteration and for each particle of ALFRED's swarm optimizer, which continuously evaluates and optimizes the generalized cost-function F g (2). www.advancedsciencenews.com www.advintellsyst.com Figure 4 shows the results of robust optimization design. Figures 4a,b shows 2D masks of the obtained g (a) and g T (b) geometries with the corresponding overlay of 50 randomly perturbed geometries G and G T . As a figure of merit for the polarizers, we choose the polarization efficiency, defined as EðλÞ ¼ js TE þ ðλÞÀs TM þ ðλÞj s TE þ ðλÞþs TM þ ðλÞ . Figure 4c,d shows the polarization efficiency between the fabrication tolerant structure g T and the structure g without tolerance. Each line on the graphs represents the polarization efficiency versus wavelength for the particular geometry from the sets G T and G. Almost one-third of the geometries from the G set obtained without robust design shows mediocre performance to deformations, with polarizing efficiencies lower than 10% for certain cases within the  To validate these results, we fabricate and characterize two broadband flexible flat-optics polarizing beam splitters operating between 600 and 800 nm. The first sample uses the previous robust resign at α ¼ 0.5, whereas the other represents the optimal design at α ¼ 0.
The fabrication process is shown in Figure 5a. It begins by applying a 30 μm-thick piece of kapton tape to a 200 μm -thick slab of microscope glass through the help of 40 μm of silicone adhesive. We then deposit a 250 nm-thick layer of amorphous  www.advancedsciencenews.com www.advintellsyst.com silicon on the tape by plasma-enhanced chemical vapor deposition (PECVD), and on top of it we spin coat a layer of ZEP520A electron beam resist. We then pattern a mask on the resist through EBL, develop the resist, and deposit chromium to create an inverse mask of our structures. After this step we remove the excess resist and dry etch the sample, using chromium as a hard mask to protect the silicon. The chromium is then removed through wet etching. As a final step, we peel off the kapton with the structures from the glass by softening the adhesive with ethanol. Figure 5b shows a picture of the finalized sample on top of flexible kapton tape. The robust design consists of periodic silicon cuboids with dimensions 275 Â 100 nm and thickness 250 nm. Figure 5c shows a scanning electron microscope image of the silicon nanostructures of the final robust sample.
We conduct the characterization of the devices under flat and curved conditions using the setup of Figure 6a. In both flat and curved cases, we illuminate 1 Â 1 mm samples with a broadband halogen source (Ocean Optics DH-2000), followed by a linear polarizer mounted on a computer-controlled motorized stage. The polarizer is rotated to measure the device transmission at each angular orientation Δθ, and each measurement is normalized to the results obtained with no sample. Normalization is conducted separately for each polarization. Figure 6b shows measurement results for the robust sample when the device is used on a flat configuration. A transmission of 85% is experimentally achieved for the TE polarization over the design bandwidth, whereas an average of 94% of TM-polarized input light reflected across the same wavelength range, resulting in an 85% polarization efficiency for this device. To assess the designs performances under various deformation conditions, we fix the ends of the devices to the jaws of a spanner wrench and curve the samples to different degrees by adjusting the wrench opening. We measure both devices under circularly flexed conditions at four different radii of curvature, between 5.5 and 1 mm. We compare the performance of the samples by computing the average polarization efficiency and average extinction ratio (ER) defined as ER ¼ s TE þ s TM þ over the design bandwidth for each condition. Figure 6c shows the relative increase/decrease in the average polarization efficiency of both samples with respect to their flat measurements. The robust design shows almost constant performance across deformations, exhibiting a positive improvement of around 3% at the tightest curvature condition. On the contrary, the nonrobust design efficiency drops significantly when the device is curved, presenting a 53% decrease in performance for the largest deformation. Figure 6d shows the behavior of the average ER across deformations. The graph plots the relative deviation in both samples of the bandwidth-averaged ER under flat and curved conditions. In the robust design, the effect of curvature manifests as a performance improvement. In the nonrobust case, in contrast, the effect results in a performance drop. Quantitatively, the robust design shows a 21% ER improvement Figure 5. Flexible flat-optics: manufacturing. a) The fabrication process of our device, we begin with kapton tape that has an adhesive silicone layer underneath. To make it flat we attach on a slab of glass. 250 nm of amorphous silicon then is deposited on top by PECVD, followed by ZEP520A resist for EBL patterning. The design is imprinted through EBL, after which we use an electron beam evaporator to deposit chromium and form a solid mask. We then dry etch our silicon and remove the chromium mask through wet etching. Following this, the kapton with silicon structures on top can be removed from the glass and used. b) A photograph of the final device. c) A scanning electron microscope micrograph of the device in (b).
www.advancedsciencenews.com www.advintellsyst.com on average across deformations, with less than 11% variation between curved states. In contrast, the nonrobust polarizer experiences an average performance drop of 40% when deformed, with ER changes between curved states that exceed 43%, showing a considerable performance decrease.

Conclusion
For any resonance-based light-processing structures, given a desired input material response characterized by a set of N fixed points in the spectrum, it is possible to set the position of around 2N independent resonant frequencies of resonator boxes. [53] This operation could be done theoretically in the simplest structure possible, composed of a single resonator, if all the required resonances could be shifted to the correct values. However, in a physical system, resonances cannot be manipulated directly, but only indirectly by modifying the resonator shape. This operation alters at once all resonances of the system, and also their corresponding lifetimes, which define in turn coupling coefficients among resonators (as described in detail in another study [53] ). The situation is even more complex in the case of deformations, in which we look for deformations, at such resonance shifts, that, on the one hand, define the material response and on the other make it stable around a specific fixed point. To tackle this problem, we developed the global optimizer ALFRED which takes into account all of these considerations and automatically looks for the simplest configuration of resonators that could address this issue. The proposed approach of modified cost-function in Equation (3) could be seen as a regularization technique providing the trade-off between the fabrication (deformation) tolerance and device's operational efficiency. This allows the framework to conduct complex tasks, while looking for the simplest set of structure that could be realized experimentally. We validated this platform by flat-optics linear polarizers that experimentally maintain polarization efficiencies of around 80% over a 200 nm bandwidth when curved to different degrees. These results can hopefully contribute to facilitating commercial applications of flat-optics flexible devices in various settings. Emerging techniques for high-throughput, low-cost fabrication of nanoscale devices such as roll-to-roll nanoimprint lithography, for example, require robust flexible designs to overcome the challenges of patterning defects that result from mold irregularities, thermal expansion, and uneven pressure application. [64][65][66] The approach described in this work can help in achieving a high tolerance to deformation and dimensional errors, contributing also to the possible future integration of flexible flat-optics with flexible electronics. As an example, flexible color filters developed in www.advancedsciencenews.com www.advintellsyst.com the study by Di Falco et al. [67] and flexible polarizers obtained here can be integrated and replaced with their bulk equivalents for flexible LCD displays that could be made ultra-thin. Robustness to deformation may also facilitate the development of biocompatible devices. Integrating flexible robust flat-optics on silk and hydrogel substrates [68][69][70] can open the door to the creation of a new class of epidermal sensors insensitive to device strain for diverse applications, including enhanced sensing of bacterial infection. [71] Other applications can be envisaged in the area of color vision, for the creation of deficiency-correcting contact lenses that could be robust to spherical deformations. [51]

Experimental Section
Sample Nanofabrication: As the base for our flexible devices, we used polyimide tape (Kapton Polyimide Film, 3M Tape 5413) adhered on a square slab of 18 mm width and %200 μm -thick borosilicate glass. We then grew on top of it a uniform layer of amorphous silicon via PECVD to a thickness of 250 nm. The thickness was verified by ellipsometry (UVISEL Plus, from HORIBA). We applied a positive electron beam resist ZEP 520A (from ZEON corporation) on silicon through spin coating at 4000 RPM for 60 s and then cured the sample on a hotplate at 180 C for 3 min. A last layer of the conductive polymer AR-PC 5090.02 (ALLRESIST) was then spun coated onto the sample at 4000 RPM for 60 s to avoid charging effects during EBL, after which we baked the device again on a hotplate at 100 C for 1 min. We used a JEOL JBX-6300FS EBL system at a 100 kV accelerating voltage to write the pattern. The development was done by submerging the exposed sample in deionized water for 60 s, then in n-Amyl acetate (ZED-N50 from the ZEON corporation) for 90 s, and finally in isopropyl alcohol for 90 s. We then used electron beam evaporation to deposit a 22 nm-thick layer of chromium on the sample. We conducted lift-off by submerging the sample in N-methyl-2-pyrrolidone (ALLRESIST) at 80 C for 1 h and sonicated the solution for 1 min afterwards to create a protective chromium mask for the successive etching step. Etching was conducted through a reactive ion-etching process with SF 6 to remove the unprotected silicon and expose the kapton. Following this, we removed the chromium mask by submersion in perchloric acid and ceric ammonium nitrate solution (TechniEtch Cr01 from MicroChemicals) for 30 s. Removal of the kapton tape with the device from the glass was achieved by submerging the sample in ethanol to soften the adhesive, after which we gently peeled the tape of the glass.

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.