A Topological Encoding Method for Data-driven Photonics Inverse Design

Data-driven approaches have been proposed as effective strategies for the inverse design and optimization of photonic structures in recent years. In order to assist data-driven methods for the design of topology of photonic devices, we propose a topological encoding method that transforms photonic structures represented by binary images to a continuous sparse representation. This sparse representation can be utilized for dimensionality reduction and dataset generation, enabling effective analysis and optimization of photonic topologies with data-driven approaches. As a proof of principle, we leverage our encoding method for the design of two dimensional non-paraxial diffractive optical elements with various diffraction intensity distributions. We proved that our encoding method is able to assist machine-learning-based inverse design approach for accurate and global optimization.


Introduction
Over the past two decades, the advancement of photonics has been enabling vast approaches for manipulating light in the wavelength scale. By engineering the building blocks of the photonics materials and devices, the behavior of light, such as phase, amplitude and polarization of transmitted light and near field responses of wave, can be accurately controlled. Diverse photonic devices ranging from diffractive optical elements (DOE) to metamaterials (MM) and metasurfaces (MS) 1,2 , are designed for the extensive applications such as virtual/augmented reality display 3,4 , miniaturized imaging systems 5,6 , and quantum optics platform 7 . However, the complex mechanism of light-matter interaction prevents an intuitive strategy for the design of the building blocks in the photonic devices. As such, various inverse design and optimization algorithms have been developed for the expeditious design of photonic structures. For example, given some parameters that define the photonic device, adjoint methods 8,9 calculates the gradients of the parameter with respect to the objective function, and incrementally updates the structure by subtracting the gradients from the parameters. Metaheuristic optimizations 10,11 , on the other hand, treat the physical system as a black box, and update the parameters following the manner inspired by physical and biological systems. sparse representation. However, when some operations such as filtering out high-frequency components, are applied to the transformed sparse representation, the inverse Fourier transform (IFT) of the sparse representation may not be a binary image anymore. Consequently, FT is not able to encode, decode, and manipulate arbitrary topologies of photonic structures for the general purpose of dimensionality reduction, data analysis, and device optimization.
Instead of simply applying FT to the binary image, we carry out FT to the level set function ( , ) that defines the topology of the structure. As illustrated in Fig. 1(a), a level set function ( , ) is defined as a 3D surface, and the topology of the structure (encircled by red line) can be represented as the zero-level set of as Γ = {( , )| ( , ) = 0}. Given certain binary images as shown in Fig. 1(b), our encoding strategy is first to construct a level set function, and then apply FT to the level set function so as to derive the sparse representation of the binary image as shown in Fig. 1(c). Reversing the whole process reconstructs a sparse representation to a binary image. The detailed procedure of encoding and decoding photonic structures are shown in Fig. 1 (d). We first construct the level set function of the original image ( , ) through the transform: 1 ( , ) = 345 (6,7) (1) The exponential function maps the original image to the discrete value {−1,1} , , so that the topology of the structure can be represented by Γ = {( , )| 1 ( , ) = 0}. By carrying out FT to 1 , we can find the sparse representation of 1 in the complex frequency space as 9 1 : 6 , 7 < = ℱ[ 1 ( , )] By the property of FT, 9 1 naturally satisfies the condition: 9 1 : 6 , 7 < = 9 1 @ :− 6 , − 7 < where 9 1 @ is the conjugate of 9 1 . Thus, for an image with N pixels, the DOF of its sparse representation is also N. Certain operations P such as filtering out high-frequency components can be performed on this sparse representation A ( , ) = 1 ( , ). In order that the decoded image from A ( , ) is also binary, A should meet the same condition 9 A : 6 , 7 < = 9 A @ :− 6 , − 7 <.
To decode the image, we apply the encoding process in a reversed order. In detail, we first apply inverse Fourier transform (IFT) to 9 A to find the level set function that represents the photonic structure and then retrieve the binary image representation through the operation: where ang(•) is a function that calculates the phase of A ( , ), and

Properties of the encoding method
The encoding method illustrated above transforms a binary image to a continuous sparse representation, allowing incremental variation of the topology of the structures by perturbing 9 A : 6 , 7 <. We will illustrate several properties of this encoding method and show its advantages in representing photonic structures for data-driven photonic discovery. To be consistent with terminology in machine learning, we will call the space of the sparse representation as latent space and 9 A as latent vector.

a. Dimensionality reduction
Suppose a photonic structure represented in an image with number of pixels N, the DOF of the structure is also N. Performing inverse design and optimization in such a high-dimensional space is difficult. On the other hand, the topology of a structure usually presents some properties such as continuity and connectiveness for the purpose of proper simulation and fabrication. Thus, the available structures cluster in a small region in the N-dimensional image space. Machine learning algorithms such as VAE and GAN have been used to reduce the dimensionality of the photonic structures. However, a trained ML model can only faithfully encode and decode the structures topologically similar to the training dataset. Our method, derived without the dependence of data, is able to process binary image data in a fast and general manner. Figure 2(a) to (d) illustrate the process of encoding and decoding a photonic structure. The initial structure, as shown in Fig. 2(a), is represented in an image with = 64 × 64. Our method encodes the structure into its frequency representation 9 1 , the norm of which is presented in Fig.  2(b). After cropping the high-frequency components as in Fig. 2(c), only dominant components are kept in 9 A . Recovering 9 A into binary images through our method, we achieve the decoded image. Interestingly, as decoding the structure is the essentially performing the IFT, the recovered images can have arbitrarily large DOF. The decoded example shown in Fig. (d) has a resolution of 128 × 128. As cropping inevitably deletes portion of the information in the latent space, our encoding method is an irreversible lossy compression. The higher order of terms kept in the latent space, the finer feature will be remained in the decoded images. Other dimensionality reduction approaches can be applied to the encoded latent vectors to further reduce the complexity of the inverse design problem.

b. Continuity of the latent space
Since FT and IFT are uniformly continuous operators, the decoded image A ( , ) can be incrementally varied by perturbing 9 A ( 6 , 7 ). Figure 2(e) to (i) shows a continuous topological variation from the first pattern shown in Fig. 2(b) to the last one in Fig. 2(f). The first and last patterns are randomly constructed with latent vectors 9 D , 9 [ ∈ ℝ ]×] . Figure 2(c) to 2(e) present the intermediate that are decoded from the linearly interpolated latent vector 9 A = 9 D + (1 − ) 9 [ , where ∈ (0,1). As we can observe, the two distinct topologies can be smoothly transformed by linearly interpolating their latent vectors. This property is indispensable for the fast convergence when evolution strategy (ES) is utilized for the topology optimization 32 . It is noteworthy that linear interpolation does not always result in continuous topological transformation, especially when the latent space is high dimensional and the topologies of the two patterns are significantly distinct. In this situation, geodesic, representing the shortest path between the two patterns in the latent space, should be computed for the smooth transformation.

c. Symmetric property
Symmetry is a crucial geometric property that should always be considered in the design of photonic devices. Properly leveraging the symmetric property reduces the time of simulation and mitigates the difficulty of optimization. Our encoding method maintains the symmetric properties of binary images in the frequency spaces. Figure 2(j) to (n) display a few randomly generated patterns with some symmetric properties. The dimension of the latent space we chose is = 5 × 5. Without any constraints on the latent space, the generated pattern shown in Fig. 2(j) does not present any symmetry. In order to generate a centrosymmetric pattern, we need to enforce the latent vector to be centrosymmetric, i.e., 9 A : 6 , 7 < = 9 A :− 6 , − 7 <. Combined with Eq. (3), this condition is equivalent to the latent vectors being real, reducing the DOF of the latent vectors to ⌊ /2⌋ + 1 = 13 . Figure 2(k) is a randomly generated centrosymmetric pattern with such constrain. Similarly, if 9 A : 6 , 7 < = 9 A : 6 , − 7 < and 9 A : 6 , 7 < = 9 A :− 6 , 7 < are enforced, we can generate axisymmetric patterns such as the one shown in Fig. 2(l). The DOF in this case is reduced to :f√ /2h + 1< [ = 9 . By additionally constraining 9 A : 6 , 7 < = 9 A ( 7 , 6 ) , axisymmetric patterns with axis of symmetric = can be produced as shown in Fig. 2(m) and (n). In this circumstance, the DOF of the pattern is jk √, [ l + 2m /2 = 6, indicating that only six variables are required for arbitrarily manipulate the topology of photonic structure. Such symmetric properties of the encoding/decoding method can be leveraged for reducing the parameters in the inverse design of metasurfaces and photonic crystals with specific polarization requirements.

d. Multilevel optimization
When the DOF of a photonic structure is large, optimization techniques suffer from problems such as slow convergence and local minimum. In this situation, multilevel optimization 36 can be used for designing the structure and enhancing the performance. By the nature of FT, our encoding method allows the multilevel optimization of photonic structures by gradually modifying the corresponding latent vectors. Figure 2(o) to 2(s) present an example that adding finer features to the initial structure ( Fig. 2(o)) through our encoding method. The initial structure is constructed from a 9 A J ∈ ℝ ]×] . By attaching additional vectors to the latent vector, we can augment the latent vector to 9 A D ∈ ℝ n×n in a higher dimension. Figure 2(p) presents the decoded image of the augmented latent vector 9 A D . A few features such as the hole in the center appear. Repeating the augmentation process results in the incremental evolution of the structures with finer features as shown in Fig. 2(q) to 2(s). This unique property of the encoding method enables the consolidation of traditional optimization and multilevel optimization for the inverse design of high DOF photonic structures.

Designing non-paraxial diffractive optical elements (DOEs)
As a case study, in this section we will represent how the encoding method can be applied in the inverse design and optimization of binary DOEs. Traditionally, the design of DOEs are relied on iterative Fourier transform algorithm (IFTA). The algorithm can generate binary phase masks whose FT is proportional to the required diffraction intensity distributions. However, IFTA does not take into account the actual physical process, resulting in the inaccuracy of the modeling for diffraction intensity. This inaccuracy prevents an effective design method for the few dots, nonparaxial diffractive beam splitters. To solve this problem, we consolidate the proposed encoding method and a hybrid inverse design algorithm 32 to design 3 × 3 non-paraxial diffractive beam splitters with various diffraction intensity distributions. The cross section of common configuration of the DOEs is shown in Fig. 3(a). The grating and the substrate share the same materials with a refractive index = 1.56, and the thickness of the grating pattern is = 840 . Monochromatic light with a wavelength of J = 940 is incident from the substrate side. We set the period of the DOE as = 2.83 , so that the angle between the 1 st and the 0 th order diffraction is about 21°. The objective of the design is to identify the DOE patterns that are able to accurately diffract the incident light into the 9 different directions with various required intensity distributions.
In order to leverage data-driven approaches such as DL for the fast and global optimization, we first generate sufficient DOE that are able to diffract light to the desired directions. Since the encoding method we proposed is based on FT, it is sufficient to sample 9 A from [−1,1] t×t as the sparse representation for the design of 3 × 3 diffractive beam splitters. In practice, we can further simplify the representation of each DOE for the convenience of the training of a neural network model. In detail, we write 9 A as: where { 3 | = {1, … ,9}} are real numbers. Equation (7) satisfies the condition defined by Eq. (3) and has the DOF of 9. Sampling a random 9 A is achieved by independently sampling each 3 from a uniform distribution. We reorganize the entries in 9 A to an encoded vector = [ D , [ , … , n ], and take the encoded vector as the input of the network. For each encoded vector v, we simulate the diffraction efficiencies of corresponding grating structures with rigorous coupled wave analysis (RCWA) 37 . Under the wavelength of incident light J = 940 , the DOEs produce ,~• 6 ‚ As such, (v, K) is a training pair of the network model. To construct the whole dataset for the training of the network, we randomly sampled 15,000 encoded vectors v and performed the process outlined above. We split the dataset into two parts with 12,000 for training and the rest for validation. Since the grating patterns are represented in low-dimensional vectors v, simple neural network architecture is sufficient for accurate approximation of the diffraction efficiency. As shown in Fig. 3(b), we build an eight-layered fully connected neural network with input of encoded vector v and output of efficiency vector K. All the hidden layers of the network contain 128 neurons, and the nonlinear activations after each input and hidden layers are ReLU. Figure 4(a) presents the loss variation during the training process. The validation loss below 0.03 is achieved after 100 epochs of training.
In order to globally optimize the topology of the grating patterns, we adopt the modified evolution strategy (ES) 32 as shown in Fig. 3(c). The algorithm starts with sampling a population of random encoded vector v. Each vector is regarded as an individual in the population. These vectors are simulated through the neural network simulator. Based on the simulated results, the population is subsequently evaluated by certain design objective, and the elites of the population are selected for the following reproduction and mutation. The algorithm iterates until one of the individuals achieve the design criterion or the maximum iteration is reached. To design DOEs with various diffraction intensity distributions ƒ"… ∈ [0,1] t×t , we define the uniformity error of a design as: where ˆ~• 6 and ˆ~3 ‰ are the maximum and minimum intensity of the scaled diffraction intensity , where is the simulated performance of the designed beam splitter. Our objective is to minimize 1 ‡ ‡ of a DOE design given certain intensity distribution ƒ"… . As neural network is an approximation model, deviation of identified solution is unavoidable. Instead of augmenting the dataset for an improved network simulator, we choose to carry out the ES-based optimization for 150 times, simulate all the identified structures, and select the optimal solution with minimal  Fig. 5, the left plot is the tiled unit cell of the designed DOE, the middle image shows the simulated diffraction intensities of all orders, and the right plot compares the desired diffraction intensities (blue) and RCWA simulated results of the designed DOE (orange). With the help of our encoding method, the hybrid framework successfully identified DOE structures with diffraction intensity distribution matching the design objectives. Quantitively, the uniformity errors 1 ‡ ‡ of designed DOEs from Fig. 5(a) to 5(h) are 0.035, 0.045,  0.073, 0.068, 0.194, 0.036, 0.352, and 0.079, respectively. Since 1 ‡ ‡ is calculated through scaled intensity Š‹•OE1 and objective intensity ƒ"… is the denominator of Š‹•OE1 , the error is extremely sensitive to the objective intensities with small values. In the examples shown in Fig. 5(f), 5(h) and (i), diffraction intensities in some orders are required to be small. In such cases tiny disagreement of the actual diffraction and objective intensities induces large 1 ‡ ‡ . Nevertheless, the overall intensity distributions of all designs have excellent agreement with respect to the objectives, confirming the effectiveness of our encoding methods in the machine-learning-based inverse design approaches.

Conclusion
In summary, we have proposed an encoding method that is able to transform the topology of a photonic structure represented in discrete, high-dimensional, and binary pixelated images into a continuous sparse representation. We explored the properties of this encoding methods, such as continuity of the latent space and symmetric properties, and discussed the potential application of this method in the dimensionality reduction and data generation for the data-driven photonics optimization. As a case study, we utilized the encoding method and a deep learning-based optimization framework to design 3 × 3 DOEs with non-paraxial diffraction angle and various diffraction intensity distributions. The encoding method allows us to generate sufficient data for the optimization without explore unnecessary solution space. The encoded DOE represented in the low dimension also enhances the accuracy of the network and, as a result, increases the fidelity of the design.
Although the proposed encoding methods are aimed to assist data-driven inverse design of photonic structures, other derivative-free traditional optimizations 38 can also take advantage of the continuous low-dimensional representation of photonic structures. If global optimization is not required, local search algorithms can be applied to the latent vectors of the structures without generating redundant dataset. In the future, we expect to explore the application of the encoding method with both traditional and data-driven optimization approaches for the discovery and design of other photonic media such as photonic crystals and metasurfaces, and anticipate to consolidate the encoding method with deep generative models to produce complex patterns for the general inverse design of photonic structures. Fig. 1    . Light with a wavelength J = 940 is incident from the substrate side. Our aim is to optimize the grating pattern such that the central 3 × 3 order diffraction present various intensity distributions. The angle between 0 and +1 order diffraction is 21°. (b) Architecture of neural network simulator for predicting the diffraction intensities of DOEs. The input is the encoded vectors of the DOEs, and the output is the vector containing normalized diffraction intensities and maximum intensity of all diffraction orders. The network is an eight-layer fully connected networks, and each hidden layer has 128 neurons. (c) Schematic of the evolution strategy. Randomly generated latent vectors are evaluated by the network simulator. Elites whose performance are closed to the design objectives are selected for subsequent reproduction and mutation. The algorithm iterates until some encoded vectors satisfies the design objectives or the maximum iteration is reached.