Meta-model-based multi-objective optimization for robust color reproduction using hybrid diffraction gratings

We propose an efficient and versatile optimization scheme, based on the combination of multi-objective genetic algorithms and neural-networks, to reproduce specific colors through the optimization of the geometrical parameters of metal-dielectric diffraction gratings. To illustrate and assess the performance of this approach, we tailor the chromatic response of a structure composed of three adjacent hybrid V-groove diffraction gratings. To be close to the experimental situation, we include the feasibility constraints imposed by the fabrication process. The strength of our approach lies in the possibility to simultaneously optimize different contradictory objectives, avoiding time-consuming electromagnetic calculations. © 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
The generation of structural colors using metallic, dielectric or hybrid nanostructures is currently the subject of a significant amount of research works [1][2][3]. The interest on this mechanism for color generation arises from its great potential for the design of ultra-thin color-tunable devices, with high resolution and efficiency beyond the diffraction limit, for applications in optical security, imaging and display technologies or color filtering [4][5][6]. Also, structural coloration is an important complement to traditional colorant-based pigmentation mechanisms.
Very often, however, the chromatic response of the structures used for color generation is characterized through parametric studies based on the systematic variation of their different geometrical parameters. Although this approach has proven successful, it can be very time consuming and does not necessarily provide the optimal solutions that best match the searched colors. Several works making use of mono-objective optimization algorithms have been focused on the solution of this kind of inverse problem [7,8]. Nevertheless, when the studied structures are complex it is necessary to satisfy simultaneously several often contradictory objectives. In that case there is not a single optimal solution but multiple trade-off solutions among the searched objectives, making thus necessary to resort to more sophisticated strategies like multi-objective optimization. This approach has been explored by Wiecha et al to design colour pixels based on silicon nano-structures [9] and by J.Jung to simultaneously improve the performance and robustness of a plasmonic wave-guide [10]. However, since several iterations are needed to search for the optimal solutions, one strategy proposed to avoid time-consuming electromagnetic calculations is to replace them with approximations based on meta-models. Meta-Modeling is increasingly used due to the ongoing improvement of high-performance computer systems and particularly the evolution of artificial intelligence tools, which not only have become a powerful method for the modelling of physical phenomena, but also to understand, simulate and predict the optical or resonant response of the interaction between light and matter at the nanoscale [11][12][13][14][15][16].
In this work, we propose an efficient multi-objective optimization (MOO) scheme, based on the combination of Genetic Algorithms (GA) with a neural-networks-based meta-model, to tailor the chromatic response of hybrid metal-dielectric gratings. We make use of the meta-model to replace the time-consuming electromagnetic method, required to compute the spectral data related to the colors generated with the grating.
This contribution is organized as follows: In Sect. 2 we describe the geometry of the diffraction grating studied throughout this work. Also, we succinctly outline the rigorous electromagnetic method used to generate the spectra from which the colors are generated. Furthermore, we introduce and describe the meta-model used. In Sect. 3, we formulate the inverse problem to be solved and introduce the variables of interest to be optimized together with the constraints to be satisfied. We present our main results in Sect. 4, where we first illustrate the limitations of mono-objective optimization through an example and then we show the optimal solutions obtained with our MOO scheme. Furthermore, through a post-optimization procedure we discuss robustness of the optimal solutions found and their sensitivity to fabrication errors. Also, we study the influence of the grating's geometrical parameters on the colors generated. Finally, in Sect. 5 we present our concluding remarks.

Direct problem: meta-model-based generation of spectral data
Throughout this work we consider, without loss of generality, the periodically corrugated hybrid multilayered structure depicted in Fig. 1. The profile z = s(x) is assumed to be invariant along the y direction and is piece-wise defined by Eq. (1) : where h is the amplitude of the grating. The aperture of the V-groove region is defined as R α = T α /Λ with T α = |x 2 −x 1 | and Λ represents the grating's period. The region s(x)<z<s(x)+e A is filled with a uniform layer of aluminum (Al) whereas the region s(x) + e A <z<s(x) + e s is filled with Silicon Nitride (Si 3 N 4 ), a high refractive index material. We denote the Al and Si 3 N 4 layers' thicknesses e A and e S , respectively. The two semi-infinite regions z ≥ s(x) + e A + e S and z ≤ s(x) are assumed to be isotropic and homogeneous media with refractive indices equal to 1.5. This configuration corresponds to the case in which the fabricated structure is protected from surrounding wear with a transparent dielectric overlay.
For the sake of brevity, in our study we arbitrarily consider the TM polarization (p-polarization) where the magnetic field is parallel to the V-groove axis (Oy) and we assume that the plane of incidence is the plane (Oxz). Nevertheless, the TE polarization state could be also used.

C-Method
The theoretical spectral data related to the V-groove grating shown in Fig. 1 are computed using the C-Method, also known as Chandezon Method, through the commercial software McGrating [17]. The C-method is a well known and powerful tool especially suited for the computation of diffraction efficiencies of shallow sinusoidal or pseudo-sinusoidal gratings. It solves Maxwell's equations using the following coordinates transformation [18,19]: where s(x) is the grating's profile. Although this transformation complicates the functional form of Maxwell's equations, the surface roughness is flattened, thus making the matching of boundary conditions simpler. In the new (u,v,w) coordinates system, the problem is reduced to an eigenvalues problem whose resolution determines the propagative and evanescent modes of the system. Unlike other methods [20], it has been shown that the convergence of the C-method depends weakly on the incident field's polarization and material permittivities. Also, it is well suited for multi-layer systems as the one studied in this contribution [21]. However, an important limitation is the aspect ratio and the sharp edges of the profile, which may introduce discontinuities in the profile's derivative function that may generate numerical instabilities in Fourier space.
In our case, to insure fast convergence, to avoid spurious effects and to remove the hyper singularities at the sharp edges [22] (marked with red circles in Fig. 2), we rounded them using a spline function considering a radius of curvature of about 7 nm for the corners and 2 nm for the tip end. These values were selected from convergence and computing time considerations. The grating's profile with rounded edges can be defined by means of a Fourier series. It is noteworthy to mention that in a experimental situation this rounding effect is unavoidable. It must be mentioned that depending on the V-groove's depth and aperture it is necessary to readjust the number of harmonics to ensure convergence. As a matter of fact, the smaller the angle, the slower the convergence because the edges are steeper. For this reason, we introduced a convergence loop in our process to generate the reflectance spectra and we set a stop error criterion of 10 −3 between two consecutive iterations. Figure 3 shows the evolution of the absolute error |R 0 N − R 0 N−1 | calculated throughout the convergence loop as a function of the number of harmonics. The red line defines the stop criterion. For example, for a V-groove grating defined with Λ = 310nm and h = 120nm, we need 51 harmonics to ensure convergence for the grating with R α = 0.6, against 91 harmonics in the case of R α = 0.2, corresponding respectively to 75.5°and 28.9°V-groove angles. For a V-groove grating defined with Λ = 310nm and h = 120nm, we need 51 harmonics to ensure convergence for the one with R α = 0.6, against 91 harmonics in the case of R α = 0.2, corresponding respectively to 75.5°and 28.9°V-groove angles.

Meta-model
To better understand how the meta-model is used in this work, let us consider two sets X and Y of N elements x i and y i i ∈ [1 · · · N], respectively. Each element x i can be interpreted as a vector whose components are the geometrical parameters (R α , Λ, h, e A , e S ) i of a diffraction grating, that from now on we write in a more compact form as (p l ) i ; l ∈ [1 · · · 5]. On the other hand, y i is a vector whose components are the CIE L*a*b* chromatic coordinates associated to each grating belonging to X. The CIE L*a*b* space is a perceptually uniform color space particularly suited to human perception [23].
The meta-model can thus be interpreted as an approximation of the non trivial relationship, noted y i = f (x i ), between the geometry of the grating and the color it generates. Mathematically, the meta-model can be written as which, in continuity with our previous work [12], was obtained using an artificial neural network (ANN) as intensive learning meta-model [24,25]. In this contribution we consider three hidden layers with 100 neurons each. The weights are determined through the minimization of the mean square error on the training set and the learning method employed is the gradient descent, which is calculated by the back-propagation principle.
To train the meta-model we generated a color database, composed of 14000 elements, using the spectra numerically obtained with the C-method, considering TM-polarized light and zeroth order diffraction in reflection configuration. Throughout our numerical experiments we found that this size of the database was enough to minimize the average prediction error and, at the same time, to have a sufficient number of elements to generate the database in a reasonable time. Each geometrical parameter (p l ) l=1..5 lies within the lower and upper bounds shown in Table 1. These values were chosen taking into account industrial feasibility considerations.
The chromatic response of each structure in the database was calculated using the CIE 2deg color matching functions and the standard illuminant D65. The related perceived colors are depicted in Fig. 4 in the CIE L*a*b* space. We can notice that by adjusting the structure's parameters, we can reach a wide and vivid visible color gamut. The difference between the initial color database and the predicted one is shown in Fig. 5(a). We quantify the prediction error through the color difference ∆E f , f = ∆E(f (x i ), f (x i )) with the formula given by CIEDE2000 [26]. The histogram in Fig. 5(b) illustrates the performance of the meta-model prediction, we notice that 78% of the initial elements of the database are predicted with a ∆E f , f <2 and more than 98% with a ∆E f , f ≤ 4, a difference that is barely perceptible to the human eye. This result provides confidence on the accuracy of the meta-model and its use to replace the C-Method.

Inverse problem: color reproduction
In some applications where the goal is to generate true color images like, for example, in displays technologies or zero order devices (ZOD) for hologram security authentication images, a convenient approach often used is to define the requested colors and then to design the optimal structure that produces them (inverse problem). In this study, we aim at finding the optimal geometrical parameters of a structure, composed of three adjacent V-groove gratings, to best match the target color pixels labelled A, B and C in Fig. 6. An important fabrication constraint is that the three gratings must be simultaneously optimized on the same structure. Although each of them may have different geometrical parameters, they must share the same thicknesses of deposited materials, Al and Si 3 N 4 in the present work. The reason for this restriction is that during the fabrication process it is easier to deposit one or several uniform layers all over the sample, for example by thermal evaporation, than layers of different thicknesses.
Very often, this kind of V-shape gratings is fabricated through the replication of a pattern into a polymer film. The hard master stamp (rigid mold) can be manufactured, for example, by focused ion beam (FIB) milling process [27] or by the combination of photolithography with anisotropic etching [28]. During the replication process, the transfer of the master pattern to the polymer film can be done by roll to roll or UV casting nano-imprint methods [29].
An optimal structure, noted s k , should then be composed of the combination of three adjacent gratings [x A , x B , x C ] k , each of them characterized by 5 geometrical parameters [p 1j , p 2j , p 3j , p 4 , p 5 ] k ; j = A, B, C as shown in Table 2, subject to the constraints In order to relate the geometrical parameters of a grating with the searched color, we defined the objective functions g A , g B and g C in terms of the CIE2000 color difference formula [26] These objective functions measure the closeness between each target color A,B or C and each pixel's color computed with the meta-model. At first sight, one may intuitively think to formulate this multi-objective optimization problem as a mono-objective one to search for a single optimized solution, based on a weighted combination of all the objectives. Another more rigorous approach would be to take into account the different objective functions g A , g B and g C making use of a dedicated multi-objective optimization method.
It should be noted that the presence of multiple objectives in a problem gives rise, in principle, to a set of optimal trade-off solutions (known as Pareto-optimal solutions) instead of a single optimal solution. The Pareto front fully determines the whole set of potential choices that optimally satisfy the trade-offs among the different objective functions. Basically, in a multi-objective optimization scheme each structure s i is ranked by non-domination relationships based on its performance (g A , g B , g C ) s i . In our case, it was established that a structure s 1 dominates a structure s 2 if g i (s 1 ) ≤ g i (s 2 ); i = A, B, C and g i (s 1 )<g i (s 2 ) for at least one function g i . In the absence of any further information, one of these Pareto-optimal solutions cannot be said to be better than the others. In this case, a further processing is required to arrive at a single preferred solution. In general, from the set of optimal structures, one can select the solution that best fits the current design needs or provides insightful information on the physical phenomenon studied.

Results
At this stage it is convenient to illustrate the performance of our meta-model-based MOO scheme through some examples.

Limits of the mono-objective optimization
Let us define the objective functional h 1 to be minimized as where g A , g B and g C are the color differences described in Sect. 3. Although, for the sake of brevity we only present in Fig. 7 some typical results obtained when Eq. (6) is minimized. The optimal geometrical parameters are retrieved with a numerical precision of 0.1 nm. Although this resolution is hardly reachable during the fabrication, the results of extensive numerical experiments showed that neither the spectra nor the colors are significantly modified at this scale. It should be noted that any other combination of these functions could be used as well. However, we found throughout our numerical simulations that the optimization algorithm, which was the Particle Swarm Optimization (PSO) [30] coupled with the meta-model presented earlier, converged to similar results. That is, the convergence strongly depends on the form of the functional chosen and the colors reproduced are far from the target ones, as it can be seen in Fig. 7 for the color pixel B. We used the PSO and ANN numerical implementations respectively provided by Python through the libraries PySwarm and Keras [31,32].

Meta-model based optimization scheme (MOMmBO)
As mentioned in Section 3, one way to avoid the situation illustrated by the previous example is to use multi-objective optimization, where the objective functions constitute a multi-dimensional space. Our goal thus is to retrieve the geometrical parameters of the structure in Fig. 1 that produce the target colors depicted in Fig. 6. That is, we aim at finding a trade-off among the objective functions g A , g B and g C to be minimized (Section 3), subject to the constraints given by Eq. (4) and set to guarantee physically feasibility. Also, the upper and lower bounds of the searched geometrical parameters are shown in Table 1.
To search for the trade-off solutions, we used the Non-Dominated Sorting Genetic Algorithm (NSGA-II) for MOO [33], combined with meta-model described in Section 2. The NSGA-II is one of the most popular population-based algorithms used for evolutionary multi-objective optimization (EMO). It is a very efficient method for solving problems with a small number of objective functions. We used the implementation of NSGA-II provided in Python via the library Platypus [34] with 1000 initial individuals and 50000 iterations, the optimal 3D-Pareto front obtained is shown in Fig. 8(a). In order to assess the convergence behavior of the NSGA-II, we searched for the target colors in Fig. 6 considering different randomly generated initial populations. To facilitate the visualization, we show in Fig. 8(a) only two of the 3D-Pareto fronts obtained. This result suggests that the NSGA-II is not sensitive to initialization and presents a good reproducibility. Furthermore, despite the large number of evaluations needed because of the population size, the meta-model f makes possible to obtain a 3D-Pareto front in about 311 seconds.
We see that there are off-center solutions which favor two objective functions while giving poor performances to the third one. Nevertheless, at this stage, the solutions that will be retained will be mainly those which minimize the three functions simultaneously as g A , g B , g C <30. We display in Fig. 8(b), using the projected CIE a * b * plane space, the 1000 solutions obtained in the 3D-Pareto front. We see that the resulting front does not give completely aberrant solutions, as these remain about potentially acceptable colors. In Fig. 9(a), we show an example of solution found in the Pareto front obtained with g A = 13.02, g B = 11.42 and g C = 9.76 with the corresponding specular reflexion spectra in Fig. 9(b).   Fig. 9. (a) One of the solutions found on the 3D Pareto-front obtained with g A = 13.02, g B = 11.42, g C = 9.76 ; (b) Reflexion spectra of the optimal structure presented in Fig. 9(a).
The retrieved colors are by far closer to the target ones than those depicted in Fig. 7 obtained using the mono-objective optimization.

Robustness
An important issue in our study is to obtain robust solutions insensitive to the variations of the design variables, which correspond to the potential error introduced when the samples are fabricated. To identify the most robust solutions, noted s * k , among those located in the Pareto front, we defined a 5-dimensional hyper volume centered on s * k and with a radius of , where ± is the accuracy of the fabrication process. In our case, we fixed arbitrarily = [0.025, 2nm, 2nm, 2nm, 2nm]. This parameter is to be adjusted according to the estimated accuracy of the used manufacturing technology.
In this hyper-volume, we chose randomly M elements s * kl ; k = 1..P and l = 1..M, where P and M respectively represent the number of Pareto optimal solutions and the number of robustness testing points about each solution of the Pareto set. In our case, P = 1000 corresponded to the population size chosen for the optimization process and M was arbitrarily fixed to 500. We then calculated the predicted colors f (s * kl ) using the meta-model. A robust solution s * k will be therefore a solution with a small color difference Dv kl = ∆E( f (s * kl ), f (s * k )); ∀l ∈ 1..M with respect to each structure within the hyper-volume. Each individual s * kl within the hyper-volume was then ranked based on its performance Dv kl . We chose then to analyse the distribution of individual performances by calculating the mean of the best and the worst ranked elements. This analysis guarantees that during the fabrication process, independently of where the error is between ± , even in the worst configuration, the color will not change significantly. We show in Fig. 10 (yellow dots), the most robust optimal solutions. It remains to choose from this set the easiest solution to manufacture.
Before closing this section we emphasize, once again, the importance of the meta-model on the significant reduction of the computing time required in the robustness study.
This can be better visualized computing the number of required electromagnetic simulations, which is given by the product PxM, where P and M are the parameters previously defined. Since the estimated time required for one simulation is about 42 s and PxM= 500000 simulations, one easily finds that 243 days would be necessary to perform all the numerical simulations. It is important to notice that, at least for the present application, these computing times were enough to obtain the expected results. However, this should no be necessarily the case for problems in which a larger number of iterations may be required.

Pareto solutions analysis
The set of optimal solutions is used not only to achieve an optimal design, but also to obtain and extract additional information to better understand how the geometrical parameters affect the observed spectral variations. It also serves as a guide to understand the physical phenomena underlying the generation of colors.
We show in Fig. 11 the distribution of the optimal geometrical parameters p l l=1..3 obtained on the Pareto front for fixed values of the thickness e A and e S . We see in Fig. 11(a) and Fig. 11(b) that, even though it depends on the period Λ and the amplitude h, R α seems to be the geometrical parameter that introduces the most significant spectral differences between the three pixels. That is, for example, two structures belonging to the 3D-Pareto front can have the same period (Λ = 265nm for example) but two different colors (pixel A and B) if the parameter R α is different ( Fig. 11(a)). Furthermore, the grating depth h does not vary drastically between the three pixels ( Fig. 11(b)). Thus, at constant h and Λ, different colors can be obtained just by modifying the groove angle. This fact suggests that the angle of the V-groove governs the origin of the spectral variations observed between the three pixels. It has been shown that gap surface modes (GSPs) confined laterally between the tapered groove sidewalls can be excited in such structures. As it can be seen in Fig. 12, where near-field intensity maps have been computed for resonance wavelengths of the three optimal pixels'structures presented in Fig. 9(b), there is a strong field confinement in the z direction especially when decreasing the groove's angle (Fig. 12(c)). The tapered groove can be seen as a stack of infinitesimally thick metal-insulator-metal structure, with a variable insulator thickness along the z direction. Then, the lower the thickness, the more the effective refractive index and the confinement of the field are important [35,36]. However, if the V-groove angle is large, these GSPs modes can leak into surface plasmons on the side of the groove. It has been previously shown, for the case of V-grooves milled in a semi infinite metallic surface, that these resonant peaks are related to the formation of standing waves due to interference of counter propagating GSPs reflected by the groove bottom and opening [37,38]. Nevertheless, further studies are needed for the case of hybrid thin films as the ones considered in this contribution. However, this is out of the scope of the present work and it will be subject of a future publication. Fig. 12. Time averaged (colormap) and dispacement (black arrows) of the electric field at resonance wavelengths (a),(b),(c) of respectively the pixels A,B and C of the optimal structure presented in Fig. 9(a).

Concluding remarks
The research evidence presented in this contribution clearly shows that the use of modern optimization tools, prior to fabrication, provides an efficient way to tailor and to optimize the optical or resonant response of an optical device for a particular application. In this sense, within the frame of the present work, the use of a multi-objective approach, which clearly outperforms mono-objective strategies, opens the possibility to increase the complexity of the diffractive structures employed for color reproduction, thus leading to chromatic effects useful for the design of optically variable devices like those used for optical documents security, to give an example.
In the present work, a post-optimization study served to find those optimized diffraction gratings less sensitive to variations of the geometrical parameters, which is an unavoidable situation when the structure is fabricated even with modern laboratory equipment. Furthermore, the post-optimization stage made evident that among the geometrical parameters optimized, the groove's angle of aperture had the strongest effect on the generation of color.
The inclusion of a meta-model in the optimization process is a practical way to significantly decrease the computing time. As it has been shown throughout this work, the iterative nature of the multi-objective approach makes the optimization of the diffraction gratings extremely long when done with the rigorous electromagnetic method. It should be noted that generating the database to train and validate the neural network can be time consuming; however, this is done just once. Afterwards, the meta-model can be used without being necessarily included it in an optimization process.
At last, the modular structure of the optimization scheme presented in this work not only facilitates its numerical implementation, but it also makes of it a generic and suitable tool that could be easily used for the solution of inverse or optimization problems in other branches of optics and photonics.