Computational complex optical field imaging using a designed metasurface diffuser

Various speckle-based computational imaging techniques that exploit the ability of scattering media to transfer hidden information into the speckle pattern have recently been demonstrated. Current implementations suffer from several drawbacks associated with the use of conventional scattering media (CSM), such as their time-consuming characterization, instability with time, and limited memory-effect range. Here we show that by using a random dielectric metasurface diffuser (MD) with known scattering properties, many of these issues can be addressed. We experimentally demonstrate an imaging system with the ability to retrieve complex field values using a MD and the speckle-correlation scattering matrix method. We explore the mathematical properties of the MD transmission matrix such as its correlation and singular value spectrum to expand the understanding about both MDs and the speckle-correlation scattering matrix approach. In addition to a large noise tolerance, reliable reproducibility, and robustness against misalignments, using the MD allows us to substitute the laborious experimental characterization procedure of the CSM with a simple simulation process. Moreover, dielectric MDs with identical scattering properties can easily be mass-produced, thus enabling real-world applications. Representing a bridge between metasurface optics and speckle-based computational imaging, this work paves the way to extending the potentials of diverse speckle-based computational imaging methods for various applications such as biomedical imaging, holography, and optical encryption.


Simulations and design of the meta-atoms
The transmitted amplitude and phase of the meta-atoms with normally incident light were calculated using the rigorous coupled wave analysis technique [1]. The α-Si meta-atoms were 652 nm tall, and the square lattice constant was 500 nm. Given the high numerical aperture of the designed MD, a smaller lattice constant could be used to increase the scattering efficiency and reduce unwanted diffraction into the substrate [2]. Refractive indices of α-Si and fused silica for the operating wavelength of 850 nm in the simulation were 3.56 and 1.44, respectively. Side lengths of the meta-atoms, Dx and Dy, were varied in the simulation to achieve independent phase control to cover the full 2π range for x and y polarizations. Figures S1(b) and S1(c) show transmittance and transmission phase for both polarizations for a periodic array of the meta-atoms as functions of Dx and Dy. Then, we optimized Dx and Dy as functions of ϕx and ϕy [ Fig. 2(c) of the main text] to provide high transmission and the desired phase shifts. Since the MD is designed to operate in cross-polarization mode, only meta-atoms operating as HWPs are required (corresponding to the dashed black lines in Fig. S1(a) and main text Fig. 2(c)). The meta-atoms are rotated 45 degrees (with respect to the x and y axes) to change the state of polarization from horizontal to vertical, and modulate the phase through the method explained in the supplementary material of Ref. 3.

Design of the phase profile of the MD
To design the phase profile of the MD, the GS algorithm was exploited [4,5]. We performed 100 repetitions to get the phase profile of the MD. The initial phase profile was determined by random number generation with a uniform distribution in the 0-2π range. Each repetition process consists of four steps. First, the Fourier transform of the phase profile is calculated. Second, the amplitude in the Fourier domain is updated to be uniform inside the disk with a radius of NAk0, where NA is 0.6 and k0 is the propagation constant in vacuum. Third, the inverse Fourier transform of the updated Fourier domain distribution is performed. Fourth, we update the phase profile to have unity transmission inside the MD aperture. The final phase profile of the MD is acquired after 100 repetitions. Figure 2(d) in the main text shows the amplitude distribution of the finalized MD design in the Fourier domain.

Device fabrication
A 652-nm-thick layer of α-Si was deposited using plasma-enhanced chemical vapor deposition on a ~500-µm-thick fused silica substrate. For nano-patterning, a positive electron-beam (e-beam) resist (ZEP520A) was spin-coated on the sample (~300 nm). To avoid electrostatic charging during the e-beam lithography, a charge-dissipating polymer (aquaSave, Mitsubishi Rayon) was spin-coated on the e-beam resist layer. The metasurfaces were patterned by an e-beam lithography system (Vistec EBPG5000+) and developed in the ZED-N50 developer. A ~70-nm Al2O3 layer was deposited and then the nano-pattern was transferred to the Al2O3 layer by a lift-off process. The Al2O3 layer was exploited as a hard mask to etch the α-Si layer in a mixture of C4F8 and SF6 gases, and was then removed by a mixture of ammonium hydroxide and hydrogen peroxide. The metallic aperture was patterned by photo-lithography, deposition of a 100-nm-thick gold layer with a 10-nm chromium adhesion layer, and a lift-off process. To avoid confusion about the shape of the MDs shown in Fig. 2(e), it is necessary to note here that the MDs are square-shaped with 1.6-mm side lengths, and the gold apertures are circular with 1.6-mm diameters (i.e. the transmittance in the overlapping region is negligible, so the final device aperture is determined by the gold aperture). For all results in the main text, we treated the MD as circularly shaped with a 1.6mm diameter.

Computational procedure
The transmission matrix of the MD (T) in each specific condition was computed using the wave propagation method [6] and the calculated phase mask of the MD. The flow graph in Fig. S2 shows the procedure for a certain optical setup. The phase mask of the MD, distances between the optical elements, magnification and NA of the custom-built microscope, and the input/output mode conditions were used to compute T. First, each input mode is propagated by z1 to find its corresponding field distribution at the MD plane. Passing through the MD is modeled by multiplying the field and the a priori known phase mask of the MD. The fields are then propagated by z2 to the working distance of the objective lens. The microscope is modeled as a Fourier domain filter (with the 0.4 NA of the objective lens) followed by a Fourier domain up-sampling to take into account the magnification and the NA of the objective lens. Oversampling can happen because of the microscope if / > /2NA , where is the camera pixel period, Mag denotes magnification, and λ is the wavelength. The oversampling effect can be compensated by a proper Fourier domain downsampling. In our case, the speckle intensity pattern captured on a 500×500 array of 7.4-μm pixels centered at the optical axis are utilized in the retrieval process. Considering the 16.7 × magnification and the diffraction-limited spot size of 1.06 μm (determined by the NA of the optical system), the fields on the 500×500 camera pixel array can be effectively down-sampled to a 210 × 210 array of 17.6-μm pixels at the camera plane. The 210×210 array at the camera plane corresponds to a 210×210 array of 1.06-μm pixels at a plane far from the objective lens by the working distance of the lens. Finally, the simulated complex fields on the 210×210 output array form the columns of T, with each column corresponding to a single input mode (for a 60×60 array of input modes). The filtering and the up-sampling processes will be eliminated if a miniaturized CMOS image sensor and a thin linear polarizer are used instead of the microscope to capture the speckle pattern. Also, the down-sampling process can be eliminated by adjusting the magnification of the optical system. Using parallel processing on 6 cores of a multi-core CPU (Intel Xeon E5-2640), it took almost 2 hours to compute T. We expect the time to be reduced significantly with GPU and advanced parallel computing. We should also note that for each specific set of input/output parameters T should only be calculated once. The speckle correlation scattering matrix Z was used to retrieve the complex fields [7]. The definition of Z is shown in Eq. (1) in the main manuscript. Before calculating Z, it is necessary to perform the down-sampling for the speckle intensity patterns captured by the image sensor so that the oversampling effect is compensated. When Z has a rank of one (approximately), a single eigenvector constitutes the initial field retrieval estimate. In practice, the eigenvector corresponding to the largest eigenvalue forms the initial estimate.
The first retrieval process (including the computation of Z) takes most of the retrieval time (~22 seconds without parallel computing). To reduce the noise, we then performed the iteration procedure based on the modified GS algorithm proposed by Lee and Park [7]. Before the iterations, the pseudo-inverse of T (T -1 ) was calculated by singular value decomposition. Using T, T -1 , the first estimate of the target, and the captured intensities of the speckle pattern, the 20 iterations based on the GS algorithm were performed. The 20 iterations take ~8 seconds on average without any parallel computing.

Measurement procedure
Schematic of the optical setup used to measure the speckle patterns with the amplitude targets is shown in Fig. S5(a). An 850-nm semiconductor laser (Thorlabs, L850P010) was coupled to a single mode fiber for illumination. The fiber was connected to a collimator package (Thorlabs, F220APC-850). A linear polarizer (Thorlabs, LPVIS100-MP2) was placed in front of the collimator to confirm the horizontal (x) polarization state of the input, and a fiber polarization controller was used to maximize the power passing through the polarizer. A bi-convex lens with a 5 cm focal distance (Thorlabs, LB1471-B-ML) was used to confine the illuminated region on the targets. We used a custom-built microscope setup to capture the speckle pattern, which consisted of a 20 × objective lens (Olympus, LMPlanFL) with an NA of 0.4, a tube lens with a focal distance of 15 cm (Thorlabs, AC254-150-B-ML), a linear polarizer aligned along the y axis (Thorlabs, LPVIS100-MP2) to reject unscattered light, and a CCD camera (CoolSNAP K4, Photometrics). In front of the camera, an optical band pass filter (Thorlabs, FL850-10) was used to decrease the background noise and limit the bandwidth. Schematics of the optical setup used for holographic imaging with the MD are shown in Fig. 5(a). The optical setup was mostly similar to the setup described above and shown in Fig. S5(a). The bi-convex lens used to confine the illumination region was removed. Also, an aperture with a ~150-µm diameter (Thorlabs, P150H) was inserted at the position between the sample and the MD, where the complex field was retrieved. The schematics of the optical setup used to capture the reference images for comparison with the holographic images is shown in Fig. 5(d). Compared with the setup shown in Fig. 5(a), the MD and the linear polarizer in the custombuilt microscope are removed. In addition, the objective lens was moved to bring the sample in focus.

S2. MATHEMATICAL PROPERTIES AND NOISE TOLERANCE OF A RANDOMLY GENERATED T MATRIX
We investigated the noise tolerance of a randomly generated T matrix for comparison. The random transmission matrix TR was generated using the code in the supplementary materials of Ref. 7. Similar to the results shown in Fig. 3(b) and 3(c) in the main text, the GR matrix generated from TR and the eigenvalue spectrum of R † R / are shown in Figs. S8(a) and S8(b), respectively. As seen in Fig. S8(a), the columns of TR are almost perfectly uncorrelated. Moreover, Fig. S8(b) shows that the entries of TR are statistically independent [8,9]. Using TR, we investigated the noise tolerance in the same way used for noise tolerance study of the MD. Gaussian intensity noise with various energies was added to the speckle pattern generated by TR to adjust the SNR value. The results shown in Fig. S8(c-j) verify that the complex fields could be retrieved when the SNR value is equal to or greater than 5 when using TR.
Comparing the noise tolerance shown in Fig. S8 to the results shown in Fig. 5 for the MD, one can see that the complex fields can be retrieved for an SNR value of 1 only with the MD. This shows that despite the existing correlation between the neighboring input modes and the dependence between entries of T, the MD demonstrates higher noise tolerance compared to the randomly generated transmission matrix.

S3. NUMERICAL ANALYSIS OF ROBUSTNESS AGAINST MISALIGNMENT
We numerically investigated the performance of the MD and the method with axial and transverse misalignments. The location of the meta-diffuser shown in Fig. S10(a) is changed transversely or axially to study the effect of the misalignment. Thanks to the large memory-effect range of the meta-diffuser [4], the misalignment leads to relatively small changes in the speckle patterns. Referring to Figs. S10(b) and S10(c), the axial and transverse misalignments cause magnification/demagnification and parallel shift of the speckle patterns, respectively [10]. Thus, these changes in the speckle pattern affect the retrieved complex fields. A difference in the estimated and the actual distance between the aperture and the object results in an out-of-focus retrieved image as shown in Fig.  S10(e). Figure S10(f) shows that the out-of-focus images can be corrected simply through numerical propagation. In Fig. S10(g), the case of an error in the estimation of the distance between the MD and the image sensor degrades the results, but it has still almost negligible results if the error is less than 10 µm. As the results in Fig.  S10(h) show, a transverse misalignment of the MD approximately results in a transverse displacement of the speckle pattern and the retrieved image along with a gradient in the phase. It is worth noting that this shift itself can help in aligning the system by showing the misalignment type and direction. Also, Fig. S10(i) demonstrates that the images can be corrected and the phase gradient can be mitigated by cropping the speckle patterns in the corresponding shifted region if the image sensor has additional pixels. In summary, the numerical investigation demonstrates that our system is robust against axial and transverse types of misalignment and the results can be corrected or exploited for the aligning process. Moreover, the MD could be manufactured monolithically on top of a CMOS image sensor in the fabrication to minimize these types of misalignment, particularly the axial error between the MD and the CMOS image sensor.

S4. EFFECT OF THE NUMBER OF ITERATION ON COMPLEX FIELD RETRIEVAL
To explore how the number of iterations affects the complex field retrieval, we performed additional simulations by changing the number of iterations from 5 to 100 (i.e. 5/10/20/30/50/100 iterations) for both the numerically generated amplitude and phase targets [shown in Figs. 3(d) and 3(e), respectively], and the experimental target [shown in Fig. 4(a)]. In Figs. S11(a-f), it is clearly shown that there is no considerable difference after 20 iterations for both the numerically generated and the experimental targets. For the numerically generated amplitude and phase targets, we have also plotted the correlations between the original targets shown in Figs. 3(d) and 3(e) and the retrieved complex field maps shown in Figs. S11(a-d) versus the number of iterations. In Fig. S11(g), the plot clearly shows that the correlation is saturated after 20 iterations and reaches 0.986 just after 5 iterations.

S5. EFFECT OF THE NUMERICAL APERTURE AND THE MAGNIFICATION/FIELD OF VIEW OF THE OBJECTIVE LENS IN COMPLEX FIELD RETRIEVAL
We have performed additional experiments to investigate the effects of the objective lens on field retrieval quality. We used 10X (0.3 NA) and 50X (0.7 NA) objective lenses to compare with the results based on the 20X (0.4 NA) objective lens in the main manuscript. The NA of the objective lens determines the diffraction limited spot size corresponding to the effective sampling pixel size for the speckle pattern (i.e. effective output pixel size). In other words, the effective sampling pixel size of the 10X and 50x objective lenses are 1.42 µm and 0.57 µm, respectively. In order for the comparison of the results to be fair, we fixed not only the input pixel size at 2.5 µm, but also the number of output modes (M) and input modes (N) at 44100 and 3600, respectively. With the 10X objective lens, a large 280×280 µm 2 central area of speckle pattern is sampled by a 210×210 array of 1.42-μm pixels. On the other hand, a smaller central area of 120×120 µm 2 of the speckle pattern is sampled by a 210×210 array of 0.57-μm pixels with the 50X objective lens. The targets shown in Fig. 4(a) and (b) were exploited again to compare with the results based on the 20X objective lens shown in Figs. 4(eh). First of all, the retrieved complex maps based on the 10X objective lens shown in Fig. S12(a) are comparable with the targets shown in Fig. 4(a) and (b), as well as the retrieved results based on the 20X objective lens shown in Fig. 4(e-h). However, the retrieved amplitude and phase maps based on the 50X objective lens shown in Fig. S12(b) verify that the targets cannot be retrieved well with the 50X objective lens under these conditions. The experimental results with the three different objective lenses shown in Figs. 4(eh), S12(a), and S12(b) clearly demonstrate that it is better to capture a large area of the speckle pattern with lower resolution than a small area of the speckle pattern with higher resolution (assuming fixed numbers of input and output modes). Finally, we increased the number of the output modes from 44100 to 176400, keeping the input pixel size the same for the 50X objective lens. This means that the 240×240 µm 2 central area of the speckle pattern is sampled by a 420×420 array of 0.57-μm pixels. In these new conditions, the complex fields are successfully retrieved using the 50X objective as shown in Fig. S12(c). Thus, if the sampling area is determined, it is better to increase the number of output modes with smaller output pixel size. In this case, the ratio between the number of output and input modes increases, resulting in a more accurate field retrieval as well as a rise in the computational time.  The virtual optical setup used to calculate T. z1 and z2 represent the distances between the MD and the input and output planes, respectively. The optical components in the green dashed box (that magnify the speckle pattern and image it onto the image sensor) are optional and can be dropped for miniaturization. (b) Flow graph of the process used to calculate T. The speckle pattern corresponding to each input mode (pixel) is calculated using the wave propagation technique and the phase mask of the MD. The objective and tube lens are modeled as a low pass filter and a Fourier up-sampling. If there is an oversampling effect caused by high magnification, Fourier down-sampling can be performed to compensate the oversampling effect. The complex field of the speckle pattern corresponding to each input mode constitutes a column in T. The low pass filtering and up-sampling processes in green dashed box will be eliminated if the optical components in green dashed box in (a) are dropped. Moreover, the down-sampling process in red dashed box will be eliminated if the magnification of the optical system is adjusted to prevent the oversampling effect. Fig. S3. Visualization of G -I matrix. The main diagonal of G is always a unit diagonal by definition. Thus, the main diagonal part is removed for visualization. It clearly shows that the correlations between neighboring modes are close to the 0.18 and the correlations between further modes are suppressed quickly. The average of the correlations between two arbitrary input modes is 0.006.     Fig. 4(a).   Schematic drawing of the axial misalignment and its effect on speckle pattern. ∆z1 and ∆z2 are the errors in estimated distances between the input plane, the MD, and the speckle plane. Right: illustration represents that the speckle is approximately magnified or de-magnified as a result of the axial misalignment. (c) Schematic drawing of the transverse misalignment and its effect on the speckle pattern. Right: illustration represents that the transverse misalignment leads to an approximate shift of the speckle pattern parallel to the misalignment direction. (d-i) Retrieved amplitude (top) and phase (bottom) distributions. The SNR value is fixed at 100 to account for the noise in real systems. An amplitude mask is used as the target for the analysis. All coordinates and numbers in the figure are given in microns. (d) Retrieved complex field without any misalignment (e) Retrieved results for ∆z1 ≠ 0. The numbers above the images denote the value of ∆z1. (f) Corrected results from the fields in (e) through numerical propagation. (g) Retrieved results for ∆z2 ≠ 0. The numbers above the images denote the value of ∆z2. (h) Retrieved results at (∆x, ∆y) ≠ (0, 0). The coordinates above the images show the values of ∆x and ∆y. (i) Corrected results for (∆x, ∆y) ≠ (0, 0). The correction is achieved by cropping the speckle patterns in the properly shifted regions at the output plane. From the first to the fourth, the results are the corrected images from the results shown in (h), in the same order. The fifth column of images shows the corrected results for the case of (∆x, ∆y) = (-50, 50). Although the amplitudes are corrected well by cropping the speckle patterns in the shifted region, the correction process is accompanied with a relieved linear gradation in the phase maps.