3D super-resolution microscopy based on nonlinear gradient descent structured illumination

: Three-dimensional structured illumination microscopy (3D-SIM) is an essential tool for volumetric fluorescence imaging, which improves both axial and lateral resolution by down-modulating high-frequency information of the sample into the passband of optical transfer function (OTF). And when combining with the 4Pi structure, the performance of 3D-SIM can be further improved. The reconstruction results of generally used linear 3D algorithm, however, are lack of high-fidelity and proneess to generate artifacts. In this paper, we proposed a novel iterative algorithm based on gradient descent combined with a nonlinear optimizer, which can be applied to all 3D-SIM setups (including I 5 S setup). We verified through simulation that the proposed solution, termed as nonlinear gradient descent structured illumination microscopy (NGD-SIM), achieves more fidelity results which can reach the limitation of theoretical resolution improvement of SIM. Moreover, it can be firmly validated on simulation that this algorithm can effectively reduce the amount of raw data in the case of sinusoidal-pattern illumination, i.e., the algorithm doesn’t need five-step phase shifting; data with any number of phases can theoretically be reconstructed. Our method also provides the possibility to extend the application of sinusoidal-pattern illumination to any kind of interference fringe, which is generated by diversified types of illumination mode.


Introduction
Since it was proposed and invented, optical fluorescence microscopy has become an irreplaceable method for biomedical researches.However, because of the diffraction limit, the resolution of conventional microscopes is constrained to around 200 nm, which is not sufficient for developing research requirements.In order to go beyond the diffraction limit, diversified solutions have been developed in the last two decades.These methods, such as photoactivated localization microscopy (PALM) [1,2] and stochastic optical reconstruction microscopy (STORM) [3,4], rely on statistical computations and repetitive shootings to surpass the diffraction limit.And stimulated emission depletion microscopy (STED) [5] is based on scanning confocal microscopy to "reshape" the point spread function (PSF) of the system.Compared with what is mentioned above, structured illumination microscopy (SIM) [6][7][8] plays an essential role in live-cell imaging.STORM and PALM are supposed to use thousands of images to localize each emitter, which makes it difficult to observe living cells, owing to long-time imaging procedure and drift.Furthermore, the light intensity required in STED is much higher than SIM, and it may cause severe photobleaching and phototoxicity.In contrast, SIM takes much less time in the imaging step and uses significantly lower light to obtain images.
Due to the shape of Optical Transfer Function (OTF), which is confined by Numerical Aperture (NA) [9], in the spatial frequency domain, the lateral resolution is several times better than the axial resolution.Considering the limit of light collecting angle, which determines NA for a single objective lens, a second objective lens is used and placed in a symmetrical position with the first objective lens about the focal plane.By using two objective lenses, the light collecting angle is doubled, and the new OTF can be stretched in axial direction correspondingly.The method that uses two identical objective lenses was applied in fluorescence microscopy such as 4Pi-B confocal microscopy [10], I 2 M [11] and I 5 M [12].For SIM, by combining I 5 M and structured illumination, which is termed I 5 S [13], the axial resolution can be improved more than twice; that is, the OTF will be further stretched in SIM.
In order to reconstruct super-resolution images from SIM, numerous algorithms have been reported [6,[14][15][16][17].But nearly most of them aim at solving two-dimensional problems.In three-dimensional imaging, the most commonly used algorithm is the Wiener filtering strategy (later, we will use Wiener-Filter SIM to refer to this linear 3D-SIM reconstruction algorithm) [13,18], which is not accurate enough and usually produces artifacts.In this manuscript, we report a novel algorithm called Nonlinear Gradient Descent SIM (NGD-SIM) by utilizing a gradient descent mechanism with a nonlinear optimizer, which is widely used in deep learning and many other applications.A similar gradient descent method has been applied in two-dimensional Fourier ptychographic microscopy called PEFP to estimate the information of Interference fringes [19].Here we expand this idea to 3D-SIM to calculate 3D object iteratively.The nonlinear gradient descent algorithm will accelerate the convergence of the iterative process.The new method demonstrates that it can perform ideally in both sparse and continuous samples according to simulations.In the meantime, compared with former techniques, the resolution and quality of reconstructed images obtained by NGD-SIM can be improved to a better degree both laterally and axially.According to simulations on MATLAB, the best lateral and axial resolution can reach 102.5nm and 62.5nm respectively (comparing with Wiener-Filter SIM, which are 140nm and 85nm under the same empirical condition).Our algorithm, to some extent, can greatly reduce the amount of the raw image data used in reconstruction and improve the temporal resolution consequently.

Forward model of the imaging process
Three-dimensional structured illumination microscopy is a linear, translation-invariant (LTI) optical system, whose response can be described by Transfer Function (in optical system, it is named PSF) H(r) and convolution theory.The observed diffraction-limited data D(r) is the convolution of the pointwise product of the illumination pattern I(r) and the emitting object S(r) with the PSF.
Where r denotes real-space coordinate vector and * refers to convolution.Owing to the moiré fringe effect, the illumination pattern I(r) in Eq. (1) down-modulates the original undetectable high-frequency components of the object into the passband of the detection OTF support [20,21].And these high-frequency components can be straightforwardly extracted if the illumination pattern satisfies the following conditions [13,18]: First, the illumination pattern can be decomposed into a sum of a finite number of components, each of which is separated into a production of an axial and a lateral function.The specific expression is as follows: where m = 0, ±1, ±2; J m (r xy ) and I m (z) denotes lateral and axial illumination intensity in the realspace domain respectively.Second, each lateral function should be a simple harmonic function (i.e., J m (r xy ) can be expressed as e im(2πpr xy +ϕ) , which implied that Jm (k xy ) = δ(k xy − mp)e imϕ .Third, relative to the focal plane of the microscope, the illumination patterns are supposed to remain fixed when acquiring a volumetric data set by moving the region of interest of the sample to the focal plane along the z-axis.Under the above conditions, combining Eq. ( 1) and Eq. ( 2), The observed diffraction-limited data D(r) can be rewritten as Eq.(3), In Eq. ( 3), the specific form of J m (r xy ) has been given above, and here we give the form of I m (z).It needs to be clearly stated that J m (r xy ) has the same expressions for 3D-SIM and I 5 S, while the expressions of I m (z) are slightly different.The expressions of I m (z) for 3D-SIM are shown in Eq. ( 4) and I 5 S in Eq. ( 5), where p z 1 equals to 1  λ ex (1 − cos θ) and p z 2 equals to 1 λ ex .θ denotes the angle between the center beam and the side beam.λ ex denotes wavelength of excitation light.
The Fourier transform of Eq. ( 3) takes the form: where k denotes the coordinate vector in reciprocal space Equation ( 6) reveals that the acquired volumetric data is a mixture of the different frequency bands of the sample, one for each index m.If these different frequency bands can be moved to the correct position, it is equivalent to artificially expanding the support of the original OTF (synthetic OTF).However, the range of the synthetic OTF support in the Fourier domain depends on the distribution of frequency shift points formed by Ĩ(k) ( Ĩ(k) is the Fourier transform of the illumination pattern I(r), which is expressed as the accumulation of a series of δ functions, and each δ function represents a frequency shift point).For the general 3D-SIM (three-beam illumination, single objective lens reception) and I 5 S (symmetrically two-sided three-beam illumination, double objective lens reception), the illumination pattern I(r) is different, specifically expressed as different expressions of I m (z) (refer to Eq. ( 4) and Eq. ( 5)).Therefore, I 5 S will produce more frequency shift points in the Fourier domain than the 3D-SIM.At the same time, due to the dual objective lens reception, the detection OTF support of the I 5 S will be extended in the k z direction than the original OTF support of 3D-SIM.The above two effects combine to make the axial resolution of I 5 S higher than that of 3D-SIM.The corresponding original OTF, synthesized OTF, and Fourier domain frequency shift points distribution comparison diagrams of 3D-SIM and I 5 S are shown in Fig. 1.

Sample reconstruction of Wiener-SIM
The conventional sample reconstruction method firstly separates these information components by acquiring additional images with varying illumination pattern phases mφ to construct a set of five independent linear equations from which Dm (k) can be extracted.Then Dm (k) is moved back to their correct position, that is, ±mp and a linear and normalized Wiener filtering algorithm is utilized to stitch these components into a high-resolution image containing high-frequency information.
The Wiener-filter SIM stitched these components using Wiener filtering strategy and it can be realized by the following formula if Dm (k) has been solved.
Where S(k) is the estimate of the true sample information and ω 2 is the Wiener parameter.A(k) is an apodization function which is used to reduce the frequency components outside the synthesized OTF support to zero.

Sample reconstruction of NGD-SIM
Here we propose an iterative algorithm combined with a nonlinear optimizer based on gradient descent that does not require solving linear equations and can obtain a higher fidelity of reconstructed image while greatly reducing the volume of the raw data.
We regard the sample reconstruction process as a minimization problem and construct following loss function, where S est (r) denotes the estimation of the sample and P(r) denotes the captured volumetric data; ∥•∥ 2 is the square of the L2 norm.
The partial derivation of the loss function with respect to S est (r) is According to Eq. ( 9), a conventional gradient descent algorithm can be used to update S est (r), where k denotes the learning rate of which we recommend take the value as 1/3 when all physical parameters participating in the iteration have been normalized.In fact, the value of 1/3 is that we take the reciprocal of the Lipschitz constant L of the system (i.e.1/L) as the step size of the gradient descent.This choice is a conservative choice that must be able to converge.The explanation of Lipschitz constant and how to calculate Lipschitz constant are given in Appendix A.
To speed up the convergence rate, we learn from the strategy of machine learning and introduce Root Mean Square Propagation (RMSProp) [22] nonlinear optimizer into the gradient descent process.Then Eq. ( 7) will evolve into the following form, where we recommend a value of 0.99 for parameter β and 0.1 for parameter α. ε is set to 10 −7 by default and the initial value of V is set to zero.In fact, Eq. ( 11) calculates the cumulative sum of decayed gradient squared and β represents the decay rate.Equation ( 12) takes the reciprocal of the sum of decayed gradient squared as the final equivalent gradient which reduce the differences of update rate between different features, and uses an exponential factor α to control the overall update step size.Regarding the choice of parameters here, we refer to the choices in the original literature for β and ε, and we found that the α value (0.5) in the original literature does not make the algorithm converge.We enumerated the α parameters and chose 0.1 as the recommended value for α, which shows the fastest convergence rate.The effect of specific α parameters on the convergence speed and reconstruction results can be found in Appendix B. Refer to Appendix C for the comparison of convergence speed and reconstruction results between NGD-SIM and gradient descent SIM (GD-SIM) without a nonlinear optimizer.
Owing to the fact that the intensity of the sample is non-negative, a correction needs to be added at the end of each iteration.
The above reconstruction process indicates that the proposed algorithm does not require fivestep phase shifting to acquire different frequency bands of the sample Dm (k).In the simulation, the observed diffraction-limited data with only one phase is available to acquire better results than using Wiener Filter while more phase redundancy can improve the accuracy and noise tolerance of the reconstruction results.

Results and analysis of simulation
In order to validate the performance of our algorithm, several simulations have been done under the conditions of both sparse and continuous samples.In order to fit the practical experiments, relevant parameters are preset as close to the actual situations as possible: excitation wavelength λ ex = 640 nm, emission wavelength λ em = 670 nm, numerical aperture NA = 1.49, refractive index of the embedding medium n = 1.518, the aperture angle θ = 60°, The amount of phase shift each time φ = 2 m π, where m is the total number of phase shifting.These parameters can define the wave number k and OTF in the frequency domain correspondingly.The pixel size in the simulated continuous samples is set as 50 nm which is compatible with the pixel size of EMCCD and in the sparse samples is set as 10 nm in order to get a better quantitative resolution for comparison.To comparing with ordinary Wiener Filter, the simulated pattern will have three orientations (0°, 60°, 120°) and five-phase shifts to fit the requirements of Wiener Filter.Hence up to 15 raw images are recorded for each 3D sample.Furthermore, we also simulated the situation when the number of phase shifting is from one to five to prove that NGD-SIM can effectively reduce the amount of raw data.To obtain best axial resolution, all the simulated data are made from I5S imaging model which is more complex than general 3D-SIM.
We first apply NGD-SIM (using different number of phases) and Wiener Filter to 87 simulated sparse fluorescent 10nm-diameter beads to demonstrate the reconstruction ability.The results are shown in Fig. 2. The lateral slices (Fig. 2(a2)-(f2)) and profiles (Fig. 2(g1), (g2)) illustrate a significant improvement in lateral resolution by using NGD-SIM.With increasing the number of phases used, the fidelity of NGD-SIM performs better.Comparing with the full width at half maximum (FWHM) of Wiener Filter which is 140 nm, the FWHM of NGD-SIM reaches 125 nm, 123 nm, 122.5 nm, 120 nm and 102.5 nm when applying 1 to 5 phases respectively; In addition, the axial slices (Fig. 2(a3)-( f3)) and profiles (Fig. 2(h1), (h2)) demonstrate that striking improvement in resolution can achieve in both axial and lateral direction.In traditional Wiener Filter, the FWHM of axial profile is around 85 nm, nonetheless, by using different number of phases, it can reach 70 nm, 64 nm, 67.5 nm, 62.5 nm and 62.5 nm respectively.It is also worth finding that with increasing the number of phases, the out-of-focus information can be suppressed, and when using 15 images these blurred patterns are totally removed, which denotes the axial resolution improvement.In Fig. 2(h2), When using 1 phase (3 images), the position of profile peak is deviated slightly (as depicted in the sky-blue curve).This is because when the number of phase shifts is too small, the process of iteration will produce errors in the calculation of the position of a single point.Therefore, though the 1-phase reconstruction result of NGD-SIM outperforms the result when using Wiener Filter, it is best to use at least 3 phases (9 images) in the image reconstruction procedure in the case of comprehensive consideration of resolution and data volume.
For continuous and complex 3D objects, the wide-field simulated microtubules are restored by both two techniques (Fig. 3).The wide field image shows severe out-of-focus blurring which obscures the sample (Fig. 3(a1)).By using Wiener Filter (Fig. 3(c1)-(c3)), the blurring can be removed effectively, but there are still many structures that remain unresolved.In contrast to Wiener Filter, NGD-SIM performs much better in both axial and lateral direction especially when applying 3 or more phases in this iterative algorithm (Fig. 3(d1)-(d3), (e1)-(e3)): the out-of-focus information and artifacts are perfectly removed and the thickness of microtubules is much slenderer (as shown in the yellow outlined regions).NGD-SIM outperforms Wiener Filter in the position where fluorescence is densely distributed.Our solution also shows a distinguishable function when using only 1 phase (3 images).Though it does recover a few details which are even better than using Wiener Filter, the continuity of the whole image is still destructed (Fig. 3(f3)).Therefore, after considering the tradeoff between the amount of data in the algorithm and the effect of image reconstruction, we recommend that at least 3 phases of raw   data should be used in the algorithm to process the image, so as to obtain reliable super-resolution results while reducing the amount of data used.
In addition, in order to test the robustness of the proposed algorithm against noise, Poisson noise, which is related to photon count and the system itself, is added to raw data stacks of microtubules.As illustrated in Fig. 3(b2), (c4)-(f4), the resolution and quality of wide-field images are degraded due to the noise.On the contrary, however, it is obvious that the novel algorithm can still significantly improve the image quality, which is much better than the traditional method.As for noise from the environment, different degree of Gaussian Noise is added to the whole image stacks of simulated microtubules and thus makes the raw data has different Signal to Noise Ratio (SNR i ). Figure 4 shows the reconstruction results of these raw data by Wiener Filter and NGD-SIM.It is obvious that no matter how the noise of raw data changes, the reconstruction results of NGD-SIM are always more superior and obtain higher SNR (SNR r , as shown in Fig. 5) than using Winer Filter and show stronger robustness to Gaussian Noise.Even the images processed with 1 phase are better than traditional Wiener Filter.What must be taken into account is the choice of parameter ω 2 in Wiener Filter, so we choose ω 2 which derives the best result to compare with NGD-SIM.The detail of the relation between ω 2 and reconstruction performances of Wiener Filter is shown in Appendix D (Fig. 11).To quantitively measure the speed of NGD-SIM and Wiener Filter, both techniques were coded by MATLAB2020a and executed on the same device (Intel(R) Xeon(R) Gold 5120 CPU, 2.2 GHz, NVIDIA Quadro P6000, RAM 128GB).The size of a single frame image in the simulation is set as 301×301 and one 3D data will contain 301 frames.In order to meet the requirements of using GPU acceleration, we estimate that the memory of GPU needs at least 12G.The reconstruction process of Wiener Filter will take around 600s (10 minutes).Considering to achieve results with high-fidelity, we recommend presetting the number of iterations to 2000 when using NGD-SIM, which will take around 3000s (50 minutes), i.e., it will take 1.5s per iteration on average.

Conclusion and discussion
We proposed a novel iterative algorithm, namely NGD-SIM, based on gradient descent combined with a nonlinear optimizer called Root Mean Square Propagation (RMSprop), which allows us to reconstruct 3D raw data and obtain reconstruction images with higher fidelity for all kinds of 3D-SIM.Taking linear reconstruction algorithm for a reference, another distinguishing advantage of NGD-SIM is that it requires a smaller number of raw images but can achieve higher reconstruction resolution, which will greatly improve the imaging speed and temporal resolution of the system in actual imaging.The proposed method is proven to be applicable for any illumination pattern, including both sinusoidal and speckle illumination.Therefore, it is entirely possible to further reduce the number of captured imaged if the illumination pattern is well-designed, like multi-spot patterns formed by the interference of four orthogonal light beams.We believe NGD-SIM will provide a more effective and precise method in any 3D structured illumination microscopic system or spatial frequency modulation super-resolution Imaging setup, including I 5 S setup.

Fig. 1 .
Fig. 1.OTF support by 3D-SIM and I 5 S in 3D rendering.(a) The OTF support of conventional optical microscope.(b) The OTF support of I 5 S, compared with OTF using a single objective lens, is stretched in the kz axis.(c) Effective OTF support is produced by three-beam illumination in ordinary 3D-SIM.(d) The effective OTF support by illumination with six beams and detection through two opposing objective lenses.(e) The frequency shifts under the different illumination modes.The red dots denote extended frequency shift when using interference of three beams; The purple dots represent the extra frequency shifts for six beams by applying I 5 S, which will never be present in ordinary 3D-SIM (Three-beam interference).

Fig. 4 .
Fig. 4. single xy slices of reconstruction results of a complex sample by Wiener Filter and NGD-SIM using different phase (image) numbers under different degree of noises.The corresponding SNR of raw images (SNR i ) is marked on the far left.The SNR of recovered images (SNR r ) is marked on the upper right of each image.(a1)-(f1) Reconstruction results of Wiener Filter under different SNR i (different degree of Gaussian noises).(a2)-(f2) Results of NGD-SIM reconstructed with 5 phases (15 images) under different SNR i .(a3)-(f3) Results of NGD-SIM reconstructed with 3 phases (9 images) under different SNR i .(a4)-(f4) Results of NGD-SIM reconstructed with 1 phase (3 images) under different SNR i .

Fig. 5 .
Fig. 5. Reconstruction quality by Wiener Filter and NGD-SIM, measured by SNR (marked as SNR r ), with different SNR of raw images (marked as SNR i ) and different numbers of phases (images).

Fig. 6 .
Fig. 6.Reconstruction results of NGD-SIM (5 phases, 15 images) with different α.(a) A xy slice of 3D groundtruth simulated sample.(b)-(i) Reconstruction results of NGD-SIM with different α by 2000 iterations.Corresponding α is marked in the upper right corner of each image.

Fig. 7 .
Fig. 7. Reconstruction performances with different α parameter and different number of iterations.The performances are valued by Mean Square Error (MSE).The black dotted box is an enlarged view of MSE under different α with the number of iterations around 1500-2000 times.When α = 0.1, the reconstruction result is best.

Fig. 8 .
Fig. 8.Comparison of NGD-SIM and ordinary gradient descent algorithm (here marked as GD-SIM) for simulated fluorescent beads.(a) The result of NGD-SIM with 2000 iterations.(b1)-(b11) Results of GD-SIM with number of iterations from 1000 to 11000.The full width at half maximum (FWHW) of the beads in the green boxes under the influence of iterations is shown in Fig. 10.

Figure 8 and
Figure8and Fig.9are reconstruction results of NGD-SIM (2000 iterations) and ordinary gradient descent algorithm (GD-SIM, from 1000-12000 iterations).And Fig.10shows the MSE of reconstructed microtubules and FWHM of reconstructed beads as a function of iteration.It is obvious that NGD-SIM converges much fast than GD-SIM which denotes the better resolution and less processing time.

Fig. 9 .
Fig. 9. Comparison of NGD-SIM and ordinary gradient descent algorithm (here marked as GD-SIM) for simulated microtubules.(a) a xy slice of 3D microtubules sample.(b) Results of NGD-SIM with 2000 iterations.(c1)-(c6) Results of GD-SIM with number of iterations from 2000 to 12000.The MSE of reconstructed results under the influence of iterations is shown in Fig. X.

Fig. 10 .
Fig. 10.Performances of reconstruction by NGD-SIM and GD-SIM with different number of iterations.The performances of reconstructed microtubules and fluorescent beads are measured by MSE and FWHM respectively.As shown in the black dotted lines, it is obvious that the reconstruction effect of 2000 NGD-SIM iterations is comparable to the counterpart with 12000 iterations for both simulated microtubules and fluorescent beads.