Deep-ROCS: from speckle patterns to superior-resolved images by deep learning in rotating coherent scattering microscopy: supplement

Rotating coherent scattering (ROCS) microscopy is a label-free imaging technique that overcomes the optical diffraction limit by adding up the scattered laser light from a sample obliquely illuminated from different angles. Although ROCS imaging achieves 150 nm spatial and 10 ms temporal resolution, simply summing different speckle patterns may cause loss of sample information. In this paper we present Deep-ROCS, a neural network-based technique that generates a superior-resolved image by efficient numerical combination of a set of differently illuminated images. We show that Deep-ROCS can reconstruct super-resolved images more accurately than conventional ROCS microscopy, retrieving high-frequency information from a small number (6) of speckle images. We demonstrate the performance of Deep-ROCS experimentally on 200 nm beads and by computer simulations, where we show its potential for even more complex structures such as a filament network.


The optical setup of ROCS
In this section we describe the entire optical setup of ROCS microscopy 1,2 we used in this work. The entire setup is of a 4f system in a Total Internal Reflection (TIR) -dark field regime. The light source in our experiments is a focused laser at wavelength of 405 nm. A pair of galvanometric scan mirrors (SM) (S-8210M, Sunny Technology, Beijing, China) is used to deflect the laser beam along a circle in the back focal plane (BFP) of the objective lens. Then, a polarizer is used to assure azimuthal polarization for all illumination directions. Next, we pass the laser beam through a tube lens with = 200 nm (Thorlabs, Newton, NJ, USA). A beam splitter (BS) transmits 70% of the light to the detection path and reflects 30% to the illumination path. The illumination beam passes through the objective and hits the sample from an oblique angle as in the total internal reflection regime. The reflected light is blocked by a diaphragm as in dark-field mode and the scattered light is passed through an additional tube lens and collected by a sCMOS camera. Figure S1: ROCS microscopy optical setup. The 2D scan mirror (SM) deflects the laser beam to desirable positions on the objective lens such that the light hits the sample from oblique angle and different azimuthal directions. The reflected light is blocked using a diaphragm and the scattered light is collected by sCMOS camera.

Bead simulation for neural network training
In this section we describe in detail the of bead simulations. First, we decided on several parameters that control the simulated beads data set: field of view (FOV) = 256x256 pixels, pixel size of 37 nm, number of simulated illumination angles = 1-72, number of beads range = [400, 1000], which bounds the possible number of beads in a simulated experiment. In each experiment we sample the number of beads uniformly from the given range, and iteratively decide on each bead position in the FOV while maintaining bead distance of Δ = 3 pixels from the edge and Δ = 5 pixels from each other. Finally, to account for finite bead dimensions, we convolved the resulting simulated positions with a disk filter ℎ with a radius matching the experimental bead size. The resulting simulation might be seen in figure 3 in the main text.
Next, we set the parameters for the optical setup similarly to the experimental setup discussed in section 2.3 of the main text. The simulated light source wavelength is 405 nm, and the objective lens numerical aperture is 0.7 for the observed images and 1.17 for the ground truth images. For more realistic bead simulation, we added imaging effects (such as noise and defocus) according to the "Simulation of coherent TIR and ROCS microscopy" section.

Filament simulation
The filament simulation is based on previous work by A. Shariff, et al 3 . The model parameters implemented in our simulation are the following: elongation step length of = .
, number of filaments = , collinearity parameter ( ) = . , mean of the normal distribution = , standard deviation of the normal distribution = , and no centrosome or membrane were considered. After the filament trajectories are set, we plug the trajectories into our ROCS microscopy simulation to visualize the data as it would be seen in a ROCS experiment. The last part of this simulation is relevant for the ground truth image only. In this part we saturate the imaged filaments to get an almost binary images of filaments structure (high intensity) and background (low intensity). This pseudo-binarization enables the neural network to infer the filaments' structure and avoid the intensity accumulation in case of filaments crossings. The final output can be seen in figure 6.

Training procedure
The training phase of the neural network for bead experiment is divided into two main stages: i) basic network training and ii) transfer learning. In the first stage we have generated simulated data of beads using Matlab as described in the "Bead simulation for neural network training" section above. We have tried to generate very dense bead patterns to enhance the network's performance in the decision between background and real signal. Next, we trained the neural network on the simulated data for 5 epochs. In the second stage we took the Tiff images captured in our experiments and cropped them to have size of 256x256 pixels. We have trained again the neural network on the experimental data, only now we have used the final neural network weights we got at the first stage as the neural network weight initialization for the second stage of the training. We trained the neural network for 25 more epochs in the transfer learning stage. An important point to notice is that when we have repeated the training pipeline in the first stage for more than 5 epochs (10, 15, etc..) the performance of Deep-ROCS was degraded. We assume it happened because the neural network weights were trained to reconstruct simulated data (which looks more synthetic) and could not adapt to solve the more realistic type of data seen in the experiments.
The first stage of the training took approximately 40 minutes per epoch and the second stage of the training took approximately 1.5 hours per epoch on a standard work-station equipped with 16 GB of memory, an Intel(R) Core(TM) i7 − 8700, 3.20 GHz CPU (total of ~40 hours). When we trained the neural network on a single NVIDIA Titan RTX with 24GB of memory GPU the whole training phase was 5 times faster (~8 hours). In the case of filament simulation, we had one-step training phase where we have trained the neural network on simulated data for 30 epochs. The training data was simulated according to previous work 3 as described in the "Filament simulation" section. We have noticed that binarizing the ground truth image of the filaments to have zero value in the background and high constant value in the filament location helped accelerate the convergence of the training phase. Training on filaments data with a single NVIDIA Titan RTX with 24GB of memory GPU took approximately 12 hours.

Simulation of coherent TIR and ROCS microscopy
Simulating optical scattering by spherical beads can be rigorously solved according to the vectorial diffraction model incorporating Mie diffraction theory 4 . For this work, we used a computationally simpler scalar diffraction model (similar to the Born approximation 5 ) which preserves the main effects in the coherent TIR images. To create a single set of coherent TIR images we first simulated the spatial distribution in 3D object space ( ) where the entire sample is assumed to be at the same axial position (on the coverslip) and each scatterer occupies a 3D sphere according to its volume, with an amplitude corresponding to its polarizability 2 . Namely: where ( ) and are the volumetric polarizability distribution and center positions of scatterers making up the sample; The positions considered in this work are either randomly scattered (e.g. for beads), or along a prespecified shape, e.g. a filament. This volumetric distribution is multiplied by a decaying amplitude (normalized to have an amplitude of 1 at axial position of = 0 which corresponds to the coverslip) which comes from the TIRF illumination setup at a TIRF angle of = sin -1 where is the numerical aperture of the illumination path and is the refractive index of the immersion oil. To reduce computation time, we solve the optical propagation in the system by a 2D Fast Fourier Transform, and add a depth encoding phase Φ ℎ( ) to each z slice, under the assumption that the changes in z are much slower than the focal length of the detection objective. Where 0 = 2 , and are the free-space wavenumber, refractive index, and propagation direction in the medium (water or air) respectively. We define a Coherent Transfer Function (CTF) Where = 2 + 2 = 0 • , are the radial coordinates in Fourier space and and are the spatial frequencies of the scattered photon. ( ) is the aperture function of the CTF according to the detection numerical aperture . Φ ( ) is the defocus phase which is a function of the objective Nominal Focal Plane (NFP) position with being the propagation angle in the immersion medium and is determined by according to Snell's law.
( , ) is a correction according to Rayleign-Gans-Debye theory 6 which is effectively a "Gaussian-like" amplitude low-pass filter which is added to the CTF. Finally, each image in the set is illuminated at a different angle Θ which adds a lateral phase to the object: The final spectrum of a single image is created by: Finally, the recorded image is created by another 2D Fourier transform according to the used modality -e.g., for bright field the reference field is added to ( , ). To account for measurement noise, we added the camera noise statistics and assumed the incoming signal is Poisson distributed after normalization of ( , ) to have a desired photon count. To improve the robustness of our method, we added variability to the simulated images in the form of phase aberrations and noise. The phase aberrations consist of a random defocus position to each set of images (normal distribution around focusing on the center of the bead), and a randomly rotated astigmatic phase added to the CTF.

Resolution quantification
In this section we examine the resolution improvement of Deep-ROCS over the used data set more quantitatively.
Quantifying resolution in coherent microscopy is less straight-forward than in incoherent imaging, due to the strong dependence of the signal on sample phase properties. Therefore, we define the resolution by measuring the spatial frequency coverage in the reconstructed image 1,7 . In the presence of noise, one could not simply search for non-zero values in the 2D Fourier transform of the image, therefore, we define an energy threshold that is meant to find the transition from signal related frequency content to noise related frequency content. The transition from the maximal spatial frequency to the imaging resolution is then achieved by division:

= 1
In figure S2a,b we show the 2D Fourier transforms of a single image of an experimentally measured bead sample reconstructed by ROCS and by Deep-ROCS respectively. To better visualize the frequency content in the Fourier transforms we have saturated the images for spatial frequencies with intensity above a 0.1% of the maximal value. The decision rule for the maximal spatial frequency of ROCS and Deep-ROCS reconstructions is shown in figure S2c. We have calculated the energy of the Fourier transform inside a circle centered at the (0,0) frequency of the Fourier transform, and then saturated the image as described above. The energy of the saturated image was normalized by the area of the circle. As we increased the circle radius the normalized energy decreased because of the transition from signal to noise. We chose the 85 percentile of the normalized energy as an arbitrary threshold to distinguish between signal and noise. For each circle we calculated the normalized energy value, then, we set the maximal spatial frequency by the radius that exceeded this threshold. The conversion from circle radii to the matching spatial frequencies is: where Δ is the frequency resolution in our experiment and is the radius chosen by the 85 percentile threshold. The calculated radii for ROCS and Deep-ROCS were 48 and 70 image pixels, respectively, corresponding to ROCS resolution of ~200 nm (similarly to the reported results in previous ROCS papers), compared to ~150 nm for Deep-ROCS. Regarding the robustness of this arbitrary threshold, we change the percentile threshold to be in the range of 80 -90 percentile of the maximal normalized energy value. For this range, we get resolutions in the range 190 -210 nm for ROCS and in the range 135 -155 nm for Deep-ROCS. It appears that Deep-ROCS might extend the spatial frequency coverage by training it to reconstruct images with similar frequency content as ROCS microscopy with higher NA. Moreover, in figure S2b the improved contrast suggests that Deep-ROCS can also denoise the reconstructed image, since we train Deep-ROCS to reconstruct sharper less noisy, high NA ROCS images.