Deep-learning super-resolution light-sheet add-on microscopy (Deep-SLAM) for easy isotropic volumetric imaging of large biological specimens

: Isotropic 3D histological imaging of large biological specimens is highly desired but remains highly challenging to current fluorescence microscopy technique. Here we present a new method, termed deep-learning super-resolution light-sheet add-on microscopy (Deep-SLAM), to enable fast, isotropic light-sheet fluorescence imaging on a conventional wide-field microscope. After integrating a minimized add-on device that transforms an inverted microscope into a 3D light-sheet microscope, we further integrate a deep neural network (DNN) procedure to quickly restore the ambiguous z-reconstructed planes that suffer from still insufficient axial resolution of light-sheet illumination, thereby achieving isotropic 3D imaging of thick biological specimens at single-cell resolution. We apply this easy and cost-effective Deep-SLAM approach to the anatomical imaging of single neurons in a meso-scale mouse brain, demonstrating its potential for readily converting commonly-used commercialized 2D microscopes to high-throughput 3D imaging, which is previously exclusive for high-end microscopy implementations.


Introduction
Light-sheet Fluorescence Microscopy (LSFM) allows three-dimensional imaging of biological samples with high speed and low photo-bleaching, and has recently emerged as an important alternative to conventional epifluorescence imaging approachs [1][2][3][4][5][6][7]. To the ultimate goal of fast, accurate, noninvasive spatiotemporal imaging, a variety of LSFM implementations have evolved from the classic SPIM, to provide superior-quality imaging of samples from single cells to entire organs [8][9][10][11][12][13][14][15][16][17][18][19]. Aside from these advanced LSFM modalities, simple LSFM approaches, e.g. established Open-SPIM, are also invented with simplified structures and reduced cost, for more widespread LSFM applications under ordinary conditions [20,21]. Some easier ways have also been reported to enhance existing conventional microscope with simple retrofits that contains plane illumination and sample scanning additions, thus providing compact and cost-effective solution to implement LSFM imaging on epifluorescence microscopes. Given the large number of such conventional microscopes in service, these types of techniques provide compelling solutions for researchers to readily access advanced LSFM imaging [22][23][24][25]. However, owing to the simple ways used for light-sheet generation, the axial resolution of system, which is compromised to the range of laser-sheet illumination, remains largely insufficient, especially being limited to ambiguous cellular resolution (∼10-20 µm) when imaging large histological specimens. Multi-frame-based resolution enhancement techniques, such as Fourier ptychography, structured illumination, and voxel super-resolution [26][27][28], can computationally address this issue through reconstructing a super-resolved 3D image based on a number of low-resolution measurements, but at the expense of increased acquisition time and low processing throughput. Unlike the above-mentioned multi-frame methods, deep learning-enabled image restoration has recently become a promising tool for various light microscopy techniques [29][30][31], with the trained neural network capable of directly deducing a higher-quality image based on a single low-quality measurement. Deep learning-based restoration has been also applied to the resolution enhancement and denoising of LSFM images, in which the acquisition of qualitied LSFM data for network training is relatively difficult [32,33]. Based on these state-of-the-art developments, we herein report a deep-learning-enabled super-resolution light-sheet add-on microscopy (Deep-SLAM) that combines simple add-on device with efficient deep neural network (DNN), and allows a 2D conventional microscope to implement 3D isotropic imaging of large biological specimens. In addition to the hardware add-on that enhances wide-field microscope with 3D optical sectioning capability, the newly integrated DNN procedure further computationally improves the axial performance of system from 15-µm cellular resolution to isotropic 3-µm single-cell resolution. We demonstrate this Deep-SLAM approach through imaging GFP-tagged single neurons in a meso-scale mouse brain. As compared to very poor performance by the original microscope, or still fuzzy 3D reconstruction by merely applying add-on device, simple-and-efficient Deep-SLAM hybrid strategy can readily achieve over 10-fold enhanced axial resolution and show much higher signal contrast, allowing otherwise indistinguishable nerve endings to be super-resolved across large 3D brain space. Furthermore, we also demonstrate accurate counting of cell populations and segmentation of single neurons as a result of significantly improved image quality.

Dataset preparation and experimental settings
Deep-SLAM procedure includes SLAM imaging on an inverted microscope (Olympus IX73) [22] with improved image contrast and axial resolution, followed by an isotropic enhancement using CARE deep-learning model [29]. Built-in epi-fluorescence mode in conventional microscope illuminates entire thick sample without blocking out-of-focus excitation, yielding complete blurred z reconstruction ( Fig. 1(a)). Then SLAM imaging was enabled through a compact add-on device, which provides horizontal light-sheet illumination to the sample ( Fig. 1(b)). While SLAM has enhanced the axial resolution and signal contrast by introducing plane illumination mode, a relatively thick laser-sheet (e.g., ∼15 µm) was necessary to cover a large field of view (FOV) of samples (e.g., ∼3 mm), thereby resulting ambiguous axial resolution insufficient for discerning single cells. Following the iso-CARE procedure, we generated the synthetic low-resolution axial slices through applying a degradation model to the better-resolved lateral slices (Fig. 6). Then these raw lateral slices (ground truths) were paired with their degraded versions that simulate the low-resolution axial slices, to form the training dataset for a CARE network training. Finally, the acquired SLAM image were resliced into a stack of axial slices, and restored by the trained model to generate an output stack with improved isotropic resolution ( Fig. 1(c)).

Characterization of Deep-SLAM
We imaged sub-diffraction fluorescent beads (0.5-µm diameter, Lumisphere, BaseLine Chromtech) using original inverted microscope (4×/0.16 objective), thick SLAM mode (0.02 illumination NA), thin SLAM mode (0.06 illumination NA) and Deep-SLAM mode (0.06 illumination NA), to compare their point spread functions (PSFs, shown in Fig. 2(a)-(d)). Compared to the extremely poor axial performance (∼42 µm) by epi-illumination mode under a wide FOV (∼3 mm), our raw SLAM result showed much improved axial resolution (∼15 µm in z-axis) even though the The blurred and down-sampled lateral slices of SLAM with noise addition were generated to simulate the axial slices of SLAM image which still show ambiguous axial resolution incapable of resolving single cells (Step 1). The degraded lateral slices (synthetic low-resolution data) paired with raw lateral slices (high-resolution ground truth) were trained using a CARE network model (Step 2). Finally, the trained network directly infer an isotropic 3D image stack based on the raw anisotropic SLAM input (step 3). elongated PSF indicated that it was still anisotropic ( Fig. 2(a), (b), (f)). Meanwhile, images obtained by thin SLAM mode had ∼5-µm higher axial resolution but exponentially reduced FOV (∼280 µm), owing to the intrinsic limitation of Gaussian beam. In contrast, Deep-SLAM showed near isotropic axial resolution (∼3µm in PSFs) close to that of thin SLAM mode while maintained the 3-mm wide FOV ( Fig. 2(d), (f)).

Network validation on cleared mouse brain
We further demonstrated the performance of Deep-SLAM through imaging fluorescence-labelled neurons in transgenic mouse brain (Thy1-GFP-M). A large clarified brain tissue (PEGASOS method [34]) was imaged and then reconstructed using abovementioned Deep-SLAM approach. The single soma in original whole-tissue-scale thick SLAM images remained ambiguous ( Fig. 3 Fig. 4(a), (d)), while the Deep-learning restoration was capable of super-resolving these fine structures across the entire 3mm-FOV (Deep-SLAM middle, Fig. 3 Fig. 4(b), (e)), which furthermore enabled follow-up biological analysis, such as accurate neuron segmentation and soma counting ( Fig. 6(a), (b)). We also compared the Deep-SLAM results with thin SLAM mode ( Fig. 3(b), (c) right, Fig. 4(c), (f)), which can achieve single-cell resolution only in few ROIs owing to the small illumination FOV. The normalized root mean square error (NRMSE) also validated the sufficient accuracy of network restoration in Deep-SLAM.

High-throughput isotropic imaging of half mouse brain using Deep-SLAM
We demonstrate that Deep-SLAM can achieve high-throughput, isotropic 3D imaging of a half mouse brain at single-cell resolution, showing how a unique function previously belonging to high-end LSFM implementations can now be realized on a conventional 2D microscope. We used SLAM to quickly image GFP-tagged neurons in a large half mouse brain (Tg: Thy1-GFP-M, ∼10 × 3 × 5 mm 3 ) in merely ∼5 minutes (Fig. 5(a)), equivalent to an acquisition throughput of 1.7 × 10 8 voxels per second (1 × 1 × 5 µm voxel size). At such a large scale, as compared to thin-and-narrow SLAM mode, Deep-SLAM with wide FOV reduced the times of image stitching (∼5 times vs ∼40 times) and showed lower photo-bleaching as well. Then by merely including a small amount of SLAM self-data for training, the DNN model can restore various types of neurons in the half brain. Diverse neuronal structures distributed in five different brain sub-regions were chosen to show the successful 3D visualization of them at isotropic single-cell resolution ( Fig. 5(b)-(f)). The neurons distributed in the edge of the FOV were compared to validate the superior performance of our Deep-SLAM in the whole FOV ( Fig. 5(g), (h)).

Improved cell counting and neuron tracing based on Deep-SLAM result
As a result of isotropic brain imaging by Deep-SLAM, significantly more neuron details densely packed in a cortex region could be segmented / traced using Imaris. The total length of segmented filaments are 759 µm in thick SLAM result, 1239 µm in Deep-SLAM result, and 1248 µm in thin SLAM result, as shown in Fig. 5(a). Meanwhile, the counting results of dense cell bodies located in different brain sub-regions reconstructed by Deep-SLAM were also more accurate than those by raw thick-SLAM, when using high-resolution thin-SLAM results as references ( Fig. 5(b)

SLAM imaging of large clarified mouse brain
We adopted SLAM imaging of mouse brain on an Olympus ix73 inverted microscope [22] ( Fig. 7(b), (c)).The brain of a 8-week transgenic adult mouse (Thy 1-GFP, line M, Jackson Laboratory) was clarified using an organic-solvent-based clearing method (PEGASOS [33]). Two set screws on the sample holder were used to clamp the cleared-and-harden half mouse brain (Fig. 7(d)) and dipped it in the glass chamber of SLAM add-on, which was filled with refractive index-matched solvent (1.54). A 473-nm laser beam was reformed into a laser light-sheet by a cylindrical lens (CL, Thorlabs, LJ1810L2-A) of add-on device, to selectively illuminate the brain sample at the focal plane of an 4×/0.16 objective. An adjustable slit (AS, Thorlabs, VA100C/M) was used to switch between thick and thin SLAM modes (wide open for thin SLAM, 1-mm width for thick SLAM). Three compact translational stages aligned the light sheet with the sample in three directions (Fig. 7(c)). The confocal range of light sheet could be adjusted along the propagation direction (x axis) by translational stage 3 and the sample could be adjusted along the y direction by translational stage 1. The motorized actuator (on translational stage 2) in the device scanned the sample through the light-sheet (z direction) while the camera simultaneously recorded the consecutive planes at a rate of 20 frames per seconds.

Validation of the image degradation model
Ambiguous axial resolution is a common problem in 3D microscopy imaging (Fig. 8(a)). Such anisotropy is caused by the inherent axial elongation of the optical PSF and the usual low axial sampling rate of volumetric acquisitions required for fast imaging. To obtain the isotropic resolution, we used the well-resolved lateral slices as ground truths and degraded them to obtain the corresponding low-resolution training data, which simulated the anisotropic axial slices. In this way, we generated low-high resolution data pairs to train the deep-learning model, which finally could restore the axial slices to a near isotropic resolution. We generated the low-axial-resolution semi-synthetic data using following image degrading operations ( Fig. 8(b)): (1) Anisotropic transform. We first convolved the high-resolution lateral slices with a synthetic PSF (simulating the axial elongation of measured PSF), to obtain a blurred image similar to the axial slices.
(2) Down-sampling. We down-sampled the blurred image by 5 times along x axis (from 1 µm to 5µm) using a re-slicing method, to simulate the coarser z-scan steps.
(3) Noise addition. We adjusted both mean and variation of randomly-generated noises to produce a series of synthetic images and compared them to the experimentally measured LR images. When the semi-synthetic images have similar SNR and Fourier domain distribution with real low-resolution axial slices, the degradation process is considered convincible (Fig. 9). Finally, these operations transformed high-resolution lateral slices (xy) into synthetic lower-resolution images similar with the measured axial slices (xz or yz), for network training.
Through applying the abovementioned degradation operations, we generated 4000 training pairs of fluorescent microspheres (size of 128 × 128 pixels for each pair) from four image blocks (∼2 × 2 ×1 mm 3 for each) acquired using thick SLAM. For the network training of mouse brain data, we generated 6000 training pairs (size of 128 × 128 pixels for each pair) from six image blocks of size (∼2 × 2 × 0. 25 mm 3 for each) acquired using thick SLAM.

Deep network structure
We adopted CARE model to obtain the nonlinear mapping function between low-resolution input and high-resolution output 29 . A U-Net frame-work was used as it has achieved good performance in different biomedical applications. Due to the elastic deformation data enhancement, U-Net only requires a small amount of label images and relatively shorter training time. The network generates intermediate outputs based on the input low-resolution data and quantitatively compares them with the high-resolution label data. The obtained system loss function then iteratively optimizes the model till it reaches the convergency. The CARE network was implemented in Python using Tensorflow and Tensorlayer. The training process (4000 training pairs with size 128 × 128 for each) took ∼3 hours by using a single Nvidia 2080Ti GPU for computation.

Neuron tracing and cell body counting
The neurons in whole brain data was segmented semi-automatically using the commercial Imaris software. The Autopath mode of the Filament module was applied to trace the neurons. We first Fig. 6. Accuracy-improved brain analysis by Deep-SLAM. (a) Comparative Imagebased neuron segmentation/tracing using Imaris. Pyramidal neurons in the same cerebral cortex region resolved by thick SLAM (1st column), Deep-SLAM (2nd column) and thin SLAM (3rd column) modes were segmented and traced. The total length of filaments in the image of each mode was also compared in the right column. (b, c) Comparative neuron cell body identification and counting in two 400 × 400 × 400 µm 3 ROIs at hippocampal and cortex. The 3D visualization as well as the quantitative results have validated that the axial improvement by Deep-SLAM is substantially beneficial to accurate neuron analysis, which are otherwise challenging to regular thick SLAM mode owing the axial blurring. The number of neurons identified in the image of each mode was also compared (inset). (d, e) 3D visualization of six 50 × 50 × 50 µm 3 ROIs indicated in the blue boxes in (a, b). The magnified view clearly shows the accuracy-improved cell counting by our Deep-SLAM. Scale bar, 50 µm.   assigned one point on a neuron to initiate the tracing. Then, Imaris automatically calculated the pathway in accordance with image data, reconstructed the 3D morphology, and linked it with the previous part. This procedure would repeat several times until the whole neuron, which could also be recognized by human's eye, was segmented. The trajectories of the neuron was shown in Fig. 5(a). We used the Spots module of commercial Imaris software to count the cell nuclei. We chose two blocks in hippocampus and cortex regions to count the number of cells densely-distributed in these areas. Then automatic creation in Spots module was applied to count cell number for all the channels, each of which represents an encephalic region. By comparing the counting results of raw thick-SLAM and Deep-SLAM with results from high-resolution thin-SLAM mode, it is obvious that our Deep-SLAM result can also lead to more accurate counting of these dense cell bodies.

Qualitative comparison of Deep-SLAM with other imaging modalities
We compared our Deep-SLAM with conventional epi-illumination microscopy, thin SLAM mode, thick SLAM mode and Bessel light sheet fluorescence microscopy which have emerged in recent years. As compared to the conventional epi-illumination microscope, the addition of The CARE network based a 2D U-Net framework was applied in our Deep-SLAM. U-Net logically contains an encoder-decoder architecture with skip-connections. Instead of predicting a single value per output pixel, it predicts a per-pixel distribution (parameterized by mean and scale), which results a 1-channel output. The mean value is used as the prediction. The numbers in the left and upper sides of the layers represent the size and number of the feature maps, respectively. The "Conv" represents a convolutional layer. "Concat" means concatenation operation. The 5×5 Conv layers use the Rectified Linear Unit (ReLU) as the active function. We used adam as the optimizer and the learning rate is 0.0004. both thin/thick SLAM modes improve the image contrast and axial resolution. But due to the inherent trade-off between axial resolution and FOV, we have to sacrifice the FOV for a thinner optical sectioning. The Deep-SLAM further solved this conflict with combining the merits of large FOV, high throughput in thick SLAM mode, and high-axial-resolution in thin SLAM mode. At the same time, there is no need to make any hardware retrofits to the existing system. Table 1 qualitatively summarizes the performance of original epifluorescence microscope, different modes of SLAM imaging.

Conclusion
We have demonstrated an efficient and cost-effective imaging approach that combines light-sheet imaging add-on (hardware) with DNN restoration (software) to allow an ordinary conventional inverted microscope to realize minute time-scale histological imaging of large mouse brain (∼10 × 3 × 5 mm 3 ) at isotropic single-cell resolution, previously a unique capability from high-end, complicated LSFM implementations only. Benefitting from the Deep-SLAM imaging results with sufficient quality, we also successfully implemented image-based neuron tracing and cell counting at high accuracy across a large scale. As we know exploring the distribution of cellular brain regions and neurons is the key to understand complex biological problems about brain functions, our approach could be a valuable tool for readily enabling image-based segmentation and segmentation of neurons performed at system level and in three dimensions. Besides the demonstration on brain imaging, we think this simple and cost-effective method could be the same efficient to the large-scale histological imaging of other whole organs.