3D Computational Cannula Fluorescence Microscopy enabled by Artificial Neural Networks

Computational Cannula Microscopy (CCM) is a high-resolution widefield fluorescence imaging approach deep inside tissue, which is minimally invasive. Rather than using conventional lenses, a surgical cannula acts as a lightpipe for both excitation and fluorescence emission, where computational methods are used for image visualization. Here, we enhance CCM with artificial neural networks to enable 3D imaging of cultured neurons and fluorescent beads, the latter inside a volumetric phantom. We experimentally demonstrate transverse resolution of ~6um, field of view ~200um and axial sectioning of ~50um for depths down to ~700um, all achieved with computation time of ~3ms/frame on a laptop computer.


Introduction
Neurons are distributed in the brain in a complex manner in 3D space.Therefore, neural imaging requires data in all 3 spatial dimensions, ideally with fast acquisition.Computational Cannula Microscopy (CCM) has shown to be effective for imaging fluorescence deep inside the brain with minimal trauma [1,2].Previous work also demonstrated the potential of computational refocusing to achieve quasi-3D imaging (in air) [3].Recently, we also demonstrated the application of Artificial Neural Networks (ANNs) to drastically improve the speed of image reconstructions in CCM [4].Here, we extend our previous work to enable ANNs to perform fluorescence imaging in a 3D volume.Specifically, we investigated three different ANNs to enable 3D CCM, and demonstrated 3D imaging using cultured neurons and fluorescent beads, both in air and inside a volumetric phantom.3D imaging of neurons in vivo in the intact brain is typically achieved with 2-photon imaging.Although impressive resolution, field of view and speeds have been demonstrated recently [5], such mesoscopes require fairly expensive equipment, complex procedures and are typically limited to depths of a few hundred micrometers.Many other approaches exist for imaging fixed/dead neurons including clearing tissue to render them transparent [6] and utilizing polymeric expansion techniques [7].Lightsheet microscopy [8,9] and structured illumination approaches [10] have also been successfully employed for fast high-resolution volumetric imaging of transparent samples.Tomography from multiple 2D images using deep convolutional ANNs have been demonstrated for semiconductor-device inspection using reflected light [11].Alternative machine-learning approaches have been combined with tomography [12] and light-field microscopy [13].Computational refocusing can also achieve 3D imaging either with optics-free setups [14][15][16] or with diffractive masks [17].Recent work has applied similar principles to 3D wide-field fluorescence microscopy of clear samples using miniaturized mesoscopes [18].In contrast to these approaches, CCM is able to image deep inside opaque or highly-scattering tissue such as the mouse brain [2,19].Furthermore, CCM has the advantage that the ratio of field-of-view to probe diameter is close to 1, thereby allowing for minimal invasive imaging.In our experiments, the probe (cannula) diameter is 220µm.Alternative approaches to imaging through multi-mode fibers have also been described, [20][21][22][23] but most of these rely on the temporal coherence of the excitation light, and thereby require more complex equipment and computational methods.

Experiment
The schematic of our CCM setup is shown in Fig. 1a (and Fig. S1) [24].The cannula (FT200EMT, Thorlabs) can be inserted into the sample.The excitation light (LED with center wavelength = 470 nm, M470L3, Thorlabs) is coupled to the cannula through an objective lens.Fluorescence from the sample is collected by the same cannula (therefore, an epiconfiguration).The image at the top (distal) end of the cannula is imaged onto a sCMOS camera (C11440, Hamamatsu).Reflected excitation light is rejected by a dichroic mirror and an additional filter.An exemplary image is shown on the right inset.A reference microscope is placed underneath the sample to image the same region as the cannula and the corresponding image is shown in the right inset as well.First, we performed a series of experiments to determine that the volume of interest (constrained by the fluorescence collection and excitation efficiencies) is limited to approximately 100µm from the bottom surface of the cannula.Subsequently, we restricted our experiments to 3 layers spaced by 50µm in close proximity to the cannula (as illustrated in the left inset in Fig. 1a).Similar to our previous work, [4] here we used both mouse primary hippocampal cultured neurons and slides with fluorescent beads to create a dataset for training the ANNs.However, unlike our previous work, we acquired this dataset for 3 layers as illustrated in the left inset of  Fig. 1a.Details of sample preparation are described in section 2 of the supplement [24].A total of 16,700 images from each layer were recorded.
Figure 1b shows the architecture of ANN1_r, which is used to convert the input CCM image into the fluorescence image.It consists of dense blocks that prevent the gradients from vanishing too fast.Each dense block includes 3 individual layers: 2 convolutional layers with RELU activation function followed by a batch-normalization layer.The structure is a typical U-net with the skip connections to concatenate the encoder and decoder outputs [4].A second ANN, referred to as ANN1_c (Fig. 1c) was used to classify the images into the 3 layers (layer index is metadata in the images).It includes 8 blocks and a final classifier.Each block consists of one 2D convolution layer with RELU activation function followed by a batch-normalization layer.We add a Maxpool function between every two blocks, which down-samples input images and prevents overfitting by taking max value in a filter region (2x2 filter).For both ANNs, the loss function is the pixel-wise cross-entropy, defined as: where gi and pi represent the ground truth and predicted pixel intensity, respectively.This loss function imposes sparsity.An alternative ANN, referred to as ANN2 that outputs 3 layer images was also explored and described in section 3 of the supplement [24].

Results
Imaging results using ANN1_r and ANN1_c for cultured neurons and fluorescent beads are summarized in Fig. 2a.The ground-truth images were obtained by the reference microscope as described earlier and confirm the accuracy of the ANN outputs.The layer index predicted by ANN1_c was verified by the metadata of the corresponding reference images.The structural similarity index (SSIM) and maximum-average error (MAE), both averaged over 1000 test images, were 90% and 1% for ANN1_r, respectively.The classification accuracy averaged over 1000 images, of ANN1_c was 99.8%.Figures 2b and 2c show results from ANN2 with cultured neurons and fluorescent beads, respectively.The performance of ANN2 averaged over 1000 test images and 3 layers per image was 96% (SSIM) and 0.4% (MAE).We further evaluated the computation time on a computer equipped with Intel(R) Core(TM) i7-4790 CPU (clock frequency of 3.60 GHz, memory of 16.0 GB) and NVIDIA GeForce GTX 970.The average reconstruction time for ANN1_r and ANN2 was 3.3ms and 3.4ms, respectively.The average classification time for ANN1_c was 3.6ms.
We also compared the performance of the two reconstruction ANNs by applying these to the same input image in Fig. 3a and ground-truth image in Fig. 3b (layer index is labelled 2).The reconstructed result from ANN1_r is in Fig. 3c.The output of ANN1_c is 2. The corresponding output from ANN2 is shown in Figs.3d-f for layers 1, 2 and 3, respectively.In order to estimate resolution, we imaged a single fluorescent bead, whose ground truth image and cross-section through the bead are shown in Figs.3g and 3h, respectively.A bead diameter of 5.9µm is measured.The corresponding outputs from ANN1_r and ANN2 are in Figs.3i-l, respectively.The corresponding measured bead diameters were 6.6µm and 5.6µm, respectively.
Finally, we fabricated a phantom made of agarose dispersed with fluorescent beads as illustrated by the photographs in Fig. 4a [24].The cannula was carefully inserted into the phantom, while CCM images were recorded.The ANN1_r was retrained with a synthetic dataset comprised of combining the 3-layer CCM images into a single "synthetic" CCM image (see section 4 of supplement [24]).We refer to this new network as ANN1_r*, which is now trained to reconstruct an image comprised of the projection of the fluorescence signal from within 100µm of the proximal end of the cannula onto a single plane.The CCM images and corresponding output images of ANN1_r* at various depths are shown in Fig. 4b.Only a subset of the images are shown here and the complete set is included in section 5 of the supplement [24].This stack of 2D images can then be combined into a reconstructed 3D image as shown in Fig. 3c [24].One of the advantages of ANN over previous approaches that utilize singular-value decomposition (SVD) [1][2][3] is the much higher computation speed.In table 1, we summarize the performance of the 2 ANN approaches and the SVD method.The performance of the 2 ANN approaches are similar, and were averaged over 1000 test images, each containing 3 layers.Classification accuracy was defined as the ratio of the number of images with correctly predicted layer index to the total number of images tested (1000).Data used for SVD was from a single layer data, hence classification accuracy is not applicable.

Conclusion
In conclusion, we demonstrate 3D imaging of fluorescent beads in a volumetric phantom using a surgical cannula as the lightpipe for both excitation and fluorescence.Image reconstructions in multiple planes were achieved using trained artificial neural networks.We trained two types of neural networks on both experimental and quasi-experimental data (augmented by synthesizing multiple plane images together).The system was able to achieve lateral resolution of ~6µm, axial sectioning of ~50µm and imaging depths as large as 0.7mm.The field of view was approximately equal to the diameter of the cannula, thereby allowing for imaging a wide area with minimal invasive surgery.We increased the exposure time to make sure all the beads could be seen.Most beads were reconstructed as z = 650µm, but the two aside bead in red circles.

Fig. 1 .
Fig. 1.Overview of computational-cannula microscopy (CCM).(a) Schematic of microscope.Left inset shows 3 layers that are captured, spaced by 50µm from the proximal end of the cannula.Right insets show recorded images with CCM (top) and with the reference microscope (bottom).(b) Details of ANN1_r that is trained to take the input CCM image and output the reconstructed image of 1 layer.A modified version of this network, ANN2 outputs 3 images, one for each layer (see SI) [24].(c) Details of ANN1_c, which classifies the input CCM image into one of the 3 layers.

Fig. 2 .
Fig. 2. Experimental results of 3D CCM.(a) Fluorescence samples were reconstructed using ANN1_r, while the layer index was predicted by ANN1_c.The first 3 rows show cultured neurons, while the last row shows fluorescent beads (diameter=4µm).(b) ANN2 produces 3 reconstructed images, one for each layer.This is an example, where layer 2 contains the neuron.(c) Another example from ANN2 where layer 3 contains the neuron.Many additional examples from all networks are included in the SI [24].

Fig. 3 .
Fig. 3. (a-f) Comparison of results from the different reconstruction ANNs using the same input image.In (c), the output of ANN1_c is labeled as 2 on top.(g-k) shows images of a single fluorescent bead (diameter=4µm) obtained using the 2 networks (the bead is in layer 1).Size of images (a)-(f) are the same and those of (g),(i) and (k) are the same.

Fig. 4 .
Fig. 4. Imaging inside a volumetric phantom: (a) Photographs of phantom.Bottom image shows cannula inserting into the phantom.Blue light is excitation.(b) Output of ANN1_r* trained on a synthetic dataset (see text for details) at various z, depths inside the phantom (see Visualization 1).(c) Reconstructed 3D image.The streaks in the z direction indicate that the cannula pushed some beads down inside the phantom during the experiment (see Visualization 2).

Fig. S6 .
Fig. S5.The reconstructed results are from ANN system that consists of reconstruction ANN and classification ANN.(a) The results are from neurons dataset.SSIM and MAE for training dataset are 0.9213 and 0. 0082.SSIM and MAE for testing dataset are 0. 8974 and 0.0104.The accuracy for classification ANN could reach 0.9980.The layer column is the predicted layer numbers, which correspond to z-positions.(b) Dataset built with beads sample.SSIM and MAE for training dataset are 0.9791 and 0.0017.SSIM and MAE for testing dataset are 0.9878 and 0.0012.The accuracy for classification ANN is 0.9665.

Fig. S7 .Fig. S8 .Fig. S9 .
Fig. S7.Comparison of reconstructed results in different layers: the results are three single bead images come from the three different layers.The resolution was similar for the three layers.

Fig. S10 .
Fig. S10.More phantom result: the thickness of phantom used in this test is 5mm.We couldn't track the beads distribution with reference microscope.The cannula began to be inserted into phantom at the red arrows position.