Fast light-field 3D microscopy with out-of-distribution detection and adaptation through conditional normalizing flows

Real-time 3D fluorescence microscopy is crucial for the spatiotemporal analysis of live organisms, such as neural activity monitoring. The eXtended field-of-view light field microscope (XLFM), also known as Fourier light field microscope, is a straightforward, single snapshot solution to achieve this. The XLFM acquires spatial-angular information in a single camera exposure. In a subsequent step, a 3D volume can be algorithmically reconstructed, making it exceptionally well-suited for real-time 3D acquisition and potential analysis. Unfortunately, traditional reconstruction methods (like deconvolution) require lengthy processing times (0.0220 Hz), hampering the speed advantages of the XLFM. Neural network architectures can overcome the speed constraints but do not automatically provide a way to certify the realism of their reconstructions, which is essential in the biomedical realm. To address these shortcomings, this work proposes a novel architecture to perform fast 3D reconstructions of live immobilized zebrafish neural activity based on a conditional normalizing flow. It reconstructs volumes at 8 Hz spanning 512x512x96 voxels, and it can be trained in under two hours due to the small dataset requirements (50 image-volume pairs). Furthermore, normalizing flows provides a way to compute the exact likelihood of a sample. This allows us to certify whether the predicted output is in- or ood, and retrain the system when a novel sample is detected. We evaluate the proposed method on a cross-validation approach involving multiple in-distribution samples (genetically identical zebrafish) and various out-of-distribution ones.


Supplementary document: Fast light-field 3D microscopy, out of distribution detection and adaptation through Conditional Normalizing Flows 1. THE CONDITIONAL WAVELET FLOW ARCHITECTURE
We augmented the Freia framework [3] for better handling of our problem in hand.Some of the modifications are: • Permutation along random dimensions: The permutation operation is typically done only on the channel dimension.We modified the implementation to randomly select a dimension to permute and apply a random permutation of the elements on that dimension.
• 1D Haar transform: The original Wavelet-Flow paper [2] up/down sampled the images on the x-y axes, however in the case of XLFM images, the lenslet images already contain the high definition information in x-y dimension.Hence, compressing the images to up-sample them again seemed counterintuitive.Instead, we perform the up/down sampling along the channel dimension.We are starting with eight depths and up-sampling with 4 CWFs until reaching 96 depths.
Additionally, instead of the traditional permutation functions, where the channel dimension is permuted, we included permutations where other dimensions can be permuted.See FIg. 1 for details on the internal components of the network.

LOW RESOLUTION 3D RECONSTRUCTION (LR-NET)
LR-net shown in Fig. S4a, performs a low axial resolution reconstruction with an XLFM image and low-resolution mean volume as input.It is comprised of the following parts: • The perspective views corresponding to each microlens is cropped and stacked on the channel dimension.
• The views are passed to a 2D unit, where the channel dimension encodes the volume's axial dimension.
• In parallel, the mean volume is passed through two ConvNext convolutional blocks and a global attention module in a residual multiplicative matter.
• The global attention module selects the relevant pixel-wise information by first turning the image into a 1D array, then applying a Conv1D, a ReLU, a Conv1d, and a sigmoid activation which outputs a value from 0 to 1 that gets multiplied or weights the original pixel.Table S1.Description of fish used for this work and in which cross-validation set they were used as testing set.S2.Improvement upon fine-tuning on 10 images of the testing set (column 3-4).Or by appending the ten images to the cross-validation training set (column 5-6).

Fig. S1 .
Fig. S1.Single conditional normalizing flow used within the CFWA, also present in Fig.1as CNF1,2, etc.In blue, the CAT block is responsible for computing a scaling and translation from the condition and applying it to the input

Fig
Fig. S2.Hyper-parameter ablation optimized towards Pearson correlation coefficient.Highlighted with a red star is the selected architecture.
Fig. S5.Neural activity comparison with different methods.In (a), the MIP of the GT volume, with a subset of the active neurons highlighted.Followed by a reconstructed frame with different methods in row (b).In (c), the neural potentials of a subset of 6 neurons in 100 frames (10 seconds).(d) the mean Pearson correlation coefficient across the six cross-validation folds and the three methods evaluated on the 50 most active neurons.Note that the PCC was measured directly on the 3D volumes, and only the 2D projection of the neuron coordinates is shown in (a).
• the U-net and ConvNext paths are added into a single volume.Resulting in a 512 × 512 × 96 2 n voxels, where n is the number of down-sampling steps in the CWFA (5 in our case).