Automatic phase aberration compensation for digital holographic microscopy based on deep learning background detection

: We propose a fully automatic technique to obtain aberration free quantitative phase imaging in digital holographic microscopy (DHM) based on deep learning. The traditional DHM solves the phase aberration compensation problem by manually detecting the background for quantitative measurement. This would be a drawback in real time implementation and for dynamic processes such as cell migration phenomena. A recent automatic aberration compensation approach using principle component analysis (PCA) in DHM avoids human intervention regardless of the cells’ motion. However, it corrects spherical/elliptical aberration only and disregards the higher order aberrations. Traditional image segmentation techniques can be employed to spatially detect cell locations. Ideally, automatic image segmentation techniques make real time measurement possible. However, existing automatic unsupervised segmentation techniques have poor performance when applied to DHM phase images because of aberrations and speckle noise. In this paper, we propose a novel method that combines a supervised deep learning technique with convolutional neural network (CNN) and Zernike polynomial fitting (ZPF). The deep learning CNN is implemented to perform automatic background region detection that allows for ZPF to compute the self-conjugated phase to compensate for most aberrations.


Introduction
Three-dimensional image retrieving techniques are important for many applications [1][2][3][4][5].These techniques can roughly be divided into interferometric and non-interferometric techniques.Digital Holographic Microscopy (DHM) is an interferometric non-invasive technique for acquiring real-time quantitative phase images which has an enormous impact in many fields such as biology of living cells [6][7][8][9], neural science [10], nanoparticle tracking [11], biophotonics, bioengineering and biological processes [12], microfluidics [13], and metrology [14,15].A traditional DHM system records a digital hologram optically using a microscope objective (MO) and the image reconstruction is performed digitally using optical propagation techniques [16,17].However, the use of an MO introduces phase aberrations which can be superposed over the biological sample (object).A successful image reconstruction requires very tedious alignment and precise measurement of the system parameters such as reference beam angle, reconstruction distance, and MO's focal length which are often difficult to achieve in a laboratory environment.To overcome these difficulties, in a previous analysis, the use of multiple-wavelength DHM and telecentric DHM configurations were employed which allowed canceling the bulk of optical phase aberration due to the MO and the reference beam [18].Residual aberrations could be compensated digitally by using Principal Components Analysis (PCA) [19,20] or Zernike polynomial fitting (ZPF) [18,21,22].However, the use of a multi-wavelength source makes the system setup more complicated and expensive.Also, the existing digital compensation techniques still have other drawbacks.The ZPF requires background information to find the phase residual which is detected semi-manually by cropping background area to perform the fitting [18,21,22].PCA, on the other hand, automatically predicts phase residual by creating a selfconjugated phase to compensate for the aberrations but assuming that the phase aberrations have only linear and spherical components and leaving higher order phase aberrations unaccounted for.Therefore, an automatic detection of the background areas in DHM would be highly desired.Many segmentation techniques have been proposed which can be divided into semi-automatic techniques such as active contour [23], region growing [24], graph cut [25], and random walker [26], which require predefined seeds, and fully automatic segmentation techniques such as edge-based [27], region-based [27], split and merge [28], and watershed techniques [29], However, in the case of DHM, these existing methods are not reliable because of the overwhelming phase aberrations and speckle noise in the images.
In this work, we propose a combination of ZPF with a fully automatic background detection using deep learning convolutional neural network (CNN) to compensate for all the phase aberrations without human intervention such as manual cropping.This technique works with single and multiple wavelength telecentric configurations.Theoretical and experimental results will be used to quantitatively assess growth and migratory behavior of invasive cancer cells.

Bi-telecentric DHM optical setup
Figure 1 shows the bi-telecentric digital holographic microscopy (BT-DHM) system in vertical transmission mode suited for biological sample analyses.A similar setup can also be used in reflection mode as well.This setup is used in an afocal configuration, where the back focal plane of the MO coincides with the front focal plane of the Tube lens ( the object placed at the front focal plane of the MO, resulting in the cancellation of the bulk of the spherical phase curvature normally present in traditional DHM systems.The optical beam from a HeNe laser travels through a spatial filter and periscope system and is collimated with a collimating lens to produce a plane wave beam.The collimated beam is split into a reference beam and an object beam which is focused on the biological sample using an afocal configuration.The two beams which are tilted by a small angle (<1°) from each other are recombined using a beam splitter and interfere with each other on a CCD to generate an offaxis hologram.The magnification of the BT-DHM system is / The most common numerical reconstruction algorithms used in constructing digital holograms are the discrete Fresnel transform, the convolution approach, and the reconstruction by angular spectrum which is defined as [15,17]: where d is the distance between image plane and CCD, ( ) , h x y is the hologram, ( , ) u ξ η is the reconstructed image, F is the Fourier transform operator, λ is the wavelength, and  , L L × are the dimensions of the CCD sensor.This is intuitively understood by realizing that the holographic recording is now simply a recording of the geometrically magnified virtual image located at distance d.Thus, the pixel resolution is automatically scaled accordingly.For a transmissive phase object on/between transmissive surface/s, the phase change (optical thickness T) due to the change in index n Δ can be calculated as: where the phase due to the biological sample is expressed as: , , , 2 where R is the radius of curvature of the spherical curvature of the MO and ( ) , ϕ ξ η is the total phase of the object beam without using the bi- telecentric configuration.Conventional image reconstruction using Eq. ( 3) contains phase aberrations which can be mitigated with the image reconstruction pipeline shown in Fig. 2. First, the hologram is converted into Fourier domain, and the + 1 order component is extracted.Then the wrapped phase image is obtained by extracting the phase of the inverse Fourier transform of the cropped spectrum, to be followed by phase unwrapping.The unwrapped phase can be fed into a trained Convolutional Neural Network (CNN) model to determine the background areas.The background information is then fed into the ZPF to calculate the conjugated phase aberration.Phase compensation is done in spatial domain by multiplying the Inverse Fourier Transform of the cropped + 1 spectrum order with the complex exponential term which contains the conjugated phase aberration.Then, the Fourier Transform of the compensated hologram is centered and zero padded to the original image size.Finally, the angular spectrum reconstruction technique is performed to obtain the phase height distribution of the full-sized, aberration-free reconstructed hologram, as shown in Fig. 2.
One of the crucial steps in the image reconstruction pipeline stated above is to train the proposed deep learning CNN which requires a training data set of sub-sampled phase aberration images and their corresponding ground truth (label) images.Section 3 describes in details the data preparation steps for training the CNN model and Section 4 describes how the CNN model is implemented.

Biological sample and data preparation
The cancer cells from the highly invasive MDA-MB-231 breast cancer cell line were seeded on type I collagen hydrogels, polymerized at 4 mg/ml and a temperature of 37°C in 35 mm glass-bottomed petri dishes.The cells on collagen were incubated for 24 hours in DMEM medium containing 10% fetal bovine serum, in standard tissue culture conditions of 37°C and 5% CO 2 , and 100% humidity.Then, cells were taken from the incubator and imaged with the bi-telecentric DHM setup described above to produce phase reconstruction maps.Figure 3 shows the steps of data collection/preparation using a PCA method [19].This process contained three separate parts: (a) collecting background phase aberration (red path), (b) collecting a single cell data set which includes phase height distribution and binary mask (blue path), and (c) what data is input to the proposed CNN model (the green path).Along the red path, random background phase aberrations were obtained while there was no sample present in the BT-DHM system.As shown in Fig. 1, MO1 and MO2 were both shifted up, down, and rotated to create different phase aberrations.Two-hundred and ten holograms without a sample present were captured and reconstructed, using angular spectrum method.The background sub-sampled (256 × 256) phase aberration was reconstructed after using a band-pass filter around the + 1 order (virtual image location) by using an inverse Fourier transform and phase unwrapping.
Along the blue path, forty holograms containing cancer cells were also reconstructed using the PCA method.For the training stage of the deep-learning CNN, 306 single cells were manually segmented from those forty reconstructed holograms to obtain real phase distribution images and corresponding ground truth binary images (0 for background, 1 for cells).Then, each of cell's phase distribution images, binary masks, and subsampled phase aberration images were augmented using flipping (horizontally and vertically) and rotating (90°, 180°, and 270°).Therefore, 1836 single cell phase distribution images, corresponding to 1836 single cell binary masks and 1260 sub-sampled background phase aberration were obtained.In order to create the training data set, 4 -10 real phase maps of cells were randomly added into each of the 1260 phase aberration images that contain no samples at random positions.It should be noted that the total phase is the integral of the optical path length (OPL).These phase maps were preprocessed with a moving average filter [5 × 5] to smooth out the edges due to the manual segmentation.Similarly, and corresponding to the same 4 -10 random positions of the real phase maps, the ground truth binary masks were added to a zero background phase map to create the labeled data set.Notice that, different types of cells can produce different shapes.In our case, a future objective would be to quantitatively assess the growth and migratory behavior of invasive cancer cells, and hence cells from the invasive MDA-MB-231 breast cancer line were used for this purpose.Note that, for each type of cells, manual segmentation is only performed once.Hence, the manual segmentation is only performed in the data preparation stage.Usually, deep learning CNN techniques require a certain amount of training data to produce good results.This additional overhead to collect and prepare the training data can be expensive.However, by augmenting 210 phase images (without sample present) and 306 cell images through flipping and rotation, we are able to create a training data set of 1260 phase aberration images and their corresponding ground truths images.Eighty percent of these images were randomly selected for training, and the rest of images were used for validation.

Deep learning convolutional neural network training
In this section, we describe the implementation of deep learning CNN for automatic background detection for digital holographic microscopic images.The deep learning architecture contains multiple convolutional neural network layers [30,31], including max pooling layers, unpooling layers with rectified linear unit (ReLU) activation function, and batch normalization (BN) function, similar to the architecture used in [32].Let us denote by (i)  x , '(i) , x and (i) y to be the input data volume corresponding to the initial group of phase aberration images, the currently observed volume data at a certain stage of the CNN model, and the output data volume of the CNN model, respectively.The input and output data volume along with the ground truth images have a size of (batchSize × imageWidth × imageHeight × channel), where batchSize is the number of images in each training session.In our model, the input volume has a size (8 × 128 × 128 × 1) (1 channel indicates a grayscale image), whereas the output volume has a size (8 × 128 × 128 × 2) (2 channels for 2 classes obtained from the one-hot-encoding of the ground truth images [33]).The model shown in Fig. 4 is a simpler version of the U-Net model [32].An output neuron in the U-net model is computed through convolution operations (which we define as a convolution layer) with the preceding neurons connected to it such that these input neurons are situated in a local spatial region of the input.Specifically, each output neuron in a neuron layer is computed by the dot product between their weights and a connected small region of the input volume, with an addition of the neuron bias: where W is the weight, B is the bias, j is the index in the local spatial region M which is the total number of elements in that region, N is the total number of neurons in each layer which can be changed depending on the architecture, and l is the layer number.The U-net model contains two parts: Down-sampling (left half of Fig. 4) and up-sampling (right half of Fig. 4).After each convolutional layer, ReLU activation function [34] and BN function [35] were applied to effectively capture non-linearities in data and speedup the training.In the down-sampling path, feature extraction is done through convolution which transforms the input image to a multi-dimensional feature representation [31,36].On the other hand, the up-sampling path is a shape generator that produces object segmentation from the extracted features obtained from the convolution path.ReLU activation improves the computational speed of the training stage of the neural networks and prevents the issue of "vanishing gradient" while employing the sigmoidal function traditionally used for this purpose.The ReLU activation function we used is defined as in [34]: , 1, 2, , , 0 , where ( )   i x′ is the i th pixel in the volume data under training,N is the total number of pixels in the volume data: N = batchSize × layerwidth × layerheight × channel, where layerwidth and layerheight is the width and height of the image at the l th layer, and the channel is the number of weights W in the l th layer.
On the other hand, batch normalization allows the system to: (a) have much higher learning rates, (b) be less sensitive to the initialization conditions, and (c) reduce the internal covariate shift [35].BN can be implemented by normalizing the data volume to make it zero mean and unit variance as defined in Eq. ( 8): where [ ] ( ) , ε is a regularization parameter used to avoid the case of uniform images), γ is a scaling factor, β is the shifting factor ), and ( ) ˆ' i x is the output of the BN stage.The down-sampling and up-sampling were done using max pooling and unpooling, respectively.Max pooling is a form of non-linear down-sampling that eliminates nonmaximal values, and helps in reducing the computational complexity of upper layers by reducing the dimensionality of the intermediate layers.Also, max pooling was done in part to avoid over fitting.The unpooling operation is a non-linear form of up sampling a previous layer by using nearest neighbor interpolation of the features obtained by max pooling, and resulting gradually in the shape of the samples.Our deep learning CNN model has a symmetrical architecture with max pooling and unpooling filters both with a 2 × 2 kernel size.
In our experiment, the Softmax function, a popular linear classifier defined in Eq. ( 9), was used in the last layer to calculate the prediction probability of background/cell potential as: where N (8 × 128 × 128 × 2) is the number of pixels (neurons) needed to be classified in the segmentation process.An error is a discrepancy measure between the output produced by the system and the correct output for an input pattern.A loss value is the average of errors between the predicted probability ( ) i Sy and the corresponding ground truth pixel ( ) i L .In our study, the loss function was measured by using the cross entropy function which is defined as: The training algorithm was performed by iterating the process of feeding the phase aberration images in batches through the model and calculating the error Є using an optimizer to minimize the error.The Stochastic Gradient Descent (SGD) optimizer was employed in the back propagation algorithm.Instead of evaluating the cost and the gradients over the full training set, SGD evaluates the values of these parameters using less training samples.The learning rate was initially set to 1e-2, the decay to 1e-6, and the momentum to 0.96 [37].Other parameters used in this work: batchsize of 8, image size of 128 × 128 instead of 256 × 256 to avoid memory overflow (images will be resized at the end), depth channel of 32 at the first layer, the deepest channel is 512, and training with 360 epochs.The proposed model was implemented in Python using TensorFlow/Keras framework [38] and the implementation was GPU-accelerated with NVIDIA GeForce 970M. Figure 5

Testing data and phase aberration compensation
To evaluate the performance of the proposed deep neural network and ZPF technique, 30 holograms recorded by the BT-DHM system (see Fig. 1) and reconstructed using the pipeline in Fig. 2, were tested.The background of a phase aberration image was first located, so the background pixel representation can be used in the ZPF model [18].As shown in Fig. 2, the unwrapped phase [39] is passed through the trained CNN model discussed in Section 4 (Fig. 2(g)) to produce the mask prediction ( ) i y in Eq. ( 9).The output of the model is normalized in the [0, 1] range and the threshold is set to 0.5 to classify the background and cell area as described by the following equation: Figure 6 shows exemplary channels of selected layers in the down-sampled (top row) and up-sampled (bottom row) paths (see Fig. 4), respectively, for visualization effect.The image on the top left is the raw phase aberration.The next five images on the top row are the outputs of consecutive down-sampling layers.The first five images on the lower row are the outputs of the up-sampled layers, and the lower right image is the binary mask using the threshold function defined in Eq. ( 11).The down-sampled layers contain the strong features of the image such as the parabolic intensities, and edges, etc., while the up-sampled layers contain the shape of the cells.
In order to measure the conjugated background phase aberration, the pixels from the raw phase image are chosen corresponding to the background pixels' locations obtained from the binary image where ( (i) 1 B = ), then converted to a 1D vector to perform the polynomial fitting [18].Then, the polynomial fitting was implemented using a 5th order polynomial with 21 coefficients as: ( )

S x y p x y i j
where ij p are the coefficients, i and j are polynomial orders, x and y present pixel coordinates.
Let the arrays The , , i j p z matrix consists of coefficients corresponding to each order of the Zernike polynomials: 0,0, 1 0,0, 0,0,0 0,0,1 0,0,10 0 1,0, 1 1,0, 1,0,0 1,0,1 1,0,10 1 10 4,0,0 4,0,1 4,0,10 The Zernike polynomial model is used to construct the conjugated phase, as: where k Z coefficients are expressed according to Zemax classification.After obtaining the background area from CNN, the conjugated phase aberration was calculated using ZPF, and then multiplied with the initial phase.To obtain the full size aberration compensated reconstructed image, zero padding and spectrum centering was performed on the Fourier transform of the aberration compensated hologram.Then, the angular spectrum reconstruction technique was performed to obtain the phase height distribution of the full-sized, aberrationfree reconstructed hologram, as shown in Fig. 2.
Figures 7(a where |.| denotes the area, A and ' A are the segmented areas of a test data based on deep learning CNN and manual segmentation, respectively.Background's DC (0.9582 -0.9898) is much higher than cell's DC (0.7491 -0.8764) because of the larger common area in the background.This will lessen the effect of true negative and false positive scenarios in ZPF.      Figure 9 shows the comparison between PCA and CNN + ZPF techniques.The CNN + ZPF technique produces better results than the PCA technique in approximating the conjugated residual phase.Figures 9(a) and 9(c) show the phase compensation using PCA and CNN + ZPF, respectively.Figures 9(b) and 9(d) are the wrapped conjugated residual phases computed using PCA and CNN + ZPF, respectively.When the PCA's technique is used, the residual phase which contains elliptical concentric pattern was fitted using the least-square method for the two dominant singular vectors corresponding to the first two dominant principal components.This will not compensate for all the distorted regions of the phase distribution.However, the CNN + ZPF technique takes advantage of the background area; the majority of background information was fitted with higher order (up to 5th order).Hence, the conjugated phase aberration looks more distorted because of those higher orders.Figure 9(e) shows the Zernike coefficients of the phase difference between CNN + ZPF method and PCA method indicating the error in phase compensation while using the PCA method.Another example of testing data is shown in Fig. 10.The same cancer cell line was used, but cells were adherent to the surface of a thin collagen hydrogel layer.MDA-MB-231 cells were placed on a collagen layer, fed with DMEM supplemented by 10% FBS and incubated for one day to promote adhesion to collagen.Collagen polymerization conditions, at a concentration of 4 mg/ml, and polymerization temperature of 4°C, were set to produce a collagen network with large-diameter fibers [40].The microscope stage was warmed to 37°C with a stage warmer, and cell culture media was buffered with 10 mM HEPES.BT-DHM was able to capture phase reconstruction map features consistent with collagen fibers from gels formed at the above polymerization conditions.Due to the different temperatures during collagen polymerization (37 °C versus 4 °C), one image in the new data set has collagen fiber features not apparent in the CNN model training image set.However, the background region was correctly detected even with the introduction of the new features.Thus, the CNN + ZPF technique has higher accuracy in measuring the phase aberration (1.68 rad of flatness using PCA and 0.92 rad of flatness using CNN + ZPF) as shown in Fig. 10(e).
To further validate the proposed technique, a data set with more cancer cells than the training images in the CNN model was used (i.e., the training data set contains 4-10 cells in a single-phase image).Figure 11 shows a typical result with a real phase image containing 15 cells.The CNN model managed to detect the background area regardless of the number of cells that appear in the image.The CNN model managed to learn representations and make decisions based on local spatial input.By scanning kernel filters spatially over the data volume, convolutional layers could detect cells' region features spatially better suited to enhance the ZPF performance, resulting in better phase aberration compensation.In Fig. 11(e), the dashed profile crosses 3 different cells, from left to right.The phase heights of the three cells are the same for both techniques.While the phase aberration remains visible for the 3rd cell using PCA, the aberration is cancelled using the proposed technique.The whole motivation is to ensure proper cell phase visualization for further analysis, without a phase offset error.Thus, ensuring flat phase in the background is crucial for correct analysis.Hence, the CNN + ZPF is a fully automatic technique that outperforms the PCA method in terms of accuracy and robustness, and can be implemented in real time [18,19].

Conclusion
We have proposed and demonstrated a combination of Deep Learning Convolutional Neural Network with Zernike polynomial fitting technique to automatically compensate for the phase aberration in a DHM system.The technique benefits from PCA's ability to obtain the training data for the deep learning CNN model.The trained CNN model can be used as an automatic and in situ process of background detection and full phase aberration compensation.From the data testing stage, we noticed that even with images having new features that didn't appear in the training process, CNN managed to detect the background with a high precision.While, many image segmentation techniques are not robust when applied to DHM images due to the overwhelming phase aberration, CNN segments the background spatially based on features regardless to the number of cells and their unknown positions.Thus, the trained CNN technique in conjunction with the ZPF technique is a very effective tool that can be employed in real time for autonomous phase aberration compensation in a DHM system.
frequencies.In DHM we introduce a MO to increase the spatial resolution which is computed according to Eq. (4).Due to the magnification 'M' introduced by the MO the pixel size in the image plane, mag ξ Δ and mag η Δ scale according to: N is the number of pixel in one dimension, and x Δ , y Δ denote the sampling intervals or pixel size / x y L N Δ = Δ =

Fig. 2 .
Fig. 2. Image reconstruction pipeline with phase aberration compensation based on CNN + ZPF: (a, b, c) Hologram, Fourier Spectrum of the hologram, and cropped + 1 order spectrum, respectively.(d,e) Wrapped phase and unwrapped phase of (c) respectively.(f) CNN trained model, (g) output binary segmentation, (h) visualization of background detection, (i) Zernike polynomial fitting, (j) phase aberration calculated from ZPF, (k) Fourier Spectrum after phase aberration compensation, (l) Fourier Spectrum with zero padding and centering, (m) reconstructed phase map using angular spectrum, and (n) final unwrapped phase map.

Fig. 3 .
Fig. 3.The pipeline of the data preparation process to be used to train the CNN model.The blue path collects data for single cell segmentations and binary masks.The red path collects sub-sampled phase aberration.The green path shows how the data is fed to the CNN model.
shows the training and the validation loss obtained from 360 epochs.Each epoch contains 126 batches of training data.The parameters were updated after each training batch.The training loss and the validation loss started at 0.48 and 0.2916, respectively.The results suggest that the loss value decreases quickly (i.e., learned quickly) during the first 50 epochs of the training, and the validation loss value decreases with random oscillations (i.e., transitory period) in the first 50 epochs.Note that the validation loss value was slightly less than the value of the training loss value during epoch 50 to 220 which implies that the model learned slowly in this period.Between epochs 220 and 360 the validation loss value was slightly higher than the training loss value.Both values decreased slowly to 0.0256 and 0.0237, respectively.
) and 7(b) show a typical manual and CNN model's segmentation on the test image in Fig. 2. Figure 7(c) shows the Dice's Coefficient (DC) or F1 score of background area and cell area of 9 typical cases in test data.DC is computed according to the following equation:

Fig. 7 .
Fig. 7. (a) Typical manual segmentation on the test image of Fig. 6, (b) CNN model's segmentation, and (c) background (BG) dice coefficient and cell dice coefficient on 9 cases of the test data.

Figure 8 (
Figure8(a) shows a typical reconstruction of real data wrapped phase with aberrations.It is worth noting that the cells in this image do not appear in the training data set.This means that these holograms were not segmented in the data preparation process.Figure8(b)shows the result of background detection using the deep learning CNN classification process.In this example, considerable differences between the training data and the real data were observed.Cells obtained from real data have smoother edges than the ones obtained in the training data.The CNN produces an intentional over segmentation of the cell area which is actually beneficial for background detection.Figure8(c) is the result of applying (through multiplication) the binary mask on the unwrapped reconstructed phase containing aberrations.Then the phase aberration in the background region was fitted using ZPF to compute the residual phase as shown in Fig.8(d).Figure8(e) shows the phase distribution after compensating in the spatial domain according to Fig.2.Figure8(f) shows the final result after phase unwrapping.

Figure 8 (
Figure8(a) shows a typical reconstruction of real data wrapped phase with aberrations.It is worth noting that the cells in this image do not appear in the training data set.This means that these holograms were not segmented in the data preparation process.Figure8(b)shows the result of background detection using the deep learning CNN classification process.In this example, considerable differences between the training data and the real data were observed.Cells obtained from real data have smoother edges than the ones obtained in the training data.The CNN produces an intentional over segmentation of the cell area which is actually beneficial for background detection.Figure8(c) is the result of applying (through multiplication) the binary mask on the unwrapped reconstructed phase containing aberrations.Then the phase aberration in the background region was fitted using ZPF to compute the residual phase as shown in Fig.8(d).Figure8(e) shows the phase distribution after compensating in the spatial domain according to Fig.2.Figure8(f) shows the final result after phase unwrapping.

Figure 8 (
Figure8(a) shows a typical reconstruction of real data wrapped phase with aberrations.It is worth noting that the cells in this image do not appear in the training data set.This means that these holograms were not segmented in the data preparation process.Figure8(b)shows the result of background detection using the deep learning CNN classification process.In this example, considerable differences between the training data and the real data were observed.Cells obtained from real data have smoother edges than the ones obtained in the training data.The CNN produces an intentional over segmentation of the cell area which is actually beneficial for background detection.Figure8(c) is the result of applying (through multiplication) the binary mask on the unwrapped reconstructed phase containing aberrations.Then the phase aberration in the background region was fitted using ZPF to compute the residual phase as shown in Fig.8(d).Figure8(e) shows the phase distribution after compensating in the spatial domain according to Fig.2.Figure8(f) shows the final result after phase unwrapping.

Figure 8 (
Figure8(a) shows a typical reconstruction of real data wrapped phase with aberrations.It is worth noting that the cells in this image do not appear in the training data set.This means that these holograms were not segmented in the data preparation process.Figure8(b)shows the result of background detection using the deep learning CNN classification process.In this example, considerable differences between the training data and the real data were observed.Cells obtained from real data have smoother edges than the ones obtained in the training data.The CNN produces an intentional over segmentation of the cell area which is actually beneficial for background detection.Figure8(c) is the result of applying (through multiplication) the binary mask on the unwrapped reconstructed phase containing aberrations.Then the phase aberration in the background region was fitted using ZPF to compute the residual phase as shown in Fig.8(d).Figure8(e) shows the phase distribution after compensating in the spatial domain according to Fig.2.Figure8(f) shows the final result after phase unwrapping.

Fig. 9 .
Fig. 9. (a) Phase compensation with PCA, (b) conjugated residual phase of (a), (c) phase compensation using CNN ZPF, (d) conjugated residual phase of (c), (e) Zernike coefficients of phase difference between CNN + ZPF technique and PCA technique using 1/|log|a k || scale, and (f) profiles of yellow dash lines in (a) and (c) corresponding to blue and red line, respectively.Yellow bars denote the flatness of region of interest.
Figure 9(f) shows the profiles of a diagonal dashed line (from bottom left to top right) of PCA's result of Fig. 9(a) and CNN + ZPF's result of Fig. 9(c).The two profiles have different bias phases; the background phase of CNN + ZPF has better flatness (1.35 rad and 0.65 rad) than PCA's background (corresponding to 2.4 rad and 0.95 rad) which can be seen inside the blue dashed rectangle.

Fig. 10 .
Fig. 10.(a) Phase aberration, (b) unwrapped phase overlaid with CNN's image segmentation mask, where background (color denoted) is fed into ZPF, (c) conjugated residual phase using CNN + ZPF, (d) fibers are visible after aberration compensation and are indicated by blue arrows, and (e) phase profile along the dash line in (d).Yellow bars denote the flatness of region of interest.

Fig. 11 .
Fig. 11.(a) Phase aberration, (b) unwrapped phase overlaid with CNN's image segmentation mask, where background (color denoted) is fed into ZPF, (c) conjugated residual phase using CNN + ZPF, (d) 3D phase after compensation, and (e) phase profile along the dashed diagonal line from left corner to right corner.