PhUn-Net: ready-to-use neural network for unwrapping quantitative phase images of biological cells

: We present a deep-learning approach for solving the problem of 2 π phase ambiguities in two-dimensional quantitative phase maps of biological cells, using a multi-layer encoder-decoder residual convolutional neural network. We test the trained network, PhUn-Net, on various types of biological cells, captured with various interferometric setups, as well as on simulated phantoms. These tests demonstrate the robustness and generality of the network, even for cells of diﬀerent morphologies or diﬀerent illumination conditions than PhUn-Net has been trained on. In this paper, for the ﬁrst time, we make the trained network publicly available in a global format, such that it can be easily deployed on every platform, to yield fast and robust phase unwrapping, not requiring prior knowledge or complex implementation. By this, we expect our phase unwrapping approach to be widely used, substituting conventional and more time-consuming phase unwrapping algorithms.


Introduction
The presence of phase ambiguities and their unwrapping algorithms constitute a wide field of research in both optical and non-optical imaging and sensing applications [1]. Specifically, the quantitative phase map measured using interferometric quantitative phase imaging (QPI) [2,3] suffers from this problem. This measured phase is proportional to the optical path delay (OPD), indicating the extent to which light was delayed when passing through the sample in relation to a clear medium. QPI yields inherently high contrast for isolated cells in vitro without the need for staining, along with valuable quantitative information regarding both the internal geometrical structure and the refractive index (RI) distribution of the sample [4]. These key advantages led to the thriving of QPI as a leading imaging modality for both biological and medical research in the past decade [5][6][7][8][9]. However, the phase of the wavefront, consisting of this quantitative information, is encoded inside a complex exponent [10], such that the phase obtained by an interferometric setup [11,12] is wrapped, meaning that it is a modulo 2π function of the actual phase, where the quotient might be unknown [1,13]. Thus, for every pixel in which the sample is optically thicker than the wavelength, an ambiguous OPD value is obtained. Generally, the two-dimensional (2-D) phase unwrapping problem is ill-posed; Nevertheless, it can be exactly solved when the ground truth (GT) phase profile satisfies the 2-D extension of Itoh's continuity condition [13], stating that the local spatial discrete gradient in the GT phase profile does not exceed an absolute value of π radians (so there is no aliasing). In our case, since the inverse problem in hand is unwrapping rather than denoising, we refer to the noisy version of the unwrapped phase as the GT phase. Owing to the multiple possible unwrapping paths enabled for the 2-D problem, the unwrapped phase profile can be found in a satisfactory manner even in cases where the continuity condition is not met in few pixels, as is often the case, by various path-based phase unwrapping algorithms [1]. Alternatively, the task of phase unwrapping can be formulated as an inverse problem in a path-independent way, minimizing the error between the intermediate phase reconstruction and the measured phase with various regularization terms [14]. Nevertheless, the state of the art algorithms are computationally heavy, not readily available in all software platforms, and often difficult to implement.
Artificial intelligence (AI) has been used in the last few decades in many areas of science and technology, to solve difficult combinatorial problems. The Al stochastic search is a group of techniques that uses probabilistic methods relying on randomized decisions, including algorithms such as simulated annealing, neural networks, and genetic algorithms [15]. Several attempts have been made to solve the phase unwrapping problem using Al stochastic search techniques [16][17][18][19][20][21][22][23][24]; yet these algorithms have not been challenged by the variety of vexing phase unwrapping situations occurring in actual data analysis, including steep spatial gradients, but rather usually given only simple scenarios [15]. Specifically, artificial neural networks have the potential to yield a robust solution to the phase unwrapping problem, since they can learn the characteristics of the input data, and if trained properly, use this information to unwrap the phase more accurately while ignoring noise. In the past, several attempts have been made for applying phase unwrapping using neural networks both in one-dimension [18,19] and in two dimensions [20][21][22][23][24] for various applications; yet up until recently, all suggested neural networks were shallow, consisting of less than 5 layers. In the past couple of years, the concept of deep learning has emerged as a gold-standard solution to many types of problems in endless fields [25], where the revolution lies in the use of hundreds of hidden layers, consisting of millions of parameters, which is enabled by recent computational advancements. Specifically in the field of image processing, deep convolutional neural networks have revolutionized problems ranging from basic classification [26] and segmentation [27] to complex inverse problems in imaging [28][29][30][31][32][33][34][35]. For the latter case of inverse problems, the residual neural network (ResNet) architecture [36], which adds short-term memory to each layer, has taken the lead due to its ability to force the network to learn new information in every layer beyond what is already encoded in the network.
Harnessing recent advances, both in software and in hardware, we have recently explored different deep learning-based approaches for solving the problem of phase unwrapping in objects with high spatial gradients (not meeting Itoh's continuity condition) via simulation [37], answering the fundamental question of whether this problem should be addressed as an inverse problem or as a semantic segmentation problem (finding the missing quotient). We have come to the conclusion that the most accurate results can be obtained by treating the phase unwrapping problem as an inverse problem, and then rounding the result to the closest integer addition of 2π, yielding an effective segmentation operation, as was previously proposed by Pritt, to account for inconsistency [38]. Others have then demonstrated a combined framework, where the phase unwrapping problem is treated as a semantic segmentation problem, followed by clustering-based smoothness optimization, demonstrated on simulative data [39]. Their work further supported our conclusion that a segmentation neural network on its own is unable to yield satisfactory unwrapping results. Wang et al. have recently proposed an intelligent simulative database generation method for phase-type objects [40], enabling them to train a neural network that generalizes well to noisy and aliased experimental data, thus tackling one of the main challenges of the method, which is the inaccessibility of experimental GT. Nevertheless, this network is not publicly available, and thus cannot be tested on other types of biological cells. The concept of deep-learning phase unwrappers was recently demonstrated for other applications as well, including 2-D phase unwrapping in optical metrology [41,42] and lens-free imaging [43], as well as temporal phase unwrapping in fringe projection profilometry [44].
In the current paper, we present a phase unwrapping procedure based on a deep encoderdecoder residual convolutional neural network, dubbed PhUn-Net (Phase Unwrapping Network). We trained this network on experimental data composed of thousands of biological cells, and demonstrate its generalized capabilities on various unseen types of biological cells. The trained network is made here publicly available in a global format (Dataset 1 [45]) such that, in contrast to previous deep learning phase unwrappers as well as conventional phase unwrappers, PhUn-Net can be easily deployed on every platform to yield fast and robust phase unwrapping of QPI data of biological cells.

Methods
The actual phase profile undergoes a modulo 2π transformation in the recording process, constituting the phase unwrapping problem. However, in practice, the relation between the original and recorded phase profiles is not quite as trivial as an integer addition of 2π. The recorded phase profile is also subjected to various types of noise, depending on the physics of the recording system. For example, in digital holographic microscopy, this noise includes speckle noise, shot noise, and readout noise, as well as typical aberrations caused by an inhomogeneous illumination. Viewing the problem of reconstructing the original phase from the recorded phase as an inverse problem, as we have found to be the optimal approach [37], actually means trying to find the pixel values that globally minimize the error. This is similar to minimum-norm methods, where we try to minimize the norm of the differences between the gradient of the intermediate phase reconstruction and that of the measured phase [14]. Using this approach, the pixel values in the output of the neural network are continuous, and not restricted to have an exact modulo 2π relation to the input phase map, such that no precise solution is to be expected per-pixel. To further improve the accuracy of the results, we add a simple post-processing step, which effectively rounds the value in each pixel of the solution to the closest integer addition of 2π to the input, thus yielding an effective segmentation operation. This step, first suggested by Pritt [38] to achieve a congruent solution that yields the input phase upon rewrapping the solution, can be achieved by subtracting the initial solution from the wrapped input phase, rewrapping the result of the subtraction to (-π, π], and then adding it to the initial solution. Thus, we allow a complex relation with many degrees of freedom between the input and output in the network itself, which is not restricted to a modulo 2π operation, but restrict the final, post-processing output to be consistent with the input in terms of wrapping cycles, since the inverse problem at hand is purely unwrapping.

Network architecture
To implement a neural network fit for solving an inverse problem, we chose to use the residual neural network (ResNet) concept [36], with an encoder-decoder architecture, which has taken the lead in this field. A diagram of the suggested deep neural network architecture is shown in Fig. 1.
The main differences between our suggested network and the more commonly used networks include the use of a leaky ReLU activation function for training speed and dealing with the vanishing gradient problem, the lack of batch normalization, and the use of a learned upsampling procedure that maintains the overall number of pixels [33]. The input to the network is a wrapped phase image of size 512×512, and its output is an unwrapped phase image of the same size. The connection weights were trained using the adaptive moment estimation (ADAM) optimizer, on the mean squared error between the network output and the GT unwrapped phase images (L2 norm), with an initial learning rate of 0.0001, over 29 epochs. The entire training process took 496 minutes on NVIDIA's Tesla K40c GPU.

Dataset for training and validation
For training and validation, we cropped 512×512 pixel regions out of 3275 1024×1024 pixel transmission-mode quantitative phase images of biological cells of human source in a watery medium, taken with our various experimental interferometric phase microscopy systems, creating 7826 images. Each of the images used by the network contained 512×512 pixels, which is typically suitable for imaging at least a single cell, even if using a large magnification of 60× or 100×. 7500 of these images were used for training, and the remaining 326 were used for validation. This The cells used for training and validation were imaged under partially coherent light. Specifically, the sperm cells were prepared and imaged as described in [46], using the off-axis tau interferometer [47], whereas all other cells were prepared and imaged as described in [8], using a Mach-Zehnder off-axis interferometer. In contrast, part of the test set, presented in the Results section, includes cells with different morphologies acquired under highly-coherent light.
Image pre-processing: All quantitative phase images were extracted from the corresponding interferograms using the Fourier transform filtering reconstruction algorithm, which includes performing a digital Fourier transform, cropping of one of the two conjugate cross-correlation terms, and then performing an inverse Fourier transform [10]. Since we acquired optimal off-axis interferograms, in which the cross-correlation term occupies no more than 1/4 of bandwidth, achieved by adjusting the magnification according to the numerical aperture (NA) of the microscope objective and the pixel size of the camera [48], the cropping step could be performed automatically, without further optimization on the cropping window size. The phase argument of the resulting complex wavefront is the initial phase profile of the sample. To prepare this phase profile for unwrapping, we subtracted from it an object-free phase map acquired with the same illumination, and, if needed, also a fitted inclined plain, to achieve a flat background. For training, each respective ground-truth phase map was obtained by applying the reliability-based 2-D phase unwrapping algorithm [49], which ensures a modulo 2π relation to the input, and zeroing negative values in the resultant unwrapped phase map (to minimize artifacts).

Results
The calculation time using PhUn-Net was ∼0.35 seconds for 512×512 inputs using Intel's Xeon E5-1620 v3 3.50 GHz 32.0 GB RAM 64 bit CPU, and only ∼0.033 second using NVIDIA's Tesla K40c GPU. The trained PhUn-Net is provided as an Open Neural Network Exchange (ONNX) file, which can run on many platforms, and is attached as a data file with this paper (Dataset 1 [45]).
Since the final result (after the post-processing step) is effectively a semantic segmentation map multiplied by 2π and added to the input, we quantified unwrapping success by the accuracy of the segmentation. As commonly performed in semantic segmentation, we define accuracy as the percentage of correctly classified pixels out of the total number of pixels in the image.
In order to test PhUn-Net, we first present the results obtained for unwrapping unseen phase images of the same type as the network was trained on (in terms of cell type and optical setup). Figure 2 presents a comparison between the results obtained for unwrapping phase images of human sperm cells in a watery medium (similar to the ones of the training set) using PhUn-Net and using the robust reliability-based 2-D phase unwrapping algorithm [49]. Figure 2(a) presents the network input, Fig. 2(b) presents the raw network output, and Fig. 2(c) presents the network output after it has been made consistent with the input by rounding to the closest integer addition of 2π. Figure 2(d) presents the GT, obtained by the robust reliability-based 2-D phase unwrapping algorithm [49], and Fig. 2(e) presents the binary difference between Fig. 2(d) and Fig. 2(c) (every entry of the original difference is an integer multiplication of 2π, due to the rounding post-processing step). Each row in Fig. 2 features a different sample, where the top row presents an empty slide, used to verify a lack of hallucinations, meaning that an empty input results in an empty output, i.e. without ghost cells. Indeed, the network does not generate ghost cells in irrelevant locations, as is evident from multiple tested empty slides; however several erroneous pixels are present at the left border of the image, caused by the convolutional nature of the solution. Rows 2-4 in Fig. 2 present human sperm cells in a watery medium, including both isolated and aggregated cells, inducing low gradients, some presenting imaging artifacts. As can be seen, the images were properly unwrapped, which in this case did not include adding any integer multiplications of 2π. Even though the left part of the bottom image, exhibiting notable artifact induced by poor interference, presented some local unwrapping errors, it did not affect the rest of the image, leaving the cells unharmed. As can be seen from Fig. 2, while the network itself is not sufficient to get accurate results, it approximates the solution well enough such that a simple post-processing rounding operation is enough to get excellent results. Figure 3 presents a comparison between the results obtained for unwrapping phase images of cancer cells in a watery medium (similar to the ones of the training set) using PhUn-Net and using the robust reliability-based 2-D phase unwrapping algorithm [49]. Figure 3(a) presents the network input, Fig. 3(b) presents the raw network output, Fig. 3(c) presents the network output after it has been made consistent with the input by rounding to the closest integer addition of 2π, Fig. 3(d) presents the GT, obtained by the unwrapping algorithm [49], and Fig. 3(e) presents the binary difference between the latter two. Each row in Fig. 3 features a different sample, where rows 1-3 are colorectal adenocarcinoma colon (SW-480) cells, rows 4-5 are melanoma skin (WM-115) cells, and rows 6-7 are metastatic melanoma skin (WM-266-4) cells. All types of cells present successful unwrapping, excluding a small number of isolated pixels near the borders.
Next, in order to further validate the ability of PhUn-Net to generalize for unseen phase images, we also present the results obtained by unwrapping phase images of various types and morphologies of cells unseen by the network, imaged under coherent light (as opposed to partially coherent during training).
Breast cancer cells from an MDA-MB 468 cell line were imaged during flow using the off-axis tau interferometer [47], under a 7.9 mW, 633 nm helium-neon laser. Figure 4 presents a comparison between the results obtained for unwrapping phase images of these cells using PhUn-Net and the robust reliability-based 2-D phase unwrapping algorithm [49]. Figure 4(a) presents the network input, Fig. 4(b) presents the raw network output, Fig. 4(c) presents the network output after it has been made consistent with the input by rounding to the closest integer addition of 2π, Fig. 4(d) presents the GT, obtained by the unwrapping algorithm [49], and Fig. 4(e) presents the binary difference between the latter two. As can be seen, the network yields an erroneous reconstruction in a small, isolated, group of pixels, mainly in the image borders, but also at the center of the image displayed in the second row.
Yeast cells (Saccharomyces Cerevisiae) were imaged in suspension during their reproduction by budding using the off-axis tau interferometer [47], under a 532 nm wavelength coherent diode-pumped laser, (Compass 2158M-50). Figure 5     obtained for unwrapping phase images of yeast cells using PhUn-Net and the robust reliabilitybased 2-D phase unwrapping algorithm [49]. Figure 5(a) presents the network input, Fig. 5(b) presents the raw network output, Fig. 5(c) presents the network output after it has been made consistent with the input by rounding to the closest integer addition of 2π, Fig. 5(d) presents the GT, obtained by the algorithm [49] , and Fig. 5(e) presents the binary difference between the latter two. As can be seen, the network yields an erroneous reconstruction only in a small number of pixels, most of which are in the noisy upper left border of the image displayed in the third row.

presents a comparison between the results
Red blood cells were imaged during flow under an external flipping interferometric module [50] under a 10 mW, 633 nm helium-neon laser. Figure 6 presents a comparison between the results obtained for unwrapping phase images of these cells using PhUn-Net and the robust reliability-based 2-D phase unwrapping algorithm [49]. Figure 6(a) presents the network input, Fig. 6(b) presents the raw network output, Fig. 6(c) presents the network output after it has been made consistent with the input by rounding to the closest integer addition of 2π, Fig. 6(d) presents the GT, obtained by the algorithm [49], and Fig. 6(e) presents the binary difference between the latter two, exhibiting a nearly perfect reconstruction (other than a single pixel), even in difficult areas with overlaps. The test set presented in Figs. 4-6 introduces not only different morphologies, but also significantly higher speckle noise relative to the training set, due to the use of coherent light sources. As can be seen, all types of cells tested were properly unwrapped by PhUn-Net, minimally harmed by the notable reduction in image quality. To further validate the performance of the suggested network, we created a simulated phantom, where the GT is known, and added various levels of noise, as can be seen in Fig. 7. The simulated phase phantom was encoded in an off-axis hologram and quantized to 8 bits, to simulate realistic recording and extraction. The first row presents the phantom in an ideal state, without noise, resulting in a 100% accurate reconstruction. The second row in Fig. 7 presents a phase image subjected to diffraction, caused by a microscope objective with NA = 1.34. The third row in Fig. 7 presents a phase image with an added non-flat illumination surface, in addition to diffraction.
Both above cases still yield perfect reconstruction. The fourth row presents a phase map with all above artifacts, as well as shot (Poisson) noise, resulting in a minor decline in accuracy (<0.01%). The fifth row presents a phase map with added speckle noise with variance 0.025 on top of all other artifacts, surprisingly still yielding 100% accuracy. Finally, the sixth row presents a phase map with added speckle noise with variance 0.15 on top of all other artifacts, again resulting in only a minor decline in accuracy (<0.01%). To further test the dependency of the reconstruction accuracy on the speckle variance, we checked multiple speckle variance levels, ranging from 0.01 to 0.5 in 0.005 increments, as shown in Fig. 8. While the accuracy is indeed reduced with the rise of speckle variance, even with very high speckle noise the accuracy is still high, reaching a lower bound of 99.97%. Finally, after having seen that training PhUn-Net on partially coherent data generalizes well to coherent data, we wanted to check the opposite case. For that purpose, we used the exact same neural network described above with the same basic training set, only this time all training images were synthetically added speckle noise with variance 0.05. The results of applying this network to some of the partially coherent images presented in Figs. 2-3 are given in Fig. 9. As is shown in Fig. 9, this network gave similar results to the one trained on the original dataset.

Discussion and conclusions
The potential of artificial neural networks in yielding a robust solution to the phase unwrapping problem was identified decades ago, due to their ability to learn the characteristics of the input data, and if trained properly, use this information to unwrap the phase more accurately while ignoring noise; yet until recent years, technological setbacks limited the complexity of the networks that could be trained and prevented them from being a feasible solution. Deep learning has changed the computational arena in countless fields. Specifically in the field of image processing, deep convolutional neural networks have revolutionized problems ranging from basic classification and segmentation to complex inverse problems in imaging.
In this paper, we trained a 117-layers-deep residual convolutional neural network to solve the phase unwrapping problem for transmission-mode quantitative phase images of biological cells of human source in a watery medium. We made this network publicly available (Dataset 1 [45]).
Even though the network was trained on phase images acquired using a low-coherence light source, it generalizes well to noisier images acquired using a coherent light source, with various cell morphologies.
Possible limitations of PhUn-Net are discussed in this paragraph. First, since the network was trained on phase maps of cells in a watery medium, it is not expected to be suitable for completely Each colorbar corresponds to the images above it. The accuracy for rows 1-6 is: 99.6%, 100%, 99.95%, 100%, 99.99%, 99.99%, respectively. dried cells, as those induce significantly higher gradients than those the network was trained on, which may cause errors in specific locations. Second, PhUn-Net is not suitable for reflection microscopy. Third, due to its convolutional nature, the network is less reliable near the edges of the image, thus it is best practice to center the 512×512-pixel FOV on the region of interest, which is typically suitable for at least one cell even under larger magnifications. Additionally, other input sizes cannot be used with PhUn-Net; yet similar networks can be trained for all input sizes, as well as for other scenarios, using the deep learning approach developed in this paper. Finally, while it is able to endure noise, as demonstrated by additional simulation results, PhUn-Net is prone to failure when the background of the input is not flat, thus it is crucial to acquire a sample-free hologram to compensate for beam curvatures and other illumination inhomogeneities. However, this referencing step is quite trivial. Even in the lack of a proper background hologram, this step can be replaced by numerical surface fitting schemes that account for wrapping artifacts, or even by a separate dedicated neural network, that runs prior to PhUn-Net.
Even though we chose an architecture treating the phase unwrapping problem as an inverse problem, thus not expecting an exact modulo 2π transformation from the input to the output, the network learned its statistical transformation from the GT, which, in our case, did have a theoretical exact modulo 2π relation to the input. Nevertheless, as can be seen from the results, the network did not learn this exact modulo 2π transformation, and needed the simple post-processing rounding to achieve accurate results, due to its global-minimization nature. We believe this is still the better choice of implementation, since if we had chosen a semantic-segmentation architecture, which simply outputs an integer for each pixel (i.e. the quotient), the result would suffer from severe local errors [37]. Moreover, the separation of the segmentation step from the network allows each end-user to choose whether or not they want to apply it, thus effectively choosing between local and global minimization.
Note that the GT used for training PhUn-Net was a specific phase unwrapping algorithm [49], which worked well for the phase maps used. Nevertheless, the same deep learning approach can be trained with other GT images, obtained by running a different phase unwrapping algorithm, a combination of several unwrapping algorithms, or even using a synthetic wavelength [51], to obtain a more robust network.
To conclude, we presented PhUn-Net, a deep residual convolutional neural network able to match the performance of a robust phase unwrapping algorithm for biological cells in a watery medium. This neural network, which is now made publicly available, can be easily deployed in almost every programming framework, not requiring prior knowledge or language-oriented efficient implementation of complex phase unwrapping algorithms. Furthermore, with dedicated training including difficult data and coinciding GTs, the same approach can be made applicable to cases that cannot be solved trivially using standard algorithms. The unwrapping time using PhUn-Net is only ∼0.033 seconds for 512×512 inputs using a Tesla K40c GPU, and thus is highly applicable for real-time data processing.
To further facilitate the relevance of PhUn-Net, we urge researchers in the field to send us their QPM data, including non-trivial cases, possibly accompanied with GT images acquired experimentally, and we will release an updated version of the network for public use every time sufficient data has been accumulated. We hope that the QPM community will adopt and widely use the publicly available PhUn-Net.
Note Please send an email to omni@eng.tau.ac.il with a link to a drive where your wrapped and unwrapped quantitative phase maps are located. PhUn-Net will be trained with this data and the updated version of the network will be released in www.eng.tau.ac.il/∼omni as soon as enough training data is accumulated.