Automated single cardiomyocyte characterization by nucleus extraction from dynamic holographic images using a fully convolutional neural network

: Human-induced pluripotent stem cell-derived cardiomyocytes (hiPS-CMs) beating can be eﬃciently characterized by time-lapse quantitative phase imaging (QPIs) obtained by digital holographic microscopy. Particularly, the CM’s nucleus section can precisely reﬂect the associated rhythmic beating pattern of the CM suitable for subsequent beating pattern characterization. In this paper, we describe an automated method to characterize single CMs by nucleus extraction from QPIs and subsequent beating pattern reconstruction and quantiﬁcation. However, accurate CM’s nucleus extraction from the QPIs is a challenging task due to the variations in shape, size, orientation, and lack of special geometry. To this end, we propose a novel fully convolutional neural network (FCN)-based network architecture for accurate CM’s nucleus extraction using pixel classiﬁcation technique and subsequent beating pattern characterization. Our experimental results show that the beating proﬁle of multiple extracted single CMs is less noisy and more informative compared to the whole image slide. Applying this method allows CM characterization at the single-cell level. Consequently, several single CMs are extracted from the whole slide QPIs and multiple parameters regarding their beating proﬁle of each isolated CM are eﬃciently measured.


Introduction
Digital holographic microscopy (DHM) providing quantitative phase images (QPIs), is a promising tool in the field of label-free and computer-aided assessments [1][2][3]. The benefits of numerical reconstruction of digitally recorded holograms made it possible to study micrometer-sized living cells [4][5][6]. DHM demonstrating the capability of monitoring cell's dry mass is particularly well suited to achieve label-free screening of living cells [7]. Cardiomyocytes (CMs) are the main contractile elements of the heart leading to pump the blood to the entire body through the vessels by an orchestrated contraction-relaxation cycle coordinated by electrical stimuli generated by pacemaker cells [8]. CMs change their shape rapidly within a beat cycle with meaningful intermediate events named contraction and relaxation. Human-induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) are elastic elements for modeling human disease, cardiotoxicity, and therapy in vitro to regenerate viable cardiac tissue due to the limited capability of the heart to regenerate functional cardiac tissue [9]. There are several critical issues regarding stem cell therapy including the transfer and survival of the implanted stem cells in the myocardium [10]. During the CM beating activity, the cell tissue fiber consecutively shortens and lengthens. The fiber flexibility is essential for the cell's proper functionality. While cardiac cell's beating activity, dry mass redistribution can be monitored with DHM. The authors in [11,12] used label-free holographic microscopy for quantitative evaluation of cell's volume variation after

Digital holographic microscopy
The off-axis DHM is used to obtain the phase image of CMs in this study. In the off-axis configuration, the laser source is divided into an object and reference beams (see Fig. 1). The microscope objective (MO) magnifies the object wave and CCD camera records the hologram generated by the interference of the magnified object wave (O) and the reference wave (R) incident  at a small angle "θ" to provide the off-axis property. Figure 2(a) shows the recorded hologram of CMs, which can be expressed as the sum of four terms: where |R| 2 is the intensity of the reference wave and |O| 2 is that of the object wave. R* and O* denote the complex conjugates of the two waves. Two separate real image from twin image and zero-order noise a spatial filter in the frequency domain is designed. This filter can only preserve the bandwidth of the real image and eliminates the unwanted bandwidth. Finally, the phase images of cardiomyocytes are numerically reconstructed using numerical reconstruction algorithms on the computer [25][26][27]. Figure 2 shows one off-axis digital hologram of cardiac cells and its corresponding reconstructed optical path difference (OPD) images of cardiac cells. . The CCD resolution is 1920 ×1200 pixel (the hologram size is 1024×1024 efficient for FFT computation). The phase stability of imaging system is around ∆ϕ=0.05°, which in studying the biological samples with limited reflective index is equivalent to a thickness of several nanometers. The 666nm laser source delivered an intensity of ∼200W/cm 2 to the specimen plane, which is nearly six orders of magnitude less light, and the required exposure time was only 0.4ms. The reconstruction process of the cardiac cells phase image was conducted afterward using a standard PC, at a rate of several images per second. 540 were recorded at a sampling frequency of 10Hz.

ROI identification with dynamic beating activity analysis
To precise ROI identification, the dynamic beating activity comparison is carried out between the ROI and non-ROI as shown in Fig. 2(c) and the results of beating is shown in Fig. 3. The QPIs in this study are related to the optical path difference (OPD). The dynamic of beating profile is obtained from the spatial variance between successive time-lapse QPI images as explained in [28]. As it has been mentioned, OPD variance reflects the time course of the cell dry mass redistribution occurring during the cardiomyocytes contraction-relaxation cycle. Since the OPD value redistribution of the ROI is dominant comparing to non-ROI (the signal amplitude and significant contraction-relaxation peaks; see Fig. 3), the OPD variance in the ROI can significantly reflect beating activity. In contrast, the OPD variance in the non-ROI demonstrate no beating activity.

Convolutional neural network
Deep learning methods have been applied to a wide variety of problems more especially for medical image analysis [29]. In many cases, deep learning-based methods have surpassed state-of-the-art methods performance by capturing hierarchical representations of data. These models are based on the sequential application of the convolutional layer, where the output of one convolution layer is the input to the next one. Each convolutional layer provides one feature hierarchies at a specific level. Layer's weights are initialized by a set of random weights and a set of biases. Convolutional neural networks (CNN) is a type of deep learning model for capturing hierarchical features of data using locally shared weights. A convolution filter is formed by shared weights to the same output. CNN captures features of input data via multiple consecutive convolution kernels followed by max-pooling layers for data dimension reduction. CNN mainly consists of the following elements. 1) A set of learnable filters to extract local features. 2) A nonlinear function as an activation function and 3) a max-pooling layer which aggregates the local feature specification to reduce the data dimensions. The max-pooling operation used for the down-sampling purpose which obtains the maximum value of each filter in the convolution layer. The summation of each convolution layer is applied to a nonlinear function named as a rectified linear unit (ReLU) as an activation function. The ReLU function is a nonlinear function applied to increase the nonlinearity of the CNN feature maps. However, depending on the different tasks, different network architectures are proposed which might be more efficient than simple CNN-based networks. A fully convolutional neural network (FCN) is a type of CNN in which the fully connected layer is replaced with another convolution layer [30]. The current study focuses on single hiPS-CMs characterization by CM's nucleus extraction using the FCN-based deep learning model.

U-Net
U-Net is one of the most popular widely used end-to-end network models for semantic segmentation which consists of encoder and decoder pathway, with skip connections between the corresponding layers that yield good segmentation performance. The designed architecture is based on symmetric pathways for accurate localization. The first section of the U-Net extracts deep features and the second part is responsible for the segmentation based on previously extracted features. However, there are some limitations with U-Net architecture that can be denoted as less flexibility and lack of scalability. While deeper networks yield better segmentation, at the same time increases parameter space and cause gradient vanishing [31].

Proposed network model
This section introduces the concepts employed for efficient CM's ROI extraction. A novel FCN-based network architecture is proposed for accurate CM's nucleus extraction which takes advantage of parallel multi-pathways features concatenation with dense connection blocks and residual connections [32]. The overall structure and different building blocks of the proposed network architecture are shown in Fig. 4. The overall network structure leads to a better training performance when compared to the U-Net model meanwhile it improves the pixel classification accuracy. The proposed FCN-based network model's building blocks are briefly explained below: Fig. 4. The proposed FCN-based network architecture for cardiac cells ROI extraction has been made up of several modular blocks which are explained as follows: Parallel multi-pathway features concatenation (blue box) which takes the advantage of different kernel sizes: 1×1, 3×3, and 5×5. Each pathway is a composition of a convolutional layer + batch normalization layer + rectified linear unit. The features are concatenated at the end. 2×2 max-pooling layer with a stride of two is used for down-sampling data. The dense connection technique is used for efficient gradient propagation to prevent vanishing gradient. The residual connections are denoted with dotted horizontal red arrows representing residual skip connections.

Parallel multi-pathways features' concatenation
In multi-pathways features concatenation technique, different feature maps extracted by various kernel sizes are concatenated. Regarding the convolution layers' kernel size, it is hard to decide which kernel size might be more efficient for the task in hand and different kernel sizes will result in different features. The commonly used convolutional kernel size is 3×3. Therefore, we proposed to use different kernel sizes in different pathways in parallel consisting of namely 1×1, 3×3, and 5×5 followed by batch normalization (BN) and rectified linear unit (ReLU). Features from different pathways are concatenated at the end. Max-pooling layer with the size of (2×2) with a stride of 2 is used for down-sampling feature maps from different pathways to reduce the data dimension.

Dense connection
We propose to use convolutional dense connection blocks explained in [33]. Within each dense block, layers are directly connected with their preceding layers, which is implemented via the concatenation of feature maps in subsequent layers. The major advantages of dense connection blocks can be mentioned as efficient gradient propagation to prevent vanishing gradient which normally happens in deep networks and reuse of the feature maps from previous layers instead of only the last layer which results in better network performance.

Residual connection
Residual connection technique is to utilize skip connections, or short-cuts to jump over some layers that facilitate the training of deep networks [34,35]. In addition, the shorter connection between layers close to the output and input yields better performance, reduces the number of parameters, and easier to train. Besides, pooling operations cause to lose some spatial information. These skip connections allow the network to retrieve the lost spatial information. In our proposed network model, the output of the standard 3×3 convolutional layer prior to the pooling operation is transferred to the corresponding output of the up-sampling layer.

Patch extraction
Generally, deep learning methods require plenty of samples for the training. There are several different methods to increase the number of training samples, for instance, data augmentation method. Still, these methods cannot increase the data sample significantly. To tackle this problem, we utilized a patch extraction method, which can significantly increase the number of samples to a sufficient level for training of FCN-based deep learning models. We extract patches from the QPI containing multiple CMs using a sliding window and capturing patches along with corresponding ground truth (manually extracted) with a size of 32 ×32 pixels. The captured patches using a sliding window are not overlapped (see Fig. 5).

Experimental results
After the manually annotated images are reconstructed, we trained the proposed network by using the training dataset images (generated by patch extraction method) and their corresponding ground truth labels. The whole dataset contains 2500 extracted patches along with corresponding ground truth. From the whole dataset 80% (n = 2000) is used for training and 20% (n = 500) is used as a test dataset. We evaluated our trained network model for ROI extraction using some test images containing multiple cardiac cells (see Fig. 6(a)). To ease the visual evaluation on network performance, predicted masks from the proposed network model and predicted mask by the U-Net network model are shown in Fig. 6(b) and (c), respectively. The ground truth mask is shown in Fig. 6(d). From the experimental results, we can see the predicted mask by the U-Net model contains extra seeds and shows over and under segmentation. The experimental results show that despite the difficulties mentioned above for ROI extraction including ROI size, shape, and orientations, the proposed network model effectively handles different edge directions and specific ambiguous attributes of the ROI and non-ROI.  . As mentioned previously, the beating activity before ROI extraction showed to be noisy with extra wrong peaks due to the OPD variance in the non-ROI section that can cause difficulties for beating profile quantification. It has been demonstrated that the ROI section can reflect a clean and less noisy beating activity with removed wrong peaks and more informative for CM dynamic characterization.

Single cardiomyocyte beating profile quantification
During the first step of our approach, the cardiac cells ROI is extracted and the resulting mask image is multiplied by each QPI image in the sequence. Finally, the spatial variance between successive images is calculated to reconstruct the beating profile as explained in [28]. This parameter is sensitive to the redistribution of dry mass within cardiomyocytes to monitor the characteristics of the cardio beating over time. The corresponding equation is as follows: where opd i and opd i−1 are the i th and i-1 th images. Specifically, opd var (OPD variance) reflects the time course of the cell dry mass redistribution while CM beating activity. The opd var signal contains information about the redistribution of dry mass within cardiomyocytes namely contraction and relaxation durations. After ROI extraction, we can monitor single CM's beating activity and measure different parameters. Descriptions of quantification parameters are explained in Table 1.

Contraction
Beat rate The total number of contraction peaks in one minute (number of red points in Fig. 8(b)). The average time between Start-of-Relaxation to the End-of-Relaxation points.

Beating interval AVG [see #1 in
Relaxation period STD The standard deviation of the relaxation beating period.

(c)]
The average time between the End-of-Relaxation to the next Start-of-Contraction points.
Resting period STD The standard deviation of the resting beating period.
To further examine the use of the proposed method for single-cell level characterization, six individual cells are extracted from the whole slide QPI of cardiomyocyte shown in Fig. 8(a). The corresponding beating profile is reconstructed from Eq. (2) (see Fig. 8(b)). We measured several parameters related to the beating activity profile for each individual cell. In order to characterize the single cells dynamic beating profile activity using ROI extraction, first, we detected two main peaks of contraction and relaxation using the Otsu thresholding method. The first peaks which in most cases have a larger amplitude value stands for contraction and the corresponding relaxation peak is the second peaks with lower amplitude value. Afterward, three auxiliary points 1) Start-of-Contraction, 2) End-of-Contraction and 3) End-of-Relaxation showed in Fig. 8(c) are defined using contraction and relaxation peaks. Locating the auxiliary points are found as follows: The End-of-Contraction point (Start-of-Relaxation) (see Fig. 8(c) blue color points) is obtained by finding the smallest amplitude value between contraction and corresponding relaxation peaks. The auxiliary Start-of-Contraction point and the End-of-Relaxation are detected using a search strategy around the contraction and relaxation peaks. Then the time difference between End-of-Contraction (Start-of-Relaxation) and End-of-Relaxation is considered as relaxation period. The time difference between Start-of-Contraction and corresponding End-of-Contraction point specifies the contraction period. The time difference between two consecutive contraction peaks determines the beating intervals. The resting period is the time difference between End-of-Relaxation and the next Start-of-Contraction point. The extracted features of the beating  Table 1. profile and the corresponding descriptions are presented in Table 1. The beating activity profile reconstructed from cells #1, cell #2, cell #3, cells #4, cell #5 and cell #6 with detected contraction and relaxation peaks and auxiliary point for precise single CM characterization are shown in Fig. 9.
The dynamic beating profile quantification of single CMs show that the contraction period is shorter than the relaxation period due to the presence of different ion channels and transporters expressed in cardiomyocytes and the mechanisms by which their activities are sequentially orchestrated while cell contract and relax (see Fig. 10). The average beating rate, average beating intervals, and resting time are similar for all cells as shown in Fig. 10.

Network performance accuracy measurement
The proposed method was developed using MATLAB 2018a on a personal computer equipped with Intel Core i7-7700, 3.60 GHz, 16 GB RAM, and NVIDIA GeForce GTX 1050. A set of evaluations is carried out to compare the proposed method's performance against the U-Net model in terms of training process performance, pixel classification accuracy, and Dice coefficient. The following network configuration was found to be optimal. Learning rate: 1.000e-03, momentum: 0.9000, mini-batch size: 5, and max epoch: 55. The learning curve evaluation shows the proposed method convergence and corresponding loss is faster when compared with the U-Net model (see Figs. 11(a), (b)). The confusion matrix of pixel classification demonstrates the accuracy percentage of correctly classified pixels versus the misclassification rate. As shown in Fig. 11(c), the classification accuracy rate using the proposed FCN-based method 99.76% and 99.28%, respectively while the same evaluation for the U-Net model shows 93.69% and 91.25% accuracy (see Fig. 11(d)). The experimental results indicate the robustness of the proposed method by accurate pixel classification. In order to evaluate the proposed method's segmentation performance of the proposed method against the U-Net model from the statistics point of view, for each cardiomyocyte sample, we used the well-known Dice coefficient analysis (see Table 2). The Dice coefficient is calculated using the following equation: where TP, FP, and FN are the numbers of true positive, false positive, and false negative detections, respectively.

Discussion
Single CM characterization in vitro is crucially important for drug compound testing and cardiotoxicity screening. We proposed a method for single CM characterization by using nucleus extraction (ROI) and the other CM sections are considered as non-ROI (surrounding cytoplasm and membrane). In our proposed method, the hiPS-CMs beating activity profile obtained by DHM is reconstructed by using the OPD variance calculation between the successive QPI frames.
The experimental results show that the CM beating profile reconstructed from the extracted nucleus is less noisy and more informative for further characterization. On the other hand, the OPD variance in the non-ROI section leads to create a noisy beating profile with multiple wrong peaks causing difficulties for further CM characterization. Due to the specific attributes of the nucleus section, we proposed a novel FCN-based method for nucleus extraction combined with further analysis which allows cardiac cell characterization at single-cell levels. The proposed platform can be integrated into other technologies for precise single CM characterization. The proposed FCN-based architecture showed a better segmentation performance when compared to the one of the popular deep learning-based biomedical image segmentation U-Net model. The overall goal of this approach is to facilitate the cardiac cells beating profile characterization both in multiple and single-cell levels. We found that segmentation was necessary for a single CM characterization. To further examine the potential of the proposed method, we extracted multiple isolated CMs from segmentation results and multiple parameters regarding the beating profile of every single cell are measured. All findings from the experimental results demonstrated that single CMs can be effectively characterized by the cell's nucleus section extracted by the proposed method. The combined measurements showed the robustness of the proposed FCN-based method for CM nucleus extraction and consequent characterization in a single-cell level suitable for screening compounds cytotoxic effects.
To validate the proposed method for single cardiomyocyte characterization using the nucleus section and to examine whether the segmentation area includes nuclei, one sample is stained with Hoechst dye. We applied our segmentation method prior to and after nuclei staining and ROI and non-ROI sections are marked for the visual comparison. As one can see from Fig. 12, the ROI section mainly includes the nuclei section (nuclei is marked in blue color).

Conclusion
In this paper, for the first time, we proposed an automated FCN-based platform for CM's nucleus extraction and beating pattern characterization at the single-cell level. We demonstrated that the cardiac cell nucleus section (denoted as ROI) from QPI can effectively reflect the beating pattern suitable for CM characterization at the single-cell level. We proposed a novel FCN-based method to discriminate CM's nucleus section from other sections of cardiac cells using pixel classification techniques. The predicted mask is used for further characterization of dynamic contraction-relaxation of single CM. The proposed FCN architecture outperformed U-Net for segmentation mask prediction by over 99% pixel classification accuracy. We evaluated the proposed model's segmentation mask prediction using pixel classification accuracy along with network training accuracy and dice similarity coefficient metrics. To validate our FCN-based platform, we used multiple quantitative phase images of cardiomyocytes containing multiple cells under different circumstance including shape, size, and, orientations. In the next step, we extracted several individual cardiac cells for beating profile characterization at the single-cell level and multiple parameters associated with the dynamic beating profile of each cell after removing non-ROI are measured. The Single CM beating profile quantification is precisely performed using contraction-relaxation peak detection and multiple auxiliary points. The experimental results showed that the proposed method is robust for cardiomyocytes ROI extraction and subsequent cell dynamic beating profile characterization at the single-cell level.