A Curvature-Norm Based Centroid Initialized Distance Regularized Level Sets for Nuclear Segmentation in Histopathological Images

Nuclear pleomorphism is considered to be one of the most significant shape based feature adapted in grading the cancer through the pathological studies of the H&E stained tissue slides. Microscopic study of manually extracting the feature is highly laborious and misleads the pathologists during grading. Digitization of the slides has given rise to various segmentation approaches to extract the nuclei shape to assess the degree of pleomorphism. Here, a novel approach of initializing and evolving the distance regularized level sets (DRLS) for the detection and segmentation of the nuclei has been presented. In this work, two major objectives have been achieved. First, a novel geometric approach has been devised for the detection of centroids of each nuclei in the occluded region and second, a shape prior model has been presented for the extraction of gradient information through morphological operations. The multiple level set implementation of the DRLS contours are initialized using the centroids detected and driven through the gradient computed. The proposed method has been experimented over the images of benign and malignant breast cancer tissue obtained from BeakHis dataset. A quantitative analysis of the results have shown that a 97% of object detection accuracy and 78% of overlap resolution has been achieved through the proposed model. A comparative study with that of geodesic active contours have indicated an improvement in the segmentation accuracy measure of 9-10 pixel difference.

Extraction of the significant biomarker play a significant role in the diagnosis and prognosis of the cancer. Pathologists have considered the nuclear pleomorphism as an important shape based biomarker in the process of staging the disease. A manual delineation of the nuclear boundary fetches the pleomorphic features based on the shape detected through microscopic observation of the histopathological slides. This is a very laborious and time consuming activity due to the heterogeneity of the tissue objects observed on the slides. This will mislead the diagnosis process resulting in wrong staging. This may also, be due to inter and intra-observer variabilities existing among the pathologists.
Many automated staging systems have been proposed following the advent of digitization of the slides. The digitized image is considered as a 2-D scene I s , represented as a matrix M, consisting of pixel intensity values of RGB components. It is defined as I s = (M, ω), where ω = χ(x,y) is the pixel intensity function representing a vector ω e M, consisting of intensity levels of red, green and blue components. Various imaging techniques have been presented in the literature, addressing the object detection, segmentation and followed by classification. Segmentation of nuclei is an important phase in the process of extracting the pleomorphic features. Since the boundary to be extracted is irregular and presents discontinuity, many low level approaches fail in segmenting the nuclei to the required accuracy. Hence, due to complex morphological features and heterogeneity of the image, segmentation of the nuclei is considered to be a challenging task.

Existing Literature
Most of the works presented in the literature have highlighted significance of various features and similarity measures in addressing the object detection in an overlapped region and segmenting the irregular boundary. An unsupervised segmentation has been presented based on the features computed using magnitude and spectra in the frequency domain 1 . A morphologically seeded watershed based method has been proposed extract the overlapped nuclei 2 . In a work presented, a Gaussian based hierarchical voting and repulsive balloon model has been used for a cell segmentation 3 . An improved hybrid active contour model driven by both boundary and region information is used for an effective nuclear segmentation 4 . In the work presented by 5,6 a multiscale radial line scanning has been proposed to delineate the boundaries of nuclei detected using Laplacian of Gaussian kernels. An integrated model of adaptive morphology and curvature scale space has been used to segment the overlapped cells 7 . A color decomposition based active contours and a sparse shape prior and occlusion constraint based levels sets have also been proposed for a robust nuclei segmentation 8,9 . A large feature set based adaboost classifier technique has been used to perform nuclear detection 10 . There are methods based on deep convolution networks applied to perform nuclear segmentation 11,12 . It has also been shown that the edge based approaches are inefficient due to irregularity and missing boundary information, whereas region based approaches suffer from over and under segmentation.

System overview
There are two major challenges need to addressed during the segmentation of the nuclei in a digitized H&E images. First, the detection of the number of nuclei present in an occluded region and second, to compute the boundary information accurately to segment the nuclei shape presenting the pleomorphic features. The work presented in this research has been able to address both the issues efficiently. Following are the main objectives achieved in this proposed methodology.
First, the computation of geometric centroid C g = {c i : M(x i ,y i ) e I s } of each nuclei objects present in the occluded region of interest (ROI) extracted through proposed shape prior based morphological enhancements. In the existing literature, various methods of centroid detection have been proposed. A review on centroid detection techniques based on Euclidian distance map, Hough transform and H-Maxima transform has been presented 13,14 . Nuclear size has been considered as a biomarker representing the ground truth for the detection of the centroids 15,16 . Detection schemes based on support vector machine (SVM) and deep learning approaches have been presented 17,18 . In the work presented in this research, a novel approach of generating the centroids of the irregular curvature has been presented. This is achieved by computing the intersecting points X i ={x 1 ,x 2 ,…..x m } of the norms N c ={n 1 ,n 2 ,…..n l } obtained orthogonal to the tangents T c ={t 1 ,t 2 ,…..t l } drawn over each boundary point B c (x,y) on the curve. Considering the Euclidian distance function over the m intersecting points, k number of clusters are obtained using k-means clustering. The k value is computed for each region, by taking the fractional area of the overall region keeping the average area of the nuclei as the fractional value given by Eq. 1.
... (1) where A region is the area of the occluded region and A avg is the mean area of the nuclei computed using the shape prior model. Through this approach an approximate centroid of each nuclei present in the occluded region is obtained, hence detecting the existence of nuclei.
Second, the segmentation of the nuclei boundary through the evolution of contours implemented as multiple level sets of distance regularized level set (DRLS) function. Active contours, originally proposed as energy minimizing deformable models 19 have been considered to be most effective in segmenting the irregular boundaries. The basic idea is to evolve the contour u, which is represented as a polynomial function in a level set functional model. The level set function ...(2) Where f G is the function which computes the gradient information. The evolution of the contour towards the object boundary is controlled by the gradient information, obtained as the function of E I and E u representing the energy gradient of image and the contour respectively. As the contour evolves towards the boundary, the energy difference reduces to null value. Various active contour models have been proposed in the existing literature emphasizing the importance of gradient computation to drive the contours effectively. An active shape model based on a statistical approach constrained by point distribution 20 , has been presented to drive the active contours effectively 21 . A multiple level set implementation based on both region and edge gradient is also presented 22 . Geodesic active contours 23 have been shown to be quite effective segmentation approach 24 . A novel method of computing adaptive energy and integrating the shape, region and the boundary features have been presented 25,26 . A region gradient based active model has also been proposed 27 , which is based on an energy minimizing model 28 .
In this research, an improved edge based distance regularized level set (DRLS) 29 active contour model has been adapted. Here, two important terms viz., forward and backward diffusion effects of contour evolution have been integrated in a distance regularization model. The term D=µh pv ( D f) is the diffusion rate controlled by a positive or negative potential value pv indicating the forward and backward diffusion. The second term is the derivative of the external energy.
The basic idea of the work presented in this research is to obtain the region of interest by adapting a shape prior model to morphologically extract the foreground regions F roi . . A novel geometrical approach of obtaining the set of norms n i e N c to the curvature orthogonal to a tangent t i drawn over each i th boundary point, is adapted to compute the set of centroids C g , representing the geometrical centers of each nuclei present in the region. The resultant of morphological processing based on the proposed shape prior is adapted to compute the energy gradient I g, which represents the external driving force for the contour evolution. Hence, the DRLS contours implemented as multiple level sets are initialized at the centroids detected and made to evolve using the shape prior based gradient computed. Subsequent sections provide the description of the data set used and a detailed discussion of the various stages of the methodology, followed by experimentation and result analysis.

MATERIALS AND METHODS
In this section, a description of the data set used followed by a detailed discussion on each stages of the methodology is presented.

Description of the dataset
The digitized images of H&E stained histopathological slides have been obtained from a standard collection provided by BreakHis dataset 30   following section, have been applied to achieve the above listed objectives. Throughout this research many notation have been used as listed in Table.1

Proposed methodology
In this section, various stages of the proposed method has been presented in detail. Algorithm 1 shows the complete illustration of the various phases involved.

Image enhancement
The H&E stained image inherently presents both low and high frequency noise components, due to staining and zooming errors. Hence, Wiener filter is considered to be the promising technique in eliminating both the components. The frequency sub-bands, s f , are separated from noise components N=(s low ,s high ) and a denoised signal co-effiecient D cf can be computed as shown in Eq.   ... (8) ... (9) The idea of integrating Weiner filter with DWT results in an efficient filtering of the noise components 31 . It has also been shown that DWT results in reduced over-segmentation 32 .

Computation of Shape prior
In this phase, the foreground region of interest F roi is extracted using a suitable shape prior model representing the structural element b and g, to compute the area of the nuclei objects to be detected. The shape prior model is as shown as shown in Eq. 10 and 11.
... (10) ... (11) The parameters µ D and s D are the mean and standard deviation of the nuclei diameter, which is derived from the mean area of the foreground objects obtained from the outcome of the clusters generated using k-means algorithm. The area computed by b and g are used with erosion and dilation process respectively. The parameter ± corresponds to the thresholding factor for dilation. These morphological operations generates the foreground scene from which the image gradient I g is computed. Binarization of the same results in the extraction of ROI.

Centroid detection
After obtaining the ROI from the previous stage, the detection of the existence of the nuclei is performed in this stage. As presented earlier, the novel curvature-norm technique is adapted to extract the centroid of the nuclei. Fig. 2 shows the outcome of the method over an occluded region. It shows the generation of norms over the boundary points of the curve and finally showing the centroid points of the number of nuclei present in the region.
The computation of the centroids is achieved using k-means clustering applied over the intersecting points of the norms. Here, the value of k is computed as shown in Eq. 1. Finally, the DRLS contours are initialized as multiple level set functions at the centroids detected and  Fig.3 shows the results of each stages and the final outcome of the segmentation approach proposed.

Quantitative analysis
The efficacy of the proposed methodology has been studied using following two quantitative measures. First, object detection and occlusion resolution measures and, second, segmentation accuracy based on boundary error metrics. The object detection measures are computed using sensitivity (SN), specificity (SP), positive predictive value (PPV), and the overlap resolution (OR). Based on the ground truth, the above measures are computed using true positive (TP), true negative (TN), false positive (FP) and false negative (FN) 24 . Since, the manual delineation performed by the pathologist is tedious, only 40 samples have been considered for quantitative analysis. A comparative results of DRLS segmentation with and without curvaturenorm initialization is presented in Table.2 The chart shown in Fig. 4 indicates the other two measures of object detection and overlap resolution viz., actual count (AC) and detected count (DC) 24 . These measures are computed by taking the average of 20 randomly chosen objects.
Second, the measures of segmentation accuracy is computed using following two metrics 33 . They are Hausdorff distance (HD) and Mean absolute distance (MAD) as shown in Eq. 12 and Eq. 13.
... (12) ... (13) The key factor for computing the above measures is the distance in terms of pixel difference between the manual delineation performed over the object boundary and the final contour. Since the manual delineation is a tedious task, pathologists have randomly chosen 20 objects for ground truth generation. These measures have been plotted in the charts shown in Fig.5 and Fig.6, in comparison with the Geodesic active contours driven by curvature-norm initialization. The proposed DRLS-CN has shown a very less pixel difference of utmost 4 pixels in contrast with that of GAC-CN, which measures in a range of 2-14 pixels. It is clearly evident from the result, that DRLS-CN outperforms GAC-CN in terms of segmentation accuracy.

CONCLUSION
The work presented in this research, has been able to address the importance of extracting the pleomorphic features for the purpose of diagnosis and prognosis of the cancer disease. An improved active contour technique has been adapted to perform the challenging task of segmenting the nuclei from an occluded region of a digitized H&E stained image. Here, a novel curvature-norm technique is devised to compute the geometric centroid of the occluded object and the DRLS contours are initialized at those centroids to evolve towards the object boundary to segment the nuclei shape resulting in pleomorphic features. Initially, the region of interest is extracted using a novel shape prior model, which is also used to compute the image gradient, representing the external energy in driving the DRLS contour efficiently towards the object boundary. Hence, this work has been able to present two novel techniques. First, The shape prior model and second, the curvature-norm technique for centroid detection. The results of the segmentation have been compared with other techniques and found to be quite promising in terms of object detection, overlap resolution and also with respect to segmentation accuracy. Further, the results can be extended for the post segmentation classification process.