Retinex model based stain normalization technique for whole slide image analysis

Medical imaging provides the means for diagnosing many of the medical phenomena currently studied in clinical medicine and pathology. The variations of color and intensity in stained histological slides affect the quantitative analysis of the histopathological images. Moreover, stain normalization utilizing color for the classification of pixels into different stain components is challenging. The staining also suffers from variability, which complicates the automatization of tissue area segmentation with different staining and the analysis of whole slide images. We have developed a Retinex model based stain normalization technique in terms of area segmentation from stained tissue images to quantify the individual stain components of the histochemical stains for the ideal removal of variability. The performance was experimentally compared to reference methods and tested on organotypic carcinoma model based on myoma tissue and our method consistently has the smallest standard deviation, skewness value, and coefficient of variation in normalized median intensity measurements. Our method also achieved better quality performance in terms of Quaternion Structure Similarity Index Metric (QSSIM), Structural Similarity Index Metric (SSIM), and Pearson Correlation Coefficient (PCC) by improving robustness against variability and reproducibility. The proposed method could potentially be used in the development of novel research as well as diagnostic tools with the potential improvement of accuracy and consistency in computer aided diagnosis in biobank applications.


Introduction
Cancer is a major life-threatening health problem around the world and biomedical imaging is one of the crucial data modalities in cancer research. Radiology and Radiomics provide means of quantitative study of cancer properties at the macroscopic scale. The histopathologic examination of tissue, on the other hand, reveals the effects and properties of cancer at the microscopic level (Kurc et al., 2020). The new digital technologies can provide means for attaining digital images from biological tissue samples in a faster and more cost-effective way without compromising quality or patient care. A trend prevalent in many areas of healthcare over the past decade has also entered into the area of pathology. In digital pathology, the histology slides are digitized to produce whole slide images with microscopic scanners (Gurcan et al., 2009a). The new scanning technologies are enabling researchers to capture various types of morphological information at very high resolutions. Whole slide image analysis technology provides a way to diagnose huge numbers of medical conditions and phenomena currently studied in clinical medicine and pathology (Stathonikos et al., 2013;Al Janabi et al., 2011). There are different possibilities to apply image analysis techniques into whole slide images. Common tasks, where pathologists locate and recognize tissue components can be automized with the aid of image analysis algorithms, such as image segmentation and recognition algorithms. Mapping and quantification of cell/nuclei in the tumor area is another significant direction.
In histopathology, different color of stains or dyes are used to identify and visualize the tissue components and structures as originally the tissue looks transparent under the microscope (Ghaznavi et al., 2013). In absence of the stain the entire light passes through, as a result, it appears bright white (McCann et al., 2014a,b). Accordingly, mixtures of stains with different spectral absorption characteristics are used. Two or three different forms of colored stains are commonly used (Thaina et al., 2019). Hematoxylin and eosin are the most common routine stains to evaluate disease morphology where the hematoxylin specifies as a purple or blue color appearance (e.g. cells, nuclei) and eosin provides eosinophilic structures (e.g., muscle fibers, cytoplasm, and collagen) as a pinkish hue appearance (Altsubaie et al., 2017).
However, the concentration of the stain present in the pixel determines the intensity of each pixel. But the use of different scanner, slide preparation, different stain coloring, concentration of the stain, specimen thickness, temperature, batches of stains, and activities from different manufacture causes color variation in whole slide images. In general, a whole slide image implies scanning sections of tissue that are mounted on a glass slide with high resolution (normally 20X or 40X resolution). It has often a pyramidal structure, with many levels of varying resolution. Therefore, handling whole slide images with the highest resolution is challenging for the lack of available computer memory and display characteristics (Gurcan et al., 2009b).
The Biological Stain Commission in USA regulates the synthesis of biological stains that promoted the awareness of the issue of staining variability and encouraged efforts to reduce it. Also, the expert assessors (histology professionals and pathologists) in UK survey slides submitted for quantitative analysis, visually looking for staining characteristics. They examined each histochemical stain and assigning a semiquantitative score to the slide (Gray et al., 2015). Such semiquantitative scoring technique is well established for the assessment of stain quality and intensity in every aspect of pathology and histotechnology. However, it is relatively expensive and time consuming in routine daily practice due to the manual assessment of slides by specialists.
From the existing research, it can be stated that image quality is highly important. The resulting image quality is a sum of several operations, starting from tissue preparation, and how the sections are handled in the laboratories to the selection of scanner devices and selected parameters. Management of image quality consistency is particularly important in biobank research where cumulative data combined from several biobanks will enable large enough sample sizes, multidisciplinary research, and innovations. Sharing knowledge is important in evaluating diseases and digital pathology provides a means for sharing high resolution digitalized information about the samples. The challenges in histological image analysis include the need for a wide variety of different methods to identify medical problems. Different tissues have different biological properties (biological factors) and staining protocol inconsistencies causing a high visual appearance variability (Arevalo and Cruz-Roa, 2014;Kothari et al., 2013;Hoque, 2019;Wei and Simpson, 2014).
It is important to study how color variability affects image analysis in computer-aided diagnostics in histopathology (Azevedo et al., 2019). Color normalization techniques for stained and digitalized pathology samples have been proposed to minimize the color variations. Color normalization is a process to realize color transformation from one image to another for the minimization of color variations. In this paper, a novel method with key elements of stain normalization is described. Color variability in H&E (hematoxylin and eosin) stained and digitalized pathology samples of 3D organotypic model for carcinoma cell invasion studies (Nurmenniemi et al., 2009) are considered for examination, and color deconvolution under illumination estimation of sample images is considered as a test case for the developed color normalization algorithm.
Section 2 describes the related work. Section 3 describes the proposed method. Section 4 shows the simulation results and the performance comparison of the developed method with selected state of the art methods. Finally, Section 5 and Section 6 summarize our main findings.

Related work
Stain normalization is a significant research topic with immense literature (Onder et al., 2014). Currently, there are various color normalization methods based on stain specific color deconvolution (Ruifrok and Johnson, 2001) which requires prior reference stain vectors for the dyes present in whole slide image. Macenko et al. (2009) introduced an algorithm to find the fringe of pixel distribution in the optical density space and determine the stain vectors in presence of strong staining variations. The pathologist needs to identify not only the stain components but also the incorporation of the spatial dependency of tissue structures (e.g. find the near elliptic shape of cell nuclei from stained hematoxylin and eosin tissue structures). The negligence of spatial dependency of tissue structures limits the robustness of the method. A linear color transform method introduced by (Wang et al., 2007) only works with transforming the color to a template image using linear projection in color space. Zheng et al. (2018) introduced a normalization method that works in HSV space. Janowczyk et al. (2017) introduced a stain normalization method that clusters the pixels in histogram by sparse autoencoders. These histogram transformation based methods can utilize staining characteristics of histological images but have difficulties in analyzing different complex tissue areas due to confusion after normalization (Magee et al., 2009). Different multiple transformation methods were proposed to analyze different colored independent stains (Bejnordi et al., 2016;Khan et al., 2014). Typically, Khan et al. (2014) introduced a nonlinear mapping approach to utilize a pretrained relevance vector machine classifier to classify the stain pixel distributions which is comparatively good under noise. Zheng et al. (2019) proposed an adaptive color deconvolution method that avoids artifacts, but the overall failure rate of this method is 4 %. Bejnordi et al. (2016) proposed hue saturation density-based color normalization method which has good performance in normal stain variations but comparatively bad performance in strong stain variations. Reinhard et al. (2001aReinhard et al. ( , 2001b introduced a technique that aligns the color channels with LAB color model (Hunter, 1948). However, dye has its specific reaction pattern with an independent contribution of color so that a single transformation function leads to improper color mapping for the stained components. To overcome this problem separate transformations are needed in different stained variations (Magee et al., 2009;Bejnordi et al., 2014;Basavanhally and Madabhushi, 2013). The Gaussian mixture model introduced in Basavanhally and Madabhushi (2013) has less robustness in strong stain variations.
The method described in Bejnordi et al. (2016) has also less accuracy due to the considerable imbalance in stained components in tissue images. The stain variations caused by different staining protocols have various problems (Bejnordi et al., 2014) where most of the published algorithms were enhanced the performance not typically evaluated in an existing computer-aided system.

Methods
Our proposed algorithm first takes the whole slide image, with multi resolution pyramid structure as input, and a group of pixels is separated and converted into optical density space. The source image is transformed into optical density space to determine the color appearance matrix from the measurement of relative color for each pixel in RGB channels. Color stains act linearly in optical density space according to Beer's Law (Ruifrok and Johnson, 2001) and color deconvolution method is applied. Depth matrix of stain is evaluated by taking the inverse of the color appearance matrix. The normalized hematoxylin and eosin components are obtained with multiscale Retinex model that estimates the illumination map of hematoxylin and eosin components. This Retinex method is different from the traditional multiscale Retinex model based method which partitions a digital image into reflectance and illumination factors (Fu et al., 2016). Our method estimates illumination factors individually by finding the maximum intensity of pixel in three color channels. Further, our Retinex model refines the illumination map by exploiting the structure of the illumination. We manipulate the illumination map with the help of Gamma correction in the range of [0,1] (Zheng et al., 2019). Finally, we can get back to the normalized original image by the multiplication of optical density space with color appearance matrix and the recombination of hematoxylin and eosin components. The flow chart of the proposed illumination estimation and Retinex model based technique is introduced in Fig. 1.

Stain separation
The tissue samples are generally stained with multiple dyes, where the first preprocessing step is to separate the areas stained with different dyes. Ruifrok and Johnson (2001) developed a color deconvolution method to separate a combination of two or three colors. To make this possible separation the color formation is presented as where, I 0 represents the intensity of light entering the specimen and I is the light detected both represented in the RGB color space. Moreover, A is the staining intensity and c is the matrix related to the absorption factors of the stains. The color deconvolution transforms RGB color space to another color space defined by the stains used for staining the tissue. The method needs an estimation of a stain matrix. In practice, the color conversion matrixes can be worked out from single-stained control slides or ROIs.
In here, our proposed algorithm estimates the color deconvolution vectors based on selecting representative sample pixels for each stain class which is tricky as the stain concentration varies over the sample. It works on image-specific color deconvolution (Ruifrok and Johnson, 2001) for unmixing different stains from that specific image. The absorption factors (Ruifrok and Johnson, 2001) of the stains affect the unmixing accuracy represented by the following equation, where, A is the amount of stain and n is the pixel, and C is the color deconvolution matrix. We calculated the mean color of regions of interest for each stain of n (eq. 3).
Finally, we estimated colors of interest for each stain of three color channels. In here we used the approximate vectors forming where the matrixes have been specified for different stain combinations like hematoxylin and eosin, and hematoxylin and DAB.

Pixel distribution in optical density space
The intensities I Red , I Green , and I Blue are obtained with the help of the scanner for each pixel in RGB model. The relative intensity depends on the stain concentration which is non-linear and cannot directly be used in stain separation calculation. Whereas, the optical density of different channels is linear with the concentration of absorbing factors. For more accurate quantitative analysis and measurement we used the optical density space (OD) as following, where, I Input is the RGB color vector. Our proposed algorithm separates a group of pixels of whole slide image and converts it into optical density space to find the fringe of pixel distribution in the optical density space.

Multiscale retinex model
Multiscale Retinex model is used to estimate illumination map constructed by finding maximum intensity of RGB color channels (Fu et al., 2016;Joze et al., 2012). The formula of the multiscale Retinex model is following, where, I In is the input image, and I Out is the desired output result, M E is the operator of element-wise multiplication and T Map is the illumination map. Our designed method only estimates global illumination. For the non-uniform illumination, we adopt the following estimation formulation, where, x is the individual pixel. This operation helps to illuminate the maximum value of the three channels. For the structural refinement of the illumination map, we set the weights as following, where, W h and W v are the horizontal and vertical weights. We used the gradient of the illumination map as the weights of the following sequel formulations, To characterize the relative total variation, we set the weights as following, where, G(x, y, σ) is the Gaussian factor and σ is the standard deviation.
Here, G(x, y, σ) is denoted as, where, D(x,y) is the measurement of the spatial Euclidean distance. This convolution with Gaussian factors can be helpful to reduce the noise in the images and emphasize the contrast of color structures. Finally, the multiscale Retinex (Starck et al., 1998) estimates the component at scale, s with Gaussian low pass filter. The image component R is obtained with the subtraction of the log of estimated illumination from the log of the original image I as the following equation, The refined illumination map T Map can be recovered with the help of the (eq. 5) and manipulated through gamma transformation as following, For the avoidance of unbalance processing e.g. dark, brighter, or over smoothed tissue area, we can use the following operation, where, I d is the result after the estimation process and I f is the result after recombination.
The whole process of our Retinex model based stain normalization algorithm is described in Algorithm 1.

Tissue collection and organotypic culturing
The collection of tissues was approved by the ethical committee of the Northern Ostrobothinia Hospital District. The oral tongue squamous carcinoma cells were cultured on top of the myoma tissues as described earlier (Nurmenniemi et al., 2009). Samples were digitalized using Leica Aperio AT2 whole slide scanner.
Descriptive information of an example whole slide image (.svs format) scanned by Leica Aperio AT2 scanner are shown in Table 1.

Experimental setup
For our experiment, we used ImageScope (Aperio ImageScope) to annotate the whole slide image, and our method analyzed by taking that annotated image automatically. We developed our algorithm using Matlab and Python separately to provide both Matlab and Python users in support of the digital pathology and cancer analysis.
We used ImageJ (Schneider et al., 2012) software for the ground truth labeling, which is the manual stain normalization and segmentation method, e.g. (Marek et al., 2019) used for their analysis. Based on selected regions from the datasets the pixel characteristics were labeled including hematoxylin, eosin, and background.
Details of the implementation parameters including their values are listed in Table 2.

Performance evaluation criterion
In this experiment, the normalized median intensity (NMI) (Basavanhally and Madabhushi, 2013) is used to evaluate color constancy and compare the statistics of intensity over whole slide image. We used the following formula for the normalized median intensity measurement to utilize the color constancy of normalization (Basavanhally and Madabhushi, 2013).
where, I M is the processed whole slide image, u(i) is the average of the RGB channels, Median () is the median value and P 95 is the 95th percentile (Bejnordi et al., 2016). The image quality metrics including Quaternion Structure Similarity Index Metric (QSSIM) (Kolaman and Yadid-Pecht, 2011), Structural Similarity Index Metric (SSIM) (Wang et al., 2004), and Pearson Correlation Coefficient (PCC) (Wang and Bovik, 2002) are calculated in order to measure performance. For the quaternion structural similarity index metric (QSSIM), numerical value lies between 0-1 where the value closer to 1, indicates the better color normalization method. The mathematical formula is given in eq. 16.
where, μ qef and μ qnorm are the sample mean and normalized image; σ qref and σ qnorm are the standard deviation of source and normalized images. σ qref,qnorm is the correlation coefficient between source and normalized images. Similarly, the numerical value lies between 0-1 for structural similarity index metric (SSIM), where the value closer to 1, indicates the better color normalization method. The SSIM can be defined as follows, From the eq. 17 we can see that SSIM depends on three factors such as, After combining eq. 18,19, and 20 the mathematical formula of SSIM is derived as follows, where, μ x and μ y are the means of the source and normalized images; σ x and σ y are the standard deviation of source and normalized images; c 1 , c 2 and c 3 are the constants. For the pearson correlation coefficient (PCC), we measured the linear correlation between source and normalized images in a range, 0-1 where 0 denotes not any similarity between two images. However, a value greater than 0 indicates the correlation be-  tween two images (Roy et al., 2018a). The mathematical formula is given in eq. 22.
where, x i and y i are the source and normalized image; μ x and μ y are the means of source and normalized images. We calculated the relative square error (rSE) using the formula (Salvi et al., 2020) given in eq. 23.
where, W GT is the ground truth stain color appearance matrix, F is the Frobenius norm of the matrix (Salvi et al., 2020) and W is the estimated color appearance matrix. Finally, we calculated standard deviation and coefficient variation of normalized median intensity for all myoma samples where the lower value of these denotes more consistent normalization.

Experimental results
The aim of the experiment was to study stain normalization by utilizing color from the stained whole slide images. Currently developed algorithms and software e.g. SPCN (Vahadane et al., 2016), do not have support to read whole slide image formats directly. In majority of cases, for example Aperio (.svs) and Hamamatsu (.ndpi) formats need to be converted into tagged image file format (.tiff) in where we lose some important information. Sometimes we need to use an extra library which is expensive and makes the system more complex and comparatively slow in performance e.g. OpenSlide (Goode et al., 2013a,b). So, another aim was to develop such an algorithm that can directly read these types of whole slide image formats without any extra library requirements,   where there is not any issue to lose the information. In our experiment, we first utilized the staining difference and color normalization of hematoxylin and eosin (HE) stained whole slide images. Then we examined the effectiveness and the fluctuation of the results by calculating hematoxylin and eosin stain vectors in the presence of various color intensities.
Finally, we compared the effectiveness and robustness of our proposed Retinex model based method with state of art methods. The detailed architecture of our proposed algorithm is shown in Fig. 2.
It shows that the individual stain components of the histochemical stains are effectively quantified and separated. Moreover, the normalized output image is free from artifacts and apparent color discontinuity because of less difficulties to identify the color distributions from different stains (Zheng et al., 2019). In here, the variability was examined more closely, with normalized density histograms and visual inspection.
The normalized stain vectors using 3D color gamut as a point cloud in RGB color space are shown in Fig. 3 and three normalized stain density maps are shown in Fig. 4.
The histogram of RGB color space and the hue channel of the normalized image is shown in Fig. 5.
It is seen from Fig. 3 that, the 3D color gamut as a point cloud in RGB color space represents the existence of hematoxylin and eosin staining where the blue color denotes hematoxylin staining and the purple color denotes the eosin staining. Figs. 4 and 5 help to visualize the variations of staining vectors using density maps, histogram of RGB color space, and hue channel.

Comparison with different methods
We studied the means for objective measurement of the differences in color deconvolution results. To evaluate the performance of our proposed algorithm, comparative evaluation results are shown in calculating stain vectors and achieving color constancy. To compare the efficiency and robustness of different algorithms the correct stain vectors are measured based on the mean and standard deviation of the Euclidean distances. Table 3 represents the mean and standard deviation of the Euclidean distance between stain vectors from the annotated image and derived by each algorithm for the comparison of efficiency and robustness. Comparatively high mean value and low standard deviation of the Euclidean distances indicate better results.
It is seen that our proposed method has comparatively low standard deviation of the Euclidean distances which is considerably better compared to other methods. Table 4 represents the normalized median intensity measure to evaluate the color constancy of different regions.
The overall performance of our proposed method is considerably better compared to other methods in calculating normalized median intensity for the hematoxylin and eosin stains. Statistically, the standard deviation (SD) and the coefficient of variation (CV) of normalized median intensity results of different methods are utilized in Figs. 6 and 7.
We used boxplot representation for comparison of the distributions e.g. shape of the distribution, central value, and variability.
We also used violin plot (a combination of boxplot and density plot) to visualize the distribution and probability density of the data. The performance of color normalization is shown in Fig. 8, which represents different normalization results of the state of art methods in terms of comparison.
It is clear from Figs. 6, 7, and 8 that, Zheng et al. (2019) and Macenko et al. (2009) methods are good in color normalization but failure occurred due to color artifacts and variance. The result of Janowczyk et al. (2017) method is good but the results are not smooth due to various artifacts that appeared in normalized images. These methods have a typical failure rate in color distribution of the monochromatic whole slide image. Whereas, compared to all seven methods our proposed method removes the color artifacts with superior performance and effectively estimates the hematoxylin, eosin, and background distributions and classifies the pixels with better performance in color distribution of the monochromatic whole slide image. Our method provides effective normalization results that help to overcome the color variance challenges. Table 5 represents the detailed evaluation results of staining separation (mean ± standard deviation), relative square error of hematoxylin and eosin respectively, time complexity, and average running time of the reference methods.
Here, our algorithm consistently has the smallest standard deviation and low relative square error. The time complexity is another important application where Table 5 also represents the comparison of time complexity for the pixel number, n, and the average running time. It is specified that the pixel-wise classification based method, Khan et al. (2014) is linearly related to the number of pixels, n so that it has the time complexity as σ(n). Bejnordi et al. (2016) has the time complexity as σ(n 3 ) due to the utilization of Hough transform. Zheng et al. (2018) is the parameter estimation method that involved sorting algorithm of n value so that this method has time complexity as σ(nlog 2 n). Other methods estimate the parameters in terms of pixel-wise operation where the time complexity is σ(n). The average normalization parameters estimation running time of our proposed method and (Zheng et al., 2019) method are 2.11 s and 2.89 s which are comparatively short time compare to the other reference methods.
We evaluated and compared the quality performance of the reference  Table 6). The mean, standard deviation (STD), minimum (min), maximum (max) along p-value are shown in Table 7. The QSSIM, SSIM, and PCC of our proposed method on different myoma samples are comparatively higher than other reference methods (Table 6). On the other hand, a small p-value (p-value ≤ 0.01 means highly significant) presents the proof against the null hypothesis testing (Altman and Bland, 2011). Table 7 shows that the p-values of our method and (Zheng et al., 2019) method are lower than other reference methods that means the results are highly significant.
The experimental results (Tables 3, 4, 5, 6 and 7) indicate that our proposed algorithm is consistently better for the stain normalization with less apparent artifacts in normalized whole slide images.

Discussion
Our proposed algorithm effectively removes the unwanted variability and makes the analysis more consistent. The scope of possibilities and alternatives to histopathological image analysis is wide (Arevalo and Cruz-Roa, 2014). Additionally, there has been an increasing interest in quantifying the tumor microenvironment and correlating the structural information with tumor invasion and patient prognosis (Wang et al., 2019).
In particular cases, the image analysis techniques can provide the supports to help the pathologists in disease grading and decision making. These techniques can be used to visualize the tissue morphology, stromal or fatty adipose tissue findings, cancer epithelium area detection, tubule counting, mitosis quantification, tissue segmentation, and classification (Aeffner et al., 2019). In that context, the myoma organotypic model (Nurmenniemi et al., 2009) is a particularly attractive tool combining real human stromal tissue with the possibility for various  experimental cell culture set-ups with e.g. anti-invasive drug candidates.
Recently different convolutional neural networks and deep learning based color normalization methods have emerged (Shafiei et al., 2020;Zanjani et al., 2018;Bug et al., 2017). They come with high computational cost which limits the applicability of these methods in practical whole slide image analysis systems. Our proposed method has low computational cost and is easy to use in effective computer aided whole slide image analysis systems. Most recently developed adaptive color deconvolution method proposed by Zheng et al. (2019) is effective in stain normalization with 4% failure rate due to color artifacts and variance which is relatively low. Our proposed method is more effective in stain normalization without any specific failure rate due to color artifacts. On the other hand, it is difficult to analyze the color distribution of the monochromatic whole slide image (Zheng et al., 2019). Here the red boxes represent the results without artifacts or apparent color discontinuity (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).

Table 5
Evaluation results of staining separation (mean ± standard deviation), relative square error (rSE Hematoxilyin and rSE Eosin are the stain estimation relative square error of hematoxylin and eosin respectively), time complexity, and average running time of the reference methods.

2.11
Md.Z. Hoque et al. Hence, our proposed method ideally removes the color artifacts and the normalization result is consistent with color appearance in analyzing the color distribution of the monochromatic whole slide image. The evaluation of the state of art methods showed color discontinuity in normalized whole slide images. This is one of the major issues that appear on hematoxylin and eosin boundary. However, our proposed method only estimates illumination map of the whole slide image and utilizes more general color pixels without any apparent color discontinuity. So, the normalized whole slide images are consistent in color appearance containing almost zero percent color artifacts.
Besides, the measure of hematoxylin and eosin (HE)-ratio is vulnerable to being dependent on image content (the actual amount of hematoxylin and eosin stained areas) and our method can be effective also in this case. Therefore, our Retinex model based method is more robust and effective compared to the state of art methods to quantify the individual stain components of the histochemical stains for the perfect expulsion of variability and area segmentation from hematoxylin and eosin stained tissue images.

Conclusion
It is evident that digital pathology is taking vast steps towards realization in clinical work in addition to being part of research methodology. Applications, both to replace conventional time-consuming work routines and the new methods are appealing to the community of pathologists as well as researchers. In this work, we studied the staining difference and color normalization of hematoxylin and eosin stained whole slide images. We examined the performance by calculating the stain vectors. Our Retinex model based stain normalization method separates the individual stain components of the histochemical stains and the normalized images are achieved via illumination estimation utilizing more general color pixels without any apparent color discontinuity. It helps to remove the color artifacts with superior performance and effectively estimates the hematoxylin, eosin, and background distributions of whole slide images.
Experimental evaluation showed that compared to the selected state of art methods, our method consistently has the smallest standard deviation, skewness value, and coefficient of variation in normalized median intensity measurements. The quality performance of our method in terms of QSSIM, SSIM, and PCC is considerably better for the stain normalization with less apparent artifacts in normalized whole slide images. Many recent studies are focused on medical image standardization. Moreover, the effect of staining variation has been noticed as a challenge when developing advanced tools in the area of computational pathology. Ideally, computational pathology would provide new innovations and tools for diagnostic purposes, as well as methods for automated analysis in biobank and other multi-center studies where the large database of digitalized samples are retrieved from multiple laboratories.
Our proposed method can provide a potential tool to remove the color variability to overcome the heterogeneity in the digitalized slides towards automating and comparable analysis in histopathology.

Ethical approval
The Ethical Committee of Northern Ostrobothinia Hospital District approved the research and the use of myoma tissue samples.

Declaration of Competing Interest
The authors have no conflict of interest that could have appeared to influence the work reported in this paper.

Acknowledgments
The research work of this paper was conducted with Physiological Signal Analysis Group at Center for Machine Vision and Signal Analysis (CMVS) in the Faculty of Information Technology and Electrical Engineering (ITEE) at University of Oulu, Finland. This research has been financially supported by Academy of Finland 6 Genesis Flagship (Grant 318927) and Academy of Finland Identifying trajectories of healthy aging via integration of birth cohorts and biobank data (Grant 309112). We are thankful to Professor Tuula Salo group (University of Oulu and University of Helsinki) for selction and access to the myoma organotypic slides. We thank Sanna Juntunen, Eeva-Maija Kiljander, Maija-Leena Lehtonen, and Merja Tyynismaa for technical assistance in preparing the organotypic cultures and slides. Table A1 and Table A2 represent the detailed statistics derived from the separated channels of twenty different myoma samples.

Table A2
Comparisons with different optimal mean and standard deviation (STD) values for eosin channel. Md.Z. Hoque et al.