Detection of pancreatic tumor cell nuclei via a hyperspectral analysis of pathological slides based on stain spectra

: Hyperspectral imaging (HSI) provides more detailed information than red-green-blue (RGB) imaging, and therefore has potential applications in computer-aided pathological diagnosis. This study aimed to develop a pattern recognition method based on HSI, called hyperspectral analysis of pathological slides based on stain spectrum (HAPSS), to detect cancers in hematoxylin and eosin-stained pathological slides of pancreatic tumors. The samples, comprising hyperspectral cubes of 420–750 nm, were harvested for HSI and tissue microarray (TMA) analysis. As a result of conducting HAPSS experiments with a support vector machine (SVM) classiﬁer, we obtained maximal accuracy of 94%, a 14% improvement over the widely used RGB images. Thus, HAPSS is a suitable method to automatically detect tumors in pathological slides of the pancreas.


Introduction
Pancreatic cancer is pathologically diagnosed via histopathological analysis with hematoxylin (HE) and eosin staining [1,2], and immunohistochemical analysis of other tissues is required to confirm the diagnosis. Various visualization methods such as electron and fluorescence microscopy have been proposed for a more precise diagnosis [3].
Numerous studies have conducted computer-aided diagnoses using pathological specimens owing to the development of a virtual microscope-based technology called whole slide imaging (WSI) [4]. In particular, intranuclear changes are prominent during tumorigenesis, and numerous studies have attempted to detect cell nuclei from red-green-blue (RGB) pathological images [5,6]. Cell nuclei are also used in computer-aided diagnosis (CAD) of prostate cancer [7]. A texture analysis method for cell nuclei has also been proposed for cancer detection [8]. Since RGB images of pathology specimens provide insufficient information for cancer diagnosis, other methods have been proposed in quantifying steatosis [9] and cell cord besides cell nuclei [10]. A texture analysis method that takes advantage of multifractal properties has also been proposed [11]. However, changes in dye amount upon histopathological staining greatly depend on the facility and the timing of staining and on whether the tissue is cancerous or non-cancerous, and it is difficult to obtain consistent analytical accuracy via RGB imaging.
Hyperspectral imaging (HSI) [12] provides rich information, captured over a wide range of the electromagnetic spectrum, for each pixel and typically has hundreds of narrow contiguous bands. HE staining is performed to stain the nucleus and cytoplasm and is appropriate for histological analysis. Furthermore, the color and texture of cell nuclei are altered in cancer [13]. HSI, which provides more detailed information than naked-eye observation, is anticipated as a new image sensing method for pathological diagnosis [14]. Thus, several studies have attempted to analyze different pathological slides using HSI. Many studies have reported the use of HSI for classifying tissues; however, most of them performed macroscopic pathological analysis [15,16,17] for diagnostic endoscopy of tissues [18] or the skin [19,20,21,22,23], where the tissue is thick and displays an adequate light absorption coefficient. Regarding studies on pathological specimens, studies have reported brain cancer diagnoses [24], live fluorescence imaging of specimens [25], mitotic cell detection and segmentation [26], red blood cell counting [27], and tissue classification of unstained specimens [28]. Pancreatic cancer has a low survival rate of 5 years and is known worldwide as a refractory cancer. The conventional pathological classification is insufficient to predict the prognosis and therapeutic effect. Therefore, development of a new sensing method is desired to accurately determine the state of the tissue [5,29]. Most studies thus far have evaluated differences in the absorption of light of a specific wavelength and the waveform of HSI. However, the spectral waveform of a pathological specimen is dominated by staining, with lesser light absorption of tissue [30]. Therefore, to classify pathological tissue specimens, detailed analysis of spectral waveforms is necessary.
The present study analyzes HSI data obtained from HE-stained pathological human pancreatic tumor tissue microarray (TMA) slides, with the primary aim of classifying cancerous and non-cancerous pathological slides through detailed analysis of HE staining data. To achieve this objective, the spectral signature of the slides was analyzed in terms of the amount of pigments, spectral absorption coefficient, and other components (e.g., light absorption by the tissue itself and the noise component) based on "darkness of color" and the Lambert-Beer law, and the slides were classified of cancerous or non-cancerous using machine learning methods. "Darkness of color" refers to a state of high absorbance by tissue densely stained with hematoxylin and eosin.

Material and methods
Experiments were performed using HSI of pathological TMAs of human pancreatic tissue, using a combination of microscopy and a HS camera. HE staining was performed to generate images with differences in light absorption depending on nuclear thickness. Therefore, we generated high dynamic range (HDR) [31] images for accurate spectral signatures using MATLAB. HDR means that the range of measurable brightness is extended by combining images obtained with long and short exposure times. HDR is widely used to reduce clipped whites and crushed shadows in RGB cameras and smartphones. By using HDR with HSI, it is possible to obtain a more accurate spectrum than by using a single HSI image. Thereafter, spectral signatures were analyzed with respect to the dye amount, spectral absorption coefficient, and other components based on the Lambert-Beer law to classify and identify the tissue samples. Hyperspectral analysis of pathological slides based on staining (HAPSS) to distinguish tissue samples was performed as detailed herein. In particular, HAPSS is a method used for distinguishing (134 bands) between waveforms of individual pigments re-estimated from pigment content and absorbance of pigments by classifying cell nuclei into dark and light cell elements. By decomposing the spectra for input to a machine learning algorithm, the classification accuracy that can be achieved is higher than that when using only transmittance.

Biological samples
Biological samples used herein comprised TMA. Figure 1 shows the TMA of pancreatic cancer (PA721) obtained from US Biomes, Inc. (Rockville, MD, USA) used in the experiment. The tissue specimens in the green box in Fig. 1 are non-cancerous tissue and the specimens in the red box are cancerous tissue. Non-cancerous tissue from 6 patients and cancerous tissue from 6 patients in the TMA slide were used for computer-assisted evaluation. The orange boxes in Fig. 1 denote cancerous tissue in the experiment. Each tissue specimen in the TMA was diagnosed by a pathologist. Fig. 1. Tissue microarray. Diagnosed pathological slides with the cancerous, noncancerous, and cancerous tissue in the experiment surrounded by red, green, and orange boxes, respectively.

Acquisition system
To obtain HSI from slides of pathological specimens, an HSI acquisition system was constructed. The acquisition system was coupled with a NH-3 (HS camera) [32] (EVA JAPAN CO., LTD., Japan) and BX-53 (microscope) (Olympus, Japan) (Fig. 2). The NH-3 is functional in the spectral range of 350 to 1100 nm, has a spectral resolution of 5 nm, and is capable of sampling 151 spectral channels. Based on the grating used, the NH-3 potentially yields 2D spatial images (752 × 380) with a single exposure, using a push-broom camera with an internal mechanical/optical shifting system. The optical system of NH-3 is based on a transmission diffraction grating for spectral measurement. The acquisition time for one HSI NH-3 scan is 3-7 [s]. Using this microscope, it is possible to acquire images in transmission or reflection mode at a magnification of 5×, 10×, and 20×. Herein, HSIs were acquired at 20× magnification. In the present system, in the microscopic field a pathological slide was imaged at 20× magnification, one pixel had dimensions 0.25 × 0.2758 µm 2 , and the size of one HSI was 120 × 207.4 µm 2 , or 480 × 752 pixels. The XY stage has a resolution of 0.3 µm. A halogen lamp (spectral range: 400-800 nm) served as the light source for the microscope.

Processing framework
Nuclei are generally more likely to become enlarged during cancer. Furthermore, Ki67 expression is strongly associated with tumor cell proliferation and growth, and is widely screened in routine pathological investigations as a proliferation marker. However, nuclear morphological analysis has already proven beneficial in cancer detection. In addition, spectral signatures can detect tumor tissue in pathological slides [24]. The present study aimed to develop a detailed analytical method of acquiring spectral signatures via HAPSS. Figure 3 provides an overview of the processing framework used herein. The first stage of the proposed framework comprises a pre-processing pipeline to compensate the effects of HSI tailing, construct an HDR image, carry out manual detection of nuclei, and estimate the dye amount. Thereafter, spectral signatures were decomposed into dye amount, spectral absorbance coefficient, and other elements (noise and tissue absorbance, etc.). Finally, the performance of the classifiers was evaluated via leave-one-subject-out-cross-validation (LOSOCV).  1) Automatic imaging of pathological slides: Tissue needs to be assessed thoroughly during pathological diagnosis, since findings may vary even for the same tissue specimen.
Although the size of one tissue for TMA is 1867 × 1920 µm 2 , only a very small region of 120 × 207.4 µm 2 can be imaged in a single shot of the spectral camera used in this study. Therefore, it is necessary to capture a plurality of HSI while moving the XY stage in order to image the entire tissue. In addition, varying the focus along the Z axis is required since the thickness of the pathological sample varies depending on the tissue site. Therefore, a system was developed wherein the entire tissue specimen is scanned automatically during 2D optical microscopy, upon adjusting the Z focus and XY stage. Specifically, the Z focus of the XY stage was translated in 2 µm increments, an edge image was obtained using a Sobel filter, and the position of the XY stage in focus was determined from the total number of edge pixels. The in-focus position has the largest number of edge pixels. A 600-nm image with sufficient contrast was used to detect the edges. Figure 4 shows an RGB image reconstructed from the HSI for an entire tissue specimen obtained with this system. The HSI used in Fig. 4 corresponds to one of the TMAs spanning 1867 × 1920 µm 2 . Figure 4 shows the actual scan path, which was followed to enable easy focusing; however, it is not a perfectly combined image with no gaps. A TMA is a collection of tiny pieces of patient tissues, some of which have more fiber than cells. Therefore, an image was selected per TMA tissue, wherein cell nuclei are observable, and fiber is scarce in a tile-scanned TMA, and a total of 12 tissues (6 non-cancerous tissues and 6 cancerous tissues) was used for the experiment. Figure 5 shows images selected for the experiment. Images #1-6 are cancerous tissues while #7-12 are non-cancerous tissues. Each tissue was subjected to HSI analysis in triplicate (total number of HSIs, 36). This system requires 3-4 hours to image an entire tissue specimen.

2) Construction of an HDR image (calibration):
Typically, cell nuclei are darkly stained with hematoxylin. However, some pixels of cell nuclei may not acquire a darker appearance owing to the effect of cancer and other illnesses. If hematoxylin absorbance is significant, an accurate spectral waveform cannot be obtained because the amount of transmitted light decreases, consequently decreasing the signal to noise ratio. Therefore, this study yielded an image under short exposure and another image under prolonged exposure, and combined the two. This means that only pixels corresponding to strong absorption are replaced with long exposure pixels based on short exposure HSI. This method is a simple extension of HDR to HSI.
In particular, when the maximum pixel value of the spectrum of each pixel for shortexposure images is higher than the predetermined threshold value (3800), the spectrum of pixels for long-exposure images is replaced with short-exposure pixels (pixel value obtained using the camera is 12 bit). Threshold processing is a process for eliminating pixel saturation due to long exposure time. In this study, although the maximum value of HSI is 4095, since the transmitted light becomes intense when the largest value is set at the threshold, the threshold was empirically set to a lower value of 3800. Then, the long-exposure image is converted to a transmittance image, using Eq. (1), to reduce the effect of the exposure period.
The image with a short exposure period was converted to a transmittance image, using Eq. (2).
Here, LE is a long exposure image (transmittance), SE is a short exposure image (transmittance), RawL (13.1 to 14.9 [s]) is an HS image acquired with a longer exposure, RawS (7.9 to 8.8 [s]) is an HS image acquired with a shorter exposure, and RawW (7.9 to 8.8 [s]) is a white image produced by scanning glass. RawD is an HS image obtained after turning off the microscope light to prevent exposure to the HS camera. Additionally, ExpW is the exposure duration for scanning a white image, ExpL is the duration of a longer exposure, ExpS is the duration for a shorter exposure, and b is the band number of HSI. The final HSI − HDR is calculated using Eq. (3).
For pathological slides, HDR is useful in pixels with strong absorption. Figure 6 shows the state of nuclei with strong absorption, showing the RGB image reconstructed from the HSI for the nuclei with strong absorption (Fig. 6(a)); the red arrow points to the strong absorption pixel.   Figure 7 shows the spectrum of strong absorption pixels. The waveform for the HE-stained specimens is considered to be smooth. The waveform for short exposure in Fig. 7 is disrupted in the region of 590 to 670 nm; however, the waveform for long exposure in Fig. 7 is smooth. An HDR is considered to acquire a waveform with high accuracy.
3) Band reduction: Microscopes generally contain a filter that allows the passage of only visible light and an HS camera that captures images in the ultraviolet and infrared. The HSI of this paper is functional in the spectral range of 350 to 1100 nm, has a spectral resolution of 5 nm, and is capable of sampling 151 spectral channels. Figure 8 shows the transmission of light from the microscope light source through the filter of the microscope. This study used 67 bands for imaging, ranging from 420 nm to 750 nm, at 5-nm intervals.
4) Manual nuclear detection: Cell nuclei are known to undergo prominent changes once cancer develops, and cell growth increases in tissues. For example, a special stain for ki-67 is often used to determine whether a cell has become cancerous, as the stain can mark nuclei of cells with a robust proliferative ability. This study limited its analysis to cell nuclei in anticipation of a difference in nuclear structure between cancerous and non-cancerous cells. Cell nuclei were detected in images manually to validate the effectiveness of the proposed method. Figure 9 shows the manual detection process (green pixels). In terms of cell nuclei, reliable cell nuclei were detected manually. The size of one HSI is 120 µm × 207.4 µm. Table 1 shows the number of manually detected nuclei and the number of pixels.

HAPSS
Cell nuclei were determined to be either cancerous or non-cancerous on analyzing the spectral characteristics of nuclei obtained from HE-stained pathological slides of pancreatic tissue. Cell nuclei of tumorigenic tissues tend to stain darker than those of non-cancerous tissues [11]. However, the dye amount changes with the duration of staining, effects of illnesses other than cancer, and a difference in staining facility. Therefore, this study is considered a proof of concept regarding whether cell nuclei are cancerous or non-cancerous via analysis of the obtained spectral characteristics through HAPSS. 1) Estimation of dye amount Pathological slides become transparent when pigments such as hemoglobin are eliminated during tissue preparation for histological analysis. Cell nuclei are then stained with hematoxylin and the cytoplasm is stained with eosin to enhance the visibility of the tissue structure. Consequently, a spectral signature with dominant spectra for HE is obtained by scanning the HSI of HE-stained slides. Figure 10 shows an average of the normalized spectral signature of 25 pixels  (5 × 5) for the absorbance of red blood cells, hematoxylin, and eosin in non-cancerous tissue. Each spectrum was acquired from HSI of only hematoxylin-stained or only eosin-stained slides. The spectrum for red blood cells was acquired from HSI of un-stained slides. Figure 11 shows the RGB image re-constructed from the HS image of single-stained specimens and red blood cells. The red arrow in Fig. 11 points to sampling pixels for a normalized spectral signature. The maximum absorption bands were observed at 540 nm for eosin and 610 nm for hematoxylin. Hematoxylin is also absorptive over a wide range of 465-710 nm. Figure 11 shows an average of a normalized spectral signature of 25 pixels (5 × 5) for absorbance for cell nuclei and cytoplasm of non-cancerous tissue. The absorption coefficients of hematoxylin and eosin vary depending on     the tissue and staining procedure. Therefore, it is necessary to re-measure the spectral absorption coefficient when the tissue and the staining procedure are changed. In addition, the spectral absorption coefficient is calculated from pixels corresponding to strong light absorption by the cell nuclei of the normal tissue in Fig. 11, and there is a possibility that the spectrum may be insignificantly different depending on the state of cancer/non-cancer and tissue. Therefore, it should be noted that the calculation results of 2.3.2 include differences depending on the state of cancer/non-cancer and tissue. Hematoxylin absorbs more light than eosin in cell nuclei, while hematoxylin absorbs less light relative to eosin in the cytoplasm. A method based on the principle of HE-stained slides, wherein dye amounts for hematoxylin, eosin, and red blood cells are estimated from HSI, has been proposed [30]. In accordance with the Lambert-Beer law, the absorbance for HE-stained pathological slides is defined by Eq. (4): where a is the absorbance, T is the transmittance, X is the spectral absorption coefficient of the HE-stained slides (hematoxylin, eosin, red blood cells), c is the dye amount, and n refers to other absorbing components. Absorbance is calculated using Eq. (5). Since the absorbance and spectral absorption coefficient are known at this point, c can be estimated by multiplying a on the left by the pseudoinverse of X:  2) Re-estimation of absorbance and calculation of difference value Absorption by the staining fluid is dominant in obtaining a spectral signature of the absorbance for HE-stained slides, and HSI may also include additional information such as absorbance by the tissue itself and camera noise. Therefore, we calculated the difference value for components other than the dyeing liquid using Eq. (7): c = X + a.
Here, c is the estimated dye amount based on Eq. (8), and Xc is the re-estimated absorbance based on the estimated dye amount. Figure 10 shows the absorbance and re-estimated absorbance values of one pixel for non-cancerous tissue. Figure 13 shows a graphical representation of the difference value n obtained using Eq. (7).

3) Classification of nuclei pixels
The nuclei are presented with HSI of nuclei containing dark and bright pixels with different spectral signatures. Therefore, complex calculations are required, since the classification needs to consider the dye amount in addition to whether the pixels correspond to cancerous or non-cancer cells if they are to be classified using machine learning. This study thus classifies pixels into dark and bright pixels on the basis of transmittance using machine learning. The following three types of standards were considered to identify pixels as dark and bright. Figure 14 shows the transmittance, absorbance, and ratio of absorbance of eosin and hematoxylin (REH) of nuclei for 25 pixels (5 × 5):

1) Transmittance threshold for eosin (515-565 nm)
This threshold helps classify cell nuclei as bright if the minimum value of the transmittance wavelength 515-565 nm in Fig. 14 exceeds the threshold, and dark if the minimum value is below the threshold.

2) Transmittance threshold for hematoxylin (585-635 nm)
This threshold helps classify cell nuclei as bright if the minimum value of the transmittance wavelength 585-635 nm in Fig. 14 exceeds the threshold, and dark if the minimum value is below the threshold.

3) Threshold for the ratio of absorbance of eosin and hematoxylin
First, the absorbance is normalized with the maximum value of absorbance. The resulting ratio is the ratio of absorbance of eosin and hematoxylin (REH). Eosin has stronger absorbance, therefore which is the ratio of each wavelength to the maximum absorbance of eosin when normalized with the maximum value. Thereafter, cell nuclei are classified as dark if the minimum value of the REH wavelength of 585-635 nm from Fig. 14 exceeds the threshold, and bright if the minimum value is below the threshold.

Evaluation metrics
The results obtained via a random forest (RF) [33] or support vector machine (SVM) [34] (supervised classifiers) were evaluated using the standard sensitivity, specificity, and accuracy metrics. In the present analysis, which focused on cell nuclei, the number of pixels yields an uneven number of dark and bright pixels of each subject depending on the image. Hence, 100 random pixels were considered per image to balance out the data for LOSOCV-based evaluation. Predictive accuracy is the performance measure generally associated with machine learning algorithms and is defined in Eq. (9). It is expressed as the sum of true positives (TP) and true negatives (TN) divided by the sum of all examples, expressed in Eq. (9), where FN and FP mean the number of false negatives and false positives. Sensitivity is related to the ability of the analysis to accurately exclude a condition, and is expressed in Eq. (10). Specificity is related to the ability of the analysis to accurately exclude a condition, and is expressed in Eq. (11).

Experiment description
Discrimination accuracy of cancerous and non-cancerous tissue was assessed via HAPSS. The following experiments were conducted to determine the appropriate usage of the proposed method and to evaluate its accuracy.
• LOSOCV-based evaluation of accuracy Accuracy was evaluated using RGB images calculated from HSI, classified HSI (C-HSI), and HAPSS. HAPSS proposed herein is a combination of REH and a difference value. REH yields results that help convert simple information regarding absorbance into the ratio of hematoxylin and eosin. This is advantageous in that supervised machine learning can be carried out without directly considering the dye amount.

Determination of the threshold of pixel darkness
We evaluated three different threshold values to classify the darkness of nuclear pixels. When determining the darkness of nuclear pixels, the accuracy of the three methods described as follows was evaluated.
• Determination of the threshold for pixel darkness In this study, a classifier comprising a dark pixel and a bright pixel was used to classify pixels from spectral information. Images were classified into either bright or dark. Binary classifier was used to classify bright and dark images as either cancer or non-cancer. An experiment was performed to elucidate the mechanism underlying the determination of the threshold value to distinguish whether a pixel is dark or bright. Specifically, accuracy was evaluated with LOSOCV while varying the threshold with each method. An appropriate threshold value was considered experimentally to classify and facilitate machine learning at the level of darkness of cell nuclei proposed via HAPSS. A random forest classifier was used, and 100 pixels were sampled from each tissue for LOSOCV analysis to prevent learning with unbalanced classes. The number of decision trees of the random forest was set to 500 as a hyper-parameter, and the depth of the random forest was 8 if the number of bands was 67 and 11 if the number of bands was 134. Here, 67 bands were used for HSI and 134 bands were used for HAPSS (REH + difference value).
The minimum transmittance at 585-635 nm for hematoxylin, which is used to stain the cytoplasm, was determined from the obtained transmittance of the spectral signature, and pixels were assessed while changing the threshold after classification. Accuracy was evaluated for threshold values ranging from 0.3 to 0.7. The experimental results are shown in Fig. 15. The maximum accuracy for HAPSS was 90% when the threshold value was 0.45, and the maximum accuracy for C-HSI was 87% when the threshold value was 0.6. The minimum transmittance at 515-635 nm for REH was determined from the obtained REH of the spectral signature, and pixels were assessed while changing the threshold after classification. Accuracy was evaluated for threshold values within the range 0.4-0.7. The experimental results are shown in Fig. 15. The maximum accuracy for HAPSS was 88% when the threshold value was 0.7, and the maximum accuracy for C-HSI was 84% when the threshold value was 0.7. The minimum transmittance at 515-635 nm was obtained for eosin staining, and the pixels were assessed while changing the threshold after classification. Accuracy was evaluated for threshold values ranging from 0.15 to 0.40. The experimental results are shown in Fig. 15. The maximum accuracy for HAPSS was 91% when the threshold value was 0.20, and the maximum accuracy for C-HSI was 88% when the threshold value was 0.2-0.35. Based on the aforementioned results, 0.2 was considered the most accurate threshold for eosin staining for C-HSI and HAPSS in this study. Fig. 15. Accuracy of C-HSI and HAPSS the threshold value via eosin, REH and hematoxylin staining. Accuracy of C-HSI the threshold value via hematoxylin (blue), HAPSS the threshold value via hematoxylin (orange), C-HSI the threshold value via REH (yellow), HAPSS the threshold value via REH (purple), C-HSI the threshold value via eosin (green) and HAPSS the threshold value via eosin (light blue).

Experimental results by average accuracy via LOSOCV analysis
Based on the threshold values for pixel darkness, a minimum value of transmittance at 515-635 nm (eosin) was determined to classify pixels on the basis of the darkness for HAPSS, and cell nuclei were classified at the threshold value of 0.2. Table 2 shows the average accuracy for SVM and RF for 36 HSIs of 12 patients (6 patients with cancer, 6 patients without cancer) assessed herein. Figure 16 shows a bar graph of the classification accuracy obtained using the support vector machine and random forest classifiers for cancerous and non-cancerous tissue. RF [33] and SVM [34] classifiers were used and 800 pixels were sampled from each tissue for LOSOCV analysis to prevent unbalanced classes. The kernel used for SVM was a Gaussian radial basis function (RBF), which was selected using hyper-parameter optimization in MATLAB (2018b). The "Box constraint" and "Kernel scale" hyper-parameters of the SVM were also optimized. The average of 10 LOSOCV analyses was used for evaluation [35,36]. This was chosen over the commonly used 10-fold cross-validation because of the potential for biasing of the results in training and testing on the same user. In addition, the computational cost of each classifier is shown in Tables 2 as a measure of the time required to train, and the performance of each classifier was evaluated on an Intel Core i7-6800k at 3.4 GHz. Reconstructed RGB images (3 bands), HSI (67 bands), HSI by HDR (67 bands), color-classified and learned spectral signature (67 bands) (C-HSI), HAPSS (134 bands) and HAPSS by HDR (134 bands) were used for the evaluation. Therefore, HSI denoted in Tables 2,3 is equivalent to that in conventional studies [23]. The average SVM classification accuracy for dark pixels of HAPSS by HDR was 94.0%, which improved by 14.5% compared to that of RGB images. In addition, the average SVM classification accuracy for only dark pixels was 94.0% for HAPSS; however, it improved by 3.0% compared to conventional HSI. The average random forest classification accuracy was 88.0% for HAPSS by  HDR, wherein the accuracy improved by 9.3% compared to RGB images. In addition, the average accuracy of only dark pixels by random forest was 90.7% for HAPSS; however, it improved by 4.5% compared to conventional HSI. Table 3 shows the SVM classification accuracy for each patient as well as the number of dark pixels and bright pixels. Comparisons among patients indicate that the accuracy of HSI was 81.0% for patient #12; however, it improved significantly for HAPSS to 98.0%. Dye amount was higher than that of other non-cancer images for patient #12, which may have complicated the classification based exclusively on dye amount and transmittance. Dye amount was also less than that of other standard images for patient #11, and the accuracy was 95.0% for HSI and improved with HAPSS to 96.0%. The accuracies of C-HSI and HAPSS, both proposed herein, were higher than that of conventional HSI for most patients. Table 3 shows the random forest classification accuracy for each patient as well as the number of dark pixels and bright pixels. Comparisons among patients indicate that the accuracy of HSI was 49.0% for patient #12; however, it improved significantly for HAPSS to 97.0%. Dye amount was also less than that of other standard images for patient #11, and the accuracy was 90.0% for HSI and improved with HAPSS to 94.0%. The accuracies of C-HSI and HAPSS, both proposed herein, were higher than that of conventional HSI for most patients. Figure 17 shows pseudo-color images for the classification of cancers shown with dark pixels (orange), bright pixels (yellow), non-cancerous tissue shown with dark pixels (blue), and bright pixels (light blue) of proposed method by SVM. Orange or yellow pixels denote cancer classified via SVM. Blue and light blue pixels denote non-cancerous tissue classified via SVM. Therefore, the distinction was accurate with many cell elements, despite partial differences. Moreover, yellowish cell elements including other bright cell elements are dominant in the case of cancerous tissue, and blue or dark cell elements are dominant in non-cancerous tissues.

Discussion and conclusion
This study proposed a method of classifying non-cancerous and cancerous human pancreatic tissue using pathological slides for computer-assisted diagnosis via HAPSS. An optical microscope and an HS camera were used to build a system for obtaining HSI at 420-750 nm. HSI for various tissues from 12 subjects were scanned from a single TMA, 6 of which were cancerous (duct adenocarcinoma, grade 2). Cell nuclei were extracted manually from these HSI, and 656,440 spectral signatures (cancer, 355,187 pixels; non-cancerous tissue, 301,253 pixels) labeled as either non-cancerous or cancerous were obtained. HAPSS calculated from the obtained spectral signatures was evaluated via LOSOCV using a random forest and SVM.
While HAPSS learns by constructing separate classifiers and classifying target tissues into dark and bright pixels, the accuracy of the analysis improved most when the threshold value was determined based on the change in transmittance during eosin staining at this point in the process. Furthermore, in the evaluation using SVM, the accuracy improved by 14.0% compared to that for RGB images, which have been widely used. In addition, the accuracy for HSI was 11.0% higher than for RGB images. Moreover, if HSI are used to diagnose pathological slides, changes in the spectral signature would be significant owing to differences among patients in the dye amount of the tissue, and the accuracy would change among patients upon LOSOCV-based analysis. However, HAPSS was more resilient against the differences among patients, and the accuracy was more stable than that for RGB images and HSI. In many cases, it has been confirmed that the accuracy improves when using HDR. In addition, the difference in accuracy between HAPSS and HSI in SVM is considered to be smaller because feature selection is not included.
In summary, HAPSS is a suitable method to develop an automatic diagnostic tool for pathological slides by decomposing spectral signatures. HAPSS yielded better results than those obtained using only spectral signatures. However, the TMA tissue used herein was stained in the same environment. Therefore, non-cancerous tissues are not included when analyzing cancerous tissues, thus serving as a challenge for future studies. Therefore, further studies are needed to confirm the validity of HAPSS. Moreover, it will be possible to combine HAPSS with morphological assessments to improve the overall diagnostic accuracy. We consider that HAPSS based on nuclear pixels is applicable for imaging various cancers. Furthermore, light absorption of tissues in HAPSS is minimal, and noise is likely predominant. Therefore, a method of separating light absorption of tissues from noise would be required, along with a more precise spectral waveform. In the near future, it is expected that HSI will be utilized in a manner similar to that of special staining and electron microscopy, as diagnostic information to be used during pathological diagnosis, thereby ultimately improving the accuracy of pathological diagnoses.