Visualization of quantitative lipid distribution in mouse liver through near-infrared hyperspectral imaging.

Lipid distribution in the liver provides crucial information for diagnosing the severity of fatty liver and fatty liver-associated liver cancer. Therefore, a noninvasive, label-free, and quantitative modality is eagerly anticipated. We report near-infrared hyperspectral imaging for the quantitative visualization of lipid content in mouse liver based on partial least square regression (PLSR) and support vector regression (SVR). Analysis results indicate that SVR with standard normal variate pretreatment outperforms PLSR by achieving better root mean square error (15.3 mg/g) and higher determination coefficient (0.97). The quantitative mapping of lipid content in the mouse liver is realized using SVR.


Introduction
Nonalcoholic fatty liver disease (NAFLD), including simple nonalcoholic fatty liver and nonalcoholic steatohepatitis (NASH), is one of the most common liver diseases worldwide, which can advance to liver cirrhosis and even hepatocellular carcinoma (HCC) [1][2][3]. The accumulation of excess lipids, such as triglycerides, in liver tissue is the distinctive feature of NAFLD [4,5]. Liver biopsy is the current reference standard for the definite diagnosis of NAFLD with a histology of intrahepatocellular fat accumulation [6]. The intrinsic limitations of biopsy, such as sampling error, difficulty in repeating, and invasiveness, necessitate the development of noninvasive and quantitative diagnosis. Ultrasonography and computed tomography can be used as noninvasive diagnostic tools for NAFLD; however, both lack the ability for the sensitive and fine detection of mild steatosis and slight changes in the fat content [7,8]. Magnetic resonance spectroscopy/imaging has been regarded as the most accurate quantitative method for measuring the fat content in the liver [9]. Although such diagnostic modalities have demonstrated in vivo feasibility for clinical use, a substantial need remains for the further development of noninvasive, quantitative, and rapid diagnostic modalities, with considerably higher spatial resolution, in particular, compared to magnetic resonance imaging (1-2 mm), computed tomography (0.5 mm) [10], and ultrasonography (∼0.5 mm) [11].
Spectroscopic and imaging techniques in the diffuse reflection mode provide a variety of clinically relevant information on the physiological composition, structure, and function of tissue [12], as well as sufficiently high spatial resolution down to the sub-micrometer level [13,14]. The visible and near-infrared region (VIS-NIR, approximately 400-1000 nm) is generally used to monitor the functional status of tissue by measuring the prominent absorption of oxygenated and deoxygenated hemoglobin [15][16][17]. For estimating the lipid concentration, the application of the shortwave infrared region (SWIR, 1000-2000 nm) is more advantageous than the VIS-NIR region. diet (ND) (CE-2, CLEA Japan Inc.) or high-fat diet (HFD) (D12492, D17011206, D17011207, Research Diets Inc.). The three HFDs (5.2 kcal/gm) provided comparable amounts of protein (20% energy), carbohydrate (20% energy), and total fat (60% energy). All these three HFDs contain the same 270gm% fat, but are composed of different types of oil. D12492 contains 245 gm of lard and 25 gm of Soybean oil. D17011206 contains 250.4 gm of Coconut oil and 19.6 gm of Safflower oil. D17011207 contains 151.6 gm Coconut oil and 118.4 gm of Safflower oil. Such a variety of diets feeding rendered mice harboring a wide range of lipid content in their livers. The livers isolated from the blood-removed mice were excised into approximately 3-mm-thick sections. These liver samples were first investigated using the NIR-HSI system. After image acquisition, the samples were processed to evaluate their lipid content.

Determination of the lipids extracted from mouse livers
The lipid content (LC; unit: mg/g) in mouse liver samples-the ratio of the extracted total lipid weight to the liver weight-is used for establishing the regression models. Folch extraction is a well-established protocol for isolating and purifying lipids in animal tissue [44]; this method was employed for extracting the total lipid content from each liver section, and the lipid content dissolved in organic solvent was measured. After acquiring the spectral images of the liver sample, the weight of the liver sample was measured using an electronic scale. The sample was then placed in a 15-mL falcon tube and a 2:1 (v/v) chloroform-methanol mixture was added to the tube. The lipids were collected by homogenizing the liver sample and transferring to the mixture. The lipid-containing mixture was then moved, through a filter paper, to another 15-mL falcon tube. After adding 2-mL of distilled water to the tube, ultracentrifugation was performed at 4000 rpm for 5 min. The organic fraction, which was thermodynamically separated from the tube, was then placed in a glass petri dish and heated on a hotplate to 90°C, until the organic solvent completely evaporated. The weight of the lipids remaining in the dish was measured using an electronic scale. Because the liver weight varied among samples, the LC was calculated by dividing the lipid weight based on the liver weight for further analysis.

Near-infrared hyperspectral imaging system and data treatment
A line-scanning NIR-HSI system was used to obtain the hyperspectral images of the mouse liver samples, as schematically shown in Fig. 1. The system comprises an NIR reflectance imaging spectrometer (ImSpector N17E, SPECIM, Finland) coupled to an NIR sensitive camera (XEVA-1.7-320, Xenics, Belgium) and NIR objective lens (SWIR series 83-160, focal length 25 mm, Edmund Optics, USA), a rotation mirror for scanning the y-axis, light source, and personal computer hardware and software (JFE Techno-Research Corp., Tokyo, Japan). The NIR camera includes a two-dimensional indium gallium arsenide (InGaAs) sensor array (256 × 320 pixels corresponding to the wavelength and position, respectively), and can detect NIR light ranging from 900-1700 nm with a wavelength resolution of 2.2 nm. A halogen lamp (LA-150UE, Hayashi-Repic Inc, Tokyo, Japan) was employed as the light source and was attached to two flexible fiber light guides to irradiate the samples from both sides at 45°. Each liver sample was scanned using a 40 × 100-mm 2 gold-coated mirror placed on a motorized rotation stage (SGSP-60YAW-OB, OptoSigma, Tokyo, Japan), while the camera acquired x-λ data. The frame rates ranged from 60-340 Hz, and the exposure time was fixed at 30 ms. To improve the signal-to-noise ratio, 16 scans were performed for each sample, and the averages of 16 full images for each wavelength were used for further analysis. The hyperspectral images were calibrated using white and dark reference images; a white reference plate was observed under the same condition, and the dark images were obtained by turning-off the illumination and covering the lens with a cap. Note that the power of the halogen lamp was controlled such that the sample temperature was stable. The NIR reflectance images were normalized to determine the relative reflectance R(λ) using the following equation: where R(λ) is the calculated relative reflectance for each λ, I raw (λ) is the raw intensity of a given pixel, I dark (λ) is the dark reference image for converting the intensity images into reflectance images to removing the effect of the dark current, and I white (λ) is the dark current and intensity obtained from the white reference image. After eliminating the spikes in the images, the pixels belonging to each liver sample (1150 pixels on an average) were selected by assigning a rectangular region in the image. Further, the mean reflectance spectrum for each liver sample was calculated from the component pixel reflectance spectra and used for representing the liver sample. In addition, the absorbance spectra corresponding to these reflectance spectra, defined as log (1/R(λ)), were used for analysis. Fig. 1. Schematic of the near-infrared hyperspectral imaging (NIR-HSI) system. The illumination path is shown in yellow, and the imaging path in black. An imaging spectrometer, equipped with an InGaAs CCD camera and a short-wave infrared (SWIR) lens, is placed in front of a gold-coated rotation mirror that enables a linearly scan of the sample-of-interest. The image acquisition and scanning mirror is controlled through an in-house data acquisition software. A slit installed in the imaging spectrometer selectively passes the light reflected at the central region of the rotation mirror.

Statistical modeling analysis
For quantitative regression modelling, PLSR and SVR developed using a kernel-based machine learning package Kernlab (version 2.5) with slight modification were used [45]. PLSR is an effective multiparametric linear statistical analysis as demonstrated in various biochemical modeling with HSI [43,46] and is suitable for handling a reflectance spectrum with several collinear variables [47]. This model is generally the first choice for building a predictive model; however, it may generate complicated results when the data are nonlinear regular. On the other hand, SVR is a nonparametric supervised machine learning algorithm, which is extended from the SVM [48]. The SVR model can transform a nonlinear regression problem into linear regression through the implementation of a kernel function [49]. This function can project the original feature space into higher dimensional space by finding a hyperplane that separates the largest section of datapoints with maximum margin among the classes. Among the various kernel functions such as the linear function, polynomial function, radial basis function (RBF), and sigmoid function, the RBF was used as the kernel because of its good performance and fewer input parameters (C, ε, and γ). γ, which defines the cost function, was first determined in order to minimize the variance of the Gram matrix of the training dataset. The two parameters, ε and C, representing the efficiency and cost of the kernel function, respectively, were determined through two-fold cross validation. NIR reflectance spectra are affected by scattering on the sample surface or the dark-current noise of the camera. Several combinations of preprocessing methods for the spectra, such as the standard normal variance (SNV) [50] and the first-order derivative, were evaluated. SNV was performed on the individual spectrum of each liver sample using the mean and standard deviation of the spectrum-of-interest. The modified spectrum is as follows: where X and X SNV are the original and SNV-modified spectra, respectively, andX is the mean of m spectral values in the entire wavelength range of X. Internal validation using the leave-one-out cross-validation method was performed to generate the PLSR and SVR calibration models. Training data were generated by matching the NIR spectral data with the lipid quantity determined through the Folch extraction method. The individual models were evaluated using the R 2 and RMSE of the cross-validation, defined as follows: whereŷ i and y i are the measured and predicted values for the i-th liver sample, respectively; n is the number of samples;ȳ is for the average of the measured values of all samples.

Analysis of the lipid content and NIR absorption spectra
The LCs in mouse liver samples (n = 75), defined as the ratio of the extracted lipid weight to the liver weight, were determined through the Folch extraction method. The LCs in the liver samples ranged from 32.6-296.8 mg/g with a mean value of 166.5 ± 15.5 mg/g (mean ± standard deviation). The 75 liver samples were categorized into five groups depending on their lipid content: group A-32.6-67.6 mg/g; group B-71.3-115.0 mg/g; group C-139.6-207.4 mg/g; group D-207.9-250.9 mg/g; group E-251.0-296.8 mg/g. Table 1 lists the lipid content and weight of the liver samples for each group. Detailed information on all the liver samples is summarized in Table S1. The profiles of fatty acids in the HFD-fed mouse livers have been observed in previous studies: oleic acid (C18:1, 31%), palmitic acid (C16:0, 28%), linoleic acid (C18:2, 14%), and docosahexaenoic acid (C22:6, 8%) [51]. Given that the liver samples used follow the similar trend, the total LC differences among the group A-E can be related to the differences in C18:1 and C16:0. Figure 2(a) displays a representative hyperspectral montage of a mouse liver, composed of absorption images obtained from 1000-1600 nm in 50-nm steps. The visible and pseudocolored NIR images of the same liver sample are also shown in Fig. 2(b) for comparison. The mean NIR absorbance spectra of the liver samples, depicted in Fig. 2(c), exhibits remarkable changes in the NIR absorption peak due to the water bands at approximately 1450 nm, attributed to the first overtone of the O-H bond stretching [17]. Another distinctive increase in the peaks at 1160 and 1210 nm corresponds to the second overtone band of the -CH and -CH 2 groups, respectively [52,53]. The lipid absorption peak at 1210 nm is in good agreement with the measurements performed by Nachabé [18]. This absorption band is particularly important for discriminating the degree of saturation of the fatty acids. Furthermore, slight changes at 1000-1050 nm and 1350-1390 nm were observed, attributed to the weak combination bands of the -CH 2 group. The SNV-processed absorbance was significantly higher in group E than in group A at 1180-1220 nm and 1360-1420 nm (Fig. 2(d)). The downward shift in the absorption spectrum baseline in the 1300-1600 nm range indicates that the relative quantity of water at the sample surface decreases with the increase in lipid content, and that SNV preprocessing successfully removes such baseline drift due to the difference in light scattering at the sample surface.

PLSR analysis
After extraction of the mean NIR spectra and the corresponding lipid content measured as reference data, PLSR models were established to study the relationship between the spectra and lipid content, and select appropriate image preprocessing methods. Leave-one-out crossvalidation was used to evaluate the PLSR models. The performance of the PLSR models for the prediction of the lipid content of individual liver samples was determined using various spectral preprocessing methods, which affected the resultant estimation errors ( Table 2). SNV preprocessing provided the best cross-validation performance with R 2 = 0.953 and RMSE = 13.34 mg/g. Figure 3(a) shows the correlation between the lipid contents obtained using the PLSR model and the Folch extraction method. The model used was analyzed using the regression coefficient (RC) curve to select the optimum wavelengths with large absolute values of the peaks and troughs (Fig. 3(b)). A total of 27 optimal wavelengths were selected based on the optimal latent variables (LVs) of the model. Wavelengths with large absolute RC values included 1083, 1089, 1216, 1392, 1430, 1432, 1416, 1434, and 1533 nm, some of which corresponded to the overtone or combination vibrations mentioned above. For example, 1216, 1392, and 1416 nm are the characteristic wavelengths that appear in the absorption spectrum of a lipid [18]. The prediction errors need to be reduced further for quantification objectives; the relative standard errors in all the PLSR models were always greater than 11% and the prediction errors in the 150-250 mg/g range were nonnegligible. The number of LVs for the PLSR model with SNV was remarkably high, leading to overfitting of the model. The PLSR model optimized with the 10 LVs nominated by the RC curve, underperformed, compared to the PLSR model with full wavelengths (R 2 = 0.832). Therefore, alternative models are needed to improve the prediction accuracy; nonlinear models would be suitable because certain absorptions in the NIR range originate from combinational vibration, which has a nonlinear dependence on the wavelength.

SVR-based predictive models
Similar to PLSR model development, four preprocessing methods were used to build the SVR models. As shown in Table 2, three SVR models exhibited better performances than those of the PLSR. The SVR model with SNV exhibited the best performance with R 2 = 0.968 and RMSE = 15.28 mg/g, with a relative standard error as less as 8.5% (Fig. 4(a)). To investigate the effect of the calibration sample size, the absolute prediction errors between the measured and predicted values were evaluated through leave-out-k cross-validation. n images (n = 75 in this case) were divided into a subset of k images used for training, whereas the other n k images were used as the validation set (k = 5, 10, 20, 30, 45, 60, and 74). The case of k = 74 indicates leave-one-out cross-validation. All the training sample combinations were validated. The absolute prediction error was calculated for each iteration, and the mean with the standard deviation was used for the result. Figure 4(b) displays the convergence of the absolute prediction error as a function of the calibration sample size (n). The dashed curve fitted using the least-squares method is proportional to n -1/2 , where the relative prediction errors are assumed to obey a normal distribution, as displayed in Fig. S2. The absolute prediction error for the SVR decreased with the increase in the calibration sample size; however, the error obtained for k ≥ 30 was nearly constant. This is attributed to the fact that each hyperspectral image of the liver includes a significantly large number of spectra (>1000) with a wide variety of corresponding lipid content, and therefore, the number of liver samples used, n = 75, is sufficient to establish a reliable SVR model.

Lipid visualization in mouse liver
We demonstrate the application of the SVR model for visualizing the distribution of the lipid content in mouse liver samples. Pixelwise estimation of the lipid content was performed using the SVR model with SNV pretreatment. Figure 5(a) shows the distribution map of the lipid content in 15 liver samples. Three characteristic samples were selected from each group listed in Table 1. As indicated by the scale bar in Fig. 5(a), the lipid content increases from blue to red; the numbers shown below each color-coded image correspond to the measured lipid content. Note that the white regions were excluded from the region-of-interest during SVR processing. For comparison, the conventional RGB images of the same samples acquired using a digital camera are shown in Fig. 5(b). In the RGB images, the color and shape of the samples can be easily discerned, but clues regarding the lipid content cannot be found, and the visible color does not directly relate to the lipid content or its existence. Contrarily, the quantity and distribution of lipids can be clearly recognized in the images obtained through SVR analysis, which show unequal distribution of the lipid content in the samples. Such visualization was realized using the unique characteristics of HSI for possessing the spectral as well as spatial information of the samples [36,43,54]. Interestingly, the edges of certain liver samples were enhanced with higher predicted lipid content, compared to the measured values, which may be due to the irregular shape of the surface. Visualization of the lipid content is the most crucial step in demonstrating the NIR-HSI results for evaluating the lipid content in a pixelwise manner. For lipid visualization, NIR-HSI is advantageous with respect to the spatial resolution, compared to existing diagnostic modalities such as the MRI, computed tomography, and ultrasonography. The prediction images generated by the SVR model have a spatial resolution of 0.25 mm, which is approximately half that of ultrasonography. As a label-free imaging for quantitatively observing lipids in livers, this method provides one of the finest spatial resolutions. In our setup, the spatial resolution can be further improved by replacing the SWIR lens with a lens having a shorter working distance, for example.
In future, studies should be performed to demonstrate the feasibility of the discriminative imaging of various fatty acids and microscopic analysis through reflectance NIR-HSI. The degree of lipid saturation is identifiable in the SWIR absorption spectrum by, for example, the change at 1140 nm due to the overtone vibration of carbon-carbon double bonds. A quantitative classification model can be established by preparing an appropriate set of absorption spectra for each fatty acid as the training dataset. For realizing this, more GC measurements will be required to quantify and identify the fatty acids contained in liver samples. Toward histopathological analysis, the approach described in this study can be applied for the label-free, microscopic histopathological investigation of lipids. Staining the lipids in tissue using lipophilic dyes such as Nile Red and Oil Red O is a standard protocol to provide contrast in histopathological slides, and the visible color is crucial. Animal models of liver diseases such as NASH and HCC are routinely evaluated by visualizing the tissue morphology with hematoxylin and eosin (H&E) or Masson's trichrome stains. The infiltration of lipids and lipid droplets are consequently delineated as white voids, where H&E stain the nuclei in blue, and the cytoplasm and extracellular matrix in different shades of pink, and Masson's trichrome stains the nuclei in black, collagen in blue, and the muscles, erythrocytes, and cytoplasm in red. Some studies have proposed digital staining of the histopathological images of the kidney [55] and lung [56], which is a machine vision approach to overlay artificial color contrast over unstained images; however, quantitative, label-free lipid visualization is yet to be realized. In principle, our NIR-HSI system can visualize the lipid distribution in histopathological slides without requiring physical staining. Acquiring the very weak NIR absorption derived from the lipids in slides will be the key to developing a better regression model.

Conclusions
A near-infrared hyperspectral imaging (NIR-HSI) system was demonstrated for the ex vivo quantitative visualization of the lipid concentration and distribution in mouse liver. Hyperspectral images in the spectral range of 1000-1600 nm were analyzed using both linear partial least square regression (PLSR) and nonlinear support vector regression (SVR) algorithms to predict the lipid content, whereas the actual lipid contents were measured through the Folch extraction method. Mice were fed with normal diets and high-fat diets in order to create variation in the lipid content. The changes in the lipid content caused distinctive changes in the NIR absorption spectra. SVR models with standard normal variate preprocessing outperformed the PLSR models, with R 2 = 0.968 and a mean absolute error of 10.9 mg/g (8.5%). In addition, color mapping of the lipid content was realized using SVR analysis, which suggested uneven lipid distribution in each liver sample. Thus, NIR-HSI was proven to be a promising approach for the ex vivo detection of the lipid content in liver. This platform is expected to offer a new strategy for the noninvasive analysis of lipid localization in the liver, which can improve the diagnosis of nonalcoholic fatty liver diseases and contribute to the better understanding of the mechanism of the pathogenesis of steatosis.