Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Analyzing the serum of hemodialysis patients with end-stage chronic kidney disease by means of the combination of SERS and machine learning

Open Access Open Access

Abstract

The aim of this paper is a multivariate analysis of SERS characteristics of serum in hemodialysis patients, which includes constructing classification models (PLS-DA, CNN) by the presence/absence of end-stage chronic kidney disease (CKD) with dialysis and determining the most informative spectral bands for identifying dialysis patients by variable importance distribution. We found the spectral bands that are informative for detecting the hemodialysis patients: the 641 cm-1, 724 cm-1, 1094 cm-1 and 1393 cm-1 bands are associated with the degree of kidney function inhibition; and the 1001 cm-1 band is able to demonstrate the distinctive features of hemodialysis patients with end-stage CKD.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Today, one of the most accessible methods of primary assessment of the human body state is a blood test [14]. Fast and accurate analysis of a blood sample is the basis of clinical diagnosis. Human blood is a liquid connective tissue of the body internal environment and consists of plasma and the formed elements suspended in it. Fibrinogen-free blood plasma is serum. Serum is more stable than plasma, and therefore is of particular interest in laboratory diagnostics. Biochemical characteristics of blood and its components depend on the general condition of the body and possible pathological processes [5]. The methods of blood biochemical analysis are utilized to quantify the concentration of certain organic and mineral components, enzymes and hormones as well as to identify their deviations from the norm [6]. However, in identifying a specific pathological process in the body, the prognostic significance of a particular biochemical blood index or a set of indices may be insufficient. Improved prognostic significance of a blood test for identifying pathological processes is possible by examining a complex of changes in the blood component composition. Raman spectroscopy is a promising method to reach this goal [79]. Raman spectroscopy is based on the effect of inelastic scattering of incident electromagnetic radiation by the molecules of the investigated substance [10]. Raman spectroscopy allows for analysis of the tested object features at the molecular level. The main advantages of Raman spectroscopy in biological tissue analysis are: high specificity, simplicity of sample preparation, non-destructiveness, ability to analyze a small sample (about 1 µm) [11,12].

However, the possibilities of Raman spectroscopy in the study of multicomponent biological objects are limited by the low level of useful signal. Improving the quality of the Raman signal and increasing the information content of the analysis is possible using the surface-enhanced Raman spectroscopy (SERS) [13]. SERS is applied for practical developments in the field of biochemistry [14], forensic science [15], and laboratory medical diagnostics [16]. Recently, a number of research teams have published the results of their studies into the spectral characteristics of plasma, serum, and whole blood based on the SERS in the presence of: cervical cancer (Shangyuan Feng et al. [17]), hepatitis B (Lu Yudong et al. [18]), hepatitis C (Muhammad Kashif et al. [19]), cervical cancer and breast cancer (Ningning Gao et al. [20]), oral squamous cell carcinoma (Lili Xue et al. [21]), etc.

It should be noted that in the analysis of multicomponent biological tissues by Raman spectroscopy and SERS, the obtained spectral characteristics are a superposition of the spectral contribution of all components included in the test sample [22]. Consequently, the analysis and interpretation of the spectral characteristics of biological tissues are associated with the problem of multicollinearity and multiple spectrum overlaps. Multivariate analysis allows for identifying specific spectral features of human blood associated with a specific influencing factor or pathology [23,24]. The decision about a multivariate analysis method corresponding to the features of the set goal is an important factor for obtaining a statistically reliable result. Rapid development of information technology and machine learning have led to attempts at applying artificial neural networks to the analysis of spectral data [2527]. The growing popularity of the approaches based on artificial neural networks is caused by the fact that they are universal and represent a compromise option between the parametric and metric methods of analysis. However, to assess the capabilities of approaches based on a combination of machine learning and Raman techniques in clinical tasks, one should carefully consider the problem of overfitting and analyzing multivariate models [28].

The aim of this work is the in vitro analysis of the spectral features of the serum in hemodialysis patients with end-stage chronic kidney disease (CKD) using a combination of surface-enhanced Raman spectroscopy (SERS) in the near infrared range and machine learning methods. Currently, the application of a combination of the SERS technique and machine learning to the analysis of biofluids in subjects with different stages of CKD is being widely investigated and described providing a basis for screening methods and CKD early detection [2931]. Nevertheless, the dialysis patients are not included in such studies and the SERS features of the blood serum in the dialysis patients with end-stage CKD are little described. However, in our opinion, this kind of investigation could be useful as a basis for monitoring the health status of such patients. The current work is the first stage of an upcoming study of the SERS features of the serum in dialysis patients associated with pathological changes and markers of various complications. The utilized approach is based on the construction of a classification model for determining the dialysis patients and identifying the most informative SERS bands associated with differences between the group of dialysis patients and the group of patients with 1-3a stages of CKD, followed by comparison of the identified bands with the SERS bands correlated with the main markers of the decreased kidney clearance (creatinine and urea).

2. Materials and methods

2.1 Colloidal silver nanoparticles solution

Our previous work [32] presented and explored an approach to preparing a silver SERS substrate for simple analysis of human serum. The prepared SERS substrates are characterized by the stability and enhancement factor at the level of 4 × 105. Therefore, in this work, we use the previously proposed approach to preparing substrates based on silver structures. Silver structures based on dried silver colloid are utilized to achieve surface enhancement of Raman scattering in the near infrared range. A silver colloid was obtained by reduction from an aqueous solution of silver nitrate with sodium citrate at the temperature of 95 °C for 20 minutes. Initially 20 ml of distilled water heated to boil. Once boiling started, 3 mL of 1.8% AgNO3 of and 6 mL of 1% trisodium citrate (Na3C6H5O7) were added to the boiling distilled water. The resulting solution was heated at 95°С for 20 min until a yellow-green solution was formed. The resulting colloidal solution was poured onto an aluminum foil and dried at room temperature until completely dry.

2.2 Serum sample preparation

In this study, the in vitro analysis of human serum was performed for 136 subjects, including 78 subjects of 1-3a stages of CKD depending on the glomerular filtration rate and 58 hemodialysis patients with end-stage CKD. The analyzed subjects were selected by the method of stratified random sampling. A standardized sampling was carried out from patients of form. The serum samples were collected from the patient in the morning, in the fasting state and were placed in sterile test-tubes followed by freezing at the temperature of -16°C. Blood sampling from dialysis patients with end-stage CKD was performed prior to dialysis (dialysis frequency: 3 treatments per week). Immediately before the analysis, the samples were defrosted at room temperature. The study protocols were approved by the ethical committee of Samara State Medical University. All subjects involved in the study gave their written informed consent at the beginning of the study. The characteristics of the sample of subjects are presented in Table 1.

Tables Icon

Table 1. Summary of the subjects

2.3 SERS spectral acquisition

For SERS analysis, each serum sample was dropped in a volume of 1.5 µl on aluminum foil with a layer of silver structures and dried for 30 minutes. The analysis of the serum spectral characteristics was carried out using an experimental stand consisting of a spectrometric system (EnSpectr R785, Spektr-M, Chernogolovka, Russia) and a microscope (ADF U300, ADF, China). The spectra were excited in the near infrared range using a laser module with the center wavelength of 785 nm. A 50x LMPlan Objective was used to focus the radiation on the sample and collect the scattered radiation. The diameter of the laser spot at the focus was 5 µm. Human serum was analyzed at the laser power of 10 mW. The spectra were recorded with the exposure time of 4sec x4 times. The spectra were recorded using the EnSpectr software (Spektr-M, Chernogolovka, Russia). Immediately before the registration of the tested serum sample spectral characteristics, a preliminary recording of the environing background signal was performed. After that, the background component was automatically subtracted from the subsequent recorded serum spectra using the algorithm built in the EnSpectr software. Each of the obtained SERS spectra was a discrete set of 1700 parameters (predictors). The set of raw SERS spectra of serum for subsequent preprocessing and multivariate analysis is presented in the Supplement 1 in Figure S1.

2.4 Spectra processing

2.4.1 Preprocessing

The preprocessing of raw SERS serum spectra consisted of several sequential stages: noise smoothing, removal of autofluorescent background, and normalization. Smoothing the raw spectra was performed by the Savitzky-Golay filter the with filter window width of 15, the first-order of the polynomial used for smoothing and the zero-order of derivative to take (no derivative). Then the smoothed spectra were subjected to removal of the autofluorescent background by the polynomial method [33]. According to Zhao et al. [33], the polynomial fitting is characterized by the requirement to choose the polynomial degree in each specific case, taking into account the features of the sample spectral characteristics in the approximation area, therefore the smoothed serum spectra were subjected to polynomial approximation with the polynomial degrees from 4 to 20. Comparative analysis of approximating the serum spectral characteristics demonstrated that a 15th degree polynomial is best fitting for our experimental conditions and for the samples characteristics. The spectral characteristics of the serum were normalized by means of a standard deviation of the normal variate method (SNV).

2.4.2 Multivariate models and validation

Multivariate analysis of the processed spectral characteristics of serum was performed to accomplish the following tasks:

  • • “hemodialysis end-stage CKD” group vs “1-3a stages of CKD” group: discriminating the hemodialysis patients with end-stage of CKD and the patients with stages 1-3a of CKD;
  • • regression between the serum spectral characteristics and the levels of creatinine and urea for the whole dataset.

Each test serum sample corresponds to the a priori information on belonging to a particular group and possessing particular biochemical characteristics. Thus, the data were analyzed through supervised learning. In this study, a basic solution without involving deep learning and a solution using deep learning are considered within the assigned classification tasks. To avoid overfitting, the analysis of the stability of the constructed models and the choice of the optimal parameters were implemented using the k-fold cross-validation (k = 10) [34]. After cross-validation and determining the optimal parameters of the model, the full data set was divided randomly: 80% of subjects for training the model (training set) and 20% of subjects for testing the model (verification set). It should be noted that when splitting the initial dataset into a training set and a verification set (both in cross-validation and in the final model construction), the split was performed by subjects in order to avoid allocating the spectral characteristics corresponding to one subject in different sets, which could lead to incorrect overestimation of the model characteristics. When constructing the models, the importance of predictors in accomplishing the classification task was assessed by means of the distribution of variable importance (VIP) in the constructed model. The VIP distribution analysis makes it possible to define which spectral bands and associated serum components are characterized by the differences in the accomplished classification task. Moreover, the estimation of the VIP distribution can circumvent the case when the model is a “black box” and is erroneously associated with noise.

2.4.3 Basic solution based on PLS-DA

The basic non-deep learning solution is implemented using discriminant analysis with the projection on latent structures (PLS-DA) for the classification task, and projection on latent structures (PLS) for the regression tasks. In our comparative study [35], the PLS-DA method demonstrated stability and a potential to analyze spectral characteristics of the whole blood. To implement the SIMPLS algorithm for the PLS and PLS portion of the PLS-DA method [36], a multivariate analysis was carried out using the MDAtools package available within the R studio software. The choice of the optimal number of loading vectors was carried out according to the first local minimum in the RMSE plot calculated for a different number of components and cross-validation predictions. The VIP distributions for PLS and PLS-DA models were calculated by a standard algorithm as a weighted sum of the squared correlations between the PLS-DA components and the variable [37,38].

2.4.4 CNN-based solution

The solution based on deep learning was implemented using a separate one-dimensional CNN for each task. The choice of the CNN architecture for recognition of the current SERS dataset consisted of several consecutive stages. At the first stage, the verified CNN configurations and advanced deep learning practices based on CNN were examined. Analysis of the work by other research teams has shown that the following CNN configurations are characterized by their possible abilities to recognize Raman spectra: sequential CNNs [39], CNNs containing the Inception module [40], CNNs with residual connections [27], ensemble CNNs [41], CNNs based on a combination of convolutional layers with recurrent layers [25].

One of the features of utilizing artificial neural networks is the need for an empirical choice of the network topology and optimization of the task hyperparameters. Therefore, the empirical optimization of the topology and hyperparameters had to be performed. As a measure of success and criteria for evaluating the tested CNN configurations, the following factors were chosen: the stability of the model, the magnitude of the error, the accuracy for the classification task, and the visualization of variable importance distribution. The 10-fold cross validation was used as the evaluation protocol. Figure 1 demonstrates a schematic diagram of the CNN used for recognizing the serum SERS-characteristics. The CNN is organized as a combination of two consecutive residual one-dimensional convolutional bases [42] and a fully connected classification level. The following loss functions were used as a feedback signal for training the weight tensors: 1) categorical_crossentropy loss function for the task of discriminating the hemodialysis patients with end-stage of CKD and the patients with stages 1-3a of CKD; 2) mse loss function for the task of creatinine regression; and 3) mse loss function for the task of urea regression. The CNN was trained using the adamax algorithm. The number of epochs was determined by the local minimum of the loss function during cross-validation. The CNN was analyzed using the KERAS package within the R studio software [43]. The informativeness of individual predictors was visualised by means of the VIP package within the R studio software [44]. For the CNN models, an algorithm of calculating permutation-based variable importance was used. The permutation-based method for assessing variable importance in constructing the model is used in traditional machine learning; the perspective of its application and its adaptation for analyzing the deep learning models are described in Refs [4547].

 figure: Fig. 1.

Fig. 1. Schematic diagram of the architecture of one-dimensional CNN for the recognition of Raman spectra and serum classification, where: input is the input layer with the output shape (1700, 1); conv1 is the one-dimensional convolutional layer with padding = “same” and the output shape (1700, 8); pooling1 is the max pooling layer with pool_size = 5, strides = 5 and the output shape (340, 8); residual conv1 is the one-dimensional convolutional layer with strides = 5 and the output shape (340, 8); add 1 is the layer that adds residual tensor and convolutional base, the output shape (340,8); conv2 is the one-dimensional convolutional layer with padding = “same” and the output shape (340, 16); pooling2 is the max pooling layer with pool_size = 4, strides = 4 and the output shape (85, 16); residual conv2 is a one-dimensional convolutional layer with strides = 20 and the output shape (85, 16); add 2 is the layer that adds residual tensor and convolutional base, the output shape (85,16); flatten is a flatten layer; FC1 is a fully connected layer1; FC2 is a fully connected layer with parameters depending on the accomplished task (units = 1 and softmax activation function for discriminating the hemodialysis patients with end-stage of CKD and the patients with stages 1-3a of CKD; units = 1 and without activation function for regression between the serum spectral characteristics and the creatinine or the urea level).

Download Full Size | PDF

3. Results and discussion

The main objective of our study is to identify the spectral features of the serum of hemodialysis patients with end-stage CKD. Therefore, the first stage of our study was to discriminate between patients with the end-stage CKD and patients with stages 1-3a of CKD using the serum SERS characteristics. Figure 2 presents the mean serum SERS spectra with standard deviation (SD) for the group of patients with stages 1-3a CKD and for the group of hemodialysis patients.

 figure: Fig. 2.

Fig. 2. Mean SERS spectra of serum for the group of patients with 1-3a stages of CKD and the group of hemodialysis patients.

Download Full Size | PDF

Figure 2 demonstrates that the differences between the mean serum spectra for the discriminated groups are visually observed in the intensity of individual spectral bands. Without utilizing multivariate analysis, an increase in the intensity of the peaks at 1004 cm-1, 1240 cm-1, 1395 cm-1 can be identified as distinctive features in the spectral characteristics of the serum of the hemodialysis patients against the 1-3a CKD patients. The peak at the 710-750 cm-1 band is characterized by increased intensity at 718 cm-1 and decreased intensity at 744 cm-1 for the hemodialysis patients against the 1-3a CKD patients. It should be noted the presence of the intragroup dispersion and the overlap of spectral features of the studied groups, as well as the presence of multicollinearity. For this reason, the SERS database of serum spectra was subjected to multivariate analysis. Table 2 demonstrates the characteristics of the constructed classification models: for the training dataset (referred to as “train”), for the verification dataset (referred to as “test”) and general characteristics for the entire dataset (referred to as “overall”). Figure 3 presents the Receiver operating characteristic (ROC) of the constructed models for detecting the hemodialysis patients with the end-stage kidney failure among the patients with stages 1-3a of CKD on the verification dataset.

 figure: Fig. 3.

Fig. 3. ROC curves of the constructed models for the classification of “end-stage kidney failure” group vs “1-3a stages of CKD” group on the verification dataset.

Download Full Size | PDF

Tables Icon

Table 2. Specificity, Sensitivity, and Accuracy of Discriminating the End-Stage Kidney Failure Group and the 1-3a Stages of CKD Group

The results presented in Table 2 and Figure 3, demonstrate that in the analysis of spectral SERS characteristics of serum, for classification of subjects by the presence/absence of end-stage kidney failure, the basic solution without involving deep learning is slightly inferior to the solution based on CNN. It should be noted that the basic solution model and the CNN model are both stable. For detecting the target subjects using a deep learning solution, the specificity at 0.95, the sensitivity at 0.92, and the accuracy at 0.94 are sufficient for clinical use.

Figure 4 (a, b) demonstrates the VIP distributions of the SERS matrix of serum spectra when constructing discrimination models for the hemodialysis patients with the end-stage of kidney failure and the patients with the CKD stages 1-3a. For convenience, Figure 4 shows the mean serum spectra for the discriminated groups with the overlapping VIP distribution. The VIP distribution is represented by a gradient fill, where the purple color corresponds to the minimum information content, and the yellow – to the maximum information content (the scale is shown in the figure).

 figure: Fig. 4.

Fig. 4. VIP distribution of the serum SERS spectra matrix in constructing a model: a) based on PLS-DA; b) based on CNN.

Download Full Size | PDF

Figure 4 demonstrates that the most informative spectral bands when constructing models coincide with the visible differences in the mean spectra of the discriminated groups. The analysis of Pearson's pairwise correlation between the VIP distribution when constructing the PLS-DA model, and the VIP distribution when constructing the CNN model, demonstrates a correlation coefficient of 0.61 and no significant correlation. Peaks assignment is presented as bond with the type of vibration with an indication of possible associated bioorganic substances in accordance with Refs. [31,4851]. It should be noted that the assignment of SERS bands in serum spectrum with specific bioorganic substances is characterized by complexity due to multicollinearity and the possible spectral contribution of several substances to a particular band. Thus, the spectral contribution to the specific band can be due to the indicated substance, but is not limited to it. The model based on the basic solution and the model based on the CNN identified the following spectral bands as the most informative: the 720 − 750 cm-1 band with the peak maximum at 724 cm-1 (corresponds to purine ring breathing in hypoxanthine), the 990 − 1030 cm-1 band with the peak maximum at 1001 cm-1 (proteins symmetric ring breathing of phenylalanine, $\upsilon$(CO), $\upsilon$(CC), δ(OCH)), the 1089 − 1110 cm-1 band with the peak maximum at 1095 cm-1 (C-N in carbohydrates), the 1220 − 1255 cm-1 band with the peak maximum at 1238 cm-1 (lipids, CN stretching), the 1380 − 1415 cm-1 band with the peak maximum at 1393 cm-1 (δCH3 symmetric) [31,4851]. The 630 − 650 cm -1 band with the peak maximum at 637 cm-1 (uric acid) is characterized by a significant information content when constructing the PLS-DA model, while for the CNN model this band, though also informative, has less relative importance for classification by CNN.

The highlighted SERS bands, which are informative for determining the group of dialysis patients, partially coincide with SERS bands of serum presented in the work by Zong et al. [29] and are also informative in discriminating the group of CKD and the healthy volunteers. Thus, Zong et al. indicated 641 cm-1, 724 cm-1, 1094 cm-1, 1398 cm-1, among the informative spectral bands associated with the degree of kidney function inhibition in CKD, which coincides with the bands identified in this work as 641 cm-1, 724 cm-1, 1094 cm-1 and 1393 cm-1, respectively. It should be noted that in the work by Zong et al. [29] and in the work by Guo, J. et al. [31], the band at 1003 cm-1 is observed, but it does not demonstrate an informative difference between the CKD group and the control group, whereas in the current work, the 990 − 1030 cm-1 band with the peak maximum at 1001 cm-1 is informative for detecting the hemodialysis patients with the end-stage CKD. Thus, it may be concluded that in the current work, among the spectral bands identified as informative for the detection of hemodialysis patients the 641 cm-1, 724 cm-1, 1094 cm-1 and 1393 cm-1 bands are associated with the degree of kidney function inhibition in CKD; and the 1001 cm-1 band can demonstrate the distinctive features of hemodialysis patients with end-stage CKD.

In order to correlate the identified peak assignments with specific biochemical changes, it is possible to analyze the regression between the serum biochemical parameters and its spectral characteristics. In case of significant correlation between the SERS characteristics of serum and a specific biochemical indicator, it is possible to use the VIP distribution of the constructed regression model to identify the bands reflecting the spectral contribution of the analyzed biochemical indicator. The main markers of kidney clearance decrease in human blood is augmentation of creatinine and urea in the content [52,53]. Therefore, the next stage of our study was identification of the SERS bands correlated with creatinine and urea in the obtained serum spectra under our experimental conditions. Regression models were constructed for the correlation of the serum SERS characteristics and the level of creatinine and urea determined by the standard biochemical analysis. The coefficient of determination R2 of the creatinine level was 0.74 and 0.81 for the PLS and CNN models, respectively. The coefficient of determination of the urea level was 0.76 for the PLS model, and 0.82 for the CNN model. In general, the obtained characteristics of the regression models of creatinine and urea are comparable with the results obtained by other authors. For instance, Zong et al. [29] investigated the SERS characteristics of serum and urine in patients with CKD. In their work, Zong et al. demonstrated that in the PLS analysis of the SERS serum characteristics of 126 patients with CKD and 97 healthy people, the determination coefficient for creatinine and urea was 0.85 and 0.85, respectively. The combination of multivariate analysis and Raman-based techniques for the analysis of creatinine and urea levels in human urine has been demonstrated by Cassiano Jr Saatkamp et al. in [54]. The authors presented a study of 54 volunteers without history of kidney diseases with the correlation coefficient of 0.91 for creatinine, and of 0.90 for urea (which corresponds to determination coefficients of 0.83 and 0.81, respectively). Huang et al. demonstrated the correlation of creatinine and urea levels with the spectral characteristics of urine in 110 kidney transplant recipients with a coefficient of determination at the level of 0.81 and 0.78, respectively [55].

The characteristics of the constructed models may not be as efficient for accurate estimation of creatinine and urea levels in blood as the standard biochemical method, nevertheless, our results demonstrate that these models are sensitive to the correlation between the serum SERS characteristics and the levels of creatinine and urea. Therefore, it is possible to consider the most informative spectral bands of the constructed PLS and CNN regression models as the spectral contribution correlated with creatinine and urea. In this work, the spectral bands that are specific for creatinine and urea were obtained through the results of a multivariate analysis of the base of the serum spectra that is characterized by overlapping spectral contributions of various components included in its composition. Thus, the highlighted bands represent only the spectral contribution “visible” to the model. This factor may complicate the comparison of the spectral bands associated with creatinine and urea obtained in this work with the results obtained by other authors. Figure 5 (a-d) presents the VIP distributions of the matrix of serum SERS spectra when constructing the PLS models and the CNN regression models of the creatinine and urea levels. For convenience, VIP distributions are superimposed on the mean SERS spectrum of serum.

 figure: Fig. 5.

Fig. 5. VIP-distribution of the matrix of serum SERS spectra when constructing: a) regression of the creatinine level by the PLS method (R2 = 0.74); b) regression of the urea level by the PLS method (R2 = 0.76); c) regression of the creatinine level by the CNN method (R2 = 0.81); and d) regression of the urea level by the CNN method (R2 = 0.82).

Download Full Size | PDF

Let us consider the characteristics of the constructed PLS models. The results presented in Figure 5 demonstrate that when constructing the regression of the creatinine and urea levels based on the PLS model, the most informative bands are 720 − 750 cm-1, 990 − 1030 cm-1, 1220 − 1255 cm-1, 1380 − 1415 cm-1, which coincides with the most informative spectral bands of the PLS-DA model for detecting the dialysis patients with end-stage CKD. The Pearson pairwise correlation coefficient between the VIP distribution for the “dialysis end-stage kidney failure” group vs “1-3a stages of CKD” group PLS-DA model and the superposition of VIP distributions for regression of the urea and creatinine levels was 0.96, which indicates a strong correlation. Thus, when constructing PLS-based models, identification of the dialysis patients and the regression of creatinine and urea levels are implemented using the same set of bands, and the relative degree of information content of the identified bands for these models is the same.

At the next stage, the characteristics of CNN models were considered. As in the task of classification by the presence/absence of end-stage CKD, the characteristics of the CNN model are superior to the PLS model when solving the problem of regression by the level of creatinine and urea. When constructing a CNN-based regression, for determining the creatinine level, the most informative bands are 630 − 650 cm-1 (which corresponds to skeletal ring deformation in creatinine [56]), 720 − 750 cm-1, 1380 − 1415 cm-1 (which corresponds to creatinine [57]); when determining the urea level – 720 − 750 cm-1, 990 − 1030 cm-1 (which corresponds to N-C-N stretching in urea [57]). The spectral band of 1380 − 1415 cm-1, which is the most informative when constructing a CNN model for detecting the dialysis patients, coincides with one of the bands correlated with creatinine. Analysis of the correlation between the superposition of the VIP distributions of regressions for the urea and creatinine levels and the VIP distribution of the dialysis patient identification model demonstrated no significant correlation, whereas the Pearson correlation coefficient was at the level of 0.43.

In general, the current work presents an approach based on the combination of the SERS technique in the near infrared range and the multivariate analysis based on one-dimensional CNN and PLS for the study of pathological-associated changes in the serum of patients with end-stage CKD. In order to improve the interpretation of SERS characteristics of serum and pathological-associated spectral bands, the next stage of research will analyze the correlation of a number of biochemical parameters and spectral characteristics on an expanded sample. Meantime, the presented study can become the basis for monitoring the health status of hemodialysis patients.

4. Conclusion

This article presents the study of SERS characteristics of the serum in dialysis patients with end-stage CKD. By means of multivariate analysis, the informative spectral bands associated with the end-stage CKD during dialysis were identified. In addition, the analysis of the correlation between the serum spectral characteristics and the main markers of the decreased kidney clearance (urea, creatinine) has made it possible to determine the spectral bands correlated with levels of creatinine and urea into the complex spectral characteristics of serum. At the next research stage, the described approach will be applied to an expanded sample to identify the spectral contribution of other biochemical components of serum (albumin, ferritin, cholesterol, glucose, etc.), which will improve the description of the pathological-associated spectral features of the serum in the dialysis patients with end-stage CKD. In general, the reported approach may form the basis for monitoring the health status of dialysis patients and find application in studying other pathological conditions of the human body.

Funding

Russian Science Foundation (21-75-10097).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. J. Watson, I. de Salis, J. Banks, and C. Salisbury, “What do tests do for doctors? A qualitative study of blood testing in UK primary care,” Family Practice 34(6), 735–739 (2017). [CrossRef]  

2. T. Hasegawa, R. Yamaguchi, M. Kakuta, K. Sawada, K. Kawatani, K. Murashita, S. Nakaji, and Seiya Imoto, “Prediction of blood test values under different lifestyle scenarios using time-series electronic health record,” PLoS One 15(3), e0230172 (2020). [CrossRef]  

3. D. Silva, C. G.G. Ponte, M. A. Hacker, and P. R.Z. Antas, “A whole blood assay as a simple, broad assessment of cytokines and chemokines to evaluate human immune responses to Mycobacterium tuberculosis antigens,” Acta Trop. 127(2), 75–81 (2013). [CrossRef]  

4. M. Kalinich and D. A. Haber, “Cancer detection: Seeking signals in blood,” Science 359(6378), 866–867 (2018). [CrossRef]  

5. G. Basten, Blood Results in Clinical Practice, 2nd ed. (M&K Publishing, 2019).

6. D. Glick, Methods of Biochemical Analysis (John Wiley & Sons, 2009).

7. C. G. Atkins, K. Buckley, M. W. Blades, and R. F.B. Turner, “Raman spectroscopy of blood and blood components,” Appl. Spectrosc. 71(5), 767–793 (2017). [CrossRef]  

8. M. Matthiae, X. Zhu, R. Marie, and A. Kristensen, “In-line whole blood fractionation for Raman analysis of blood plasma,” Analyst (Cambridge, U. K.) 144(2), 602–610 (2019). [CrossRef]  

9. G. McLaughlin, K. C. Doty, and I. K. Lednev, “Raman Spectroscopy of blood for species identification,” Anal. Chem. 86(23), 11628–11633 (2014). [CrossRef]  

10. R. R. Jones, D. C. Hooper, L. Zhang, D. Wolverson, and V. K. Valev, “Raman techniques: fundamentals and frontiers,” Nanoscale Res. Lett. 14(1), 231 (2019). [CrossRef]  

11. M. S. Bergholt, A. Serio, and M. B. Albro, “Raman spectroscopy: guiding light for the extracellular matrix,” Front. Bioeng. Biotechnol. 7, 303 (2019). [CrossRef]  

12. N. Kuhar, S. Sil, T. Verma, and S. Umapathy, “Challenges in application of Raman spectroscopy to biology and materials,” RSC Adv. 8, 25888–25908 (2018). [CrossRef]  

13. J. Langer, D. Jimenez de Aberasturi, J. Aizpurua, R. A. Alvarez-Puebla, B. Auguié, J. J. Baumberg, G. C. Bazan, S. E. J. Bell, A. Boisen, A. G. Brolo, J. Choo, D. Cialla-May, V. Deckert, L. Fabris, K. Faulds, F. Javier García de Abajo, R. Goodacre, D. Graham, A. J. Haes, C. L. Haynes, C. Huck, T. Itoh, M. Käll, J. Kneipp, N. A. Kotov, H. Kuang, E. C. Le Ru, H. Kwee Lee, J.-F. Li, X. Y. Ling, S. A. Maier, T. Mayerhöfer, M. Moskovits, K. Murakoshi, J.-M. Nam, S. Nie, Y. Ozaki, I. Pastoriza-Santos, J. Perez-Juste, J. Popp, A. Pucci, S. Reich, B. Ren, G. C. Schatz, T. Shegai, S. Schlücker, L.-L. Tay, K. George Thomas, Z.-Q. Tian, R. P. Van Duyne, T. Vo-Dinh, Y. Wang, K. A. Willets, C. Xu, H. Xu, Y. Xu, Y. S. Yamamoto, B. Zhao, and L. M. Liz-Marzán, “Present and future of surface-enhanced Raman scattering,” ACS Nano 14(1), 28–117 (2020). [CrossRef]  

14. A. Milewska, V. Zivanovic, V. Merk, U. B. Arnalds, Ó. E. Sigurjónsson, J. Kneipp, and K. Leosson, “Gold nanoisland substrates for SERS characterization of cultured cells,” Biomed. Opt. Express 10(12), 6172–6188 (2019). [CrossRef]  

15. C. Muehlethaler, M. Leona, and J. R. Lombardi, “Review of surface enhanced Raman scattering applications in forensic science,” Anal. Chem. 88(1), 152–169 (2016). [CrossRef]  

16. L. A. Lane, X. Qian, and Shuming Nie, “SERS nanoparticles in medicine: from label-free detection to spectroscopic tagging,” Chem. Rev. (Washington, DC, U. S.) 115(19), 10489–10529 (2015). [CrossRef]  

17. S. Feng, D. Lin, J. Lin, B. Li, Z. Huang, G. Chen, W. Zhang, L. Wang, J. Pan, R. Chena, and H. Zeng, “Blood plasma surface-enhanced Raman spectroscopy for non-invasive optical detection of cervical cancer,” Analyst (Cambridge, U. K.) 138, 3967–3974 (2013). [CrossRef]  

18. Y. Lu, Y. Lin, Z. Zheng, X. Tang, J. Lin, X. Liu, M. Liu, G. Chen, S. Qiu, T. Zhou, Y. Lin, and S. Feng, “Label free hepatitis B detection based on serum derivative surface enhanced Raman spectroscopy combined with multivariate analysis,” Biomed. Opt. Express 9(10), 4755–4766 (2018). [CrossRef]  

19. M. Kashif, M. Irfan Majeed, M. Asif Hanif, and A. ur Rehman, “Surface Enhanced Raman Spectroscopy of the serum samples for the diagnosis of Hepatitis C and prediction of the viral loads,” Spectrochim. Acta, Part A 242, 118729 (2020). [CrossRef]  

20. N. Gao, Q. Wang, J. Tang, S. Yao, H. Li, X. Yue, J. Fu, F. Zhong, T. Wang, and J. Wang, “Non-invasive SERS serum detection technology combined with multivariate statistical algorithm for simultaneous screening of cervical cancer and breast cancer,” Anal. Bioanal. Chem. 413(19), 4775–4784 (2021). [CrossRef]  

21. L. Xue, B. Yan, Y. Li, Y. Tan, X. Luo, and M. Wang, “Surface-enhanced Raman spectroscopy of blood serum based on gold nanoparticles for tumor stages detection and histologic grades classification of oral squamous cell carcinoma,” Int. J. Nanomed. 13, 4977–4986 (2018). [CrossRef]  

22. C. B. Saltonstall, T. E. Beechem, J. Amatya, J. Floro, P. M. Norris, and P. E. Hopkins, “Uncertainty in linewidth quantification of overlapping Raman bands,” Rev. Sci. Instrum. 90, 013111 (2019). [CrossRef]  

23. E. Lenzi, S. Dinarelli, G. Longo, M. Girasole, and V. Mussi, “Multivariate analysis of mean Raman spectra of erythrocytes for a fast analysis of the biochemical signature of ageing,” Talanta 221, 121442 (2021). [CrossRef]  

24. J. L. Pichardo-Molina, C. Frausto-Reyes, O. Barbosa-García, R. Huerta-Franco, J. L. González-Trujillo, C. A. Ramírez-Alvarado, G. Gutiérrez-Juárez, and C. Medina-Gutiérrez, “Raman spectroscopy and multivariate analysis of serum samples from breast cancer patients,” Lasers. Med. Sci. 22(4), 229–236 (2007). [CrossRef]  

25. P. Wang, L. Guo, Y. Tian, J. Chen, S. Huang, C. Wang, P. Bai, D. Chen, W. Zhu, H. Yang, W. Yao, and J. Gao, “Discrimination of blood species using Raman spectroscopy combined with a recurrent neural network,” OSA Continuum 4, 672–687 (2021). [CrossRef]  

26. J. Liu, M. Osadchy, L. Ashton, M. Foster, C. J. Solomon, and S. J. Gibson, “Deep convolutional neural networks for Raman spectrum recognition: a unified solution,” Analyst (Cambridge, U. K.) 142, 4067–4074 (2017). [CrossRef]  

27. C.S. Ho, N. Jean, C. A. Hogan, L. Blackmon, S. S. Jeffrey, M. Holodniy, N. Banaei, A. A. E. Saleh, S. Ermon, and J. Dionne, “Rapid identification of pathogenic bacteria using Raman spectroscopy and deep learning,” Nat. Commun. 10, 4927 (2019). [CrossRef]  

28. R. Staritzbichler, P. Hunold, I. Estrela-Lopis, P. Werner Hildebrand, B. Isermann, and T. Kaiser, “Raman spectroscopy on blood serum samples of patients with end-stage liver disease,” PLoS One 16(9), e0256045 (2021). [CrossRef]  

29. M. Zong, L. Zhou, Q. Guan, D. Lin, J. Zhao, H. Qi, D. Harriman, L. Fan, H. Zeng, and C. Du, “Comparison of Surface-Enhanced Raman Scattering Properties of Serum and Urine for the Detection of Chronic Kidney Disease in Patients,” Appl. Spectrosc. 75(4), 412–421 (2021). [CrossRef]  

30. S. Feng, L. Zhou, D. Lin, J. Zhao, Q. Guan, B. Zheng, K. Wang, H. Li, R. Chen, H. Zeng, and C. Du, “Assessment of treatment efficacy using surface-enhanced Raman spectroscopy analysis of urine in rats with kidney transplantation or kidney disease,” Clin. Exp. Nephrol. 23, 880–889 (2019). [CrossRef]  

31. J. Guo, Z. Rong, Y. Li, S. Wang, W. Zhang, and R. Xiao, “Diagnosis of chronic kidney diseases based on surface-enhanced Raman spectroscopy and multivariate analysis,” Laser Phys. 28(7), 075603 (2018). [CrossRef]  

32. S. Z. Al-Sammarraie, L. A. Bratchenko, E. N. Typikova, P. A. Lebedev, V. P. Zakharov, and I. A. Bratchenko, “Silver nanoparticles-based substrate for blood serum analysis under 785 nm laser excitation,” Journal of Biomedical Photonics & Engineering. 8(1), 010301 (2022).

33. J. Zhao, H. Lui, D. I McLean, and H. Zeng, “Automated autofluorescence background subtraction algorithm for biomedical Raman spectroscopy,” Appl. Spectrosc. 61(11), 1225–1232 (2007). [CrossRef]  

34. P. Refaeilzadeh, L. Tang, and H. Liu, “Cross-Validation,” in Encyclopedia of Database Systems (Springer, 2009).

35. L. A. Bratchenko, I. A. Bratchenko, A. A. Lykina, M. V. Komarova, D. N. Artemyev, O. O. Myakinin, A. A. Moryatov, I. L. Davydkin, S. V. Kozlov, and V. P. Zakharov, “Comparative study of multivariative analysis methods of blood Raman spectra classification,” J. Raman Spectrosc. 51(2), 279–292 (2020). [CrossRef]  

36. S. Kucheryavskiy, “mdatools — R package for chemometrics,” Chemom. Intell. Lab. Syst. 198, 103937 (2020). [CrossRef]  

37. O. M. Kvalheim, R. Arneberg, O. Bleie, T. Rajalahti, A. K. Smilde, and J. A. Westerhuis, “Variable importance in latent variable regression models,” J. Chemometrics 28(8), 615–622 (2014). [CrossRef]  

38. M. A. B. Hedegaard, K. L. Cloyd, C. M. Horejsa, and M. M. Stevens, “Model based variable selection as a tool to highlight biological differences in Raman spectra of cells,” Analyst (Cambridge, U. K.) 139, 4629–4633 (2014). [CrossRef]  

39. J.W. Tang, Q.H. Liu, X.C. Yin, Y.C. Pan, P.B. Wen, X. Liu, X.X. Kang, B. Gu, Z.B. Zhu, and L. Wang, “Comparative analysis of machine learning algorithms on surface enhanced Raman spectra of clinical staphylococcus species,” Front. Microbiol. 12, 696921 (2021). [CrossRef]  

40. J. Hu, Y. Zou, B. Sun, X. Yu, Z. Shang, J. Huang, S. Jin, and P. Liang, “Raman spectrum classification based on transfer learning by a convolutional neural network: Application to pesticide detection,” Spectrochim. Acta, Part A 265, 120366 (2022). [CrossRef]  

41. H. Yan, M. Yu, J. Xia, L. Zhu, T. Zhang, Z. Zhu, and G. Sun, “Diverse region-based CNN for tongue squamous cell carcinoma classification with Raman spectroscopy,” IEEE Access 8, 127313–127328 (2020). [CrossRef]  

42. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, “Deep residual learning for image recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778 (2016).

43. D Falbel, J. J. Allaire, F. Chollet, RStudio, Google, Y. TangW. Van Der Bijl, Martin Studer, Sigrid Keydana, the Keras Package, Version 2.3.0.0 (2020).

44. B. M. Greenwell and B. C. Boehmke, “Variable Importance Plots—An Introduction to the vip Package,” The R Journal 12(1), 343–366 (2020). [CrossRef]  

45. Y. Date and J. Kikuchi, “Application of a deep neural network to metabolomics studies and its performance in determining important variables,” Anal. Chem. 90(3), 1805–1810 (2018). [CrossRef]  

46. A. McGovern, R. Lagerquist, D. John Gagne II, G. Eli Jergensen, K. L. Elmore, C. R. Homeyer, and T. Smith, “Making the black box more transparent: understanding the physical implications of machine learning,” Bull. Am. Meteorol. Soc. 100, 2175–2199 (2019). [CrossRef]  

47. J.B. Yang, K.Q. Shen, C.J. Ong, and X.P. Li, “Feature selection for MLP neural network: the use of random permutation of probabilistic outputs,” IEEE Trans. Neural Netw. 20(12), 1911–1922 (2009). [CrossRef]  

48. X. Cao, Z. Wang, L. Bi, and J. Zheng, “Label-free detection of human serum using surface-enhanced Raman spectroscopy based on highly branched gold nanoparticle substrates for discrimination of non-small cell lung cancer,” Journal of Chemistry 2018(23), 1–13 (2018). [CrossRef]  

49. E. Ryzhikova, N. M. Ralbovsky, L. Halámková, D. Celmins, P. Malone, E. Molho, J. Quinn, E. A. Zimmerman, and I. K. Lednev, “Multivariate statistical analysis of surface enhanced raman spectra of human serum for Alzheimer’s disease diagnosis,” Appl. Sci. 9(16), 3256 (2019). [CrossRef]  

50. X. Yue, H. Li, J. Tang, J. Liu, and J. Jiao, “Multivariate statistical analysis of surface enhanced raman spectra of human serum for Alzheimer’s disease diagnosis,” Anal. Bioanal. Chem. 412(2), 279–288 (2020). [CrossRef]  

51. A. Bonifacio, S. D. Marta, R. Spizzo, S. Cervo, A. Steffan, A. Colombatti, and V. Sergo, “Surface-enhanced Raman spectroscopy of blood plasma and serum using Ag and Au nanoparticles: a systematic study,” Anal. Bioanal. Chem. 406, 2355–2365 (2014). [CrossRef]  

52. J. H. Salazar, “Overview of urea and creatinine,” Lab. Med. 45(1), e19–e20 (2014). [CrossRef]  

53. D. Pandya, A. Kumar Nagrajappa, and K. S. Ravi, “Assessment and correlation of urea and creatinine levels in saliva and serum of patients with chronic kidney disease, diabetes and hypertension- a research study,” J. Clin. Diagn. Res. 10(10), ZC58–ZC62 (2016). [CrossRef]  

54. C. Junior Saatkamp, M. Liberal de Almeida, J. Aliana Martins Bispo, A. Luiz Barbosa Pinheiro, A. Barrinha Fernandes, and Jr. L. Silveira, “Quantifying creatinine and urea in human urine through Raman spectroscopy aiming at diagnosis of kidney disease,” J. Biomed. Opt. 21(3), 037001 (2016). [CrossRef]  

55. Z. Huang, S. Feng, Q. Guan, T. Lin, J. Zhao, C. Y. C. Nguan, H. Zeng, D. Harriman, H. Li, and Caigan Du, “Correlation of surface-enhanced Raman spectroscopic fingerprints of kidney transplant recipient urine with kidney function parameters,” Sci. Rep. 11, 2463 (2021). [CrossRef]  

56. Y. Lu, C. Wu, R. You, Y. Wu, H. Shen, L. Zhu, and S. Feng, “Superhydrophobic silver film as a SERS substrate for the detection of uric acid and creatinine,” Biomed. Opt. Express 9(10), 4988–4997 (2018). [CrossRef]  

57. L. Parada Moreira, Jr L. Silveira, A. G. da Silva, A. B Fernandes, M. T. T Pacheco, and D. D. S. Rocco, “Raman spectroscopy applied to identify metabolites in urine of physically active subjects,” J. Photochem. Photobiol., B 176, 92–99 (2017). [CrossRef]  

Supplementary Material (1)

NameDescription
Supplement 1       supplemented document

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (5)

Fig. 1.
Fig. 1. Schematic diagram of the architecture of one-dimensional CNN for the recognition of Raman spectra and serum classification, where: input is the input layer with the output shape (1700, 1); conv1 is the one-dimensional convolutional layer with padding = “same” and the output shape (1700, 8); pooling1 is the max pooling layer with pool_size = 5, strides = 5 and the output shape (340, 8); residual conv1 is the one-dimensional convolutional layer with strides = 5 and the output shape (340, 8); add 1 is the layer that adds residual tensor and convolutional base, the output shape (340,8); conv2 is the one-dimensional convolutional layer with padding = “same” and the output shape (340, 16); pooling2 is the max pooling layer with pool_size = 4, strides = 4 and the output shape (85, 16); residual conv2 is a one-dimensional convolutional layer with strides = 20 and the output shape (85, 16); add 2 is the layer that adds residual tensor and convolutional base, the output shape (85,16); flatten is a flatten layer; FC1 is a fully connected layer1; FC2 is a fully connected layer with parameters depending on the accomplished task (units = 1 and softmax activation function for discriminating the hemodialysis patients with end-stage of CKD and the patients with stages 1-3a of CKD; units = 1 and without activation function for regression between the serum spectral characteristics and the creatinine or the urea level).
Fig. 2.
Fig. 2. Mean SERS spectra of serum for the group of patients with 1-3a stages of CKD and the group of hemodialysis patients.
Fig. 3.
Fig. 3. ROC curves of the constructed models for the classification of “end-stage kidney failure” group vs “1-3a stages of CKD” group on the verification dataset.
Fig. 4.
Fig. 4. VIP distribution of the serum SERS spectra matrix in constructing a model: a) based on PLS-DA; b) based on CNN.
Fig. 5.
Fig. 5. VIP-distribution of the matrix of serum SERS spectra when constructing: a) regression of the creatinine level by the PLS method (R2 = 0.74); b) regression of the urea level by the PLS method (R2 = 0.76); c) regression of the creatinine level by the CNN method (R2 = 0.81); and d) regression of the urea level by the CNN method (R2 = 0.82).

Tables (2)

Tables Icon

Table 1. Summary of the subjects

Tables Icon

Table 2. Specificity, Sensitivity, and Accuracy of Discriminating the End-Stage Kidney Failure Group and the 1-3a Stages of CKD Group

Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.