SERS mapping combined with chemometrics, for accurate quantification of methotrexate from patient samples

univariate SERS data analysis, underlining the need for a more in-depth data analysis strategy based on exploiting of full-spectrum information. In this paper, a multivariate data analysis strategy was developed, for analyzing SERS maps of methotrexate (MTX) from patient samples, including all steps from baseline correction, selection of wavelength, and the relevant pixels in the map (image threshold segmentation), as well as quantitative model construction based on partial-least squares regression. Among the different baseline correction methods evaluated, standard normal variable transformation and Savitzky-Golay smoothing proved to be more suitable, while the genetic algorithm wavelength screening method was able to screen out MTX-related SERS spectral regions more efficiently. Importantly, with the here-developed process, it was sufficient to use MTX-spiked commercial serum when building quantitative models, removing the need to work with MTX-spiked patient samples, and consequently enabling time-and resource-saving quantitative analyses. Besides, the developed multivariate data analysis approach showed superior performances compared with univariate analysis, with 30 % improved sensitivity (detection limit of 5.7 µ M), 25 % higher reproducibility (average relative standard variation of 15.6 %), and 110 % better accuracy (average prediction error of (cid:0) 10.5 %).

• Multivariate data analysis of SERS maps enabled MTX prediction in patient samples.• More accurate quantification with multivariate vs. univariate data analysis.• The model can be built directly in commercial serum and applied to patient samples.• The genetic algorithm identified specific wavenumbers from complex SERS spectra.• Image threshold segmentation was applied to select relevant map pixels.Despite the technological development in Raman instrumentation that has democratized access to 2D sample scanning capabilities, most quantitative surface-enhanced Raman scattering (SERS) analyses are still performed by only acquiring a single or a few spectra per sample and performing univariate data analysis on those.This strategy can however reach its limit when analytes need to be detected and quantified in complex matrices.In that case, surface fouling and competition between the target analyte and interfering compounds can impair

Introduction
Surface-enhanced Raman scattering (SERS) has been reported to be a powerful analysis tool with applications notably in medicine [1,2], food safety [3,4], and environment monitoring [5].The main advantage of SERS compared to Raman is that the Raman signal of the target analyte, either bonded or adsorbed onto the surface of SERS-active substrates, is significantly enhanced, therefore enabling sensitive detection and quantification even in complex samples [6].
Both qualitative and quantitative information can be gathered from SERS spectra.A widespread practice in quantitative SERS is to combine colloids with the fast single-point measurement acquisition setting [7][8][9][10].Thanks to the relative homogeneity of nanoparticle suspensions, collecting a single or a few spectra can give an overview of the analyzed sample within seconds and can be conducted even with handheld devices [7,8].With the view of achieving more reproducible results and improving batch-to-batch variability, the development of solid SERS substrates has expanded, as they could provide more defined hotspot geometries [11].
Compared with single-point analysis, SERS mappings of larger substrate areas can provide more representative, objective, and accurate information about the analyzed samples by accounting for spot-to-spot intensity fluctuations [12][13][14].These fluctuations typically occur, especially with solid SERS substrates and at low analyte concentrations, as only a few molecules located in the hotspots contribute to the recorded SERS spectra [15].The data collected during SERS mappings generally consists of a three-dimensional data cube, with two spatial (length*width, x*y) and one spectral (Raman shifts, λ) dimension, consequently rich in information.SERS mappings have been applied in various areas, including the detection of food contaminants [16,17], quantification of bacterial secondary metabolite [18], or detection of drugs in human biological matrices [1,19].However, in most of these cases, exhaustive exploitation of the hyperspectral data was not performed; on the contrary, solely a mean spectrum was extracted as output from the maps for univariate quantitative analysis, primarily based on peak height or area under the curve analysis of one or few defined spectral features [1,16,18].
Furthermore, applying algorithms to the analysis of spectral data is also essential to improve the quantification performances of SERS, especially considering complex sample matrices [6,15].SERS mapping data suffer from dimensionality issues where a high number of uninformative, undesired and/or interfering wavelength variables need to be removed.Therefore, selecting the most informative wavelengths is critical in constructing robust quantification models.Additionally, the number of samples is often much smaller than the number of wavelength variables collected, which can lead to overfitting, i.e., great prediction performances in training samples but poor prediction performances on newly introduced samples [15].Both of the above aspects in data analysis of SERS spectra lead to a significant challenge in building a costeffective and robust calibration model [33].Wavelength selection algorithms used for this purpose enable retaining informative variables and removing uninformative or interfering ones, thereby improving the prediction performance of calibration models.Yet, applying these algorithms to quantitative SERS mapping data is still not a common practice, which could be due to the complexity of the required data analysis, which is not automated.However, algorithms such as genetic algorithm (GA) [8,9,28], competitive adaptive reweighted sampling (CARS) [7,8,10], uninformative variable elimination (UVE) [34], successive projection algorithm (SPA) [7], and synergy interval partial least squares (SI) [28] have been used to improve SERS quantitative models.The selection of variables is based on the model's performance, generally using partial least square regression (PLSR) through iterative selection, which yields the subset of variables with the best model performance [7][8][9][10]28].However, variable selection strategies have specific constraints, and the algorithms perform differently according to the data set.Comparison and screening of algorithms performances is hence required for each new type of data set.
Combining SERS mapping with multivariate spectral data analysis would notably benefit the field of SERS-based bioanalysis [35].Biological matrices, like blood or serum, are complex, with variable composition and patient-to-patient differences.Even in the case of the same patient, the sample matrix composition could be influenced by physiological and pathological factors, such as sampling time, food intake, or the presence of an underlying condition [35].This can entail significant spectral variations in terms of both peak intensity and peak (dis)appearance, making the reliability of a univariate quantitative model low [35].
Previously, Göksel et al. developed a nanopillar-assisted separation (NPAS) SERS-based detection method to quantify methotrexate (MTX) in the serum of leukemic pediatric patients receiving high-dose MTX therapy [1].Due to the NPAS process, MTX was well localized in a specific area on the SERS substrates and univariate data analysis, based on a single spectral feature peak intensity, was successfully implemented on the averaged SERS spectra when building calibration curves even in commercial serum samples.However, when working with patient samples, the high intra-and inter-patient serum variability greatly influenced MTX peak intensity and led to spectral feature variations.To mitigate this high variability and enable quantification of MTX in patient samples, the serum of each patient, collected before MTX infusion, was used to build a calibration curve for each patient by spiking it with MTX.Still, despite this fact, the vast information included in the SERS maps was not fully utilized, leading to poor prediction ability, with an average prediction error of 21 %.Although a calibration curve was built for each patient, MTX peak intensity could be impacted by interfering serum species competing with MTX for hotspots.Moreover, matrix components could even produce interfering SERS peaks, decreasing the accuracy of MTX quantification.Additionally, constructing calibration curves for every patient is not a viable strategy in clinical practice.A single quantification model, ideally built using commercial serum, would instead be preferred to predict MTX concentrations from all patient samples.
Accordingly, the current work aimed to develop an accurate, rapid, reproducible, and easy-to-implement data analysis method for quantifying MTX in patient samples by fully exploiting the previously published SERS mapping dataset [1].To that end, chemometrics algorithms were applied to reduce spectral noise (SNVT, MSC, FD, detrend, and SG), to screen and select meaningful wavelength variables (CARS, GA and SI), and to select the regions of the SERS substrate carrying significant MTX information (image threshold segmentation).As indicated earlier, this in-depth data analysis is one of the few where spectral data collected in maps are used for building an analytical model for quantification.

Description of the used dataset
All the SERS data used in this work was collected and presented by Göksel et al. [1].More information regarding experimental parameters can be found in Section 1 of the Supporting Information.The data set consisted of 2 quantitative models (i and ii) and 1 type of data used to test the models (iii): The SERS maps comprised around 43 × 43 spectra and 1686 wavelengths per spectrum.Each map was collected onto a new piece of SERS substrate from a different aliquot of the same sample (technical replicates, n).

Baseline correction methods
To reduce the noise in the SERS spectra and correct their baseline, the FD (window size of 17), MSC, detrend, SNVT and SG (window size of 15, derivative order of 2) algorithms were applied as pre-processing.

Variable selection methods
As the SERS spectra contain numerous wavenumbers that do not include useful information for MTX quantification, variable selection methods were employed to select the most meaningful variables to include in the PLSR model.Both discontinuous wavelength screening (CARS and GA) and continuous wavelength screening (SI) methods were tested.
In the CARS algorithm, the importance of wavelength variables was determined by the absolute value of PLSR model coefficients [36].CARS algorithm mainly included the following steps: (i) Monte Carlo method for sampling, and random selection of the calibration set samples; (ii) deletion of the wavelength variables with lower contribution rates for the PLSR model (with max 15 principal components) based on an exponentially decreasing function; (iii) adaptive weighted sampling adopting the "survival theory of the fittest" to eliminate variables competitively and retain variables whose weight was greater than the threshold (20); and (iv) comparison of cross-validation root mean square error (RMSECV) values acquired by 5-fold cross validation in subsets after 50 sampling times, and selection of the subset with the smallest RMSECV value.
In GA algorithm, the initial population (30 chromosome samples) of the subset was randomly generated by binary coding, where "1″ represents the selection of the corresponding wavelength and "0" means the non-selection [37,38].The population evolved through the probability of crossover (0.5) and the probability of mutation (0.01) operations.The variables were screened based on the principle of "survival of the fittest", i.e., variables with high predictive ability for the PLSR model were included in the next population.Finally, after 100 iterations, the population converged to the wavelength variables containing necessary information.Genetic algebra was used as the termination condition.When it reached the optimized algebra number, the search process in GA was terminated [38,39].
The SI algorithm divided the spectra evenly into N = 20 subintervals, removed one sub-interval, and established a PLSR model with the remaining N-1 sub-intervals [40].The first excluded subinterval replaced each subinterval in the PLSR model established by N-1 subintervals, and the stability and prediction accuracy of these submodels were assessed through RMSECV.Among these, the most stable sub-model was used as the first SI-PLS model, and then the above steps were repeated for the remaining N-1 sub-interval.Finally, the model with the smallest RMSECV was found from the established sub-models as the SI-PLS model, and this sub-interval was also the optimal wavelength variable subset.
All the above-described variable selection methods (CARS, GA and SI) were combined with PLSR.The samples were divided into training and prediction sets (sample ratio of 2/1).The RMSECV, root mean square error of prediction (RMSEP), and determination coefficient in the training set (Rt 2 ) and the prediction set (Rp 2 ) were calculated to assess the performance of the PLSR model.To further test the accuracy of the prediction results of the model, the predicted concentrations of patient samples were compared with their nominal values, and relative standard deviation (RSD) and prediction error (PE) were calculated.
Generally, reliable models have higher R 2 value and lower RMSE value, and the difference between Rt 2 and Rp 2 and between RMSECV and RMSEP should be as small as possible.

Optimization of the image threshold segmentation
As mentioned earlier, only certain parts of the SERS maps contain MTX-related information.Appropriate threshold segmentation can be applied to the maps to extract spectral information only from the regions where MTX molecules are located.
When MTX is present at a specific location of the SERS substrate, the SERS intensity at the characteristic MTX wavelengths should be relatively higher than the background signal.Therefore, threshold P.He et al. segmentation can isolate these areas with high SERS intensity and then calculate the average SERS intensity in these areas (Supporting Information).The PLSR models can subsequently be built with these averaged intensity data.

Data analysis software
All the data pre-processing and chemometrics analysis were performed using MATLAB Version 2014b (Eigenvector Research Inc., Wenatchee, WA, USA).

Results and discussion
In our previous paper [1], when using univariate, single spectral feature-based data analysis on the same data set used in this work, a calibration curve had to be built for each patient serum to quantify the MTX concentration in the respective patient sample.This practice is very time-and sample-consuming, thus not suitable for quantitative analysis in real life.On the contrary, the optimal solution would be to construct a single quantitative model, preferably with commercial serum, to accurately predict MTX concentration in patient samples.The model would ideally be able to extract relevant MTX features without being influenced or sensitive to matrix variations and interferences.

Comparison of different baseline correction methods
When acquiring SERS maps, it is common that the baseline moves up and down or drifts (Figure S1A) [41].These perturbations will reduce the stability of the model, so it is necessary to pre-process the SERS spectrum before a quantitative model is built.This work compared different baseline correction methods, such as SNVT, MSC, FD, detrend, and SG, to select the most suitable SERS maps.
In the spectra obtained using the MSC pretreatment method, several abnormal spectra were obtained, which did not occur in the raw spectra (Figure S1B).Many spectral features (peaks) obtained after FD pretreatment were negatives (Figure S1C), which made FD pretreatment unsuitable for the subsequent image analysis.
Compared with the raw spectra, the detrend and SNVT methods produced pre-processed spectra with similar baselines (Figure S1D and E, respectively).These methods were thus better suited to SERS maps.
As the next step, PLSR models were built on the data pre-processed with these 4 methods (detrend, SNVT, MSC, FD) to compare them.The Rt 2 , Rp 2 , RMSECV and RMSEP are displayed in Table 1, and it can be seen that, among the models, the one based on SNVT-corrected spectra showed better predictions.
Although SNVT could correct the baseline effectively, noise can clearly be observed in the pre-processed spectra (Figure S1E).An additional SG smoothing processing was therefore employed to reduce the noise in the SERS maps (Figure S1F), which resulted in improved model performances (Table 1, group 6); therefore this SNVT-SG processing was used for further data analysis.
To further verify the effectiveness of the SNVT-SG pretreatment, it was applied before univariate data analysis using MTX characteristic peak (679 cm − 1 ) and compared to non-baseline corrected data [1] (Table S1).As can be observed in Table S1, univariate models based on 2 points baseline-corrected spectra used in our previous work also produced better prediction results than non-baseline corrected data [1].

Comparison of variable selection methods
While the SERS spectrum of MTX has several characteristic peaks (as indicated by pink arrows in Fig. 1, Figure S2 and locations specified in Table S2), in our previous work, we only focused on a single peak to perform univariate quantitative analysis [1].In this work, we aimed to improve the prediction accuracy by considering several MTX characteristic peaks in the quantification model.As can be observed in Fig. 1, only a few wavenumbers contain relevant information assigned to MTX.Furthermore, despite implementing sample pretreatment (protein precipitation and NPAS) prior detection, interferences originating from the complex serum matrix could remain in the pretreated sample and compete with MTX for the substrate surface.This matrix effect is further accentuated in the case of patient samples, where, due to the disease and used therapy, the serum composition can broadly vary, and the presence in serum of co-administered drugs is expected together with the target analyte.
In summary, it would be highly beneficial not to rely only on a single spectral feature peak from the SERS spectra, for data analysis.Selecting the characteristic analyte peaks manually is time-consuming and it can be subjective.An automatic wavelength screening method for finding these distinct peaks would be much more efficient.
As described in the Material and Methods section, 3 different variable screening algorithms, namely GA, CARS, and SI, were investigated in this work to select the most relevant wavelengths.PLSR models were then constructed using 128 spiked commercial serum samples based on these chosen variables.The variable screening algorithms were compared and their suitability was assessed by looking at the prediction accuracy of MTX from patient samples (Table 2).
GA is a method based on the theory of natural selection and biological evolution [42].Each spectrum with wavelength variables in GA Table 1 PLSR model results after spectra pre-processing with different baseline correction methods.The PLSR models were constructed using 128 spiked commercial serum samples, based on the full spectrum and the whole map pixels, and used to predict the same samples.

Group
Baseline correction method Rt  it is regarded as a biological individual.The spectral group with the highest explained variance will be selected as the final selected wavelength variable [39].CARS can also play a powerful and influential role in determining wavelength variables.CARS does not only reduce the influence of collinear variables on the model, but also further avoids over-fitting in the modeling process.It hence improves the prediction of the model [36].GA and CARS algorithms screen out discontinuous variables.Nørgaard et al. introduced SI as a method for selecting continuous variables [40].The accurate model is obtained by dividing the spectrum into smaller equidistant subintervals, then comparing the preferred combined spectral regions.As shown in Table 2 and Fig. 1, 63, 76 and 208 wavenumbers were selected from the whole SERS spectrum by GA, CARS and SI, respectively.Among these algorithms, GA selected the lowest proportion of uninformative variables (Table S2).The proportion of informative variables included in the model by GA were between CARS and SI, while more MTX characteristic peaks were selected by GA than by CARS and SI (Table S2).Therefore, among these 3 methods, GA was the most efficient for spectrum screening and for selecting MTX characteristic peaks.
Table 3 and Figure S3 show the MTX prediction results from 3 patient samples (n = 3) based on PLSR models constructed with 128 spiked commercial serum samples, comparing the 3 different variable selection methods.When predicting MTX concentration from patient samples, the RMSECV of the 3 PLSR models were quite close.However, the RMSEP of the PLSR model based on GA-selected wavenumbers was lower than the ones based on CARS and SI-selected wavenumbers.Based on these results, we concluded that the PLSR model constructed using variables screened by GA predicts better MTX concentration than models built using CARS and SI-selected variables.The difference between Rt 2 and Rp 2 , and RMSECV and RMSEP in the PLSR models based on GA-selected wavenumbers were also lower than that for the 2 other models (CARS and SI), which proves the improved performance of the PLSR model constructed by GA-selected variables.
In addition, it can be noticed in Table 3 that the GA method yielded the lowest SD, RSD, and PE.Therefore, the model established on wavelengths selected by the GA algorithm provided the best prediction results, which was consistent with the fact that the proportion of informative variables included by GA was the highest.
When comparing the results of the PLSR model based on SNVT-SG pre-processing and GA-selected variables (Group 1 in Table 3) with the univariate model based on 679 cm − 1 MTX peak intensity (Group 3 in Table S1), it could be noticed that the SD, RSD, and PE obtained with the multivariate model were lower than for the univariate model.Based on this finding, we concluded that the multivariate model was more accurate than the univariate one and therefore, the SNVT-SG-GA analysis method was selected for further data analysis.

Influence of the sample matrix on the model performance
The model's performance is affected by the data analysis method and the nature of the samples included in the model.As a general rule, the prediction results of PLSR models are the best when the samples used to build the model are similar to the ones that need to be predicted.In this work, serum from two different sources was used to build the models: commercial serum and serum collected from patients before the initiation of their MTX therapy.It is expected that serum originating from patients would have more variability (patient-to-patient and intrapatient) and be a more complex matrix than the somewhat standardized commercial serum due to the altered physiological condition of patients.Therefore, the effect of the sample matrix was also evaluated in this study.
When spiked patient samples were used to construct the PLSR model to predict the MTX concentration in patient samples undergoing MTX therapy (Group 1 in Table 4), the overall values of RSD, PE, RMSECV and RMSEP were the lowest.Furthermore, it can be seen from Figure S4A that the patient samples were predicted with higher accuracy in that case.On the contrary, when a mix of spiked commercial and patient serum samples were used to build the PLSR model (Group 2 in Table 4) to predict the MTX concentration, the overall values of RSD, PE, RMSEP and RMSECV were much higher.The model could only accurately predict the concentration of 1 of the 3 patient samples (Figure S4B).
The MTX concentration prediction results were the worst when only spiked commercial serum samples were used to build PLSR models (Group 3 in Table 4).The RMSEP, the difference between RMSEP and RMSECV, and Rt 2 and Rp 2 were quite large, and only 1 patient sample could be accurately predicted (Figure S4C).
By comparing these 3 groups of results, it was thus concluded that the PLSR model built with only spiked commercial serum samples resulted in the poorest prediction performances for patient samples.In contrast, as expected, the model built with only patient samples yielded the best prediction performances.When the samples used to create the models and for prediction are comparable, it is easier to eliminate similar spectral interferences, and the resulting model shows better prediction results.It is worth noting that models constructed and verified with the same batch of samples are prone to overfitting, meaning that the models can only be used to predict samples from this batch.The prediction results are usually poor when testing new samples from different natures.This is why a large number of samples from a wide range of sources are generally collected in multivariate modeling [43,44].Nevertheless, as samples from only a few different patients (3) were included in the model, the prediction accuracy of the latter could be further improved by including samples from a larger number of patients.In this way, the physiological and pathological variability of serum in-between patients and within the same individuals would be taken into account in the model.

Influence of the number of samples on the model performance
We also investigated the model's performance in correlation with the number of samples included.Especially for highly variable and complex matrices, such as serum and other biological matrices, it is essential to have enough samples to build the model, to consider this high variability and create a robust predictive model.
The performance of PLSR models constructed with 64, 82, 100 or 112 spiked commercial serum samples were thus compared to investigate the effect of sample number (Table 5 and Figure S5).While the Rt 2 and Rp 2 of models built with 64 and 82 samples were similar, increasing the number of samples reduced the RMSEP (Table 5 group 1 and 2).When the number of samples increased to 100, difference between Rt 2 and Rp 2 , and RMSEP were lower, (Table 5 group 3).Further increasing the number of samples to 128, increased the Rp 2 and decreased the RMSEP, and the differences between Rt 2 and Rp 2 , and between RMSECV and RMSEP were also the smallest (group 4 in Table 5).We also found that this model could accurately predict the concentration of almost all patient samples (group 4 in Table 5 and Figure S5D).
It was thus concluded that increasing the number of samples included in the models could help mitigate sample-to-sample spectral variability by having additional matrix background information, resulting in improved prediction performance.

Optimization of the image threshold segmentation process
During the NPAS process, as explained by Göksel et al. [1], sample migration and analyte separation from matrix compounds occur on the SERS substrate due to a wicking effect.
Considering this, only a portion of the SERS substrates contains the relevant spectral information related to MTX, and the distribution of MTX on the SERS substrate is mainly localized in the migration region.This distribution can also vary slightly from sample to sample.Appropriate image threshold segmentation methods could thus be used to solely include pixels containing relevant information related to MTX in the constructed PLSR models.Including less noise would, as a result, benefit the models in terms of MTX concentration prediction performance due to improved signal-to-noise ratios.
Different percentages (top 10 %, 20 %, 30 %, 40 % and 100 %) of image threshold segmentation were investigated to build PLSR models with 128 spiked commercial serum samples, and the performance of these models were compared by predicting the MTX concentration in 3 patient samples (Fig. 2 and Figure S6).In Table 6, it can be seen that when the top 30 % or top 20 % pixels (group 3 and 4, respectively) of spectral intensity were used for threshold segmentation, the results of training and prediction were the best, probably due to the most effective noise reduction.In both cases, all patient samples were predicted accurately, with slightly better performances of the top 20 % model (Table 6 group 4 and Fig. 2).On the contrary, including more pixels in the model increased the noise and deteriorated prediction performances (Table 6 group 1 and 2, and Figure S6A and B).Moreover, further

Table 4
Influence of the sample type on PLSR results for the prediction of MTX concentration in 3 patient samples (n = 3).Spectra were pre-processed using SNVT-SG-GA.The top 20 % pixels of the maps, selected by image threshold segmentation, were used to build the models.

Table 5
Influence of the amount of samples used to build the model on PLSR results for the prediction of MTX concentration in 3 patient samples (n = 3).Spectra were preprocessed using SNVT-SG-GA.The top 20 % pixels of the maps, selected by image threshold segmentation, were used to build the models.decreasing the amount of data included in the model also resulted in performance degradation (Table 6 group 5, and Figure S6D), which could be explained by the too-low amount of data included in the models.
The limits of detection and quantification (LOD and LOQ) of MTX using this data analysis strategy, for the model built with the top 20 %, were found to be 5.7 μM (LOD) and 17.0 µM (LOQ).The calculated LOD and LOQ values, using the here developed and the optimized multivariate method, based on multiple spectral features, proved to be lower than the ones obtained with the univariate model previously published [1], where the LOD and LOQ were 8 μM and 26.5 μM, respectively.These results show that the data analysis method developed in this study based on the combination of standard normal variable transformation, SG smoothing, GA and image threshold segmentation (SNVT-SG-GA-ITS-PLSR) can predict the concentration of MTX in patient samples based on a model solely constructed with spiked serum samples.

Conclusion
The developed data analysis strategy for SERS maps, based on the combination of SNVT-SG-GA-ITS-PLSR, enabled accurate prediction of MTX concentration in patient samples based on chemometrics models constructed with only commercial serum samples.Notably, SNVT baseline correction and GA screening method applied for data preprocessing showed better performance than other pre-processing and selection methods such as MSC, FD, detrend, CARS or SI.During optimization of the image threshold segmentation, we found that using the top 20 % pixels of spectra intensity led to better prediction performance.
Not only were the prediction results superior to the previously reported univariate, single spectral feature-based, data analysis method, with lower RSD (15.6 % vs 20.8 % from multivariate model with top 20  % pixels vs univariate model, respectively), and PE (10.5 % vs 22.6 % in average, respectively), but the samples used for model construction were also more readily available (commercial serum instead of patient samples).In addition, avoiding constructing calibration models in each patient serum represents a time-and resource-saving benefit.The developed method is relatively user-friendly, could be automated, and the constructed models could be updated with upcoming new samples to make it more robust.Moreover, the method can quickly complete data extraction, analysis and prediction (2 min).
In conclusion, the main scientific advance of this work is the combination of multivariate data analysis with SERS mapping.The developed and optimized data analysis approach proved to be accurate, rapid, easy to implement for quantifying MTX in patient serum by fully exploiting the information contained in SERS maps and could be implemented in SERS-based assays in real-life settings.We believe that the presented methodology could be adapted to other quantitative SERSbased assays.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.SERS spectra of MTX in spiked commercial serum after SNVT-SG preprocessing.The characteristic SERS peaks of MTX are highlighted by pink arrows.The plain black and dashed red vertical lines display the wavelengths selected by the GA and CARS algorithms, respectively.The highlighted purple areas are the spectral regions selected by the SI algorithm.

Fig. 2 .
Fig. 2. Scatter plots of the PLSR model constructed with an image threshold segmentation of top 20 % pixels.The blue stars represent the samples used to build the models while the red circles represent the predicted 24 h MTX therapy patient samples.

Table 2
Comparison of different variable selection algorithms in terms of number of selected wavenumbers.

Table 3
Comparison of PLSR model results for different variable selection methods, using the top 20 % pixels of the maps, obtained by image threshold segmentation.Spectra were pre-processed using SNVT-SG.

Table 6
PLSR model results based on different percentages of image threshold segmentation for the prediction of MTX concentration in 3 patient samples (n = 3).