Rapid and accurate identification of stem cell differentiation stages via SERS and convolutional neural networks

Monitoring the transition of cell states during induced pluripotent stem cell (iPSC) differentiation is crucial for clinical medicine and basic research. However, both identification category and prediction accuracy need further improvement. Here, we propose a method combining surface-enhanced Raman spectroscopy (SERS) with convolutional neural networks (CNN) to precisely identify and distinguish cell states during stem cell differentiation. First, mitochondria-targeted probes were synthesized by combining AuNRs and mitochondrial localization signal (MLS) peptides to obtain effective and stable SERS spectra signals at various stages of cell differentiation. Then, the SERS spectra served as input datasets, and their distinctive features were learned and distinguished by CNN. As a result, rapid and accurate identification of six different cell states, including the embryoid body (EB) stage, was successfully achieved throughout the stem cell differentiation process with an impressive prediction accuracy of 98.5%. Furthermore, the impact of different spectral feature peaks on the identification results was investigated, which provides a valuable reference for selecting appropriate spectral bands to identify cell states. This is also beneficial for shortening the spectral acquisition region to enhance spectral acquisition speed. These results suggest the potential for SERS-CNN models in quality monitoring of stem cells, advancing the practical applications of stem cells.


Introduction
Induced pluripotent stem cells (iPSC) hold immense potential for regenerative and transplantation medicine [1,2].In the field of autologous cell therapy, neural progenitor cells (NPC) derived from iPSC are extensively utilized in treating various neurological disorders [3][4][5], such as age-related macular degeneration [6], amyotrophic lateral sclerosis [7] and Parkinson's disease [8].However, these applications are based on the successful differentiation of iPSC.Incomplete differentiation or differentiation into unwanted cell subtypes can be tumorigenic or lead to treatment failure, limiting its clinical application's safety and effectiveness [9,10].Notably, the formation of embryoid bodies (EB), a crucial initial step in directing iPSC differentiation pathways, has been widely employed to investigate differentiation strategies across diverse cell types [11,12].Standardized cultivation of EBs can enhance their quality and improve differentiation outcomes [13][14][15].Hence, accurate and detailed monitoring and identify cellular states throughout targeted differentiation is essential to explore biochemical mechanisms, and improve the therapeutic effect and clinical safety [16,17].
To identify the cellular state during differentiation, various techniques such as immunocytochemistry, flow cytometry, Western blotting, and polymerase chain reaction are employed in stem cell differentiation studies [18,19].However, these methods are complex, time-consuming, and destructive, potentially compromising the integrity of stem cells and, consequently, their clinical applicability [20].Surface-enhanced Raman Spectroscopy (SERS) stands out as a non-labeled, non-destructive, and highly sensitive detection technology capable of operating at the single-cell level.By utilizing Au or Ag nanoparticles to amplify spectral signals that reflect internal biochemical characteristics of cells with improved signal-to-noise ratio and reduced collection time [21,22], SERS has emerged as a valuable tool for monitoring the stem cell differentiation process through spectral analysis of cells [23], nuclei [24], and mitochondria [25].The accuracy of Raman spectra is not high when it is applied to distinguish cells at different periods [26,27].Given the significant involvement of mitochondria in cell differentiation processes [28] where they can actively promote stem cell differentiation [25,29].We hope to enhance the Raman signal of mitochondria by introducing mitochondria-targeting probes, thereby improving the quality and reliability of the spectrum.Then, the cell state during differentiation was accurately identified by collecting mitochondrial spectra and using SERS analysis of related material changes.
SERS spectra provide a wealth of intricate cellular information, which requires advanced data processing techniques for accurate identification of the cell states [30].In the analysis of Raman spectra, conventional algorithms such as k-nearest neighbor (KNN), support vector machine (SVM), and random forest (RF) have been employed for feature extraction and spectral classification [31].However, these algorithms have certain deficiencies in accuracy when dealing with large data sets and complex spectra [30,32].To address this issue, convolutional neural network (CNN) models have emerged as effective deep learning methods extensively utilized in the processing of spectral data [27].CNN is good at extracting valuable information from complex spectra by its superior architecture and algorithmic capabilities [33].Therefore, they are ideal for tasks such as feature extraction and spectral classification.For instance, the integration of SERS and CNN has demonstrated its ability to distinguish between normal and tumor cells [34], as well as its potential for identifying stem cell states [35].However, the classification of cellular states in existing works is relatively coarse, focusing on the classification of different types of cells [32,34].Moreover, to the best of our knowledge, no studies have addressed the identification of cellular states during the EB stage.Skvortsova et al. achieved a 95.9% accuracy in identifying mesenchymal stem cell states at different time points using SERS and CNN, but the refinement of classification categories remains to be improved [35].Germond et al. attained an 88.6% accuracy in the reprogramming process of mouse stem cells using Raman spectroscopy and CNN, yet the classification categories and accuracy were inadequate [26].In the context of this study, we aim to enhance the spectral signals by introducing mitochondrial-targeted probes and further investigate the mitochondrial activity during cell differentiation.Subsequently, we plan to analyze SERS spectra using CNN to achieve accurate identification of cell states and classification into more categories.
In this paper, a CNN-SERS combination method (CNN-S) was used to realize the rapid and accurate identification of cell states during iPSC differentiation.The differentiation process from iPSC to NPC was divided into six stages based on the timing of inducing agent addition, correspondingly, cells and their SERS spectra were categorized into six classes.An accurately effective and sufficient dataset is a prerequisite for achieving high predictive accuracy.Hence, mitochondria-targeted probes were synthesized by combining AuNRs and mitochondrial localization signal (MLS) peptides to obtain effective and stable spectra signals.With over 6000 SERS spectra as input for CNN, the trained model can achieve rapid and accurate identification of any unknown-state spectrum.Besides, the contributions of spectral features and length to achieve high-precision identification tasks were investigated.Our work provides a reliable method for the precise identification and quality monitoring of cell states throughout the entire process of stem cell differentiation, expanding the development of artificial intelligence in cellular medicine.

Materials and methods
2.1.Cell culture iPSCs were purchased from ICell Bioscience Inc. (iCell, Shanghai).The method for differentiating iPSC into NPC in this study was based on previous research [36].First, iPSCs were treated with an iPSC cell digestion solution (iCell, Shanghai) to detach them from the culture flasks.The detached iPSCs were then seeded onto 6-well plates coated with matrix glue.Added 10 µM Y-27632 and 3 ml M1 medium (Neurobasal-A Medium (1X), 1% vol/vol GlutaMAX Supplement, 1% vol/vol Double antibody(penicillin/streptomycin), 1% vol/vol N-2 Supplement, 2% vol/vol B27 Supplement and 1.25 µM AMPK inhibitor) to the 6-well plates.When the cells grew to the exponential phase, digested and transferred to ultra-low adsorption 6-well plates (CORNING).After 24 hours, the formation of EB became evident.The medium was refreshed daily, and on day 9, purified EB was obtained by filtration through a 40 µm cell filter to remove cellular debris.Then the cells were transferred to culture flasks for adherent culture.On day 10, the flasks were replaced with 3 ml M2 medium (DMEM/F-12 medium supplemented with 1% vol/vol GlutaMAX Supplement, 1% vol/vol N-2 Supplement, 2% vol/vol B27 Supplement and 20 ng/mL FGF2).On day 15, the neural rosettes could be observed which were subsequently isolated and selected using STEMdiff subsequent passages for further purification of NPCs.Once the NPCs reached maturity and stability, the NPC Progenitor Medium (STEMdiff Neural Progenitor Medium) was chosen as the cell culture medium to replace the previous ones.

Preparation of the mitochondria-targeted nanoprobes
At first, AuNRs were synthesized using a seed-mediated growth method [37].To successfully guide SERS probes to target mitochondria, AuNRs were modified with MLS. 1 ml AuNRs colloidal solution was added with 112 µL MLS solution with a concentration of 5.0 mM.The mixture was stirred at room temperature by heating a magnetic agitator for 20 h, and the cysteine at the end of the MLS had sulfhydryl groups, which formed stable Au-S with the surface of AuNRs.Then centrifuge once at 6000 rpm by high-speed centrifuge to remove the upper liquid (unbound MLS); Add ultrapure water and centrifuge once at 6000 rpm to remove the upper liquid again.The resulting AuNRs-MLS probes were then re-dispersed in 100 µl ultrapure water and stored at 4°C for subsequent use.

SERS measurement and pretreatment
At each stage of stem cell differentiation, we extracted a small number of cells and incubated them with AuNRs-MLS for 12 hours to enhance the Raman spectra of mitochondria in the cells.Following that, the cells were treated with Accutase and centrifuged at 1000 rpm.After centrifugation, the cells were fixed with 4% paraformaldehyde for 20 minutes at 4°C and centrifuged once more.The cells were then washed twice with phosphate buffer solution (PBS) and twice with ultrapure water before being resuspended in 50 µL ultrapure water.The SERS spectra of the stem cells were measured using a Raman spectrometer (Renishaw, UK) equipped with a 633 nm laser.The used microscope objective had a 50x magnification and 0.5 numerical aperture, and the laser spot size was 1 µm.The excitation light power is set to 7.4 mW.The acquisition parameters were configured to 10 seconds per spectrum with 2 integration times, and the spectral range was set to 600-1800cm-1.The differentiation experiment was replicated three times, and for each cell state, spectra were collected from at least 20 cells.10 SERS spectra were obtained for each cell by selecting random locations, encompassing both the central and peripheral regions of each cell.The spectral data can be divided into 12 categories, labeled as iPSC, EB 1d, EB 2, EB 3d, EB 4d, EB 5d, EB 6d, EB 7d, EB 8d, EB 9d, EB 10d, and NPC, with each category containing at least 500 SERS spectra.
The SERS spectra have been pre-processed before input into the CNN-S model.Firstly, preprocessing analysis was conducted using Wire software (version 4.3), involving the removal of narrow cosmic rays, baseline correction, and noise smoothing.Finally, the SERS spectra were normalized using MATLAB.The resulting preprocessed spectral dataset consisted of at least 6500 single-cell Raman spectra.

Fluorescent dyeing
First, both iPSCs and NPCs were seeded onto clean slides pretreated with Matrix.Then, the cells were fixed with 4% paraformaldehyde for 20 minutes at 4°C.Subsequently, the cells were washed with PBST (PBS with 0.1% TritonTM X-100).To permeabilize the cells, they were immersed in PBST for 10 minutes at 4°C.Subsequently, the cells were blocked using 10% goat serum (Sigma-Aldrich).For immunostaining, primary antibodies including Anti-SOX2 antibody (Abacam, 1/200), PAX6 (Abacam, 1/350), Oct-3/4 antibody (Santa, 1/200) and Nestin antibody (R&D system, 1/40) were diluted in the blocked buffer and incubated with the cells at 4°C for 18∼24 hours.Then the cells were washed with PBST.Secondary antibodies, including goat anti-rabbit IgG H&L (Abcam) and goat anti-mouse IgG H&L (Abcam), were diluted in the blocked buffer and incubated with the cells at room temperature in the absence of light for one hour.Finally, the cells were re-stained with Hoechst33342 nuclear staining.The samples were observed and imaged using an Olympus IX83 inverted microscope, and subsequent image processing was performed using ImageJ software.

Construction and training of the CNN-S model
As shown in Fig. 1(a), the CNN-S model comprises an input layer, four convolution layers, four maximum pooling layers, a flattening layer, and a fully connected layer.The number of convolution kernels in the four convolutional layers is 16, 32, 64, and 128, respectively, with a consistent convolution kernel size of 1 × 3, stride is 1, and no padding is used.Rectifier linear unit (ReLU) served as the activation function while maximum pooling with a window size of 1 × 3 and step size of 3 was employed in the pooling layer to reduce model parameters [38].The flattening layer primarily flattened multidimensional data into one-dimensional vectors while preserving sequence relationships and spatial structures of original data to facilitate input into the fully connected layer.Take a spectrum as an example, the input neuron of the fully connected layer is 1536, and the output neuron takes a value of 6 or 12 according to the actual task.The specific parameters of CNN-S are shown in Table 1.Feature mapping and model output were performed by the linear layer where the highest probability among outputs represented the predicted cell state for the current spectral.During the training process, the Adam optimization function was utilized with a learning rate set at 8 × 10 −4 and a batch size set at 36.The cross-entropy loss function was adopted.
To minimize contingency due to a single partitioning of the training and validation sets, we fully utilize the existing dataset for multiple partitions.Employing the 10-fold cross-validation method enables us to derive the optimal model.As depicted in Fig. 1(b), the preprocessed stem cell Raman spectral dataset was randomly shuffled and divided into ten groups.In each experiment, one group of data (blue block) was randomly selected as the test set to evaluate the CNN model's accuracy, while the remaining nine groups were used for training and constructing the model.To mitigate the risk of overfitting, seven groups of data (green block) in the nine training datasets were dedicated to model building and training, while two groups of data (yellow block) were kept aside for validating the model's classification performance.Subsequently, the remaining group of data (blue block) was utilized to assess the performance of the model.This entire process was repeated ten times to determine the best-performing model.

Assessment indicators
The evaluation index employed in the experiment is as follows: Accuracy: the proportion of correctly classified samples, providing an overall assessment of the model algorithm's performance; where TP, FP, TN and FN represent true positive, false positive, true negative and false negative of each type of cell samples, respectively.Precision: the probability of a sample being positive among all those predicted as positive, indicating its ability to distinguish negative samples; Recall: the probability of correctly predicting a positive sample out of all actual positive samples, reflecting the model's capability to identify positives; F1-score: a weighted average combining precision and recall, demonstrating the stability of the model;

Characterization of mitochondria-targeted probes
Figure 2(a) shows the transmission electron microscopy (TEM) image of the synthesized AuNRs-MLS, functioning as a mitochondrial-targeted probe to enhance cellular Raman signals.The probe consists of AuNR covered with 3-nm-thickness MLS film, presenting an elliptical shape with a long axis of ∼104 nm and a short axis of ∼58 nm.The AuNRs used here exhibit prominent absorption bands at 500-540 nm and 600-700 nm (Fig. 2(b), black line), in which the latter overlaps with the excitation light (633 nm).This facilitates the excitation of surface plasmon resonance of AuNR, resulting in the Raman spectra enhancement, i.e., SERS.The MLS is incorporated to facilitate the targeting of the probe to mitochondria.Compared with pure AuNRs, the absorption spectrum shape of the probe is unchanged but slightly shifted, as shown in Fig. 2(b) (red line).The offset can be attributed to the coupling effect of MLS, indicating that MLS is successfully modified on the AuNRs surface.
To validate the enhancement effect of the probe on Raman signals, the probe was co-cultured with iPSCs, and the Raman signals of iPSCs with/without the presence of the probe were measured, as shown in Fig. 2(c).It can be seen that the original cellular Raman signal is relatively weak.The presence of the probe results in a significant enhancement of the cellular Raman signal while the shape of the spectral remains largely unchanged.This demonstrates that the presence of the probe facilitates the acquisition of stable and effective Raman spectra without compromising the accuracy of Raman signal analysis.
Further, to validate the targeting specificity of the probe for mitochondria, fluorescence experiments were conducted and the results are shown in Fig. 2(d-f).The probe and iPSCs were labeled with fluorescein isothiocyanate (FITC) and Mito-Tracker Red CMXRos, respectively, in which the former exhibits green fluorescence under blue excitation, while the latter exhibits red fluorescence under green excitation.Figure 3(d) shows the fluorescence image of iPSCs with the probe under blue light excitation.The positions of green fluorescence represent the locations of the probe (Fig. 2(d)).Similarly, when the excitation light is changed to green, the positions of red fluorescence indicate the locations of mitochondria (Fig. 2(d)).The merged image shown in Fig. 2(f) presents a high degree of overlap between green and red fluorescence, indicating the successful targeting of probes to the mitochondria in iPSCs.

Classification and spectral analysis of the differentiation process of iPSC into NPC
Figure 3(a) illustrates the differentiation of iPSC into NPC using a single bone morphogenetic protein (BMP) inhibition method [36].The differentiation process was divided into six stages based on the timing of inducing agent addition.Fig. 3 contrast diagrams of cells at different differentiation stages.In the undifferentiated stage, iPSCs exhibited stable growth and reached the required size for differentiation.Additionally, they exhibited clear colony edges.This moment before differentiation is denoted as day 0, and the cellular state at this point is referred as the first state, represented by iPSC.Then, the inducing differentiation agent named M1 was added to the iPSCs.As differentiation commences, iPSCs initiate the transition to the EB.The cellular state one day after the M1 addition is designated as EB1.Cells in this state do not exhibit significant morphological differences compared to the previous state.On the fourth day of cell differentiation, the cells cluster and overlap with each other, transitioning from an adherent to a suspended state, which was denoted as EB4.On the seventh day of cell differentiation, denoted as EB7, the cells within the EBs undergo further changes and maturation during this time while there is no significant morphological difference compared to the previous state.On the tenth day of differentiation, the cellular state is denoted as EB10, and the cells transitioned from suspension to adherence stage, marking the beginning of their transformation into NPC.From its corresponding phase contrast image, it can be observed that at this stage, cells can be mainly divided into two types: one is still in the EB stage and another has differentiated into early-stage NPC.The first ten days of iPSC differentiation are collectively referred to as the EB stage.Next, the cells were treated with the differentiation inducer M2 to facilitate the transformation from EB into NPC.And cells are subjected to adherent culture.On the fifteenth day of cell differentiation, the cells exhibit a stable morphology and become dispersed, NPC can be obtained following repeated sorting and purification.The cellular state is designated as NPC, signifying the completion of iPSC into NPC differentiation.

(b) presents the corresponding phase
Throughout the entire differentiation process, it can be seen that the EB stage spans a wide time range with complex morphological changes.However, relying solely on morphological features is insufficient for the detailed classification of cells in the EB stage.Hence, the development of a novel cell state identification method with high sensitivity holds significant practical implications.
To validate the success of differentiation, we conducted additional immunofluorescence (IF) experiments, and the results are shown in Fig. 2 pluripotency markers Oct4 and SOX2 in iPSC, as well as the marker proteins Nestin and PAX6 in NPC.Both SOX2 and Oct4 are crucial genes involved in regulating iPSCs self-renewal and pluripotency [39,40] while PAX6 and Nestin are critical regulators of NPC proliferation and differentiation [41][42][43].The Hoechst is employed to localize the nucleus of all cells.For the cells on day 0, the SOX2 and Oct4 of cells are expressions while the Nestin does not express, indicating that the cells are undifferentiated iPSCs.For the cells on day 15, the SOX2 and Oct4 of cells show weak or even no expression while the PAX6 and Nestin are expressions, indicating the successful induction of iPSC into NPC.

(c). IF staining was performed to detect the
Figure 4(a) presents the SERS spectra of the cells in the six stages during the differentiation process.Significant characteristic peaks at 776 (phosphatidylinositol), 1002 (phenylalanine), 1337 (CH2/CH3 swing torsion mode in tryptophan), 1440 (lipids), and 1660 cm −1 (amide I) were identified.It can be seen that the differences in SERS spectra shapes of cells at different differentiation stages were relatively minor.This observation can be attributed to several factors.Firstly, the biomolecular composition of cells underwent insignificant changes throughout the differentiation process, resulting in less pronounced alterations in the SERS spectra.Secondly, the data were influenced by normalization and averaging procedures.During the spectrum collection process, we noted substantial fluctuations in the spectra of the same cell at different locations.To mitigate the impact of intensity variations on classification, we normalized the spectral data and then averaged 20 spectra from each category.Normalization resulted in a reduced amplitude difference between spectra.Moreover, the averaging process diminished local peaks and fluctuations, rendering the spectra more comprehensive.Although this might lead to a smaller amplitude change in SERS spectra post-averaging compared to individual SERS spectra, it enhanced the statistical significance of our results.To visually observe and explore the relationship between differentiation stages and SERS spectra, the peak intensities from the SERS spectra in Fig. 4(a) were extracted and replotted in a bar chart, as shown in Fig. 4(b).As the differentiation process progresses, the cellular SERS spectra are gradually changing.For example, the 1440 cm −1 peak displayed an overall increasing trend during the differentiation process.Besides, in comparison to iPSC, NPC exhibited higher spectral intensities at 1002, 1337, and 1660 cm −1 though their content fluctuates during the differentiation process.The results show that the cellular SERS spectra do undergo a change in the differentiation process.However, establishing a clear relationship between the two, or using SERS spectra as a basis for identifying differentiation stages, requires comprehensive analysis and organization of a large amount of SERS spectra data.And this is precisely what deep learning excels at.

Identification of induced pluripotent stem cell differentiation stages by CNN-S
The CNN-S model was trained using SERS spectra for precise identification of cellular states.curve represent the specificity (false positive rate, FPR) and the sensitivity (true positive rate, TPR), respectively.For all ROC curves, the closer to the upper left area, the higher the TPR and the lower the FPR of the model, indicating that the trained model can accurately discriminate and distinguish cell states, regardless of the differentiation stage they are in.The confusion matrix in Fig. 5(c) displays the prediction accuracy of the CNN-S model for each cell type on the test datasets.It can be seen that the prediction accuracy for all cell types is above 96% and the average prediction accuracy is high to 98.50%, which aligns with the results obtained from the ROC curves (Fig. 5(b)).The above results are calculated based on the Accuracy metrics.Besides, a more comprehensive evaluation of the model using three other evaluation metrics, Precision, Recall, and F1-score, is conducted.The results shown in Fig. 5(d) reveal that the CNN-S model achieves outstanding performance for each metric across all classes.Further, to explore the classification limits of this model, we closely monitored each day of the EB stage, providing a more detailed 12-class classification overall differentiation process, including iPSC, EB 1-EB 10, and NPC.The results shown in Fig. 5(e) demonstrate that, although the number of classified categories expanded from 6 to 12, the proposed method also achieves an average accuracy of 95.5%.
For comparison, traditional machine learning algorithms including KNN, SVM, and RF were used to accomplish the cell state identification, and the results are shown in Table 2.It can be seen that KNN achieved an average accuracy of 78.8%, while SVM with RBF and Sigmoid kernel functions achieved an average accuracy of 61.3% and 47.8%, respectively.The RF algorithm had an average accuracy of 80.5%.In contrast, CNN-S outperformed all other methods with an average accuracy of 98.5%.The accuracy of CNN-S surpasses that of other methods primarily due to two factors.Firstly, when processing one-dimensional spectral data, CNN efficiently captures local features through convolution operations, enabling the identification of specific patterns and structures, and thereby enhancing spectral analysis results.Additionally, CNN determines the appropriate number of network layers and parameters through ten-fold cross-validation, further contributing to its superior performance.These results suggest that combining SERS spectra with CNN can achieve high sensitivity and precision for recognizing stem cell differentiation state at the single-cell level.

Impact of spectral features on cell differentiation status identification
In addition, the impact of spectral features on cell state identification tasks was further investigated.SRES spectra from different bands containing distinct characteristic peaks rather than the full spectra were input into the CNN-S model and the average prediction accuracy for six-cell states was shown in Fig. 6(a).It can be seen that an identification accuracy of over 80% could be achieved by solely utilizing the SRES spectra peaks centered at 776 or 1002 cm −1 .Compared, relying solely on the SRES spectra peaks centered at 1337, 1440, or 1660 cm −1 results in lower prediction accuracy, but it still exceeds 70%.The results indicate that different SRES spectra features contribute differently to the cell state identification, with a relatively greater contribution from the peaks centered at 776 and 1002 cm −1 .Based on the above results, SRES spectra datasets with various spectral lengths were constructed and the impact of spectral length on the cell state identification tasks was investigated.Fig. 6(b) shows the average prediction accuracy at different spectral lengths.All the spectra start at 600 cm-1.It can be seen that the prediction accuracy increases with the increase in spectral length.The average accuracy can reach 92% when the spectra range from 600-1000 cm −1 .By expanding the spectra to 600-1100 cm −1 , the average accuracy improves to approximately 95%, proving the importance of the Raman peak centered at 1002 cm −1 .As the spectra extend to 600-1400 cm −1 , a high accuracy of 98% is achieved and the accuracy remains stable with the spectral length increasing.The results show that the spectral feature differences in the Raman bands at 776, 1002, and 1337 cm −1 played a crucial role in accurately classifying and identifying the six distinct cell states during the differentiation process.The spectra ranging from 600-1400 cm −1 are sufficient to address the identification task of the CNN-S model, which is beneficial for shortening the spectral acquisition region to enhance spectral acquisition speed.

Conclusion
In summary, an approach combining SERS and CNN was proposed which demonstrated outstanding performance in identifying cell states during stem cell differentiation.The differentiation process from iPSCs to NPCs was divided into six stages including the EB stage for the first time.Stable and effective cellular mitochondrial Raman signals for cells at different stages were achieved by utilizing the SERS effect of mitochondria-targeted probes.By training with over 6000 SERS spectra, the CNN-S model achieved precise and rapid identification of any cell state with a prediction accuracy of 98.5%, which far surpasses the predictive results of traditional machine learning algorithms.Further, for the more detailed 12 classifications, the model achieved a pretty high prediction accuracy of 95.5%.Here, the accurate, effective, and sufficient Raman spectral datasets, along with an appropriate network architecture, are prerequisites for achieving high prediction accuracy.Besides, the contribution of spectral characteristics and length in achieving high-precision identification was investigated.The results show that the Raman peaks at 776, 1002, and 1337 cm −1 play a crucial role in accurately classifying cell states during differentiation and SERS spectra range from 600 to 1400 cm −1 is enough for the cell states identification tasks.These findings offer valuable insights into the selection of appropriate spectral bands to monitor cell states, presenting significant potential for the precise identification of cell states and a deeper understanding of cellular differentiation processes.In our study, the prediction accuracy of stem cell differentiation using SERS combined with CNN reached 98.5%, providing valuable insights into the characterization of iPSC, EB, and NPC differentiation stages.Additionally, the accuracy of the network in daily monitoring of the EB stage reached 95.5%.These findings underscore the potential of SERS-CNN integration as a powerful tool for precise and efficient monitoring of cell differentiation processes.However, it is important to note that CNN imposes strict requirements on the size of the dataset, which may lead to overfitting issues when dealing with limited sample data.This sensitivity can result in a decrease in model generalization performance.We anticipate that this study will enrich our understanding of cellular differentiation processes and make noteworthy contributions to related fields.

Fig. 2 .
Fig. 2. (a) Transmission electron microscopy (TEM) of AuNRs-MLS probe.(b) UV-vis spectra of AuNRs and AuNRs-MLS.(c) Raman spectra of iPSC samples with (red line) and without AuNRs-MLS probe added (black line).(d-f) Fluorescent images of iPSC cells added with AuNRs-MLS probes.AuNRs-MLS probes were stained green with FITC (d) and mitochondria of iPSCs were stained red with Mito Tracker red CMXRos (e).(f) Fluorescence merges the image of d and e.

Fig. 3 .
Fig. 3. (a) Process and timeline of iPSC differentiation into NPC using a single inhibition method.(b) In the phase contrast images of cells at different differentiation stages, the red line in EB10d is the dividing line between EB and early NPC.(c) Immunostaining was performed with different protein markers in iPSCs and NPCs.The nuclei are blue by Hoechst staining.Scale bar:100 µm.

Fig. 4 .
Fig. 4. (a) SERS spectra of cells at six states during stem cell differentiation.Note: The average SERS spectra are shown as solid lines.(b) The evolving trend of the characteristic peak of the six stages.
A 10-fold cross-validation mechanism was employed to obtain the optimal model.To prevent overfitting, the loss function and accuracy of both the training and validation datasets were continuously monitored during the training process.The training loss and validation accuracy curves of the training and validation sets across epochs are shown in Fig. 5(a).After 40 epochs, both the loss and accuracy curves of the training and validation datasets converged, and the curves remained consistently stable in subsequent epochs, indicating that there was no occurrence of overfitting during the training process.The receiver operating characteristic (ROC) curves in Fig. 5(b) reflect the sensitivity and specificity of the model in distinguishing cells in different states.The x and y-axis of the ROC

Fig. 5 .
Fig. 5. (a) Accuracy and loss function curve versus different epochs; (b) ROC curves for the model.The average AUC values for each class are all greater than 0.99.(c) The confusion matrix of the CNN-S model for each cell states on the test dataset.(d) Performance of the CNN-S model under different evaluation metrics.(e) The confusion matrix of the CNN-S model for 12 cell states on the test dataset.

Fig. 6 .
Fig. 6.(a) Prediction accuracy curve of CNN-S model using SRES spectra including different spectral peaks as input.(b) Prediction accuracy curve of CNN-S model for different spectral lengths.Note: Error lines in the curve are standard deviation values through tenfold cross-validation.