Machine learning-based LIBS spectrum analysis of human blood plasma allows ovarian cancer diagnosis

: Early-stage screening and diagnosis of ovarian cancer represent an urgent need in medicine. Usual ultrasound imaging and cancer antigen CA-125 test when prescribed to a suspicious population still require reconfirmations. Spectroscopic analyses of blood, at the molecular and atomic levels, provide useful supplementary tests when coupled with effective information extraction methods. Laser-induced breakdown spectroscopy (LIBS) was employed in this work to record the elemental fingerprint of human blood plasma. A machine learning data treatment process was developed combining feature selection and regression with a back-propagation neural network, resulting in classification models for cancer detection among 176 blood plasma samples collected from patients, including also ovarian cyst and normal cases. Cancer diagnosis sensitivity and specificity of respectively 71.4% and 86.5% were obtained for randomly selected validation samples.


Introduction
Ovarian cancer as one of the common gynecologic cancers [1] presents a high mortality rate when a patient is diagnosed in an advanced stage [2]. The absence of specific clinical symptom combined with the lack of performant diagnosis method would delay effective treatments [3]. Early screening method is therefore currently expected by clinical medicine [4]. At the same time, the occurrence of ovarian cancer [1] does not yet justify a systematic screening over a large population with a high precision but invasive diagnosis technique, such as tissue biopsy. Noninvasive techniques such as ultrasound imaging often requires a high expertise to distinguish between cancer and a benign abnormal case such as cyst, while blood cancer antigen CA-125 test generally presents insufficient robust levels of diagnostic accuracy [5]. Supplementary tests are thus extremely important for an accurate diagnosis of ovarian cancer. The improvement idea is to develop an intermediate noninvasive technique with higher diagnostic performances than ultrasound imaging and CA-125 test to help practitioners in their decision of further diagnosis with biopsy for example. Blood analysis in the molecular or atomic levels could be an efficient way to satisfy the above need in condition that it is coupled with a suitable and effective information extraction method.
Optical spectroscopy as an analytical technique is able to acquire the fingerprint of a blood sample. The obtained information can be often complex in nature and implicit in expression, as for analysis and diagnosis in biology and medicine in general. Modern data mining methods [6] developed within artificial intelligence (AI), such as machine learning and deep learning [7] are at the same time, required and powerful for extracting suitable characteristic features of a sample. Recent progresses in medical image treatment by the AI approach fully demonstrate the capability of the related algorithms for classification and identification of medical images [8,9]. Combined with machine learning algorithms, spectroscopic techniques have been implemented for cancer diagnosis, including attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy [10], Raman [11] and surface-enhanced Raman spectroscopy (SERS) [12]. Laserinduced breakdown spectroscopy (LIBS) as a multi-elemental detection technique [13,14] has demonstrated its potential in biomedical applications [15], especially for bacteria detection and identification [16][17][18], biological tissue mapping [19], neurodegenerative disease diagnosis [20] and cancer screening [21]. Combined with chemometrics or more recently machine learningbased regression models [22,23], LIBS is able to classify and identify various biological samples according to their LIBS spectra [24]. For cancer diagnosis with LIBS more specifically, early works often involved samples harvested from laboratory model animals and was designed to study spectral markers pertinent for identification and detection of the targeted diseases [25,26]. More recently, tests using human blood samples were reported with a quite limited number of samples in the order of several tens [27][28][29], which may influence the significance of the study for clinical applications due to a large variability of the patients. It is also worth to point out that the above mentioned works of cancer detection with LIBS paid attention to the differentiation between normal and cancer cases, other intermediate or evolving abnormal cases, cyst for example, were not included in the collection of studied samples. Furthermore, among various operation modes with different types of biological samples, direct analysis of blood-related liquids is preferred for a clinical approach because it corresponds to an easy, cost-effective and commonly applied implementation of biomedical test which is suitable for a wide coverage screening and can be incorporated in a routine physical examination.
The present work was designed according to a clinical application scenario where an ensemble of blood plasma samples is collected from a population of female patients after their initial medical imaging examination intended to a screening of ovarian cancer, and their health situation needs to be further diagnosed to decide among the three cases of normal, ovarian cyst and ovarian cancer. LIBS analysis of blood plasma was thus chosen to provide supplementary diagnostic information. Although the sample collection represented a time-consuming task, a significantly large set of 176 blood plasma samples was collected from a population of female patients examined by the hospitals, including the three finally diagnosed case-types of normal, ovarian cyst and ovarian cancer. The measurements were performed using the liquid sample preparation method of surface-assisted LIBS introduced in our previous works [30,31]. The recorded LIBS spectra were used to study classification and identification models. One third of each case-type of samples was randomly selected as the model validation samples, while the rest of the samples was used as the model training samples. Basically, the study of the classification and identification models included the steps of data pretreatment, feature selection, neural network training and model validation. In the following, we first presented the used samples, the experimental setup and the measurement protocol. The spectral data treatment method was then presented in detail. In the section "results and discussions", we presented the way that the classification model was optimized step by step together with the principle of each optimization process and the role of several common elements in blood plasma, such as K, Na and Mg before delivering and discussing about the performances of the finally optimized model.

Samples, experimental setup, and measurement protocol
Blood samples were collected from 176 patients examined by Women's Hospital of the School of Medicine, Zhejiang University and Tongde Hospital of Zhejiang Province in China, the resulted plasmas were stored in a fridge at −80 C°before being prepared for the experiment. Plasma as the liquid component of blood is known as playing a vital role in an intravascular osmotic effect that keeps electrolyte concentration balanced and protecting the body from infection and other blood disorders [32]. A disease in the body can influence its chemical composition, the analysis of which can therefore provide indications of the health state of the patient. The advantage of the use of plasma instead of whole blood for LIBS is to avoid the spectral interference from lines emitted by iron contained in red blood cells at a relative high concentration. For the purpose of the present study, all the samples were labeled with the help of usual medical diagnosis methods practiced in the hospitals. Thereby, 79 samples belonged to the healthy normal case (45%), 34 were diagnosed to be the cyst cases (19%) and 63 to be the ovarian cancer cases (36%). For an effective fingerprint measurement, the liquid sample preparation protocol of surface-assisted LIBS [30,31] was applied. As shown in Fig. 1(a) and (b), 150 µℓ of each liquid sample at the room temperature was picked with a pipette and dropped on a high purity graphite plate (purity ≥ 99.92% according to the provider) with a polished and cleaned surface of 20 × 20 mm 2 (thickness 5 mm). The liquid was then spread out uniformly over the whole surface of the plate with the tip of the pipette which was changed for each sample to avoid cross contamination. The obtained liquid-covered graphite surface was put under an infrared lamp and heated up for about 10 minutes for drying. The sample was then left for cooling down under the ambient temperature for about 10 minutes. The result was a thin and semi-transparent layer of residual of the blood plasma on the surface of a graphite plate.
The used experimental setup with its central part illustrated in Fig. 1(c) was the same as in our previous work [33] and the detailed description can be found elsewhere [34]. Briefly, a Q-switched Nd:YAG laser operating at its fundamental of 1064 nm delivered 7 ns and 30 mJ laser pulses to the samples after being focused by a lens of 50 mm focal length for the LIBS measurements. A sample was placed on a motorized 3-D displacement stage allowing its translation synchronized with the laser pulses in order to perform single-shot LIBS spectrum recordings. During the experiment with a sample, replicate measurements were performed on its surface and distributed in the form of a matrix of ablation craters. A center-to-center distance of 0.8 mm was left between neighboring craters to avoid their overlapping. Emission from laser-induced plasma was collected by a two-lens system and captured by an optical fiber connected to an echelle spectrometer with a wide spectral range from 230 nm to 900 nm and a spectral resolution power of λ/∆λ = 5000, equipped with an intensified charge coupled device (ICCD) camera (Mechelle 5000 and iStar from Andor Technology). The ICCD camera was triggered by laser pulses with a detection delay of 0.8 µs and gate width of 2.5 µs. Measurements were randomly performed with samples of different case-types in order to avoid systematic drift of the setup. For each sample, a measurement matrix of about 400 to 500 single-shot ablations together with the corresponding replicate LIBS spectrum recordings was performed on the substrate surface covered by a residue of blood plasma, yielding a total number of 85176 single-shot spectra for the 176 samples. In Fig. 1(d), typical replicate-averaged spectra are shown for the three case-types of blood plasma samples and for a substrate graphite. We can see that the prominent spectral features on an intensity scale of 10 5 counts correspond to the major metal and nonmetal elements in a biological material: K, Na, Ca, Mg and C, H, O, N. Notice that the lines from the last 4 nonmetal elements can also be contributed by the substrate for C, and by the ambient gas for H, O, N, as shown in Fig. 1(d). On a much smaller intensity scale of 10 3 counts, some minor elements, Fe, Si, P, Cu, can be identified in the blood plasma spectra, as shown in the insets in Fig. 1(d). A first glance on the spectra does not evidence obvious difference between those of the blood plasma samples, indicating the need of a more sophisticate data processing method to reveal their specific characteristics.

Data treatment procedure and methods
The data treatment flowchart is shown in Fig. 2, it consisted of several steps, including spectral pretreatment, organization of training and validation sample and data sets, feature selection, standardization, model training and validation.

Data pretreatment
Data pretreatment included the following operations: i) Spectrum averaging in order to reduce the fluctuations due to laser pulse shot-to-shot energy jitter and sample inhomogeneity. For a given sample, the raw single-shot replicate spectra were randomly arranged in a sequence, a first average spectrum was generated as the result of averaging over the first 30 spectra from the n°1 to the n°30. A second average spectrum was then generated in shifting the averaging range by 10 spectra to include the spectra from the n°11 to the n°40. The operation was repeated until all the raw single-shot spectra of the sample were involved in the generation of average spectra. A number from 37 to 47 average spectra were generated for each of the samples, resulting 7887 average spectra for the 176 samples of the three case-types. ii) Normalization: each average spectrum was normalized with its total spectral intensity calculated by integrating the spectral intensity over the whole spectral range. The above operations generated the pretreated spectra.

Dataset organization
For the further steps in the study of identification and classification models, we needed to isolate a part of the samples as the model validation ones in order to assess the prediction performance of the trained models. In a machine learning data treatment procedure, the validation samples do not take part in the model training process that is exclusively contributed by the model training samples. It is however required that the training samples and the validation samples share a same feature space with similar distributions [35]. In the case of a regression model trained and validated for quantitative analysis, such requirement can be satisfied by displaying the standard samples as a function of their known concentrations of the element to be analyzed, and selecting validation samples in such way that their elemental concentrations being randomly and uniformly distributed within the concentration range covered by the concentrations of the training samples. Similarly, for our task in this work of identifying and classifying among the 3 case-types of normal, cyst and cancer, the selection of the validation samples consisted in taking respectively from the 3 case-types, one third of the samples in a random way so that the selected samples are statistically equivalent to the remaining ones. Different from the case of regression model for quantitative analysis, our task of identification and classification did not rely on a clearly identified parameter, such as the concentration of an element to be determined. The fact that a sample is classified into a given case-type depends on a rather undefined ensemble of parameters which can be extracted from the LIBS spectrum. It was why we decided to select the validation samples according to principal component analysis (PCA) scores, although such method did not correspond to the unique one that could be used, and the actual selection of the validation samples could slightly influence the final performance of the trained models.
The pretreated spectra of each sample were therefore further averaged generating a mean spectrum to represent the sample in a PCA plot. The 2-D PCA plots of the samples respectively belonging to the 3 case-types are presented in Fig. 3(a), (b) and (c). For each case-type, we manually selected one third of the samples as validation ones in such way that these samples were distributed uniformly in different areas in the PCA plot occupied by the ensemble of the samples as shown in Fig. 3(a), (b) and (c), where for each case-type of the samples, the selected validation samples are represented in crosses, while the rest of the samples used as the model training samples are represented in cycles. As a convention for visual recognition in this paper, we used green, orange and red respectively for the normal, cyst and cancer cases. The detailed composition of the validation and training sets of samples is given in Table 1. The training samples with their pretreated spectra provided the training data set, and the validation samples with their pretreated spectra provided the validation data set.

Feature selection
A feature selection process based on SelectKBest algorithm [36] with a chi-squared test [37] was applied to the ensemble of pretreated spectra of the 118 training samples. In statistics, the chi-squared test is used to determine whether there are statistically significant differences among two or more distributions of data by calculating the distances separating the distributions. In our case, for each spectral channel, a chi-squared value was calculated for the intensities of the individual pretreated spectra with respect to the mean channel intensity over all the pretreated spectra. The resulted value represented the distances among the 3 ensembles of channel intensities associated to the 3 case-types. The calculation was carried out for the 23826 channels of the spectrum, resulting in the corresponding channel scores for ranking them from the highest (large distances among the populations) to the lowest scores. The top 100 spectral channels were retained to determine the selected features in all the pretreated spectra of the training data set. These spectral features were used as the input variables to train the classification model. At the same time, the retained channels were applied to the pretreated spectra of the 58 validation samples in order to identify the 100 spectral features used for the assessment of the trained model in the step of model validation.
The results of feature selection are shown in Fig. 4. The score obtained by the 100 highest ranked spectral channels are shown in Fig. 4(a) in red cycles, and in Fig. 4(b) the corresponding spectral features are indicated by red cycles in an average spectrum of a normal sample to show their respective intensities. We can see that the most important spectral features for the identification and classification according to the 3 targeted case-types of normal, cyst and cancer belong to K I 766.5 nm, K I 769.9 nm, Na I 589.0 nm, Na I 589.6 nm, Na I 819.5 nm, Mg II 279.6 nm, Mg II 280.3 nm, and C I 247.9 nm lines. Such result can be understood by the fact that certain major metal elements like sodium, potassium and to a lesser extent magnesium, are the most important electrolytes in living systems, their concentrations in blood plasms play a vital role in maintaining homeostasis in the body [38]. Concentration imbalance of the electrolytes can be the cause of abnormalities in the human body and should play an important role in the identification and classification of the case-types of normal, cyst and cancer in our study. The case of carbon remains less straightforward for an explication because of the contribution also from the substrate as we mentioned above. We can also remark the absence of minor elements among the 100 highest ranked spectral features. A detailed look in the scores obtained by the detected minor elements reveals a ranking of 464 th for Si and 4923 rd , 5151 st , 6465 th respectively for Cu, P and Fe, far behind the above discussed major elements. This observation shows the marginal roles of the minor elements detected in blood plasma in diagnosis of ovarian cancer, due to certainly their very low line intensities.

Standardization
As a usual operation in a machine learning data treatment facilitating the gradient descent type model optimization, the standardization was implemented in our study for the selected and identified features respectively of the training and validation samples. The selected features of the training data set were first scaled with a linear transformation which brought their values into the range of [0, 1]. For a given selected spectral channel, the maximal (I max ) and minimal (I min ) values of the channel intensities were identified over all the pretreated spectra of all the samples of the training set. The standardized channel intensity of an actual spectrum was then calculated by (I − I min )/(I max − I min ). The pair of values I max and I min were then applied to the same spectral channel of the validation data set. The operation generated respectively for the training and the validation data sets the standardized selected features and the standardized identified features.

Neural network training by cross-validations
The classification model training process was implemented according to our previous work initially devoted to quantitative analysis with LIBS spectra from soil samples with a regression model based on back-propagation neural network (BPNN) [23]. Since its introduction into LIBS data analysis, this method has been applied to various scenarios of LIBS analysis including laser pulse energy variation correction [33], chemical matrix effect correction in rock analysis [39], determination of carbon concentration in steel [40], and simultaneous determination of concentrations of water and potassium in potash online analysis [41]. In the present work, the method was adapted to the case of identification and classification of a collection of samples. The used neural network had 3 layers, with an input layer of 100 neurons corresponding to the 100 standardized selected features of each pretreated training spectrum, a hidden layer of 50 neurons, and an output layer of 3 neurons corresponding to the 3 output case-types. A 5-fold cross-validation optimization procedure was employed for neural network training with the pretreated spectra of the training samples. For each fold of the cross-validation, the identification of a sample among the 3 case-types of normal, cyst and cancer, was decided according to the majority of the individual identifications for the test spectra of the sample. In the end of the cross-validation, an ensemble of definitive identifications was assigned to all the training samples according to the majority of the 5 cross-validation identifications. The calibration performance of the trained models was then assessed by a comparison between the models-assigned case-types of the training sample and their label values, and presented in a confusion matrix for the training samples, together with the associated figures of merit. A more detailed description of the model training process can be found in the Supplement 1 associated to the paper.

Model validation by the validation data set
The prediction performance of the trained models was assessed in this step by the validation data set which was excluded from the training process. The validation process was similar to the cross-validation tests in the model training step. The pretreated spectra of a given validation sample with 100 standardized identified features each, were used as an ensemble of data to successively test the 5 trained models,

Initial models leading to classification according to 3 case-types
The classification results with the above discussed initial models leading to classification according to 3 case-types are shown in Table 2 for the training samples and in Table 3 for the validation samples, together with the confusion matrix and the figures of merit. For calibration with the training samples as shown in Table 2, we can see that the identification of normal samples is satisfactory with a very low rate of wrong classification of 1.9% (1 over 53). At the same time, for the cyst and cancer samples, if they were considered as an ensemble, their wrong classification to the normal case remains limited with a rate of 4.6% (3 over 65). However, misclassification within the ensemble of cyst and cancer samples becomes quite important. The model therefore effectively explores the pertinent information to distinguish between the normal and the ensemble of cyst and cancer samples, while within the ensemble of cyst and cancer samples its effectiveness greatly decreased. For prediction with the validation samples as shown in Table 3, we can see that the performances are globally degraded as shown by a comparison between the figures of merit for the training data set in Table 2 and for the validation data set in Table 3. The robustness of the model is therefore not sufficient. This remains understandable because of the limited number of the training samples, which appears quite small with respect to the large variability of human plasma samples. The representability of the validation samples by the training samples cannot thus be ensured in an optimized way despite the precaution taken in the data organization into the training and validation sample sets. Besides the weak robustness of the models, a better performance for classification between the normal samples and the ensemble of cysts and cancer samples can still be observed for the validation samples. In addition, the important misclassification within the ensemble of cyst and cancer samples leads to, for an application of cancer screening, a large false positive rate of 36.4% (8 over 22) mainly due to misclassifications from cyst samples, and a false negative rate of 33.3% (7 over 21) also due to misclassifications to cyst samples, corresponding to cancer diagnosis sensitivity and specificity of respectively 66.7% and 78.4%. These results show rooms for improvement for the initial classification models.

Table 3. Confusion matrix and figures of merit for classification of the validation samples with the models of classification according to 3 case-types.
In order to figure out the way to improve the performances of the classification models, we investigated in detail the reasons for the mediocre performances with the ensemble of cyst and cancer samples. We first looked at the mean positions of the samples in a PCA plot (PC1 and PC2) determined by the respective mean values of the 100 standardized selected or identified features calculated over the pretreated spectra of a sample. Such plot is shown in Fig. 5(a), where the mean positions of the normal, cyst and cancer samples are respectively represented in green, orange and red, the training samples with crosses and the validation samples with circles.
We can see first that globally the normal samples are clustered together in an area separated from the cyst and cancer samples. The PC1 expresses the most discriminative character for such separation. This can explain the satisfactory classification results for normal samples with respect to the ensemble of cyst and cancer samples and vice versa. There are however 2 normal validation samples wrongly classified as cancer ones in Table 3. In the PCA plot, these 2 samples correspond to the points in Fig. 5(a) surrounded by a green circle, the positions of which are actually located outside of the area occupied by the major part of the normal samples and merged into the zone occupied by the ensemble of cyst and cancer samples. Table 3 also indicates 3 cancer validation samples wrongly identified as normal ones. We can find them in Fig. 5(a) surrounded by a red circle and merged into the zone of normal samples. Concerning the general situation of the cyst and cancer samples, we can see in Fig. 5(a) that their positions are mutually merged in a same zone without distinct areas from each other. This can explain the unsatisfactory classification results for the training samples as well as for the validation samples between the cyst and cancer samples. These observations seem to tell us that the features selected for 3 case-types classifications would not express effective characteristics of the spectra for the distinction between the cases of cyst and cancer, although they provide a quite satisfactory distinction between the normal and the ensemble of cyst and cancer samples.  Table 3 are surrounded by a circle.
A look at the composition and the coefficients of the PC1 reveals the main contributions from K lines (5.44), Na lines (−2.48) and Mg lines (0.17), while the PC2 are mainly contributed by Na lines (5.97), K lines (1.97) and Mg lines (0.40). This means that the separation between the normal samples and the ensemble of cyst and cancer samples are mainly due to the K lines which express different behaviors for these 2 types of samples. A plot of the relative intensity of the K I 766.5 nm line for all the samples in Fig. 5(b) clearly shows such difference. We can see also that the several wrongly classified samples show their K I line intensity differentiating from that of the other samples of the same type, as indicated in Fig. 5(b) by the data points surrounded by a circle. At the same time, for the cyst and the cancer samples, the intensities of the K I line exhibit similar behaviors in accordance with the PCA plot in Fig. 5(a), where the PC1 scores do not allow their separation. The PC2 which is mainly contributed by the Na lines does not allow clear separation of the samples, confirmed by the similar behaviors of the Na I 589.0 nm line for the 3 types of the samples shown in Fig. 5(c). These observations would tell us that the features selected for classification according to the 3 case-types of normal, cyst and cancer are dominant by K lines as indicated by the scores shown in Fig. 4(a), which offers a satisfactory separation between the normal samples and the ensemble of cyst and cancer samples. Meanwhile however, such domination prevents other spectral characteristics from a sufficient expression, which otherwise, may help for a better differentiation between cyst and cancer samples.
The idea was thus to proceed the classification task into 2 successive steps of one first step separating the normal samples and the ensemble of cyst and cancer ones, followed by a second step for a further separation between the cyst and cancer samples with new spectral features selected without dominance of the K lines.

Improved models of classification in 2 steps of 2 case-types
An improved model training process with a schema of classification in 2 steps of 2 case-types was implemented according to the flowchart shown in Fig. 6. In the end of the first step which was identical to the initial one-step model training, the resulted model 1 were validated by the validation samples, resulting in the separation of the "normal" samples and the ensemble of "cyst-cancer" samples. Here, the use of quotation marks expresses the fact that misclassification can happen with the model 1 in such way that the resulted 2 classes of identified samples can mutually contain individuals from the other type. The ensemble of validation samples identified as "cyst-cancer" was further processed in the second step as the new validation samples. The cyst and cancer samples in the training sample set of the first classification model were used as the new training sample set for the second classification model. The same feature selection process was applied to the new training data set for the purpose of classification according to 2 case-types. The 100 highest ranked spectral channels are shown in Fig. 7 for comparison with the results shown in Fig. 4. The obtained scores of these channels are shown in Fig. 7(a) in red cycles, and in Fig. 7(b) the corresponding spectral features are indicated by red cycles in an average spectrum of a cancer sample to show their respective intensities. We can see that the most relevant spectral channels for classification according to the 2 case-types of cyst and cancer belong to Mg II 279.6 nm, Mg II 280.3 nm, Mg I 285.2 nm lines, complemented by Na I 589.0 nm, Na I 589.6 nm, Na I 819.5 nm lines, C I 247.9 nm line, as well as Ca II 393.4 nm and 396.8 nm lines and H I 656.3 nm line. The K lines are not selected among the 100 channels as pertinent for differentiating cyst and cancer, confirming our expectation discussed in the above section. Minor elements were again absent among the 100 top features. A detailed look in the scores obtained by the detected minor elements reveals a ranking of 2497 th , 2634 th , 3703rd, and 3850 th respectively for P, Fe, Cu and Si, showing their negligible contributions for differentiation between cyst and cancer. A comparison between the scales of the vertical coordinate in Fig. 7(a) and Fig. 4(a) shows an important diminution of the scores of the 100 highest ranked spectral channels, which means a significant reduction of the difference between the 2 populations of data in the new training data set, in accordance with the results shown in Fig. 5. In other words, the number of the features effective to distinguish the 2 populations of data decreases as a consequence of a reduced distance between them. It was therefore justified to use less features for model training in order to avoid overfitting. After testing several options, the 30 highest ranked channels (including emission lines from Mg and Na) were used to selected 30 features in the pretreated spectra of the new training samples and to identify the same number of features in the pretreated spectra of the new validation samples. The same model training process was performed using the new training spectra with the new selected features, to optimize a neural network with 30 neurons in the input layer, 20 neurons in the hidden layer and 2 neurons in the output layer, resulting in the model 2 and the confusion matrix 2 together with the corresponding figures of merit for classification of the training samples as shown in Table 4. The trained model 2 was then validated using the new validation spectra with the new identified features. The obtained result was combined with the that obtained in the first step of classification, leading to the final confusion matrix together with the corresponding figures of merit for classification of the validation samples as shown in Table 5. For calibration with the training samples, a comparison between Table 4 and Table 2 shows that the performance for cyst identification is unchanged with a wrong classification rate of 26.1% (6 over 23). At the same time, the performance for cancer identification is slightly degraded. The wrong classification rate is increased from 7.1% (3 over 42) to 9.5% (4 over 42). For prediction with the validation samples, a comparison between Table 5 and Table 3 shows that the performance of identification is improved for the both cases of cyst and cancer. The wrong classification rate for cyst is greatly reduced from 54.5% (6 over 11) to 27.3% (3 over 11). The wrong classification rate for cancer is reduced from 33.3% (7 over 21) to 28.6% (6 over 21). These results show that the improved models of 2-step classifications offer a better prediction performance for validation samples, even though the improvement for calibration remains unclear. For an application of cancer screening, a false positive rate of 25.0% (5 over 20) and a false negative rate of 28.6% (6 over 21) were obtained, which are clearly improved comparing to the initial one-step classification models, allowing cancer diagnosis sensitivity and specificity of respectively 71.4% and 86.5%.

Conclusion
In this work, we have developed a method of identification and classification of blood plasma samples collected from the patients and including, among the 176 samples, the 3 case-types of normal, ovarian cyst and ovarian cancer. The method is based on LIBS spectrum recording coupled with spectral data treatment using a machine learning approach. A first classification model allowed a satisfactory classification between the normal samples and the ensemble of cyst and cancer samples, whereas numerous misclassifications happened between the cyst and cancer samples, leading to mediocre sensitivity and specificity for cancer identification of respectively 66.7% and 78.4% when the models were tested with independent validation samples. A detailed investigation on the spectral features selected for the model training revealed the domination of K lines in LIBS spectrum, which was effective for separating the normal samples and the ensemble of cyst and cancer samples. Such domination inhibited the expression of other features more suitable for the discrimination between cyst and cancer. A second ensemble of models was trained, where the normal samples were separated from the ensemble of cyst and cancer ones in the first step of classification, while the second step focused on the discrimination between the cyst and cancer cases. A new feature selection disgraced the K lines and put forwards other features, Mg and Ca lines for instance. The new models exhibited a better performance of differentiation between cyst and cancer samples, leading to improved cancer identification sensitivity and specificity of respectively 71.4% and 86.5% when the models were tested with independent validation samples. Emission lines from some minor elements in blood plasma, Fe, Si, P, Cu, were identified in our experiment. Their contribution to the classification of the samples has been observed clearly negligible as compared to the major metal elements, K, Na, Mg and Ca, considered as the most important electrolytes in blood and playing a vital role in maintaining homeostasis in the body. An imbalance of their concentrations therefore indicates a state of abnormality in a patient [38].