Revealing hidden information in GC–MS spectra from isomeric drugs Chemometrics based identification from 15 eV and 70 eV EI mass spectra

EI source yields more information rich and thus discriminating mass spectra for ring-isomeric, cathinone-type drugs. Through multi-variate statistics using Principal Component Analysis (PCA) followed by Linear Discriminant Analysis (LDA) mass spectral data can be exploited to confidently distinguish isomeric classes. Including feature selection of the mass spectra further enhanced the discriminative power of these models. In this way, all examined classes of cathinone and fluoroamphetamine isomers could be robustly identified, even through their conventional 70 eV mass spectrum from a quadrupole-MS instrument. A characteristic Likelihood Ratio (LR) based indicator was developed to quantify the selectivity of the models which proved to be useful for comparison, optimization and identification purposes. The potential of the method was demonstrated with six forensic case samples, providing 100% correct isomer identification. In general, this new approach enables robust classification of drug isomers that is currently not possible with conventional GC–MS methods without the use of additional spectroscopic analyses.


Introduction
A large number of closely related, novel synthetic drug compounds are observed in the global drugs-of-abuse market since the last decade.This large group of continually expanding Novel Psychoactive Substances (NPS) consists to date of over 700 compounds comprising a great variety of isomeric forms [1].Legal control is often lagging and differs among countries.Also, a perpetual cycle is observed in which recently controlled substances are replaced by newly developed, related forms not yet put under legislation.This has initiated more generic legal approaches to ban entire classes of compounds in some countries such as New Zealand, USA and the United Kingdom [2] and currently a more https://doi.org/10.1016/j.forc.2020.100225Received 19 December 2019; Received in revised form 8 February 2020; Accepted 9 February 2020 ⁎ Corresponding author at: Dutch National Police, Unit Amsterdam, Forensic Laboratory, Kabelweg 25, Amsterdam 1014 BA, The Netherlands.E-mail address: ruben.kranenburg@politie.nl (R.F.Kranenburg).generic framework is also considered in the Netherlands.However, this approach provides challenges and in specific cases the exact compound involved still needs to be reported by the forensic illicit-drug expert.
In forensic drug analysis laboratories, established GC-MS methods fall short when dealing with closely related NPS.As isomeric drug compounds can co-elute, have identical masses and can yield very similar electron ionization (EI)-MS fragmentation spectra, selectivity of GC-MS is currently insufficient in a forensic setting.This is especially true when one specific isomer is legally controlled while the other isomers are yet uncontrolled, as is currently the case in many countries.
The standard ionization energy for routine GC-MS methods in EI mode is 70 eV.This is the optimal energy for this type of ionization in terms of ion-efficiency (sensitivity) and mass spectral robustness.EI at 70 eV is a relative 'harsh' (i.e.hard) ionization technique leading to extensive fragmentation.Suitable bonds in molecules are ionized by disturbances in the electrical field induced by high energetic electrons [11].In the case of synthetic drugs, which frequently consist of an amine moiety, the most abundant fragment is the ion resulting from alpha-cleavage of the amine.This leads to mass spectra showing an uninformative low-mass base peak.Other, more informative, fragments are hardly visible.
Operating a traditional EI ion source at energies well below 70 eV significantly reduces the ion-efficiency and thus sensitivity as moleculeelectron interactions diminish and the process of guiding electrons into the ion-chamber becomes less efficient.Despite this loss of sensitivity, an earlier study operating GC-MS instrumentation on lower ionization energies reported less fragmentation and more information-rich mass spectra for pesticides [12].
To overcome the limitations of sensitivity, a dedicated high-efficiency ion-source recently became available from Agilent Technologies.This source, introduced as 'low energy EI' is designed with modified lens geometry, with a filament position in-line with the ion-beam and is optimized for operation in the range of 10-70 eV.In an earlier study, this source was successfully applied in isotopologue distribution analysis, in which molecular ions of metabolites are more dominantly present in mass spectra due to reduced fragmentation [13].Lau et al. studied the stability and reproducibility of low energy eV mass spectra for a set of ether, alcohol and terpene compounds, observing improved molecular ion intensity for most compounds at 12 eV [14].Another ionsource capable of low energy ionization is the variable ionization source from Markes.This source uses dedicated ion optics to reduce ion energies in the ion chamber between 10 and 70 eV.In this way the loss of sensitivity due to electron clustering around the filament was also addressed [15].
A novel way of producing more information rich mass spectra in GC was developed by Amirav et al. by using supersonic molecular beams [16].This ionization technique, often described as 'cold EI', was recently used for NPS drug analysis yielding mass spectra with strongly reduced fragmentation [8].These mass spectra were thus more suitable for NPS discrimination than traditional 70 eV EI spectra.Levitas et al. described one specific example of methylone positional isomers in which the cold EI mass spectra differed in intensity, in contrast to the visibly more similar 70 eV EI spectra [17].
Despite these developments to produce more information rich mass spectra, NPS isomers and especially ring-positional isomers can still yield very similar MS spectra.This brings the need for more powerful methods to perform spectral comparison than traditional MS library searches against known spectra.Multivariate data analysis is a promising chemometric approach to extract and visualize small differences in mass spectra of drug isomers.
Principal Component Analysis (PCA) on raw spectral data was applied earlier on Direct Analysis in Real Time (DART)-MS spectra [18], Raman and FTIR spectra [19], VUV spectra [20] and paper spray MS spectra [21].Chemometric approaches on GC-MS data for synthetic drugs were applied for the first time by Levitas et al. [17].This group performed PCA analysis to visualize specific mass spectral differences but did not use it as a tool for identification.Bonetti [22] did extensive work on PCA followed by Linear Discriminant Analysis (LDA) of EI-MS data to identify fluorofentanyl and fluoromethcathinone positional isomers and demonstrated the promising potential of these approaches on traditional 70 eV EI mass spectra.
In a number of very recent publications the potential of applying chemometric approaches on mass spectra was further demonstrated for synthetic drug isomers.Davidson and Jackson [23] applied PCA and Canonical Discriminant Analysis (CDA) on normalized data of the 15 most abundant ions in the mass spectrum for two sets of 2,5-dimethoxy-N-(N-methoxybenzyl)phenethylamine (NBOMe) drug isomers.Setser and Smith [24] analyzed mass spectra of benzofurane and tryptamine drug isomers by means of LDA and selected 13 high abundant, diagnostic ions as input variables for the LDA-model and compared this to PCA on the full mass spectrum as variable selection for LDA.Chikumoto et al. [25] used PCA analysis on 4 major MS/MS-fragment ions to distinguish three synthetic cannabinoid positional isomers.
Up to our knowledge no earlier studies have been published on low energy ionization as a specific tool for ring-isomeric drug differentiation.Additionally, this study is the first to combine multivariate data analysis on more information-rich, low energy mass spectra and the first to look more extensively in the concept of feature selection in mass spectral data to enhance isomer differentiation by ignoring the most abundant ions in the mass spectrum.This study demonstrates these concepts on ring-isomeric sets of methylcathinones and fluoroamphetamines as both groups are frequently encountered in case samples, both in The Netherlands and worldwide.Correct identification is essential for these compound classes as legal control differs for the individual isomers in many countries [26].Currently, only the 4-positional isomers of these classes are controlled substances in The Netherlands, leaving the 2-and 3-positional isomers uncontrolled [27].Methanol (for analysis) and sodium bicarbonate monohydrate (for analysis) were obtained from Merck (Darmstadt, Germany).

Chemicals and reagents
The case samples in this study were drug suspected materials seized by the Amsterdam Police.

Instruments and settings
For low energy EI experiments with the high efficiency ion source, a GC-Q-TOF MS system was used.This system consisted of a 7890B GC connected to a 7250 GC/Q-TOF mass spectrometer and was equipped with a split/splitless injector and a 7650A automatic liquid sampler.The Q-TOF MS contained a high efficiency EI ion source with low energy ionization functionality.Data was acquired with MassHunter GC/ MS Acquisition version 10.0.All instrumentation and software were from Agilent Technologies (Santa Clara, CA, USA).The Q-TOF was operated in scan mode with the collision gas switched off.Mass resolution of the TOF was > 25,000 at m/z 271.9867.Automated mass calibration was performed after every 10 injections.
For regular EI experiments at nominal mass, a single quadrupole GC-MS system was used.This system consisted of a 7890B GC equipped with a split/splitless injector and connected to a 5977B single quadrupole mass spectrometer with a conventional stainless steel EI ion source, all from Agilent Technologies.A Combi PAL autosampler from CTC Analytics (Zwingen, Switzerland) was used for sample injection.Full scan MS data was acquired with MassHunter GC/MS Acquisition version B.07 Samples were dissolved in methanol.To reduce peak broadening due to the presence of both the free base form and the hydrochloridesalt form samples were neutralized with sodium bicarbonate in similar fashion as described in earlier work [7] when appropriate.
A volume of 1 µL of methanolic extract was injected at 250 °C in split-mode with a 1:50 split.On both systems a 30 m; 250 µm ID; 0.25 µm film thickness HP-5MS ultra inert column from Agilent Technologies was used.The oven temperature program started at 100 °C for 1.5 min, followed by a temperature ramp of 30 °C/min to 300 °C.The column flow was 1.1 mL/min helium in constant flow mode.
For traditional EI ionization, 70 eV energy and a source temperature of 230 °C was used on both instruments.Low energy EI experiments were performed at 15 eV in combination with selected source temperatures.The following combinations were found to be optimal for each isomeric set: MEC, 15 eV at 150 °C; MMC, 15 eV at 180 °C and FA, 15 eV at 200 °C.

Data analysis
All acquired data from both systems was processed in MassHunter Qualitative Analysis version 10.0.Chromatograms were integrated and average mass spectra were extracted by including all individual spectra with a TIC (Total Ion Current) intensity above 15% of the peak height maximum and subtracting background spectra from the 6.0 to 6.2 min part of the chromatogram.A time range where no peaks or excessive column bleed were observed.Individual mass spectra above 100% of saturation were excluded in the TOF data.
The 200 most abundant masses for each individual mass spectrum were exported using the MassHunter built-in export option.For TOF, individual mass spectra contained between 190 and 200 ions.
Quadrupole mass spectra contained between 160 and 200 ions.A data matrix suitable for PCA was created by binning all ions in 0.01 Da bins for high resolution data and 1 Da bins for nominal resolution data.Data was zero filled for ions not present in all individual spectra.Spectral data above 200 m/z was ignored as no ions of interest are expected in a mass range exceeding the molecular mass of the subjected molecules.Table S1 shows the number of features in each data matrix obtained this way.
Three different methods of data feature-selection were compared in this study: the full 40-200 m/z spectrum (method A); exclusion of selected masses (method B) and selection of only the 100-200 m/z part of the spectrum, the region where ions contributing to isomeric difference are expected (method C).For each isomeric set, the excluded masses for method B were selected by PCA loadings based on visible contribution of these masses to saturation-or concentration-effects, in combination with prior knowledge about selective ions resulting from the 15 eV experiments.These excluded masses are given in Table S2 in the Supplementary Data.
PCA and PCA-LDA analysis were performed in Unscrambler X (Camo Analytics, Oslo, Norway), version 10.5.1.using total sum normalization and mean-centering as data pre-processing.No autoscaling was applied.The system's standard random cross validation, using 20 segments was used.Separate LDA models were created for each isomeric class using all three above mentioned feature-selection methods.The obtained models were assessed in this study for their suitability to distinguish individual isomers.To this end, the discriminant values for each class from the prediction matrix were used.These discriminant values are the relative natural logarithmic posterior probabilities obtained by the LDA algorithm used in The Unscrambler X [28] and exhibit the probability of membership for a class.
For each model, the training set of data consisted of 11 replicate spectra of each of the three isomers at a single concentration (200 μg/ mL, Set 1 'same concentration') and 12 replicate injections of each of the three isomers at four different concentrations (100 μg/mL, 200 μg/mL, 500 μg/mL and 1000 μg/mL, Set 2 'concentration set').The first set was used for both instrumental technique (15 eV vs 70 eV) and model (feature selection) comparison.The second set was used to further assess the concentration effect on mass spectra, subsequent feature selection methods, and PCA-LDA analysis.This set was only acquired on the single quadrupole GC-MS at 70 eV.The concentration effect was not further investigated for the 15 eV source and therefore the LDA models based on 15 eV data do not take variation from other sources, such as concentration into account.

Low energy ionization of NPS drugs
To determine the potential benefit of low energy ionization for NPS compounds, several preliminary experiments were performed to investigate the source efficiency (i.e.sensitivity), and the resulting mass spectra, for both the dedicated low energy EI source and the conventional EI source.As expected, both sources showed optimal ion yields and signal intensities around 70 eV and a reduced signal was observed at lower ionization energies.The overlaid TIC chromatograms of the 4-MEC peak, obtained from 70 eV to 10 eV with both sources, can be found in Fig. S1 in the Supplementary Data.The dedicated low energy EI source gave adequate sensitivity at all ionization energies down to 15 eV.Signal intensities dropped to approximately 50% at 15 eV.However, noise intensities also decreased thus overall loss in sensitivity was even less.As sensitivity is usually not a key issue in illicit-drugs analysis, the observed sensitivity was considered suitable and no further effort was put into optimizing or calculating LODs (Limits of Detection).
A remarkable finding was that the low energy source was more efficient in the 15-25 eV range than between 25 and 70 eV.A possible explanation is that also the tuning procedure in the software is specifically designed for low energy EI experiments and dedicated tuning parameters were optimized for low energy operation.The conventional EI source showed a totally different signal intensity curve, with a roughly constant signal intensity in the range of 70 to 40 eV, followed by a substantial decrease in sensitivity to less than 10% at 15 eV.For the low energy source a dedicated tune procedure for the 10-30 eV range was available in the software, whereas the conventional EI source had to be tuned at 70 eV before operation at lower energies.This could possibly explain the signal optimum at the higher energies.
Mass spectra were obtained at various ionization energies on both source types.Example spectra for 4-MEC can be found in Fig. S2 in the Supplementary Data.For the compounds in this study the EI-mass spectrum predominantly shows one low-mass base peak resulting from the alpha-cleavage fragment of the amine-group.This was the case for all drug components involved in this study and this effect was observed in mass spectra obtained from both sources at all ionization energies.Even at very low ionization energies, this fragment was clearly the most intense ion visible.As more information rich and possibly discriminating fragments are expected in the higher m/z part of the mass spectrum, partial spectra were obtained by zooming in on the 100 m/z to 200 m/z part of the spectrum, thereby neglecting the most intense ions.In general, the low energy source gave more fragment ions of sufficient abundance in the mass spectra at values below 20 eV and a shift in intensity ratio towards more intense molecular ion peaks is observed for lower ionization energies.Fig. 2 shows an example for 4-MEC with the increasingly intense 191 m/z molecular ion visible from 20 to 10 eV.Preliminary experiments with the low-energy source operated at 150 °C, 180 °C and 200 °C did not reveal remarkable fragmention ratio differences by visual inspection of the 100-200 m/z part of the mass spectrum.The source temperature was therefore found to have a rather minimal effect on the degree of fragmentation compared to the ionization energy.For the conventional EI source no major differences in fragmentation were observed at low energy.As stated before, sensitivity was compromised at these settings and this is also reflected in very low intensity mass spectra not suitable for compound characterization.

Ring-isomer differentiation through visual comparison and ion ratios of low eV mass spectra
Experiments at 15 eV were performed on the low energy EI source for isomeric differentiation.For both sets of cathinone drugs, spectra showing major intensity differences in the higher m/z part of the spectrum were obtained for the ring-isomers.Fig. 3 shows the mass spectra for MEC-isomers with clear differences in intensity-ratios of the 191 m/z (molecular ion, C 12 H 17 NO + ), 176 m/z (C 11 H 14 NO + ), 148 m/z (C 10 H 14 N + ) and 119 m/z (C 8 H 7 O + ) ions.Fragment formulas for all diagnostic ions in the 100-200 m/z part of the mass spectrum were derived from the high resolution MS data.These formulas are shown in Table S3.The proposed structures of the diagnostic ions for the MECisomers are given in Fig. S3.The fragmentation mechanism is at this moment not completely understood and may be subject to future research.However, the higher abundance of the 119 m/z fragment in the ortho-substituted isomer might be explained by the presence of a hydrogen-bond that can only be formed by 2-MEC therefore making this fragmentation route more favorable.Both the 3-MEC and 4-MEC isomers gave less fragmentation and a higher abundant molecular ion, which was most prominent for the para-substituted isomer.As demonstrated by Davidson and Jackson [23] for NBOMe-isomers, ionratios could be used to distinguish between isomers.The 119 m/ z:191 m/z ion-ratio was determined for the MEC samples and resulting values were found to be 2.9 ± 0.5 for 2-MEC; 1.0 ± 0.1 for 3-MEC and 0.52 ± 0.04 for 4-MEC, based on 11 replicates.For the case sample this ratio was found to be 0.48 ± 0.02, exclusively falling in the 4-MEC ion ratio range.Fig. S4 in the Supplement Information also depicts the full mass spectrum as well as the same data obtained on the conventional EI source, for which no visible differences were observed.The fragmentation pattern for the low energy spectra was found to be stable and repeatable as demonstrated by the small deviation in the ions  ratios.Example spectra from 5 replicates are shown in the Supplementary Data (Fig. S5).In line with the cathinone type drugs, the fluoroamphetamines also showed more information rich mass spectra in low energy EI.Unfortunately, the mass spectra for these ring-isomers were more similar and variation in ion-ratios was less profound.However, ion-ratio assessment could still be used as demonstrated by the 138.07 m/z:118.07m/z ion-ratios of 3.0 ± 0.1 for 2-FA; 3.5 ± 0.1 for 3-FA and 2.6 ± 0.1 for 4-FA from 11 replicate injections with a case sample later identified as 2-FA showing 2.9 ± 0.1 and a case sample known for containing 4-FA giving 2.5 ± 0.1.Low energy 15 eV mass spectra of the MMC and FA sets of isomers are given in Fig. S6.Several case samples, known for containing an isomeric form of the compounds included in this study were analyzed.For both the MEC and MMC species identification of the correct isomeric form was clearly possible by visually comparing the spectra in combination with assessment of the ion-ratios.An example of an exclusive case sample match with the 4-MEC library spectrum is also shown in Fig. 3.In this study, low energy EI experiments were performed on a high mass resolution Q/TOF MS system as the novel low energy ion source is currently only available on this instrument.Therefore, special attention was given to the difference between high resolution and nominal resolution mass spectra.As can be seen in the fragment masses in Table S3, isobaric ions were observed at 146 m/z, 148 m/z, 160 m/z, 174 m/z and 175 m/z.These ions have at least 10 mDa mass difference and could easily be differentiated on the Q/TOF instrument.On a nominal mass instrument, such as a regular single quadrupole MS, these isobaric ions are all measured as one single ion with a summed intensity consisting of the individual ion signals.
Spectral data from replicate analysis was used to investigate whether these isobaric ions are selective in differentiating ring-isomers.Fig. 4 shows the distribution of the fragment intensities for the three MEC isomers in different colors.This clearly shows which fragments are selective indicators for a certain isomeric form through their intensity difference.Also, as can be seen for fragments 146.06 m/z (C 9 H 8 NO), 146.07 m/z (C 10 H 10 O) and 146.10 m/z (C 10 H 12 N) there is difference in occurrence for the ring isomeric forms within the isobaric ions.The nitrogen-lacking C 10 H 10 O, 146.07 m/z fragment is much likely to occur in the 2-MEC isomer than the other two 146 m/z fragments which both occur more prominently in the 4-MEC isomer.As the most discriminating fragments are not compromised by isobaric ions, selectivity from low energy EI will still be visible at nominal mass, however, additional selectivity originating from individual isobaric ions will be lost.
As stated earlier, it was not possible to analyze samples using low energy EI on a regular single quadrupole MS.Instead, nominal mass low energy EI spectra were 'mimicked' by binning the m/z values from high resolution spectra to 1 Da resolution.This data was also included in the chemometric analysis.

Feature selection and data pre-processing
The three methods for feature selection explained in 2.3 were applied on the mass spectral data from the 15 eV experiments, both processed as nominal and high-resolution spectra.Also, 70 eV data from conventional quadrupole GC-MS was analyzed and processed in similar fashion.Fig. 5 shows an example of the differences observed by the three methods of feature-selection.In the full-spectral PCA-scoreplot a certain degree of clustering for the isomeric groups was obtained, but there is also a high within group variation, represented by linear shifts in all three groups.In some cases, the clustering was only achieved  using the higher PCs.This is in line with results from Bonetti [22] obtained from 70 eV mass spectra for the fluoromethcathinone isomers.
The observed linear shifts could be attributed to the most abundant ions in the mass spectrum based on their PCA loadings.In this study, no auto-scaling was applied on the mass spectral data as this will place increased emphasis on low abundant ions that could lead to increased noise [22].A drawback of this decision is that the high abundant, low m/z-ions now strongly affect the PCA model.It is evident that these effects were caused by the high abundant, mostly sub-100 m/z ions, as these undesirable effects almost disappear when using processing methods B and C. In general, processing methods B and C resulted in PCA-data showing clearly isolated groups on basis of the first two principal components.A surprising result was that clearly focused and isolated PCA-clusters were obtained not only for the 15 eV mass spectra, but also for the 70 eV mass spectra in all NPS isomeric groups.
As PCA is an unsupervised approach, thus not taking prior information about the isomeric class into account, the less convincing results of method A could be explained by more prominent variance in the dataset not related to isomeric variation.Such variance might result from fragmentation efficiency and yield in the source, as this predominantly affects the most abundant, low m/z-ions.
For comparison between the instrumental techniques, similar datasets comprising of 33 mass spectra were measured on all instruments.Technical details are given in paragraph 2.3 under 'set 1'.PCA analyses were performed on the spectra, revealing completely separated groups for all techniques (70 eV and 15 eV) when using both processing methods B and C.An example for both the MEC and FA isomers is given in Fig. 6.

PCA-LDA optimization for maximum selectivity
All PCA models that were used for LDA were assessed for suitability according to their Q-residuals and Hotelling's T 2 -values.No outliers were observed.In the PCA models resulting from the 100 to 1000 µg/ mL concentration sets, higher Q-residuals were noticed for the lower concentrations (e.g. 100 µg/mL) for all three isomeric forms.A logical explanation for this is the presence of additional noise at these lower signal-intensity levels.The Hotelling's T 2 and Q-residual plots for the FA and MMC isomer PCA-models are shown in Fig. S7.An α = 0.05 (5%) limit was determined for diagnostic use in the case samples described under 3.4.2.For the reference standards used for the model, only the samples at around 100 µg/mL showed Q-residuals above the 5% limit.As a next step, PCA-LDA was performed as this supervised approach focuses on maximum discrimination between the isomeric classes.For each instrumental method (15 eV high resolution, 15 eV nominal mass and 70 eV nominal mass) and each isomeric set (MEC, MMC and FA) a PCA-LDA model was built and optimized for the minimum number of principal components needed to provide sufficient discrimination between groups.In many cases only 3 or 4 PCs (corresponding to over 90% explained PCA variance) were required to obtain optimal discrimination.For all instrumental methods and for all isomeric classes, cross validation of the PCA-LDA models yielded 100% accuracy.Although 100% correct classification is a satisfactory result, especially for visually almost similar, conventional, 70 eV quadrupole mass spectra, additional testing is required for practical use in a forensic setting.As PCA-LDA is a hard-classification technique, an unknown sample will always be assigned to a certain group.However, based on the mass spectrum we assume that the unknown must correspond to one of the considered isomers.To reduce the risk of wrong classification, special attention needs to be given to model selectivity and thus robustness.In the forensic field it is especially critical to minimize the risk for a false-positive result (i.e.identifying an uncontrolled isomer as an illicit substance) although a false-negative outcome is of course also highly undesirable.In that respect it is essential to optimize methods in such a way that maximum distance between the closest pair of a controlled and an uncontrolled isomer is ensured.The discriminant values obtained from the LDA training set are natural logarithmic posterior probabilities while assuming equal prior probabilities.The distances observed in the validation results could thus simply be subtracted and converted to derive a log Likelihood Ratio (LR) when considering the following set of hypotheses: H p (prosecution hypothesis): 'Sample X contains the controlled isomer Y, and not the uncontrolled isomer Z' H d (defense hypothesis): 'Sample X contains uncontrolled isomer Z, and not the controlled isomer Y' The log LR-value is calculated by subtracting the log posterior probability of the false positive class (e.g. the predictions of 4-MMC on the 3-MMC class) from the true positive class.For the compounds included in this study there are two classes of uncontrolled isomers.Log LR-values were calculated for both uncontrolled isomeric forms (e.g.4-MMC on the 2-MMC class and 4-MMC on the 3-MMC class).The most critical pair exhibits the smallest difference in true and false positive LDA values.Within the validation data set of the most critical pair, the lowest log LR value indicates the lowest evidential value to support the presence of an illicit substance.The best and most selective method exhibits the highest LR-value for its most critical pair and worst validation test result.By applying this worst case scenario the performance of the various MS data processing methods can be compared and their suitability for forensic practice can be assessed.
An overview of the most critical pair log LR-values for all three instrumental techniques and all three data processing methods is shown in Table 1.A complete overview with ratios for both the 2 vs 4 and 3 vs 4 isomeric pair is given in the Supplementary Data, Table S4.In most cases, the 3 vs 4 isomeric pair was the most critical pair, with the 2isomer more distant from the others.A possible explanation is that the 2-isomer in general is less stable.Although not readily visible, this effect might as well be reflected in the mass spectra due to different fragmentation routes.For the low energy spectra, this effect was less prone due to increased discrimination of the 3 vs 4 isomeric pair.The best results for the MEC-isomers were obtained for the 15 eV spectra with a log LR of up to 122.This is in line with the large visual differences in the mass spectra reported in paragraph 3.2.The expected additional benefits of the high mass resolution were not fully reflected in 'worst case' log LR-values for this group as they are for the MMC isomers.Although the log LR-values in general (Table S4) were higher for the high resolution spectra.An explanation for this is the additional selectivity from isobaric ions, such as the ones shown in Fig. 4.This additional information is lost at nominal mass resolution.For both cathinone-classes, 15 eV ionization provides a clear benefit in isomeric selectivity.Contrary to the cathinones, the discrimination of fluoroamphetamine isomers does not benefit from 15 eV ionization nor the high mass resolution.In this case, 70 eV mass spectra obtained on a conventional quadrupole-MS instrument and used as such already provided log LR-values of over 100 for processing method C. Another clear observation from these results is that the data preprocessing methods focusing on the higher m/z-values and excluding the most abundant lower m/z-values (i.e.method B and C) provides much more selectivity for the 15 eV data for all groups of NPS-drugs.
Interestingly, when considering the full range of the mass spectra, the single quadrupole mass spectrometer shows better performance.However, in combination with smart mass selection, the added value of the new ion source in isomer differentiation is clearly demonstrated.A typical example of the PCA-LDA selectivity is shown in Fig. 7 as a 3Dplot with ln posterior probabilities for all MMC-isomers and predictions from two case samples projected on it.In this plot, isolated groups of isomers can be seen.The large distance in the PCA-LDA space allows for unambiguous identification of the case samples.

Concentration effects and reproducibility
All results described under 3.3 in this study were obtained with samples analyzed at the same concentration.For use in actual casesamples that have a known or estimated concentration, this is no problem as samples could be prepared at the same concentration of the reference standards.However, in real-life forensic case samples, mixtures of multiple components at different concentrations occur.As concentration-effects were observed in some PCA-plots, even after data normalization, this might negatively affect the robustness of this approach.Therefore, additional experiments were performed to determine the reproducibility and concentration-dependency.
As the results show promising PCA-LDA selectivity even for the 70 eV quadrupole-MS data, which is the current technique-of-choice in forensic drug-analysis laboratories worldwide [29,30], the additional robustness experiments were only performed on 70 eV quadrupole data.The concentration effect was investigated by analyzing a large set of reference samples, together with case samples, over a 10-fold

Table 1
Lowest observed log(LR)-values of PCA-LDA models from the comparison of the critical pair (values indicated without an asterisk are 3 vs 4 isomers, the ones with the asterisk are 2 vs 4 isomers) at given instrumental techniques and dataprocessing methods.All PCA-LDA models are based on 33 mass spectra, analyzed as set 1. Fig. 7. PCA-LDA 3D prediction plot of 15 eV TOF spectra of the MMC-isomers, with projections of case samples 4 (identified as 3-MMC) and 9 (identified as 4-MMC).

MEC
Method and data-processing: set 1, method C. All axis units are negative ln posterior probabilities.
concentration range.This range was found suitable as most case samples contain between 10 and 100% of active ingredient (very low active levels are currently only relevant for fentanyl isomers).Reference solutions of 100 µg/mL; 200 µg/mL; 500 µg/mL and 1000 µg/mL were analyzed in 12-fold with GC-MS for each individual isomer (set 2, 'concentration set').As these GC-MS sequences consisted of 144 references standards per set together with multiple case-samples and blanks in between, an indication of the reproducibility over several days was also obtained.Mass spectral data was processed and PCA-LDA analyses with log LR-ratio calculations were performed in similar fashion as described in paragraph 3.3.For all drug types clear separated groups were observed for each isomeric class.In line with earlier findings, the best selectivity was obtained for the fluoroamphetamine isomers, corresponding to a 'worst case' log LR of 61.The cathinones yielded worst case log LR-values of 11 and 32.The best results were obtained using processing method C (100-200 m/z selection).An interesting phenomenon is that, despite normalization, severe concentration effects resulting in linear shifts could be observed in the PCAplots of the full mass spectrum (method A).With the other pre-processing methods resulting in the exclusion of the low mass values, this concentration dependency almost disappeared.PCA-plots for all three processing methods on the FA concentration set are shown in Fig. S8.Ion diagnostics for a specific isomeric form could be derived from the PCA-loadings.As expected, the ions that gave visual differences in the 15 eV mass spectra for MEC (i.e.119 m/z; 148 m/z; 176 m/z and 191 m/ z) also contributed strongly to the PCA-model.For the 70 eV fluoroamphetamine spectra, several low abundant fragment ions hardly visible in het full range mass spectrum appeared to be diagnostic for a certain isomeric form, such as 118 m/z and 117 m/z for 2-FA, 133 m/z for 3-FA and 69 m/z (next to the more abundant 109 m/z) for 4-FA.An overview of the most diagnostic ions per isomer selected from the PCAloadings can be found in Table S5.A complete overview of the results, including explained variance, PCA-LDA accuracy, false positive and false negative rates and both mean and worst case log LR-values for all concentration sets can be found in Table S6 in the Supplementary Data.

Case samples
To further investigate the potential of this approach in the forensic laboratory, 8 actual case samples seized by the Amsterdam Police between 2017 and 2019 were analyzed.For each individual run the observed mass spectrum corresponding to one of the NPS classes was extracted and processed.In this study it was decided to use the models on the 100-200 m/z part of the mass spectrum (method C).Although method B preprocessing gave slightly better results, this approach was preferred as it is a more generic and thus easier method to apply in forensic casework laboratories.Mass spectra from case samples were projected on the reference PCA-model and Q-residuals and Hotelling's T 2 statistics were assessed to determine suitability.For all case samples at intensity levels between 200 and 1000 µg/mL, both Q-residuals and Hotelling's T 2 -values for the case samples were below the 5%-limit for the model.Lower intensity samples (i.e. more diluted samples) returned Q-residuals around or above this limit.This is in line with the results observed for the lower concentration reference standards and is a likely result from the increased noise levels in the corresponding mass spectra.Q-residuals and Hotelling's T 2 plots for FA and MMC case samples are shown in Fig. S7.For all samples in this study, routine GC-MS analysis was inconclusive for the isomeric position of the isomer, mainly due to MS match score criteria.FTIR analysis also could not provide a definite identification due to the presence of interfering substances.However, the identity of the case samples was known from earlier work on GC-VUV [7] and these findings were used to validate the predictions.
Results of the PCA-LDA prediction are given in Table 2.The correct isomeric form was identified in all individual samples, thus providing 100% correct classification.To provide additional information on method selectivity, log LR-values were calculated for the closest uncontrolled isomeric class based on the hypothesis set mentioned in 3.3.2.A higher absolute log LR serves as a stronger indication that the isomer assignment is correct and that the risk of a false-positive identification is low.In all cases, very convincing log LR-values, often above 50, were obtained.A remarkable observation for the FA-isomers is that at 70 eV the closest group to 4-FA is the 3-FA isomer, while at 15 eV 2-FA vs 4-FA is the critical pair.Although not directly visible in the mass spectra, this is probably caused by minor differences in fragmentation at 70 eV and 15 eV.In Table 2, log LR-values are given for the isomeric class closest to the controlled isomer class.For the 2-FA containing case sample 12, the log LRs are thus based on 2-FA vs 4-FA to verify that this sample is correctly identified as 'uncontrolled'.However, in this specific situation the closest class is not 4-FA but also the uncontrolled 3-FA class.Dedicated LRs could be calculated for 2-FA vs 3-FA to determine whether the correct isomeric form is identified.These worst case log LRs are 209, 155, 69 and 25 for the 70 eV quadrupole concentration set, 70 eV quadrupole set, 15 eV 1 Da set and 15 eV 0.01 Da set respectively.
Fig. 8 shows a 3D plot of the prediction values for the FA-isomers in the 70 eV quadrupole model based on reference standards at different concentrations.Predictions of case samples 10, 11 and 12 are projected unambiguously on the 4-FA, 4-FA and 2-FA group, respectively.

Discussion
The results show that distinctive mass spectra for ring-isomeric compounds could be obtained with low energy EI using a dedicated high efficiency source.However, the differences in these mass spectra mainly appear in lower abundant fragments (< 1%) as the low mass fragment resulting from alpha-cleavage of the amine moiety remains the dominant base peak.In this study replicate analysis up to n = 17 for individual isomers gave highly similar spectra.These replicates were performed on samples at equal concentrations, analyzed in a short period of time.Long-term reproducibility was not investigated in this study.Optimization experiments did show some concentration dependency of the mass spectra despite normalization.This is in line with earlier findings from Lau et al. [14] who reported flow-rate and concentration based variation in low energy mass spectra.Just as the low energy mass spectra, concentration and saturation effects were also observed at 70 eV and these effects were reflected in the PCA plots both in this study and prior work from other groups [22].By applying PCA on only those parts of the mass spectrum that are not affected by concentration-induced saturation, selectivity significantly improves.Only one prior indication of such an approach on mass spectral data for isomeric differentiation was found in literature.Setser and Smith used an informed chemical approach to select specific indicator m/z-values for phenetylamine and tryptamine type NPS for subsequent use in LDA.However, their focus was on selecting only fragments resulting from the drug compound, thus excluding noise.No specific focus was set on ringspecific discrimination [24].
Davidson and Jackson [23] also gave special attention to the elimination of noise by selecting only the 15 most abundant ions.Interestingly, in our study low abundant ions were not excluded and data sets contained between 362 and 437 individual ions as features.No problems were caused by these ions as their contribution to the PCAmodels was minimal.An explanation might be that autoscaling was not applied.Therefore, low abundant ions contribute to a lesser extent to the model.Autoscaling is suggested when data varies in range causing differences in units or scale.Mass fragmentation is a physical process and low intensity ions are the result of less favorable and possibly less reproducible fragmentation routes.Applying autoscaling on mass spectra will force unwanted emphasis on these ions.In our experience autoscaling on mass spectra was found to worsen the performance of the PCA-models.
In this study mass-fragments selective for ring-isomer differentiation have been identified.A possible molecular formula for these fragments was proposed using the high-resolution MS data.An interesting next step in drug isomer identification using low energy MS spectra might be to better understand the ring-specific fragmentation mechanisms leading to the observed differences in ion intensities in similar fashion as Davidson and Jackson [23] described for the NBOMe-isomers.This will add confidence providing a fundamental basis for this approach.
A possible scenario in which the benefits of selective low energy EI spectra can be used in forensic drug analysis is as a rapid additional method on top of the regular 70 eV GC-MS screening that is the international standard for drug identification [29,30].When this conventional GC-MS method identifies a synthetic drug compound known for various ring-isomeric forms (as indicated by several spectral matches above a certain threshold value), an additional injection of the same extract on the same instrument using the low energy EI option of the ion source could be performed.PCA-LDA models could then be applied on this data to identify the correct ring-isomer.In specific cases, this could eliminate the need for re-analysis on instrumentation like GC-FTIR or GC-VUV.Cases in which this approach could be beneficial are the more frequently occurring isomeric drugs, for which reference standards are readily available.
Thus far, this approach could be employed only on GC-Q/TOF instrumentation as the low energy option of the dedicated source is exclusively supported on this platform.However, this study demonstrated that applying LDA-PCA on 70 eV mass spectra also provides isomeric differentiation.Bonetti was the first to report this phenomenon, showing similar results using PCA-LDA models on a different class of NPS-isomers [22].In addition to these findings, our study also provides a method to quantify isomeric selectivity and shows that this selectivity can be enhanced dramatically by selecting only specific parts of the mass spectrum for data analysis.As the impressive selectivity resulting from conventional 70 eV quadrupole mass spectra was initially unexpected, a closer inspection of the mass spectra was performed (Fig. S9) with specific focus on the 'discriminating 15 eV' ions.This also for the most part revealed minor isomeric differences in the 70 eV mass spectra, yet much less profound as observed in the low EI spectra.
The observation that with the right statistical data analysis NPS isomer differentiation is feasible using standard GC-MS methodology (without additional spectroscopic analyses) could be very beneficial for forensic case laboratories that have to deal with high caseloads and a limited budget for equipment investments.However, a limitation of the current study and the work by other groups is the use of a limited number of GC-MS instruments over a relatively short period of time.Therefore, to apply this approach on actual case samples, back-to-back analysis of reference standards and case samples need to be performed and PCA-LDA models need to be created for every sequence.Even though performing PCA-LDA on mass spectra is a fast, reliable and repeatable procedure using current chemometric software platforms and automated data processing scripts, a validated model that could be used as a fixed basis might even further reduce complexity.Such a model could possibly be built and shared internationally involving numerous drug analysis laboratories.
As a next step towards practical implementation of chemometric approaches in forensic drug analysis, more information on inter-system reproducibility and long-term stability of mass spectra is therefore needed.In addition, further practical experience and validation of the log LR-values and reproducibility of the chemometric models might provide certain threshold limits for Q-residual values and log LR-values that could serve as a criterium for definite identification.If these limits are not met this could trigger the forensic drug expert to conduct additional chemical analyses to reveal the true nature of the sample.Also, equal prior probabilities were assumed as no reliable information about the occurrence of individual isomers was available.Training sets therefore contain equal number of samples for each isomer.When prior knowledge about occurrence in forensic cases becomes available this could also be used to improve the training set design and the corresponding PCA and LDA models.

Conclusions
Low energy EI ionization provides additional selectivity for ringisomeric drug differentiation in routine GC-MS analysis by means of softer ionization and more information-rich mass spectra.Low EI 15 eV

Table 2
GC-MS identification of drug isomers in case samples using PCA-LDA models on regular 70 eV EI mass spectra (100-200 m/z).The chemical identity was established from prior FTIR or GC-VUV analyses.mass spectra acquired with a dedicated low energy EI source were clearly distinctive for two ring-isomeric sets of cathinone-type drugs.For fluoroamphetamines the benefit of low energy EI was less prone.The more selective mass-spectra could be used for direct identification purposes, but also aid as a tool to reveal fragment-ion ratios selective for ring-isomeric differentiation.PCA-LDA modeling of mass spectral data from both 15 eV and 70 eV mass spectra enabled the correct identification of the ring-isomeric form of MMC, MEC and FA-classes of drugs in case samples.Log LR-values calculated from the LDA posterior probabilities are objective and quantitative indicators for model selectivity.Using both this LR approach and prior knowledge about the selective parts of the mass spectrum, PCA-LDA data processing could be optimized and results from both 70 eV and 15 eV could be compared.
For the cathinones, the highest isomeric differentiation was obtained from 15 eV mass spectra using high resolution TOF-MS, as both 15 eV ionization and the high resolution contribute to selectivity.For the fluoroamphetamines, these features provided little added value, as 70 eV quadrupole spectra already gave convincing selectivity with worst case log LR-values of over 75.
In all cases studied, the PCA-LDA models based on the selective 100-200 m/z part of the mass spectrum yielded correct and selective classification, even for conventional 70 eV quadrupole MS spectra.The successful application of this approach was demonstrated for six forensic case samples using PCA-LDA models based on reference standards at multiple concentrations.This resulted in 100% correct classification with absolute log LR-values ranging from 9 up to over 100, meaning that the mass spectra fit considerably better to the reported class than the adjacent likely other isomeric class.These findings show that chemometric models are a powerful analytical tool for isomeric drug differentiation, even on the conventional 70 eV quadrupole GC-MS systems.In addition to this, low energy ionization could further enhance selectivity and provide diagnostic information with respect to mass spectral selectivity for more accurate characterization of NPS.

Declaration of Competing Interest
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.Daniela Peroni and Sander Affourtit are employees of JSB Benelux, JSB Benelux is a value-added reseller of Agilent Technologies.Fig. 8. PCA-LDA 3D prediction plot of 70 eV quadrupole spectra of the FA-isomers, with projection of case samples 10, 11 (identified as 4-FA) and 12 (identified as 2-FA).Method and data-processing: set 2 (concentration set, 100-1000 µg/mL), method C (100-200 m/z only).All axis units are negative ln posterior probabilities.

Fig. 1 .
Fig. 1.Molecular structures of the compounds used in this study, compounds underlined and in red are controlled substances in The Netherlands.

Fig. 2 .
Fig. 2. Fragmentation of 4-MEC in the 100-200 m/z part of the mass spectrum, analyzed on the low energy source at given energies.

Fig. 3 .
Fig. 3. Mass spectra (m/z 100 to m/z 200) of the MEC-isomers and an unknown case sample clearly matching 4-MEC.All spectra were obtained with the low energy EI source at 15 eV on Q-TOF MS.