Urine Fingerprints of Stanozolol Treated Horses by Liquid Chromatography High Resolution Mass Spectrometry

The current detection methods for stanozolol are all based on targeted approach. The present study aimed to assess the global biological effect of stanozolol-treatment by means of chemometric models, after generating and comparing horse urine LC-HRMS fingerprints collected from control and stanozolol-treated horses. The animal study was conducted according to an ethically approved protocol at two different places in France: Chamberet and Coye la Forêt. The total duration of the animal phase was seven months and only females were selected to partake. The sixteen mares in this study were not actively racing horses, but were in good physical condition. SIMCA-P+ software and R free software environment were used for multivariate data analysis. Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures-Discriminant Analysis (OPLS-DA) were applied to build some descriptive and predictive models. The analyzed horse urine fingerprints based on the 220 features selected after suppression of confounding factors show changes in metabolic states after chronic stanozolol treatment. This proof of concept study confirms the power of untargeted approach in doping control since the changes are present over seven months after anabolic administration.


Introduction
Stanozolol is a synthetic derivative of testosterone in which the androgenic effects of the hormone are minimized, leaving the anabolic action unchanged, in order to improve its handling and tolerability even during prolonged treatment. Attempts have been made to obtain this decoupling between the androgenic and anabolic effect in more than one hundred testosterone derivatives [1]. However, of these, stanozolol stands out for its more favorable relationship between the two effects, with a more pronounced anabolic action that dominates the androgenic action (Table 1) [2].

Generic name
Relative biological activity  Table 1: Anabolic/andogenic activity ratio.
The therapeutic indication of anabolic steroids is therefore among muscular disorders (hypotonia, hypertrophy), fracture consolidation difficulties, bone marrow demineralization (osteoporosis), proteindispersing pathologies (nephropathies), anaemias, growth disorders and, in skin diseases, to promote the growth of tissues or to stimulate their repair, an action that is useful in inflammatory and degenerative diseases of the skin. Specifically, stanozolol is used in veterinary medicine to increase appetite, cause weight gain, and treat certain types of anemia. It was noticed recently its use to stimulate the condition of the cartilaginous tissue by increasing the production of collagen and other fundamental protein substances of the cartilage matrix. The identification of stanozolol as the therapeutic agent efficacious in tracheal collapse in dogs was also observed [3].
It has become one of the most commonly used anabolic steroids in the horse racing industry, being classified as a class III drug by the International Association of Racing Commissioners. Recent reviews have discussed in great detail its therapeutic uses and potential as performance enhancers in the horse [7,8]. Although the use of stanozolol may be medically warranted for short periods, long-term use is associated with adverse side effects particularly in terms of the reproductive and musculoskeletal systems [7,[9][10][11][12][13]. Also, chronic use of stanozolol may induce neurochemical alterations centrally involved in depression and stress-related states [14].
Unlike most injectable anabolic steroids, stanozolol with its pyrazole ring cannot be esterified and is sold as an aqueous suspension, or in oral tablet form. The drug has a high oral bioavailability, due to a C17 α-alkylation which allows the hormone to survive first-pass liver metabolism when ingested. It is because of this that stanozolol is also sold in tablet form.
The metabolization of stanozolol indicates a quick production of mono-and dihydroxylated metabolites in humans and animals that are mainly present in a glucuronide form. The most abundant metabolites identified in human and animal urine are 16-OH stanozolol, 3'-OH stanozolol and 4-OH stanozolol. 3'-OH stanozolol was the main metabolites used in routine detection methods analysing human urine. 16-βOH stanozolol was the main metabolite after administration to bovine. It should also be noted that depending on the way of administration, oral or subcutanous, a difference can be observed in the identity of the metabolites. Several years ago, the authors confirmed 15 metabolites of stanozolol in human urine and 4 more in chimeric uPA-SCID mouse [15].
However, Wang et al. in their last study in 2017 while searching to extend the detection time of Stanozolol through its unknown, longterm metabolites in human urine, discovered 48 metabolites in total. In equine, metabolites of stanozolol in urine have been investigated by atmospheric pressure chemical ionization (APCI) triple-quadrupole LC-MS [16] and are detected to be hydroxylated at C3, C4, C6 and C16 following oral administration. 16β-hydroxystanozolol was established as a major equine urinary metabolite of stanozolol following administration by intramuscular injection. Also, two other metabolites with additional hydroxylation were tentatively proposed by McKinney et al. [6] as 16α-hydroxystanozolol and a 15α or -hydroxystanozolol ( Figure 1).
Knowledge of the stanozolol metabolic transformation is necessary for the development of efficient analytical methods for identification of parent drug and/or metabolic products. For example, stanozolol is behaving differently than other anabolic steroids mainly analyzed by GC/MS. The need for derivatization when using GC-MS was a negative factor in the detection of 16β-OH stanozolol [17] but for the detection of 3'-hydroxystanozolol, it can be sufficient [18]. The detection power of the stanozolol metabolites gave better results using LC/MS. Van de Wiele et al. reported the optimization of the detection of stanozolol and its major metabolite 16β-OH stanozolol in faeces and urine from cattle by LC-MS. The authors discussed two different methods of detection with LC-MS-MS: first approach in ESI mode, where the final extract was detected without derivatization, and second approach in APCI mode, where the derivatizations step with phenylboronic acid (PBA) for 16β-OH stanozolol was included. Each approach has some advantages and drawbacks. LC-MS-MS in APCI in positive mode was also applied by [19] where the protonated stanozolol molecules [M+H]+ at m/z 329 and m/z 345 for 16β-OH stanozolol were precursor ions for collision induced dissociation (CID) in the selected reaction monitoring (SRM). The presence of interferences was always reported but, by the means of LC-FAIMS-MS/MS it was finally resolved [20].
This short overview displays the current detection methods for stanozolol and these methods are all based on targeted approach. Nevertheless, knowing the limits of targeted approach [21] and the confirmed potential of untargeted approach [22], the present study aimed to assess the global biological effect of stanozolol-treatment by means of chemometric models, after generating and comparing horse urine LC-HRMS fingerprints collected from control and stanozololtreated horses. Furthermore, by the means of metabolomics approach, the possibility to extend the detection window of stanozolol is considered.

Experimental design and sampling strategies
Experimental design: The animal study was conducted according to an ethically approved protocol at two different places: Chamberet and Coye la Forêt. The total duration of the animal phase was seven months and only females were selected to partake. The mares in this study were not actively racing horses, but were in good physical condition.
Fourteen Anglo-Arab 4-year old females, weighting 520 ± 60 kg, were involved in the study conducted in Chamberet. At this place, horses were at shelter from mid-November until mid-April and were in pasture for the rest of the year. At the winter period, these horses were fed twice per day with hay and manufactured feed.
The study conducted at Coye la Forêt involved two 6-year old thoroughbred females, weighting 450 kg and 550 kg. These horses stayed in boxes bedded with straw, during the entire experiment. They have the same diet regime all the time: twice per day feeding with hay and manufactured feed. The horses were moderately exercised for about one hour each day. Water was provided ad libitum.
Animal administration: In-house preparation (9 mL) of Stanozolol for chronic intramuscular administration consisted in dissolution of stanozolol in a mixture of sesame oil/ isoamyl alcohol (7/2). Vehicle of stanozolol administration was injected to 5 control horses. The stanozolol treatment consisted of 4 injections every 4 days at 0.12 mg/kg or 0.31 mg/kg doses. The low-dose treatment (0.48 mg/kg in total) was given to five mares located in Chamberet. The high-dose treatment (1.24 mg/kg in total) was given to six mares (five in Chamberet and one in Coye la Forêt). Since the experiment took place at the beginning of March, all mares were housed in the boxes during drug administration.
Sample collection: Sampling consisted in urine collection spontaneously voided.
In order to assess seasonal and other environmental factors that influenced metabolomic profiles, sample collection was conducted during one year. Urine samples (about 200 mL) were collected from all mares every 2 weeks in the morning at the same time. Samples were pH measured (8-9) before being frozen and stored at −20°C until required for analysis. Every individual veterinary treatment (i.e., antibiotic) or other special happening were reported.
Urine samples were collected before the beginning of the administration. During the treatment, urine samples were collected before each administration. After the end of the treatment, urine samples were collected every 4 days for 16 days then once per week for 7 months. As previously mentioned, pH was measured and samples stored at −20°C.
Sample preparation: 400 μL of each urine sample were centrifuged at 12 000 rpm for 30 min at 20°C, 200 μL aliquot of the supernatant was collected and 50 μL aliquot of each sample was pooled to obtained representative quality control (QC) sample. 5 μg.mL -1 of imipramine as internal standards were added in biological samples and 5 μg.mL -1 of metformine, amiloride, imipramine, prednisone, colchicine and 2aminoanthracene were added in QC samples and vortexed. 1 mL of acetonitrile was added and supernatant was transferred in tubes to dry at 60°C until evaporation. Finally, 200 μL of acetonitrile/water mixture (50/50; v/v) was added and transfer to LC vials.

Analytical platforms
Liquid chromatography: For the MS fingerprinting analysis, chromatographic separation was performed with an Ultimate 3000 (Dionex, Sunnyvale, USA) pump on a reversed phase Uptisphere Strategy C18 NEC column (2.1 mm × 100 mm, 2.2 μm particle size, Interchim). The analytes were eluted by a 25-min gradient, which started at 100% A (water + 0.1% formic acid) during 2 min, changed to 100% B (acetonitrile + 0.1% formic acid) during the 18 min, maintained at 100% B during 5 min, and then returned to the initial condition for equilibration during 2 min. The flow rate was 0.25 mL.min -1 and the column temperature was 25°C. Autosampler was set at 4°C for the duration of the analysis. 15 μL of samples were injected.
ESI-HRMS: High-resolution mass fingerprints were acquired on a quadrupole -time of flight analyzer (MicroToF Q II, Bruker, Bremen, Germany) in positive ESI mode. The mass spectrometer parameters corresponding to capillary voltage, capillary temperature, nebulizer gas flow and dry gas flow were set as follow: -4.5 kV, 180°C, 2.4 bar, 8 L.min -1 , respectively. External mass calibration of the instrument was performed using a solution of lithium cluster (16 mM lithium formiate in isopropanol/water) at the beginning of the chromatographic gradient using a divert valve and a separate pump. Mass accuracy of the m/z calibration standard was below 3 ppm for positive mode and below 2 ppm for negative mode. Centroid mass spectra were acquired in the m/z 50-1000 range. Hystar (Bruker) software was used for system controlling and data acquisition.

Data pre-processing:
Data files generated after LC-HRMS analysis were converted to a more exchangeable format NetCDF files (.cdf) using a conversion function from Data Analysis software program (Bruker). The converted data were exported in the open-source XCMS software implemented with the R statistical language for subsequent data processing based on several steps: peak picking, peak grouping, retention time alignment [23]. XCMS matched filter algorithm was used with default values for all parameters, except for fwhm, step, steps, mzdiff, mzwid and minfrac which were respectively set at 10, 0.1, 5, 0.1, 0.1 and 0.6 for both group functions. Extracted features from all samples are combined in a single dataset with the following characteristics: exact mass, retention time and peak intensity.
It was then annotated using the spectral database developed at CEA-Saclay [24] which contained over 400 compounds at the time of this study, and bioinformatics tools for automatic query of metabolic and metabolomic public databases with the measured accurate masses ± 20 ppm.

Statistical analysis
SIMCA-P+ (v. 12.0, Umetrics, Sweden) software and R (http:// www.r-project.org/ ) free software environment were used for multivariate data analysis. Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures-Discriminant Analysis (OPLS-DA) were applied to build descriptive and predictive models. The various m/z peaks constituting the mass fingerprints (i.e., couples of chromatographic retention time and m/z ratio) were considered as independent variables. All variables were UV scaled (i.e., centered and divided by the standard deviation) prior to multivariate analyses.  The horse urine fingerprints were obtained through an unbiased sample preparation method optimized previously following "nontargeted approach" basis [25]. The analytical information acquired from the profiles is transformed to coordinates on the basis of mass, retention and response intensity by XCMS software for data treatment. Therefore, the comparison of these coordinates from different metabolic patterns is undertaken using sophisticated statistical software SIMCA to highlight dissimilarities. However, before concluding with relevant differences, successful use of metabolomics methodology depends on a number of critical analytical and statistical issues. One aspect is the stability of the analytic measurements. The retention time fluctuation in chromatogram or MS signal modification were assessed by monitoring of QC samples from pooled biological samples and spiked with several different internal standards chosen for their various chemical structures, molecular mass or polarities. QC samples were injected repetitively (every ten runs) throughout analytical batch and by this means eventual system variability or source fouling was supervised.
To follow injection reproducibility over time, all samples were spiked with imipramine as internal standard selected for its stable response in positive ionization mode. Stability of the signal intensity throughout the analytical batch was improved by consecutive injections of CQ for several times (ten injections in present study) before the biological samples, at the beginning of acquisition sequence.
After the validation of this stage of workflow, XCMS output of aligned data with an Excel-type format was subject to additional filtering step since the number of extracted ions was huge (5649, 2382, 3582, respectively) and it was already demonstrated for LC/ESI/MS experiments that about 3/4 of ionized molecules derived from contaminants, chemical noise or analytical system. To eliminate signals that occur in both blank and biological samples at the intensity level and to select only analytically relevant features, biological to blank samples 5:1 intensity ratios was applied. Another criterion was based on serial dilutions of QC samples (1/2; 1/4; 1/8; 1/16) and only the features whose intensity levels correlate with the dilution factor (superior to absolute 0.5 value) were taken into account. Hereby, final data sets with 1935, 1222 and 1147 ions for first, second and third analytical experiments, respectively were submitted to data analysis by multivariate statistics.

Multivariate data analysis (unsupervised and supervised)
To obtain important number of urine fingerprints of stanozolol treated population, two experiments were realized in several months aiming to analyze a maximum of samples after stanozolol administration. First investigation encompassed samples from all horses in two experimental centers that participate in this study (Chamberet and Coye la Foret), with both dose level administration (0.4 and 1.2 mg/kg) and for a wide time range (from T0, before administration until T36, more than seven months after administration). The second one, included more sampling time but in shorter period after administration (about 3 months and a half after treatment) and with no low-dose administration. In this batch, one 16β-hydroxy stanozolol positive case from our routine screening was also joined to experimental samples. The dataset were imported into SIMCA software and PCA was realized for the first, global approach. A qualitative visual inspection of the clustering patterns in PCA score plots showed the similar results for both of analysis. As already underlined in previous study [26], the sample data for different metabolic states are clearly separated into distinct clusters following some environmental factors. Considering the variance along 1st principal component that accounts for the greatest possible variance in the data set, there is a clear separation between samples collected from horses staying in pasture and horses housed in boxes (Figure 2A). The 2nd principal component distinguishes samples from two experimental centers ( Figure 2B). Regarding the second analytical batch, besides two factors that are the different nutrition regimens and geographical origin (detailed in previous chapitre), there is one more group of samples collected two months after stanozolol administration (T13) behaviouring like outliers. The possible explanation is that the horses' metabolome was highly impacted by environmental changes. Indeed, T13 urine samples were sampled just few days after the horse migration from boxes to pasture ( Figure 2C). Once accommodate, their metabolic states are similar for all other sampling times from "pasture" period (T16, T19, T20, T36). The PCA analysis visual investigation showed that the current environmental aspects influence more urine metabolome than stanozolol treatment, imperceptible in PCA scatter plots. To extract and explore biological variations due to stanozolol administration, it is necessary to remove or at least to reduce these confounding factors that mask probably very discreet metabolic changes after stanozolol treatment. To highlight and then take out the variables responsible for discrimination of two populations ("box" vs. "pasture"; "Chamberet" vs. "Coye la Foret"), the supervised OPLS-DA analysis was applied.
This multivariate regression method is used principally for extracting systematic variations (e.g. batch order, drift in the system) from the variables related to specific responses (e.g. from drug intake). In addition, with its S-plot function and VIP ion list, OPLS-DA permits the selection of the features influencing the discrimination. The first visualization tool, so-called ''S-plot'' provides the visual contribution of each ion to the model and highlights which of them are the most correlated to the first discriminant component. The second one concerns VIP (Variable Importance in Projection). The features displaying score larger than 1 are considered as statistically relevant for discrimination between groups, and thus, could be taken into account. This step allowed obtaining a PCA model devoid of mentioned confounding factors and based on 614 and 546 features for two analytical batches respectively (Figure 3).
Step by step removing of confounding factor variables: (A) Model of 1st PCA analysis based on MS intensities of 1935 ions. The first discriminant axis corresponds to environmental factor and the second one is related to geographical difference between 2 experimental centers; (B) Model of 2nd PCA analysis based on MS intensities of 1168 ions, after having removed variables responsible of discrimination between horses living in the pasture and horses living in the boxes. The most important principal component is the separation between 2 locations: Coye la Forêt and Chamberet; (C) Model of 3rd PCA analysis based on MS intensities of 614 ions, after having removed variables responsible of discrimination between horses living in Coye la Forêt and horses living in Chamberet.  (T0, T2, T8, T16, T36) However, the principal advantage of supervised OPLS-DA (or PLS-DA) analysis over PCA is access to the knowledge on class membership. Discriminant analyses are well-suited for treated vs. untreated classification and OPLS-DA maximizes the covariance between the predicting data set, where X is matrix constituting the fingerprints and Y the class assignment. OPLS-DA also shows the differences between preselected classes. The fraction of variation of the Y variables "explained" by the selected components (R2Y), along with the fraction of the variation of the Ys that can be "predicted" by a component according to cross-validation (Q2Y), is calculated to plot and validate the model. The two main, significant components are orthogonal and were selected in the way that most of the association with dummy Y variables can be explained by the variation in X. Indeed, high coefficient values of R2Y and Q2Y represent good discrimination.  (Figure 4). The both models demonstrated excellent separation between urines collected from control or stanozolol-treated animals. Orthogonal Partial Least Squares (OPLS) were applied to build descriptive and predictive models. Models built with an OPLS attempt to explain a Y variable which in the present study represents the animal status: stanozolol-treated (red) or control (green) from the X matrix of all the features constituting the fingerprint.
Then, the robustness of models was investigated by several steps. First, a cross-validation was performed. It consists to build a new OPLS analysis on the basis of 2/3 of the original dataset. The other 1/3 of the dataset is considered as validation test and is incremented in the model for the prediction.
The results were satisfactory since they show the possibility to build a reliable descriptive models (R2(Y)=0.935; Q 2 (Y)=0.789 for the first analytical batch and R2(Y)=0.906; Q2(Y)=0.631 for the second one) where control and stanozolol-treated populations are discriminated. In addition, permutation tests by reallocating randomly the status of the animals (non-treated vs. stanozolol-treated) were performed to ensure that these results are not due to a chance factor.
Finally a CV-ANOVA (Cross Validation-Analysis of Variance) was calculated. It allows attributing a degree of significance to the testpermutation. The CV-ANOVA variants imply that the models are significant with P values of 1.9e-27 and 1.8e-20 for both models, respectively. All these results suggest that the predictive ability of the models is high and it was proved by classification without misallocation of all samples from Chamberet when datasets related to each horse was considered as validation test and added to the model for the prediction ( Figure 5).
However, the correct classification of all samples from Coye la Foret and a positive sample from the laboratory screening routine was not totally satisfactory. Indeed, the Chamberet population is a small population (N=14) and not representative of large population of horses. Thus, to improve the model it was necessary to incorporate in the study design a larger number of subjects and also a wide variety of subject populations. With the help of Italian antidoping laboratory for horse races, 18 cases of stanozolol abuse in past were obtained and they were added into a new analytical batch, third one. Beside these positive samples, 18 samples from our laboratory declared as negative for the research of prohibited substances were also added. The same workflow was realized as for two previous experiments. The visualization of PCA score plots permitted to observe the similar grouping of samples coming from two experimental centers with in addition a cluster related to samples from race horses ( Figure  6).

Selection of common features in three different analytical batches
Automatic selection by MetaXCMS analysis: In this stage of work, with an aim to include complete information available in three analytical batches, the software called metaXCMS was used [27]. The XCMS is well suited for the analysis of large sample numbers, but it is limited in possible comparing of only two different sample groups directly. Meta-analysis is an approach capable to compare the results from two or more independently performed studies to identify data points that are unique or shared among all or some of the experimental groups [14]. However, some experimental conditions should be followed, such as the same metabolite-extraction method and the same column and chromatographic method. The three analytical sequences responded to these criteria since the sample were prepared by the same sample preparation method and were analyzed under the same analytical conditions. The pairwise comparisons of each model with its respective control resulted in 5649, 2382 and 3582 features, respectively, but after the metaXCMS data reduction strategy, we obtained 1218 features with significant differences. Unfortunately, the   The model based on the 33 features failed to separate two populations since the features selected by second-order analysis by metaXCMS are not sufficiently discriminatory.
Manual selection: With the aim to increase the statistical power of model, all analyzed samples from the three different analytical batches were pooled together. The manual selection of features common to the three analytical batches was undertaken with complex step of intrabatch peak alignment based on already accomplished XCMS interbatch peak alignment; 220 features emerged by this way. However, the well-known problem of analytical sequences realized in different time period is fluctuation in mass spectrometry sensitivity and consequently, fluctuation in abundance of signal response of followed ions. To withdraw the disparity of the mass spectrometer sensitivity for the acquisitions within different time periods data from each sequence were mean-centered and scaled to unit variance separately and then merged in one dataset for further multivariate data analysis. As the selection of common features was achieved on the bases of dataset already filtered for the variables related to confounding factors, the sample grouping following these factors (e.g., box/pasture or Chamberet/Coye la Foret) is not emphasized in the obtained PCA scatter plot (Figure 9).  The predictive model is able to classify correctly the majority of analyzed samples. In Figure 11 all samples collected from stanozololtreated horse before stanozolol administration (T-7, T-4, T0) and after stanozolol treatment (T2, T3; T5, T6; T7, T8; T9, T10; T13, T16; T19,  T20) were correctly classified. Only T36 (more than seven months after stanozolol treatment) could not be discriminated from control population which may be explained by the fact that metabolic perturbations associated to stanozolol administration are no longer detectable with this predictive model.
All samples in the scatter plot are correctly classified.
However, the prediction ability of existing statistical model is not sufficient for its application in routine screening since the 0.646 value represents third of population with prediction incertitude and too high possibility of false negative/positive classification.
To increase the robustness of statistical model, a larger population is needed. The response to this request is straightforward in the case of negative samples easily attainable in routine laboratory, but more demanding in the case of positive samples since in routine screening  Nevertheless, the discrimination of control and stanozolol-treated population is efficient and thus almost 4 months after anabolic steroid administration which is significantly longer than the detection window of stanozolol obtained by different target analysis. Although, the detection window is greatly improved with the screening of stanozolol long-term metabolites [28] it is mandatory to carry on further.

Conclusion
The analyzed horse urine fingerprints show changes in metabolic states several months after chronic stanozolol treatment which demonstrate the potential of such approach as a screening tool. However, the obtained model cannot be incorporated as a routine screening method in its present form due to prediction capacity of current model. It is important to mention that for the moment, there are no established criteria regarding the acceptable rate of false positive or false negative. In the clinical diagnosis, a false negative is much more important to avoid than is a false positive, since if the metabolite is a disease marker a false negative may be life-threatening. In doping control, it depends on the type of analysis which can be screening or confirmatory. In the case of screening, false positive is unwanted but acceptable given that there is further complementary analysis to confirm the presence of prohibited substance. Regarding confirmatory analysis, the rate of false positive must be nonexistent. However, the obtained predictive ability of presented model needs to be improved.
In order to keep superior sample size of statistical model by including three realized analytical batches in the same time, other normalization methods can be applied (i.e. LOESS signal correction by QC samples). Furthermore, beside performed PCA, PLS and OPLS data analysis, there are alternative multivariate statistical techniques that need to be considered. Nevertheless, when the identification of specific biomarkers of drug abuse is required, the solution is to integrate a wide variety of subject populations in the study design in order to minimize the effects of nonrelevant metabolic variations. It can be done through another animal phase conducted on a larger number of animals (very expensive, thus difficult) or simply, through integration of new samples originating from laboratory routine (declared negative or stanozolol positive when possible).
Once the metabolic pattern of stanozolol administration defined, it will be compared to metabolic patterns of other anabolic steroids administration with aim to release a common one. Fundamentally, they should share some common invariant properties of biological signatures, whatever the anabolic steroid (mis)used. Obviously, the identification of these selected metabolites remains as an ultimate step to explain their biological significance.