The robustness criteria investigated in the present work have been proposed by Dervilly-Pinel et al. (Dervilly-Pinel, 2018) for the validation of screening methods derived from non-targeted metabolomics study.
3.1 Method selectivity
In a previous study, a model based on an equation combining 4 candidate biomarkers urinary levels and predicting the status of urine samples in the context of testosterone esters administration was settled (Cloteau, 2022). The present study aims at probing the relevance of these findings regarding testosterone esters abuse and in a broader scope the misuse of alternative substances of interest in the context of doping control. The method selectivity was thus assessed based on additional in vivo studies involving different administration routes, doses, and prohibited substances. Inter-individual variability (e.g., gender, age) was also included in the study design. The selectivity of the classification model is directly related to the scope of the method. For this purpose, the model’s response was evaluated across the following protocols:
1- administration of testosterone esters from several in vivo studies including inter-individual variability (e.g., age, gender) but also different doses and administration routes (exp. B to E)
2- administration of alternative anabolic agents (exp. F to L)
3- administration of alternative prohibited substances as non-steroidal anti-inflammatory drug (NSAID) or steroidal anti-inflammatory drug (SAID) (exp. M to Q).
3.1.1 Selectivity towards AAS administration
Challenge test. An in vivo study (Exp. B, Fig. 2B), based on the experimental design of the initial in vivo study (Exp. A, Fig. 2A), was performed. This in vivo study aimed at confirming the capacity of the method to highlight the use of low dose testosterone esters cocktails (0.12 to 0.4 mg/kg per testosterone esters). As depicted on Figs. 2A and 2B, the kinetics of the Y value are comparable. Although different amplitude between the points before and after administration is observed in both in vivo studies, similar pattern is exhibited on the model and the predictive value is still impacted throughout the kinetics.
Additional testosterone related studies. Following the successful completion of the challenge test involving low dose mixtures of testosterone esters, the specificity of the model was further evaluated by investigating the impact on the biomarkers, and thus on the prediction value (Y), of urine samples collected from horses given a single formulation of testosterone esters. in vivo studies involving testosterone propionate (Exp. C) and testosterone enanthate (Exp. D) were hence selected and associated urine samples characterized. These additional studies enabled further assessment of the effect generated by different administration doses (1 and 0.6 mg/kg) and different genders (mare vs gelding). As illustrated in Figs. 2C and D, administration of a single testosterone ester also affects the biomarkers. The lower predictive value response following administration of testosterone enanthate might result due to a lower dose administration (0.6 mg/kg).
Finally, method selectivity was assessed regarding administration route through the per os administration of Clostebol (Exp. E). Clostebol is the 4-chloro derivative of testosterone. As presented on Fig. 2E, a response of the model can be observed starting from the 5th day after administration demonstrating a disruption of the biomarkers levels after oral administration of Clostebol. This preliminary study of the selectivity of the model, involving various protocols dealing with the use of testosterone esters demonstrated the effectiveness of the model in highlighting a wide range of practices, in terms of administered doses, ester chain lengths, chemical structures, or administration routes independently from animal physiological characteristics (gender, age). In Comparison to the initial Exp. A, higher predictive values were observed for some urine collected before administration. As the biomarkers identity is still unknown, all the factors impacting these compounds are not yet fully understood. No explanation for the biomarkers disruption prior to administration can therefore yet be drawn with the available information. Further toxicological and biological analyses could be carried out in these samples to identify a potential confounding factor to explain these observations.
Selectivity towards boldenone undecylenate administration. As the classification model was found to be effective in demonstrating testosterone esters and Clostebol administration independently of inter-individual variability, dose and route of administration, the next step was to test a possible extension of the method scope. Another AAS of interest, boldenone, under its main ester form (undecylenate), was then considered, involving 4 additional in vivo studies relying on either IM or oral administration (Exp. F to I) (Table 1). As depicted in Figs. 3A, similar kinetic profiles to those observed after IM administration of testosterone esters (Fig. 2A and B) are observed between the samples collected before, after administration and at the end of the study. Indeed, the predictive value (Y) increases after administration and then decreases to a basal level. In the second in vivo study, no return to a basal level is observed due to the lower number of collected urine samples. However, predictive value (Y) are comparable within studies (Y > 7) (Fig. 3B). Regarding the prediction of urine samples collected after oral administration of boldenone undecylenate, this route of administration does not seem to be an obstacle for using the model either, since a kinetic is observed with the progressive increase of the prediction value (Fig. 3C and D). Finally, biomarkers levels seem to slightly alter after each repeated administration of boldenone undecylenate at low dose (Exp. H) (Fig. 3C).
As explained in § 3.1.1.2, although the predictive values of some samples collected before administration are high, a disturbance of the signal throughout the kinetics could nevertheless observed. Such results could thus allow extending the application of the method to the prediction of boldenone undecylenate.
Selectivity towards nandrolone administration. Nandrolone is, together with boldenone and testosterone, major doping agents. Therefore, with regard to the previously reported promising results, the selectivity of the method was assessed after IM administration of nandrolone decanoate to a stallion (Exp. J). The particularity of this experiment is the selection of a stallion for this in vivo study while the previous involved either mare or gelding. Associated results (Fig. 3E) demonstrate a strong disruption of the candidate biomarkers after administration of nandrolone and especially the capacity of the model to highlight this disruption over the long term. These results demonstrate the relevance of the 4 selected biomarkers to assess the administration of AAS.
3.1.2 Selectivity towards other anabolic agents or veterinary drugs
To further assess the method selectivity and biomarkers’ response towards other veterinary drugs and anabolic agent administration, additional steps in the validation process have been included as follows.
SARM S. SARMs (Selective Androgen Receptor Modulators) which exert an anabolic effect associated with low androgenic effects (Hansson, 2015) are recognized of potential interest for doping purposes. An in vivo study involving oral administration of GW1516 (0.14 mg/kg) was selected to study the ability of the model to highlight such practice in the horse (Exp. L). The characterization of urine samples showed deregulation of the 4 biomarkers of interest with a significant increase of the predictive Y value after GW1516 administration (Figure S1B). This preliminary result suggests that the model may also be applicable to screen for SARMs administration. Deeper investigation including more in vivo studies involving SARMs should be performed to confirm these findings.
β -agonists. The capacity of the strategy to highlight per os administration of clenbuterol was also included in the validation scheme. While clenbuterol is authorized in equine veterinary medicine to prevent bronchospasm, this substance also acts as a repartitioning agent at higher dose and is not authorized for such use (Garcia, 2011). The characterization of urine samples collected in the frame of Exp. K resulted in a disruption of the biomarkers and thus of the model and predictive value (Y) (Figure S1A). However, this result should be considered as preliminary at this stage due to the concomitant injection of Diazepam, an anxiolytic of the benzodiazepine family. The latter substance could be involved in the biomarkers alteration. In this context, it would be advisable to characterize the urine collected from in vivo studies involving uniquely the administration of β-agonists.
Specificity towards non-anabolic compounds. Finally, selectivity of the method regarding other non-anabolic veterinary drugs was evaluated. In that respect, urine samples collected in the frame of in vivo studies involving non-anabolic compounds, i.e. steroidal and non-steroidal anti-inflammatory drugs (SAID and NSAIS respectively (Exp. M to Q). No kinetic between the samples collected before and after administration and low predictive value (Y < 6) were observed (Figure S2). Such results indicate that the biomarkers do not seem to be affected by the administration of non-anabolic compounds and specifically of SAID and NSAID.
The biological relevance of the 4 candidate biomarkers as well as the adequacy of the parameters of the equation and the efficiency of the model were proven. In summary, 349 urine samples were analyzed to define the classification model scope. The different experimental designs enabled confirming that the abundance of the candidate biomarkers signals is impacted by anabolic agents’ administration with promising initial results also towards SARMs and β-agonists. In contrast, biomarkers profile did not present any disruption or a high predictive value further to the administration of non-anabolic compounds.
3.1.3 Selectivity towards general equine population
Following the establishment of the application field of the strategy, it is advisable to verify the behavior of the biomarkers to characterize “off-the-shelf” samples, representative of the equine population. 342 compliant urine samples were characterized and predicted on the model using the predictive value (Y) to consider inter individual variability and refine the scope of the method. Prediction of the 342 compliant samples in the model, using the targeted metabolomics protocol, showed distinct two sub-populations samples. Further investigations enabled concluding that one of the sub-population corresponds to the samples for which at least one of the 4 candidate biomarkers is not detected. It represents 46 % of the whole compliant sample set (n = 158). As no correlation could be made with the actual metadata associated with these particular samples, it was not possible to conclude on the origin of such phenomenon (biological variability, signal sensitivity etc.). Furthermore, in the context of this study, the partial or total absence of signal associated with biomarkers (whether of analytical or biological origin) refers to a compliant status of these samples. While the aim of the population-based study is to establish a baseline mean predictive value (Y) and to assess whether the confounding factors included in the study have an impact on biomarker abundance, only samples containing all biomarkers were selected for further study (n = 184).
Figure 1 depicted the physiological and environmental parameters of the selected samples. Considering the normal distribution of the resulting prediction values (Y) (n = 184) (Shapiro test, pvalue = 0.35), further characterization of the control population was performed using Student’s t-test to assess the impact of different factors, including origin (IFCE samples or racing samples), gender and racing practice. A significant difference of response for the predictive value (Y) is observed between the urine samples from racehorses (harness and gallop), and those from the IFCE experimental facilities (pvalue = 0.0032) (Fig. 4). The distinction between both populations can be associated to their respective environment and breeding conditions. Indeed, physical activity differs between both populations. While racehorses (Racing Horses, n = 128) have a high physical activity in training and competition, horses from stud farms have light to moderate physical activity (IFCE horses, n = 56). However, the available metadata on the living conditions parameters did not allow investigating further associated parameters. A more in-depth population study is underway with a wider range of factors (e.g., seasons, age) to identify a potential effect explaining these preliminary observations.
Following a deeper examination, racing population appeared homogeneous in its response to the classification model, since no significant differences are observed between the practices (harness, gallop), p-value = 0.118 (Fig. 5A) or between genders (mare, stallion, gelding), p-values > 0.3 (Fig. 5B). Moreover, Fig. 5C shows the distribution of prediction values between the control samples (racing and IFCE) and the samples after anabolic agents’ administration (post administration samples) (Exp. A to L). The mean of the prediction value for post administration samples are visually and significantly higher than those obtained for control samples (p-value = 7.07e-11), thus the biomarkers do not have an impact on control samples including inter-individual variability. Furthermore, despite a significant difference between the IFCE and racing control samples (Fig. 4), the predictive values remain lower than those observed for post administration samples. Thus investigating further, the animals’ characteristics (e.g. racing practice, gender) on more than a hundred samples (n = 184) allowed to refine the scope of the method and lead to consider the entire equine population including different gender and racing practices, but also different ages and environment provided that all biomarkers are detected.
3.4 Method implementation
3.4.1 Classification model
Implementing a classification model would require setting a suspicion limit (SL), corresponding to the mean predictive value (Y) of selected compliant urine samples set with an uncertainty factor; it could be defined based on previously published work (Dervilly-Pinel, 2015; Cloteau, 2022). The suspicion limits could for instance be determined on the basis of the racing control samples (n = 128) characterized above (§ 3.1.3). Therefore, based on the selected urine samples to represent the control population, and given their normal distribution, a mean prediction value control of 4.13 with a standard deviation of 1.06 would be obtained, enabling setting suspicion limits at Y = 5.87 with 95% confidence (SL1) and at Y = 6.59 with 99% confidence (SL2) (Figure S3). The setting of these limits is a way of evaluating an order of magnitude of the possible detection window for each of the experiments in our study (Table 2). Such an approach would allow suspecting the abuse in Exp. A over 89 days. As can be observed from Table 2, the suspicion windows depend mainly on the route of administration and the length of administered ester chain. Thus, the suspicion windows associated to the use of single ester formulations was observed to be 2 to 4.5 times shorter than when using cocktails. This observation may be related to the long-term release of each ester contained in the testosterone esters cocktail. In addition, the faster return to basal levels for testosterone proprionate compared to testosterone enanthate despite a higher dose may correspond to the rapid release and thus, rapid metabolization of testosterone propionate (Minto, 1997; Zaro, 2015). The suspicion windows also depend on the route of administration of the compound. Indeed, the results obtained after single testosterone and boldenone ester IM administration showed comparable suspicion windows up to 30 days, whereas those observed after oral administration suggests a delayed response of the biomarkers, especially for in vivo studies involving boldenone undecylenate. This observation may be related to the ester chain drug which undergoes a first transition in the lymphatic system before reaching the systemic circulation (Shackleford, 2003). The shorter detection window observed upon Clostebol acetate oral administration may be the consequence of the low dose administration and the hepatic first-pass effect resulting in low bioavailability of the active substance and therefore a lower response on the biomarkers (Pond, 1984; Abuhelwa, 2017). It is interesting to note that despite the strong response observed after administration of nandrolone, the suspicion limit proposed would unfortunately not be conclusive as to the suspect status of these urine samples, demonstrating that either the scope of the classification model should be refined, excluding nandrolone as a target, or that further refinement of the suspicion limit value would be required before implementing this strategy as a screening tool.
3.4.2 Longitudinal profiling
Facing the limitations mentioned previously, linked in particular to inter-individual variability, and the aim to target a wide range of substances, an alternative option to the use of such a tool could be the biological passport. Initially, longitudinal profiling was introduced by the World Anti-Doping Agency (WADA) with the concept of the Athlete Biological Passport (ABP). This approach has been extended to the field of equine anti-doping with the Equine Biological Passport (EBP) which consist of following horses and monitoring performances enhancing by measuring biological changes based on several effect biomarkers (Cawley 2017, 2022, Loup 2013, Duluard 2010). In this context, the model would be adapted to the individual subjects, the basal level of the biomarkers would be well characterized and a significant deviation of the prediction Y value on the model would result in an investigation.
Table 2
Suspicion windows obtained after analysis of in vivo studies involving administration of anabolic and non-anabolic agents, with direct screening and the classification model. Suspicion windows, from direct screening, were established using IFHA thresholds of 20 ng/mL and 55 ng/mL for testosterone, for geldings and mares respectively. Testosterone and boldenone concentrations in urine were obtained after analysis in LC-Q-Exactive in PRM mode. The suspicion windows from the classification model, were calculated based on the suspicion limit with 95 or 99% of confidence, and after semi-quantification of the 4 potential biomarkers and calculation of the prediction value (Y) after analysis in LC-Q-Exactive in PRM mode.
|
|
|
|
Suspicion windows (Days)
|
Compound(s)
|
Type
|
Response
|
Direct screening
|
Classification model
(SL1)
|
Classification model
(SL2)
|
Testosterone cocktail-1
|
AAS
|
Yes
|
1–20
|
4–89
|
|
6–45
|
|
Testosterone cocktail-2
|
AAS
|
Yes
|
1–26
|
1–58
|
|
4–26
|
|
Testosterone propionate
|
AAS
|
Yes
|
1–25
|
1–20
|
|
1–15
|
|
Testosterone enanthate
|
AAS
|
Yes
|
2–8
|
2–30
|
|
8–16
|
|
Clostebol acetate
|
AAS
|
Yes
|
1
|
4–6
|
|
n.a
|
|
Boldenone undecylenate-1
|
AAS
|
Yes
|
1–82
|
1–23
|
|
2–17
|
|
Boldenone undecylenate-2
|
AAS
|
Yes
|
1–10
|
2–30
|
|
2–30
|
|
Boldenone undecylenate-3
|
AAS
|
Yes
|
1–3
|
1–10
|
|
7–10
|
|
Boldenone undecylenate-4
|
AAS
|
Yes
|
1–5
|
9–14
|
|
9–14
|
|
Nandrolone decanoate
|
AAS
|
Yes
|
n.a
|
n.a
|
|
n.a
|
|
Clenbuterol + Diazepam
|
β-agonists
|
Yes
|
n.a
|
10–16
|
|
12–16
|
|
GW1516
|
SARMS
|
Yes
|
n.a
|
6–30
|
|
6–30
|
|
Meloxicam-1
|
NSAID
|
No
|
n.a
|
n.a
|
|
n.a
|
|
Meloxicam-2
|
NSAID
|
No
|
n.a
|
n.a
|
|
n.a
|
|
Dexamethasone
|
SAID
|
No
|
n.a
|
n.a
|
|
n.a
|
|
Beclomethasone
|
SAID
|
No
|
n.a
|
n.a
|
|
n.a
|
|
Cortivazol
|
SAID
|
No
|
n.a
|
n.a
|
|
n.a
|
|
3.5 Direct screening comparison
In the context of equine doping control, targeted analysis of anabolic agents in urine by targeted mass spectrometry remains a method of choice in combination with the blood analysis (Moreira, 2021). Considering the emergence of metabolomics as a new tool for the suspicion of prohibited substances abuse and the desire to implement them, the performances of both types of screening were compared. According to previous research, in horses, boldenone and testosterone are excreted in urine as sulfated form quasi-exclusively (Teale, 2010; Dumasia, 1983). As a result, semi-quantification of testosterone, boldenone and clostebol were monitored with the following selected transitions: 367.1585 > 177.0227, 365.1428 > 177.0227 and 401.1194 > 177.0227 for the testosterone sulfate, boldenone sulfate and clostebol sulfate respectively. 368.1428 > 180.0415 and 370.1795 > 180.0415 correspond to the transitions chosen for the testosterone sulfate d3 and boldenone sulfate d3. The m/z fragment 177.0227 was formed from the fragmentation of the AAS Sulfated D-ring (Ho, 2004). The abundance of each molecule as well as the prediction value Y were, then, normalized on a scale of 0 to 1 (0 when the response of each screening method is null and 1 for the maximum response obtained over the time scale of each in vivo study). Direct screening suspicion windows were estimated based on the suspicion limits set by the IFHA threshold at 20 ng/mL and 55 ng/mL for urinary total testosterone for gelding and mare respectively (IFHA). Currently, no threshold is set for boldenone in gelding and female. As described in the Fig. 6, the Tmax, corresponding to the time when the maximum abundance is reached, for the classification model appear to be delayed compared to the direct screening method and especially when administrated orally (Clostebol, boldenone undecylenate 3 et 4). The suspicion windows presented in Table 2 are comparable between both methods except for oral administration where Tmax is different by several days. In addition, the abundance of the prediction value (Y) is more important over a longer time period. Even if the samples are not classified as suspicious, particular attention can be given to the horse in the framework of a longitudinal monitoring. Although targeted methods monitoring the steroid esters in plasma or the conjugated and free steroids in urine are efficient and sensitive, especially after oxime derivatisation for plasma analysis (De la Torre, 2021), confirming the presence of esters cocktails or micro doses administration can be a real challenge considering the rapid turnover of compounds (Schönfelder, 2016; Delanghe, 2014). Indeed, the concentrations administered are low, reducing the targeted methods detection windows. As observed in Fig. 6, the comparison of the two types of screening shows a complementarity of the approaches. The biomarkers monitoring in urine, in these particular cases may be interesting with a long-term suspicion window.