Urinary Sex Steroids and Anthropometric Markers of Puberty - A Novel Approach to Characterising Within-Person Changes of Puberty Hormones

Background/Aims The longitudinal relationships of within-individual hormone and anthropometric changes during puberty have not ever been fully described. The objectives of this study were to demonstrate that 3 monthly urine collection was feasible in young adolescents and to utilise liquid chromatography-tandem mass spectrometry assay methods for serum and urine testosterone (T), estradiol (E2) and luteinizing hormone (LH) in adolescents by relating temporal changes in urine and serum hormones over 12 months to standard measures of pubertal development. Methods A community sample of 104 adolescents (57 female) was studied over 12 months with annual anthropometric assessment, blood sampling and self-rated Tanner staging and urine collected every 3 months. Serum and urine sex steroids (T, E2) were measured by liquid chromatography-tandem mass spectrometry (LC-MS/MS) and LH by immunoassay. Results A high proportion (92%) of scheduled samples were obtained with low attrition rate of 6.7% over the 12 months. Urine hormone measurements correlated cross-sectionally and longitudinally with age, anthropometry and Tanner stage. Conclusion We have developed a feasible and valid sampling methodology and measurements for puberty hormones in urine, which allows a sampling frequency by which individual pubertal progression in adolescents can be described in depth.


Introduction
The circulating gonadotropin, testosterone (T), and estradiol (E 2 ) changes that drive the external manifestations of puberty are well described from cross-sectional studies according to chronological age or Tanner staging interpreted quasi-longitudinally [1,2]. Such cross-sectional analysis artificially smooths longitudinal data due to a low resolution in temporal sampling, markedly underestimating the underlying within-subject variability. Combined with this variation in individual hormones is the normal variability in both time of onset and tempo of completing puberty. Both timing and tempo of hormone change might be an important intermediate factor in the marked behavioural and psychological changes of adolescence, but presents challenges in its study. Previous studies that have considered how the individual variability in puberty hormone change might influence the dramatic psycho-bio-behavioural changes in adolescence relied upon older methods of often direct (unextracted) sex steroid immunoassays. This is a less accurate technology especially at low circulating steroid levels concentrations [3][4][5], and is now being supplanted by more sensitive and specific liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based steroid assays [3]. Similar more sensitive LC-MS/MS -based methods have been recently described for serum sex steroids in pre-pubertal children [6].
In clinical settings the usual methods to appraise pubertal development and, by implication, its variability, comprise hormone measurements and anthropometry, with emphasis on timing of the height growth spurt, clinical inspection and rating of secondary sexual characteristics and bone age [7,8]. In epidemiological studies and community studies the definition of puberty has to be simplified to be based on adolescent self-report or parental report against either Tanner stage line drawings [9,10] or Petersen's Pubertal Development Scale [1]. Selfrated Tanner staging is less intrusive and often the only Ethics Board approved method to assess puberty as defined by secondary sexual characteristics for epidemiological and community studies involving adolescents who remain legally minors. This has to sacrifice some reliability, especially in early puberty, and with the limited number of developmental stages available to select [11][12][13][14]. Recent work has demonstrated self-rating is of insufficient accuracy to be of use in the clinical setting, however may be acceptable in research settings where clinician assessment is not possible [15]. Menarche (a late pubertal event), spermarche [16,17] which is difficult to evaluate [18], and voice breakage have all been described in relation to age and/or Tanner stage [19,20]. However, as singular time points these provide minimal information on either timing of onset or tempo of puberty. To date in the literature there have been no reports describing individual puberty hormone change with a sufficient measurement frequency to adequately capture individual variation. Unless this is achieved, it is not possible to determine the true effects of puberty hormones on adolescent mood, wellbeing and behaviour.
This study aims to examine the feasibility of frequent three-monthly urine collection in a large Australian adolescent cohort, as well as the utility of LC-MS/MS assays urine and serum sex hormones. Additionally, we will describe changes in E 2, T (using these LC-MS/MS) assays and luteinizing hormone (LH) over 12 months in adolescents to compare results for urine and serum hormones. Finally, we aim to examine the association between the changes in anthropometry and self-reported Tanner stage with changes in hormones in both urine and serum.

Setting and Participants
The study was set in two regional towns in the state of New South Wales (NSW), Australia. Adolescents between the ages of 10 and 12 years were recruited from local schools. Fasting morning blood samples were collected at 0 and 12 months for the measurement of E 2 , T and LH and first morning (fasting) urine collected three monthly for the same measures. No participants had an endocrine disorder or were on any type of gonadal steroid hormone therapy.

Anthropometry and pubertal staging
Height was measured using a portable stadiometer (to 0.1 cm). Weight was measured in light clothing using a Tanita TBF-300 Pro Body Composition Analyzer [21]. Body mass index (BMI) was calculated using these measures. The adolescents provided a self-rating of puberty using line drawings based on the Tanner stages [9,10]. Females provided self-rating of breast stage and males provided self-rating of genitalia stage. Self-report of Tanner stage [11][12][13][14] was the only feasible and ethically acceptable measure of pubertal staging available to the investigators. This situation is now a common limitation, as both Institutional Review Boards and parents are reluctant to permit direct physical examination of undressed healthy children for a research study of normal puberty.

Hormone measurements
Following a 12 hour fast urine was collected at home before blood samples were collected between 7:00 am and 8:30 am in order to minimise the effects of diurnal hormone variation [1,22]. The urine sample was collected as a first morning void into a lidded container and immediately placed into an insulated carry bag on an ice brick. This was refrigerated in a freezer at -20 C within 3 hours. Blood and urine samples were then transported on dry ice to the research laboratory where these were kept at -80 C until analysed-approximately 12 months later. Postmenarcheal girls provided urine and blood specimens in the mid-follicular phase (Day 7-10) of their menstrual cycle in order to standardise collection time. Urine and serum steroids were measured by liquid chromatography, tandem mass spectrometry (LC-MS/MS) as modified from a previously described method for serum [23] and adapted for urine specimens following enzymatic deconjugation. The conjugated steroid underwent hydrolysis using β-glucuronidase enzyme from Escherichia coli K12 (Roche Diagnostic, Mannheim, Germany) that deconjugates the glucuronides moiety from steroids. The developed LC-MS/MS method measures unconjugated steroids (i.e. originally unconjugated plus deconjugated). The details of the novel urine assays have been published [24]. Briefly, urine and serum specimens were separated by liquid chromatrography using a Shimadzu Nexera UHPLC system (Shmadzu Scientific Instruments, Columbia, MD). Following this, tandem-mass spectrometry analysis was performed on samples using an API-5000 triple-quadrupole mass spectrometer (Applied Biosystem/MDS SCIEX, Ontario, Canada) [24]. A thorough validation was carried out for the LC-MS/MS method to measure E 2 and T according to standard FDA/EMEA analytical validity criteria.
The calibration curves of the urinary steroid LC-MS/MS method ranged from 0.025 and 32 ng/mL for T and 0.05 and 32 ng/mL for E 2 , and fitted quadratic functions with r of 0.999 or better. Within-day and between-day accuracies and precision at all levels of quality control ranged from 95-105% and 2.6-9.7%, respectively. Using a 500 μL urine sample the limits of detection (LOD) and limits of quantitation (LOQ), respectively, were 5 pg/mL and 25 pg/mL for T and 25 pg/mL and 50 pg/mL for E 2 . Matrix effects were negligible with no significant ion suppression or enhancement for either analyte with recovery values between 102-108%. Extraction recovery and process efficiency were between 93-103% for both the analytes at all levels of QC. The β-glucuronidase enzyme hydrolysis (deconjugation) efficiency was 100-102% after an overnight incubation at room temperature.
Specificity of both the analytes was tested against structurally related compounds that potentially may interfere with the method, including estrone, epitestosterone, androsterone, etiocholanolone, epietiocholanolone, 3-α androstanediol, 3-β androstanediol, and dehydroepiandrosterone. All the steroids listed did not interfere with the studied analytes. Blood and urine samples with values less than the lower limit of quantitation (LLOQ) for E 2 and T were taken as half the LLOQ. Total of 27 urine (E 2 : 16, T: 11) and 2 serum (T) samples were below the LLOQ. Serum and urine LH were measured by Immulite 1000 LH (Siemens) which detects intact LH and LHβ subunit and provided reproducible measurements in frozen stored urine [25]. The within-assay coefficients of variation were <10%. Serum and urine LH values below the detection limit (0.1 IU/L) were set at zero (10 serum and 5 urine samples). Urine FSH assays (Immulite, Delfia) did not pass validity tests (dilutional linearity, quantitative spike recovery) and were therefore not used in this study. All urine hormone concentrations were adjusted for urine specific gravity (SG) measured by reagent strip (ChoiceLine 10, Roche Diagnostics) to a standard SG of 1.020.

Statistical analyses
Anthropometric measurements and hormones levels (blood and urine) were summarised by gender over time using means and standard deviations for continuous variables and frequencies and proportions for categorical variables. Linear mixed effects models were used to assess longitudinal changes in hormones, with gender and collection time as covariates. A random effect for child was included in these models. Interaction between gender and collection time was tested for each hormone. Linear regression was used to analyse the association between changes in anthropometry (height, weight, BMI) and self-rated Tanner stage and changes in urine and serum hormones. For these, the models were adjusted for the baseline anthropometric measurement and baseline hormone concentration. All p-values were calculated using Wald tests. Statistical analyses were conducted using Stata 12.1 (StataCorp, Texas, USA). Statistical significance was set at 0.05.

Ethical considerations
The study has ethical approval from the Human Research Ethics Committee, University of Sydney (HREC 13094) within the National Health and Medical Research Council Guidelines for Human Experimentation, which is consistent with the Declaration of Helsinki. All participants assented, and a parent provided written informed consent prior to commencing the study.

Cohort Characteristics
One hundred and four participants were recruited. The mean ages (SD) for the study participants at baseline were 12.5 (0.93) years for males and 11.8 (0.98) years for females. At follow-up, the ages were 13.5 (0.94) years for males and 12.9 (0.97) years for females. Anthropometric characteristics for this cohort are shown in Table 1. For the females, 22 (39%) had menarche prior to the study and one additional girl experienced menarche during the follow-up year. Post-menarcheal girls were significantly older than their pre-menarcheal counterparts (12.9 years vs. 11.9 years, p<0.001). A high proportion of scheduled samples were collected for urine (484, 92%) and serum (194, 93%). There was a low loss to follow-up (7, 6.7%). For those not lost to follow-up, 16 females and 12 males did not provide at least one urine specimen. There was no difference in age between those who provided a specimen and those who did not (mean difference 0.35 years; 95% CI -0.21-0.92; p = 0.23), nor was there a difference in self-rated Tanner stage (p = 0.53).
Mean anthropometric measurements (Table 1) increased significantly over 12 months whereas age-and gender-standardized z-scores did not change over the 12 months follow-up. Pre-menarcheal girls significantly increased their age-standardized weight (p<0.001) and height z-scores (p = 0.002), whereas these z-scores did not change for post-menarcheal girls. Rate of change in anthropometric measurements were similar between genders (interaction p>0.05).
Self-rated Tanner staging increased with fewer in stage 1 and more in stage 5 at 12-month follow-up (Table 2). During the year, 14 participants (13.5%) progressed two Tanner stages, 43 (41.3%) progressed one stage and 41 (39.4%) did not change in their self-rated Tanner stage. Six (5.8%) participants (three boys) provided a lower self-rated Tanner stage at follow-up than baseline. One participant (1%, 1 boy) did not provide baseline Tanner staging and seven (6.7%, 1 boy, 6 girls) did not provide follow-up Tanner staging.

Hormone measurements
Urine hormone levels versus time. For the 484 urine collections, 14 (3%) E 2 , nine (2%) T and five (1%) LH assays were below the LLOQ. For LH, these participants were younger than the rest of the cohort (mean difference 1.35 years; 95% CI 0.41-2.30; p = 0.005), but there was no age difference between those with urine E 2 or T samples below or above the limits of detection. In five such cases (one urine E 2 , one urine E 2 and T, one urine T, one urine and serum and one urine LH) the participant provided a Tanner stage 1 at baseline.   Table 3 shows the mean three monthly values for urinary E 2 , T and LH over 12 months. There is a significant increase and clear upward pattern for all subjects in mean urine T and E 2 plotted over time from baseline to follow-up in E 2 , T and LH. The same pattern is seen in females when stratified by menarcheal status (data not shown). Urine hormone levels were not strictly progressive and in some instances decreased over time, though there was an overall increase in mean levels for all three hormones for males and females. For example, there was a 22.2% decrease in mean male and a 40.0% decrease in mean female E 2 between six and nine months, a 14.6% decrease in male T between baseline and three months, a 24.6% decrease for male and 16.7% decrease for female T between six and nine months. For LH, there was a decrease for males between baseline and three months of 3.4% and between six and nine months a 14.3% decrease and a 23.3% decrease for males and females, respectively. At all other times there was an increase in urine hormones. For serum hormones, a decline from baseline to follow-up was observed 22 adolescents (11 female) for E 2 , 19 (13 female) for T and in 26 adolescents   Table 4), and increased in each Tanner stage.
Serum hormone levels versus time. Mean serum hormone levels at baseline and one year follow-up are shown in Table 5. Of the 194 serum collections, two (1%) T and nine (5%) LH assays (all in separate individuals) were below the LLOQ. No E 2 samples were below LLOQ. Those with LH samples below the limits of detection were significantly younger than their peers (mean difference = 1.47 years 95% CI 0.81-2.14 p<0.001). No difference in age was observed for those with serum E 2 or T samples below limits of detection.
Hormone values all significantly increased over the 12-month period and were significantly different between genders, but rate of change was not statistically significant between genders (interaction p>0.05), except for T.

Association between Anthropometry Measurements and Urine and Serum Hormones
Urinary E 2 , T and LH all positively correlated with Tanner staging at baseline and 12-month follow-up (p<0.001 for all). Table 6 shows the associations between anthropometry changes and changes in sex hormones. Change in height was associated with changes in serum T, and serum and urine LH in females and both serum and urine T in males. Change in weight was associated with changes in urine E 2 , serum T and serum LH in females. Serum T and LH was associated with self-rated Tanner stage in males, but this was not statistically significant based on the urine samples. No other significant associations were observed between serum or urinary hormones and changes in anthropometry over 12 months.

Discussion
There are two major findings of this study. First is the demonstration that it is feasible to collect urine samples from a community-based adolescent cohort at three-monthly intervals, with high compliance (93% serum and 92% urine collections completed) and low follow-up attrition rate. Second is the finding that urine levels both correlate with serum testosterone and oestradiol and have the potential to provide a more subtle and nuanced description of individual puberty hormone progression. In order to demonstrate the specific effects of puberty hormones on any biological aspect of adolescent development and health, a methodology that allows more frequent biological sampling than has ever been previously reported for community based field studies is essential. A single reading of testosterone or oestradiol in whatever biological sample is of limited use. It is change that is relevant and hence the importance of longitudinal samples with frequent sampling. Urine samples have the advantage over blood samples that these are more acceptable to both research ethics and adolescents. Each overnight sample also provides a more time-integrated hormone measure. This is particularly true in early puberty when pubertal hormones commence pulsatile secretion nocturnally [1], so that a morning overnight urine sample may be more informative than a serum sample at any single time point. Urine collections also allow for more frequent collection than repeated venepuncture would be tolerated [26]. Salivary samples are potentially easier to collect, but blood contamination and influence of flow-rate on measurements seriously limits validity and accuracy and these may be subject to non-compliance by children [27].
Urine steroid measurements have been dramatically improved upon by introduction of new LC-MS/MS technology. Urine steroid LC-MS/MS measurements have been used exclusively as the basis of for anti-doping tests worldwide for the last few decades [3] and were required to have high level specificity. As a result this led to the recognition of the lower specificity of steroid immunoassays and resulted in major changes to clinical endocrinology research [3]. Steroid mass spectrometry ensures the accuracy and specificity necessary for the measurement of low levels of E 2 and T and the detection of subtle changes in these gonadal steroids over time. Anthropometric and serum hormone changes in our study revealed the anticipated increases over the course of one year in a cohort of young adolescents. The urine data also revealed anticipated hormone increases over the 12-months of observation. However, mean and individual urinary hormone changes were not strictly progressive, suggesting within-subject variability in early and mid-pubertal hormone levels, which may contain novel and hitherto unexploited information on determinants of biological aspects of pubertal progression. As these urine data are novel we are unable to compare with any previously published data, notably the degree of individual variability. Mouritsen and colleagues have shown that serum testosterone is variable in a longitudinal study of 20 adolescents (10 male) with sample collection biannually for five years [28]. Individual hormone curves showed testosterone levels fluctuated during the study, though the overall pattern demonstrated an increase in serum hormones testosterone (as assayed by LC-MS/MS and immunoassay) from recruitment to the end of followup. Biro's work in 252 females followed every six months for six years similarly demonstrated androstenedione, estrone, E 2 and T by LC-MS, with changes which increased over the transition from pre-puberty to six months after reaching Tanner stage 2 [29]; however, neither provide inter-individual variability or mean inter-individual hormone levels. Both previous studies measured only serum steroids in a single gender. We also identify the need for studies of longer duration in order to comment on what implications this variability (or what could be interpreted as instability) of hormone change may have on more distal responses to gonadal hormone change patterns in puberty be these physical, neurobehavioral or mood. This finding also suggests that three monthly urine collections over an extended period have the capacity to provide new insights into the biology of puberty.
We have shown considerable overlap between hormone levels at each age and self-reported Tanner stage, which emphasises the need for better descriptors of individual puberty hormone changes. Such overlap is consistent with other data using LC-MS/MS and clinician rating [30][31][32] and indicate the complexity and dynamic nature of puberty. Both Tanner stage and anthropometric change lag behind hormonal change. Using the former as surrogates for puberty hormone change will less accurately describe the relationship between hormone changes and the resultant physical changes, as well as any other adolescent health or developmental change of interest, such as mood or behaviour. More broadly still, this study has potential importance to the understanding of normal puberty because there is a paucity of frequently sampled longitudinal studies of the hormonal changes during puberty. This situation leads to over-interpretation of how adolescent mood, behaviour and wellbeing relate to puberty hormones, going well beyond available data [33,34]. Our study methodology with frequent sampling of an easily accessible biological fluid is likely provide new insights into the dynamics of pubertal hormonal changes, especially in the tempo and stability of puberty change. These data in turn may allow an understanding of how puberty hormone change relates to the behavioural aspects of adolescence.
Previous work has questioned the validity of self-rated Tanner staging [1,[11][12][13][14]. In this study a high proportion of adolescents completed self-assessment with findings of stable or an advance in Tanner stage at 12 months follow-up in all but 6% of adolescents (who went backwards in a Stage) Self-rated Tanner staging also corresponded well with conventional anthropometric measures of puberty change.
In conclusion, our work has established a feasible method for intensive urine sampling of community-dwelling adolescents and used a robust methodology of urine sex steroid hormone measurement, using liquid chromatography-tandem mass spectrometry measurements for urine sex steroids [3], which display the high sensitivity and specificity to detect lower levels of sex hormones, a particular challenge to the study of pubertal progression [6,35]. Based on previously recorded longitudinal growth data [36,37], it is anticipated that frequently measured urine samples over the two to three year window of normal puberty might well provide a firmer biological basis for clinically observed patterns of puberty, such as rapid, slow or variable tempo, which may over the longer term support some of the observed differences in adolescent mood and behaviour.