Altered pubertal timing in 7q11.23 copy number variations and associated genetic mechanisms

Summary Pubertal timing, including age at menarche (AAM), is a heritable trait linked to lifetime health outcomes. Here, we investigate genetic mechanisms underlying AAM by combining genome-wide association study (GWAS) data with investigations of two rare genetic conditions clinically associated with altered AAM: Williams syndrome (WS), a 7q11.23 hemideletion characterized by early puberty; and duplication of the same genes (7q11.23 Duplication syndrome [Dup7]) characterized by delayed puberty. First, we confirm that AAM-derived polygenic scores in typically developing children (TD) explain a modest amount of variance in AAM (R2 = 0.09; p = 0.04). Next, we demonstrate that 7q11.23 copy number impacts AAM (WS < TD < Dup7; p = 1.2x10−8, η2 = 0.45) and pituitary volume (WS < TD < Dup7; p = 3x10−5, ηp2 = 0.2) with greater effect sizes. Finally, we relate an AAM-GWAS signal in 7q11.23 to altered expression in postmortem brains of STAG3L2 (p = 1.7x10−17), a gene we also find differentially expressed with 7q11.23 copy number (p = 0.03). Collectively, these data explicate the role of 7q11.23 in pubertal onset, with STAG3L2 and pituitary development as potential mediators.


INTRODUCTION
Menarche, when girls begin menstruation, is a pivotal physical developmental milestone that not only holds cultural and social importance in many communities, signifying the onset of reproductive ability, but is also medically meaningful with a range of concurrent and future health implications.Menarche is also a highly important but under-recognized public health consideration.][3][4] Pubertal maturation of the hypothalamic-pituitary-ovarian axis underlying menarche is complex, and age at menarche (AAM) varies widely across individuals.This variability is associated with subsequent risk for a number of illnesses later in life, including various oncologic, cardiometabolic, gynecological/obstetric, gastrointestinal, musculoskeletal, and neuropsychiatric diseases. 4For instance, early menarche has been associated with increased breast cancer risk, 5,6 likely due, at least in part, to longer cumulative estrogen exposure throughout reproductive life. 7,80][11][12][13] Some evidence suggests that, even many decades past menarche, Alzheimer's disease risk may be elevated in those with later AAM, 14,15 though confirmatory work is needed. 16,179][20][21][22] This genetic component raises the possibility that some of the epidemiologic associations between AAM and subsequent disease risk could be mediated not just by the direct sequelae of exposure to a certain cumulative number of lifetime menstrual cycles, but also by underlying pleiotropic genetic variation, though mechanistic evidence from biological experimentation is needed to elaborate this hypothesis.A recent genome-wide association study (GWAS) of $370,000 women identified 389 significant independent signals that contribute to AAM, 23 indicating a highly polygenic architecture guiding this important developmental trait.This complexity is consistent with studies of puberty-related disorders in both human and animal models, which point to a multifaceted network of signaling molecules and pathways regulating hypothalamic-pituitary-gonadal (HPG) maturation. 24,25In addition to the abundance of common, GWAS-defined variants that have been statistically

Polygenic contributions to AAM
The degree to which common genetic variation cumulatively influences a phenotype can be estimated as a single value with polygenic scoring methods. 43We used these methods, in conjunction with previously reported summary statistics from a GWAS of retrospectively self-reported AAM that included $370,000 adult women 23 in order to confirm and quantify the genetic contribution to pubertal timing in a reference sample of TD girls participating in our longitudinal studies.Based on the GWAS summary statistics for AAM previously reported, 23 we calculated AAM polygenic scores (PGS-AAM) for 47 TD girls (whose date of first menstruation was recorded soon after it occurred; Table 1).PGS-AAM was positively related to clinician-ascertained AAM such that individuals with a polygenic proclivity toward later AAM were observed to have later actual AAM (r = 0.3, p = 0.04, Figure 1).Common genetic variation accounted for approximately 9% of the variance in AAM within this TD reference group, an effect size consistent with prior reports. 23re variant (7q11.23 CNVs) contributions to AAM Clinical observations have suggested that individuals with WS (hemideletions yielding one copy of $25 genes at chromosomal locus 7q11.23)have early puberty, whereas those with Dup7 (having three copies of these same genes) have delayed puberty.We sought to confirm these observations empirically and ascertained AAM through clinical interviews of eight girls with WS, 47 TD girls, and nine girls with Dup7 (Table 1).In our rare cohorts of people with 7q11.23 CNVs, we found that AAM significantly differed across copy number groups such that greater gene dosage was associated with older AAM (Average AAM: WS = 10.96G 1.1 years, TD = 12.6 G 1.25, Dup7 = 15.8G 2.7; F = 24.99,h 2 = 0.45, p = 1.2x10À8 , Figure 2), with 7q11.23 copy number accounting for 45% of the variance in AAM.Post-hoc Tukey's honest significant difference tests showed that AAM in the WS group was significantly earlier than was AAM in both TDs (p = 0.01) and Dup7 (p = 5.7x10 À7 ) and AAM in the Dup7 group was significantly later than in TDs (p = 2.3x10 À8 ).
Because common genetic and environmental factors could influence the timing of puberty, we also collected AAM data on typically developing full sibling girls of our participants with WS (8 siblings, AAM = 12.7 G 1.1) and Dup7 (8 siblings, AAM = 12.2 G 0.8).These sibling participants (who have the typical two copies of 7q11.23 genes) live and have grown up in the same households as the CNV participants.Therefore, their development could have been affected not only by similar genetic factors (outside of the 7q11.23 region), but also by similar environmental factors that may affect the timing of puberty, including nutrition and socioeconomic status.We found no AAM differences between the unrelated TD group and the 16 siblings of our patients (unrelated TD AAM 12.6 years, sibling AAM 12.45 years, F = 0.6, p = 0.5).Additionally, we repeated our analysis of AAM as a function of copy number status, substituting these siblings for unrelated TDs to carry out comparison with the two 7q11.23 CNV groups and confirmed the stepwise copy number effect on AAM controlling for these shared genetic and environmental factors (F = 17.8, h 2 = 0.5, p = 8x10 À5 ).
Because one individual with Dup7 included in this analysis was treated with growth hormone during childhood, we repeated the analysis testing for association of AAM with 7q11.23 copy number excluding this individual, and the results were largely unchanged and remained highly significant (F = 20.55,h 2 = 0.41, p = 1.6x10À7 ).

Pituitary volume in individuals with 7q11.23 CNVs
The pituitary gland is a key brain region in regulating endocrine function and is central to the HPG axis, producing a surge of hormones at the onset of puberty.Additionally, this region is also enriched with genes underlying AAM-related genetic loci, 23 suggesting that the epidemiologic association between AAM and diseases later in life may be attributed, at least in part, to variability in exposures to sex hormones related to menarche.Prior evidence points to pituitary size increasing in puberty with potentially larger glands in individuals with early puberty. 41,42To test whether individuals with rare 7q11.23 CNVs, who (as aforementioned) showed differential timing of AAM, also have differential sizes of pituitary glands, raters, who were blinded to participants' copy number status, manually segmented the pituitary from structural brain MRIs of 30 individuals with WS, 50 TD children and 16 individuals with Dup7, matched for age (Table 1).Total volume of the segmented region was calculated and compared across groups.Pituitary size was significantly associated with 7q11.23 copy number, such that individuals with WS had the smallest pituitary glands, while individuals with Dup7 had the largest (F = 11.7,h p 2 = 0.2, p = 3x10 À5 , Figure 3).Post-hoc pairwise comparisons showed that the pituitary size of participants with WS was significantly smaller than both TDs (p = 1.1x10À4 ) and those with Dup7 (p = 4.5x10 À5 ); TDs were nominally smaller than Dup7 (p = 0.17).The directionality of this across-group relationship was opposite to that expected, since as aforementioned, individuals with precocious puberty have larger pituitary volumes, 41,42 but here, individuals with Dup7,  who have later AAM, had larger pituitary volumes.In reviewing the images after raters blinded to copy number status had manually performed pituitary segmentation, it was noted that glands of several individuals with Dup7 had a more cystic appearance, potentially suggesting a buildup of unreleased hormone. 44,45ext, to examine developmental patterns of pituitary volume longitudinally within 7q11.23 copy number groups, investigators, again blind to diagnosis, segmented pituitaries from 226 longitudinal visits of these same participants, many of whom return approximately every two years.This analysis included scans from 87 visits of 30 participants with WS, 102 visits of 50 TDs, and 37 visits of 16 participants with Dup7.There were significant age (p = 2x10 À16 ) and group (p = 2x10 À4 ) effects and an age-by-group interaction (F = 10.02,p = 2.5x10 À6 ), wherein pituitary volumes of the three groups were similar after the start of puberty, but then begin to diverge, and the reduced volumes in WS and the increased volumes in Dup7 become apparent (Figure 4).
Additionally, to ensure that these analyses were not driven by treatment with any hormonal medications, we performed sensitivity analyses for both the cross-sectional and longitudinal assessment of pituitary volumes after excluding participants who had any history of receiving treatment with hormonal medications.The results of these sensitivity analyses were largely unchanged from those of the primary analyses: for the cross-sectional analysis (22 participants with WS, 50 TDs, and 15 participants with Dup7; F = 8.22, h p 2 = 0.17, p = 5.5x10 À4 ; and for the longitudinal age-by-group interaction F = 6.02, p = 4.5x10 À4 , based on 186 total visits.

7q11.23 genetic signals impacting AAM
To determine which, if any, of the 389 genetic signals previously reported to be significantly associated with AAM in a large-scale GWAS 23 are located at the 7q11.23 locus, we referenced the reported summary statistics from that study.One genome-wide significant signal spanning the telomeric end of the WS critical region was identified, extending into the low-copy repeat region flanking the WS 7q11.23 locus (peak SNP rs2267812, p = 1.7x10À17 , Figure 5).To further characterize this significant locus, we searched for eQTLs in data derived from postmortem brains of healthy individuals from the Brainseq database (http://eqtl.brainseq.org/phase1/eqtl/).Though this peak SNP, rs2267812, is located within the GTF2I gene, we found that its variation was most significantly associated with expression of two transcripts of the STAG3L2 gene (ENST00000448772 and ENST00000380775, p bonf = 0.002 and p bonf = 0.01, respectively), suggesting that the latter gene may be responsible for the association of this locus with AAM.Importantly, the ENST00000380775 transcript of STAG3L2 spans 194kb, nearly the entire locus seen in Figure 5 and includes an exon underlying the GTF2I gene in the telomeric portion of the 7q11.23 region hemideleted in WS (Figure 6).In contrast, the ENST00000448772 transcript lies exclusively in the flanking region that is not typically involved in the classic WS hemideletion (Figure 6), further emphasizing the importance of ENST00000380775 for this work.Though one exon of the ENST00000380775 transcript of STAG3L2 extends into the classic WS critical region, the typical reporting of this gene lies outside of the WS locus, in the flanking low-copy repeat region (Figure 6, bottom).Therefore, to test whether individuals with WS and Dup7 have altered expression of STAG3L2, we analyzed RNASeq data from blood lymphocytes of 23 children with WS, 40 TD children, and 13 children with Dup7 (Table 1), and found that expression of STAG3L2 was related to CNV dosage (p = 0.03; Figure 7).Post-hoc t-tests showed that individuals with WS had greater STAG3L2 expression than both TDs (p = 0.03) and individuals with Dup7 (p = 0.04) but the greater expression in TDs compared to Dup7 was not significant (p = 0.2).Surprisingly, the directionality of this finding was such that expression of STAG3L2 was highest in individuals with WS (who have hemideletions) and lowest in individuals with Dup7 (who have an extra copy of the $25 genes in the WS 7q11.23 locus), suggesting that hemideletion of this region may include the removal of regulatory element(s) that inhibit STAG3L2 transcription.We also examined RNAseq data regarding the expression of the other STAG3-like genes (which are not located in the 7q11.23 CNV region) and found no associations between 7q11.23 copy number status and expression of these genes.These include STAG3 (p = 0.98), STAG3L1 (p = 0.2), STAG3L3 (p = 0.96), STAG3L4 (p = 0.42), STAG3L5P (p = 0.99), and STAG3L5P-PVRIG2P-PILRB (p = 0.98).

DISCUSSION
Consistent with prior clinical observations, we empirically confirmed that individuals with 7q11.23 CNVs have differential pubertal timing, such that greater gene dosage (i.e., in participants with Dup7) was associated with older AAM, whereas hemideletion (i.e., in WS) was associated with earlier pubertal timing.In support of our focus on these rare neurodevelopmental conditions in order to better understand genetic sources of AAM variability, we found that CNV status explained approximately 45% of our sample's total variance in AAM, reflecting both the wide range of AAM in our CNV cohorts and the strength of the highly penetrant CNV effect.Additionally, our observation is likely an underestimation of the actual CNV effect, as the girls with WS who were treated with hormonal medication to delay their early onset of puberty were not able to be included in this analysis as their AAM was pharmacologically altered.In contrast to this robust copy number effect, the cumulative effects of GWAS-identified common genetic variation, while present, accounted for $9% of the variance in AAM in our reference sample of TD children, which importantly, is similar to the previously reported $7.4% of the population variance estimated from prior large-scale GWAS data. 23hough there is a significant genetic component driving AAM, environmental factors, including nutrition and socioeconomic status, could also affect development.Our analysis of AAM comparing patients with 7q11.23 CNVs to full siblings (who have the typical two copies of 7q11.23 genes and live and have grown up in the same households as the patients) was highly significant.In fact, the effect size estimate actually increased (from h 2 = 0.45 to 0.5) despite a substantially smaller cohort of siblings than there were unrelated TD individuals with 2 copies of 7q11.23 (Ns = 47 TDs versus 16 sibling girls), suggesting that taking the shared genetic (outside of the 7q11.23 region) and environmental factors into consideration brought the observed copy number effect into even clearer focus.
In addition to AAM, pituitary size was also associated with 7q11.23 copy number, such that individuals with WS had the smallest pituitary glands while individuals with Dup7 had the largest.This result indicates that pituitary development is likely subject to genetic regulation originating, at least in part, from within the 7q11.23 locus.The pituitary gland, a central endocrine hub governing sex hormone homeostasis and playing a crucial role in the pubertal transition, shows volumetric changes associated with sexual development , 37 visits) over 1-5 visits for each participant across ages 8-20 years.The three developmental trajectory curves (WS, yellow; TD, orange; and Dup7, green) were longitudinally calculated with mixed-effects spline modeling; gray ribbons depict 95% confidence intervals for each group's curve.There were significant age (p = 2x10 À16 ) and group (p = 2x10 À4 ) effects and an age-by-group interaction (p = 2x10 À6 ): pituitary volumes of the three groups were similar until after the start of puberty, but then began to diverge, with largest pituitary volumes in individuals with Dup7 and smallest volumes in WS, a pattern that continued throughout the studied age range; volume increased at a faster rate in the Dup7 group.and sex hormone milieu 46,47 and is enriched for AAM-associated gene transcripts. 23Moreover, SNPs near genes involved in hormone synthesis, bioactivity (e.g., ESR1, PGR), and pituitary function (e.g., TACR3, LGPR4) have been found to be associated with menarche timing. 48However, the directionality of the association between pituitary size and 7q11.23 CNVs-individuals with WS having smaller pituitary volumes (and earlier puberty) and individuals with Dup7 having larger pituitaries (and later puberty)-suggests that the molecular mechanisms driving earlier puberty in WS may be different from those for clinical disorders associated with formal diagnoses of central precocious puberty, wherein pituitary volumes have been found to be enlarged. 41,42Additionally, our qualitative observation of more cystic-appearing pituitary glands in individuals with Dup7 will require further investigation, including determining whether this observation might indicate a buildup of unreleased hormone.Regardless, because of the pituitary's important role in pubertal development and concomitant morphological changes mirroring pubertal timing abnormalities, it is unlikely that the parallel 7q11.23 CNV associations with both AAM and pituitary size are independent phenomena.Future work will be needed to delineate potential biological pathways leading from specific WS critical region gene hemideletion/duplication to pituitary structural and functional change resulting in AAM alterations.
Given the results we obtained from combining large-scale GWAS-level information with incisive studies of individuals with rare 7q11.23 CNVs, one candidate gene deserving of such investigation is STAG3L2.We identified one genome-wide significant signal from the Day et al., findings 23 that spans the telomeric end of the 7q11.23 CNV region, extending into the low-copy repeat region flanking the WS locus (peak SNP rs2267812).RNASeq data from postmortem brains suggest that, though the peak SNP is physically located in an intron of the GTF2I gene, variation at rs2267812 results in differential expression of a different gene, STAG3L2.This gene is ubiquitously expressed, particularly in testes, ovaries and pituitary, 37 and is similar in structure to at least five other ''STAG3-like'' genes, all of which lie outside the 7q11.23 WS critical region in other low copy repeat regions on chromosome 7.
Interestingly, the existence of these multiple ''STAG3-like'' genes is believed to be due to segmental duplications that first occurred in a common ancestor to all hominds after diverging from macacques 12-19 Mya, though the emergence of the STAG3L2 gene of interest here likely occurred during a much more recent segmental duplication in evolution, estimated to have occurred 2.55-2.89Mya. 497][58][59][60] Indeed, the findings reported here suggest that the structurally similar STAG3L2 gene is also an important mediator in endocrine function and has the potential to also influence the onset of puberty.The observation from our blood lymphocyte RNASeq analyses of individuals with 7q11.23 CNVs that an increase in STAG3L2 RNA expression is linked to WS is surprising, as one might expect hemideletion to result in a decrease in expression.This unexpected result suggests that hemideletion of the WS critical region may lead to the removal of regulatory elements that normally inhibit transcription.These findings are particularly intriguing in light of the reported links between STAG3 loss-of-function mutations and primary ovarian insufficiency in women, which is also characterized by delayed puberty.This association the ancestral STAG3 gene with pubertal timing is consistent with our observations in individuals with Dup7, in whom STAG3L2 expression is decreased and puberty is delayed, versus in WS, where STAG3L2 expression is increased and puberty is early.Thus, it seems plausible that the altered STAG3L2 expression observed in WS and Dup7 and altered STAG3L2 expression associated with common variation linked AAM (i.e., rs2267812) could reflect a unitary genetic mechanism impacting the timing of puberty.Understanding the precise mechanisms underlying these associations will require further research, but these findings may ultimately shed light on the complex interplay between genetics and pubertal timing, and the role of STAG3L2 in these processes.
While much of our focus on the GWAS signal studied here has been on the STAG3L2 gene, the physical location of the peak SNP in the GTF2I gene offers a possible alternative candidate for the gene of interest underlying this locus.However, given the associations between STAG3 and ovarian insufficiency, the eQTL linking rs2267812 with STAG3L2 expression in the postmortem brain, and the alternative STAG3L2 transcript that spans the entire significant GWAS locus, STAG3L2 appears to be the more likely candidate responsible for the differential impacts on pubertal timing observed in individuals with WS and Dup7.Nonetheless, GTF2I codes for a general transcription factor (TFII-I) that regulates methylation of genes outside of the WS critical region 61 and its targets include regulators of phosphorylation and WNT signaling. 62dditionally, variation in this gene has been associated with the hypersocial phenotype seen in WS 63 and recent evidence has linked this gene to oligodendrocyte function and myelination in the human and murine brain. 64Therefore, this important gene warrants further study, but may be the less likely candidate for the pubertal effects reported here.
Overall, the findings presented here provide valuable insights into the genetic mechanisms underlying the timing of puberty and the role of the 7q11.23 locus, particularly the STAG3L2 gene, in this pivotal developmental process.Specifically, the study confirmed that individuals harboring AAM-associated common polymorphisms and those with 7q11.23 CNVs have differential pubertal timing, with relatively modest versus large effect sizes, respectively.Furthermore, alongside AAM, the size of the pituitary may be modulated by 7q11.23 CNVs and merits investigation as a potential neural substrate for relevant pubertal events.Additionally, we suggest that STAG3L2 may yoke both rare CNV and common-SNP findings in this arena, potentially representing a key mediator of altered pubertal onsets in 7q11.23 CNV syndromes that may also generalize to normative variation in the timing of puberty.Further research will be necessary in order to understand the precise mechanisms underlying these associations, but collectively, the present data provide valuable insights into, and pathways for further discovery of, the complex interplay between genetics and puberty timing.They also present a potential avenue to investigate the adverse health consequences associated with the timing of puberty, which present substantial public health concerns.

Limitations of the study
Several caveats warrant mention regarding the present work.First, because of the relatively small sample sizes of these rare CNVs, we are unable to directly assess the effects of race or early stress, which, in addition to genetics, might influence AAM.Also, there is relatively limited data assessing age at voice breaking in boys, a corresponding pubertal milestone in males that shares many similar genetic signals with AAM in girls. 23Though there may be multiple reasons for these limited data, including the fact that our cohort of individuals with WS has a sampling bias toward girls, the timing voice breaking in boys has proved to be more difficult to ascertain than AAM.In our experience, individuals and parents tend to remember the exact date of the discrete event of menarche in girls but are less likely to remember the approximate date of the relatively slower process of voice breaking in boys.Lastly, mechanistic interpretation of these findings related to sex steroids is limited by the lack of information regarding circulating hormone levels in these participants.Therefore, the presence of any such association with these data will require further study.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:  further analyses, based on previously described methods. 67,68The relation between the resulting PGS-AAM scores and clinician-ascertained AAM was determined using Pearson correlation in SPSS.
Association of AAM and 7q11.23 CNVs AAM was ascertained, as above, through clinical interview of participants and their parents, including 17 girls with 7q11.23 CNVs: eight with WS, nine with Dup7, and 47 typically developing (TD) girls.AAM was also ascertained from 8 siblings of individuals with WS and 8 siblings of individuals with Dup7.Seven additional girls with WS had been excluded from analysis as they were treated with hormonal medications to pharmacologically delay the onset of puberty.No participants with Dup7 received hormonal treatment to alter the timing of puberty, and therefore none were excluded for this reason.A one-way ANOVA was used to calculate group differences in AAM in SPSS across WS, TD, and Dup7 participants.Post-hoc Tukey's honest significant difference tests were performed between each group to assess whether AAM significantly differed between WS and TD groups and between Dup7 and TD groups.Additionally, a one-way ANOVA was also used to examine other common genetic and environment factors that may influence the timing of puberty across WS siblings, TD, and Dup7 siblings.

Pituitary Volume Size Differences with 7q11.23 CNVs
For age-matched groups of 30 individuals with WS, 50 TD children, and 16 individuals with Dup7, investigators blinded to diagnostic group determined the volume of the pituitary by manual segmentation on T1-weighted ME-MPRAGE structural scans (TR/TE: 10.5/1.8 ms, Flip Angle: 7 , voxel size: 1 3 1 3 1mm, 176 slices) collected on a GE 3T MRI scanner.Group differences in pituitary volumes across 7q11.23 copy number groups were assessed in SPSS using a general linear model, controlling for age at scan and sex.Post-hoc pairwise comparisons of estimated marginal means were also reported from the general linear model in SPSS.To further examine the developmental patterns of the pituitary volume across 7q11.23 CNV, 226 longitudinal visits (1-5 visits per participant, age range: 5-24 years old) of these same participants (all children returned approximately every two years for the study; WS: N = 30, 87 visits; TD: N = 50, 102 visits; Dup7: N = 16, 37 visits) were used to evaluate the development of pituitary size.R's gamm4 statistical package was used to carry out a mixed-effects penalized-spline analysis to model age-dependent developmental trajectories within these three diagnoses groups.

Differential expression analysis of STAG3L2 in 7q11.23 CNVs
RNA was extracted from lymphocyte cell lines for 23 children with WS, 40 TD children, and 13 children with Dup7, and RNA sequencing was performed at the NIH Intramural Sequencing Center.Stranded Poly-A selected mRNA libraries were constructed from 1 mg total RNA for each sample using the TruSeq Stranded mRNA Kit (Illumina) according to the manufacturer's instructions.Amplification was performed using 10 cycles to minimize over-amplified product.Unique dual-indexed barcode adapters were applied to each library.Libraries were pooled in an equimolar ratio for sequencing.The pooled libraries were sequenced on an S4 flow cell on a NovaSeq 6000 using version 1.0 chemistry to achieve a minimum of 49 million 150 base read pairs.The data were processed using RTA version 3.4.4.Alignment of resulting RNASeq fastq files was performed using STAR version 2.6.1 to the GRCh37 genome build.Aligned reads were further processed using QoRTs version 1.3.6 and expression values of STAG3L2 were analyzed using DESeq2 to test for stepwise group differences.Post-hoc pairwise t-tests of extracted normalized count values were conducted in SPSS to test for differences in STAG3L2 expression between groups.

QUANTIFICATION AND STATISTICAL ANALYSIS
For the analysis of AAM in TD children, the relation between PGS-AAM scores and clinician-ascertained AAM was determined using Pearson correlation in SPSS.For the analyses of AAM in 7q11.23 CNVs, one-way ANOVAs were used to calculate group differences in AAM in SPSS, and post-hoc Tukey's honest significant difference tests were performed.For the cross-sectional analyses of pituitary volume, a general linear model was performed in SPSS, controlling for age at scan and sex.For the longitudinal pituitary volume data, mixed-effects penalized-spline modeling was analyzed using R's gamm4 package.For the differential expression analysis, expression values were analyzed using DESeq2 to test for stepwise group differences and post-hoc pairwise t-tests of extracted normalized count values were conducted in SPSS.

ADDITIONAL RESOURCES
This work was is part of two clinical trials, NCT01434368 and NCT01132885.

Figure 1 .
Figure 1.Correlation between PGS-AAM and AAM in typically developing girls PGS-AAM was positively related to clinician-ascertained AAM such that individuals with a polygenic proclivity toward later actual AAM were observed to have a later AAM (R 2 = 0.09, p = 0.04).Abbreviations: PGS: polygenic scores; AAM: age at menarche.

Figure 4 .
Figure 4. Developmental trajectories of 226 longitudinally acquired pituitary volumes (in cubic millimeters) graphed by 7q11.23 copy number status Pituitary volumes from T1-weighted MRIs of 30 participants with Williams syndrome ([WS], 87 visits), 50 typically developing children ([TD], 102 visits), and 16 participants with 7q11.23 Duplication ([Dup7], 37 visits) over 1-5 visits for each participant across ages 8-20 years.The three developmental trajectory curves (WS, yellow; TD, orange; and Dup7, green) were longitudinally calculated with mixed-effects spline modeling; gray ribbons depict 95% confidence intervals for each group's curve.There were significant age (p = 2x10 À16 ) and group (p = 2x10 À4 ) effects and an age-by-group interaction (p = 2x10 À6 ): pituitary volumes of the three groups were similar until after the start of puberty, but then began to diverge, with largest pituitary volumes in individuals with Dup7 and smallest volumes in WS, a pattern that continued throughout the studied age range; volume increased at a faster rate in the Dup7 group.

Figure 5 .
Figure5.SNP associations with AAM in the expanded 7q11.23 region A ''mini-Manhattan'' plot of SNP associations with AAM across the expanded WS critical region.One genome-wide significant signal spanning the telomeric end of the WS critical region was found, extending into the low-copy repeat region that telomerically flanks the WS 7q11.23 locus (peak SNP rs2267812, p = 1.7x10À17 ).Abbreviations: SNP: single nucleotide polymorphism; AAM: age at menarche; WS: Williams syndrome.

Figure 6 .
Figure 6.STAG3L2 alternate transcripts Figure highlights the genes in the WS region and STAG3L2 transcripts significantly related to AAM-associated SNP rs2267812.Top shows WS critical region at the 7q11.23 locus; colored lines and gene names indicate affected individual genes, and the red gradient bar represents the classic WS hemideletion/Dup7 duplication.Bottom shows locations of introns for STAG3L2 transcripts as reported in Ensembl (http://www.ensembl.org).The first two transcripts, highlighted by the red box, represent those showing significant eQTLs with rs2267812 in data derived from postmortem brains of healthy individuals from the brainseq database (http://eqtl.brainseq.org/phase1/eqtl/;ENST00000448772 and ENST00000380775, p bonf = 0.002 and p bonf = 0.01, respectively.Importantly, the ENST00000380775 transcript spans 194kb, nearly the entire identified significant region, and includes the telomeric portion of the 7q11.23 region hemideleted in WS.In contrast, the ENST00000448772 transcript lies exclusively in the flanking region and is not typically involved in the classical WS deletion.

Figure 7 .
Figure 7. 7q11.23 CNVs and STAG3L2 expression RNASeq analysis showed that expression of STAG3L2 was highest in individuals with WS (who have hemideletions) and lowest in individuals with Dup7 (who have an extra copy at 7q11.23, p = 0.03).Y axis shows normalized STAG3L2 counts from DESEQ2.X's represent mean of each group, lines represent median, boxes represent the range of the central 50% of data, and whiskers represent maximum and minimum values.Abbreviations: WS: Williams syndrome; Dup7: 7q11.23 Duplication syndrome; TD: typically developing children.

Table 1
a for ''AAM polygenic scores in TD girls'' and ''Association of AAM and 7q11.23CNVs'',age refers to AAM; for ''Differential Expression Analysis of STAG3L2 in 7q11.23 CNVs'', age refers to age at blood draw; for ''Pituitary Volume Size Differences with 7q11.23 CNVs'', age refers to age at scan.

TABLE
d RESOURCE AVAILABILITY B Lead contact B Materials availability B Data and code availability d EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS d METHOD DETAILS