The NIMH Intramural Longitudinal Study of the Endocrine and Neurobiological Events Accompanying Puberty: Protocol and rationale for methods and measures

Delineating the relationship between human neurodevelopment and the maturation of the hypothalamic-pituitary-gonadal (HPG) axis during puberty is critical for investigating the increase in vulnerability to neuropsychiatric disorders that is well documented during this period. Preclinical research demonstrates a clear association between gonadal production of sex steroids and neurodevelopment; however, identifying similar associations in humans has been complicated by confounding variables (such as age) and the coactivation of two additional endocrine systems (the adrenal androgenic system and the somatotropic growth axis) and requires further elucidation. In this paper, we present the design of, and preliminary observations from, the ongoing NIMH Intramural Longitudinal Study of the Endocrine and Neurobiological Events Accompanying Puberty. The aim of this study is to directly examine how the increase in sex steroid hormone production following activation of the HPG-axis (i.e., gonadarche) impacts neurodevelopment, and, additionally, to determine how gonadal development and maturation is associated with longitudinal changes in brain structure and function in boys and girls. To disentangle the effects of sex steroids from those of age and other endocrine events on brain development, our study design includes 1) selection criteria that establish a well-characterized baseline cohort of healthy 8-year-old children prior to the onset of puberty (e.g., prior to puberty-related sex steroid hormone production); 2) temporally dense longitudinal, repeated-measures sampling of typically developing children at 8-10 month intervals over a 10-year period between the ages of eight and 18; 3) contemporaneous collection of endocrine and other measures of gonadal, adrenal, and growth axis function at each timepoint; and 4) collection of multimodal neuroimaging measures at these same timepoints, including brain structure (gray and white matter volume, cortical thickness and area, white matter integrity, myelination) and function (reward processing, emotional processing, inhibition/impulsivity, working memory, resting-state network connectivity, regional cerebral blood flow). This report of our ongoing longitudinal study 1) provides a comprehensive review of the endocrine events of puberty; 2) details our overall study design; 3) presents our selection criteria for study entry (e.g., well-characterized prepubertal baseline) along with the endocrinological considerations and guiding principles that underlie these criteria; 4) describes our longitudinal outcome measures and how they specifically relate to investigating the effects of gonadal development on brain development; and 5) documents patterns of fMRI activation and resting-state networks from an early, representative subsample of our cohort of prepubertal 8-year-old children.


a b s t r a c t
Delineating the relationship between human neurodevelopment and the maturation of the hypothalamicpituitary-gonadal (HPG) axis during puberty is critical for investigating the increase in vulnerability to neuropsychiatric disorders that is well documented during this period. Preclinical research demonstrates a clear association between gonadal production of sex steroids and neurodevelopment; however, identifying similar associations in humans has been complicated by confounding variables (such as age) and the coactivation of two additional endocrine systems (the adrenal androgenic system and the somatotropic growth axis) and requires further elucidation. In this paper, we present the design of, and preliminary observations from, the ongoing NIMH Intramural Longitudinal Study of the Endocrine and Neurobiological Events Accompanying Puberty. The aim of this study is to directly examine how the increase in sex steroid hormone production following activation of the HPG-axis (i.e., gonadarche) impacts neurodevelopment, and, additionally, to determine how gonadal development and maturation is associated with longitudinal changes in brain structure and function in boys and girls. To disentangle the effects of sex steroids from those of age and other endocrine events on brain development, our study design includes 1) selection criteria that establish a well-characterized baseline cohort of healthy 8-year-old children prior to the onset of puberty (e.g., prior to puberty-related sex steroid hormone production); 2) temporally dense longitudinal, repeated-measures sampling of typically developing children at 8-10 month intervals over a 10-year period between the ages of eight and 18; 3) contemporaneous collection of endocrine and other measures of gonadal, adrenal, and growth axis function at each timepoint; and 4) collection of multimodal neuroimaging measures at these same timepoints, including brain structure (gray and white matter volume, cortical thickness and area, white matter integrity, myelination) and function (reward processing, emotional processing, inhibition/impulsivity, working memory, resting-state network connectivity, regional cerebral blood flow). This report of our ongoing longitudinal study 1) provides a comprehensive review of the endocrine events of puberty; 2) details our overall study design; 3) presents our selection criteria for study entry (e.g., well-characterized prepubertal baseline) along with the endocrinological considerations and guiding principles that underlie these criteria; 4) describes our longitudinal outcome measures and how they specifically relate to investigating the effects of gonadal development on brain development; and 5) documents patterns of fMRI activation and restingstate networks from an early, representative subsample of our cohort of prepubertal 8-year-old children.

Adolescence versus gonadal development and puberty
Adolescence, defined as the period of transition between childhood and adulthood, is a time of physical, emotional, and cognitive development. Adolescence is also a time when multiple neuropsychiatric illnesses emerge ( Costello et al , 2011 ;Green et al , 2005 ;Ma et al , 2019 ). This window of development has, therefore, received considerable attention in the search for neurobiological substrates and mechanisms underlying the instantiation not only of adult human characteristics but also of risks for these disorders (for review see: Meyer and Lee, 2019 ). However, "adolescence " is very broadly defined in terms of chronological age rather than by biological benchmarks, and it encompasses numerous and varied emotional, psychosocial, physiological, and anatomical events that are difficult to disambiguate mechanistically. In contrast to "adolescence, " gonadal development (e.g., gonadarche), while embedded in the adolescent period, is more clearly and specifically defined and measurable as the development of mature reproductive function, including characteristic physiological changes ( Witchel and Topaloglu, 2019 ). It is important to note that, in parallel with gonadal development, adolescence also involves the coactivation of two additional endocrine systems, the adrenal zona reticularis (i.e., adrenarche) and the somatotropic growth axis ( Witchel et al , 2019 ), both of which, in theory, could also impact brain development. In this longitudinal study, our primary aim is to investigate the neurobiological correlates of puberty-related gonadal development, including its initiation at gonadarche. In addition, we explore neurodevelopmental effects of accompanying endocrine and metabolic events related to the adrenal androgenic system and somatotropic growth axis.
Our primary focus on the specific relationship between gonadal development and neurodevelopment is important for several reasons. First, the increased onset of a wide range of psychiatric illnesses during adolescence and the emergence of sex differences in many of these conditions suggests a specific role for differential gonadal sex steroid exposure in the ontogeny of these conditions ( Cohen et al , 1993 ;McGee et al , 1992 ). Second, organizational effects of gonadal sex steroids have been welldocumented in the brain during this critical time period in preclinical research (for review see: Juraska and Willing, 2017 ). However, despite the clear importance of gonadal sex steroids during adolescence, studies of children during this time period could be complicated by general effects of chronologic aging on the brain or by effects related to the concurrent maturations of both adrenal androgen secretion and the somatotropic axis (for review see: Blakemore et al , 2010 ). Thus, the specific impact of gonadal development and gonadal sex steroid production on human neurodevelopment requires further elucidation.

Neuroendocrine events of puberty
Puberty involves a complex cascade of endocrine events that result in visible end-organ effects on reproductive and, potentially, neural substrates. Although the hypothalamic-pituitary-gonadal (HPG) axis is activated for limited times during fetal and neonatal periods (e.g., "mini-puberty "), throughout the remainder of childhood there is a yetto-be fully characterized neurobiological 'brake' on the HPG axis that slowly begins to release during puberty ( Plant and Witchel, 2006 ). The reactivation of the HPG axis (i.e., gonadarche) that defines puberty includes an increase in pulsatile secretion of gonadotropin-releasing hormone (GnRH) from the hypothalamus, resultant release of luteinizing hormone (LH) and follicle-stimulating hormone (FSH) from the pituitary, and subsequent production of sex steroid hormones, including testosterone, estradiol, and progesterone from the testes and ovaries. The first apparent manifestations of gonadarche on physical examination are the estrogen-dependent onset of breast bud development (i.e., thelarche) in the majority of girls, and the FSH-dependent expansion of the testes above 3cc in boys Rosenfield, 2011a , Bordini andRosenfield, 2011b ). These physical developments and the underlying endocrinological events are important to document in order to associate neurodevelopment with pubertal onset and progression.

Effects of puberty on neurobiology: basic and human studies
Early studies in rodents and non-human primates offered crucial clues of an association between puberty-related events and brain development. For example, beginning in the 1970's, work in rhesus monkeys demonstrated that orbitofrontal and dorsolateral prefrontal cortices become involved in key neural and behavioral functions during late adolescence, roughly around the time of puberty, and also showed that this maturation was under the control of gonadal steroids ( Clark and Goldman-Rakic, 1989 ;Goldman et al , 1974 ). It was also demonstrated that adolescence (i.e., as defined by chronological age) is marked by decreased synaptic density in the prefrontal cortex of non-human primates ( Bourgeois et al., 1994 ) as well as in human post-mortem tissue ( Huttenlocher et al., 1997 ).
Recent preclinical research in rodents has allowed more specific demonstrations of the association between gonadarche, gonadal sex steroids, and neurodevelopment, such as changes in dendritic and synaptic structure, axon myelination, and neuronal function (for reviews see: Juraska et al , 2013 ;Juraska et al , 2017 ;Sisk and Zehr, 2005 ). These changes include synaptic pruning in the medial prefrontal cortex (mPFC) in both sexes ( Drzewiecki et al., 2016 ), neurogenesis and apoptosis in the mPFC in female but not male rats ( Koss et al , 2015 ), and increased inhibitory input to the mPFC in female rodents ( Piekarski et al , 2017 ). Thus, observed changes in rodent brain structure accompanying gonadarche and attendant changes in brain function both suggest sex differences in the impact of sex steroids. These convergent studies indicate that gonadarche is associated with the commencement of a "critical developmental window " for organizational changes similar to the critical windows seen in fetal brain development ( Schulz et al , 2009 ).
In humans, neuroimaging studies also document gross morphological changes in brain during adolescence (i.e., age-related), with gray matter volume increasing in late childhood, reaching a peak during early adolescence, and declining afterward ( Giedd et al , 1999 ;Gogtay et al , 2004 ). It has been suggested that this time course reflects pubertyspecific synaptic remodeling, but neuroimaging studies that specifically relate the impact of changes in gonadal hormones per se to changes in the brain during puberty are necessary. Puberty-related findings across neuroimaging modalities, including structural magnetic resonance imaging (sMRI), functional MRI (fMRI), and diffusion tensor imaging (DTI), have made considerable progress, but are marked by variability in study designs and analytic approaches and require replication (for reviews see: Blakemore et al , 2010 (fMRI); Herting and Sowell, 2017 (sMRI); Ladouceur et al , 2012 (DTI)). Additionally, the lack of consensus in many human neuroimaging studies likely reflects, at least in part, the difficulty of isolating and dissociating the impact of the events of puberty from those of age and of non-puberty related endocrine events on brain (for review see: Vijayakumar et al , 2018 ).

Clinical relevance of puberty for psychiatric illness
The importance of delineating a relationship between human neurodevelopment and the endocrine events accompanying puberty is emphasized by the increased vulnerability (and in some children possibly the resilience) to neuropsychiatric disorders documented during this period. Epidemiological data demonstrate a peak onset of neuropsychiatric diseases during the pubertal transition, as well as the emergence of sex differences in their prevalence. For example, girls demonstrate a two-to three-fold increased prevalence of mood, anxiety, and eating disorders after puberty, while boys show no change in prevalence ( Costello et al , 2011 ;Green et al , 2005 ;Ma et al , 2019 ). The instantiation of sex differences in neuropsychiatric disorders, particularly in female-biased rates of affective and anxiety disorders, has been inferentially linked to the production of gonadal steroids ( Hankin et al , 2015 ;Joinson et al , 2012 ;Patton et al , 1996 ;Patton et al , 2008 ). Despite the suggestion of an association, there is neither formal documentation of a neurostructural or a neurofunctional basis for this presumed relationship nor a delineation of its underlying mechanisms. Thus, an important first step for understanding the role of puberty, and more specifically the role of gonadal sex steroid production, in the emergence of these disorders and their neurobiological instantiation, is to establish evidence of a link between the time course of brain development and that of the well-characterized biologic events related to puberty (e.g., gonadarche or other physiologic events discussed below) in healthy, typically developing children and adolescents. The relationship between these endocrine events and brain development must therefore be investigated longitudinally.

The NIMH Intramural Longitudinal Study of the Endocrine and Neurobiological Events Accompanying Puberty
In this paper, we present the design of and preliminary observations from the ongoing NIMH Intramural Longitudinal Study of the Endocrine and Neurobiological Events Accompanying Puberty. The aim of this study is to directly examine how the increase in sex steroid hormone production at puberty impacts neurodevelopment and whether gonadal development tracks with longitudinal changes in brain structure and function in boys and girls. More specifically, we aim to identify which parameters of brain structure and function may be affected by changes in gonadal sex steroids across puberty and beyond in order to enable a more informed understanding of neurobiological mechanisms relevant for the ontogeny of neuropsychiatric disease and to provide predictors of disease as well as potential treatment targets. Finally, in addition to a careful assessment of gonadal development and sex steroid secretion, measures of concurrent physiologic events (e.g., adrenarche or growth spurt) will be obtained to rule-out/assess the effects of these non-HPG axis events on brain development.
To disentangle the effects of gonadal steroids from those of age and other endocrine events on a range of brain development measures (including changes in cortical morphology, axon myelination, and functional activation), three critical methodological steps are necessary: 1) prospective mapping starting with a well-established prepubertal baseline, since only neurobiological changes relative to this baseline can meaningfully be considered to be caused by underlying changes in sex steroid hormones; 2) longitudinal, within-individual repeated-measures sampling at time intervals that are sufficiently frequent to document changes in each individual across each pubertal stage, throughout the entirety of the pubertal transition until adulthood; and 3) the collection at each timepoint of multimodal data that integrate endocrine and somatic measures together with a comprehensive array of neuroimaging techniques designed to access the various parameters of brain structure and function that could reflect different aspects of neurodevelopment. To address these objectives, this investigation aims to identify the specific impact of gonadarche and the longitudinal impact of gonadal development on neurodevelopment. In a longitudinal cohort starting prior to the onset of puberty and returning every 8-10 months for 10 years, we measure brain structure (gray and white matter volume, cortical thickness and area, white matter tracts and integrity, myelination) and function (impulse inhibition, reward, working memory, emotional processing, resting-state network connectivity, regional cerebral blood flow) using a temporally dense (research visits every nine months) and cross-modality design (extensive neuroimaging, behavioral measures, endocrine and metabolic evaluations, and sequential MRI measurements of ovarian and testicular growth).
This report reviews the overall protocol of our ongoing longitudinal study, the rationales for our approach, and offers important considerations for developmental studies. First, we describe the basis for our participant selection criteria, with emphasis on the scientific underpinnings of these criteria and why they provide important endocrinological foundations for developmental studies. Next, we outline the overall lon-gitudinal design as well as describe the study's outcome measures and how they relate to investigating the effects of endocrine events accompanying puberty on brain development. Finally, we show patterns of activation and resting-state networks associated with each of our fMRI scans in an early, representative subsample of our well-characterized cohort of prepubertal children to document the utility of our measures.

Participants
The children enrolled in this on-going study represent a carefully selected, typically developing cohort. Participants are recruited through local advertisements and are paid for the time and inconvenience associated with study procedures according to National Institutes of Health (NIH) volunteer guidelines. The recruitment area is geographically centered around the NIH, located in Montgomery County, Maryland, and includes southern Maryland, Washington D.C., and northern Virginia. Target representativeness of our sample is based on the demographics of this region. Study protocols were approved by the NIH CNS Institutional Review Board, and all children provide written assent in combination with parents/legal guardians written consent. Children are recruited at eight years of age and are screened by clinicians with expertise in pediatric endocrinology and in child psychiatry to identify those with healthy typical development. Children who meet inclusion criteria (see below) are enrolled to participate in the longitudinal protocol with regular time points of data collection every 8-10 months until they reach 18 years of age.
An initial sample of 100 participants (50 boys and 50 girls) will establish the effect sizes and variance for our endocrine and neuroimaging data and help determine the sample sizes necessary to answer our research questions. Extensive interim analyses are planned, and if more participants are required to investigate specific questions, we will expand our sample. Additionally, participants who do not complete the protocol will be replaced with new recruits. To reduce attrition rates, all efforts are made to maintain stability of clinical staff over the course of the study, and team members prioritize several factors: 1) engaging each family and child to establish a rapport, fostered by the high temporal density of the study design, 2) establishing frequent and open communication to maintain access to clinical staff, and 3) maximizing availability to participants and their families both during and between longitudinal visits.
To anchor this investigation and to establish a critical baseline reference point to which comparisons can be made longitudinally as each child progresses through puberty and beyond, we have focused on a set of standardized core selection criteria for study entry in order to establish a cohort of well-characterized 8-year-old children who are documented to be in the prepubertal state, as described below.

Defining the prepubertal cohort: core selection criteria
During an initial screening visit (see Table 1 ), we employ five core selection criteria to ensure a group of prepubertal boys and girls likely to have comparable endocrine exposures at enrollment into the study (see Table 2 ). Identifying the prepubertal state is essential for our goal of prospectively defining changes in brain developmental trajectories that are specifically related to the onset and progression of gonadal development. Our criteria include: 1) a consistent time point for study entry at eight years of age; 2) the absence of visible markers of puberty as assessed by a clinical exam; 3) a body mass index (BMI) between the 15 th and 85 th percentiles, because lower or higher values are associated with delayed and precocious pubertal onset, respectively ( Wagner et al , 2012 ); 4) a normal tempo of growth as measured by hand and wrist skeletal bone age on X-ray; and 5) normal development including ageappropriate developmental milestones, typical cognitive development   Greulich andPyle, 1959 , 2 Reynolds andKamphaus, 2007 (IQ > 70), and no symptoms of psychopathology. These criteria and their scientific bases are described below.
In addition to the assessment of our core selection criteria, the screening visit includes the following: 1) collection of sociodemographic information and home environment including parental race, ethnicity, educational attainment, occupation, household income, and marital status, 2) assessment of participant handedness using the Edinburgh Handedness Questionnaire ( Oldfield, 1971 ), 3) mock MRI scanner training (see Section 2.4.4 . for more details), and 4) an educational session with the parents to review instructions for at-home sample collection for 24 h urine and salivary cortisol (see Section 2.4.3.5 . for more details).

Standardized age for study entry at which the majority of children will be prepubertal
Age and pubertal development are highly correlated; however, a specific age does not predict a specific pubertal state. The onset of puberty typically occurs between the ages of eight and twelve, although there are sex, racial, and ethnic differences in this timing ( Biro et al , 2008 ;Sun et al , 2002 ;Susman et al , 2010 ). Indeed, pubertal development prior to age eight in girls, and nine in boys, is considered precocious ( Parent et al , 2003 ), although some girls, especially those of African-American descent, may have breast development before this age ( Herman-Giddens et al , 1997 ). A cohort of 8-year-old children, clinically determined to be prepubertal (see next criterion, Section 2.2.2), permits the establishment of a baseline for following trajectories of typical pubertal development.

Clinical confirmation of prepubertal status: pubertal staging criteria
Because of the importance of establishing the prepubertal state as an anchor for our longitudinal measures, all 8-year-old participants are assessed with a physical exam, rather than with self-or parental-report of pubertal markers, and are determined to be prepubertal ( Bonat et al , 2002 ). The gold-standard method for the determination of a prepubertal state is the clinical assessment of testicular volume compared with orchidometer beads in boys ( Prader, 1966 ) and breast development in girls (and pubic hair development in both boys and girls) using a normative five-point scale developed by Marshall and Tanner ( Bordini et al , 2011a ;Tanner, 1969 , 1970 ); pubertal stage (PS)1 is defined as prepubertal whereas PS5 is full reproductive maturation. In boys, PS1 is defined as a testicular volume 3cc or less ( Bordini et al , 2011b ). In girls, PS1 is defined as the absence of breast buds in either breast (e.g., absence of thelarche; Dorn et al , 2006 ;Marshall et al , 1969 ).
Clinically, the onset of puberty can include the physical manifestations of gonadarche or those of adrenarche (e.g., pubarche or development of pubic hair; Counts et al , 1987 ;Sklar et al , 1980 ). Thus, to ensure a wellcharacterized prepubertal state, the absence of pubic or axillary hair is also required for designation of PS1 in both boys and girls in this study.

Standardized Body Mass Index (BMI) range to limit confounding influence of adiposity on pubertal timing and trajectory
By controlling for BMI between the 15th and 85th percentiles for age and sex as defined by the US Centers for Disease Control growth standards ( Kuczmarski et al , 2002 ), this study attempts to limit the confounding influences of adipose tissue on measures of endocrine function relevant for brain development. Converging data support the importance of this criterion. First, childhood obesity has been associated with an earlier onset of puberty in girls ( Wagner et al , 2012 ). In contrast, children with low BMI, particularly girls, show a delay in pubertal onset consistent with the suggestion that a minimum body weight is necessary for girls with anorexia to resume menstrual function ( Frisch and McArthur, 1974 ). Second, in addition to alterations in pubertal timing, high BMI may affect the accuracy of measures of prepubertal breast stage development in girls because adipose tissue can be mistaken for glandular breast bud development ( Dorn et al , 2006 ). Finally and importantly, adipose tissue contains a range of steroid metabolic enzymes that could impact the sex steroid exposure of the brain, irrespective of the HPG axis, even among prepubertal children ( Nokoff et al , 2019 ) and, therefore, could affect our study measures (also see Section 2.4.3.4 ).

Standardized skeletal maturation (bone age) at study entry to account for general health and pubertal timing and trajectory
Standardized measures of bone age are highly informative of normative developmental tempo as it relates to sex steroid hormone exposure (i.e., whether a child is exposed to a typical range of endocrine hormones for their age and sex; Dorn et al , 2006 ). Bone age measures reflect skeletal events associated with the cumulative actions of gonadal, adrenal, and growth hormones on bone tissue ( Emons et al , 2011 ). Specifically, chronological bone age development reflects a continuous longitudinal marker for measuring exposure to estradiol, androgens, insulin-like growth factor (IGF-1), and growth hormone (GH). For example, an advanced bone age is often observed in children with premature adrenarche and early puberty ( De Sanctis et al , 2014 ). Additionally, the closure of the growth plate (i.e., epiphyseal closure) is a discreet mile-stone marking the end of pubertal growth at the level of skeletal tissues ( Emons et al , 2011 ) and largely reflects the actions of estradiol in both men and women ( Shim, 2015 ;Weise et al , 2001 ). Bone age is also an integrative measure of past health, including nutrition, environmental exposures (e.g., exposures to endocrine-disrupting chemicals), genetics, and serious childhood disease states ( Creo and Schwenk, 2017 ). In our study, an X-ray of the left hand and wrist is compared against standards for sex and age ( Greulich and Pyle, 1959 ) to determine a participant's bone age (see also Section 2.4.2.3 for methods). Participants are entered into the study if they are within the 90% confidence interval for age and sex-specific mean value. Thus, we standardize in each child an indication of the cumulative exposure of pubertal hormones (i.e., estradiol) as well as of general health and pubertal progression in order to ensure an endocrinologically homogeneous and typically developing sample for entry into our study.

Selection of children with typical cognitive and behavioral development
Because this cohort is chosen to serve as an archival sample for the community to address questions about the relationship between the endocrine and neurobiological events of puberty in healthy children with typical behavioral and cognitive development, participants are carefully screened by clinicians prior to study entry. As part of a comprehensive evaluation, each child is determined to have an IQ > 70 using the Test of Irregular Word Reading Efficiency (TIWRE; Reynolds and Kamphaus, 2007 ), no diagnosis of psychiatric disorders as assessed by the Kiddie Scale for Affective Disorders and Schizophrenia (K-SADS; Kaufman et al , 1997 ), no family history of DSM-5 psychiatric disorders in first-degree relatives, no use (current or past) of psychotropic medications, and hospital birth records and parental history that reflect normal pregnancy, delivery, birthweight, and attainment of developmental milestones.

Longitudinal study design
Children meeting the inclusion criteria detailed above are studied longitudinally for 10 years, every 8-10 months, from age 8 to age 18. The 8-10 month intervals were chosen as a frequency dense enough to capture individual transitions between pubertal stages in boys and girls along with potential inflection points in neurodevelopmental measures while balancing the logistical demands of longitudinal data collection. In the original methods papers for pubertal staging, children could spend on average between 5 months and 2 years within a single PS, although higher and lower durations are possible ( Marshall et al , 1969( Marshall et al , , 1970. At study entry and at each longitudinal visit ( Table 1 ), participants are characterized with deep phenotyping of endocrine, metabolic, behavioral, and neurobiological outcome measures. Neurobiological measures are captured through an extensive series of multimodal neuroimaging procedures targeting neural systems previously identified to undergo substantial structural and functional transformation during adolescence. These include: four structural scans (T1-weighted structural scans, diffusion tensor imaging (DTI), a T2-weighted structural scan, and a myelin water fraction scan), two resting-state functional scans (multiecho and single-echo), an arterial spin labeling (ASL) scan, and four event-related tasks (impulse inhibition, reward, working memory, and emotional faces). Additionally, a clinical brain MRI is obtained once every calendar year (i.e., every other visit) and read by a neuroradiologist to ensure continued absence of any clinically relevant neuropathology.
Additionally, every 8-10 months, each longitudinal visit also includes: 1) a comprehensive clinical assessment -with physical exam, clinician-performed pubertal staging, history, assessment of clinical or subclinical symptoms of psychopathology disorder using the K-SADS ( Kaufman et al , 1997 ); 2) behavioral assessment including Tier 1 NIMH toolbox measures for pediatric research ( Barch et al , 2016 ); 3) venipuncture to measure sex steroid hormones (and other physiologicallyrelevant metabolic/neurotrophic hormones); 4) repeated salivary sam-ples and 24 h urine collection to assess the cortisol awakening response and urinary free cortisol levels, respectively; 5) left hand and wrist Xray for bone age assessment; 6) anthropometric measures to document growth and body fat distribution; 7) full-body Dual X-ray Absorptiometry (DXA) scan to measure bone density, lean muscle, and fat mass (total, android, and gynoid); and 8) MRI of the abdomen to measure visceral fat, subcutaneous fat, and gonadal volumes (i.e., both ovaries and testes). During the course of the study, participants will be retrospectively excluded if they have onset of an Axis I psychiatric illness in either the child or a first-degree relative. Additionally, participants will be monitored prospectively and excluded for use of psychiatric medication, or, for girls, if they become pregnant, are lactating, or begin using hormonal contraceptives. Finally, if a child's BMI falls out of our study range (below the 15th, or above the 85th, percentiles), he or she will be temporarily excluded (e.g., data up until time of exclusion will be included) until their BMI falls within the normal range.
Additional supplemental visits ( Table 1 ) include one-time visits when the child is older for 1) collection of blood samples for genomic assessment, 2) collection of blood samples for future investigations using induced pluripotent stem cells (iPSCs), and 3) assessments of childhood trauma using the Childhood Trauma Questionnaire ( Bernstein et al , 1994 ) and Adverse Childhood Experience questionnaire ( Felitti et al , 1998 ). To assure broadly typical cognitive development across the duration of the study, we administer a screening battery of neuropsychological assessments at study entry and every three years thereafter, including the Weschler Abbreviated Scale of Intelligence (WASI: vocabulary, block design, similarities, matrix reasoning) and tasks from the Delis-Kaplan Executive Function System (D-KEFS: trail-making test conditions 1-5, letter and category fluency, color-word interference conditions 1-4). Finally, after girls begin menstruating (i.e., menarche), monthly menstrual cycle self-reports are completed to enable standardized timing of assessments during the follicular phase and maintain a consistent endocrine environment in each girl at each study visit. Monthly reports include the timing of menses onset and duration of menstrual bleeding documented either online or via phone-call follow-up.

Behavioral measures
At each longitudinal visit, participants are administered a structured clinical interview (K-SADS) and a series of computer-based questionnaires that are completed by either the participant or by parents or legal guardians ( Table 3 ). These questionnaires include two NIMH Tier I core measures: the Columbia Impairment Scale ( Bird et al , 1993 ) and Strengths and Difficulties Questionnaire ( Bourdon et al , 2005 ) to assess symptomatology related to psychopathology as well as interpersonal peer relationships. Additional questionnaires similarly aim to assess changes in symptomatology of psychopathology as well as sleep and other behaviors that may change throughout puberty ( Achenbach and Ruffle, 2000 ;Buysse et al , 1989 ;Franko et al , 2004 ;Garner, 1991 ;Kovacs, 2004 ;Muris et al , 1999 ;Owens et al , 2000 ;Papay and Spielberger, 1986 ;Tellegen and Waller, 2008 ).

Measures of gonadal axis development
In the following sections, we describe the measures used throughout the study to assess and demarcate gonadarche and HPG axis development with a focus on what information each measure conveys about the underlying endocrine events that may impact brain development.

Gonadal hormone measures.
The secretion of the sex steroids estradiol, progesterone, and testosterone increase at gonadarche and are the principal gonadal-derived hormones of puberty. We obtain morning blood measures of these sex steroid hormones (and in post-menarchal girls during the follicular phase of their menstrual cycle) every 8-10 months, at each longitudinal visit. In accordance with recommendation Symptoms related to panic, generalized anxiety, social phobia, separation anxiety, obsessive-compulsive disorder, traumatic stress disorder, and specific phobias 9 State-Trait Anxiety Inventory for Children (STAIC) Current anxiety (i.e. "right now ") and state anxiety (i.e. "in general ") 10 for studying prepubertal and pubertal children, we use both a sensitive chemiluminescence assay and liquid chromatography-tandem mass spectrometry (LC-MS/MS) for quantification of estradiol and LC-MS/MS to measure testosterone in plasma ( Handelsman and Wartofsky, 2013 ;Rosner et al , 2007 ;Rosner et al , 2013 ). LC-MS/MS measures of sex steroids in blood samples (versus saliva) are preferable since most commercially available direct measures lack the accuracy to detect the low levels of steroid hormones during early puberty ( Courant et al , 2010 ;Keevil et al , 2014 ;Ketha et al , 2014 ;Lewis, 2006 ;Maskarinec et al , 2015 ;Rosner et al , 2007 ;Tivis et al , 2005 ). In our study, chemiluminescence immunoassay measures of estradiol and LC-MS/MS measures of testosterone are conducted on the day of sample collection, with lower limits of detectability of 5 pg/mL and 8 ng/dL, respectively. Additionally, banked blood samples of both serum and plasma are aliquoted and stored at -80 °C until the end of the study to permit all samples from each individual to be run under batched conditions in order to reduce inter-assay variability. The stability of steroid hormone levels in blood has been extensively studied and the impact of freeze/thaw cycles over several years of storage is well-characterized ( Comstock et al , 2001 ;Handelsman et al , 2020 ). Here, we elected to employ blood-based measures in our longitudinal study as saliva samples have been shown to degrade following long-term storage and, therefore, could impact subsequent steroid hormone measures ( Toone et al , 2013 ). With the goal of identifying how gonadal steroid production alters brain development in boys and girls, we will identify trajectories of the blood concentrations of these sex steroid hormones within individual children as well as individual inflection points within these trajectories to examine their association with changes in brain structure and function.

Pubertal staging of breast and testes.
Standardized determinations of the physical manifestations of HPG axis development are critical outcome measures that can be related to neurodevelopment. In addition to their importance as prepubertal inclusion criteria for study entry, development of breasts and testicular volumes are clinically assessed at every visit using a five-point scale from PS1 (prepubertal) to PS5 (reproductive maturity) ( Dorn et al , 2006 ;Marshall et al , 1969Marshall et al , , 1970. In our study, PS is assessed by trained clinicians following standardized training by a single pediatric endocrinologist and single pediatric nurse practitioner and assessment of inter-rater reliability to ensure consistency of the measures over time ( Bonat et al , 2002 ;Slora et al , 2009 ). In boys, testicular pubertal stages are defined by palpation and comparison to Prader orchidometer beads ( Slora et al , 2009 ): for PS1, less than or equal to 3cc; for PS2, > 3cc to 8cc; for PS3, > 8cc to 12cc; for PS4, > 12cc to 20cc; and for PS5 at full maturity, > 20cc ( Emmanuel and Bokor, 2017 ).
In girls, pubertal stages are defined by breast bud development using palpation to aid differentiation between glandular and adipose tissue: for PS1, absence of breast buds; for PS2, subareolar breast bud; for PS3, elevation of the breast and enlargement of the areolae; for PS4, areolae that form a secondary mound above the contour of the breast; and for PS5, mature female breasts with recession of the secondary mound to the contour of the breast ( Bordini et al , 2011a ;Marshall et al , 1969 ). In addition to determining pubertal onset and progression, our temporally dense study is particularly well-suited to investigating interindividual differences in pubertal development using constructs of pubertal timing and tempo, which are derived from longitudinal measures of PS using models of logistic growth ( Beltz et al , 2014 ;Marceau et al , 2011 ;Mendle et al , 2010 ;Vijayakumar et al , 2020 ) or based on timing of specific pubertal events (e.g., thelarche and menarche, Fassler et al , 2019 ). Pubertal timing describes an individual's relative age at a given stage in the pubertal process in comparison to other members of a given cohort, while pubertal tempo describes the rate of an individual's progression though each of the pubertal stages. Thus, our measures of PS are used to assess neurodevelopment in four ways: 1) to identify and document the prepubertal state (PS1), 2) to demarcate the onset of gonadarche (e.g., onset of PS2), 3) to measure the step-wise progress of reproductive maturity individually in each boy and girl, and 4) to assess inter-individual differences in pubertal timing and tempo.

Skeletal bone age.
In addition to baseline bone age measures which are used as inclusion criterion to confirm the prepubertal state, bone age is also a measure of the cumulative actions of gonadal, adrenal, and growth hormones on bone tissue across puberty ( Emons et al , 2011 ). In the current study, X-rays of the left hand and wrist are rated by a trained pediatric endocrinologist and compared against age and sex norms to determine chronological bone age at each longitudinal visit ( Greulich et al , 1959 ). As described previously (see Section 2.2.4 ), bone age is a composite of both longitudinal bone growth (i.e., cumulative steroid hormone exposure) and closure of the growth plate (i.e., discreet estradiol-driven event). Thus, in this study, we can use both measures as relevant markers of sex steroid hormone exposure at a tissue level to examine against changes in neurodevelopment.

Menarche and the menstrual cycle.
Menarche is a milestone of puberty in girls, signaling the initiation of tissue exposure, including to the brain, to the cyclical production of estradiol and progesterone. The onset of ovulatory cycles results in biphasic secretion of estradiol with peaks prior to ovulation (i.e., follicular phase) and during the midluteal phase. In the luteal phase, episodic production of progesterone from the corpus luteum (and progesterone's neuroactive metabolite, allopregnanolone) results in blood levels of progesterone approaching a 100-fold increase (i.e., ng/dL rise to ng/mL) compared to levels prior to menarche (and compared with boys). Menarche and the accompanying onset of cyclic ovarian steroid production may have important functional consequences in the developing brain ( Keating et al , 2019 ;Shen et al , 2005 ). In our study, menarche is identified prospectively by self-report and is used as a milestone event to investigate changes in brain structure and function in combination with changes in gonadal steroids. To ensure a homogeneous endocrine environment in each girl over the course of the study, subsequent longitudinal visits are timed to coincide with the follicular phase of their menstrual cycle. Follicular phase assessments were chosen because many girls experience months or years of irregular or anovulatory cycles following menarche, thus increasing the variability of hormone exposure in the luteal phase (for review see: Rosenfield, 2013 ). In order to accurately assess timing of the follicular phase, all menstruating participants submit regular monthly reports of menses onset and duration of menstrual bleeding. We also chose to exclude participants for use of hormonal contraceptives as they can alter the typical exposure to circulating gonadal and adrenal steroids ( Lello and Cavani, 2014 ;Zimmerman et al , 2014 ).

MRI measures of gonadal volume.
In addition to traditional methods of measuring gonadal development, we have included a relatively innovative MRI technique for more precisely documenting gonadal volumes. Gonadal volumes are measured from an MRI of the abdomen, with both right and left testicular/ovarian volumes manually segmented by a trained radiologist using Vitrea Software (Vital Images Inc., Minnetonka, MN, USA). Direct measurement of gonadal volumes, targets of both pituitary gonadotropins, can reflect overall maturation of sex steroid secretory dynamics ( Bordini et al , 2011a, Bordini et al , 2011b and provide a non-invasive yet direct measure of gonadal pubertal development. The initial expansion of the testes from 3 to 4cc (based on clinical assessment with Prader orchidometer) is a milestone signaling the earliest activation of the HPG axis and gonadarche in boys; however, relatively little is known about the relation between orchidometer and MRI measures in boys or about the association between ovarian volume and pubertal development in girls. Studies in women across the lifespan suggests that ovarian volumes (as measured by MRI and ultrasound) increase throughout early childhood and peak in late adolescence/early adulthood, and increases in ovarian volume have been identified between breast PS1 and PS2 ( Buzi et al , 1998 ;Haber and Mayer, 1994 ;Kelsey et al , 2013 ), although it remains to be determined if there is well-defined volumetric expansion of the ovaries that heralds the onset of gonadarche, as there is in testicular volume in boys. Thus, our use of direct gonadal volume measurements by MRI in conjunction with sex steroid measurements not only adds to the armamentarium of approaches for assessing pubertal development, but also may provide fundamental knowledge about both testicular and ovarian development and may serve as a sensitive and direct end-organ measure to correlate with brain development.

Measures of adrenal and somatotropic axis development
In parallel with gonadarche, puberty is accompanied by the concurrent activation of two additional endocrine systems, the adrenal androgenic system (e.g., adrenarche) and the somatotropic growth axis. Identifying the impact of gonadal development on brain development is, therefore, not only complicated by its association with age, but also potentially by interactions with, and the neurotrophic impact of, adrenal and somatotropic hormones. Thus, longitudinal measures of these systems should be included in the characterization of each child. Adrenarche involves the production of several adrenal hormones that either directly or indirectly act on androgen receptors (or serve as substrates for neuroactive steroid metabolites) and estrogen receptors (both estrogen receptor alpha and beta; Kuiper et al , 1997 ) and can thus mimic the effects of gonadal hormones ( Engdahl et al , 2014 ).
In addition, the maturation of the somatotropic axis is contingent on adrenal and gonadal hormones, and encompasses the increased production of GH, IGF-1, and leptin ( Murray and Clayton, 2013 ). Importantly, the hormones produced by all three systems are potentially neurotrophic ( Folch et al , 2012 ;Maninger et al , 2009 ;McCarthy et al , 2017 ;McEwen and Milner, 2017 ;O'Kusky and Ye, 2012 ), and each could contribute to puberty-related brain development. However, the timing of these events relative to gonadarche and within individual children is subject to considerable individual variability. Thus, characterizing hormone trajectories and milestones for each of these endocrine systems within an individual is critical to disambiguate or rule out the effects of adrenal gland and somatotropic axis development from the direct effects of gonadal development on neurodevelopment. In the following sections, we describe the measures used at each longitudinal visit to assess and demarcate adrenal and somatotropic axis development with particular focus on what information each measure conveys about underlying endocrine events that may impact brain development.

Adrenal and somatotropic hormone measures.
The increased secretion of adrenal steroids and somatotropic hormones marks the onset and development of the adrenal zona reticularis and the somatotropic axis that accompany puberty. At each longitudinal visit, we obtain morning blood measures of both adrenal (dehydroepiandrosterone [DHEA], dehydroepiandrosterone sulfate [DHEA-S], androstenedione, and 17-OH progesterone) and somatotropic (GH, IGF-1, and leptin) hormones with the goal of dissociating their impact from those of gonadal steroids hormones. Similar to our measures of gonadal steroids, banked blood samples (both serum and plasma) are aliquoted and stored at -80 °C. Likewise, we focused on blood-based measures for adrenal androgens, as salivary measures of some androgens, particularly DHEA-S, can vary considerably from those in blood due to external factors such as salivary flow rates (for review see: Lewis, 2006 ). Importantly, these adrenal hormone measures can be used to identify pre-gonadarchal children with biochemical evidence of adrenarche (defined by a threshold of DHEA-S levels in blood ≥ 1 μmol/L [ ≥ 40 μg/dL]) ( Mäntyselkä et al , 2018 )) to rule in or out the possibility of adrenarche-specific changes in prepubertal brain structure and function. Longitudinally, we will identify 1) trajectories of adrenal steroids (e.g., DHEA-S) and somatotropic hormones within individual children and 2) milestones in adrenal androgen secretion and somatotropic axis development (e.g., biochemical adrenarche, pubarche, growth spurt), to examine their association with changes in brain structure and function (and attempt to distinguish those factors most directly linked to observed changes in brain function or structure).

Pubertal staging of pubic hair.
Onset of pubic hair growth (i.e., pubarche) is a physical manifestation of increased adrenal androgen production following adrenarche and/or gonadarche. Both the onset and progression of pubic and axillary hair growth are driven by circulating androgens (i.e., testosterone, DHEA, DHEA-S, androstenedione) and are, thus, dependent on the androgen receptor ( Griffin and Wilson, 1980 ) in both boys and girls. At each longitudinal visit, pubic hair is clinically assessed by visual inspection using ordinal measures of pubic and axillary hair growth from its absence (PS1, prepubertal) to the first appearance of sparse fine pubic hair in PS2 (i.e., pubarche) through to full adult-like, mature quality and distribution of pubic hair PS5 ( Bordini et al , 2011b ;Dorn et al , 2006 ;Marshall et al , 1969Marshall et al , , 1970. Deconstructing PS into its individual components (e.g., pubic hair versus breast bud development) in combination with hormone concentrations provides an expanded set of critical benchmarks to examine neurodevelopment including, 1) identify and document the prepubertal state (PS1), 2) dissociate progression of adrenal androgen-related development from that of gonadal development (e.g., testicular/breast PS), and 3) assess inter-individual differences in pubertal timing and tempo.
2.4.3.3. Peak height velocity. Peak height velocity, or the midpoint of the growth spurt, is a critical milestone event in the development of the somatotropic axis. Pubertal-related growth is a result of the increased stimulation of the somatotropic axis by both adrenal and gonadal hormones ( Murray et al , 2013 ), but is also dependent on adequate nutrition, psychosocial environment, and the absence of disease ( Rogol, 2010 ). Age at peak height velocity is thought to be preceded by a sharp increase in IGF-1 secretion in both boys and girls ( Cole et al , 2015 ), a progressive rise in leptin in girls, and a sharp rise, and subsequent decline, in leptin in boys ( Rogol, 2010 ). In the current study, height is measured at each longitudinal visit, and growth velocity is calculated in each child as change in height over time using a derivative of the longitudinal curve to identify the slope at each time point. The onset of the growth spurt is defined as the visit with the greatest height velocity (i.e., slope) that is followed by a decrease in height velocity in subsequent visits (i.e., point of maximum height velocity). Thus, an individual's age at peak height velocity can be employed to test for potential changes in brain neurobiology that may be associated with elevated somatotropic hormones and somatic growth.

Distribution of adipose tissue with abdominal MRI and full-body
Dual X-ray Absorptiometry (DXA) scans. Adipose tissue can be considered a metabolically active endocrine organ due to its role in synthesizing hormones, including leptin, and in converting substrates to more active hormones, such as the conversion of weak androgens into estradiol or other estrogens (for review see: Kershaw and Flier, 2004 ). Changes in adipose tissue deposition are seen throughout puberty, with boys showing a relative decrease in fat percentage while girls show an increase concordant with rising leptin secretion ( Rogol, 2010 ). Additionally, changes in fat distribution are seen during puberty ( Staiano and Katzmarzyk, 2012 ), with girls showing increases in subcutaneous gynoid fat (around hips and thighs), while boys are more likely to show increases in visceral and android fat (around abdomen, chest, and shoulders). These changes in deposition and distribution result in increased subcutaneous fat in pubertal girls compared to prepubertal girls and to both prepubertal and pubertal boys ( Ahmed et al , 1999 ;Demerath et al , 1999 ;Staiano et al , 2012 ). Importantly, since steroid metabolic enzymes are differentially localized in subcutaneous and visceral adipose tissues, the relative amounts and type of adipose tissue could potentially modify the levels of (and thus brain exposure to) certain sex steroids (for review see: Kershaw et al , 2004 ). For example, subcutaneous fat has relatively higher levels of the aromatase enzyme, which converts androgens into estrogens, compared to visceral fat. Therefore, aromatase activity within subcutaneous fat may indirectly contribute to circulating estrogens, including estradiol (for review see: Kershaw et al , 2004 ).
To measure adiposity and fat distribution, we collect three integrated measurements of fat composition and depositions at each longitudinal visit. First, a clinical nutritionist evaluates each participant on a range of measures to characterize nutritional status including body height, weight, BMI, and anthropomorphic measures (e.g., skin fold thickness, waist-hip ratios). Second, participants complete an MRI of the abdomen to identify relative volumes of visceral and subcutaneous fat, as well as to provide a ratio of gynoid and android fat distributions. Third, participants complete a full-body DXA scan to measure visceral and subcutaneous fat mass (as well as lean muscle mass and bone mineral content). Together, these measures of fat deposition can provide a complementary index to address individual differences in relative androgen-versusestrogen exposures across puberty and their potential effects on brain neurobiology.

Cortisol measures.
Peripubertal maturation of the adrenal gland is limited to the expansion and differentiation of the zona reticularis (e.g., adrenarche) ( Auchus, 2011 ) and does not directly impact the adrenal cortex responsible for producing glucocorticoids (i.e., cortisol). However, preclinical evidence supports puberty-related changes and sex differences in hypothalamic-pituitary-adrenal (HPA) axis function ( Green and McCormick, 2016 ;Oyola and Handa, 2017 ) that could impact brain or behavior. In humans, several contextual factors, such as exposure to early life stress and obesity, also impact both age-related and sex differences in HPA function ( King et al , 2017 ;Leneman et al , 2018 ;Marakaki et al , 2018 ). To evaluate potential changes in HPA axis function across puberty, as well as to examine its possible associations with or impact on brain, we collect saliva samples at each longitudinal visit to measure the cortisol awakening response (CAR) at 0, 30, 45, 60 minutes after waking, and 24 h urine samples to measure integrated free cortisol (UFC). Measures of CAR are stable and reliable markers of HPA activity ( Pruessner et al , 1997 ) and are thought to reflect, in part, circadian cortisol secretion, stress responsivity, and anticipation of future stressors. The CAR also has been reported to be a potential marker for certain disease states or stress conditions ( Kramer et al , 2019 ;Stalder et al , 2016 ). Measures of UFC reflect an integrated measure of the 24 h secretion of plasma-free cortisol ( Boner, 2001 ). In humans, the effects of age and context on HPA function has not yet been confidently separated from those of puberty and remain to be characterized.

Neuroimaging measures
For the NIMH Intramural Longitudinal Study of the Endocrine and Neurobiological Events Accompanying Puberty, each child is scanned every 8-10 months, over a 10-year period, with a wide variety of neuroimaging methods designed to access a variety of indicators of neurobiological development. During their screening visit, all children participate in a mock scanner training session, which has been shown to reduce in-scanner motion in children and adolescence ( de Bie et al , 2010 ). Our mock scanner is designed to expose children to the procedures and environment of an MRI, and is equipped with an operational patient table, a mock head coil, foam cushions, and speakers to reproduce the sounds of various scanner sequences. Then, at each longitudinal visit, neuroimaging measures include brain structure (gray matter volume, white matter integrity, and myelination), functional activation of key circuits, and basal (resting) brain function. We apply this inclusive array to investigate our archival cohort of children because pubertal development may have manifestations that are either widespread or specific to one or more of these key neurobiological parameters. Casting this "wide net " with a multiplicity of neuroimaging methods not only allows cross-comparison among these neurobiological measures over time, but also, critically, affords the opportunity to compare the various neurodevelopmental trajectories with the endocrine, somatic, and behavioral outcome measures that are also collected longitudinally. All brain imaging is conducted on a single designated scanner, a 3T GE MR750, using a 32-channel GE head coil. Scanning parameters for all neuroimaging protocols are listed in Table 4 . Structural scans include: 1) T1-weighted images to measure changes in both cortical surface and volume across development, 2) DTI and a T2-weighted scan to measure changes in white matter integrity, and 3) multicomponent driven equilibrium single pulse observation of T1 and T2 (mcDESPOT) scans to quantify myelin water fraction. Functional scans include: 1) arterial spin labeling (ASL) to measure changes in blood flow at rest, 2) multi-echo and 3) single-echo resting state scans to measure spontaneous fluctuations in brain activation to explore whole brain functional organization, and 4) task-based scans to identify changes in brain activation during impulse inhibition, reward processing, working memory, and emotional processing. During functional scan acquisition, real time motion is monitored using in-house software developed for AFNI ( Cox and Jesmanowicz, 1999 ;Voyvodic, 1999 ).
Our fMRI tasks were chosen to examine changes in behavioral constructs that previously have been shown to mature during adolescence. Although each of our four tasks relies on widespread, distributed networks, they are well-documented to be related to particular brain regions; for example, impulse inhibition and working memory activate neural and behavioral functions relevant to the prefrontal cortices, namely cognitive control regions, while reward and emotional face pro- cessing are relevant to subcortical brain function, including the striatum and amygdala, respectively (for review see: Crone and Dahl, 2012 ). Age-related changes during adolescence have been identified for each of these tasks (for reviews see: Luna et al , 2010 ;Somerville et al , 2010 ), and several studies have identified potential impacts of puberty ( Vijayakumar et al , 2019 ), although with mixed findings (for review see: Goddings et al , 2019 ). Neuroimaging data suggest that both reward processing and working memory are characterized by an adolescent-specific peak in recruitment of core network areas that decreases with age (for reviews see: Casey, 2015 ;Luna et al , 2010 ). On the other hand, some studies have suggested a more linear development of brain regions associated with impulse inhibition (for reviews see: Casey, 2015 ;Casey et al , 2008 ) as well as working memory ( Kwon et al , 2002 ). Among each of these networks, the recruitment and utilization of core 'adult-like' processing regions emerges at some point during the age-continuum of adolescence. Importantly, the protracted development of brain function relevant to impulse inhibition in comparison to that of reward processing or working memory has often been linked to risk-taking behaviors in adolescence (for reviews see: Casey, 2015 ;Steinberg, 2010 ). Dysregulation of the working memory and face processing systems has also been implicated in several neuropsychiatric illnesses that emerge during puberty, including depression and schizophrenia ( Arnone et al , 2012 ;Hall et al , 2008 ;Manoach et al , 2000 ). In the following sections, we provide details and a proof of principle of the fMRI scans that are carried out longitudinally every 8-10 months in a representative subsample of prepubertal children. Here, we identify preliminary resting-state networks and describe the paradigms and associated patterns of activation for each of our four fMRI tasks (impulse inhibition, reward, working memory, emotional faces). Presentation of these preliminary patterns of activation here is intended to document the consistency of the neural systems accessed by these scans with previous literature and to demonstrate the utility of our functional scans in this pediatric sample. Additional information is given in the supplemental methods.

Representative cohort to document fMRI approaches.
Data are reported for a preliminary cohort of 24 prepubertal children (12 boys, 12 girls) selected from a sample of our early participants and scanned during their first visit (i.e., at their prepubertal baseline). This sample was chosen to be representative of the demographics of our catchment area and, thus, of our larger cohort. Of the 24 children, 56% were White, 20% were Black or African American, 4% were Asian, 12% reported multiple racial identities, and 12% were Hispanic or Latino, consistent with the demographics of Montgomery County, Maryland. Each child met the inclusion/exclusion criteria described in Section 2.2 , including age at entry (8.7 ± 0.3 years), prepubertal status, standardized BMI (16.2 ± 1.3 kg/m 2 , or 48 ± 22.5 percentile), standardized bone age (8.3 ± 0.7 years), and typical cognitive (IQ > 70) and behavioral (i.e., no symptoms of psychopathology) development.

Resting-state fMRI.
At each longitudinal visit, two resting state T2 * -weighted EPI scans are collected while subjects are instructed to rest as they would at home with their eyes open. To demonstrate the utility of our rs-fMRI scans, we conducted an independent component analysis (ICA) in FSL ( www.fmrib.ox.ac.uk/fsl ) using the MELODIC tool (for preprocessing and analysis see supplemental methods). Our goal was to determine if we were able to identify each of the seven adult networks identified in Yeo et al. (2011) in our representative cohort of prepubertal children. Here, we used FSL's spatial cross-correlation tool (fslcc) to identify components that had correlations greater than .30 to one of the Yeo networks. Cross-correlation analysis classified 13 components ( Fig. 1 ) as having high spatial overlap with one of the seven canonical networks identified in Yeo et al. (2011) . Overall, our preliminary data show highly similar components to those identified in adults, thus, setting the stage for longitudinal and cross-modality analysis as the study progresses. Data from other studies in adolescents suggests that age is associated with development of functional network structure ( Fair et al , 2009 ;Power et al , 2010 ), including strengthening of withinnetwork connections, particularly in the default mode and frontoparietal networks ( de Bie et al , 2012 ;Fair et al , 2008 ;Sato et al , 2014 ;Supekar et al , 2010 ). Tracking the development of these networks, by examination of both within-and between-network connections, will provide insight into the role of gonadal steroids in the development of functional network structure.

Impulse inhibition task.
The impulse inhibition task is based on a stop-signal task design ( Verbruggen and Logan, 2009 ) that requires participants to withhold a button press following a visual stimulus (e.g., a "stop signal "). In our impulse inhibition task ( Fig. 2 , supplemental methods), participants are presented with an image of a green traffic light superimposed with either a left or a right arrow (i.e., go trials) and they are asked to identify the direction of the arrow using a button box. In stop trials (25% of all trials), a red stop light is presented after the presentation of the arrow, indicating that the participant should not complete the button press. Thus, there are four possible types of trials: stop or go trials that are either successful or unsuccessful.
To evaluate the functionality of this task, we examined activation associated with response inhibition using the Successful Stop -Go contrast (supplemental methods). Group-level activation maps of this contrast ( Fig. 2 ) demonstrated robust activation (p < 0.01, FDRcorrected) of canonical impulse inhibition regions including the ventral lateral (vl)PFC, dorsolateral (dl)PFC, middle frontal gyrus (MFG), fusiform gyrus (FG) insula, anterior cingulate cortex (ACC), and presupplemental motor area (preSMA). Negative activation was seen in the Fig. 1. Resting-state ICA Analysis. A total of thirteen independent components were identified as having high spatial overlap with one of the seven canonical networks identified in Yeo et al. (2011) . For more information see supplemental methods. left motor cortex (not shown) consistent with positive activation associated with a button press in the go trials. Response inhibition has been shown to activate brain regions associated with executive control, including the right vlPFC and dlPFC, in both children and adults ( Aron and Poldrack, 2006 ;Casey et al , 2018 ;Hampshire et al , 2010 ;Rubia et al , 2001 ). Our preliminary fMRI data suggest that our paradigm similarly activated these circuits and, thus, can be used to probe their developmental trajectories across puberty.
2.4.4.4. Reward task. The reward task is based on a monetary incentive delay task in which a visual cue predicts a monetary reward a participant may receive for correctly responding to a subsequent target ( Knutson et al , 2001 ). In our study, we modified a previously published reward task ( Kohli et al , 2018 ) by dividing up one long task (~15 min) into three runs of 5 min each to reduce the scanning burden on our child participants. Participants are presented with three reward trial types to measure anticipation and gain of monetary rewards based on varying probabilities of success ( Fig. 3 , supplemental methods). Monetary reward trials include two levels of difficulty associated with anticipatory cues (i.e., screen background) based on response duration, a high probability (green background, longer duration) or a low probability (red background, shorter duration) of reward. A control trial is also presented where no monetary gain is possible (gray background, interme-   . 3. Reward Task. Task includes monetary reward trials in two levels of difficulty associated with anticipatory cues (i.e., screen background) including a (A) high probability (green background) or (B) low probability (red background) of reward, and (C) control trials where no monetary gain was possible (gray background) and participants could 'win' pieces of paper. High probability trials have a 67% chance of being easy (longer response time), low probability trials have a 67% chance of being difficult (shorter response time), while control trials have a 50:50 distribution of required response times. For more details see supplemental methods and Kohli et al. (2018) . Representative activation t-maps from the Gain -Loss contrast (high and low probability hits -misses) show robust task activation (p < .01, FDR-corrected) in the ventral striatum (VS), middle frontal gyrus (MFG), and the lateral occipital cortex (lOC).
diate duration) and participants can 'win' only pieces of paper. Three anticipation event types are possible: presentation of cues for high probability reward, low probability reward, or non-monetary control reward. Six outcome events are possible: successful (hit) or unsuccessful (miss) outcomes for either monetary or control trials. For both high and low probability of reward trials, hits are considered 'gains' and misses are considered 'losses'.
To document the patterns of activation of this task, we examined activation associated with reward receipt using the Gain -Loss contrast in the high and low probability outcome trials combined (supplemental methods). There was robust activation (p < 0.01, FDR-corrected) in the ventral striatum (VS) and the lateral occipital cortex (lOC) in the Gain -Loss contrast ( Fig. 3 ). These preliminary findings are similar to activations in the VS in response to the reward receipt and anticipa- Fig. 4. Working Memory Task. The N-back task includes the presentation of one of five blocks including (A) a letter (verbal memory), (B) a location (spatial memory) condition of either low (1-back) or high (2-back) memory load, and (C) a sensorimotor control condition (0-back). Participants are asked to identify by button press whether the current stimulus category (letter or location) is the same as one they saw 'N' steps earlier (e.g., letter 2-back, same letter as 2 steps back). For the control condition, subjects are asked to respond with a button press when they see a zero appear on the screen. For more information see supplemental methods. Representative activation t-maps from the Location 1-back -0-back contrast show robust task activation (p < .01, FDR-corrected) in canonical working memory regions including the dorsolateral prefrontal cortex (dlPFC), anterior cingulate cortex (ACC), inferior parietal lobule (IPL), and lateral posterior parietal cortex (PPC).
tion of reward that have been observed in children, adolescents, and adults ( Bjork et al , 2010 ;Casey et al , 2018 ;Kohli et al , 2018 ;Somerville et al , 2018 ). However, adolescents have shown reduced striatum activation associated with reward processing compared to adults ( Bjork et al , 2010 ). Our preliminary fMRI data suggest that our paradigm robustly activated the VS, even in prepubertal children, and thus can be used to investigate changes in reward-evoked VS activation across puberty.
2.4.4.5. Working memory task. The working memory task is based on the N-back task paradigm in which children are asked to monitor either the identity or location of a series of letters ( Owen et al , 2005 ). In our study, the N-back working memory task ( Fig. 4 , supplemental methods) varies both memory load (e.g., 2-back versus 1-back) and task type (e.g., letter identity versus location monitoring) to activate different aspects of working memory circuitry ( Callicott et al , 1999 ;Owen et al , 2005 ). Specifically, the task includes the presentation of five task types: a verbal (letter) or a spatial (location) condition of either low (1-back) or high (2-back) memory load, as well as a sensorimotor control condition (0back).
To document associated patterns of activation of this task, we examined activation in the location 1-back condition compared to the control condition using the Location 1-back -0-back contrast (supplemental methods). Group-level activation maps ( Fig. 4 ) demonstrated robust activation ( p < .01, FDR-corrected) in canonical working memory regions including the dlPFC, ACC, inferior parietal lobule (IPL), and lateral posterior parietal cortex (PPC). These findings are similar to those in adults, whereby working memory is highly associated with activation of the dlPFC ( Owen et al , 2005 ). The regions identified in our study have also been observed in association with working memory tasks in children, with greater recruitment with increasing age ( Kwon et al , 2002 ), although some studies suggest an inverted-U across development with the greatest recruitment occurring during adolescence ( Scherf et al , 2006 ). These preliminary fMRI data suggest that our paradigm can robustly activate working memory circuitry (particularly the dlPFC) in children, and may thus be used to investigate changes in activation associated with working memory load (1-back, 2-back) for either spatial (location) or verbal (letter) working memory across puberty.
2.4.4.6. Emotional faces task. The emotional faces task is based on a face and scene matching task design ( Hariri et al , 2002 ) where partici-pants were asked to match negatively or positively valenced facial expressions or images depicting objects and situations. To avoid habituation effects between longitudinal visits ( Plichta et al , 2014 ) we use three versions of the emotional faces tasks ( Fig. 5 , supplemental methods), so participants have at least 2 years between exposure to particular images. In each version, participants are presented with one of five blocks of images: faces or scenes of either high (aversive) or low (non-aversive) emotional valence, as well as sensorimotor control (scrambled) images. For each trial, participants are presented with three images and are then asked to identify the image on the bottom that matches the image on the top.
To demonstrate the utility of this task we examined activation in the Aversive Faces -Scrambled contrast (supplemental methods). Grouplevel activation maps ( Fig. 5 ) demonstrated robust activation ( p < 0.01, FDR-corrected) in the amygdala and fusiform gyrus (FG), including the fusiform face area (FFA). Negative activation was also observed in the parahippocampal gyrus, consistent with activation seen with the scrambled control condition. These data reflect similar robust amygdala activation for aversive faces as identified in adults ( Hariri et al , 2002 ) and adolescents ( Thomason and Marusak, 2017 ). In adolescents, amygdala activation in response to aversive faces has also been associated with increased depressive symptomatology ( Forbes et al , 2011 ). Our preliminary fMRI data suggest that our emotional faces task can robustly activate the amygdala and may thus be used to track changes in amygdala reactivity to faces across the pubertal transition.

Analytical approaches
The primary aims of this study are, first, to first examine how the increase in sex steroid hormone production at puberty impacts neurodevelopment and, second, to determine whether gonadal development tracks with longitudinal changes in brain structure and function in boys and girls. To investigate our aims, we will implement a two-pronged analytical approach including both hypothesis-driven and data-driven analyses. For each of our endocrine and neuroimaging outcome measures, events or trajectories will be analyzed both independently and in an integrated manner (e.g., using principal component analysis) depending on the specific questions being investigated. Our hypothesisdriven approaches will be based on the extant literature and will include assessments of the impact of discrete milestone attainment (e.g., pre/post onset of gonadarche, adrenarche, and the growth spurt) and longitudinal trajectories of endocrine/ metabolic measures on our neuroimaging modalities. For example, we plan to test the specific impact of the attainment of adrenarche on brain development (for review see: Byrne et al , 2017 ) by investigating our prepubertal sample of 8-yearolds cross-sectionally and comparing children who meet endocrine criteria for adrenarche with those who are pre-adrenarchal, thus controlling for both age and gonadal development. We also plan to investigate of the impact of estradiol on the maturation of prefrontal cortex gray matter volume and function ( Clark et al , 1989 ;Goldman et al , 1974 ) by investigating the role of blood-based measures of estradiol on prefrontal cortex structure and function on a range of our neuroimaging measures using, for example, the non-linear mixed-model approaches discussed below. To compliment this analysis, we can also test whether a specific pubertal marker of tissue estradiol exposure (e.g., circulating blood levels, bone age, breast PS) or a combination of these markers best predicts prefrontal brain development. Further, we can test the specificity of the effects of estradiol on neurodevelopment as compared to other puberty-related endocrine markers (e.g., adrenal androgens) and age. Finally, our data-driven approaches will include techniques such as machine learning, multivariate statistical modelling, latent growth modelling, and methods of dimensional reduction with the goal of identifying linked patterns of covariation between brain development and the endocrine events of puberty.
Critical to addressing the complex interactions between brain, puberty-related endocrine events, and age, we will prioritize analytical tools that include the following: 1) the use of a mixed-models approach that treats subject effects as random effects and 2) the use of penalized splines to represent nonlinear trajectories ( Wood, 2017 ). These analytic methods have been successfully used in longitudinal region of interest analyses to investigate the impact of pubertal tempo on cortical thickness, demonstrating effects of puberty independent of age-related development ( Vijayakumar et al , 2019 ). These analytic methods have recently been adopted for whole-brain analysis in the program 3dMSS (Multi-level Smoothing Splines), part of the AFNI software suite ( Chen et al , 2021 ), which would allow us to investigate the non-linear association between our endocrine and neuroimaging measures in a voxel-wise manner. Overall, the high temporal density of our longitudinal design, starting from a clear prepubertal baseline, in combination with currently available analytical tools may help to disentangling the impact of age from that of puberty and its related events on neurodevelopment. In our early sample of participants, we are already discovering natural dissociations between pubertal events (e.g., pubertal stage) and age. Finally, as more is learned about the biology of the pubertal transition and the critical events leading to the onset of gonadarche we expect to be poised to address new questions about the role of puberty on neurodevelopment.

Discussion
The NIMH Longitudinal Study of the Endocrine and Neurobiological Events Accompanying Puberty was designed to examine fundamental questions about the relationship between puberty and brain development. Specific goals are to: 1) identify the role of gonadarche (i.e., pubertal activation of the HPG axis) and gonadal steroids on brain development in healthy, typically developing boys and girls, 2) investigate sex differences in neurodevelopmental trajectories and characterize those factors that contribute to the emergence of sex differences, 3) distinguish those pubertal events most relevant to brain development across the multiple measures of brain structure and function that can be accessed with neuroimaging, 4) examine the effects of sex steroids on brain systems/networks implicated in psychiatric illnesses, and 5) serve as an archival sample for the community to ask questions about the impact of neurodevelopmental trajectories linked to pubertal maturation and their relationship to trajectories of endocrine measures.
This work brings together a number of incisive methodological approaches to achieve these goals. The study is constructed to control for several important, potentially confounding factors by way of carefully chosen inclusion criteria: by excluding children with altered timing of puberty (e.g., precocious or delayed puberty), abnormally high or low BMI, family history of psychiatric illness, and the presence of devel-opmental disorders. The study is also constructed with several important design elements that enable us to address the broad goals listed above, including prospective, deep multimodal phenotyping starting from a well-documented prepubertal baseline, followed by longitudinal data collection with high temporal density (i.e., frequent) every 8-10 months for 10 years. The deep neural phenotyping is achieved using the full armamentarium of neuroimaging measures that can access different aspects of neuronal structure and function. The deep endocrine phenotyping is achieved using a wide array of hormonal, somatic, and metabolic measures that both cast a wide net and yet are focused on forming specific associations and disambiguating chronologically concurrent and potentially neurotrophic mechanisms. Ultimately, the combination of these methodological approaches will allow us to create phenotypically rich longitudinal navigational maps of the endocrine and metabolic events of puberty within each participant to explore their specific impacts on both structural and functional neurodevelopment.
Our first goal, identifying the role of gonadarche and gonadal sex steroids in brain development, is facilitated by our prospective withinsubjects, longitudinal evaluation of children starting from a clearly defined pre-gonadarchal baseline (i.e., pubertal stage = 1, prepuberty) at intervals sufficiently frequent to capture the onset and maturation of the HPG axis (i.e., every 8-10 months over the course of 10 years). The prepubertal baseline is critical for this goal, as changes relative to this developmental stage are necessary for determining the impact on neural systems of the emergent exposures to circulating sex steroid hormones during gonadal maturation. Thus, a clear and well-characterized baseline evaluation prior to any evidence of HPG axis activation during adolescence (i.e., gonadarche) is a key anchoring point, the data from which will be compared to any subsequent brain changes arising throughout the course of pubertal aging/maturation. Additionally, evaluating the biology of each of the three endocrine systems maturing during puberty is essential to dissociating events related to gonadarche and HPG axis functioning from those related to adrenal development (i.e., adrenal androgen production) and somatic growth (i.e., the growth spurt). Finally, the within-individual longitudinal design (versus accelerated longitudinal or repeated cross-sectional designs) is necessary to capture the onset and progression of puberty through the multiple physiologic milestones within each child. Thus, an individually-based roadmap of pubertyrelated endocrine and metabolic events will attempt to overcome the difficulty of documenting these events in cross-sectional studies due to the individual variability inherent in the timing of these events.
Convergent with our efforts to determine the role of gonadarche on brain development, is the potential of this study to identify developmental trajectories in brain that are sex-specific. Sex differences in brain structure and function have been reported in adult men and women (for review see: Gur and Gur, 2017 ) as well as in the maturation of brain structure and function across childhood, adolescence, and early adulthood ( Kaczkurkin et al , 2019 ). There are several categories of sex differences that can be identified ( Joel and McCarthy, 2017 ). For example, sex differences in the timing (i.e., onsets) of endocrine events during adolescence (e.g., gonadarche) can result in temporary or context-dependent sex differences in brain development, whereas other persistent sex differences in pubertal brain development can be enduring and remain present in adult men and women ( Joel et al , 2015 ) . A transient sex difference may be as potentially relevant as a persistent sex difference to later psychosocial development if some systems are on-line earlier in one sex compared to the other. In addition to the impact of puberty, it is possible that sex differences in brain structure and function occur prior to puberty because of earlier sex-specific exposures to adrenal sex steroids or their metabolites (i.e., adrenarche) or non-hormonal mechanisms (e.g., dietary factors or health status) or even because of in utero or neonatal sex steroid exposure. Thus, having a prepubertal baseline can facilitate the identification of sex differences that are present prepubertally and are, thus, unrelated to gonadarche, as well as enable us to identify the impact of puberty-related endocrine events on the emergence of sex differences in brain development.
The third goal, identifying those pubertal events that are most salient to brain development, will be important in providing information to the community to guide the decision of which of the many endocrine measures available are most relevant and deserve focus. Our inclusion of multiple endocrine measures in this study is top-heavy by design since the focus is to be more inclusive of biological measures relevant to the physiology of puberty; however, future studies may be able to avoid endocrine measures that are either non-informative or redundant in order to reduce overall study costs and economize participants' time and effort. Thus, the comprehensive assessment of puberty-related endocrine and metabolic measures collected in this study can, hopefully, be pared down for future studies upon the identification of those markers of puberty most relevant for brain development. For example, this study could produce evidence suggesting the non-relevance of the growth axis in puberty-related brain changes, and, therefore, measures of those systems could be omitted in future studies without losing information. Our findings may also aid future research studies in making a more informed selection of pubertal measures relevant to specific research questions. Obviously, despite our careful selection of outcome measures across the three endocrine systems, there are as yet unknown factors such as 11keto androgens ( Turcu and Auchus, 2017 ), the impact of which on pubertal brain development has only recently been suggested, and assays for which remain in development. Thus, it is possible that there are neurophysiologically relevant measures we might miss (although the banked serum and plasma samples can be employed in the future to test the biological relevance of these measures).
The well-documented association between adolescence and puberty and the onset of psychopathology mandates the identification of the effects of HPG axis function on brain systems/networks implicated in psychiatric illness (i.e., mood and anxiety disorders, autism, schizophrenia). Thus, we will pursue our fourth goal by examining our data to assess the impact of HPG activation/puberty on the development of specific networks implicated in disease. Nonetheless, our data represent a normative sample, and it is possible that the development of those pathways associated with psychiatric illness are altered only in the context of disease risk, and thus not specifically impacted by sex steroid hormones or pubertal events.
Finally, this study will serve as an archival dataset to use for future research. We carefully include only typically developing, healthy children (i.e., normal IQ and no history of developmental disability) with normal pubertal development (i.e., not precocious or delayed), restricted BMI, normal bone age, and without symptoms or history of psychopathology. Increasingly, most psychiatric conditions are considered to have a developmental origin. Thus, our study provides wellcharacterized control data to compare with children who have developmental abnormalities (e.g., autism) or risk of psychiatric illnesses in order to identify alterations in neurodevelopmental trajectories. For example, our data could be used to compare against children with inherited risk factors for psychiatric illness, such as family history of Axis I disorders, genetic risk factors including copy number variations (e.g. 22q11.2 deletion syndrome; Zinkstok et al , 2019 ), exposure to early life stress ( Anda et al , 2006 ), and high BMI ( Quek et al , 2017 ). Overall, these data will serve as a resource for future studies to use as a comparative dataset to probe the impact of puberty and its associated endocrine events on neurodevelopmental differences in diseases.
In summary, the results of this study will provide phenotypically rich longitudinal navigational maps of pubertal events and brain function and structure in healthy typically developing children. These data will be important for several reasons. Methodologically, determining which of the many events that occur at puberty impact the maturation of specific brain systems will permit future studies to focus on these relationships in adolescent development. Thus, more efficient and economical studies can home in on the developmental roles of pubertal events in a wide range of behavioral illnesses. Mechanistically, the identification of puberty as a developmental stage in which reprogramming of risk for psychopathology occurs will have important consequences to our understanding of the developmental origins of multiple psychiatric illnesses. Lastly, from a clinical perspective, the identification in either boys or girls of specific pubertal events that instantiate the risk of specific psychiatric illnesses will focus our efforts to mitigate these risks during specific developmental windows and possibly identify a role for hormonally-based or other interventions to prevent the emergence of psychiatric illness in some children.