Longitudinal changes of ADHD symptoms in association with white matter microstructure: A tract-specific fixel-based analysis

Highlights • HI symptom remission is associated with more follow-up lCST FD.• Combined symptom remission is associated with more follow-up lCST FC.• Altered white matter development may be moderated by preceding symptom trajectory.


Introduction
Although (proto)typically considered a childhood syndrome, clinical trajectories of attention-deficit/hyperactivity disorder (ADHD) vary by individual. Many ADHD-affected adolescents exhibit improvement over time, but approximately two-thirds of them retain impairing symptoms into adulthood (Faraone et al., , 2006Sibley et al., 2016). The neural substrates that determine this variable clinical course of childhood ADHD have been increasingly investigated through the years, yet the dynamic nature of these mechanisms in relation to maturation remains unclear. Theoretically, symptom remission occurs via brain compensation-reorganization, and/or normalization-convergence, with a possible fixed anomaly 'scar' or enduring neurological trait-all of which may concurrently arise in different brain regions (Sudre et al., 2018). In a double dissociative neurodevelopmental model of ADHD, the underlying neural mechanisms that control onset are distinct from those that drive remission (Halperin and Schulz, 2006). Thus, onset can be characterized by dysfunctional subcortical structures remaining static throughout life, while remission may be separately associated with brain (particularly prefrontal cortex) maturation and compensation (Cortese et al., 2013;Shaw et al., 2015;Sudre et al., 2018).
The theory that maturing frontal cortical regions compensate for initial childhood ADHD emergence via top-down regulatory processes, leading to eventual symptom remission, has been supported by magnetic resonance imaging (MRI) studies: Reduced symptom severity throughout development appears to correlate with prefrontal cortex maturation. White matter (WM) development in frontal-temporal areas subserving emotional and cognitive processes indeed continues to mature into early adulthood, coinciding with the typical age range of ADHD symptom remission (Cheung et al., 2015;Clerkin et al., 2013;Francx et al., 2015;Halperin and Schulz, 2006;Lebel and Deoni, 2018;Shaw et al., 2013). Based on this model, it is possible to methodologically differentiate remitted from unaffected brains with MRI. Yet, previous neuroimaging studies have reported inconsistent results-perhaps because of study-specific differences (e.g. analysis methods, crosssectional cohorts, sample characteristics). Considering this disorder's neurodevelopmental component, sample age is especially important, making systematic longitudinal studies essential in deconstructing the etiological timeline of brain mechanisms in reference to remission.
Diffusion-weighted imaging (DWI) is an in vivo MRI method which measures the magnitude and direction of water molecules diffusing through brain tissue, reflecting the underlying architecture of axons and their ensheathing myelin. Diffusion tensor imaging (DTI) has been the most commonly used DWI method in ADHD studies, which have usually reported tensor-derived, voxel-wise measures like fractional anisotropy (FA). One follow-up case-control DTI investigation in only men suggested that ADHD is a lasting neurobiological trait irrespective of remission or persistence: Compared to those who did not have childhood ADHD, probands with both remittent and persistent ADHD showed widespread reduced FA three decades post-diagnosis (Cortese et al., 2013). Others showed that children who exhibited symptom improvement had the most FA anomalies at follow-up . However, given ADHD's neurodevelopmental aspect, studying symptoms and brain tissues in late adolescence and early adulthood (as myelination continues) can produce more relevant information about how remission is intertwined with maturation. While valuable, the few follow-up DTI reports to date were limited by categorical participant groups, sample characteristics (populations that were either prepubertal or well into adulthood), and only a single follow-up MRI measure-underscoring the need for more studies beyond the cross-sectional perspective.
A longitudinal design reveals temporal dynamics of underlying neurobiological processes and increases statistical power by reducing inter-subject variability. The NeuroIMAGE study and its latest follow-up, DELTA, is a longitudinal cohort of ADHD-affected probands, their siblings, and unaffected controls from childhood to adulthood (Damatac et al., 2020;Francx et al., 2015;Leenders et al., 2021;van Ewijk et al., 2014;von Rhein et al., 2015). We previously demonstrated that, at two different time-points and in two partly overlapping NeuroIMAGE samples, more improvement in combined ADHD and hyperactivityimpulsivity symptom scores over time were associated with lower FA at follow-up in an area where the left corticospinal tract (lCST) crosses the left superior longitudinal fasciculus (lSLF) (Francx et al., 2015;Leenders et al., 2021). In that same report, we also systematically demonstrated that symptom change is associated with neither baseline FA nor change in FA from baseline to follow-up. Symptom remission was counterintuitively and repeatedly associated with decreased FA later in life, whether from childhood to adolescence, or to early adulthood. Our longitudinal findings indicated divergent WM microstructure trajectories between individuals with persistent and remittent symptoms at a follow-up age range of 12 -29 years. Now, we build on those previous results on the downstream effect of symptom progression on WM microstructure by asking whether the same relationship exists in the same brain areas at a later time window, when the cohort is aged 18 -34 years (Fig. 1).
Although in vivo WM microstructure has been most commonly studied through tensor-derived metrics (e.g. FA), results from voxel-wise DTI-based methods can be unreliable or misleading in areas with complex fiber architecture (Seehaus et al., 2015). Given that tensor-based reconstructions are an average across an entire voxel and that approximately 90% of voxels contain multiple fiber populations, we applied a high angular diffusion model: constrained spherical deconvolution (Jeurissen et al., 2013). Voxel-based methods that model crossing fibers (e.g. BEDPOSTX) only represent a subset of the full range of possible fiber orientation distributions (FODs), whereas constrained spherical deconvolution represents FODs as spherical harmonics, free to distinguish more or less arbitrary shapes (Tournier et al., 2004). Fixel-based analysis (FBA) applies the constrained spherical deconvolution model and can more accurately reconstruct a continuous FOD in both singleand multiple-fiber voxels-characterizing properties of each "fixel," or specific fiber population in a voxel (Raffelt et al., 2017b;Schilling et al., 2018;Tournier et al., 2019, Tournier et al., 2007. Fixels can be statistically analyzed for fiber-specific indices of underlying physiology: fiber density (FD), a microstructural measure of the within-voxel intra-axonal restricted compartment of a fiber population; fiber cross-section (FC), a macrostructural measure of the area perpendicular to the fiber orientation; and fiber density and cross-section (FDC), a combination of FD and FC (Raffelt et al., 2017b). Less FD can indicate axonal loss, while less FC can indicate macroscopic fiber atrophy (Gajamange et al., 2018;Mito et al., 2018;Rojas-Vite et al., 2019). FBA resolves crossing fibers more accurately as well as characterizes the microstructural and morphological, macrostructural properties of specific fiber populations.
One cross-sectional FBA showed that ADHD-affected children who had reduced fine motor competence also had lower WM microstructure in all three fixel metrics in the CST. These results suggest that cases had fewer and/or thinner CST axons, which may lead to reduced fiber bundle information transmission speed (Hyde et al., 2020). Likewise, compared to controls, children with ADHD had less FD in association and projection pathways subserving behavioral control and motor function . Despite the consistent clamor to resolve crossing fiber regions and FBA's evident advantages, there have been no other published FBA applications in people with ADHD to our knowledge. Furthermore, besides our prior research, there have been no other DWI studies with multiple follow-up waves in the longitudinal symptom course of ADHD.
In an extension of our previous work in overlapping samples, here we followed 139 people over approximately 15 years. We used a more recent multi-shell DWI fiber model and a new follow-up measurement at an older age range. Because our previous findings were detected in regions of crossing fibers, it is reasonable to expect that, by using more sensitive FBA metrics, we could find an opposite effect between symptom improvement and follow-up white matter microstructure. We aimed to assess the time-lag between the course of ADHD symptomatology and WM microstructure in a priori models and regions-of-interest. Because our smaller sample size at an older age range is not suitable for a datadriven search to discover any new relevant regions or tracts, the present analyses were intended to further understand the nature of our previous results. To compare FBA metrics to our previous FA findings, our first follow-up analysis used the same exact sample as our most recent longitudinal DTI study (Leenders et al., 2021). Our second analysis probed whether those same time-lag associations existed at the newly-acquired later age range. For both analyses, we hypothesized that, like in our earlier findings, symptom improvement would be associated with lower FC, FD, and FDC at follow-up in the lCST and lSLF.

Participants
Clinical and MRI data were originally collected from probands with childhood ADHD, their first-degree relatives, and healthy families in one initial wave: NeuroIMAGE1 (W1) (von Rhein et al., 2015). After an average of 3.7 years (standard deviation [SD] = 0.5 years), those participants were invited back for a second acquisition: NeuroIMAGE2 (W2). After a mean of 5.1 years (SD = 1.4 years), some individuals returned for another wave, DELTA (W3), which included only people who fulfilled full ADHD diagnostic criteria in at least one previous wave (Table 1). For the analyses here, we only included participants who had clinical data from at least two of the three waves and DWI data from W2 and/or W3 (Fig. 1). For each time-point, there were no differences between the participants included in the current analyses and the complete sample in symptom severity, age, and sex (p > 0.12).
Given our longitudinal design, we did not split our participants into cases versus controls. Through the years, symptom scores and diagnoses varied through time and participant characteristics changed from wave to wave (Fig. 2). Some individuals originally recruited as controls or unaffected siblings developed ADHD at a later time-point and others recruited as ADHD participants remitted, further highlighting the complex, variable course of ADHD. Alternative to a case-control categorization, ADHD can be operationalized as a continuous trait (Lahey and Willcutt, 2010;Marcus and Barry, 2011). In a previous cross-sectional study, we systematically showed that, compared to categorical diagnoses, continuous symptom measures are more sensitive to diffusion- Fig. 1. Schematic of how this study chronologically relates to previous studies, the samples included in each, relevant clinical and neuroimaging measurements, study sample age ranges, mean years (standard deviation) in between each acquisition wave, and the analysis methods used. The present study is a fixel-based analysis of W1 to W2 and W2 to W3, using only the models in which we found significant effects in a previous voxel-wise tract-based spatial statistical analysis of W1 to W2. weighted brain features in this sample (Damatac et al., 2020). Thus, all models here used symptom scores, optimally capturing the dynamic and continuous nature of the ADHD spectrum throughout development in this longitudinal cohort.

Clinical symptom measures
For continuous measures of symptom dimension severity and in accordance with our previous report, we used raw combined Conners' Parent Rating Scale (CPRS) scores from W1 and W2, and Conners' Adult ADHD Rating Scale (CAARS) scores from W3 for hyperactivityimpulsivity (HI) and inattention (IA) (Conners et al., 1999(Conners et al., , 1998Leenders et al., 2021). Here, we define symptom change (Δ) as the Conners' score difference: Δscore = score follow-up -score baseline .
Baseline versus follow-up scores were always positively correlated with each other ( Figure S2). A more positive Δ value indicates the worsening of symptoms, while a more negative Δ value indicates the improvement of symptoms over time. In this report, we refer to "symptom remission" dimensionally and not diagnostically, i.e. a decrease or improvement in symptom severity over time.
At W1 and W2, we assessed history of comorbid disorders with the Kiddie Schedule for Affective Disorder and Schizophrenia Present and Lifetime Version (K-SADS-PL) semi-structured interview (Donker et al., 2010;Kaufman et al., 1997). For children aged < 12 years, the child's parents or the researchers assisted in completing the self-report questionnaires. Participants with elevated scores on ≥ 1 of the K-SADS-PL screening questions had to complete a full supplement for each disorder. At W3 (all participants were aged ≥ 18 years), we recorded history of comorbidity using the Structured Clinical Interview for DSM-5 Disorders (SCID-V) (First et al., 2018). IQ was estimated using the vocabulary and block design subtests of the Wechsler Intelligence Scale for Children (WISC-III) or Wechsler Adult Intelligence Scale (WAIS-III). We excluded one whole dataset from a participant who had an estimated IQ < 70. Our final sample's demographic characteristics are summarized in Table 1.

Diffusion-weighted imaging acquisition, pre-processing, and quality control
At W2, single-shell DWI data were acquired with a 1.5-Tesla AVANTO scanner (Siemens, Erlangen, Germany) equipped with an 8channel receive-only phased-array head coil using the following parameters: echo time/repetition time (TE/TR) = 97/8500 ms; GRAPPAacceleration factor 2; voxel size = 2 × 2 × 2.2 mm; b-values = 0 (5 volumes, interleaved) and 1000 (60 directions) s/mm 2 ; twice refocused pulsed-gradient spin-echo EPI; no partial Fourier. More details of this MRI data acquisition have been described previously (Damatac et al., 2020;Leenders et al., 2021). Because our models only included followup neuroimaging data as a further investigation of the aforementioned analyses, we did not include W1 DWI.
W2 and W3 images were pre-processed with MRtrix3 (version 3.0.1, https://www.mrtrix.org/) according to recommended quality-control and FBA protocols for multi-shell data (Raffelt et al., 2017b;Tournier et al., 2019). Pre-processing included denoising and unringing, motion and distortion correction, and bias field correction (Smith et al., 2004; Table 1 Demographic and clinical characteristics of participants at Wave 1 (W1), Wave 2 (W2), and Wave 3 (W3) with mean and standard deviation (or numerical count and percentage). W3 included only those who fulfilled full ADHD diagnostic criteria in at least one previous wave. Values reported here are for all participants in the final sample after all quality control (N = 139).   Note that participants at W3 were selected on the basis of their history of ADHD diagnosis, so W3 tends to differ quite markedly from the other two waves, which also include never-affected controls. This conceals the typical pattern of average symptom remission that would be expected in a follow-up study without this selection criterion. Andersson and Sotiropoulos, 2016;Kellner et al., 2016;Raffelt et al., 2017a;Raffelt et al., 2012;Tustison et al., 2010;Veraart et al., 2016;Zwiers, 2010). We visually inspected all corrected diffusion images and excluded whole datasets if any motion or distortion artefacts remained after pre-processing. After excluding 22 datasets, our final sample consisted of 154 total diffusion scans collected from 139 participants at W2 (N = 99) and W3 (N = 55). Our entire pipeline is detailed in Figure S1 and our scripts are available at: https://bit.ly/3iPAzQt.

Fixel-based analysis
Following pre-processing, we computed two unique group average tissue response functions for W2: WM and cerebrospinal fluid (CSF) (Dhollander et al., 2016). B0 images can be utilized like a second shell to estimate a CSF-specific response function for each participant (Dhollander et al., 2016). By modeling distinct response functions for WM and CSF, we were able to enhance the signal from WM relative to CSF and include our single shell data in the multi-shell FBA pipeline. For W3, we calculated three response functions: WM, gray matter, and CSF (Dhollander et al., 2016). We upsampled to 1.25 mm 3 and performed multishell multi-tissue constrained spherical deconvolution on all images, resulting in a WM fiber orientation distribution (FOD) within each voxel (Jeurissen et al., 2014;Tournier et al., 2007). Afterwards, we performed joint bias field correction and global intensity normalization for each of the multi-tissue compartment parameters (Raffelt et al., 2017a). We then separately generated two study-specific FOD population templates for W2 and W3 using 40 unrelated participants from each wave per template. Symptom scores did not differ between the individuals included in the population templates versus those of the overall samples (P > 0.06).
For each population template, we calculated the FD, log(FC), and FDC (FDC = FD ⋅ FC) for each participant across all fixels. Instead of FC, we chose to calculate log(FC) so data would be centered around zero and normally distributed. The derivation of these fixel metrics, which are based on FOD lobe segmentation and subject-to-template registration warps, are described in detail elsewhere (Raffelt et al., 2015).
For each FOD template, we performed whole-brain probabilistic tractography (iFOD2) seeded from a whole-brain white matter mask to generate a tractogram of 20 million streamlines and a fixel-fixel connectivity matrix (Tournier et al., 2019, Tournier et al., 2010. To reduce tractography biases in each whole-brain tractogram, we selected a subset of 2 million streamlines that best fit the diffusion signal using the SIFT algorithm (Smith et al., 2013).
To obtain each region-of-interest (left corticospinal tract: lCST; left superior longitudinal fasciculus: lSLFI, lSLFII, lSLFIII), we extracted the spherical harmonic peaks from each voxel of both FOD population templates. We then applied TractSeg, which is an automated convolutional neural network-based approach that directly segments tracts in fields of FOD peaks, circumventing any biases that may result from userdefined or atlas-based delineation (Wasserthal et al., 2018). To maintain consistency with our previous study and hypothesis, we concatenated the lSLFI, lSLFII, and lSLFIII track files into one lSLF tractogram. Finally, we converted the resultant tractograms to fixel maps used as masks to constrain our search space during connectivity-based fixel enhancement (Raffelt et al., 2015) (Fig. 3 and Figure S3).

Statistical analyses
To control for the lack of independence in our sample due to siblings, we designed multi-level exchangeability blocks per wave and used FSL PALM to generate a set of 5000 permutations per wave (Winkler et al., 2015, Winkler et al., 2014. Our blocks did not allow permutation between all individuals; instead, we constrained permutations at both the whole-block level (i.e. permute between families of the same size) and within-block level (i.e. permute within families) ( Figure S4). We used each set as an input for its respective wave to define permutations in data shuffling during nonparametric testing.
We demeaned our design matrices using Jmisc in R (version 4.0.2) and applied connectivity-based fixel enhancement to the fixel-fixel connectivity matrices using smoothed fixel data (Raffelt et al., 2015). Using only models in which we previously found significant effects (i.e. not IA, but only HI and combined scores), for each fixel metric and each tract region-of-interest, we constructed general linear models (GLMs) to separately test whether combined ADHD or HI symptom score change (Δscore as independent variables) are associated with fixel metrics at follow-up (as dependent variables) (Leenders et al., 2021). Our covariates were: symptom score (either combined or HI) at baseline, change in age (Δage = age follow-up -age baseline ), age at baseline, sex, and head motion (framewise displacement) at follow-up. For the W2 FBA, followup was W2 and baseline was W1, while for the W3 FBA, follow-up was W3 and baseline was W2: fixel metric follow-up ~ Δscore + score baseline + Δage + age baseline + sex + head motion follow-up .
We performed statistical analyses using connectivity-based fixel enhancement, which exploits local connectivity information (derived from probabilistic fiber tractography) to enhance the test-statistic of each fixel based on the support lent to it by other structurally connected fixels (Raffelt et al., 2015). Local connectivity thus acts as a neighborhood definition for threshold-free enhancement of locally clustered statistic values. Fixels were considered statistically significant at familywise error-corrected p<0.05 (p FWE < 0.05).

Association between white matter at Wave 2 and the change in symptoms from Wave 1 to Wave 2
W2 fiber density (FD) in the lCST was significantly negatively associated with ΔHI score (t max = 1.092, standardized effect[SE] = 0.044, p FWE = 0.016; Fig. 4). There were no other significant associations between WM microstructure in the lSLF at follow-up and Δcombined or ΔHI score (all p FWE > 0.12; Table S1). Compared to HI symptom persistence, HI symptom remission over time was correlated with more FD in the lCST at follow-up ( Figure S5, top).

Association between white matter at Wave 3 and the change in symptoms from Wave 2 to Wave 3
W3 log of fiber cross-section (log[FC]) in the lCST was significantly negatively associated with Δcombined symptom score (t max = 3.775, SE = 0.051, p FWE = 0.019; Fig. 4). There were no other significant associations between WM microstructure in the lSLF at follow-up and Δcombined or ΔHI symptom score (all p FWE > 0.25; Table S2). Compared to HI symptom persistence, HI symptom remission over time was correlated with more FC at follow-up ( Figure S5, bottom).

Discussion
We conducted a unique study of WM microstructure and longitudinal ADHD symptom development between ages 9 and 34 years. Using the FBA framework, we discovered two findings in the lCST: (1) HI symptom improvement was associated with axonal expansion at follow-up, and (2) combined ADHD symptom improvement was associated with a larger total cross-sectional area at follow-up at a slightly later age-range. Initially, a previous voxel-wise analysis in an overlapping sample found that improved HI symptoms were associated with lower follow-up FA (W1, aged 9 -26 years) (Francx et al., 2015). Subsequently, we extended this sample by adding a second DWI time-point (W2, aged 12 -29 years), and systematically applied and excluded specific models-ultimately replicating the same effects on follow-up FA in the same WM region (Leenders et al., 2021). Given the counterintuitive nature of these previous highly consistent results, the present analysis aimed to further understand the physiological origins and its dynamic nature in relation to maturation. Thus here, in the exact same sample (W1-W2) and including yet a third DWI acquisition (W3, aged 18 -34 years), using the more advanced FBA method, and employing the same GLMs in which we previously found significant voxel-wise effects, we have found increased FD in relation to HI remission, and increased FC in relation to combined symptom remission, in only the lCST and not the lSLF. In contrast to our previous finding using DTI-based methods, our current finding using FBA is more intuitive in the direction of its effect: Indices that are generally indicative of "stronger" fibers were associated with clinical improvements over time.
The fixel metrics we used for quantifying WM microstructure contain complementary information. FD is thought to be related to the microstructural properties of WM, whereas FC pertains to the macrostructural properties (cross-sectional area). In our W1-to-W2 analysis, greater lCST axonal density in individuals who became less hyperactive-impulsive over time suggests plasticity, or a greater ability to relay information, after symptom improvement. FD is an estimate of the intracellular volume of fibers oriented in a particular direction. Higher FD at follow-up Fig. 4. Symptom change may precede lCST WM microstructure plasticity. Top: Improvement of HI score is associated with more follow-up fiber density (FD). Bottom: Improvement of combined score is associated with more follow-up fiber cross-section (FC). Streamline segments have been cropped from the template tractogram to include only streamline points that correspond to significant fixels for this tract (FWE-corrected p-value < 0.05). Significant streamlines are colored by the standard effect size of 'Δscore' on 'FD at W2 ′ and 'log(FC) at W3 ′ and displayed across coronal and sagittal slices of the study-specific white matter fiber orientation distribution templates. could result from developmental processes like axon diameter growth, or more axons occupying a given space. Because FD is proportional to the total intra-cellular volume of axons along a fixel, we cannot distinguish between effects specific to axon count or axon diameter. Another explanation is a reduced exchange rate between intra-and extra-axonal spaces because of increased myelination, causing an apparent increase in the intra-axonal compartment and hence an increase in FD (Cohen and Assaf, 2012). However, FD is largely not sensitive to myelin, as myelin-associated water has a very short T2 relaxation time and therefore contributes little to the diffusion signal (Dhollander et al., 2021;Raffelt et al., 2012). Our second analysis found associations with FC, which measures the morphological macroscopic change in the crosssectional area perpendicular to a fiber bundle (calculated during registration to the template image). In W2-to-W3, higher follow-up lCST cross-sectional area in individuals whose combined symptom score improved, again, suggests plasticity, greater myelination, or fiber bundle organization after symptom remission (Raffelt et al., 2017b).
Although the direction of effects in our FBA analyses are opposite to that of our aforementioned FA analyses, they are not incompatible. In some cases, crossing fiber complexity can have an inverse correlation with FA, wherein greater complexity occurs when more fixels in a voxel have the same fiber density (Grazioplene et al., 2018). An analogous inverse association exists in our previous W1-to-W2 voxel-wise analysis, wherein less follow-up FA was associated with improved HI symptom score. Notably, our results were in the approximate location of where the lSLF and lCST cross, while our present fixel-wise results in the lCST seem to be absent from where the lSLF crosses this tract. Therefore, as we previously suggested, our tract-based spatial statistics results may have been due to the neuroanatomical location of the effects, which, when labeled with an atlas, were in an area where these tracts cross. Compared to our voxel-wise study, we presently accounted for crossing fibers better through FBA, as well as the specific, FOD-based segmentation of these tracts as separate regions-of-interest. Accordingly, symptom improvement over time can conceivably be associated with increased CST fiber maturation, which by our previous DTI methods may have appeared as reduced FA in voxels where a more dominant SLF crosses those corticospinal fibers.
The lCST is the only tract in which we have consistently found longitudinal effects. However, a cross-sectional study of symptoms in an overlapping W1 sample found the most FA differences in the right cingulum-angular bundle (Damatac et al., 2020). Anatomically, this differs from that of the present and previous longitudinal effects, which suggests a dissociation in the WM tracts associated with cross-sectional differences versus those that are associated with symptom remission. This dissociation points to the neurodevelopmental models of remission, which all predict atypical neural features in adults with persistent ADHD, but have different predictions about those with remittent symptoms: If symptom remission occurred via WM normalization, convergence, or passive delayed maturation, then we would have observed neurological alterations at follow-up in participants with persistent symptoms, but no differences between those with remittent symptoms and healthy controls. If, regardless of symptom trajectory, this disorder imparted an indelible mark or scar on the brain, then we would not have observed follow-up neurological differences between those with persistent versus remittent symptoms, and only healthy controls would have been differentiated by our follow-up analyses. If symptom remission occurred via compensation or reorganization, then remitted brains would have differed from both the never affected and the persistent ADHD brains, but in different ways. The dissociation we observed in the tract that is important for symptom remission versus the tract that is important for symptom severity implies the last model of remission. According to our current findings, we tentatively suggest an interpretation consistent with our previous report: In a top-down fashion, remitters may have learned compensatory strategies to overcome symptoms as they aged, while persisters may have either learned disadvantageous strategies, other beneficial (but insufficiently effective) compensatory strategies, or none at all-leading to diverging WM development trajectories in specific brain regions in individuals with persistent ADHD symptoms ( Figure S5). Nonetheless, we cannot presume that only compensatory mechanisms exist in ADHD remission. Voxel-based studies in adult remitters have reported results from white matter microstructural analyses that are more compatible with remission as the normalization of the neural alterations related to childhood ADHD (Cortese et al., 2013;Shaw et al., 2013;Sudre et al., 2018).
Based on our longitudinal design, we postulate that different WM alteration patterns are associated with symptom trajectory in a tractspecific manner: HI symptom remission preceded lCST plasticity at 20 years median age (range 12 -29 years), and combined symptom remission preceded lCST plasticity at 26 years median age (range 18 -34 years). Perhaps our sample at a slightly younger age, in response to HI symptom improvement or learning new skills, gained more lCST fibers over time. Tract expansion could have been a compensatory mechanism to improve motor control, followed by more myelination of those fibers. Then, as our participants became slightly older, improvement in both dimensions may have led to greater lCST WM macrostructure and improved motor control ( Figure S6). We speculate that improved IA (and related executive control) could help suppress HI, leading to greater motor control evinced as larger FC at a later age. In our remitters, higher measures of lCST WM might also result from reorganization in other brain areas outside of the tracts we studied. Like our previous study, we have again found that alterations in WM microstructure appear to follow symptom improvement, suggesting that WM micro-and macrostructural changes may be a downstream effect of ADHD symptom remission.
A strength of the current study is its large sample size over three clinical and two DWI time-points. Our approach using two separate follow-up analyses lent further characterization to the temporal dynamics of ADHD-WM microstructure interplay. Of particular concern given this disorder, we mitigated potential confounding effects of head motion through careful data screening, correction during preprocessing, and inclusion as a covariate in our models. Using multi-shell FBA, we demonstrated that WM-associated differences are fiber-specific even within regions of crossing fibers, and we were able to further characterize WM micro-and macrostructural properties. However, the age ranges of our W2 sample overlaps with that of W3 (Fig. 2, top left), and the acquisition methods differed between waves. Therefore, we refrain from drawing conclusions about the specificity of the relevance of micro-and macrostructural elements in specific waves or age-intervals. Because we used cluster-based inference, effects were observed consistently across a large number of connected fixels; thus, maximum effect sizes at any specific fixel within a cluster can be much smaller than in standard statistical tests. Similar small effect sizes have been found in other MRI studies on the neurobiological correlates of ADHD (Hoogman et al., 2017;Zhang-James et al., 2021). It is more likely that small effects across many different regions and imaging modalities each slightly contribute to explaining individual differences in ADHD symptom trajectories. We can expect that the current rapid increase in neuroimaging sample sizes will yield meaningful results, not by identifying single brain traits of direct clinical utility but by describing the true effects of many traits across the brain with increasing accuracy, which could collectively describe individual differences even if individual effects are small. Our study adds to this scientific development, especially given that our effects replicate those in our previous studies.
Nonetheless, a limitation of this study is the quality of our Wave 2 imaging data. The first follow-up analysis included DWI data acquired with a single relatively low b-value at 1.5 T with anisotropic voxel dimensions, which may have precluded us from discovering effects in other tracts and/or symptom dimensions. Higher diffusion weighting has been shown to improve correspondence between FD estimates and intra-axonal signal fraction simulations by increasing extra-axonal signal suppression (Genc et al., 2020). While diffusion-weighting was negatively associated with FD estimates, it was not a confounder for our effect of interest; the effect was balanced across both waves and b-values ( Figure S7). Despite upsampling and tractogram filtering, anisotropic voxel dimensions may have still affected our tractography results (Neher et al., 2013). Second, our follow-up samples were prone to selection bias from attrition and our explicit selection criteria in W3, where we also used different instruments (CRPS vs. CAARS) and raters. Returning participants were different from those who participated only once. Finally, even in a longitudinal study, we cannot prove causality. ADHD symptom persistence is likely associated with many other factors in daily life, or medication, or comorbid symptomatology-and any combination of these could also contribute to neurological differences at follow-up. Given our small sample size, we used a priori regions-of-interest and models, but a larger, systematic and data-driven whole-brain analysis would be a more sufficient test for mechanisms of remission.
Our findings contribute to the growing body of evidence describing the progression of symptoms in relation to WM development. Defining the correlates and predictors of remission may eventually lead to an improved allocation of treatment resources for persistent or complicated ADHD. A better understanding of the underlying neural mechanisms of these changes in time can contribute to the promotion of favorable future perspectives for children and adolescents with this disorder.

Funding, acknowledgments, and financial disclosures
The authors would like to thank all of the families who participated in this study and all of the researchers who collected the data. This study sample is from the DELTA and NeuroIMAGE projects. NeuroIMAGE is the longitudinal follow-up study of the Dutch part of the International Multisite ADHD Genetics (IMAGE) project, which was a multi-site, international effort. NeuroIMAGE was supported by a Dutch Research Council (NWO) Large Investment Grant (no. 1750102007010) and NWO Brain & Cognition an Integrative Approach Grant (no. 433-09-242 to J. K.B.), and grants from Radboud University Medical Center, University Medical Center Groningen and Accare, and VU University Amsterdam. DELTA was funded by a Hypatia Tenure Track Grant from Radboud University Medical Center (to E.S.). Funding agencies had no role in study design, data collection, interpretation or influence on writing. J.N. is supported by an NWO Veni grant (no. VI.Veni.194.032). B.F. has received educational speaking fees from Medice. J.K.B. has been in the past 3 years a consultant to / member of advisory board of / and/or speaker for Takeda/Shire, Roche, Medice, Angelini, Janssen, and Servier. He is not an employee of any of these companies, and not a stock shareholder of any of these companies. He has no other financial or material support, including expert testimony, patents, royalties. E.S. is supported by a Hypatia Tenure Track Grant (Radboudumc), Christine Mohrmann Fellowship (Radboud University), and a NARSAD Young Investigator Grant (Brain and Behavior Research Foundation, Grant No. 25034). All other authors report no biomedical financial interests or potential conflicts of interest.