Reproducible protocol to obtain and measure first-order relay human thalamic white-matter tracts

The "primary" or "first-order relay" nuclei of the thalamus feed the cerebral cortex with information about ongoing activity in the environment or the subcortical motor systems. Because of the small size of these nuclei and the high specificity of their input and output pathways, new imaging protocols are required to investigate thalamocortical interactions in human perception, cognition and language. The goal of the present study was twofold: I) to develop a reconstruction protocol based on in vivo diffusion MRI to extract and measure the axonal fiber tracts that originate or terminate specifically in individual first-order relay nuclei; and, II) to test the reliability of this reconstruction protocol. In left and right hemispheres, we investigated the thalamocortical/corticothalamic axon bundles linking each of the first-order relay nuclei and their main cortical target areas, namely, the lateral geniculate nucleus (optic radiation), the medial geniculate nucleus (acoustic radiation), the ventral posterior nucleus (somatosensory radiation) and the ventral lateral nucleus (motor radiation). In addition, we examined the main subcortical input pathway to the ventral lateral posterior nucleus, which originates in the dentate nucleus of the cerebellum. Our protocol comprised three components: defining regions-of-interest; preprocessing diffusion data; and modeling white-matter tracts and tractometry. We then used computation and test-retest methods to check whether our protocol could reliably reconstruct these tracts of interest and their profiles. Our results demonstrated that the protocol had nearly perfect computational reproducibility and good-to-excellent test-retest reproducibility. This new protocol may be of interest for both basic human brain neuroscience and clinical studies and has been made publicly available to the scientific community.


Introduction
The thalamus consists of two large gray matter masses located on both sides of the third ventricle in the dorsal part of the diencephalon. The thalamus receives orderly inputs from subcortical sensory, motor, and brainstem modulatory systems and it is massively interconnected with all areas of the ipsilateral neocortex in complex and highly specific patterns. Accordingly, different parts of the thalamus are involved in diverse roles, all of them essential for cortical function. These include the relay and gating of peripheral and reentrant signals to the cortex for sensation, arousal, affect, voluntary movement, or language perception and production, the dynamic control of cortical network activity in attention, perceptual decision-making and short-term memory, as well as the regulation of consciousness and sleep states ( Guillery & Sherman, 2002 ;Halassa & Sherman, 2019 ;Saalmann & Kastner, 2011 ).
The study of thalamic functions in humans is attracting increasing attention from the neuroimaging community ( Cunningham et al., 2017 ;Fig. 1. Scheme of the fiber tracts originating in or targeting the sensory / motor first-order relay nuclei of the thalamic tracts that were analyzed in this study. Color lines are used to indicate the topographic ordering of these fibers. A) Axial view of the right optic radiation. B) Coronal view of the right acoustic radiation. C) Coronal view of the right somatosensory radiation. D) Coronal view of the right motor radiation and dentatothalamic tract from the left dentate nucleus (in blue) (please note that these two tracts cannot be seen in their entirety in one single slice). matrix from sensory epithelia or motor pathways. There is also evidence that functions performed by first-order relay nuclei are modulated as a function of the nature of the sensorimotor information being processed (e.g., Mihai et al., 2021 ;O'Connor et al., 2002 ). In contrast, neurons in higher-order relay nuclei are mainly driven by the cerebral cortex and project back to it. Their axons typically show branched, point-tomultipoint motifs; creating circuits suitable for signal broadcasting and synchronization of separate cortical regions, such as those involved in attention, cognitive control, and short-term memory ( Aru et al., 2020 ;Clascá, 2022 ;Halassa, 2020 ;Mukherjee et al., 2021 ). In turn, each thalamic nucleus receives reciprocal connections from the cortical areas it innervates ( Briggs, 2020 ;Nieuwenhuys et al., 2008 ). These corticothalamic pathways are massive; about an order of magnitude more numerous that the ascending thalamocortical axons. Both axonal systems remain tightly intermingled as they extend along the internal capsule and the so-called thalamic radiations of the cerebral white matter ( Jones 2007a ;Nieuwenhuys et al., 2008 ).
Their well-known and orderly layout, robust myelination and limited divergence make the first-order relay nuclei input and output tracts ideal subjects for developing and testing a reproducible protocol to selectively obtain and measure thalamic white matter pathways. Here, we set out to investigate the reciprocal thalamocortical/corticothalamic fiber tracts originating in four separate first-order relay nuclei; namely, the fibers from the lateral geniculate (known as the optic radiation, OR; Fig. 1 A), from the medial geniculate (acoustic radiation, AR, Fig. 1 B), from the ventral posterior nucleus (which is part of the superior thalamic radiation and here referred to as "somatosensory radiation ", SR; Fig. 1 C), from the ventral lateral nucleus (likewise part of the superior thalamic radiation and here referred to as the "motor radiation ", MR; Fig. 1 D), as well as the main subcortical input tract of the ventral lateral posterior nucleus, which originates in the dentate nucleus of the cerebellum (known as the dentatothalamic tract, DT; Fig. 1 D).
Thalamocortical axons exiting the lateral geniculate nucleus (LGN) target the primary (striate) visual cortex along the banks and lips of the calcarine sulcus, on the medial surface of the occipital lobe and occipital pole ( Fig. 1 A). After leaving LGN from its rostral aspect, the fibers cross the retro-and infralenticular portions of the internal capsule to pass over the roof of the lateral ventricle temporal horn and then turn inferolaterally towards the pole of the temporal lobe. The inferior fibers of the OR make a sharp loop around the tip of the lateral ventricle temporal horn and atrium to return posteromedially forming what is called Meyer's loop ( Dayan et al., 2015 ;Ebeling & Reulen, 1988 ;Meyer, 1907 ;Wahlerliiek et al., 1991 ). Along their paths, OR fibers preserve the retinotopic ordering present in the LGN. Fibers carrying signals from both eyes related to the lower contralateral visual field quadrant radiate along a dorsal and more direct route to terminate in the upper lip of the calcarine fissure, with the fovea located at the occipital end of the fissure. Meanwhile, the fibers carrying inputs form the upper quadrant radiate more inferiorly and take increasingly looped trajectories, Axons with peripheral upper quadrant inputs take the longest and most anterior loop; lesion of these anterior fibers results in superior quadrantanopia (see Barton et al. 2005 , for a review). Corticothalamic axons from V1 reciprocate these pathways in strict topographic order ( Briggs, 2020 ).
Axons traveling from the ventral division of the medial geniculate nucleus (MGN) to the ipsilateral primary auditory cortex (A1, BA 41, and partly 42 in the first straight (Heschl's) gyrus of the temporal operculum) form the AR ( Fig. 1 B). These fibers exit from the rostral pole of the ventral MGN, turn laterally to cross the intralenticular portion of the internal capsule and finally fan dorsally to reach A1. The layout of these fibers preserves the orderly segregation present in the ventral MGN of signals related to the predominant excitation of specific regions of the cochlea ( "tonotopy ", see Cherches 2016 ). Functionally, the AR is the gateway for auditory information to the cerebral cortex; as such, these fibers are required for conscious sound perception and for the discrimination of speech signals ( Ojemann, 1991 ).
Neurons in the ventroposterior thalamic complex (VP) innervate the somatosensory areas in the posterior banks and lips of the central sulcus (Brodmann areas 3, 1 and 2, referred to collectively as primary somatosensory cortex, S1) in a highly topographic fashion ( Fig. 1 C). These pathways convey a diverse array of tactile, thermal, nociceptive and proprioceptive inputs, and each of these sensory submodalities is preferentially directed to one of the above areas ( Jones & Friedman, 1982 ). Axons from cells in the medial VP subnucleus relay trigeminal information from the face and mouth and target the lateral and opercular portions of S1, whereas axons originating in the lateral subnucleus carry information from the hands, arms, trunk and lower limbs and target dorsolateral and medial regions of S1 (reviewed in Jones 2007b ). The VP axons travel through the posterior limb of the internal capsule and the superior thalamic radiation in close vicinity to the axons of the ventrolateral (VL) nucleus. The descending corticothalamic axons originating in S1 innervate VP in the same topographic order ( Nieuwenhuys et al., 2008 ).
The posterior division of the ventral lateral nucleus (VLN) relays inputs from deep cerebellar nuclei, mainly the dentate nucleus, to the primary motor cortex (M1, BA4) via the MR ( Ilinsky & Kultas-Ilinsky, 2002 ;Fig. 1 D). The order of projections preserves the muscular somatotopy present in each of the deep cerebellar nuclei ( Jones, 2007b ). Neurons in the medial and ventral part of VLN innervate the opercular region of M1 controlling mouth and face muscles, while more latero-dorsally located neurons innervate progressively more dorsal and medial regions of M1 controlling the contralateral hand and trunk and lower limb muscles ( Kelly & Strick, 2003 ).
Inputs from the deep cerebellar nuclei to the VLN travel via the DT ( Fig. 1 D). This fiber bundle originates mainly in the dentate nucleus, with smaller contributions from the interpositus and fastigial ( Jones, 2007b ;Middleton & Strick, 1997 ; not shown in Fig. 1 D). This fiber tract exits the cerebellum in the superior peduncle ( brachium conjunctivum ) and decussates to the contralateral mesencephalic tegmentum to ascend medially to the red nucleus, where it issues numerous short collateral branches. The DT fibers finally enter the VLN from its inferolateral aspect ( Coenen et al., 2014 ;Kwon et al., 2011 ;Pelzer et al., 2017 ). At the functional level, the DT, VLN and MR together provide a low-convergence and fast-conduction pathway that is believed to feed M1 high-resolution error signals about the execution of motor commands. Such signals are crucial for learning and flexible adjustment of skilled motor behaviors, including speech ( Kawai et al., 2015 ;Ojemann, 1975 ). Lesions to the DT can produce abnormal movement, including ataxia, tremor, and dystonia ( Kwon et al., 2011 ).
Several studies have reported successful reconstruction of OR with diffusion data with a variety of approaches ( Bassi et al., 2008 ;Behrens et al., 2003 ;Benjamin et al., 2014 ;Sherbondy et al., 2008 ). The reconstruction methods and parameters adopted in these studies differ in several ways, such as the seeding strategy, fractional anisotropy threshold, streamline length and angle restriction. Studies trying to identify the AR have encountered difficulties when using a single fiber tractography model because the AR fiber group crosses with other fiber groups, which results in multiple-orientation signals in voxels. . For instance, Behrens et al's (2007) study defined the MGN as a cuboid medial to the LGN and started tracking from there to the primary auditory cortex using a probabilistic algorithm. In this study they failed to reconstruct the AR with single-fiber tractography, and only succeeded in doing so when they used multi-fiber tractography. In a more recent study, Maffei et al. (2019) explored how the DWI acquisition and tracking parameters can affect the reconstruction of the AR. They found that higher b-values and more gradient directions increased the accuracy of reconstruction for both probabilistic and deterministic tracking algorithms, but with low b-values ( ≤ 3000 s/mm 2 ) only the probabilistic algorithm was able to successfully reconstruct the AR.
The existing tractography studies on the SR have mainly focused on clinical and special populations, such as Parkinson patients with essential tremor ( Sudhyadhom et al., 2013 ), patients with brain lesions ( Staudt et al., 2006 ), patients with trigeminal neuralgia ( Rutland et al., 2020 ) or preterm born infants ( Berman et al., 2005 ). Similarly, most studies investigating the MR have focused on clinical populations, and often on patients undergoing deep brain stimulation (DBS) surgery ( Andrade et al., 2020 ;Pouratian et al., 2011 ;Riskin-Jones et al., 2021 ;Sammartino et al., 2016 ). These studies reconstructed only parts of the MR depending on the pathological conditions of those patients. For example, Andrade et al's (2020) study reconstructed part of the MR, using the ventral oralis internus nucleus as the seed, in patients who underwent DBS surgery due to Tourette syndrome. As the DT travels through deep and small nuclei it is difficult to identify with DWI techniques. There are only a handful of studies that have successfully reconstructed the DT from DWI data in healthy ( Kwon et al., 2011 ;Meola et al., 2016 ) or clinical populations ( Coenen et al., 2011( Coenen et al., , 2014Nowacki et al., 2019 ). Nowacki et al's (2019) study tested four different tracking protocols to identify the DT, which led to divergent results. All four methods were based on deterministic algorithms, but no probabilistic algorithms were tested. Some studies have also reconstructed the DT and MR as one single tract ( Ji et al., 2019 ;Vo et al., 2015 ).
The present study aimed to develop and test a reproducible protocol for obtaining all of these five first-order relay thalamic input and output white-matter tracts. The novelty of this protocol capitalizes on 4 aspects: (1) It is focused on well-known white-matter tracts constituted by myelinated axons that originate and/or target the first order relay nuclei of the thalamus, testing them within the same study, and using similar methods and reconstruction procedures across them. 2) Different from most previous studies, here we specifically investigated in a large dataset the reliability of the protocol in terms of both computational and testretest reproducibility.
3) The present protocol uses state-of-the-art MRI protocols (multiband, multi-shell) and tractography methods with the aim of developing an advanced protocol that can be applied to current ongoing studies and future research. 4) The protocol is designed to be reproducible, easy to use and automatized, which unfortunately has not been the norm in the past. Also, it builds on previous well-validated tools including the first probabilistic atlas of the thalamus based on combining high-resolution ex vivo MRI and histology ( Iglesias et al., 2018 ) and the reproducible-tract-profiles (RTP2) containerized tool which is based on state-of-the-art techniques implemented on top of Vistasoft's code, which have been tested and used in many publications over the last 15 years ( https://www.github.com/vistalab/vistasoft ; Lerma-Usabiaga et al., 2022 ). The ultimate goal of this work was to provide a reliable protocol for obtaining and estimating first-order relay thalamic pathways for basic research and clinical studies.
To this end, we first defined multiple parameters to optimally reconstruct the above-mentioned five thalamic pathways (i.e., OR, AR, SR, MR and DT; Fig. 1 ) in left and right hemispheres. Second, we tested the computational and test-retest reproducibility of our protocol by examining a range of white-matter proxies related to the microstructural and macrostructural properties of these tracts. To examine the reliability of the protocol we obtained tracts from diffusion-weighted images of 113 normal adults. The protocol consisted of three components: Defining the regions-of-interest (ROI); preprocessing diffusion-weighted images (DWI) data; modeling white-matter tracts and tractometry. Reproducibility was tested using two approaches: 1) Computational reproducibility , tested by identifying each tract using the same parameters 10 independent times for all 113 subjects, and 2) Test-retest reproducibility , tested by re-scanning a subset of 24 participants using the same MRI protocol twice within an average interval of 15 days. Our hypothesis was that we would obtain a high degree of reproducibility for the microstructural and macrostructural properties of these tracts. However, we expected some variability in specific tracts, and hypothesized that this variability would be higher for test-retest than for computational reproducibility.

Subjects
A total of 113 healthy volunteers (mean age = 24.5 years, SD = 4.33 years; 65 females) participated in the study. Twenty-four of the volunteers (mean age = 24.7 years, SD = 4.06 years; 13 females) returned for a second session in which they were scanned using exactly the same MRI protocol (mean interval = 15 days, SD = 21.82 days, range: 7-104 days). All participants were right-handed and had normal or corrected-to-normal vision. No participant had a history of major medical, neurological, or psychiatric disorders. The study protocol was approved by the Ethics Committee of the Basque Center on Cognition, Brain and Language (BCBL) and was carried out in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki) for experiments involving human participants. Prior to their inclusion in the study, all participants provided informed writ-ten consent. Participants received monetary compensation for their participation.

Data acquisition
Whole-brain MRI data acquisition was conducted on a 3-T Siemens Prisma Fit whole-body MRI scanner (Siemens Medical Solutions) using a 64-channel whole-head coil. The MRI acquisition included one T1weighted structural image (T1w) and DWI sequences. High-resolution MPRAGE T1-weighted structural images were collected with the following parameters: time-to-repetition (TR) = 2530 ms, time-to-echo (TE) = 2.36 ms, flip angle (FA) = 7°, field of view (FoV) = 256 mm, voxel size = 1 mm isotropic, 176 slices. In total 100 diffusion-weighted images were acquired with the anterior to posterior phase-encoding direction and 50 isotropically distributed diffusion-encoding gradient directions. The 100 diffusion weighted images included 50 images with b-value of 1000 s/mm 2 and 50 images with b-value of 2000 s/mm 2 . Twelve images with no diffusion weighting (b-value of 0 s/mm 2 ) were obtained for motion correction and geometrical distortion correction, which comprised five images with the same phase-encoding direction as the DWI images and seven images with the reversed phase-encoding direction (posterior to anterior). Both DWIs and b0 images shared the following parameters: TR = 3600 ms, TE = 73 ms, FA = 78°, voxel size = 2 isotropic, 72 slices with no gap and a multiband acceleration factor of 3.

Tractography pipeline
Tract reconstruction was conducted using a custom containerized workflow called RTP2 (Reproducible Tract Profiles 2) pipeline ( Lerma-Usabiaga et al., 2019, which guarantees data provenance and reproducibility. The RTP2-pipeline divides this process into three main parts: (1) ROI definition, (2) DWI preprocessing, and (3) tract identification and tractometry. In step 3, we used the preprocessed diffusion data from step 2 to model streamlines based on the ROIs created in step 1. Details are given below. The code and parameters are available through GitHub (github.com/MengxingLiu/Thatract-paper) and Docker Hub ( https://hub.docker.com/u/garikoitz ).

ROI definition
The first step of the RTP2-pipeline (called RTP2-anatROIs) involves processing the subject's anatomical T1w image to obtain the ROIs to be used in tractography. The input for this step is the subject's T1w file and ROIs defined in MNI space; the output is a segmented T1w image and the ROIs of interest in individual subject T1w space.
The ROIs used for the identification of each fiber group (see Table 1 ) were obtained as follows. First, Freesurfer ( http://surfer.nmr. mgh.harvard.edu/ ) was used to perform cortical/subcortical segmentation and parcellation. Next, the thalamic nuclei were obtained by running the thalamic segmentation module implemented in Freesurfer on a probabilistic atlas built with histological and high-resolution ex vivo MRI data ( Iglesias et al., 2018 ). For this study, we only considered first-order relay nuclei as ROIs: the LGN, MGN, VP, VLN, and VLp. The VP was defined from the subdivision of lateral ventral posterior nucleus (VPL) in the thalamic segmentation because in Iglesias et al. (2018) this subdivision combines both lateral and medial ventral posterior nuclei. The VLN was created by combining the anterior and posterior ventral lateral nuclei (VLa and VLp). All tracts described in this paper travel from these thalamic nuclei to other locations in the brain. We next describe how we obtained the ROIs.
To parcellate the visual cortex we ran the Neuropythy ( Benson & Winawer, 2018 ; https://github.com/noahbenson/neuropythy ) tool on the Freesurfer results. A combination of the resulting V1 and V2 ROIs was used for our visual cortex ROI when defining the OR. The OR projections to V1 are well established (see Rokem et al., 2017 , for a review on OR tractography), whereas OR projections to V2 are relatively less studied. There is tracer evidence from non-human primates suggesting that LGN projects beyond V1 ( Garey & Powell, 1971 ;Maciewicz, 1975 ;Manger & Rosa, 2005 ), specifically to V2 (see Kennedy, 1983 , andBullier, 1985 ). In-vivo tractography evidence from humans reconstructing the OR and measuring the terminating points suggested that a considerable number of streamlines also terminated in V2 ( Alvarez et al., 2015 ;Arrigo et al., 2016 ). Thus, both V1 and V2 are included in the current protocol to reconstruct the OR. The remaining ROIs were obtained from atlases defined in MNI space. To convert them to individual subject space, we first performed a non-linear registration of a 1mm 3 MNI template using Advanced Normalization Tools (ANTs, http://stnava.github.io/ANTs/ ). A1, M1 and S1 were converted from the human connectome project (HCP) atlas ( Glasser et al., 2016 ). The S1 was defined by combining Areas 1, 2, 3a and 3b in the HCP atlas. The cerebellar dentate nucleus ROIs were obtained and transformed from the cerebellar parcellation proposed by Diedrichsen et al. (2011) .
To make sure the ROIs extended to the interface of gray and white matter, all ROIs were dilated by one cubic voxel. Also, we used inclusion or exclusion ROIs to improve neuroanatomical accuracy when obtaining both the OR and DT. For the OR, the inclusion ROI was a waypoint, first drawn in MNI space (coronal plane of y = -80 that transpasses the posterior limb of the internal capsule) and then transformed to native space, to select fibers passing through the internal capsule. In contrast, for the DT, two ROIs, the ipsilateral cerebellar cortex and the contralateral thalamus, were used with "NO " logic, that is, excluding fibers that passed through them. These were generated from one of the default parcellations of Freesurfer (i.e., aparc + aseg.mgz).

DWI data preprocessing
The second step in the RTP2-pipeline (called RTP2-preproc) consisted in preprocessing the diffusion data and registering it in anatomical space. This step was mainly based on MRtrix's ( Tournier et al., 2019 ) recommendations and used MRtrix tools, the ANTs tool described above and FSL . The data was preprocessed using several MRtrix functions in the following steps: first, data denoising based on random matrix theory, which exploits data redundancy in the patch-level principal component analysis domain ( Cordero-Grande et al., 2019 ;Veraart et al., 2016 ) using dwidenoise ; second, Gibbs Ringing correction ( Kellner et al., 2016 ) using mrdegibbs ; third, susceptibility induced distortions and motion correction with the FSL's topup and eddy tools ( Smith et al., 2004 ) called by dwifslpreproc ; fourth, B1 field inhomogeneity correction with dwibiascorrect and Rician background noise removal with mrcalc ; fifth, a rigid transformation matrix to align the DWI images to the corresponding T1w image using ANTs.
To make sure the DWI data from test and retest sessions were in the same space, DWI data from both sessions were aligned to the same T1w collected in the initial test session. Therefore, both test and retest sessions used the same ROIs; only DWI preprocessing and the streamline tracking were session-specific.

Tract identification and tractometry
In the third and final main step, the container RTP2-pipeline was used to obtain the final white-matter tracts. This container used the ROIs and preprocessed DWI data to systematically identify the tracts of interest bilaterally: OR, AR, SR, MR and DT. We initially ran this step with 4 random subjects. After carefully checking the reconstruction of each tract in line with the neuroanatomical and neurophysiological literature (see Nieuwenhuys et al. 2008 , for a review), the expert neuroanatomist (FC), who has more than 30 years of research experience in the study of thalamocortical projections, evaluated and guided the iteration of different parameters, such as inclusion/exclusion ROIs, tracking algorithm, streamline cleaning, etcetera. If the reconstructed fiber bundles were not compatible with the knowledge of the neuroanatomist (e.g., the pathway, thickness of the bundle, terminating location etcetera), we adjusted our parameters accordingly, until the bundle was as close as possible to the known neuroanatomy.
We first modeled the diffusion information to obtain a map of possible directions with weights, called fiber orientation distributions (FODs), for every voxel, using MRtrix3's multi-tissue constrained spherical deconvolution (CSD; Jeurissen et al. 2014 ). This tool can discern crossing fibers and provide more than one direction in each voxel. Next, streamline tractography was performed on the estimated fiber orientation distributions using a probabilistic algorithm (iFOD2; Tournier et al. 2010 ) with the following parameters: step size 1mm, maximum fiber length 200mm, minimum fiber length 20mm, FODs amplitude threshold 0.05, angle threshold 45 degrees. The ROIs used in the streamline tractography are described in Table 1 (please note that the dentatothalamic tract is a cross-hemispheric tract; in the present study we defined the hemisphere of this tract based on the thalamic hemisphere for the convenience of the description). The streamlines were seeded from ROI#1 and terminated in ROI#2 for all tracts except the OR, which combined streamlines from both directions to limit the impact of volume differences between seed and target. Matlab utilities developed in Vistasoft ( https://github. com/vistalab/vistasoft ) were used to remove the outlier streamlines from each tract generated in MRtrix and to obtain the main tract metrics.
We generated along-tract profiles using the tract metrics obtained from Vistasoft. Although CSD was used to model the fibers because it discerns crossing fibers, classical diffusion tensor (DTI) modeling was used to obtain the typical diffusion summary statistics, such as fractional anisotropy (FA), axial diffusivity (AD), mean diffusivity (MD), and radial diffusivity (RD). To generate tract profiles, we obtained the central location of all the streamlines in the tract and sampled this as 100 samelength segments. We then summarized the diffusion properties of each segment by taking a weighted average of the diffusion properties corresponding to a disc centered in the segment. Finally, an along-tract profile was generated for each tract using Vistasoft.

Reproducibility measurement
Reproducibility was measured in two different ways ( Fig. 2 ). First, to test computational reproducibility , we repeated the tractography reconstruction using the RTP2-pipeline 10 times. Our aim was to measure how consistently our pipeline generated the tracts of interest. Second, to check for test-retest reproducibility, we used a subset of 24 participants, who returned for a second acquisition session within a mean temporal interval of 15 days. We examined how consistently our pipeline generated their tracts at these two different time points.
To evaluate these two types of reproducibility at the microstructural scale, we performed pairwise correlations on tract profiles for all possible pairs from the 10 repeated computations to measure computational reproducibility, and across test and retest to measure test-retest reproducibility. For simplicity, we only show the correlations for the FA values (for results with other metrics please see the Supplementary material).
At the macrostructural level, we quantitatively analyzed tract volume overlap, streamline density and distance to check the reproducibility of tract shapes. These analyses included: (1) Dice similarity index to check for volume-based overlap of all tract pairs; (2) density correlation for the voxel-level streamline density of all tract pairs; and, (3) bundle adjacency, the average distance between streamlines from two tracts. The measurements used to examine computational reproducibility and rest-retest reproducibility were computed using the package scilpy (see details in Schilling et al., 2021 and https://github.com/scilus/scilpy ). These analyses were conducted across all possible pairs of computa-tional reproducibility and rest-retest reproducibility, as well as for each tract.

Results
In the present study, we obtained and measured fibers bundles connecting four first-order sensory (LGN, MGN), somatosensory (VP) and motor (VLN) thalamic nuclei with their main corresponding cortical target areas. In addition, we reconstructed the subcortical input pathway to VLp from the dentate nucleus of the cerebellum. These five tracts were identified as homologous tract pairs in the left and the right hemisphere. Figs. 3 and 4 show these tracts in a representative subject (see also supplementary videos showing 3D rotations of the main tracts studied, as well as supplementary Figure S1 for tracts from five more random subjects). To examine the reproducibility of our protocol, we followed a double analytical approach testing: (1) computational reproducibility by repeating the computation on the same diffusion data 10 times and quantifying changes from computation to computation for the same tract; and, (2) test-retest reproducibility , by obtaining DWI data from the same subjects and using the same MRI protocols across two different sessions to quantify test-retest changes in the same tracts.

Computational reproducibility
For the five pairs of white-matter fibers used in the established protocol, the repeated computations on same diffusion data resulted in almost    Each dot represents the correlation coefficient for a specific computation pair for one participant. C) Agreement indices distribution: bundle adjacency (top), Dice coefficient (middle), and density correlation (below) for all possible computation pairs for each tract and each subject (lighter color columns represent the left hemisphere, darker color columns represent the right hemisphere).
identical tract profiles and high agreement on streamlines. Mean correlations of the FA profile were above 0.99 for all tracts examined (see Table 2 ; Fig. 5 A). At an individual level, correlation coefficients were higher than 0.97 for each fiber across all possible computation pairs, except for left AR, which had a long tail that reached 0.82 ( Fig. 5 B). Bear in mind for 10 repeated computations, that will result in at least 9 relatively low coefficients, even if only one computation resulted in a different tract than all the others. Agreement indices also showed that the identified white-matter fibers had consistent shapes and density across repeated computations ( Table 2 ). Fig. 3 C shows agreement indices for each individual pair of computations. These three agreement indices revealed a similar pattern as that observed for correlation coefficients of the FA profile ( Fig. 5 B), with more variability in left AR. A similar pattern was also found for the homologous right AR. It is noteworthy that some of this variability in agreement indices derives from the same single subject (e.g., outlying clusters for bundle adjacency and Dice coefficient in right AR, and for Dice coefficient and density correlation in left OR). Bilateral DT also showed more variability in density correlation, but less variability in bundle adjacency and Dice coefficient. The correlation coefficients for AD, MD, and RD across all possible pairs of computations are described in the Supplementary material ( Figure S1).

Test-retest reproducibility
To examine test-retest reproducibility, 24 participants came back for a retest session where we used exactly the same MRI protocol. The mean of FA profile correlations was above 0.9 across the ten tracts of interest, although as expected the values were numerically lower than those observed in the computational reproducibility analyses. As in the computational reproducibility analysis, the left AR also showed higher variability in the test-retest reproducibility analysis, with lower values within the mean correlation coefficients (0.93, Table 2 and Fig. 6 A & B). Nevertheless, it is important to highlight that all of these values reflect a high degree of reproducibility.
Test-retest reproducibility was also confirmed by the agreement indices. The group averages for bundle adjacency were all under 0.4 for the ten tracts, indicating that the streamlines identified in the test were very close to the streamlines identified in the retest (see Table 2 and Fig. 6 C). High reproducibility was also reflected by the Dice index and streamline density correlation. Among the ten tracts, the AR and DT tended to show more variability bilaterally. The correlation coefficients between test and retest for AD, MD, and RD are shown in the Supplementary materials ( Figure S2).

Discussion
We present a reproducible protocol for tractography reconstruction of first-order human thalamocortical tracts, which play a critical role in sensory and motor information relay between the thalamus and cortex. We tested the reproducibility of our protocol for obtaining these tracts of interest by examining their microstructural tractometric properties and volume-based macrostructural similarity across repeated computations and test-retest sessions. Results showed nearly perfect computational reproducibility across ten repetitions and high-to-excellent test-retest reproducibility.
In terms of computational reproducibility, it is worth highlighting that across ten separate and independent computations using the same raw data and protocol, reproducibility was nearly perfect; for example, providing an average FA profile correlation of 0.99 (individual-subject values ranging from 0.82 to 0.99) and an average Dice similarity value of 0.91 (individual Dice index values ranged from 0.66 to 0.97) across all the bundles examined. We expected computational reproducibility would be high, but it was important to demonstrate that the protocol is reliable and appropriate for obtaining the tractography measures of interest. Concerns about this type of reproducibility, which we refer to as computational reproducibility, have grown in recent years (e.g., Theaud et al. 2020 ). One goal of our protocol is to offer neuroscientists and medical practitioners efficient and reproducible processing guidelines to reconstruct first-order thalamocortical tracts. Hence, we packed our solution in software containers that can be run using Singularity or Docker technologies. In both cases, exactly the same set of algorithms, associated libraries and operating systems can be run in a computationally reproducible manner. This allows researchers to run previously as well as recently acquired data using exactly the same software and configuration options many times.
Test-retest reliability, which ensures stability across time, is one of the most widely used measures for a protocol or tool. Since test-retest reproducibility entails inputting different data, we expected to obtain lower numerical tractometric and volume-based similarity values than those observed in our computational reproducibility calculations. At the microstructural level, our data showed an average 0.97 (individualsubject values ranged from 0.50 to 0.99) test-retest reproducibility for FA profile correlations after an average of 2 weeks. These correlation coefficient values for tract profiles are high and consistent with the values obtained in a previous study aimed at validating test-retest reliability in classic fiber bundles ( Kruper et al., 2021 ).
At the macrostructural level, the Dice index is commonly used to assess the overlap of bundles reconstructed at two time points ( Besseling et al., 2012 ;Boukadi et al., 2019 ;Cousineau et al., 2017 ). In the present study, Dice index values averaged 0.85 (individual-subject values ranged from 0.39 to 0.94) for all white matter bundles examined. Based on the Dice index values reported in previous studies focused on white matter bundles (e.g., minimum group-level Dice index of 0.70 in Besseling et al. 2012 andCousineau et al. 2017 ;0.71 in Boukadi et al. 2019 ), the range of the Dice index reported here (i.e., 0.76-0.90 across tracts at group level) indicates that all the white matter bundles we investigated had high test-retest reliability.
It is important to understand the nature of the variability observed in these two types of reproducibility measurements. Our results showed that computational reproducibility is nearly perfect, but there is some variance across repeating computations. The main source of this variance is random seed generation. Basically, there are steps in our reconstruction pipeline that involve non-deterministic processes, which will be different across computations. A fully reproducible pipeline can be achieved by fixing the random seed initialization, but we decided not to proceed in this way. We used a probabilistic algorithm to generate the streamlines, whose advantage has been discussed by many researchers (see for instance ( Bonilha et al., 2015 ;Grisot et al., 2021 ;Khalsa et al., 2014 ). Indeed, fixing random seed initialization would work against the probabilistic nature of the algorithm and the philosophy of probabilistic tractography. In addition, fixing the random seed is not compatible with using multi-threaded steps that allow for faster computation.
Multi-threading introduces randomness in terms of the order of step execution, which will generally affect the reproducibility of results. It is relevant to note that previous studies that focused on probabilistic tractography have not examined computational reproducibility. Together with random seed generation, there could possibly be other factors contributing to some extent to the computational reproducibility variance. Instead of assuming no or slight changes among computations, researchers should be aware that computational variance exists depending on the parameters chosen to conduct the tractography. It is important that future research further investigates, in a systematic manner, factors that might be associated with computational variance in probabilistic tractography beyond random seed generation.
The test-retest reproducibility reported here showed some variability especially for specific bundles. We observed overall more variability in test-retest reproducibility for the AR and DT tracts. This can be in part explained by specific characteristics of these tracts, for instance: i) the small seed used for reconstruction (i.e., MGN/A1, dentate nucleus); ii) long streamlines (i.e., DT); and iii) anatomical complexity (i.e., DT). Previous studies have related lower reproducibility to smaller seed size ( Bonilha et al., 2015 ;Buchanan et al., 2014 ;Zhang et al., 2019 ) and longer streamlines ( Bonilha et al., 2015 ;Mori & Van Zijl, 2002 ;Tsai, 2018 ). Fiber tracking from small seeds may be influenced by systematic errors and noise, leading to spurious findings. Moreover, smaller seeds are likely to generate less fibers and therefore decrease the likelihood of successful tracking. Similarly, streamlines with longer paths lead to more interruptions in fiber tracking, which can also lead to larger variation. Also, the anatomically complex DT connects small and deep nuclei, making it more vulnerable to partial volume effects between different tissues ( Mori & Van Zijl, 2002 ). Small ROIs, long fibers, and anatomical complexity affect computational reproducibility in the same manner as to the test-retest reproducibility. Nevertheless, our protocol was able to reconstruct the five pairs of tracts in all 113 subjects. This is an important achievement considering the complexity of the tracts, and the fact that we used a fairly standard modern scanner and acquisition sequences.
The present work has some limitations that could be addressed in future work. First, here we limited our study to first-order relay thalamic nuclei input and output pathways tracts. These pathways are of great basic and clinical interest, as they are pivotal elements of the forebrain networks supporting complex functions such as language and skilled movement. Future studies should implement protocols suitable for the study of the structural connectivity between higher-order relay nuclei of the thalamus and cerebral cortex. Second, the data used to measure the reproducibility were collected using a specific acquisition sequence. Follow-up studies should examine to what extent data reproducibility at the computational and test-retest levels is maintained for a range of DWI MRI protocols. Important variations may include: single-shell versus multi-shell DWI, multiband versus monoband protocols, the number of directions included, the types of MRI scanners, etcetera. Third, the reproducibility profiles of the present protocol focused on first-order relay human thalamic white-matter tracts in healthy adults. It is not clear how generalizable these results would be if the same protocol were applied to clinical or special populations. Testing different population groups would help determine whether this protocol successfully generalizes to, for instance, clinical samples and developmental samples, who undergo rapid neuroplasticity changes. Testing different DWI MRI protocols and populations was beyond the scope of the present work. The containerized implementation of the protocol facilitates the testing of other relevant aspects in future studies. We will provide support to researchers interested in using this protocol ( https://github.com/ garikoitz/RTP-pipeline/wiki/Parameter-recommendations ).

Conclusion
In sum, in this work we established a reliable protocol to obtain and measure the main first-order thalamic input and/or output white-matter fiber tracts. Our tests revealed this protocol had nearly perfect computational reproducibility and high-to-excellent test-retest reproducibility at both the microstructural and macro-scale levels. This protocol can be used to investigate the structural connectivity of these pivotal connection systems in development, learning or disease. We have made this protocol openly available to the scientific community.