XTRACT - Standardised protocols for automated tractography and connectivity blueprints in the human and macaque brain

We present a new toolbox and library of standardised tractography protocols devised for the robust automated extraction of white matter tracts both in the human and the macaque brain. Using in vivo data from the Human Connectome Project (HCP) and the UK Biobank and ex vivo data for the macaque brain datasets, we obtain white matter atlases, as well as atlases for tract endpoints on the white-grey matter boundary, for both species. We illustrate that our protocols are robust against data quality, generalisable across two species and reflect the known anatomy. We further demonstrate that they capture inter-subject variability by preserving tract lateralisation in humans and tract similarities stemming from twinship in the HCP cohort. Our results demonstrate that the presented toolbox will be useful for generating imaging-derived features in large cohorts, and in facilitating comparative neuroanatomy studies. The software, tractography protocols, and atlases are publicly released through FSL, allowing users to define their own tractography protocols in a standardised manner, further contributing to open science.


Introduction
Diffusion tractography is a unique tool for extracting white matter (WM) pathways non-invasively and in vivo. The virtual dissection of major WM tracts enables the study of brain organisation (Catani et al., 2013;Jbabdi et al., 2015) and offers a probe to brain development (Huppi and Dubois, 2006) and WM pathology (Ciccarelli et al., 2008;Griffa et al., 2013). It further allows explorations of individual variations (Assaf et al., 2017) and crossspecies variations (Mars et al., 2018b) in anatomy and connectivity. This information has functional relevance, as the pattern of extrinsic WM connections of each functional brain subunit to the rest of the brain are unique (Mars et al., 2018a;Passingham et al., 2002).
To be able to reliably study individual variability in WM pathways, tractography approaches often utilise protocols to extract a pre-defined set of WM tracts. Such protocols must be robust and reproducible, allowing reconstruction of WM tracts in a consistent manner across subjects, while respecting the underlying anatomical variation and individual differences. Tractography protocols are typically comprised of masks and rules used to impose prior anatomical knowledge to guide and constrain curve propagation, reducing the chance of false positives (Catani et al., 2002;Wakana et al., 2004). One approach that may be used is to define subject-specific tractography protocols (Conturo et al., 1996), considering the specific variations in individual anatomy. However, defining masks on a subject-wise basis is time-consuming and subjective (Jones, 2008;Nucifora et al., 2012), while for large cohorts these limitations become prohibitive. The alternative to this manual approach is to define a set of standardised masks in template space, which are then registered to the individual geometry and used in a consistent and automated manner for each subject.
This approach has proven powerful in the extraction of a range of tracts (Catani and Thiebaut de Schotten, 2008;de Groot et al., 2013;Thiebaut de Schotten et al., 2011b;Wakana et al., 2007;Wassermann et al., 2016;Zhang et al., 2008) (see Supplementary Table 1 for a summary). (Wakana et al., 2007) developed a set of standard masks for the extraction of 20 tracts (9 left/right, 2 commissural). They reported high inter-and intra-rater reproducibility and suggested that some tracts may display left-right asymmetry. Similarly, (Catani and Thiebaut de Schotten, 2008) defined standard space masks for the reconstruction of 19 tracts (7 left/right, 5 commissural). They demonstrated how the use of standardised mask-based protocols may aid in improving the reproducibility of tractography results and produced tract atlases. This work was furthered by (Thiebaut de Schotten et al., 2011b) through the extension of the tractography protocols to 31 tracts (14 left/right, 3 commissural), where good correspondence between their automated tractography technique and histological atlases was reported. (de Groot et al., 2013) compiled a library of tractography protocols, adapted from the literature, and used standardised mask-based automated probabilistic tractography to reconstruct 27 tracts (12 left/right, 3 commissural) in two datasets with varying quality. (Wassermann et al., 2016) proposed a framework for describing WM anatomy and tracts which uses subject-wise anatomical segmentation, clustering and a query language to extract 57 (25 left/right, 7 commissural) tracts from whole-brain tractography. This approach reduces the definition of tracts to sets of logical rules with reference to whether a tract is present or terminates in given brain parcels.
More recently the power of automated tractography has been illustrated within a different context. (Mars et al., 2018b) defined a set of tracts, derived from the same tractography protocols as in (de Groot et al., 2013), in both humans and macaques, in order to perform comparative anatomy. They defined connectivity blueprints as (Cortex) x (Tracts) maps, which can be used to identify functionally equivalent cortical regions between the two different species, in the absence of any geometrical similarity. This paper extends these previous efforts in devising an extended set of ROI-based tractography protocols and an automated tractography toolbox 1 . The contribution of the presented work is as follows: 1) we design tractography protocols for 42 tracts and we illustrate their robustness against data quality, using high-resolution data from the Human Connectome Project (HCP)  and more typical data from the UK Biobank , 2) we illustrate generalisability of the tractography protocols to the macaque brain, 3) we derive high-quality tract atlases using these protocols both for the human brain (1000 HCP subjects) and the macaque brain (high-resolution, ex vivo datasets from 6 animals), 4) we perform indirect validation by assessing lateralisation of the extracted tracts (in humans), 5) we illustrate that, despite being template-driven, reconstructed tracts preserve individual variability as assessed via twinship analysis, and 6) we offer an opensource flexible framework for publicly exchanging tractography protocols available within FSL. New standard space WM tractography protocols may be defined and "plugged into" the toolbox, allowing for further expansion and tract exchange, contributing to open science and reproducibility of results.

Tractography Protocol Definition
We devised tractography protocols for 42 WM tracts (19 bilateral and 4 commissural), in a generalisable manner that allows equivalent mask definitions to apply to both the human and the macaque brain. The full list of tracts that are currently supported is presented in Table  1. We further implemented a new cross-species tractography (XTRACT) toolbox, capable of reading the standard space tractography protocols and performing probabilistic tractography (Behrens et al., 2007), with the option of GPU acceleration (Hernandez-Fernandez et al., 2019). Figure 1 illustrates the main stages for a single tractography protocol. Each tract is reconstructed using a unique combination of masks, defined in standard space (MNI152 for humans and F99 for macaques). The masks include seeds (starting points of the tractography streamlines), targets/waypoints (regions through which a streamline should pass in order to be valid), exclusions (regions that serve to reject any streamline running through them) and stop/termination masks (regions that serve to stop any streamline running through them). Seeding strategies support: a) a standard single-ROI seed and b) a "reverse-seeding" approach, where a pair of seed-target masks exchange roles and the final path distributions are added. The protocol masks are transformed from standard space to subject's native space using a non-linear registration warp field. Tractography is performed in native space and results are directly resampled to standard space, allowing between-subject geometrical correspondence, necessary in certain contexts (e.g. atlasing).
The sections below describe in detail the protocol for each tract in consideration in the case of human tractography and any adjustments for the macaque brain. With the exception of the brainstem and commissural tracts all protocols include the midline sagittal plane as an exclusion mask to restrict fibres to the ipsilateral hemisphere.

Association fibres
Superior Longitudinal Fasciculus (SLF) 1/2/3: The three branches of the superior longitudinal fasciculus are reconstructed using an extension of the approach taken by (Thiebaut de Schotten et al., 2011a). In each case a coronal plane in the region of the central sulcus within the frontal/parietal cortex is used as a seed along with two target masks. Frontally, target masks for the first, second, and third branches of the SLF were coronal sections through the superior, middle, and inferior frontal gyri, respectively, placed at the level of the posterior end of the genu of the corpus callosum. Posteriorly, a large coronal target mask in the superior parietal lobule, immediately posterior to the margin of the cingulate gyrus is used for SLF1. For SLF2 and SLF3, the second target masks are placed in the angular gyrus and supramarginal gyrus respectively. An axial exclusion mask was placed underneath the parietal cortex and one blocking subcortical areas prevented leaking into ventrally oriented fibres. A final coronal exclusion mask through subcortical areas posterior to the caudal end of the genu of the corpus callosum prevented leaking into ventral longitudinal tracts.
Arcuate Fasciculus (AF): The arcuate fasciculus forms part of the system of dorsal longitudinal fibres, but in the human brain is distinguished by its posterior curve ventrally into the temporal cortex. The human AF was reconstructed with a seed in the supramarginal gyrus (SMG), a temporal target mask was in the WM encompassing the superior temporal gyrus (STG) and middle temporal gyrus (MTG), and an anterior target at the level of the ventral premotor cortex, posterior to the inferior frontal gyrus (IFG) and anterior to the precentral sulcus. Following the observation in the macaque that this tract runs along the fundus of the circular insular sulcus (Petrides et al., 2012) we placed a seed mask there, just posterior to the level of the central sulcus. An axial target mask was placed in the parietal-temporal WM posterior to the caudal end of the Sylvian fissure. An additional axial plane was placed in the IFG. This protocol was validated by (Eichert et al., 2019b).
Middle/Inferior Longitudinal Fasciculus (MdLF, ILF): Three tracts that course along the temporal lobe were reconstructed (MdLF, ILF, IFO). The middle and inferior longitudinal tracts stay within the lateral posterior cortex. MdLF was seeded in the anterior part of the SFG (Makris et al., 2009); ILF in the middle and inferior temporal gyri to account for the expansion of the temporal cortex in the human brain compared to the macaque (Latini et al., 2017;Roumazeilles et al., submitted). For the MdLF, large axial and coronal planes covering the WM in the temporo-parietal-occipital junction were used as targets, based on anatomical descriptions from (Makris et al., 2013). For ILF, an axial plane in middle and inferior temporal gyrus is used as a target. Exclusion masks where placed axially through the brainstem, coronally through the fornix, axially through the cingulum bundle posterior to the corpus callosum and through the entire frontal cortex. In addition, seed and target masks of MdLF served as exclusion masks for ILF and vice versa.
Inferior Fronto-Occipital Fasciculus (IFO): In contrast to MdLF and ILF, the inferior frontooccipital fasciculus, also termed the extreme capsule fibre complex (Mars et al., 2016), runs more medially and courses into the frontal cortex through the extreme capsule. Extending the recipe of (Wakana et al., 2007), the seed was a coronal plane through the anterior part of the occipital cortex, the target a coronal plane through the frontal cortex anterior to the genu of the corpus callosum. An exclusion mask just behind the anterior commissure excluded all fibres except those running through the extreme capsule.
Uncinate Fasciculus (UF): The bottom part of the extreme capsule contains fibres belonging to the uncinate fasciculus, curving from the inferior frontal cortex to the anterior temporal cortex. The tract was reconstructed using a seed in the STG at the first location where temporal and frontal cortex are separated, a target through the ventral part of the extreme capsule, and exclusion masks through the rest of the extreme capsule and a layer between the seed and the target to force the curve. An additional coronal exclusion mask prevented accidental leaking into the fibres running longitudinally through the temporal lobe.
Frontal Aslant Tract (FA): The frontal aslant is a short tract running in the frontal lobe between the posterior part of the inferior and superior frontal gyri (Catani et al., 2012). The seed was placed sagittally in the WM of the IFG, the target axially in that of the SFG. A posterior coronal exclusion mask prevented leakage into longitudinal fibres.
Vertical Occipital Fasciculus (VOF): The vertical occipital fasciculus (VOF) runs in a predominantly dorsal-ventral orientation in the occipital lobe. We used an adapted version of the recipe described by (Takemura et al., 2017). An axial seed mask was placed in the lateral part of the ventral occipital WM posterior to the anterior occipital sulcus (Petrides et al., 2012). A larger axial target mask was placed dorsally at the level of the lateral occipital sulcus. A coronal plane just posterior to the corpus callosum served as an exclusion mask to prevent leakage into anterior-posterior tracts.

Commissural fibres
Middle Cerebellar Peduncle (MCP): The middle cerebellar peduncle (MCP) was seeded in the cerebellar WM with a target in the opposite hemisphere (and their inverses). Exclusion masks were placed sagitally along the cerebellar midline and axially through the thalamus.
Corpus Callosum Splenium (FMA) & Genu (FMI): We reconstructed callosal connections to the occipital lobe via the splenium of corpus callosum (forceps major, FMA) and to the frontal lobe via the genu of corpus callosum (forceps minor, FMI) using recipes based on those defined by (Wakana et al., 2007). Seed and target masks (and their inverse) for the FMA were defined as coronal sections through occipital lobe at the posterior end of the parietal-occipital sulcus. The sagittal exclusion mask was confined to the occipital cortex and the subcortex. Additional exclusion masks though the inferior fronto-occipital WM and a coronal plane through the pons prevented leakages to longitudinal fibres. Seed and target masks (and their inverse) for the FMI were defined as coronal sections through the frontal lobe at the anterior end of the pregenual cingulate sulcus. The sagittal exclusion mask was interrupted at the level of the anterior third of the corpus callosum.
Anterior Commissure (AC): The anterior commissure connects the temporal lobes of the two hemispheres across the midline. It was seeded in the left-right oriented fibres on the midline, with a target mask covering the WM lateral to the globus pallidae. Stop masks were placed directly underneath and lateral to the two amygdalae. A large axial exclusion mask was placed dorsal to the seed through the entire subcortex.

Limbic fibres
Cingulum subsections (CBT, CBP, CBD): Recently, (Heilbronner and Haber, 2014) proposed a segmentation of the cingulum bundle into distinct sections based on the presence of fibres connecting specific cingulate, non-cingulate frontal, and subcortical targets. We therefore created protocols for three distinct subsections of the cingulum bundle. The temporal part (CBT) was seeded in the posterior part of the temporal lobe at a section where the fibres of the cingulum are mostly oriented in the anterior-posterior direction. The target was placed posteriorly to the amygdala and stop masks were placed posteriorly and anteriorly to the seed and target masks, respectively. An exclusion mask prevented leaking into the fornix. The dorsal segment (CBD) was seeded just above the posterior part of the corpus callosum and had a target at the start of the genu of the corpus callosum. A sagittal exclusion mask in the anterior limb of the internal capsule prevented leakage into the temporal lobe. Finally, the peri-genual part of the cingulum bundle (CBP) was seeded anteriorly above the corpus callosum and a target placed below the sub-genual callosum with a stop mask placed inferior and anterior to the target. A callosal plane at the level of the rostral end of the Sylvian fissure prevented leakage into the CBD.
Fornix (FX): The fornix connects the hippocampus with the mammillary bodies, the anterior thalamic nuclei, and the hypothalamus (Catani et al., 2013). The tract was reconstructed using a seed in the body of the fornix at the level of the middle of the corpus callosum and a target in the hippocampus. A callosal plane at the anterior end of the occipital cortex prevented leakage into posterior tracts and bilateral sagittal planes around the midline, at the level of the anterior tip of the thalamus prevented lateral propagation to the anterior limb of the internal capsule. We should point out that due to the relatively small size of the stria terminalis and its close proximity to the fornix, the fornix tracking may leak into the stria terminalis. This is a common issue in diffusion tractography and is yet to be overcome using approaches in line with those used in the current study (Kamali et al., 2015;Mori et al., 2017;Mori and Aggarwal, 2014;Pascalau et al., 2018).

Projections fibres
Corticospinal Tract (CST): The corticospinal, or pyramidal, tract extends from the spinal cord through the midbrain and distributes to motor cortex, premotor cortex and somatosensory cortex. The tract is seeded from the pons with a large target covering the motor, premotor and somatosensory cortices. An axial exclusion mask is used to restrict tracking to the cerebral peduncle of the midbrain. In addition, the exclusion mask includes two coronal planes, anterior and posterior to the target, to exclude tracking to the prefrontal cortex and occipital cortex respectively.
Anterior and Superior Thalamic Radiations (ATR, STR): The anterior and superior thalamic radiations connect the thalamus to the frontal lobe and pre-/post-central gyrus respectively. The anterior thalamic radiation is seeded using a coronal mask through the anterior part of the thalamus (Wakana et al., 2007) with coronal target mask at the anterior thalamic peduncle. In addition, the exclusion mask contains an axial plane covering the base of the midbrain, a coronal plane preventing leakage via the posterior thalamic peduncle and a coronal plane preventing leakage via the cingulum. A coronal stop mask covers the posterior part of the thalamus, extending from the base of the midbrain to the callosal sulcus. The superior thalamic radiation is seeded using a mask covering the whole thalamus and a target axial plane covering the superior thalamic peduncle. An axial plane is used as a stop mask ventrally to the thalamus. The exclusion mask includes two coronal planes, anterior and posterior to the target, to exclude tracking to the prefrontal cortex and occipital cortex respectively.
Acoustic Radiation (AR): The acoustic radiation connects the medial geniculate nucleus (MGN) of the thalamus to the auditory cortex. It was seeded from the transverse temporal gyrus with a target covering the MGN of the thalamus. The exclusion mask consists of two coronal planes, anterior and posterior to the thalamus, and an axial plane superior to the thalamus. In addition, the exclusion mask contains the brainstem and a horizontal region covering the optic tract.
Optic Radiation (OR): The optic radiation consists of fibres from the lateral geniculate nucleus (LGN) of the thalamus to the primary visual cortex. It was seeded in the LGN and the target mask consisted of a coronal plane through the anterior part of the calcarine fissure. Exclusion masks consisted of an axial block of the brainstem, a coronal block of fibres directly posterior to the LGN to select fibres that curl around dorsally, and a coronal plane anterior to the seed to prevent leaking into longitudinal fibres.

Adjustments for the macaque brain
Although the protocols described above are such that they allow for equivalent definitions in the macaque brain, some adjustments were required to ensure anatomical accuracy. For all macaque protocols, the reverse-seeding method was used, as this was found to increase robustness in the resulting tracts. In addition, the AF and MdLF protocols were adjusted to reflect the macaque brain. In the case of the AF, a seed is placed in the caudal STG, a target directly above the principal sulcus extending posterior to 8Ad (based on the tract-tracing data of (Schmahmann and Pandya, 2009)). In addition, a target placed in the caudal STG, immediately inferior and posterior to the seed ensured tracking occurred via caudal end of the lateral fissure. For the MdLF, a single axial plane in the posterior part of the STG was used as a target.

Data and Preprocessing
To assess robustness across varying data quality, we utilised very high-quality diffusion MRI data from the Human Connectome Project (HCP)  and data from the UK Biobank , which have overall quality closer to the one typically available through clinical scanners. To ensure generalisability of the protocols across species, we also utilised diffusion MRI data from the macaque brain. This data consisted of an extended set of animals used in (Eichert et al., 2019a;Mars et al., 2018b). In total, the datasets we considered consist of 1065 subjects from the HCP (all available HCP S1200 subjects that had diffusion MRI data), 1000 subjects from the UK Biobank and ex vivo high-resolution datasets from 6 macaques. For the HCP data, we removed 44 subjects with identified anatomical abnormalities from the statistical comparisons and group atlases (see the HCP quality control website for details 2 ), leaving us with a total of 1021 subjects from the 1065 subjects with diffusion data.
For both the HCP and UK Biobank, we utilised the pre-processed dMRI data (see Sotiropoulos et al., 2013) and (Alfaro-Almagro et al., 2018;Miller et al., 2016) for full descriptions respectively). Briefly, the HCP data have been acquired in a bespoke 3T Connectom Skyra (Siemens, Erlangen) with a monopolar diffusion-weighted (Stejskal-Tanner) spin-echo EPI sequence, an isotropic spatial resolution of 1.25mm, three shells (b-values=1000, 2000 and 3000 s/mm 2 ) and 90 unique diffusion directions per shell, acquired twice. The UK Biobank data have been acquired in a clinical 3T Skyra (Siemens, Erlangen), consist of two shells (b-values=1000 and 2000 s/mm 2 ) and 50 diffusion directions per shell, with an isotropic spatial resolution of 2mm. In both cases, data were motion, susceptibility distortion and eddy current distortion corrected (Andersson et al., 2003;Andersson and Sotiropoulos, 2016). Nonlinear transformations to standard space (MNI152) were obtained using the respective T1-weighted images, to which the distortion-corrected diffusion MRI data were also linearly registered. Concatenation of the diffusion-to-T1 and T1-to-MNI transforms allowed diffusion-to-MNI warp fields to be obtained.

Fibre orientation estimation and tractography
The crossing fibre model described in (Jbabdi et al., 2012) was applied to the diffusion data and used to estimate orientations to inform tractography. This is a parametric spherical deconvolution model that accounts for the non-monoexponential decay of the dMRI signal with higher b-values. Up to three fibre orientations were estimated in each voxel along with their uncertainty. The "XTRACT" toolbox read the standard space tractography protocols and performed probabilistic tractography (Behrens et al., 2007). As discussed before, tractography protocols were defined for each bundle using a unique combination of seed, target, exclusion and stop masks, along with a seeding strategy (see Table 1). A number of default tractography termination criteria were also used in all protocols (curvature threshold: ±80 degrees, max streamline steps: 2000, subsidiary fibre volume threshold: 1%, loopchecking and termination). A step size of 0.5mm and 0.2mm were used for human and macaque tractography respectively. As shown in Figure 1, the masks were warped to the subject's native space and after tractography, the tractography results are directly resampled to standard space. The resultant distributions are normalised with respect to the total number of valid streamlines generated.
In order to obtain tract atlases, in the form of population percentage overlap, we binarised each normalised path distribution at a threshold value. The binary masks were then cohort-averaged to give the percentage of subjects for which a given tract is present at a given voxel.

Connectivity blueprints
The estimated bundles were further used to estimate maps of "cortical termination" for each tract in consideration, using connectivity blueprints (Mars et al., 2018b). Specifically, a white/grey matter boundary (WGB) x tracts matrix CB was reconstructed for each subject. This was achieved by seeding from every WGB location and counting the number of visitations to the whole WM, giving a WGB x WM connectivity C1 matrix. The tracts obtained using the tractography protocols were vectorised and concatenated into a single WM x tracts C2 matrix. Multiplying the two matrices provides a connectivity "blueprint", i.e. a CB=C1xC2 (WGB x tracts) matrix. Columns of this matrix represent the termination points of the corresponding tract on the WGB surface, while rows illustrate the connectivity pattern of each cortical location (i.e. how each tract contributes to the overall connectivity of each cortical location). This process was performed for the HCP subjects and the macaque datasets. The results were then cohort-averaged to produce connectivity blueprint atlases.

Assessing tract lateralisation
In order to demonstrate whether our protocols produce tracts representative of the anatomical expectations, we investigate tract lateralisation using a large number of subjects. Based on the literature, it is expected that AF is left-lateralised (Eichert et al., 2019b;Nowell et al., 2016;Panesar et al., 2018), IFO, MdLF and SLF3 are right-lateralised (Hau et al., 2016;Howells et al., 2018;Menjot de Champfleur et al., 2013;Thiebaut de Schotten et al., 2011a;Zhao et al., 2016), while SLF1 is expected to be non-lateralised (Thiebaut de Schotten et al., 2011a). The literature suggests that the SLF2 is right-lateralised, however, findings are less conclusive as in some cases the reported lateralisation does not reach significance (Hecht et al., 2015;Thiebaut de Schotten et al., 2011a).
We assessed tract lateralisation using tract volume. Specifically, lateralisation (L) was calculated as the relative right-left volume (Vr and Vl) difference, after binarising the normalised tracts at 0.5% and taking the voxel count, i.e. L=(Vr-Vl)/(Vr+Vl).
Furthermore, we explored inter-hemispheric differences in the cortical termination maps. A connectivity blueprint CBL, including only the left-hemisphere tracts/columns and a CBR, including only the right-hemisphere tracts/columns, were obtained. Both matrices were row-normalised, so that the sum of all elements in each row was equal to 1. Subsequently, we calculated the Kullback-Leibler (KL) divergence (a measure of dissimilarity) between every pair of (CBR, CBL) rows. We assessed the right-left similarity in connectivity patterns in every hemispheric location i using the minimum KL-divergence value obtained between all possible pairs, i.e. min(CBRi, CBLj), with j spanning all WGB locations.

Respecting similarities stemming from twinship
Whilst we aimed for the automated tractography protocols to be robust against data quality, be reproducible and generalisable between species, we further tested whether they could respect features stemming from the inherent individual variability in WM anatomy across subjects. To demonstrate this, we explored the similarity of tract reconstructions within twin and non-twin sub-groups in the HCP cohort. We anticipate that monozygotic twin pairs will illustrate larger similarities than dizygotic twins and non-twin siblings, and subsequently than unrelated subject pairs, in line with the literature on the heritability of structural connections (Bohlken et al., 2014;Jansen et al., 2015;Shen et al., 2014) and as may be expected from the literature on sulcal similarities in twinship (Amiez et al., 2019). Using the 72 pairs of monozygotic twins available in the HCP cohort, 72 randomly chosen pairs of dizygotic twins, 72 randomly chosen pairs of non-twin siblings and 72 randomly chosen pairs of unrelated subjects, correlations between tract reconstructions were performed and used to assess whether our automated protocols respect the underlying tract variability across individuals.

Respecting individual differences due to atypical anatomy
We excluded a small number of subjects from the HCP cohort-derived atlases, due to identified anatomical abnormalities, following the HCP quality control recommendations. However, we explored the performance of our tractography protocols in a number of these cases, and particularly the ability to handle atypical anatomical features.

WM Tract and WGB termination atlases of the human brain
We applied the prescribed tractography protocols to ~1000 HCP and 1000 UK Biobank subjects. Figure 2 presents the tract atlases, obtained from the HCP datasets. To obtain these atlases, the subject-specific MNI-transformed tracts were binarised and subsequently averaged to produce the population percent coverage for every tract, shown in the Figure. The atlases obtained using the UK Biobank data are shown in Supplementary Figure 1.
Connectivity blueprints were also derived for each HCP subject and averaged to obtain an atlas. Examples of columns of this average connectivity blueprint across all HCP subjects are shown in Figure 3, representing atlases of termination points of each tract on the whitegrey matter boundary (WGB) surface.
We also investigated the effect of sample size on tract atlas creation. For each of the HCP and UK Biobank cohorts, we produced tract atlases with increasing numbers of subjects and correlated, tract-wise, each set of atlases to a 1000-subject atlas set. Supplementary Figure 2 shows the distributions of the tract-wise correlations for each of the sample size atlases. The top plot includes an atlas set with a sample size of 10 subjects which, whilst already showing high correlations, performs relatively poorly compared to using a sample size of 100 or greater.

Robustness against datasets
To explore robustness against varying data quality, we compared tract atlases and inter-subject variability of the tract reconstructions within and across cohorts. To compare atlases, each tract from the HCP atlas set was correlated with its corresponding UK Biobank tract atlas (population threshold of 30% applied to each tract atlas). The average correlation across tracts was 0.80 (standard deviation = 0.07).
Inter-subject correlations were obtained by correlating random subject pairs within and across cohorts. To avoid possible family structure-induced bias in the HCP, we restricted our subjects to the 339 unrelated subjects. We matched the number of subjects in the UK Biobank data by randomly selecting 339 (gender matched) subjects. The across-cohort comparison was made by correlating a random subject in the unrelated HCP subject pool with a random UK Biobank subject, giving a distribution of 339 correlations per tract. Withincohort comparisons gave average correlation values of 0.52 (standard deviation = 0.09) and 0.54 (standard deviation = 0.09) for the HCP and UK Biobank respectively, with no significant difference in the within-cohort correlations for the two cohorts (p = 0.06 -Mann-Whitney U test) (Figure 4). Across-cohort comparison gave an average correlation of 0.41 (standard deviation = 0.10) which is lower than within-cohort comparisons (p = 2x10^-5 and 3x10^-7), yet it is comparable enough, particularly given the age difference of subjects in the two cohorts (HCP:22-35 years old, UK Biobank: 40-69 years old, mean age for our chosen subjects in HCP = 28.6 (standard deviation = 3.7) and UK Biobank = 62.6 (standard deviation = 7.5)).

Generalisation across species -WM Tract and WGB termination atlases of the macaque brain
We repeated tractography as described above and obtained atlases from the macaque cohort. Our tractography protocol definitions are such that they allow the extraction of homologous tracts in both the human brain and macaque brain.
Tract atlases in the form of population percent were generated and are shown in Figure 5, while Figure 6 shows the averages of the connectivity blueprints derived using the macaque data. Figure 7 illustrates tract-lateralisation estimates for a subset of tracts in the human brain, for which lateralisation has been previously reported in the literature. As shown in Figure 7, the AF is left-lateralised and IFO, MdLF and SLF3 are right-lateralised in both the HCP and the UK Biobank cohorts. SLF1 is symmetric in the HCP cohort but reaches rightward significance in the UK Biobank cohort. SLF2 is also variable across cohorts with leftlateralisation in the HCP and right-lateralisation in the UK Biobank.

Tract Lateralisation
In addition to volume-based measures of lateralisation, inter-hemispheric differences on connectivity patterns were also assessed on the WGB surface using the tracts-derived connectivity blueprints. In an approach similar to (Mars et al., 2018b), Kullback-Leibler (KL) divergence was calculated to explore connectivity similarity between the two hemispheres of the human brain. For every location on the right hemisphere surface, the minimum KL divergence value assesses the most similar connectivity pattern on the left hemisphere. In doing so, we can probe cortical locations that demonstrate dissimilar connection patterns between left and right hemispheres and assess which tracts are contributing to these dissimilarities. In areas of high minimum KL divergence, we would expect to observe differences in the tract contribution profiles between the corresponding vertices. Figure 8a shows the minimum KL divergence values obtained for all WGB surface locations, overlaid with a subset of the Glasser parcellation (Glasser et al., 2016). Regions of high left-right dissimilarity in connectivity patterns were generally confined to frontal and temporo-parietal junction (TPJ) regions. Figure 8b shows examples of the tract contribution to the connection pattern of specific WGB locations on the right hemisphere and how these compare with the connection patterns of the best matching location on the left hemisphere. Three examples are shown corresponding to varying degrees of inter-hemispheric dissimilarity. It can be seen that high dissimilarity was mediated by tracts that were found to be lateralised. For example, inter-hemispheric differences for a selected voxel in IFSa were primarily driven by differences in how the lateralised SLF3 contributes to its connectivity pattern. For a mid-range dissimilarity vertex, selected in the temporo-parietal-occipital junction (TPOJ1), we can see that small inter-hemispheric divergence was driven primarily by the AF and SLF3. Conversely, regions of low dissimilarity, such as the fourth visual area (V4), show little inter-hemispheric difference in the tracts contributing to the connectivity profile.
In addition to exploring the similarity of tract reconstructions in twins, we further investigated whether the automated tractography respected individual variability in the case of anatomical abnormalities. A subset of subjects with gross anatomical abnormalities were identified using the HCP quality control. Figure 10 gives examples of these subjects and highlights the difference between the average tracts (as provided by the atlas) and the individual subject tractography results, which reflect the presence of cavernomas and cysts in WM.

Discussion
We have presented a new toolbox (XTRACT) for automated probabilistic tractography along with standardised protocols for extracting white matter bundles in the human and the macaque brain. We have demonstrated that the protocols are robust when applied to data of varying image quality and to data from a non-human primate species. We have generated human WM tract atlases using an order of magnitude more data than previous efforts, as well as macaque atlases using a small number of, however high-quality ex vivo, datasets. We have performed indirect validation illustrating that reconstructed tracts are left/right asymmetric, when they are expected to be based on prior literature. We have also shown that despite automatically generating tracts using standard-space protocols, the protocols respect the underlying individual variability, as reflected in twinship-induced similarities and in respecting anatomical abnormalities. The toolbox, tractography protocols and atlases are freely and openly available as a part of FMRIB's software library (FSL) (version 6.0.2 and later).
A current issue in the field of tractography is that protocol definitions in journal publications often lack detail or are designed without data-sharing in mind. Solutions have been proposed to resolve this issue, for example, the white matter query language (Wassermann et al., 2016). Here, we offer a platform for direct sharing of standardised protocol masks and tract atlases. Moreover, we made the protocol definitions generalisable across species to directly facilitate comparative anatomy studies.
We reconstructed tracts using imaging datasets of different quality and we generated atlases for both the Human Connectome Project (HCP) and the UK Biobank cohorts. Comparisons of tract reconstructions within and across the human cohorts demonstrate that the method and protocols are robust across subjects and against data quality. The HCP and UK Biobank cohorts provide examples of high-quality data and more typical quality data respectively. Within cohort comparisons reveal similar inter-subject tract correlations across the varying quality data, with greater inter-subject tract correlations observed in the UK Biobank. This may reflect a reduced level of detail in the lower resolution UK Biobank data compared to the HCP data, but also differences in the mean age of subjects in the two cohorts. In addition, we have generated atlases using a smaller cohort of macaques. To compensate for the small number of subjects, we used high-quality and high-resolution ex vivo data. The respective results demonstrate the generalisability of our method to the macaque brain. Recent efforts to obtain macaque data from larger cohorts (HCP-style protocols) are ongoing (Autio et al., 2019;Milham et al., 2018) and our tools will be a useful resource for these new initiatives for the non-human primate brain (Thiebaut de Schotten et al., 2019).
As a means to indirectly validate our results, we investigated left-right tract lateralisation. We compared our lateralisation results to a priori knowledge from the literature. For both human cohorts (HCP and UK Biobank), we found that reconstructed AF is strongly left-lateralised, while SLF3, IFO and MDLF were right-lateralised, as expected from the literature (Eichert et al., 2019b;Hau et al., 2016;Hecht et al., 2015;Nowell et al., 2016;Panesar et al., 2018;Thiebaut de Schotten et al., 2011a;Zhao et al., 2016). Results were less clear-cut for SLF1 and SLF2, where prior studies (with much fewer numbers of subjects) are inconclusive (Hecht et al., 2015;Howells et al., 2018;Thiebaut de Schotten et al., 2011a). This may be due to the large variance observed (in the case of SLF2), perhaps reflecting some underlying interaction, such as handedness (Howells et al., 2018).
We performed further sanity checks by investigating lateralisation using the connectivity blueprint that we obtained from the reconstructed tracts. By using the KL divergence between connectivity patterns to assess inter-hemispheric dissimilarity, we identified that regions associated with language, known to be lateralised (Hiscock and Kinsbourne, 2008), have dissimilar connectivity patterns across the two hemispheres ( Figure  8a).
Whilst being a robust automated method for the consistent reconstruction of tracts, our method also respected the underlying anatomical variation. We demonstrated this by assessing inter-subject tract similarity in monozygotic twins, dizygotic twins, non-twin siblings and unrelated subject pairs. Our results show greater similarity in twin pairs compared to unrelated pairs, as would be expected from the heritability literature (Bohlken et al., 2014;Shen et al., 2014). We further demonstrated that the automated method respects underlying anatomical variation by exploring how tractography results differ from the cohort-averaged results in the case of subjects with anatomical abnormalities.

Conclusions
In conclusion, we have developed and demonstrated a set of robust and standardised tractography protocols for cross-species automated delineation of white matter bundles, along with a platform to use them. The demonstrated toolbox (XTRACT) is freely available along with the tractography protocols and human/macaque tract atlases as a part of FMRIB's software library (FSL version 6.0.2 and later). Given the benefits with regards to data and protocol sharing, we expect that this toolbox will aid reproducibility in the field of tractography and facilitate comparative neuroanatomy studies.  Figure 1. Schematic of the stages for automated tractography, as implemented in the "XTRACT" toolbox, with an example of the left arcuate fasciculus (AF) for the human brain. 1) Tractography protocol masks are defined in standard space with seed (green), exclusion (black), waypoint (blue) and termination (orange) masks (see the "Protocols" section for full details of definitions).

Figures
2) The protocol masks are warped to the subject's native space using the subject-specific non-linear warp fields. 3) Probabilistic tractography is performed in the subject's native space using the crossing fibre modelled diffusion data. Notice that results are mapped directly into standard space, leading to a single interpolation step. 4) The resultant tract stored in standard space, overlaid on the FSL_HCP1065 FA atlas.        ) with the greatest similarity, it is possible to investigate how differences in tract contribution to location connectivity contribute to divergence. Black lines correspond to the tract contributions to the vertex on the right hemisphere and blue lines for the left hemisphere. For example, in V4 (middle), the minimum KL-divergence is small which is reflected by the almost identical underlying tract contributions. Regions with mid-to high-range dissimilarity -TPOJ1 (top) and IFSa (bottom) -are seen to have greater differences in their underlying tract contributions, primarily driven by differences in AF and SLFs. Figure 9. Twin/non-twin WM tract similarity using 72 subject pairs per group. Correlations are performed on normalised tract density maps with a threshold of 0.5%. µ is the group median across tracts and subjects and is the standard deviation. A Kruskal-Wallis test is used to determine whether the groups come from the same median: χ 2 = 13.1, p = 0.0043, mean ranks = 104.1 (monozygotic), 85.3 (dizygotic), 82.9 (non-twin siblings) and 65.7 (unrelated).