Temporal Diffusion Ratio (TDR) for imaging restricted diffusion: Optimisation and pre-clinical demonstration

Temporal Diffusion Ratio (TDR) is a recently proposed dMRI technique (Dell’Acqua et al., proc. ISMRM 2019) which provides contrast between areas with restricted diffusion and areas either without restricted diffusion or with length scales too small for characterisation. Hence, it has a potential for informing on pore sizes, in particular the presence of large axon diameters or other cellular structures. TDR employs the signal from two dMRI acquisitions obtained with the same, large, b-value but with different diffusion gradient waveforms. TDR is advantageous as it employs standard acquisition sequences, does not make any assumptions on the underlying tissue structure and does not require any model fitting, avoiding issues related to model degeneracy. This work for the first time introduces and optimises the TDR method in simulation for a range of different tissues and scanner constraints and validates it in a pre-clinical demonstration. We consider both substrates containing cylinders and spherical structures, representing cell soma in tissue. Our results show that contrasting an acquisition with short gradient duration, short diffusion time and high gradient strength with an acquisition with long gradient duration, long diffusion time and low gradient strength, maximises the TDR contrast for a wide range of pore configurations. Additionally, in the presence of Rician noise, computing TDR from a subset (50% or fewer) of the acquired diffusion gradients rather than the entire shell as proposed originally further improves the contrast. In the last part of the work the results are demonstrated experimentally on rat spinal cord. In line with simulations, the experimental data shows that optimised TDR improves the contrast compared to non-optimised TDR. Furthermore, we find a strong correlation between TDR and histology measurements of axon diameter. In conclusion, we find that TDR has great potential and is a very promising alternative (or potentially complement) to model-based approaches for informing on pore sizes and restricted diffusion in general.


Introduction
Characterising neural tissue structure at the microscopic scale can provide important information regarding development, plasticity, ageing, as well as inform on the impact of various diseases that affect the central and/or peripheral nervous systems ( Lawrence et al., 2021 ;Tian and Ma, 2017 ;Dubois et al., 2014 ;Sexton et al., 2014 ;Lebel et al., 2010 ).For example, axon diameter, along with myelin content, is an important property that influences the speed and efficiency of neural communication ( Hursh, 1939 ).Mapping axon diameter can large balloon cells are present in focal cortical dysplasia ( Taylor et al., 1971 ).
Because of these relationships between tissue microstructure and function in the nervous system, techniques for mapping restricted diffusion in the brain and the corresponding characteristic length-scales are of high potential clinical significance.Non-invasive techniques for measuring brain microstructure, such as those developed using diffusionweighted MRI (dMRI) are especially of interest, as they provide clinically relevant information whilst obviating the need for invasive biopsy and associated risk.dMRI sensitises the signal to the displacement of the water molecules in the tissue and thus provides indirect information about the microscopic tissue organisation ( Baliyan et al., 2016 ;Drake-Pérez et al., 2018 ;Le Bihan and Iima, 2015 ) on a scale much smaller than the voxel size.Since dMRI techniques can report on representations of diffusion, to enhance specificity, modelling strategies can be employed -under strict assumptions -to inform in a quantitative way on the underlying microstructure.Towards this goal, several methods have been developed in the past to characterise restricted diffusion in tissue and to map cellular sizes, both in white matter (WM) and grey matter (GM).
In white matter, mapping axon diameter has been the focus of several dMRI studies over the last decades ( Stanisz et al., 1997 ;Assaf et al., 2008 ;Alexander et al., 2010 ;Harkins et al., 2021 ;Veraart et al., 2020 ;Kakkar et al., 2018 ;Huang et al., 2020 ;Duval et al., 2015 ;Grussu et al., 2019 ;Sepehrband et al., 2016 ;Fan et al., 2020 ;Barazany et al., 2009 ;Zhang et al., 2011 ;Ong and Wehrli, 2010 ;Shemesh, 2018 ;Xu et al., 2014 ;Bar-Shir and Cohen, 2008 ;Shemesh et al., 2015 ).Some approaches employ biophysical models of the tissue, for example, representing intra-axonal signal as diffusion restricted in infinitely-long non-permeable cylinders, and extra-axonal signal as Gaussian hindered diffusion ( Assaf et al., 2008 ;Alexander et al., 2010 ).Then, the tissue model is fitted to a rich dMRI acquisition, usually with several diffusionweighting amplitudes ('shells') and diffusion times for each shell, to estimate summary statistics drawn from the apparent axon diameter distributions.Another recently proposed approach is to use diffusion measurements with ultra-high diffusion weighting (e.g.b > 6 ms/ m 2 ) to map axon diameter sizes based on the departure from the power law expected for diffusion inside cylinders with negligible diameters (i.e.'sticks'), however this approach requires at least one b-value much greater than 10 ms/ m 2 , leading to lower resolution data on account of longer TEs and the lower SNR at these high b-values ( Veraart et al., 2020 ).To improve sensitivity to axon diameter ( Drobnjak et al., 2016 ), other approaches go beyond the conventional single diffusion encoding acquisitions and also use oscillating gradient waveforms that can access much shorter diffusion times, either included in a modelling framework ( Kakkar et al., 2018 ;Siow et al. ) or used to contrast measurements with different diffusion times and frequencies ( Harkins et al., 2021 ;Does et al., 2003 ;Wu et al., 2019 ).It has been shown in multiple works ( Drobnjak et al., 2016 ;Nilsson et al., 2017 ;Paquette et al., 2021 ) that measurements of axon diameters and/or diameter distributions are extremely challenging, regardless of the dMRI sequence used, especially when considering standard gradient amplitudes available on clinical scanners.The biggest challenge is the low inherent sensitivity of the signal towards axons of small diameter, with the vast majority of contrast being produced by large axons from the tails of most biological axonal distributions ( Drobnjak et al., 2016 ;Nilsson et al., 2017 ).
In grey matter, mapping cell soma sizes is a more recent endeavour in dMRI, with several studies showing the benefits of including cell soma in microstructure models ( Palombo et al., 2020 ;Ianus et al., 2021 ;Schiavi et al., 2022 ;Afzali et al., 2020 ), suggesting mapping restricted spaces such as cell soma is a good additional target for new microstructure imaging techniques.However, it has also been recently shown that, compared to WM, GM may be characterised by faster multi-compartmental exchange of water molecules ( Jelescu et al., 2022 ;Olesen et al., 2022 ;Olesen et al., 2022 ), which represents an extra chal-lenge for the biophysical modelling of dMRI measurements and the unbiased estimation of restricted diffusion in GM.
Furthermore, studying the time dependence of the diffusion signal, as well as of estimated parameters such as diffusion and/or kurtosis tensors, can provide additional information about the tissue characteristics and inform about the type of diffusion processes.For instance, measuring the diffusion time dependence of the diffusion and kurtosis tensors can help to differentiate between the effects of restricted diffusion and inter-compartmental exchange ( Jelescu et al., 2022 ;Olesen et al., 2022 ;Olesen et al., 2022 ;Reynaud, 2017 ;Lampinen et al., 2021 ;Assaf et al., 2000 ;Jespersen et al., 2018 ;Lee et al., 2020 ;Aggarwal et al., 2020 ;Xu et al., 2016 ), with various applications for brain and body imaging.Although of great interest, characterising both the time dependence and b-value dependence (e.g. to estimate diffusion and kurtosis tensors) requires many measurements and cannot currently be performed in a practical amount of time for clinical applications.
A common feature of the vast majority of dMRI techniques developed for characterising both WM and GM microstructure is the use of biophysical models to describe the relationship between the measured dMRI signals and the underpinning tissue microstructure.Imaging markers of histologically relevant features, such as axon diameter, are then inferred by fitting such biophysical models.Although very successful for many applications ( Alexander et al., 2019 ;Novikov et al., 2018 ), this paradigm has some major limitations: e.g. it relies on strong assumptions on the relevant features characterising the underlying tissue structure, the models used are often overly simplistic, and it is prone to ambiguities of results interpretation due to inherent model degeneracies.Other research also exists, which aims to avoid modelling approaches and use diffusion time dependence to characterise tissues while keeping other diffusion parameters the same, such as the studies on diffusion dispersion rate ( Xu et al., 2016 ), ΔfADC ( Wu et al., 2014 ), ΔD ⊥ ( Harkins et al., 2021 ), and SSIFT ( Devan et al., 2022 ).However, these studies do not use large b-values and hence the signal is sensitive to both restricted and non-restricted tissues making it hard to apply for mapping axon or soma diameters.
To bypass some of these limitations, Dell'Acqua et al. ( 2019) recently in a preliminary study presented at ISMRM 2019, introduced the Temporal Diffusion Ratio (TDR), a model-free dMRI approach to characterise restricted diffusion inside large axons using two b-value shells with different gradient timings.Specifically, TDR employs dMRI measurements with a very high b-value (above 7 ms/ m 2 ) to suppress fast and assumingly extra-axonal diffusion, and contrasts the signal from two shells at the same b-value obtained with different diffusion times and gradient settings.The advantages of TDR are that it employs standard PGSE diffusion sequences widely accessible on commercially-available MRI systems without any specialist programming, that it does not make any assumptions on the underlying tissue structure, and that it does not require any model fitting, avoiding issues related to model degeneracy ( Paquette et al., 2021 ;Jelescu et al., 2016 ).
In this paper we introduce and validate for the first time the unique potential of the TDR contrast to inform on restriction in tissue, building on the preliminary work by Dell'Acqua et al. (2019) .We optimise the TDR methodology and demonstrate it in a pre-clinical study.We extend the TDR approach from characterising cylindrical restrictions (e.g.axons, as it was originally introduced in Dell'Acqua et al. ( 2019) ), to include both cylindrical and spherical restrictions (e.g., neurites and cellular bodies), allowing us to understand the TDR values observed in a wide range of tissue types.We employ simulations to optimise the gradient shapes to increase the TDR sensitivity to a wide range of axon diameter distributions, modelled as cylinders, as well as cell body sizes, modelled as spheres.We also investigate the effect of the Rician noise floor on the TDR values and present a strategy for maximising TDR contrast by using a subset of the full gradient directions.Finally, we demonstrate the optimised TDR strategies as well as the relationship between TDR and axon diameter using ex-vivo rat spinal cord data.

TDR approach
TDR is computed by contrasting two spherically-averaged dMRI signals with the same b-value, each collected with a different set of diffusion gradient parameters (e.g.diffusion time, gradient strength), following the expression: where the sequence parameters used to generate S 1 and S 2 are chosen so that S 2 > S 1 for restricted diffusion.Note that, given that the sequences have the same b-value, Gaussian diffusion would result in equal signal values and no TDR contrast, while in restricted diffusion (and any time dependent process) the specific values of S 1 and S 2 , as well as TDR contrast, depend on the pore size and diffusion time.In the original implementation of TDR, the two shells are acquired using Single Diffusion Encoding (SDE) sequences, with fixed and equal gradient pulse duration , and different diffusion times Δ, one short and one long ( Dell'Acqua et al., 2019 ).Then, the normalised spherical mean diffusion signals, S mean , are calculated either averaging over gradient directions uniformly sampled over a sphere (e.g. using a HARDI acquisition), or using spherical harmonics keeping order zero.TDR contrast is then calculated based on the spherical mean data, to diminish the effect of fibre orientation distribution: ( Callaghan et al., 1979 ) Here, we calculate the spherical mean signals S 1 mean and S 2 mean by averaging across gradient directions, and hence get TDR as following: where N is the total number of uniformly distributed gradient directions in the HARDI acquisition,  1 , is the signal acquired with the i th gradient direction and gradient waveform used to create  1 , while  2 , is the signal acquired with the i th gradient direction and gradient waveform used to create  2 .
To illustrate the rationale behind the TDR contrast, originally proposed for characterising white matter tissue, we consider a simple tissue model consisting of two compartments: intra-axonal signal is modelled as restricted diffusion inside cylinders and contributes to TDR contrast, and extra-axonal signal is modelled as Gaussian diffusion ( Veraart et al., 2019 ;Skinner et al., 2015 ) and does not contribute to the TDR contrast.
Fig. 1 shows the signal attenuation and TDR values for spins restricted inside a cylinder as a function of diameter when the diffusion gradients are perpendicular to the fibres.The signals S 1 and S 2 correspond to sequences with the same b-value (large enough to attenuate extra-cylindrical diffusion) but different gradient waveforms, each yielding a different attenuation for restricted diffusion.For the range of sizes considered here, the difference between S 2 and S 1 , and thus the TDR contrast (right) increases with the cylinder diameter, here proxy for the axon diameter.
In reality, each white matter voxel contains a range of axon diameters that contribute to the signals S 1 and S 2 .In this case the TDR contrast is effectively the integral over the distribution of axon diameters.Hence, given that the difference between S 1 and S 2 increases with axon diameter, TDR contrast will be higher in voxels containing many large axons.In other words, the TDR contrast is mostly sensitive to large axons and therefore has the potential to map regions where there are larger axons.
In this study, we propose to optimise the dMRI gradient waveform to maximise TDR contrast.The principle driving our optimization is that the best contrast (i.e.maximum TDR value) is found for the smallest S 1 and largest S 2 values, for a given b value and restriction size.Hence, we perform numerical simulations to find the actual optimal gradient waveform that maximises TDR contrast given specific hardware constraints for different substrates.We also analyse the TDR contrast in spherical pores, that have been included as a signal compartment when modelling grey matter ( Palombo et al., 2020 ) as well as other tissue types, for instance tumours ( Panagiotaki et al., 2014 ;Roberts et al., 2020 ) .

Simulations
In this study we perform simulations to design optimised SDE acquisitions that maximise TDR contrast.In the first set of simulations we optimise the gradient waveform, then we explore the intensity and the potential of TDR as an imaging contrast.
Simulations are generated using the Microstructure Imaging Sequence Simulation Toolbox (MISST, version 0.93) ( Ianu ş et al., 2016 ), a diffusion MRI simulator that calculates the signal attenuation for various restricted geometries and different gradient waveforms using the matrix formalism ( Callaghan, 1997 ;Drobnjak et al., 2010 ;Drobnjak et al., 2011 ).Code for MISST is available at http://mig.cs.ucl.ac.uk/index.php?n = Tutorial.MISST .Code written by the authors for the optimisation and analysis is also available and can be found at the link in the footnote.3

Simulation 1: optimising gradient waveforms for TDR
In our first simulation, we optimise the gradient waveform for each of the two SDE sequences to maximise TDR contrast.We consider substrate configurations for two different geometries: (a) distributions of parallel impermeable cylinders (mimicking axons), and (b) distributions of impermeable spheres (mimicking cell bodies), illustrated in Fig. 2 .For each of these two geometries, we consider two diameter distributions, one large and one small.For cylinders we use large and small gamma distributions to mimic axons typically found in the spinal cord ( Grussu et al., 2019 ;Duval et al., 2019 ) (mean = 5.33 m, std = 3.00 m) and corpus callosum ( Grussu et al., 2019 ;Aboitiz et al., 1992 ) (mean = 1.93 m, std = 0.81 m), respectively.For spheres we use large ( Rajkowska et al., 1998 ;Palombo et al., 2021 ) (mean = 15 m, std = 0.5 m) and small ( Palombo et al., 2021 ;Lackey et al., 2018 ;Houston et al., 2017 ) (mean = 7 m, std = 0.5 m) normal distributions to mimic cell somas typically found in mammalian brains.The intrinsic diffusivity for all pores is set to 2 m 2 /ms, a value typically used for tissue at 37 °C and close to the parallel diffusivity associated with the intra-axonal compartment in ex-vivo spinal cord ( Olesen et al., 2021 ).The signal from extra-axonal space, as well as any exchange effects are assumed to be negligible for the TDR contrast.
We assume the same b-value for both SDE sequences in order to ensure the diffusion contrast from Gaussian diffusion is the same.We set b to a very high value, specifically b = 8 ms/ m 2 as in the initial work introducing TDR ( Dell'Acqua et al., 2019 ), in order to eliminate the signal that comes from the extra-axonal space and free water.For comparison, in the Supplementary Material, we additionally include results for optimisations performed at b = 20 ms/ m 2 ( Veraart et al., 2020 ), which ensures an even higher signal attenuation in the extra-axonal space.TDR is calculated from normalised diffusion signals averaged over 60 gradient directions uniformly sampled over a sphere as in the original implementation of TDR ( Dell'Acqua et al., 2019 ).
Optimisation is done using a range of simulations performed with MISST and spanning a large space of sequence parameters, for four high performance animal and human scanner hardware constraints: (1) G < 600 mT/m and Δ+  < 45 ms corresponding to a typical pre-clinical scanner; (2) G < 2700 mT/m, Δ+  < 45 ms corresponding to a highgradient pre-clinical system; (3) G < 300 mT/m, Δ+  < 80 ms corresponding to the (human MRI) Connectom scanner (Siemens Healthineers); 4) G < 80 mT/m, Δ+  < 80 ms corresponding to a high performance clinical scanner.Optimisation is automated using the interior  point algorithm as implemented in the MATLAB fmincon function, with Δ,  and G values for each of the two sequences as linked variables under investigation.The interdependence of these variables means we are ultimately optimising over a 4-dimensional space [ Δ 1 ,  1 , Δ 2 ,  2 ] with Δ and  for each of the two sequences determining the appropriate G-values.Note that to keep consistent with previous TDR work and widely available pre-clinical and clinical settings, we only considered SDE waveforms (i.e.other waveforms such as double diffusion encoding, oscillating gradients or b-tensor encoding were not considered here).Moreover, the gradient slew rates are considered infinite, a good approximation for the preclinical scanner used in our experiments that has a fixed rise time of 0.1 ms (leading to slew rates up to 25,000 mT/ms).Other gradient settings for clinical scanners, as well as the effect of a finite slew rate (set to 200 mT/m to account for physiological constraints) are explored in Supplementary Material Sections S1 and S2.
We then evaluate the performance of the optimised gradient waveforms against a set of gradients chosen according to the previously published strategy for TDR ( Dell'Acqua et al., 2019 ), namely both sequences have the same, short gradient duration, and only the diffusion time is varied: S 2 has a long diffusion time and S 1 has a short diffusion time.The exact values of , Δ and G were chosen according to the scanner constraints (1-3), with the minimum possible  and Δ for S 1 and maximum possible Δ for S 2 .As these sequences were not determined based on full scale optimisation, we will refer to them as "non-optimised " waveforms/sequences.

Simulation 2: effect of restriction size on TDR
In the second simulation we use optimised sequences obtained from simulation 1 to evaluate the TDR contrast over a wide range of pore size distributions.
We consider both distributions of cylinders and spheres to model restrictions in both white matter and grey matter, with size distributions commonly found in tissue ( Grussu et al., 2019 ;Palombo et al., 2020 ;Aboitiz et al., 1992 ;Rajkowska et al., 1998 ;Palombo et al., 2021 ;Lackey et al., 2018 ;Houston et al., 2017 ).Cylinder sizes follow a gamma distribution ( Alexander et al., 2010 ;Grussu et al., 2019 ), whilst sphere sizes follow a normal distribution ( Rajkowska et al., 1998 ).For cylinders, the gamma distributions are truncated at 20 m to match realistic values from the tissue; thus, combinations of parameters where this truncation changes the mean by more than 10% are not considered.For all substrates, intrinsic diffusivity is set to 2 m 2 /ms and extra-cellular space is neglected.These simulations are run without noise to illustrate the maximum potential of TDR as an imaging contrast.

Simulation 3: effect of gradient directions and noise on TDR
In the third simulation, we investigate the effect of the Rician noise floor on TDR values.To this end, we consider different fibre configurations: one, two and three fibre bundles consisting of parallel cylinders with the same diameter distributions (Gamma distribution with mean = 5.33 m, std = 3.00 m) with separate fibre bundles crossing at right angles (in the case of two and three fibres), as well as spherical pores (Gaussian distribution with mean = 7 m, std = 0.5 m), and we simulate the signals using the optimised sequences from simulation 1.Then, for each measurement, we consider Rician noise at SNR levels of 20 and 50 in the non-diffusion weighted images, typical for clinical and pre-clinical acquisitions ( Tian et al., 2022 ;Ianu ş et al., 2022 ), as well as noise free signals, i.e.SNR ∞.Finally, we explore different numbers of gradient directions (30 and 60).
We also look at the effect of the orientation distribution of gradient directions included when calculating TDR.In the previous work ( Dell 'Acqua et al., 2019 ) S 1 and S 2 are calculated as the signal average over a set of uniformly distributed gradient directions.However, in white matter, due to the fast signal decay in certain directions (e.g.parallel to the fibres), the contribution to the direction-averaged signal of these measurements will carry little to no extra information about the axon diameter and could potentially introduce bias to the TDR estimation, due to the noise floor.Here we investigate this effect and its impact on TDR.

Preclinical experiments
All animal studies were approved by the competent institutional and national authorities, and performed according to European Directive 2010/63.
The preclinical experiments aim to investigate TDR contrasts in exvivo rat spinal cord, to validate our optimisation, both in terms of gradient waveforms as well as the number of HARDI directions employed, and the relationship between TDR and axon diameter in different WM ROIs.
The data and code from the preclinical experiments is available at https://github.com/andrada-ianus/TDR_study.git .

Data acquisition
One rat spinal cord was extracted via transcardial perfusion with 4% Paraformaldehyde (PFA).After extraction, the spinal cord was immersed in a 4% PFA solution for 24 h, and then washed in a Phosphate-Buffered Saline (PBS) solution for 24 h.Two sections of cervical spinal cord were cut and placed separately in 5 mm NMR tubes filled with Fluorinert (Sigma Aldrich, Lisbon, PT).The samples were imaged on a 16.4 T Bruker Aeon Ascend scanner (Bruker, Karlsruhe, Germany) equipped with a 5 mm birdcage coil and gradients capable of producing up to 3 T/m in all directions.

Imaging protocol
Diffusion MRI datasets for TDR were acquired using a SDE-EPI sequence with the following parameters: TE = 50 ms, TR = 2 s, 16 averages, slice thickness = 0.5 mm, 5 slices, in plane resolution = 0.09 × 0.09 mm 2 , matrix = 60 × 46, FOV = 5.4 × 4.15 mm 2 , Partial Fourier = 1.12.The EPI acquisition bandwidth was 400 kHz and data was acquired in a single shot using double sampling, with a total acquisition time of 1 h 50 m for each combination of G max and b-value.
In terms of diffusion weighting, the TDR acquisition was performed for two fixed b-values of 8 and 20 ms/ m 2 .The b-values were chosen to be similar to those proposed by Dell'Acqua ( Dell'Acqua et al., 2019 ) for TDR and by Veraart et al. (2020) for axon diameter imaging, respectively.For each b-value we consider two scenarios: (1) the sequence parameters were chosen so that the maximum gradient strength was limited to 600 mT/m, a value available on many preclinical systems, and (2) the maximum gradient strength was 2500 mT/m, close to the maximum available on this gradient system.For each scenario, we considered waveforms with the same gradient duration and different diffusion times, as originally proposed in ( Dell'Acqua et al., 2019 ), referred to as the non-optimised protocol, as well as the optimised waveforms proposed in this study.Each shell was acquired with 10 b0 values and 60 diffusion directions each, and the specific timing parameters for the non-optimised and optimised protocols are provided in Section 3.2 .
The data was acquired with an in-house implementation of the sequence which loops through the different diffusion times in order to avoid any potential signal differences caused by sequence adjustments.The sequences implemented in PV6.0.1 are available upon request.

Data analysis
Pre-processing: Complex data were denoised using the MP-PCA approach ( Veraart et al., 2016 ) following the steps described in ( Ianu ş et al., 2022 ) to account for the effect of Partial Fourier acquisition in the data, and the magnitude was computed.Then the data was normalised for each shell.

Effect of extra-axonal space and orientation dispersion
To further explore different confounding factors that might affect TDR, we employ simulations to study the effect of extra-axonal space and fibre dispersion.

Extra-axonal space
Monte Carlo simulations were performed in Camino ( Hall and Alexander, 2009 ) for the same sequences employed in the preclinical experiments and six substrates consisting of randomly packed cylinders with intra-axonal fraction of 0.6 and Gamma distributed diameters with the same mean and standard deviation as reported for the spinal cord ROIs: 4.47 ± 0.51 m (VST), 2.22 ± 0.21 m (ReST), 3.39 ± 0.47 m (RST), 3.73 ± 0.36 m (FC), 1.16 ± 0.1 m (dCST), 1.80 ± 0.13 m (FG).Simulations were run with spins distributed either uniformly (200,000 spins) or only in the intra-axonal space (120,000 spins).We fix the diffusivity to 2 m 2 /ms according to the study from Olesen et al. (2021) who estimated a diffusivity value ∼2 m 2 /ms associated with the intraaxonal compartment in the ex-vivo spinal (following the same fixing procedure as ours).We also perform simulations in more realistic substrates using the ConFIG framework ( Callaghan et al., 2020 ;Callaghan et al., 2021 ) and report these results in the Supplementary Material Section S5.

Fibre dispersion
Numerical simulations using MISST were run for a model of dispersed cylinders with a Gamma distribution of diameters and a Watson distribution of orientations ( Watson, 1965 ;Zhang et al., 2012 ).The simulations were performed for the same sequences and diameter distributions as above, and three different concentration parameters of the Watson distribution: k = 1 yields highly dispersed orientations, k = 6 yields an orientation dispersion that has been reported in WM, and k = 100 yields close to parallel cylinders.

Simulation 1: optimising gradient waveforms for TDR
This section presents the optimisation results of the TDR contrast.As described in Methods, both the first and the second shell of the "optimised protocol " were optimised to maximise TDR contrast across the whole space of possible waveform parameters such as gradient duration, diffusion time and gradient strength.The boundaries of the search space are fixed to pre-clinically achievable values and b-value is fixed.The two shells of the "non-optimised protocol " were adopted from the previously published preliminary work (Dell'Acqua et al., ISMRM 2019) and those were selected empirically and from theoretical intuition, but were not optimised.Here we evaluate which protocol provides stronger TDR contrast.
For the optimised protocol, the space of parameters we explore is typical for pre-clinical scanners, namely G < 600 mT/m and Δ+  < 45 ms -presented here.We also do optimisation for other scanner settings: G < 2700 mT/m, Δ+  < 45 ms (corresponding to a high-gradient preclinical system), G < 300 mT/m, Δ+  < 80 ms, (corresponding to the Connectome scanner), and G < 80 mT/m, Δ+  < 80 ms (corresponding to a high performance clinical scanner) -presented in Supplementary Material, Section S1.We provide optimisation results for two different b-values: b = 8 ms/ m 2 (results presented in this section and Supplementary Material Section S1) and b = 20 ms/ m 2 (results presented and discussed in Supplementary Material, Section S3).
Fig. 3 a illustrates diffusion weighted signal values averaged over 60 uniformly distributed gradient directions for G < 600 mT/m, Δ+  < 45 ms: and a range of different combinations of gradient durations and diffusion times and a substrate consisting of small cylinders.Signal values ranging from low (blue) to high (red) values are shown, and TDR is calculated for each pair.In order to find the combination of diffusion sequence parameters that maximises TDR, the optimisation looks for values for S 1 and S 2 that are most different from one another.The corresponding, optimal, pair of gradient waveform parameters (squares) has one gradient waveform with short duration and diffusion time and the other with long duration and diffusion time.The short duration waveform is the same as the short diffusion time sequence in the "non-optimised " approach (circles), while the second optimal waveform has much longer gradient duration than any sequence in the "nonoptimised " approach, where the diffusion duration is kept constant between the two waveforms and only diffusion time has changed ( Fig. 3 e).
Similar results are obtained for other substrates as well: large cylinders Fig. 3 b, small spheres Fig. 3 c and large spheres Fig. 3 d -although for spheres with large diameters the optimal duration of the long gradient sequence is slightly shorter.The exact values for all optimised waveform parameters as determined through non-linear optimisation are: • small cylinders: S 1 : Δ = 8.9 ms,  = 6.9 ms; S 2o : Δ = 28.5 ms,  = 16.5 ms • large cylinders: S 1 : Δ = 8.9 ms,  = 6.9 ms; S 2o : Δ = 31 ms,  = 14.1 ms • small spheres: S 1 : Δ = 8.9 ms,  = 6.9 ms; S 2o : Δ = 29 ms,  = 15 ms • large spheres: S 1 : Δ = 8.9 ms,  = 6.9 ms; S 2o : Δ = 35 ms,  = 9.9 ms In addition to the specific substrates shown in Fig. 3 , we have also performed the optimisation for other substrates, consisting of both cylinders and spheres.Whilst distributions that include very large pores can have different optimal sequence shapes, in all cases considered, increasing  for the S 2o shell improves the contrast compared to the nonoptimised version of TDR (data not shown).When we repeated the simulations above for b = 20 ms/ m 2 , we found extremely similar optimised sequence patterns and TDR values for the different substrates as presented in Fig. S3 in Supplementary Material.
Furthermore, we have done optimisation for a range of different hardware constraints and got very similar optimised sequence patterns (Supplementary Material Section S1).The main difference we found for different hardware constraints is that the optimal diffusion time and gradient duration both reduce as the maximum gradient strength increases, which is expected as the b-value is kept fixed at b = 8 ms/ m 2 .Furthermore, we found that the TDR values were drastically reduced as the maximum gradient strength is reduced.This, in particularly affects clinical gradient strength values ( G < 80mT/m) for which TDR was very low of 0.106, and in particular the normalised signal difference itself, S 2 -S 1, was at maximum 0.02, in the same order or less as the noise levels.Connectome scanner constraints on the other hand produced much larger TDR of 0.44 and signal difference S 2 -S 1 of 0.08.

Simulation 2: effect of restriction size on TDR
This section explores the TDR contrast over a wide range of pore size distributions, to explore the types of structures visible through TDR.Following the previous results which show that the optimised sequences are very similar across a range of small and large cylinders and spheres, in this simulation we chose the optimised parameters obtained for the large cylinder distribution, thus for S 1 we choose Δ = 8.9 ms,  = 6.9 ms and for S 2 we choose Δ = 31 ms,  = 14.08 ms.
Fig. 4 a illustrates TDR values across substrates consisting of cylinder distribution with a wide range of parameters (mean along the x-axis and standard deviation along the y-axis), calculated based on the optimised sequences above.The plot presents the overall link between the axon distribution sizes and TDR and shows that distributions with the mean below 3 m and standard deviation below 1 m -resolution limit for this gradient strength -have TDR values close to zero, and hence would not be detectable in the images.On the other hand larger axons have sufficiently large TDRs (TDR = 1 is a maximum value) which the approach will pick up, and so will stand out in the TDR map images.This plot shows where typical axon distributions found in the tissue would be: "small dist " matches a previous model of callosal white matter ( Grussu et al., 2019 ) which is undetectable with this approach (as expected based on the max gradient strength used ( Nilsson et al., 2017 )) and "large dist " replicates the white matter structure in the spinal cord, which is within the sensitivity of the TDR approach.
The isocolors in Fig. 4 a show existing ambiguities in associating a specific TDR value to a single size distribution.Multiple size distributions can provide the same TDR value: e.g., all the distributions characterised by mean and standard deviation along the green area in Fig. 4 would all contribute to a TDR value of ∼0.4.This is expected to be seen in the TDR approach whose main aim is to visualise areas with large, detectable restrictions rather than to map their exact size.
Fig. 4 b) shows results for spherical substrates.These are much larger as they represent the structures in the grey matter and their TDR values are much higher and, depending on the SNR, would be detectable in the TDR maps for the hardware parameters selected.The optimised sequences provide larger signal differences between S 1 and S 2o compared to between S 1 and S 2n , and higher TDR values.In each plot, colours are scaled so that a diffusion signal of 0 is blue, and the maximum diffusion signal obtained is red; the colour range displayed in each plot gives an idea of the maximum TDR possible.(e) Schematic representation of nonoptimized and optimised gradient shapes.These figures show that optimised TDR maximises the difference between the signal from the two acquisitions, using sequences with different pulse shapes.Equivalent figure for b = 20 ms/ m 2 is Fig.S2 presented in Supplementary Material.For b = 20 ms/ m 2 the TDR contrast is very similar to what we discussed above for b = 8 ms/ m 2 , with a difference that the TDR values are in general larger for b = 20 ms/ m 2 -except for very large cylinders when they are somewhat smaller.Results are presented in Figure S3 in Supplementary Material.

Simulation 3: effect of gradient directions and noise on TDR
It is well known that when imaging a single bundle of parallel fibres, the intra-axonal signal obtained depends on the angle between the orientation of the fibre bundle and the direction of the diffusion sensitising gradients: orthogonal gradients produce small signal attenuation (and thus higher overall diffusion-weighted signal), whilst parallel gradients create stronger signal attenuation and return lower signal.This effect generally increases with b-value ( Drobnjak et al., 2016 ) and would be strongly emphasised in the case of the TDR approach.A similar rationale applies to imaging more than one fibre bundle: the gradient directions returning the highest signals are those which are close to perpendicular to one or more fibre bundles; meanwhile gradient directions which are not close to perpendicular to any of the fibre bundles show very low signal.This concept is represented in Fig. 5 , which shows signal measurements for two crossing fibre bundles and 60 uniformly distributed gradient directions ( Fig. 5 a).Fig. 5 b shows all the 60 signal measurements colour coded according to the angle between the plane of the fibres and the gradient direction shown in a) (e.g.red are gradient di-rections most perpendicular to the fibres).Fig. 5 c shows all 60 signal measurements ordered from the highest to the lowest signal to emphasise the impact that gradient direction has on the signal itself.It can be seen how the signal measurements are highly distributed from very high, through medium intensity and finally some measurements with very low signal.
In order to investigate whether this effect affects TDR, we divide the signal measurements into subsets and evaluate TDR for each different subset.The idea is to explore whether using a subset of gradient directions that creates the highest signal is more optimal (i.e.maximises TDR) compared to when using the full set of 60 uniformly distributed directions which includes both the high and the low signals.Note that the data are already measured for all 60 uniformly distributed gradient directions, and this subset selection is happening at the post-processing stage, once the signal intensities have been ordered.
We use the definition of TDR outlined in Section 2.1 .and calculate it for different subsets using the following equation: where M is the number of gradient directions in the subset (N corresponds to the full set of uniformly distributed directions and 1 ≤ M ≤ N ) and j = 1..M is the index of the ordered signal measurements from the highest intensity to the lowest intensity (as shown in Fig. 5 c).The ordering is performed on the average of the two signals (  1 +  2 )∕2 , in order to reduce the effect of noise on the sorting process.In our simulations we used HARDI acquisitions of 60 uniformly distributed gradient directions (determined by electrostatic repulsion), hence N = 60 and   60 is equivalent to the full original formulation of TDR (ie.signal S 1 and S 2 are averaged across all 60 gradient directions acquired).To illustrate this further:   1 is calculated using only one gradient direction -the one that provides maximum signal intensity;   2 uses the two strongest signal measurements to calculate the average, the one used in   1 and another one in the ordering etc.The results for TDR calculated for every subset from   1 to   60 are provided in Fig. 6 (a-d).Fig. 6 a shows the TDR values for a substrate consisting of one fibre bundle of parallel cylinders.Results are presented both for the noise-free scenario (blue circles) and the scenario where Rician noise (SNR = 20) has been added (orange error bars present the standard deviation over the noise instances).In the scenario without noise, TDR remains equal to the ground truth regardless of how many gradient directions are used in the TDR calculation.However, once simulated Rician noise is added to the signals, due to the Rician floor effect, the results show a decrease in TDR and a reduction in the accuracy of the calculated TDR value as the number of gradient directions included in the calculation increases.In contrast, the precision of the calculation is improved with more measurements and hence for this substrate with one fibre bundle, using   12 which corresponds to the subset of ∼20% of ordered gradient directions that provide the highest signal values, yields an optimal balance between accuracy and precision of TDR estimates.
Figs. 6 b and c show that similar effects occur for substrates consisting of two and three fibre bundles (fibre bundles are mutually perpendicular to allow for testing of the most extreme cases).The accuracy of estimated TDR lowers as the number of fibre bundles increases, however the trend that the accuracy reduces with increasing the size of the ordered subset is equally present suggesting that optimisation in the presence of Rician noise would be very beneficial and would provide higher TDR contrast.
For a substrate consisting of spheres ( Fig. 6 d) the results are very different.Accuracy is pretty stable regardless of the number of signal measurements used for both the no-noise and noisy case.This is as expected since signal from spherical substrates is indifferent to gradient direction as opposed to the fibres which are highly sensitive to it.Similarly to the fibre simulations, the precision is improved with the number of measurements and hence the optimal solution for spherical substrates is that which maximises the number of measurements -i.e. the full set of HARDI acquisitions (   60 for our simulations).Hence, for isotropic pores, as well as uniformly orientated fibres (data not shown), using more gradient directions is optimal, with TDR values remaining accurate even when all directions are used.
We also note that the estimated TDR values for the substrates with one, two or three fibre bundles should ideally be the same, as the size distributions of the substrate cylinders are the same regardless of the fibre orientation distribution.When calculating the coefficient of variation of TDR values across the three substrates, for an SNR of 20 when 60 HARDI gradient directions have been acquired, we see a minimum for subsets with 27/60 gradient directions, as illustrated in Figure S6 in Supplementary Information.For other SNR levels and numbers of acquired gradient directions, the exact number of measurements in the optimal subsets might vary slightly, nevertheless, using a smaller subset of gradient directions appears optimal, also for SNR = 50 and 30 measured gradient directions, as illustrated in Figures S5-S6 in Supplementary Information.Overall, in all fibre scenarios considered, using ∼50% of the gradient directions acquired provides a more accurate TDR estimate than when all gradient directions are used; in the SNR 50 scenarios or when 60 HARDI gradient directions are acquired, using ∼33-50% of the directions also reduces the coefficient of variation between the three fibre scenarios compared to when all gradient directions are used.
Figs. 6 e and f show   12 and   60 , respectively, in the presence of noise, for the single fibre bundle scenario, across a wide range of different cylinder distributions.We can see that using all 60 directions results in the reduction of TDR compared to the ground truth (no-noise case, Fig. 4 a) for many large distributions: this also reduces the difference in TDR between large and small distributions, reducing the potential contrast of a TDR image.  12 is on the other hand considerably higher and closer to the ground truth: for example for the distribution of cylinders with a mean 5.33 m,   12 is 1.6% lower compared to the ground truth while   60 , is 24.5% lower -a greater than 15 times difference in percentage error -which in the real-world scenario could create contrast that is not sufficiently detectable.
For b = 20 ms/ m 2 the results are very similar to what we discussed above for b = 8 ms/ m 2 .The main difference is that for cylindrical substrates, especially those with larger cylinders, decrease in accuracy is even more pronounced, and the optimal subset size for calculating TDR is smaller.For spherical substrates the accuracy is the same however the precision is, as expected, much improved.Results are presented in Figure S4 in Supplementary Material.

Optimised and non-optimised acquisition protocols
Diffusion MRI measurements for TDR contrast in ex-vivo spinal cord were acquired following the non-optimised TDR approach, where the diffusion time is varied between the two shells, as well the optimised gradient waveforms proposed in this work.The measurements were repeated for a maximum gradient strength of 600 mT/m, a value widely available on pre-clinical systems as well as 2500 mT/m that is available on typical microimaging probes.The specific values for the nonoptimised and optimised protocols are given below: Non-optimised protocol: • Shell 1 consists of waveforms with short gradient duration and short diffusion time: The maximum diffusion time given the same echo time was chosen.TDR values computed from Shell 1 and Shell 2n are referred to as nonoptimised.
Optimised protocol: • Shell 1 is the same as for the non-optimised protocol.
• Shell 2o consists of waveforms with long gradient duration and long diffusion time: The timing parameters are adapted from the numerical optimization.TDR values computed from Shell 1 and Shell 2o are referred to as optimised.
The signal attenuation profiles for these gradient waveforms plotted against axon diameter are presented in Fig. 7 , illustrating that indeed the difference between Shell 2o and Shell 1 (optimised) is larger than between Shell 2n and Shell 1 (not-optimised).
TDR contrast in spinal cord Fig. 8 a illustrates the data acquired in the ex-vivo rat spinal cord for b = 8 ms/ m 2 , where WM and GM regions are outlined on the b0 image.The normalised diffusion weighted maps are shown for the three different waveforms when the gradient is either close to parallel or perpendicular to the spinal cord fibres.Indeed, there is a pronounced change in contrast between the different shells.For the perpendicular direction, the signal in white matter increases between shell 1 and shells 2n & 2o, while the signal in grey matter decreases.For the parallel direction, the change is less pronounced in white matter, nevertheless there is still a pronounced decrease in grey matter.Estimated SNR values in white and grey matter were 78 ± 28 and 148 ± 62, respectively.
Fig. 8 b presents TDR maps calculated based on all gradient directions for the two scenarios with b = 8 ms/ m 2 G max = 600 mT/m as well as for G max = 2500 mT/m.In spinal cord white matter, TDR contrast is positive.For G max = 600 mT/m, optimised TDR values in WM are between 0 and 0.4, matching a wide range of axonal distributions presented in Fig. 4 a for the same gradient strength.On the other hand, in grey matter, TDR values are negative, showing that a model of pure restriction (either in cylinders and/or spheres) does not accurately represent the diffusion time dependence in the tissue.
For both gradient strengths, we see that the TDR contrast provided by the optimised pair of gradient waveforms (i.e.Shell 1 and Shell 2o) is higher than the values from the non-optimised pair (Shell 1 and Shell 2n).Moreover, TDR values obtained with stronger gradients are higher, matching the simulation results presented in Supplementary Information in Figure S1, as well as previous results regarding estimating axon diameters ( Fan et al., 2020 ;Drobnjak et al., 2016 ;Nilsson et al., 2017 ).
Fig. 8 c illustrates the effect of using a subset of gradient directions in the computation of TDR.To assess the effect of gradient direction on the TDR estimate in the spinal cord, where white matter tracts are highly anisotropic, the gradient directions were voxelwise sorted based on the average signal intensity in the three shells.Then, TDR was computed from subsets of gradient orientations with an increasing number of directions, as detailed in Fig. 5 .We find that for WM, both the optimised and non-optimised TDR values decrease as more gradient directions are included in the signal average.Including only ∼ 20% of the data points (i.e.12/60 directions) already provides a good balance towards maximising TDR while minimising the effect of noise, corroborating the simulation results.In GM, there is also a dependence of TDR on the number of measurements, likely due to the directionality of the neurites ( Grussu et al., 2017 ), and we estimated negative TDR values.Similar results are observed for the second spinal cord segment.
Overall, for a given gradient strength and echo time, the highest TDR contrast in spinal cord white matter is obtained using the optimised pair of gradient waveforms and approx 12/60 directions from a fully acquired HARDI shell.( Dula et al., 2010 ) Figure S7 in the Supplementary Material shows equivalent results but for b = 20 ms/ m 2 .The TDR maps and the overall results are very similar between the two b-values in line with simulations presented in Section 3.1 .There is a small difference in the effect of the number of gradient directions -in WM for the larger b-value the accuracy decreases faster as the number of gradient directions increases.This is also consistent with simulation results presented in Section 3.1 .
Fig. 9 compares TDR with axon diameter values previously reported in 6 different WM ROIs .The TDR values are calculated voxelwise from the optimised waveforms using a subset of 12 gradient directions, to maximise the contrast.Then, the mean TDR value in each ROI is computed.Strong correlations between mean TDR values and axon diameter are observed for all the analysed sequences, both for G max = 600 mT/m, with a correlation coefficient of 0.85 ( p < 0.01) for both b-values, as well as for G max = 2500 mT/m with a correlation coefficient of 0.9 ( p < 0.01) for b = 8 ms/ m 2 and 0.87 ( p < 0.01) for b = 20 ms/ m 2 .The TDR values are also similar for the two different b-values.Moreover, the TDR values for the two spinal cord segments, that were mounted and imaged separately, are highly consistent.

Effect of extra-axonal space and orientation dispersion
Monte Carlo simulations with substrates consisting of parallel cylinders with an intra-axonal fraction of 0.6 were employed to assess the effect of extra-axonal space.
Fig. 10 a presents the signal contribution of the extra-axonal space, i.e. S total -S intra , as a function of mean axon diameter for the three waveforms and different combinations of b-value and maximum gradient strength.The shown signal is the average over the first 12 directions, that will later be used to compute the optimised TDR.In most cases the signal contribution of the extra-axonal space is less than 0.01 (on a scale from 0 to 1).For sequences with short diffusion time, b = 8 ms/ m 2 and G max = 2500 mT/m the differences are slightly larger, up to 0.03, for diameters > 3 m.
Fig. 10 b shows the effect of the extra-axonal signal propagated to the TDR values calculated from 12 directions and the optimised waveforms.The TDR values calculated from simulations with a uniform distribution of spins (i.e.including both intra and extra-axonal space) are lower than the values simulated only with an intra-axonal spin distribution.The maximum differences for the various sequences are: 0.041 (16%) and 0.053 (7.5%) for b = 8 ms/ m 2 with G max = 600 mT/m and 2500 mT/m, respectively, and 0.011 (3.8%) and 0.022 (2.4%) for b = 20 ms/ m 2 with G max = 600 mT/m and 2500 mT/m, respectively.Similarly to other results throughout the paper we see that the TDR is higher for G max = 2500mT/m compared to 600mT/m.Finally, as expected from results presented in Fig. 4 and Figure S3, the TDR is slightly different between two different b-values.For the substrates considered here, TDR for b = 20 ms/ m 2 is slightly higher, with differences up to 0.05 for G max = 600 mT/m and 0.017 for Gmax = 2500 mT/m, respectively.

Discussion
This work employs numerical simulations to optimise the dMRI acquisition for maximal TDR contrast.The simulation results are validated by ex-vivo diffusion MRI data from both WM and GM tissues in rat spinal cord.

TDR contrast and waveform optimisation
Through our investigation of pulse sequence shapes for standard single diffusion encoding experiments, we find that optimised TDR requires short-, high-G sequences contrasted with long-, low-G sequences.These results are consistent for a wide range of substrates (e.g.restricted diffusion in cylinders and spheres with different size distributions) and sequence constraints (maximum gradient strength, total gradient duration, etc.), and are in line with the optimisation results of previous studies ( Alexander et al., 2010 ;Drobnjak et al., 2016 ;Nilsson et al., 2017 ;Ianus et al., 2021 ).Intuitively, this can be understood by the following reasoning.Given the restriction length r, assuming the Gaussian phase approximation and the long-pulses limit for simplicity, the signal attenuation for the SDE sequence is ln S ∝ -r 4 g 2 , where g is the gradient strength amplitude ( Veraart et al., 2020 ;van Gelderen et al., 1994 ;Neuman, 1974 ).If we acquire both the signals S 1 and S 2 needed to compute the TDR (see Section 2.1 ) with the same b value, the condition g 2  2 ( Δ-/3) = const must be satisfied.In order to maximize TDR contrast given the above conditions, we want maximal attenuation for S 1 which leads to maximizing const / [ ( Δ-/3)], and minimal attenuation for S 2 which leads to minimizing const / [ ( Δ-/3)].Obviously, this can be achieved by acquiring the signal S 1 using  and Δ as short as possible (consequently, with stronger g given the condition of constant b value) and the signal S 2 using  and Δ as long as possible (consequently, with weaker g given the condition of constant b value).
The contrast from the optimised TDR is sensitive to a range of restriction sizes, as illustrated in Fig. 4 , and has the potential to distinguish distributions containing large axons from those containing small axons.As expected, size distributions with different combinations of mean and standard deviation can yield very similar TDR values, as illustrated by areas with similar colours in Fig. 4 .This effect is more pronounced for distributions with longer tails, such as the gamma distribution used for cylindrical substrates, compared to normal distributions used for spherical substrates, where we observe a better correspondence between the mean of the distribution and TDR values.This known tail-weighting effect is due to the fact that the signal contribution of each cylinder to the total measured signal is volume-weighted ( Alexander et al., 2010 ;Veraart et al., 2020 ).As reference, for axon diameter distributions typically found in human brain tissue (i.e.< 3 m), TDR contrast is on the order of 6% or less ( Dell'Acqua et al., 2019 ).
As illustrated both through simulations and experiments, the TDR contrast depends on the maximum gradient strength available on the scanner, and is sensitive to axon diameters that are above the resolution limit, i.e. the minimum diameter that can be distinguished ( Alexander et al., 2010 ;Drobnjak et al., 2016 ;Nilsson et al., 2017 ).For sizes below the resolution limit, TDR is zero.In regards to clinical imaging and maximum gradient strength needed for TDR mapping, we find that gradient strengths on typical clinical scanners (below 80 mT/m) are not suitable as they do not provide sufficient contrast, and stronger gradients are needed, e.g.such as 200 mT/m or more available on the Connectome scanner (results presented in Supplementary Material Section S1).

Effect of extra-axonal space
The TDR exploits ultra-high b-values to suppress the contribution from water diffusing in the extra-axonal space, and enhance the sensitivity of the dMRI measurements to water diffusing in the restricted intracellular space.Previous work demonstrated, by identifying the powerlaw signature specific of narrow cylinders, that b ≳ 7 ms/ m 2 is enough to assume that the observed signal originates from inside the axons only ( Veraart et al., 2019 ).Hence, in this study we investigate sequences at b = 8 ms/ m 2 , which is at the limit of that requirement, as these can be achieved on pre-clinical and high performance clinical scanners for a range of gradient waveforms.We find that for those, the extra-axonal signal is negligible (normalised signal below 0.01) for most configurations (including complex fibre geometries as seen in Supplementary Material Section S5) and protocols ( Fig. 10 a).Only the largest axonal configurations ( > 3um) for very short diffusion times ( ∼5 ms) reach nonnegligible but still very low extra-axonal signal of 0.03.This seems to originate from axonal configurations with geometrical repetitions, inducing slow diffusion or even restriction in some of the extra-axonal space, a feature we expect would be much smaller in the real data.
We found that the impact of this on TDR was not significant.We investigated b = 20 ms/ m, for which the extra-axonal signal is negligible (normalised signal below 0.01) for all axonal configurations and protocols we investigated and TDR values of intra-axonal space almost identical as that of the uniform space (intra + extra-axonal) [ Fig. 10 b].When compared in simulations TDR for b = 8 ms/ m 2 has differences slightly larger for certain axonal configurations and G = 2500mT/m but overall still very small.When compared in pre-clinical data, we find that the TDR maps between the two b-values are extremely similar and with matching patterns and values, suggesting that b = 8 ms/ m 2 is sufficiently large to satisfy the TDR assumptions.

TDR in the presence of noise
In the original TDR, it was proposed to use the direction-averaged signal to remove/mitigate the bias due to orientation dispersion.In this work we show that, when imaging structures consisting of one or several anisotropic fibre bundles (e.g.coherent white matter fibres, but also regions with crossing fibres), in the presence of Rician noise, the TDR contrast can be further improved by using only a subset of the highest-signal gradient directions.This happens because for some gradient directions the measured signal is very low and decays to the noise floor, therefore adding a bias in the powder averaged signal and subsequent TDR calculation.Thus, removing these measurements improves the accuracy of TDR estimation.For relatively coherent fibres, both simulations and experimental results suggest that using approx 20% of the measurements is optimal.For more complex fibre configurations (e.g. two or three crossing fibre bundles), larger subsets are more appropriate, with the exact optimal percentage of gradient directions depending on the SNR as well as the fibre configurations.If we are to choose a single percentage, for example to calculate TDR across the brain where there are voxels with different fibre orientations, our simulations suggest that considering a subset with e.g.∼50% of measurements, improves TDR accuracy and decreases its variance across fibre orientations compared to using the entire dataset.Nevertheless, the optimal percentage should be decided based on the SNR of the data, the number of HARDI directions and expected fibre orientations in the sample.
This bias in TDR can also be mitigated by employing real valued data ( Eichner et al., 2015 ) instead of magnitude data and removing the effect of the Rician noise floor.Indeed, simulations employing Gaussian noise result in similar TDR values for different numbers of gradient directions (data not shown).

TDR in ex-vivo spinal cord: white matter
The TDR contrast in the ex-vivo rat spinal cord closely follows the predictions from simulations, with higher values for the optimised waveforms and ∼20% of the directions.We have also observed very strong correlations between TDR and axon diameter values reported in the literature in different ROIs, with correlation coefficients above 0.85 both for a weaker gradient (600 mT/m) and a stronger gradient (2500 mT/m).As discussed above, TDR is sensitive to the volume weighted distribution of axons, and therefore the mean diameter calculated from electron microscopy might not be the best quantity for comparison, nevertheless, there is a clear trend, as illustrated in Fig. 9 .The trends of TDR are also consistent with previous MRI correlates of axon diameter, both using diffusion data as well as relaxometry ( Shemesh, 2018 ;Anaby et al., 2019 ;Fick et al., 2017 ;Nunes et al., 2017 ).
The curves presented in Fig. 1 show a slightly different dependence of TDR on axon diameter compared to the trends shown in Fig. 9 -namely that in simulations the TDR curve goes to zero for small axons and in fixed rat spinal cord it does not.Indeed, when considering straight cylinders ( Fig. 10 ), simulated TDR values for axons below 2 um are close to zero, while TDR values measured in the spinal cord are larger ( ∼0.1 for Gmax = 600 and 0.2 for Gmax = 2500).Nevertheless, the TDR values simulated in the more realistic substrates with undulation and varying axon diameter indicate higher values compared to straight cylinders, that are similar to the ones measured in the spinal cord in ROIs known to have similar axon sizes, see Table S1 in Supplementary material.Since TDR is not aiming to measure pore sizes quantitatively, these differences do not affect the interpretation of the results significantly.

TDR in ex vivo spinal cord: grey matter
In ex vivo spinal cord grey matter we measure negative TDR values.These results are not consistent with simulations of TDR when diffusion is influenced by restricting barriers in different scenarios: i) totally restricted diffusion in spheres and/or cylinders, ii) randomly packed parallel cylinders with a uniform distribution of spins, iii) packed complex fibre geometries featuring orientation dispersion, undulation and varying diameter along the fibre (see section S6 in SI) with a uniform distribution of spins.This suggests that effects, other than pure restriction, influence the diffusion time dependence in grey matter.These results show for the first time that for measurements with high diffusion weighting (i.e.≥ 8 ms/ m 2 ), the signal is showing a pronounced decrease with increasing diffusion time in spinal cord grey matter.This is consistent with recent preclinical studies focusing on brain grey matter, suggesting that inter-compartmental exchange is a highly plausible explanation for the diffusion time dependence, both in vivo and ex vivo ( Jelescu et al., 2022 ;Olesen et al., 2022 ;Ianus et al., 2022 ).In this case, TDR may reflect exchange between neurites and extra-cellular space, between soma and extra-cellular space and/or between soma and neurites ( Ianus et al., 2021 ;Jelescu et al., 2022 ;Olesen et al., 2022 ).Since TDR was originally proposed to provide information about restriction effects in the WM, here we focus our optimization and investigation on WM.The observed discrepancy between simulated and measured TDR values in the GM are of great interest, but require further dedicated investigations, which goes beyond the scope of this work.

Going beyond single diffusion encoding
Here we chose to look at only SDE sequences since this is the first paper to look into TDR following the preliminary results presented in Dell'Acqua, proc.ISMRM 2019) ( Dell'Acqua et al., 2019 ) and we felt that starting from the most standard sequence and exploring it at depth (both in simulation and pre-clinically) was sensible.We showed that the TDR can work well already with SDE, however the TDR contrast could be further improved by including other waveforms in the optimization, for example oscillating gradients ( Drobnjak et al., 2016 ), and a similar approach has been explored recently in simulations using such waveforms ( Harkins et al., 2021 ).If high frequency oscillating gradients cannot usually achieve the desired high b-values for a practical echo time, low frequency oscillations could potentially improve TDR contrast, as they were shown in the past to increase the sensitivity towards small axon diameters.Nevertheless, these sequences are not widely available, neither on clinical nor on pre-clinical scanners, therefore here we focused on standard single diffusion encoding acquisitions.For larger restriction sizes, using SDE sequences in a stimulated echo sequence rather than the standard spin-echo preparation might be beneficial as it allows to reach larger gradient separations, a period where the signal is governed by a slower T1 decay rather than the faster T2 decay.

Limitations
This work optimises TDR based on simple and ideal geometries (parallel cylinders and spheres), nevertheless, the results are corroborated for WM (where TDR works as expected) using both real data from spinal cord and simulations performed for more complex substrates including orientation dispersion, extra-axonal space and realistic axonal shapes.However, none of these substrates consider exchange between compartments, so a full exploration of negative TDR values in GM is not possible.The future exploration of complex substrates with a wider range of parameters describing dispersion, axon curvature ( Nilsson et al., 2012 ;Lee et al., 2020 ), beads ( Alves et al., 2022 ;Budde and Frank, 2010 ), and/or exchange might provide more insight into the behaviour of TDR across various possible microstructures.Another simulation limitation is that noise is simulated using the Rician distribution; strictly speaking, noise across a multi-channel receive coil (e.g., 32 channels for many clinical scanners) should have a non-central Chi distribution.Moreover, TDR contrast could be further improved by considering other gradient shapes, such as oscillating gradients or spin preparations, e.g.stimulated echo, nevertheless, this would require the use of new sequences that are not widely available at the moment.Finally, in our simulations we assume infinite slew rates in the sequence, which although are a very good approximation for the pre-clinical scanners (the scanner we use has a slew rate of 25,000 mT/m) are not so well suited for clinical scanners.Hence we tested the finite slew rate of 200 mT/m (typical of Connectome Scanner) that can accommodate the physiological constraints and found that this has not affected the optimisation results significantly (results in the Supplementary Material S2).
We did not estimate the diffusivity directly from our mouse tissue sample: the spinal cord segments did not have clear CSF-containing areas, and they were immersed in fluorinert, which is not MR visible.That said, we have checked the MD values in CSF from previously published mouse brain data acquired on the same system as ours, at the same temperature and following the same perfusion protocol: they were ∼2.8 m 2 /ms and it is very likely that our sample was in line with this.
TDR requires special care when interpreting the results.As TDR is a single-dimensional measure of restriction size, different size distributions can return identical TDR values.Examples of these distributions can be determined through examination of the isocolours in Fig. 4 .More generally, when moving beyond substrates consisting simply of cylinders (e.g. as a model of axons in WM) or spheres, there will be a range of microstructure compositions which we can expect to return similar values of TDR.
TDR is also affected by the inherent sensitivity of the acquisition and the diffusion signal to the pore sizes.As shown in previous work ( Drobnjak et al., 2016 ;Nilsson et al., 2017 ), the sensitivity of the diffusion signal to axon diameter strongly depends on the gradient strength and SNR.These can be used to calculate a size resolution limit below which the signal has minimal sensitivity.While its dependence on the resolution limit is very similar to model-based approaches, TDR has an advantage that it does not try to fit but rather just maps areas where the voxels are more likely to contain pores of sizes above the resolution limit.Hence, its robustness, sensitivity and interpretation will be much improved and more appropriate for some applications compared to the model-based approaches.However, it is also important to highlight that TDR values can be biased by properties of the tissue other than pore size, such as inter-compartment exchange (compartment-specific T2 and T1 relaxation should not bias to TDR values as long as high enough b values are used to suppress extra-cellular signal and TE/TR are kept the same for all the acquisitions).This could be particularly problematic in diseases, where for example demyelination and accumulation of reactive glia may change the exchange properties of the tissue, complicating the interpretation of the results.Future studies investigating TDR sensitiv-ity and specificity to changes in tissue pore sizes in disease might shed a light on the impact of said confounders.

Conclusion
This work employs simulations and pre-clinical data to show the potential of TDR contrast to characterise restricted diffusion for a wide range of tissue microstructures featuring cylindrical and/or spherical size distributions.Importantly, the TDR contrast can be enhanced by using optimised gradient waveforms, contrasting a short  + high G pulse and a long  + low G pulse, as well as a subset of gradient directions from the acquired shells.In ex-vivo experiments on rat spinal cord, TDR can successfully characterise spinal cord white matter microstructure, showing a strong correlation with axon diameter values from quantitative histology.Overall these results show that the recently proposed TDR approach has a great potential and is a very promising alternative (or potentially complement) to model-based approaches for informing on pore sizes in tissue.
MP is supported by the UKRI Future Leaders Fellowship MR/T020296/2 This research was funded in part by a Wellcome Trust Investigator Award (096646/Z/11/Z) and a Wellcome Trust Strategic Award (104943/Z/14/Z).For the purpose of open access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Fig. 1 .
Fig. 1.An illustration of the TDR contrast, as proposed in the original study ( Dell'Acqua et al., 2019 ).Both signals (left) are simulated at a b value of 8 ms/ m 2 , with Δ = 21 ms,  = 9 ms, G = 276.8mT/m for S 1 and Δ = 55 ms,  = 9 ms, G = 162.9mT/m for S 2 , respectively.The relative difference between the two signals produces the TDR contrast on the right, that is monotonically increasing with pore size for the range considered here.

Fig. 3 .
Fig.3.Optimisation results maximising TDR for G < 600 mT/m, Δ+  < 45 ms and b = 8 ms/ m 2 .(a-d) Maps showing the diffusion weighted signal averaged over the 60 uniformly distributed directions for sequences with various / Δ combinations for the substrates illustrated in Fig.2.White markers indicate the non-optimised (circle) and optimised (square) sequences, respectively.The optimised sequences provide larger signal differences between S 1 and S 2o compared to between S 1 and S 2n , and higher TDR values.In each plot, colours are scaled so that a diffusion signal of 0 is blue, and the maximum diffusion signal obtained is red; the colour range displayed in each plot gives an idea of the maximum TDR possible.(e) Schematic representation of nonoptimized and optimised gradient shapes.These figures show that optimised TDR maximises the difference between the signal from the two acquisitions, using sequences with different pulse shapes.Equivalent figure for b = 20 ms/ m 2 is Fig.S2presented in Supplementary Material.

Fig. 4 .
Fig. 4. Noise-free TDR values calculated for sequences with b = 8 ms/ m 2 optimised parameters for G max = 600 mT/m: S 1 : Δ = 8.9 ms,  = 6.7 ms; S 2o : Δ = 30.9ms,  = 14.1 ms.The signal is simulated across a wide range of (a) cylinder and (b) spherical diameter distributions.For cylinders, we simulate Gamma distributions, and for spheres we simulate Gaussian distributions, which reflect the size distributions usually measured in the tissue.The typical large and small size distributions presented in Fig. 2 are shown using white markers.For cylinders, the gamma distributions are truncated at 20 m to match realistic values from the tissue; thus, combinations of parameters where this truncation changes the mean by more than 10% were not considered.Equivalent figure for b = 20 ms/ m 2 is Fig.S3 presented in Supplementary Material.

Fig. 5 .
Fig. 5. Effect of the gradient direction on the signal coming from two crossing bundles of parallel fibres.a) Gradient directions for an acquisition protocol with 60 measurements.b) Signal attenuation for each diffusion direction simulated for a substrate consisting of two perpendicular fibres / bundles each with Gamma diameter distributions of mean = 5.33 m, std = 3.00 m (the same as used in the 'large axon' simulations).c) Signal attenuations sorted in descending order and examples of subsets with different numbers of measurements.

Fig. 6 .
Fig. 6.TDR values for noise free (blue circles, SNR = inf) and noisy (orange crosses, Rician noise with SNR = 20) signals as a function of the number of gradient directions included in the analysis in different substrates: (a-c) bundles of cylinders with one, two or three fibre orientations crossing at 90°.All bundles have a Gamma distribution of sizes with mean = 5.33 m and std = 3.00 m; (d) a Gaussian distribution of spheres with mean = 7 m and std = 0.5 m.The orange error bars show 1 standard deviation of the estimated TDR over 100,000 noise instances.As illustrated in (a-c), for the different fibre configurations, TDR values estimated from noisy data are below the expected noise-free values.All simulations are performed using the optimised sequence parameters for large cylinders.Equivalent figure for b = 20 ms/ m 2 is Fig.S4 presented in Supplementary Material.
Fig. 10 c illustrates the effect of fibre dispersion on the optimised TDR values (12 directions, optimised waveform) based on numerical simulations performed in MISST for cylinders following a Watson dis-

Fig. 7 .
Fig. 7. Simulations for signal attenuation profiles as a function of cylinder diameter for the three shells used in the pre-clinical acquisition at b = 8 ms/ m 2 and b = 20 ms/ m 2 , both for G max = 600 mT/m (left) and G max = 2500 mT/m.The signals were simulated for single cylinders and diffusion gradients perpendicular to the fibre with intrinsic diffusivity of D = 2 m 2 /ms.

Fig. 8 .
Fig. 8. Optimisation of TDR acquisition including waveforms (a-b) and number of gradient directions (b).a) left: T2 weighted image of the rat spinal cord without diffusion gradients.White matter and grey matter regions are delineated with blue and orange contours, respectively.Right: schematic depiction of the gradient waveforms for the three shells (top) and the corresponding diffusion weighted maps for a perpendicular (middle) and a parallel (bottom) gradient direction.The maps are shown for denoised data.b) TDR maps for maximum gradient strength of 600 and 2500 mT/m for optimised and non-optimised gradient waveforms.c) left: TDR values as a function of how many gradient directions were included in the signal average for white matter (blue) and grey matter (orange) ROIs.right: TDR maps for measurement subsets with 12/60 and 60/60 directions.Equivalent figure for b = 20 ms/ m 2 is Fig.S7 presented in Supplementary Material.

Fig. 9 .
Fig. 9. Comparison between TDR values and axon diameters reported in the literature in 6 spinal cord WM ROIs (Vestibulospinal (VST) -blue, Reticulospinal (ReST) -orange, Rubrospinal (RST) -yellow, dorsal corticospinal (dCST) -green, Funiculus Cuneatus (FC) -purple and Funiculus Gracilis (FG) -light blue).The ROIs are manually delineated and colour coded as illustrated on the left.Each segment is represented with a different marker shape in the correlation plot.Strong correlations between mean TDR values and axon diameter are observed for all the analysed sequences, both for Gmax = 600 mT/m, with a correlation coefficient of 0.85 ( p < 0.01) for both b-values, as well as for Gmax = 2500 mT/m with a correlation coefficient of 0.9 ( p < 0.01) for b = 8 ms/ m 2 and 0.87 ( p < 0.01) for b = 20 ms/ m 2 .

Fig. 10 .
Fig. 10.(a) Signal contribution of extra-axonal space calculated from Monte Carlo simulations of parallel cylinders with a Gamma distribution of radii corresponding to the spinal cord ROI values.The signal is the average over the 12 gradient directions used for computing TDR 12 .The contributions are shown for the different waveforms (left to right), as well as for different b-values and maximum gradient strengths (markers).The largest differences observed for b = 8 ms/ m 2 do not exceed 0.03 for the simulated substrates.The dotted line represents 0.01 signal difference.b) Optimised TDR (12 gradient directions, optimised waveforms) as a function of axon diameter for different protocols, when particles in the MC simulation are distributed either uniformly ('x') or just inside the cylinders (' □').c) Optimised intra-axonal TDR (12 gradient directions, optimised waveforms) as a function of axon diameter for dispersed cylinders following a Watson distribution with different concentration parameters  = {1, 6, 100}.The TDR values overlap for the different  values.c) Comparison of optimised TDR for the data acquired with b = 20 ms/ m 2 and 8 ms/ m 2 for the two gradient strengths.