Joint modelling of diffusion MRI and microscopy

The combination of diffusion MRI (dMRI) with microscopy provides unique opportunities to study microstructural features of tissue, particularly when acquired in the same sample. Microscopy is frequently used to validate dMRI microstructure models, addressing the indirect nature of dMRI signals. Typically, these modalities are analysed separately, and microscopy is taken as a gold standard against which dMRI-derived parameters are validated. Here we propose an alternative approach in which we combine dMRI and microscopy data obtained from the same tissue sample to drive a single, joint model. This simultaneous analysis allows us to take advantage of the breadth of information provided by complementary data acquired from different modalities. By applying this framework to a spherical-deconvolution analysis, we are able to overcome a known degeneracy between fibre dispersion and radial diffusion. Spherical-deconvolution based approaches typically estimate a global fibre response function to determine the fibre orientation distribution in each voxel. However, the assumption of a ‘brain-wide’ fibre response function may be challenged if the diffusion characteristics of white matter vary across the brain. Using a generative joint dMRI-histology model, we demonstrate that the fibre response function is dependent on local anatomy, and that current spherical-deconvolution based models may be overestimating dispersion and underestimating the number of distinct fibre populations per voxel.


Introduction
Diffusion MRI (dMRI) is routinely used to study white matter microstructure and connectivity in vivo and non-invasively (Basser et al., 2000;Sporns et al., 2005;Jbabdi et al., 2015). dMRI microstructure models relate variations in the MR signal to microstructural features of interest. Such inference requires biophysical modelling of both the tissue architecture and diffusion process. Although many dMRI models have been proposed, few have been rigorously validated (Jelescu and Budde, 2017;Dyrby et al., 2018), and the link between the observed diffusion signal and the underlying white matter microstructure remains controversial (Lerch et al., 2017;Novikov et al., 2019).
Microscopy is often considered a gold standard technique for the validation of dMRI models. Crucially, microscopy tends to resolve a specific structure of interest (e.g. histological staining of astrocytes or polarised light imaging of myelinated axons) and thus typically provides specificity that is not guaranteed by MRI. In a typical validation study the dMRI and microscopy data are analysed separately, then dMRI-derived tissue parameters (e.g. fibre orientation, myelin density or axon diameter) are compared to microscopy equivalents which are taken to be the ground truth (Leuze et al., 2014;Bastiani et al., 2017;Mollink et al., 2017;Schilling et al., 2017). This is possible due to the complementary nature of the data: both modalities provide information about the same tissue parameters of interest, but each observe them through a different lens. However, by analysing the data separately (rather than simultaneously), such paradigms may not be exploiting the multimodal data to its full potential.
Here we suggest an alternative, data-fusion framework in which we combine dMRI and microscopy data from the same tissue sample into a single joint model. A joint model may be advantageous in three respects.
Firstly, by considering both datasets simultaneously, we have access to additional, complementary information about the tissue microstructure and may be able to accurately determine tissue parameters that are currently unobtainable from the diffusion signal alone. A secondary benefit of the data-fusion framework is that the joint model considers both dMRI and microscopy to be informative of the 'true' underlying microstructure, but also that both have sources of uncertainty (Fig. 1). Crucially, these are unique, modality-dependent sources of noise. Therefore, by using a data-fusion framework we can in theory obtain a higher-precision estimate of the underlying microstructure of interest. Finally, microscopy is typically 2D and may only be sensitive to a subset of the tissue compartments (e.g. myelinated axons or astrocytes). For example, histological staining of the tissue (a gold standard microscopy technique) typically produces 2D images of thin tissue sections, where only the stained microstructure is easily visualised. Thus, the information provided by microscopy only partially informs on the tissue microstructure. The joint model can overcome this limitation by considering the microscopy as a soft constraint on the model, as opposed to a hard constraint or ground truth in post-hoc validation. This framework is inspired by a similar data-fusion approach (Sotiropoulos et al., 2016) which demonstrated improved brain connectivity analysis when complementary 3T and 7T dMRI data was analysed jointly rather than separately. It should be noted that a similar joint modelling approach could be applied to co-analyse any two datasets which share a common parameter of interest, to obtain a higher-precision estimate of that parameter. The two datasets could be a) intra-modality, such as the two dMRI datasets in the above example by Sotiropoulos et al. (2016), b) from different MRI techniques (e.g dMRI-relaxometry) or c) inter-modality, such as the combination of MRI with positron emission tomography (PET), electro-or magneto-encephalography (E/MEG) (Daunizeau et al., 2007;Uluda and Roebroeck, 2014) or microscopy data, as in the approach presented here. This report considers one example of how dMRI-microscopy datafusion allows us to extract tissue parameters which are difficult to obtain from the dMRI data alone. Here we aim to separate out fibre orientation dispersion from radial diffusion i.e. the apparent diffusion coefficient perpendicular to the direction of the fibre. As is illustrated in Fig. 1 (bottom), a highly dispersed fibre population with low radial diffusivity may produce the same dMRI signal as a single, coherently oriented fibre population with high radial diffusion. This degeneracy is commonly overcome by assuming one can identify a region with a single, coherently oriented fibre population which is then used to define a global radial diffusivity. Alternatively, promising developments by Lasi c et al. (2014), Szczepankiewicz et al. (2015), and more recently by Cottaar et al. (2019) demonstrate that, by combining linear and spherical diffusion tensor encoding, it is possible to disentangle microscopic diffusion anisotropy from orientation dispersion. Nonetheless, current analyses of conventional dMRI data (based on single diffusion encoding and which do not assume global diffusivities) are unable to distinguish these two distinct fibre configurations.
Microscopy can be used to determine the fibre orientations at a much finer spatial resolution (typically $ micrometer or sub-micrometer per pixel) and so can estimate the fibre dispersion in each dMRI voxel. However, microscopy alone is typically uninformative of the diffusion properties of the tissue. In a joint dMRI-microscopy model (as illustrated in Fig. 1 bottom), once the amount of within-voxel fibre dispersion is constrained, dMRI can accurately estimate the radial diffusion. In this manner a joint model should be able to overcome the degeneracy and separate out fibre dispersion from radial diffusion. This could provide insights into how these two parameters vary across the brain in both health and disease.
There are many ways to formulate a joint model. In this report we focus on one approach based on constrained spherical deconvolution (CSD, Tournier et al., 2007). As a popular data-driven analysis, CSD estimates the underlying fibre orientations from a dMRI dataset. In CSD, the diffusion signal is considered to be the convolution of the fibre orientation distribution (FOD) with a fibre response function (FRF). Here the FRF describes the diffusion signal from a single, coherently oriented fibre population. In the model, the FRF is first estimated empirically (Dhollander et al., 2016;Jeurissen et al., 2014;Tax et al., 2014;Tournier et al., 2007) (typically from voxels with large fractional anisotropy) and subsequently used as a global deconvolution kernel to determine the underlying FOD. CSD is typically thus based around two main assumptions: that it is possible to identify a region which contains a single, non-dispersed fibre population, and that the FRF estimated from this region holds globally. In other words, it is possible to estimate a valid 'brain-wide' FRF. These assumptions may be challenged if we consider there to be non-zero orientation dispersion in typical MRI voxels, and the diffusion characteristics of white matter to be dependent on microstructural properties such as axonal diameter, packing and myelination that can vary across the brain (Walhovd et al., 2014;Aboitiz et al., 1992). Thus, the estimation of a brain-wide fibre response function could be unreliable and has been shown to produce spurious results when poorly estimated (Parker et al., 2013;Tax et al., 2014). To overcome this limitation, various attempts have been made to assume a more local (rather than global) response function. Anderson et al. Fig. 1. Top: Microscopy data can provide highly detailed information about specific microstructural features of the tissue at sub-micrometre resolutions. In comparison, the diffusion MRI signal is an indirect measure of the same microstructure of interest. Due to the complementary information provided by the two modalities, we propose a data-fusion framework which simultaneously analyses dMRI and microscopy data from the same tissue sample to drive a single, joint model. Bottom: (a) A highly dispersed fibre population with low radial diffusivity may produce the same diffusion MRI signal as a single, coherently ordered fibre population with high radial diffusivity. The blue line labelled 'Diffusion MRI' represents a simplistic but graphical representation of this degeneracy. (b) Microscopy can estimate and, in the joint analysis, constrain the fibre dispersion in each MRI voxel to overcome the degeneracy; the dMRI data can now provide accurate estimates of radial diffusion. Critically, as the microscopy acts as a soft constraint (rather than fixing the dispersion), both the dMRI and the microscopy data simultaneously inform on the fibre dispersion. Information about the radial diffusion comes from dMRI data alone. The shading represents noise in the data and uncertainty in the parameter estimates. A.FD. Howard et al. NeuroImage 201 (2019) 116014 (2005) derive a voxel-wise FRF from the estimated apparent diffusion coefficient, whilst Jeurissen et al. (2014) and Dhollander et al. (2016) estimate a tissue specific response function and perform multi-tissue CSD. Nonetheless, alongside the recent and complimentary publication by Schilling et al. (2017), this report constitutes some of the first work to actively demonstrate the extent of FRF variation across the brain.
In this work we combine dMRI and histology data from the same human tissue sample to investigate the diffusion properties of white matter under conditions where multiple modalities are informative of the 'true' fibre configurations. To highlight the benefits of the data-fusion framework, this report considers a joint model which is based on nonparametric CSD. We show that by including histology data into the joint model we are able to overcome the degeneracy of fibre dispersion and radial diffusion. As such, we can simultaneously estimate the diffusion profile of a single fibre bundle (the FRF) and the underlying fibre orientation distribution (the FOD) on a voxel-by-voxel basis. In our results we investigate how the FRF changes across several white matter regions, and consider the implications this may have on both the reliability of CSD-based analyses and our understanding of the white matter microstructure.

Methods
This section will first describe the principles of CSD, and how these were developed into our joint modelling approach (c.f. Joint modelling). As a generative model which spans multiple modalities, the joint model error balances three terms (a dMRI-data fidelity term, a microscopy-data fidelity term and a complexity penalty) with two regularisation factors. Thus, Simulations describes how simulated data (with a known ground truth) was used to determine appropriate values for the two regularisation factors. Finally, Postmortem data acquisition provides details of the coregistered high-resolution dMRI (0.4 mm isotropic) and myelin-stained histological data used in this study.

Constrained spherical deconvolution
In constrained spherical deconvolution (CSD, Tournier et al., 2007) the diffusion-weighted MR signal attenuation, S, measured along an orientation parametrised in spherical coordinates by angles ðθ 0 ; φ 0 Þ, is considered to be the convolution of the FOD with the single-fibre response function (FRF): FODðθ; φÞ FRFðθ 0 À θ; φ 0 À φÞ sinðθÞ dθ dφ (1) where both the FOD and FRF are defined on the unit sphere. The FRF describes the signal profile of a single coherently-oriented fibre bundle and both the FRF and the FOD can be expressed in terms of spherical harmonics (Tournier et al., 2007). After estimating an appropriate FRF, the FODs are reconstructed on a voxel-wise basis by deconvolving the FRF from the diffusion signal S. As the deconvolution is sensitive to noise, the spherical deconvolution is constrained to minimise physically impossible negative peaks in the reconstructed FOD (Tournier et al., 2007(Tournier et al., , 2008. To constrain the optimisation a modified Tikhonov regularisation method (Hansen, 1994) is typically employed to iteratively penalise negative amplitudes on the FOD and update the FOD estimation.
This report will compare results from the standard CSD model (a deconvolution) with those from the novel joint modelling approach (a generative model). In the standard CSD model, processing was performed using the MRtrix3 package (www.mrtrix.org, (Tournier et al., 2012)) and the FRF was determined empirically using Tournier's approach (Tournier et al., 2013). Briefly, the Tournier algorithm iterates between response function estimation and CSD to determine the 300 most likely 'single-fibre' voxels from which the FRF is estimated.

Joint modelling
The principles of constrained spherical deconvolution were extended to enable joint estimation of the FRF and FOD for datasets with both dMRI and microscopy data. Although the joint modelling approach could be applied to any type of microscopy data from which we can extract fibre orientations, in this work we analyse histological images which have been stained for myelin and co-registered to 0.4 mm isotropic dMRI data (c.f. 2.4 Postmortem data acquisition). Through structure tensor analysis of histological images (Bigun et al., 2004;Budde and Frank, 2012;Seehaus et al., 2015) (Appendix A), the primary fibre orientation was estimated per pixel after which 1400 Â 1400 orientations were combined into a frequency histogram to generate a 2D microscopy-derived FOD (FOD 2D;micro ) for a 'superpixel' comparable to the spatial resolution of dMRI. In the model, the microscopy-derived FOD acted as a soft constraint on both the fibre orientation and amount of dispersion in each MRI voxel. This allowed estimation of the diffusivities both parallel (d axial ) and perpendicular (d radial ) to the fibre.
The generative model is described in Fig. 2. In the joint model, the FRF and FOD were first estimated across a densely sampled sphere and then combined to predict the dMRI signal, S, along each gradient direction, kðθ 0 ; φ 0 Þ, using Eq. (1). Additionally, the FOD was projected onto the 2D plane (FOD 2D;joint ) and re-normalised to fit the microscopy-derived FOD 2D;micro ; for details see Appendix B. To retain simplicity and avoid overfitting, in the joint model the FRF was characterised by an axiallysymmetric diffusion tensor, whilst the FOD was defined on a spherical harmonic basis set of order 6. The model thus contained 30 free parameters: 2 diffusivities (which determined the FRF) and 28 spherical harmonics coefficients (which determined the FOD).
The model parameters were fitted by minimising a cost function, E, made of three terms, where E diff describes the dMRI-data fidelity term, E micro the microscopydata fidelity term, and E complex a complexity penalty. The predicted dMRI signal, S, was compared to the dMRI data, Y, along gradient direction k 2 ½1; N, using the mean squared error, Fig. 2. The generative joint model. Here the fibre response function (FRF) was considered an axially-symmetric diffusion tensor characterised by diffusivities (d) parallel and perpendicular to the fibre. The fibre orientation distribution (FOD) was defined on a spherical harmonic (SH) basis set of order l ¼ 6. In the joint model, the FRF and FOD were simultaneously fit to the diffusion data (through convolution) and the FOD projected onto a 2D plane for comparison with the microscopy-derived FOD.
The projected FOD (FOD 2D;joint ) was compared to the microscopyderived FOD 2D;micro using the symmetric Kullback-Leibler divergence D KL (Kullback and Leibler, 1951), where the Kullback-Leibler divergence of two discrete probability distributions P and Q is defined as, In the model, i indicates the discrete values of θ, the in-plane angle over which FOD 2D;micro and FOD 2D;joint are defined. As the Kullback-Leibler divergence becomes numerically unstable when PðiÞ or QðiÞ are close to zero, we set a lower bound such that PðiÞ; QðiÞ ! 2 Â 10 À16 . The lowest value in the real histology FODs was found to be 4 Â 10 À8 .
A third error term, the complexity penalty, minimised the presence of spurious peaks in the 3D FOD. The penalty consisted of two parts. The first penalised unphysical, negative peaks in the FOD. The second penalised small positive peaks and excessive undulations in the FOD shape, which are unlikely to be biologically meaningful. Firstly, positivity in the FOD was enforced in a manner similar to Tournier et al. (2007). The spherical harmonics (which describe the 3D FOD) were projected onto a densely sampled sphere, the magnitude of any negative peaks summed (creating E neg ) and then set to 0. Thus, the negative peaks did not contribute to either the predicted diffusion signal S nor FOD 2D;micro . To discourage the presence of small positive peaks or excessive undulations in the FOD, the complexity penalty minimised the L1-norm of the spherical harmonics coefficients x 2 ½1; 28 where, xn : The three error terms were combined into a single cost function (Eq. (2)), with regularisation factors λ micro and λ complex respectively, and minimised using MATLAB's non-linear solver fmincon (MathWorks, 2017). Optimisation constraints ensured that the FOD, d axial and d radial were positive and that d axial ! d radial . The joint model was initialised as follows: in each voxel, the spherical harmonics coefficients output from CSD (Tournier et al., 2007,Tournier et al., 2012 provided an initial estimation of the FOD and the FRF was initialised to d axial ¼ 0:25 μm 2 = ms; d radial ¼ 0:05 μm 2 =ms. MATLABs simulated annealing algorithm simulannealbnd (MathWorks, 2017) was used to generate three sets of starting parameters from the above initial conditions. The model was optimised for each set of starting parameters, and the solution with the lowest error chosen. With this procedure, the joint model was able to simultaneously optimise both the FRF and FOD on a voxel-by-voxel basis.

Simulations
This section will describe two sets of simulations which were used to determine appropriate regularisation factors λ micro and λ complex . dMRI and histology data was simulated for a fairly simple single-fibre configuration. The first set of simulations was used to determine an appropriate value of λ micro and did not include a complexity penalty (λ complex ¼ 0). We evaluated whether, through the inclusion of histology data (increasing λ micro ), the joint model was able to separate fibre dispersion from radial diffusion by assessing how faithfully the model could quantify the fibre orientation dispersion. Here the FOD was fully described by 3 parameters: the azimuth, θ, and inclination, φ, of the FOD peak, and the symmetric fibre dispersion around the peak, characterised by the orientation dispersion index (ODI, Eq. (7)). In the second set, λ micro was fixed and a cross-validation approach was taken to optimise λ complex . Here spherical harmonics were used to describe the FOD and we assessed how the residual error of the model changed with respect to λ complex .
In both sets of simulations a dispersed FOD was approximated by a Watson distribution with concentration parameter κ (Mardia, 2000;Zhang et al., 2012). As proposed by Zhang et al. (2012), an orientation dispersion index was defined as, whereby the ODI varies from 0 (no dispersion) to 1 (isotropic dispersion). As above, the diffusion signal was calculated as the convolution of the FOD with an axially symmetric FRF (Eq. (1)) after which zero-mean Gaussian noise was added. In a similar manner to Sotiropoulos et al. (2012), assuming S 0 ¼ 100, the SNR was defined as SNR ¼ S 0 =σ noise . Simulation parameters were fixed to values of ODI ¼ 0:25, d axial ¼ 0:2 μm 2 =ms, d radial ¼ 0:1 μm 2 =ms, SNR dMRI ¼ 15.
Structure tensor analysis (Bigun et al., 2004;Budde and Frank, 2012;Seehaus et al., 2015) of the histology data produced a single fibre orientation per pixel, where each orientation can be seen to represent the in-plane component of a (myelinated) fibre in the 3D FOD (as defined by MRI, assuming perfect co-registration). To simulate histology data, we emulated this process in reverse: fibre orientations were sampled from the Watson distribution (which describes the 3D FOD) using rejection sampling, projected onto a 2D histological plane and combined into a frequency histogram. Rejection sampling is an acceptance-rejection method which allows us to generate observations from the Watson distribution. Briefly, orientations ðθ; φÞ were randomly sampled across the sphere. For each orientation, the value of the normalised Watson distribution Pðθ; φÞ was evaluated and a number was randomly sampled from the uniform distribution Uð0; 1Þ. The orientation ðθ; φÞ was accepted if the randomly generated number was less than, or equal to, Pðθ;φÞ. In simulation, 1400 Â 1400 (accepted) samples were drawn from the Watson distribution to match the number of fibre orientations in each 'superpixel' of the real data. No additional noise was added to the histology data.
We first determined an appropriate value for the weighting of the histology data λ micro , whilst setting λ complex ¼ 0. As we were primarily interested in how accurately the model could separate fibre dispersion from radial diffusion, a wrapper function was used to simplify the joint model such that the model contained only 3 free parameters: ODI, d axial and d radial . To do this the fibre orientation was fixed to a specific direction, ðθ;φÞ, after which the Watson-like FOD could be fully characterised by the ODI alone (rather than the 28 spherical harmonics of the normal joint model). dMRI and histology data was simulated for a single-fibre population oriented along ðθ; φÞ and a given ODI. The simplified joint model was fit to the simulated data using a Markov chain Monte Carlo (MCMC) method, Metropolis Hastings (1970), to find optimal values of ODI, d axial and d radial . d axial and d radial were initialised using the same parameters as for the real data (d axial ¼ 0:25 μm 2 =ms; d radial ¼ 0:05 μm 2 =ms) whilst the dispersion was initialised to ODI ¼ 0:5. The optimisation was constrained such that the ODI, d axial and d radial were positive and d axial ! d radial . It was preferable here to use MCMC for optimisation (instead of fmincon above) for two reasons. Firstly, the multiple MCMC samples identify combinations of parameters which produce an equally good fit. This allowed us to investigate both the precision and accuracy of the model. Secondly, as there were fewer model parameters to fit (2 diffusivities and the ODI, rather than the 30 free parameters above), the optimisation was better suited to MCMC methods. In these simulations of a single fibre population, the number of model parameters was greatly reduced as we fit a single-fibre FOD of symmetric dispersion, characterised by the ODI. Due to the tight restrictions on the form of the FOD, this simplified model was not used on real data. In real data, we describe the FOD using spherical harmonics coefficients rather than the ODI of a Watson distribution, allowing for greater complexity in the FOD shape (e.g. multiple peaks with complex A.FD. Howard et al. NeuroImage 201 (2019) 116014 dispersion). The above procedure was first performed for a fibre population oriented along the histological plane (φ ¼ 0) where the histological data was most informative of the FOD shape, and then repeated for various inclinations, φ. Finally, the simplified model was used to investigate the robustness of the joint model to mis-alignment of the histology-dMRI data. Here the histology FOD was rotated with respect to the dMRI data and the model evaluated for its accuracy and precision in estimating the ODI and radial diffusivity.
For the second set of simulations, a cross-validation approach was used to determine an appropriate weighting of the model complexity penalty, λ complex . The complexity penalty aims to minimise spurious peaks in the FOD by penalising non-zero spherical harmonics which are unsupported by the data. If λ complex were too low it would be ineffective, but if set too high, the spherical harmonic coefficients would be unable to accurately describe the FOD. The latter would be identified in a crossvalidation study as it would result in a high error of the model with respect to a validation dataset. For the cross-validation study, data was simulated for a single-fibre population (as described above) and divided into training and validation datasets of equal proportions. That is, each set included half of the gradient directions, distributed fairly evenly across the sphere. For various λ complex , the model parameters d axial , d radial , and the spherical harmonic coefficients were fit to the training data, after which the out-of-sample residual error was calculated.

Postmortem data acquisition
This study analyses formerly obtained dMRI and histology data; for a detailed description of the samples, data acquisition and post processing see Mollink et al. (2017). Briefly, postmortem brain tissues, which had been immersion fixed in formalin, were obtained from the Oxford Brain Bank, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK. The postmortem tissues were from three male donors with no known neurological conditions who died of non-brain related disease. From each brain, a 5 mm thick coronal section was extracted at the level of the anterior commissure to include various anatomical regions of interest: the corpus callosum, centrum semiovale, corticospinal tract, and cingulum bundle.
dMRI data was acquired on a 9.4T preclinical MRI scanner equipped with a 100 mm bore gradient insert (Varian Inc, CA, USA) and with a maximum gradient strength of 400 mT/m. At a spatial resolution of 0.4 mm isotropic, a diffusion-weighting of b ¼ 5000 s/mm 2 was achieved with a single-line spin-echo sequence (TE ¼ 29 ms, TR ¼ 2.4s) over a total of 120 gradient directions. An additional eight images were collected with negligible diffusion weighting (b ¼ 8 s/mm 2 ). This equated to approximately 8 h 45 min scan time per tissue specimen.
After MRI scanning, the tissue was sectioned in half along the anterior-posterior direction. One half was processed for classic immunohistochemical staining of myelin and the other for polarised light imaging. As previous work (Mollink et al., 2017) determined that the myelin staining provided better predictions of dispersion (those more consistent with dMRI data), this study focused solely on these histological images. For immunohistochemical staining, the tissue was paraffin embedded and sectioned at 6 μm along the coronal plane. To visualise the myelin content, three sections of tissue were stained with antibodies against proteolipid protein (MCA839G; Bio-Rad; 1:1000) and imaged at a spatial resolution of 0.28 μm/pixel. Structure tensor analysis (Bigun et al., 2004;Budde and Frank, 2012;Seehaus et al., 2015) of the histological images estimated the primary fibre orientation per pixel (see Appendix A for details). 1400 Â 1400 fibre orientations were then combined into a frequency histogram to generate a 2D histology-derived FOD, FOD 2D;micro , at a spatial resolution comparable to the 0.4 mm dMRI data. Due to the symmetric nature of the dMRI-derived FOD, we calculated a symmetric (rather than asymmetric) FOD from the histology data, which was fully described over θ 2 ½ À π=2;π=2. The FOD 2D;micro was then normalised such that P θ FOD 2D;micro ðθÞ ¼ 1. To allow for voxel-wise comparison of the multimodal data, the dMRI and histology were co-registered using a 2D registration based on a Modality Independent Neighbourhood Descriptor (MIND) (Heinrich et al., 2012) algorithm. The MIND algorithm computes a local modality-indepdent similarity metric and so was highly applicable to the multimodal data in this study. Co-registration of MRI-microscopy data is typically challenging due to the large number of deformations (e.g. shearing or tearing) that may occur in the histology tissue preparation. Nonetheless, previous evaluation (Mollink et al., 2017) of the registered data found the tissue boudaries to be generally well aligned (within one 0.4 mm MRI voxel (Fig. 3)). All three specimen show especially good registration in the corpus callosum facilitating robust voxel-wise comparisons. However, slight mis-alignment was apparent at some tissue boundaries, particularly in specimen 3.

Model validation
Using simulated data we first assessed whether the joint model was able to separate fibre dispersion from radial diffusion as intended. To determine an appropriate weighting of the histological data, λ micro , we examined the model specificity and accuracy when estimating d axial , d radial . and the ODI. Fig. 4 considers a FOD oriented along the histological plane, where the histology is most informative. Fig. 4a shows samples from fitting a simulated FOD using MCMC where the ODI, d axial and d radial were optimised simultaneously. With only dMRI data included in the model (Fig. 4a, left), we found a clear degeneracy between fibre dispersion and radial diffusion as expected. A similar degeneracy exists between d axial and the ODI which is not shown. In comparison, Fig. 4a (right) depicts the same fit but with both histology and dMRI data included in the model. Here the histology data acted to constrain the within-voxel fibre dispersion. Even though the histology data was only informative of the fibre dispersion in 2D, the joint model was able to overcome the degeneracy and estimate both the fibre dispersion and radial diffusion simultaneously.
In the model, the histologically-derived error was weighted by a regularisation factor λ micro ; when λ micro ¼ 0, the histological data was excluded from the model, corresponding to the left part of 4a. Fig. 4b shows how the accuracy of estimating d radial and the ODI varies with respect to λ micro . When λ micro ¼ 1, we observe a break in the degeneracy between d radial and the ODI, demonstrated by the increase in precision Fig. 3. Previous evaluation of the dMRI-histology registration (reprinted with permission from Mollink et al., (2017)). Generally, the co-registered data shows good alignment, with many of the tissue boudaries within one dMRI voxel of each other (i.e 0.4 mm). There are also areas of poor alignment, perticularly in specimen 3, highlighted by the blue arrows. and accuracy.
In Fig. 4a and b the FOD was oriented along the histological plane, such that the 2D histological data was highly informative of the FOD shape. In contrast, the histological data should be minimally informative for FODs oriented out of the histological plane. Fig. 5 evaluates the performance of the model for fibre FODs with increasing inclination, φ, with respect to the histological plane. For λ micro ! 1, both d radial and the ODI were estimated with good accuracy for fibre inclinations of up to 60 . Accurate estimation of d radial and the ODI for fibres of higher inclination required increasing values of λ micro . Finally, for an FOD oriented perpendicular to the histological plane, the degeneracy between the ODI and d radial remained for all λ micro .
On occassion, in Fig. 5, the joint model appeared to estimate radial diffusivities higher (φ ¼ 15 ∘ ) or lower (φ ¼ 45 ∘ ) than the ground truth. Notably, the estimated radial diffusivity is consistent across λ micro ! 1. When evaluating noiseless data, the joint model repeatedly recovered the ground truth values of both radial diffusivity and ODI (data not shown). Therefore, we conclude that these deviations are an artefact of the dMRI noise, not the performance of the model. Fig. 5 suggests that it is optimal to set λ micro to a very high value (! 100). Since the simulated histology FOD is both noiseless and perfectly aligned to the dMRI data, this is to be expected. For a large λ micro , the histological FOD becomes similar to a 'ground truth' or a hard constraint. In real data however, there are various sources of noise in the histology images (e.g tissue deformation or tearing) and some degree of mis-registration of the dMRI-histology data. This will result in a lack of spatial overlap and/or a rotation of the histology data with respect to the dMRI data. Fig. 6 considers the robustness of the joint model to rotations, Δθ, of the histology FOD with respect to the diffusion data. As in Fig. 4, Fig. 6 considers simulated data for a single-fibre population oriented in the histological plane (φ ¼ 0). For large λ micro > 1 and rotations Δθ ! 12 ∘ , the model estimated both the ODI and radial diffusivity with Fig. 4. Overcoming the degeneracy between fibre dispersion and radial diffusion. Here, simulated data for a single-fibre FOD has been fit using MCMC. Note, the fibre orientation dispersion index (ODI) ranges from 0 (no dispersion) to 1 (isotropic dispersion) and the light grey lines represents the ground truth. Here we simulate data for a single fibre population oriented along the histological plane, where the histological data is most informative. a) Each point shows a sample from MCMC where the cost function of the joint model has been minimised with equally good fit. When the model only considers the dMRI data (left), we see a clear degeneracy as expected. However, the degeneracy may be overcome through the inclusion of histology data to the model (right). Part b) examines the effect of λ micro , the weighting of the histological data. When λ micro ¼ 1, the joint model was able to estimate both the ODI and radial diffusivity to a high degree of accuracy and precision. Fig. 5. How the accuracy of the joint model, for a given λ micro , is affected by the FOD inclination with respect to the histological plane, φ. As in Fig. 4 , data was simulated for a single-fibre FOD and fit using MCMC. The light grey line represents ground truth values. When λ micro ¼ 1, the joint model was able to faithfully estimate both the ODI and radial diffusivity for fibres of inclinations up to 60 ∘ out of the histological plane. For an inclination of φ ¼ 45 ∘ and when λ micro ! 1, the joint model estimated a value of radial diffusivity lower than the ground truth. The estimated value appears consistent across λ micro ! 1. When investigating noiseless data, the model consistently estimated radial diffusivities very similar to the ground truth (data not shown). Therefore we can conclude that such discrepancies between the estimated radial diffusivity and the ground truth (which are consistent for λ micro ! 1) are artefacts due to noise, not the performance of the model. increasing precision, but with reduced accuracy. For λ micro ¼ 1, the model could estimate the ODI and radial diffusivity with good precision and accuracy both for rotations Δθ 12 ∘ (Fig. 6) and fibres of inclination φ 60 ∘ (Fig. 5). Therefore, λ micro ¼ 1 was chosen for all further optimisations: the histology FOD was considered to a be a soft-constraint and the model estimates were less susceptible to mis-registration or histology artefacts.
With λ micro ¼ 1, a procedure of cross-validation was used to determine the weighting of the complexity penalty, λ complex . Fig. 7 plots the out-ofsample residual error with respect to λ complex . To remove spurious peaks from the FOD requires the highest complexity penalty supported by the data (i.e. one which does not detrimentally increase the residual error). Consequently, λ complex ¼ 1 Â 10 À3 was considered optimal. Fig. 8 compares the fit of the joint model and CSD to both the dMRI and histological data. For standard CSD, the FRF was estimated empirically from the most likely 'single-fibre' voxels using the Tournier algorithm (Tournier et al., 2012(Tournier et al., , 2013. Once estimated, the FRF was held constant across the sample and used as a deconvolution kernel to estimate, in each voxel, the FOD from the dMRI data. In comparison, the joint model acted as a forwards model, fitting to both the dMRI and histology data to estimate the FRF and FOD on a voxel-by-voxel basis. Fig. 8 shows the mean absolute percentage error between the predicted diffusion signal (convolution of the FOD and FRF, Eq. (1)) and the dMRI data (top), and the Kullback-Leibler divergence (Eq. (4) and (5)) between the histological data and the model FOD when projected into the histological plane (bottom).

Residual error
When compared to CSD, the joint model shows a slight increase in the dMRI-associated error which is concurrent with a largely improved fit to the histological data. This is as expected: CSD by definition minimises the error with respect to the dMRI data. In comparison, the joint model considers additional information about the 'true' fibre dispersion (from the histology data) to avoid overfitting.

Variations in axial and radial diffusivities
Standard CSD methods define a global FRF and thus assume that the diffusion characteristics of the tissue remain constant across the sample. In contrast, the joint model overcomes the degeneracy between fibre dispersion and radial diffusion to estimate the FRF on a voxel-by-voxel basis. In the joint model the FRF is considered an axially symmetric diffusion tensor described by d axial and d radial . So, the joint model can estimate how the axial and radial diffusivities (and thus the FRF) vary voxel-wise across the sample. Fig. 9 shows considerable variation in the estimated axial and radial diffusivities across all three specimens. Here we should again note the less accurate registration of dMRI and histology data in specimen 3 (Fig. 3). Previous work (Mollink et al., 2017) demonstrates the mis-alignment of the tissue around the grey matter tissue boundaries of Fig. 6. Evaluating the robustness of the joint model to mis-alignment (here rotation) of the histology FOD with respect to the diffusion data. Data was simulated for a single fibre population oriented in the histological plane, the histological FOD was rotated Δθ with repsect to the dMRI data, and optimised using MCMC. Again, the light grey line represents ground truth values. For λ micro > 1 and rotations Δθ ! 12 ∘ , the joint model estimated the ODI with high precision, but without accuracy: the ground truth no longer lies within the distribution of estimated values. For λ micro ¼ 1, the model could estimate the ODI and radial diffusivity with both good precision and accuracy for rotations Δθ 12 ∘ . Fig. 7. Determining λ complex , the weighting of the complexity penalty. A complexity penalty was added to prevent spurious peaks occurring in the FOD. Left: Having performed cross-validation on simulated data, the residual error of the out-of-sample data was estimated for various λ complex . λ complex ¼ 1Â 10 À3 was chosen for all future optimisations due to the unsubstantial increase in the residual error. Right: An example voxel from the postmortem dataset. The voxel was optimised both with a very low complexity penalty (λ complex ¼ 1 Â 10 À5 ) and with λ complex ¼ 1 Â 10 À3 . We see how the complexity penalty successfully minimises the presence of spurious peaks in the FOD which are unlikely to be biologically meaningful. A.FD. Howard et al. NeuroImage 201 (2019) 116014 specimen 3 as well as the more robust registration across the corpus callosum. Values of axial and radial diffusivity outside of the corpus callosum in specimen 3 should therefore be viewed with a sceptical eye. Nonetheless, in Fig. 9 (top) we generally see a similar pattern of variation in both axial and radial diffusivities, with particularly low values of diffusivity often found in the corpus callosum.
To assess how the diffusivities vary with anatomical regions of interest, Fig. 9 (bottom) compares the distribution of axial and radial The error of the model with respect to the histological data, defined as the symmetric Kullback-Leibler (KL) divergence. When KL ¼ 0, the 2D-FODs are identical. Compared to CSD, there is a slight increase in the residual error of the joint model with respect to the dMRI data. This is to accommodate the highly improved fit to the histological data. Fig. 9. Heatmaps (top) and raincloud plots (bottom, specimen 1 only) show how the axial and radial diffusivities (and thus the FRF) vary on a voxel-wise basis across the white matter. We find notably lower radial diffusivities in the corpus callosum compared to other white matter tracts: the centrum semiovale, corticospinal tract and cingulum bundle. Additionally, we find two apparent distributions of radial diffusivities in the white matter. These results challenge standard CSD methods where a global FRF is assumed constant across the sample. Note the change in scale bar between the axial and radial diffusivities, with the radial diffusivity generally lower. diffusion in all white matter voxels to those only in the corpus callosum, centrum semiovale, corticospinal tract and cingulum bundle. We see considerably lower diffusivities in the corpus callosum, as well as two distinct distributions of radial diffusivities across all white matter voxels. Fig. 10 compares three ways of estimating the FRF from the dMRI data alone: the two conventional approaches estimate a global FRF, whilst the joint model estimates the FRF on a voxel-by-voxel basis. The FRF was first estimated from the 300 highest FA voxels and from the most-likely single-fibre voxels defined by the iterative Tournier algorithm (Tournier et al., 2012(Tournier et al., , 2013. This was compared to the joint model estimates of the FRF for voxels in the corpus callosum, centrum semiovale and corticospinal tract. To consider local changes in the FRF (rather than the voxel-wise variations of Fig. 9), Fig. 10 shows the mean FRF across a 2 Â 2 neighbourhood of voxels in each anatomical region.

Fibre response function
Here both conventional methods estimated the FRF from voxels either in the corpus callosum or the cingulum bundle (data not shown): both of which are known to contain coherently oriented, single-fibre voxels. In Fig. 10, we see how the conventional FRFs most closely resemble those estimated by the joint model in the corpus callosum, as expected. Although the FRFs are similar, there are also noteable differences. In the corpus callosum (Fig. 10a), the joint model estimated an FRF with lower radial diffusivity and higher diffusion anisotropy (d axial À d radial ) when compared to the conventional FRFs. This would be consistent with standard methods conflating fibre dispersion with diffusivity in the voxels used to generate the FRF.
In the joint model, the FRF appeared to vary considerably with neuroanatomy. Compared to the corpus callosum, Fig. 10 depicts a higher radial diffusivity in the corticospinal tract as well as a more spherical FRF in the centrum semiovale.
Notably, Fig. 10 demonstrates FRF variability across the corpus callosum, which is typically considered a homogenous tract of coherently oriented fibres. Comparing the example FRF's in Fig. 10a and b, the midline of the corpus callosum is characterised by lower axial diffusivity, higher radial diffusivity and a subsequent decrease in diffusion anisotropy.

FODs
Notable changes to the estimated FODs were also observed. Fig. 11 (top) shows representative voxels from the corpus callosum of specimen 1 where, when compared to the CSD, the joint model estimated narrower FODs and a lower degree of dispersion in the underlying fibres. This finding was consistent across many (but not all) voxels in the corpus callosum of all three specimens.
Furthermore, in the joint model we sometimes observed the emergence of additional distinct fibre populations in the FOD. Fig. 11 shows an example from the centrum semiovale; a known region of crossing fibres. Fig. 11 right ('FOD 2D ') shows the improved fit of the FOD when projected into the histological plane and compared to the histologically derived FOD 2D;micro . This is quantified by the reduced KL divergence as shown in Fig. 8. We see here how the histological data acts as a soft constraint on both the fibre orientation and amount of dispersion in the MRI voxel.
3.6. Biasing the shape of the FOD: model limitations Fig. 12 demonstrates two cases where, due to limitations of the 2D histological data, the histology may bias the joint model prediction of the FOD. Firstly, the histology may underestimate, or totally omit, the volume fraction of a secondary fibre population (Fig. 12 top). We see how the joint model thus attempts to minimise the presence of a secondary fibre population, the resultant FOD being perhaps biased or biologically improbable.
Secondly, the histological data is most sensitive to fibre bundles in, or close to, the histological plane. This effect was first observed in simulated data (Fig. 5) where the joint model was able to accurately estimate the fibre dispersion and radial diffusion for fibres of inclination angles φ 60 ∘ , but struggled to faithfully reconstruct highly inclined FODs with good accuracy. As the fibre inclination was increased, the histology FOD became evermore circular and information about the FOD dispersion was lost: for highly-inclined fibres, the model again became degenerate. Fig. 12 (bottom) shows a single voxel from real data in two orthogonal Fig. 10. Variations in the observed fibre response function (FRF); here the FRF describes the diffusion profile of a single fibre bundle aligned vertically with respect to the page. The axial and radial diffusivities which describe the FRF are plotted below. A global FRF was first estimated from the 300 highest FA voxels and the Tournier algorithm (Tournier et al., 2013) which selects the most likely 'single-fibre' voxels (left). This was compared to the FRF estimated by the joint model in the corpus callosum, centrum semiovale and corticospinal tract. The joint model estimates the FRF on a voxel-wise basis. However, here we consider local changes in the FRF; the FRFs shown were characterised by the mean axial and radial diffusivities over a 2 Â 2 neighbourhood in each anatomical region. We see how the joint model predicts in general a more anisotropic (oblate) FRF in the corpus callosum and that the FRF appears to vary considerably with anatomy.
views: looking down on the histological plane (view 1), and perpendicular to the plane (view 2). In CSD, we see a single fibre population, primarily oriented out of the histological plane; the CSD-derived 2D FOD is consequently fairly circular. The histology-derived FOD however looks substantially different. This may be due to the histology-derived FOD over-emphasising either in-plane components of mostly through-plane fibres or minor fibre populations which lie in the histological plane.
Alternatively, there could be mis-registration of the dMRI and histology data such that the two datasets are incompatible. Consequently, as the joint model tries to fit the histological data, it assumes a much less biologically plausible FOD shape. Both examples in Fig. 12 were taken from the centrum semiovale on the left hand side of specimen one: a known region of complex fibre configurations and, in this specimen, a region of high dMRI-associated error in the joint model. Fig. 11. 3D fibre orientation distributions (FODs) derived from CSD and the joint model in both the corpus callosum (top) and centrum semiovale (bottom). Top: The joint model estimates narrower FODs and lower dispersion in the underlying fibres. Bottom: In the centrum semiovale, additional distinct fibre populations are evident in the joint model FODs when compared to CSD. Notably, these peaks are recognisable in both the CSD FOD and the histology independently, and are not introduced by the histology only. In the right-hand pane ('FOD 2D ') the CSD and joint model FODs have been projected onto the plane of the histological data and are compared to the histology-derived FOD. Here we see the greatly improved fit of the joint model FOD to the histology. Fig. 12. Two cases where the histological data might bias the 3D FOD predicted by the joint model. Histology may under-estimate the volume fraction of in-plane secondary fibre populations (top), or omit out-of-plane fibres from the histology-derived 2D FOD (bottom). Here the 3D FOD is seen from two orthogonal views: looking onto (view 1) and out of (view 2) the histological plane (denoted x/y).

Orientation of the FOD peak
In the joint model, the histological data constrains both the fibre orientation and amount of dispersion in each dMRI voxel. This is likely to affect the orientation of the estimated FOD peak in two respects. Here we define the peak orientation by the azimuthal angle in, and inclination out of, the histological plane. Firstly, any mis-registration of the dMRI and histology data may cause a rotation in the azimuthal angle. Secondly, as the histological data is most informative of in-plane fibres, the joint model may favour FODs of low inclination. Fig. 13 investigates these effects by comparing the orientation of the primary peaks of FODs estimated by both CSD and the joint model. Fig. 13a shows the angular deviation of the two peaks where the three samples have a median difference of 12:8 ∘ , 12:9 ∘ and 13:5 ∘ . Fig. 13b and c compare the azimuth and inclination of the primary peaks respectively. We see how all three specimen show very close correlation with the robust regression line close to the line of unity and r-values between r ¼ 0:93 À 0:98 . Note that here we use robust linear regression to limit our sensitivity to outliers and that the r-values of robust regression (here calculated using MATLABs fitlm (MathWorks, 2017)) are typically inflated when compared to standard linear regression.
In 13b specimen 1 we see a prevalence of joint model estimated FOD peaks at azimuthal angles of θ ¼ AE90 ∘ , i.e along the superior-inferior axis. This is due to an almost equal prevalence of histology-defined FODs with peaks at these angles and is probably caused by an artefact in the histology data. This however does not appear to greatly impact either the inclination (Fig 13c) or total angular difference (Fig. 13a) between the CSD and joint model derived FODs of specimen 1. Overall, Fig. 13 indicates a markedly stable relationship between the FOD peaks derived from CSD (dMRI analysis only) and the joint model (combined dMRI and histology analysis).

Discussion
In the joint model we utilised a data-fusion framework to combine co-registered dMRI and histology data from the same tissue samples, acquired from three postmortem human brains. This allowed us to test the validity of a brain-wide fibre response function as used in CSD (Tournier et al., 2007). With the inclusion of histological data, we were able to constrain both the fibre orientation and amount of dispersion within each MRI voxel, albeit in 2D. Consequently, the joint model could overcome the degeneracy between fibre dispersion and radial Fig. 13. Comparing the orientation of the primary FOD peak derived from both CSD and the joint model. The peak orientation is described by an azimuth and an inclination angle which is defined with respect to the histological plane (inclination ¼ 0 ∘ ). In the joint model, any mis-registration of the dMRI and histology data may result in a rotation of the peak azimuth. In addition, as histology is most informative of fibres lying in or close to the histological plane, the joint model might favour FODs of low inclination angles. In all 3 specimen there is a low angular deviation (a) between the CSD and joint model FOD peaks, with good correlation between the azimuthal (b) and inclination (c) angles: the robust regression lines are close to unity with r-values between 0.93 and 0.98. diffusion to estimate the FRF and FOD simultaneously on a voxel-by-voxel basis.
Using simulated data, we evaluated the model specificity and accuracy. For simulated data of single-fibre populations, the joint model provided particularly robust estimates of both orientation dispersion and radial diffusion for fibre populations in or close to the histology plane (up to inclinations of 60 ∘ ), where the histological data was most informative. However, in both simulation and post-mortem data the joint model was less effective at reconstructing out-of-plane fibre populations. This was due to an inherent limitation of the microscopy technique, where fibres oriented out of the plane aren't readily detected in the 2D histological data. The bias of the model could be reduced if the microscopy was able to reconstruct the FOD in 3D; 3D polarised light imaging developed by Axer et al. (2011), or 3D confocal microscopy from Schilling et al. (2016) being two examples of such techniques. In addition, future work could take a more heuristic approach where the regularisation factor λ micro could depend on fibre inclination, as a proxy of how informative the microscopy data is of the 3D FOD.
In this joint model, the FRF was characterised by axial and radial diffusivities, which appear to vary considerably across the human brain. Generally, both diffusivities seem to follow a similar pattern of variation, with particularly low values of radial diffusivity found in the corpus callosum. Contrary to the assumption of a brain-wide FRF in CSD, our results demonstrate that the FRF is dependent on the local anatomy. This supports recent findings by Schilling et al. (2019) who used 3D confocal microscopy to investigate FRF variation across the brain of a squirrel monkey. FRF variation is expected as tissue characteristics such as axonal density, packing and myelination are unlikely to hold across the entire brain (Walhovd et al., 2014). For example, areas of the corpus callosum have been reported to have only 30% myelination (Sturrock, 1980) which would significantly impact the FRF; decreased myelination has been shown to be associated with high radial diffusivity (Beaulieu, 2002;Song et al., 2002Song et al., , 2005. Indeed, characteristically different fibres should each have their own unique FRF, although in practice this would be difficult to model. Not only does FRF variation challenge the assumption of the brainwide FRF used in CSD, but also implies that current analyses may be biasing the shape of their reconstructed FODs throughout the brain. Indeed, in known regions of crossing fibres such as the centrum semiovale, the joint model estimated distinct secondary fibre populations which were often recognisable but not well defined in the CSD-derived FODs. Furthermore, the joint model often estimated FODs of lower dispersion in single-fibre voxels, particularly across the corpus callosum. Current algorithms which estimate the FRF directly from the data (Tournier et al., 2013;Tax et al., 2014) typically estimate the FRF from voxels in the corpus callosum which exhibit high fractional anisotropy. However, even these voxels are unlikely to contain a perfectly coherent fibre bundle. Indeed, recent studies of both histology (Budde and Annese, 2013), polarised light imaging and dMRI data (Mollink et al., 2017) have found significant levels of dispersion in the corpus callosum, particularly at the midline (Mollink et al., 2017). To obtain accurate estimates of the FOD, both this study and that by Schilling et al. (2019) support the development of analyses which estimate a local rather than global FRF. Both approaches are however limited as they require co-registered dMRI and microscopy data in postmortem tissue samples. Building on the work by Lasi c et al. (2014) and Szczepankiewicz et al. (2015), Cottaar et al. (2019) combine linear and spherical tensor diffusion encoding to overcome the degeneracy between fibre dispersion and diffusion anisotropy to estimate both on a voxel-wise basis, with highly consistent results. Similar techniques could provide exciting new avenues for local FRF estimation in in vivo data and more faithful FOD reconstruction in spherical-deconvolution based analyses.
Multimodal datasets also present a number of challenges: one being that the microscopy data is only representative of a subset of the tissue microstructure. Here the tissue sections were stained to visualise the myelin content of the tissue, yet only pixels above a certain staining intensity were recognised as myelinated fibres. Fibres that failed to meet the staining threshold (perhaps those with with less myelin and typically lower diameter), and other cell types (e.g. glia) may have therefore been underrepresented or omitted from the microscopy-derived FOD. Secondly, as the histological stain was only sensitive to myelin, other cell types (e.g. glia) were excluded from the microscopy FOD. If the unmyelinated axons or glia both contributed to the diffusion signal and acted to increase the dispersion of the true FOD, the FOD of the joint model would underestimate the amount of dispersion compared to that of the true contributing microstructure. Nonetheless, as seen in Fig. 11, the joint model did not always act to fit to the histological data perfectly. Indeed the joint model often estimated a slightly higher degree of dispersion than prescribed by the histology and on occasion assumed a substantially different FOD, if this was both driven and supported by the dMRI data. This is a result of the histology acting as a soft constraint, rather than fixing the fibre orientation and dispersion.
The postmortem dMRI data used here presents a second challenge. Although the data provides high spatial resolution (0.4 mm isotropic), it has relatively low diffusion contrast. This is likely due to the postmortem interval (time between death and fixation) and immersion fixation of the human brains from which our samples were taken. It is most likely that the meninges, specifically the pia mater and the components of the dura, were attached during immersion fixation of the postmortem brains in this study. As the samples originated from the centre of each brain, the uptake of fixative to this area will have been particularly slow as fixative will have had to diffuse from the cortical surface to reach the callosum, increasing the apparent postmortem interval. Prior to fixation, postmortem tissue may begin to decompose, wherein both the postmortem interval and fixation method have been shown to change the diffusion properties of the tissue (Shepherd et al., 2009;Miller et al., 2011). This likely explains both the inter-specimen variation in estimated diffusivities and why the diffusivities reported here are low when compared to in vivo data where, for comparison, the NODDI model assumes d axial ¼ 1:7 μm 2 =ms (Zhang et al., 2011(Zhang et al., , 2012. It is important to note that the joint model generalises to any dataset which includes dMRI alongside microscopy data from which we can estimate fibre orientations. This encompasses existing, open access, dMRI and histology datasets (Schilling et al., 2016(Schilling et al., , 2018Mollink et al., 2017), as well as data from alternative microscopy techniques such as the mesoscopic fibre orientations obtained from polarised light imaging (Axer et al., 2011;Leuze et al., 2014;Mollink et al., 2017). As such, future work aims to apply the CSD-based joint model to other datasets, of both human and monkey (perfusion-fixed at death) brain tissue.
Thirdly, the joint model simultaneously analyses co-registered dMRI and microscopy data from the same tissue sample. The accuracy with which the model estimates the FOD, axial and radial diffusivities is therefore dependent on the quality of the registration. dMRI-histology registration is particularly challenging as histological images suffer from various non-linear tissue deformations (e.g. due to sectioning, tissue shrinkage or tearing) or imaging artefacts such as dirt, air bubbles or uneven immunohistochemical staining. This study analyses previously published, pre-processed data where the dMRI and histology data was registered using a Modality Independent Neighbourhood Descriptor (MIND) algorithm (Heinrich et al., 2012). Previous evaluation (Mollink et al., 2017) of the tissue boundaries (Fig. 3) showed generally good alignment of the dMRI-histology data with the tissue boundaries mostly within one MRI voxel (0.4 mm). Nonetheless, there are areas which show less good alignment where the joint model will estimate the FOD and diffusivities with a higher degree of uncertainty. The development of reliable tools the for high-quality co-registration of microscopy and dMRI data (Alegro et al., 2016;Iglesias et al., 2018;Huszar et al., 2019) will greatly aid future joint-modelling work and the field of dMRI validation in general.
With very high-quality co-registered data, the joint modelling approach could further take advantage of the highly detailed fibre ori-entations obtained from histology. Here, we combine 1400Â 1400 histologically-derived fibre orientations into a frequency histogram to create a 2D FOD which is considered symmetric about the origin. The FOD is therefore essentially a summary measure, where ultra-highresolution information about the spatial distribution of the fibre orientations within the MRI voxel is lost. Instead, the histology data could describe an asymmetric FOD and the joint-modelling approach could be used to either validate or drive the modelling of FOD asymmetry (Bastiani et al., 2017). Alternatively, we could use the joint model to directly predict the diffusion signal (through the convolution of the FRF and FOD, c.f. Eq. (1)) in each histology pixel, or over a small local neighbourhood, much smaller than the MRI voxel. In this manner, a joint modelling approach may be able to differentiate crossing fibres from those in fanning, bending or kissing configurations which are known to produce the same diffusion signal .
Finally, this report looks through the lens of a convolution-based joint model to detail both the advantages and challenges of a dMRI and microscopy data-fusion framework. As such, we recognise this model to be only one example of a joint modelling approach. The microstructure model at the centre of the data-fusion framework can itself take a range of forms, or be extended to increase both its complexity and specificity. For example, future work will consider a multi-compartment model of the fibre response function, which distinguishes intra-from extra-axonal space, each characterised by a unique diffusion profile. In this manner, we hope to disentangle which tissue compartment is driving the FRF variation across the brain. Alternatively, the joint-modelling approach could be utilised to obtain high-precision estimates of fibre dispersion from, for example, a NODDI-based model (Zhang et al., 2011). Lastly, future joint models will benefit from a Bayesian framework where the datasets are combined optimally in the sense of precision-weighting, i.e. the shared parameters get information from the fused datasets while accounting for uncertainties inherent to the data, noise, and model structures. The data and code related to this work is publicly available at https://git.fmrib.ox.ac.uk/amyh/jointmodelling. The original data can be downloaded from http://www.fmrib.ox.ac.uk/DigitalBrainBank.

Conclusion
The joint model takes a data-fusion approach, combining dMRI and histology data from the same tissue sample to investigate the diffusion properties of white matter under conditions where multiple modalities are informative of the 'true' tissue microstructure. In the model, histology acted as a soft constraint on both the fibre orientation and amount of dispersion in each dMRI voxel. This allowed us to overcome the degeneracy between fibre orientation dispersion and radial diffusion, to estimate the FOD and FRF on a voxel-by-voxel basis. Our results demonstrate how the diffusion characteristics of a single fibre (here characterised by axial and radial diffusivities) vary considerably across the brain. Notably, the diffusivities were found to vary both between and within white-matter tracts; even in the corpus callosum where the microstructure is typically considered fairly homogenous. These findings contradict the assumption of a brain-wide FRF, currently used in many deconvolution-based analyses. Finally, our results suggest that current diffusion models may be overestimating dispersion in single-fibre voxels and underestimating the number of distinct fibre populations in known regions of crossing fibres; both of which have important implications for tractography (Behrens et al., 2007). producing FOD joint ðθ; φÞ. Due to the rotational symmetry of the 3D FOD, only a hemi-sphere (θ; φ 2 ½ À π=2; π=2) was required. The projection of FODðθ; φÞ onto the 2D plane was thus defined as, FOD 2D;joint ðθÞ ¼ X π=2 φ ¼Àπ=2 FOD joint ðθ; φÞcosðφÞΔφ: (B.1) The FOD 2D;joint ðθÞ was then normalised such that P θ FOD 2D;joint ðθÞ ¼ 1. As FOD 2D;joint and the microscopy-derived FOD 2D;micro were defined over the same values of θ, the discrete distributions were directly comparable.
Fig. 14. Using structure tensor analysis (Bigun et al., 2004;Budde and Frank, 2012;Seehaus et al., 2015) of histological images to derive a 2D FOD. Left: A section from specimen 3 was immunohistochemically stained with antibodies against proteolipid protein (MCA839G; Bio-Rad; 1:1000) and imaged at a spatial resolution of 0.28 μm/pixel. Right: The red colour channel was then extracted for structure tensor analysis after which fibre orientations were combined into histologically-derived 2D FODs (overlaid). In the corpus callosum (top) we see coherently oriented fibres and low dispersion in the FOD. In comparison, in the centrum semiovale (bottom) we see crossing fibres in the histology image which are reflected in the FOD as multiple, distinct fibre populations and more disperse fibre profiles. In addition, we see small areas of tissue tearing (at the tissue boundary, top right) and degradation (in the cingulum bundle), both of which are artfeacts commonly found in histology data.