Local volume fraction distributions of axons, astrocytes, and myelin in deep subcortical white matter

&NA; This study aims to statistically describe histologically stained white matter brain sections to subsequently inform and validate diffusion MRI techniques. For the first time, we characterise volume fraction distributions of three of the main structures in deep subcortical white matter (axons, astrocytes, and myelinated axons) in a representative cohort of an ageing population for which well‐characterized neuropathology data is available. We analysed a set of samples from 90 subjects of the Cognitive Function and Ageing Study (CFAS), stratified into three groups of 30 subjects each, in relation to the presence of age‐associated deep subcortical lesions. This provides volume fraction distributions in different scenarios relevant to brain diffusion MRI in dementia. We also assess statistically significant differences found between these groups. In agreement with previous literature, our results indicate that white matter lesions are related with a decrease in the myelinated axons fraction and an increase in astrocytic fraction, while no statistically significant changes occur in axonal mean fraction. In addition, we introduced a framework to quantify volume fraction distributions from 2D immunohistochemistry images, which is validated against in silico simulations. Since a trade‐off between precision and resolution emerged, we also performed an assessment of the optimal scale for computing such distributions. HighlightsAxon, myelin and astrocyte local volume fraction distributions are first reported.Elderly populations with various degrees of deep subcortical lesions were analysed.Statistical modeling allows accurate volume fraction estimation from slice microscopy.Significant demyelination and astrogliosis associate with deep subcortical Lesions.


Introduction
Brain tissue microstructural damage can result from neurodegenerative diseases such as amyotrophic lateral sclerosis, Parkinson's disease, and Alzheimer's disease (Stoessl, 2012;Tur et al., 2016;Pearson et al., 1985). These conditions produce gradual deterioration or even death of neurons with concomitant alterations in brain structure and function. Devising imaging techniques capable of characterising brain tissue microstructure in vivo is topical within neuroimaging. Key information about brain microstructure is provided by the volumetric densities of the different white matter (WM) structures (Horsfield and Jones, 2002). This knowledge might be valuable not only for research but also for its potential to help in developing early stage diagnosis of neurodegenerative diseases. The aim of this paper is to characterise the local volume fraction distribution of axons, astrocytes, and myelinated axons in deep white matter for different populations. These are important stereological parameters, but their distribution has not been previously identified.
Age-associated cerebral white matter lesions can be sub-classified into those within deep white matter (DWM) of the centrum semiovale (deep subcortical lesion, DSCL) and those close to the angles of the lateral ventricles (periventricular lesion, PVL). Each has its own clinical relevance (Park et al., 2011), but both are thought to be the consequence of small vessel-related vascular pathology such as vascular dementia. This work focus on DSCLs, which are associated with loss of myelin components (Wharton et al., 2015) and astrogliosis (Simpson et al., 2007(Simpson et al., , 2010. To this purpose, various subjects belonging to groups that represent healthy and diseased conditions were imaged. We analyse immunohistochemically stained sections of three populations of DWM samples: Control (no DSCLs were present in the subject), Lesion (the sample presented DSCLs), and Normal Appearing White Matter (NAWM, the subject presented DSCLs but not in the sampled tissue). Tens of thousands of structures such as axons, coexist in 1 mm 3 of brain tissue (Azevedo et al., 2009). Their arrangement varies between different subjects and also with the presence of disease. The information obtained from histological analysis has the potential to help in the description and understanding of healthy tissue, and also in a diverse range of conditions including multiple sclerosis (Peterson et al., 2001;Trapp et al., 1998), schizophrenia (Colon, 1972), and Alzheimer's disease (Stark et al., 2005). Volume fraction maps of the main white matter structures can further inform and validate magnetic resonance imaging (MRI) techniques. Prior distributions on the microstructural parameters of biophysical models can be generated from this kind of information.
MRI has become a clinical standard to diagnose brain diseases among other conditions in several body organs (Hollingworth et al., 2000). It has a spatial resolution considerably lower than histology. While MRI voxels are in the order of the millimetres, light microscopy can resolve structures smaller than a micron. While microscopy can discern individual structures, MRI can only detect the aggregate signal of the distribution of components within a voxel. However, MRI has the advantage of being a non-invasive imaging technique that can be used in vivo. Due to the limited resolution that can be achieved with MR scanners, a modality that has gained popularity is diffusion MRI (dMRI) (Assaf, 2008). Diffusion weighted images (DWIs) are sensitised to displacements in water molecules along pre-determined directions. By measuring across multiple orientations and processing the set of signals, this technique enables the extraction of information about the underlying tissue architecture within a voxel. A wide range of analysis methods have been developed in the dMRI literature to extract different information from the DWIs (Basser et al., 1994;Assaf and Cohen, 1999;Tournier et al., 2004;Jensen et al., 2005). Among them, a number of biophysical tissue models (Assaf et al., 2004;Jespersen et al., 2007;Alexander et al., 2010;Fieremans et al., 2011;Zhang et al., 2012;Jelescu et al., 2015;Reisert et al., 2017;Veraart et al., 2017) have been proposed that aim to describe degeneration processes with higher sensitivity and specificity than previous attempts to characterise tissue microstructure with Diffusion Tensor Imaging (DTI) or similar phenomenological models.
As in any other physical problem involving a model, the accuracy of the results relies on how representative the model is for the phenomenon under study (see recent review ). The validation of dMRI biophysical models is generally hindered by the complexity and unavailability of the ground truth. Some of the prominent dMRI biophysical models make unrealistic assumptions and hence renders the results of these models dubious (Lampinen et al., 2017). In addition, in absence of additional information, the precise estimation of the model parameters requires a huge amount of measurements. This is where the characterisation of V V distributions, or more generally information derived from histology, can play a key role. This information has the potential to improve the performance of existing tissue models and help in the validation of new ones. For example, Clayden et al. (2016) showed that by introducing structured prior information on model parameters, the accuracy in the estimation is improved. The interpretation of parameters from several existing dMRI techniques such as DTI or biophysical models has been previously validated using histological sections (cf. (Chenevert et al., 2000;Assaf et al., 2008;Jespersen et al., 2010;Xu et al., 2014;Sepehrband et al., 2015;Szczepankiewicz et al., 2016)). Additionally, combined analyses of histology and dMRI have been performed to further understand the development of certain diseases and the healthy brain (Budde and Frank, 2012;Kolasinski et al., 2012;Khan et al., 2016;Mollink et al., 2017). Information from histology can also help developing realistic in silico biomimetic phantoms of brain tissue (Cook et al., 2006;Beltrachini et al., 2015). Phantoms provide controlled ground truth that can test different dMRI acquisition schemes and post-processing methods.
Local volume fractions depend on the scale of the windows of observation. Previous works have only considered the global average V V of white matter structures for the whole brain or over complete regions (Tang et al., 1997;Xu et al., 2014;Sepehrband et al., 2015). There is little information on which scale should be considered for computing local volumetric density maps. As in other imaging fields, there is a trade-off here between precision and resolution (Chen et al., 2000;Kale et al., 2009). The choice of a small scale can lead to imprecise estimates due to the comparable size of the structures and the averaging window. Larger scales yield stable density estimates, but at the price of losing microstructural detail and hence be uninformative. To define a convenient scale of analysis, we computed the standard error in volume fraction estimates for windows of observation of various scales, together with the significant differences found between adjacent windows. In order to characterise different populations, we required histology data from a large cohort of subjects. The best option for this was immunohistochemistry. However, this modality produces slices with non-negligible thickness in comparison with the structures of interest. Thus, to recover the volume fraction from area fraction measures, we had to adapt and develop new stereological methods. These methods are an interesting additional contribution in themselves.
This paper addresses first the challenge of analysing the appropriate scale for computing local V V values. Second, the development of a method for an automatic computation of the V V intra-subject distributions from thin histology sections. Finally, this paper tackles the computation of local V V probability distribution functions in different populations of deep white matter.

Tissue sample selection
The tissue samples for this work came from the Cognitive Function and Ageing Study (CFAS) neuropathology cohort (Brayne et al., 2006;Cognitive Function and Ageing Studies (CFAS) Collaboration, 2017). Brains were removed with the consent of the next of kin and with multicentre research ethics committee approval, according to standard CFAS protocols (Fernando et al., 2004). Brains were removed within 60 h of death, one cerebral hemisphere was fixed in buffered formaldehyde and sliced into 10 mm thick coronal slices. These slices were: 1) immediately anterior to the temporal stem (anterior), 2) at the level of the pulvinar (middle), and 3) at the posterior most limit of the occipital horn of the lateral ventricle (posterior). These slices were scanned using T 1 and T 2 weighted MRI (details available in (Fernando et al., 2004)). The MR images were rated by three experienced observers (blind to clinical status) and given a score for DSCLs using a modified Scheltens' scale (Scheltens et al., 1993). Following this scoring, the coronal slices were stored in formalin until required for this study (at least four weeks). From every subject one block of approximately 20mm Â 20mm Â 10mm was sampled from one of the slices. Blocks were allocated in three groups: Control, NAWM, and Lesion. Control blocks were taken from cases where all three levels were scored as 0 on this scale or where only one slice had a score of a maximum of 1. Lesion blocks were taken from regions with a Scheltens' score of 4 or greater. NAWM blocks were taken from lesion free regions of deep white matter in which a DSCL of score 3 or greater was present elsewhere.
To decide the total number of samples for the study, we performed a pilot study using five samples for each group. We required that the standard error of the mean V V for every group needed to be below 0:5% for all structures. This resulted in the need of at least 25 samples from each group. To guarantee our requirement, we decided to run the complete experiment with 30 samples per group. Table 1 presents the main information of the selected patient cohort. Additionally, a baseline demographic analysis was performed to assess significant differences in the position of the samples or the sex of the subjects between the groups. No statistically significant differences were found using a Pearson's χ 2 -test.

Histology acquisition
The formalin-fixed blocks of tissue were processed to paraffin and embedded in paraffin wax using conventional protocols (Fernando et al., 2004). For each block, three sections of 5μm thickness were cut for immunohistochemistry (in plane dimensions of the samples were around 20 mm Â 20 mm, see Fig. 1). The sections were collected onto charged slides and underwent Ag retrieval with Access Revelation RTU (A. Menarini Diagnostics Ltd, Winnersh, UK) in a pressure cooker. Sections were immunostained for phosphorylated neurofilament (SMI31, an axonal marker), glial fibrillary acidic protein (GFAP, an astrocyte marker), and proteolipid protein (PLP, a myelin marker) using an intel-liPATH FLX system (A. Menarini Diagnostics Ltd, Winnersh, UK). These immunostaining markers were chosen due to being the best option for analysing ex vivo samples of these structures (Barker et al., 2013;Garwood et al., 2016;Sternberger and Sternberger, 1983). Immunohistochemistry was performed using a standard ABC method, visualised with diaminobenzidine tetrachloride (DAB), and the sections counterstained with hematoxylin. Prepared sections were scanned and digitised at 40X magnification using a Nanozoomer XR (Hamamatsu, Photonics Ltd., Hertfordshire, UK). The final resolution of the images was of 0:23μm= pix.

Histology segmentation
The area fraction (A A ) of a certain structure in a region of an image is defined as the ratio of the area occupied by the structure of interest against the total area of the region. We computed the A A maps using the area method (Ohser and Mucklich, 2000). To this effect, two segmentation methods were compared regarding their influence on the final results. The first step of the segmentations was to apply colour deconvolution (Ruifrok and Johnston, 2001) to change the representation of the RGB images to DAB, hematoxylin, and background channels. Only the DAB intensities were kept, leaving a single-channel image. The second step was local thresholding. The first segmentation was based on Otsu's method (Otsu, 1979), while the second one used k-means clustering (Arthur and Vassilvitskii, 2007) with two means. The DAB-channel images were divided into patches of 3001 by 3001 pixels, and the optimal thresholds were computed for each patch with the corresponding methods. These patch-wise thresholds were assigned to a cubic spline grid of control points centred at the middle of the corresponding patches. A smooth image with the original resolution was generated by this cubic spline interpolation, representing pixel-wise thresholds. Finally, they were applied to the corresponding pixels of the DAB-channel image resulting in the segmented binary image.
To assess the accuracy of the segmentation algorithms, we compared their performance against two experienced observers (JRH and JES). For each preparation (axons, astrocytes and myelin), 200 random pixels belonging to two ROIs of two samples were assessed. These were segmented first by the automatic algorithms, and twice by the two experts on separate sessions at least 4 weeks apart. Intra-rater and interrater reliability were evaluated for both experts. These are defined as agreement between repeated segmentations performed by the same person and between segmentations performed by different individuals, respectively. The agreement between the manual and the automatic segmentations was also quantified. All agreement scores were computed using Cohen's κ-statistic (Cohen, 1960), the variation of information (Meila, 2002), and Rand's index (Rand, 1971), between each pair of segmentations.

Scale dependency of the area fraction
Brain tissue has a heterogeneous V V spatial distribution. Our goal is to characterise V V local variations (i.e. intra-subject) for a population. Hence, we need not only accurate estimates but also to capture the changes across a sample. To define an appropriate scale for analysis, we investigated the trade-off between resolution and precision. First, we computed the standard error of the A A values. Second, we quantified our ability to find statistically significant differences between neighbouring regions. Both analyses were performed for five scales ranging from 118μm to 1884μm (512 pixels-8192 pixels). To this purpose, images were divided in squared windows of corresponding width s (Fig. 2). White matter regions of interest (ROIs) were drawn in each sample to avoid any bias in the results due to contributions from grey matter or artifacts in the images. Only windows that fitted entirely inside the ROI were considered for the analysis.
To compute the A A local variance at scale s, we subdivided each corresponding window into four equal sub-windows of width s=2 (see Fig. 2a). If the expected A A is homogeneous inside the window, the standard error of the local A A can be estimated by: where i indexes the window (width s) in the image, N is the total number of windows, k indicates the position of the sub-window, a i;k indicates the area fraction measurement of a sub-window, and a i ðsÞ is the mean of the four measurements inside a window, i.e. the area fraction measurement of the window. The global standard error for each scale is estimated by the pooled analysis of the local standard error of all windows by: Statistical hypothesis testing was used to assess the degree of differentiability between neighbouring regions. The null hypothesis (H 0 ) was formulated as a pair of neighbour windows having the same A A . We observed the percentage of rejections of H 0 for each scale over the total number of comparisons. This was tested for every pair of possible neighbours. For all scales, each window was divided into subwindows that were large enough to guarantee the independence of their A A values (i.e. much larger than the typical size of the structures within it). Sub-windows had fixed sizes of 29:44μm Â 29:44μm (128 Â 128 pixels) in axons and myelinated axons, and 58:88μm Â 58:88μm (256 Â 256 pixels) in astrocytes. A two-sample Student's t-test was performed between all A A values belonging to each pair of neighbouring windows to assess the validity of H 0 (see Fig. 2b). As the size of the sub-windows was constant, their number inside a window increased with the scale. Thus, incrementing the size of the windows provided more independent samples, which helped to distinguish smaller differences between the mean A A of neighbouring windows. An overall rule for the rejection of H 0 was defined by applying a False Discovery Rate (FDR) correction (Benjamini and Hochberg, 1995) over all comparisons made for each given scale.

Volume fraction from histology images
When assuming sections of negligible thickness in comparison to the microstructural elements within (infinitesimally thin planar cuts), it is straightforward to estimate V V from A A . In this case, it can be shown that the average V V is equal to the average A A following properties of the expectation of random variables (Ohser and Mucklich, 2000). Given we aimed to analyse relatively large samples from an extensive population, we used digitised whole slide images of paraffin-based histology. While this allows larger tissue expanses to be assessed efficiently in larger cohort sizes, it does not yield 2D infinitesimal planar cuts. The samples used in this study have non-negligible thickness (5μm) compared to the size of axons or astrocytes ($ 1 À 10μm). They are not infinitesimal planar cuts but thin sections and, A A and V V are unequal. The digitised images generated from these thin sections show projections of the structures within the slice (cf. Fig. 3). Complete 3D information from the projected image cannot be recovered. However, for basic characteristics, such as V V , we can find theoretical formulas relating the projected image to the thin section. These depend on the section thickness t and the structures' size, and are valid under specific assumptions.
The A A was computed as the number of positively segmented pixels over the total pixels in the window. Myelin sheaths surround some axons in white matter. In the images stained for myelin, the measured A A and hence the computed V V , correspond to the volume of myelinated axons (i.e. the outer myelin volume plus the inner axon volume). Based on the work of Miles (1976), Weibel et al. (Weibel and Paumgartner, 1978) derived factors for correcting the section thickness effect on volume density estimates. These factors apply to substrates composed of spheres and randomly oriented cylinders (see Eqs. (3) and (4)). We modelled axons and myelinated axons as randomly distributed cylinders. For astrocytes, we considered spheres with cylinders protruding from the surface (see Fig. 4b), representing the body and processes of the astrocyte, respectively.
The correction factors for cylinders and spheres are (Weibel and Paumgartner, 1978): where, λ ¼ L=d is the ratio between the length L of the cylinders with the diameter d, g ¼ t=d is the relative section thickness, and ξ ¼ V V =λ. EðR 2 Þ and EðR 3 Þ are the second and third moments of the spheres radius distribution. We considered the cylinders to be long for our scales of analysis (Alberts et al., 2003). In that case, K V;cyl;const was practically independent of λ, which we fixed to 100. ξ depends on the true V V , which is what we were trying to estimate. Hence, we implemented an iterative estimation of K V . We computed K 0 V as a function of ξðA A Þ, and repeated the process The main limitation in Eq.
(3) is that the cylinders are assumed to have an equal radius. This is not realistic if we plan to use them to model axons in brain tissue. To remove this assumption, we derived a more general correction factor (see Appendix A) that accounts for a size distribution in the cylinders' diameter. It is proportional to the factor with constant radius: K V;cyl ¼ K disp;cyl Â K V;cyl;const . The additional factor is where μ and σ are the mean and standard deviation of the diameter distribution. This factor is unitary when there is no dispersion in the cylinders' size.  We computed the volume fraction of astrocytes as the volume fraction of processes and bodies. Each was estimated independently from their corresponding area fractions: where, A A;proc and A A;bodies are the area fractions from the processes and bodies, respectively. These were computed by performing a segmentation of the astrocytes binary image that allowed us to separate the thin elongated structures (processes) from the larger and rounded ones (bodies). These correction factors depend on the size of the modeling structures. However, when analysing histological samples, the statistics of axons, myelinated axons, and astrocytes sizes are unknown. To overcome this issue, 100 axons or processes were manually selected from each binary image, and the mean and standard deviation of the structures' radii were estimated automatically based on structural tensor analysis (Bigun and Granlund, 1986). For the astrocytes' bodies, their comparable size to the slice thickness introduced an extra challenge. We found a correspondence between the distributions of the projected radii and the actual radii, by assuming the latter one to be a Gaussian distribution.

Validation of automated segmentation
The agreement scores computed with Cohen's κ-statistic, the variation of information, and Rand's index led to the same conclusions. Hence, we only present in Fig. 5 a summary of Cohen's κ-statistic for each structure.
For axons and astrocytes both methods had similar agreement scores with the experts. We chose the Otsu-based segmentation as it was the fastest for the computation. For myelinated axons, we chose the Otsubased segmentation as the score was significantly higher than using kmeans. For the three preparations the mean agreement score between the chosen algorithm and the experts was similar to the mean agreement between the two experts.
The segmentation accuracy was also qualitatively assessed in two randomly selected samples. For two ROIs from each sample, RGB images where overlaid with segmentation masks and visually examined by one expert. The expert confirmed that the segmentation results produced acceptable results. Supplementary Fig. S1 shows example images of the segmentation of the three structures.

Optimal scale selection
We computed the standard error of our A A estimates and assessed the significance of the differences between neighbouring regions for different scales. These analyses provided information on the changes in the area fraction we can detect at each scale, and their relation with the resolution-precision trade-off.
The statistical tests provided an answer to the question: "What proportion of adjacent windows present statistically significant differences at each scale?". Fig. 6b shows that for windows whose width is smaller than 236μm, we cannot distinguish neighbours. This is either because we have insufficient precision due to the small window size, or because the differences between windows was too small, or both. Fig. 6a shows that as resolution increases (i.e. decreasing the width of the windows), the standard error of the A A estimates is further increased. Our criteria was to select as an optimal scale the one which presented statistically significant differences between more than 5% of the possible neighbours. We found that neighbouring patches ($ 10% of them for all structures) presented statistically significant differences between them at a scale of 236μm and the accuracy of the A A estimation was reasonable. Thus, 236μm Â 236μm was the scale selected for the computation of A A maps for the three structures. Fig. 4. Confocal microscopy image of an astrocyte (Pascal, 2012) (a) with the sphere þ cylinders model (b). Randomly placed truncated spheres with constant radius r, and 2r ¼ t (c), and its corresponding projected image (d). Cohen's κ(Cohen, 1960) with their confidence intervals. The first two symbols represent the mean score corresponding to intra-rater and inter-rater reliability. The third and fourth symbols show the mean agreement between the four manual segmentations and the Otsu and k-means automatic segmentations.

Volume fraction distributions
Simulations of in silico substrates showed an excellent agreement between the true V V and that estimated from the A A of the projected image (see Appendix C). These comprised cylinders and spheres þ cylinders with size distributions similar to those in the structures of interest. The simulated slice thickness was the same as that of the histology images. The robustness of the full orientation dispersion assumption was also tested. We simulated cylinders which were not fully dispersed but with dispersion values typical of DWM (Chang et al., 2015). Results show that errors were low even in that scenario.
The local volume fractions, V V , were computed from the corresponding A A following the method described in Section 2.5. Then, these values were gathered for each group of subjects (control, lesion, and NAWM) to estimate the group-wise distribution of V V . Fig. 7 shows the estimated probability densities for the three structures. The group mean and the intra-group standard deviation are given in Table 2.
The group means and variances for axons across different tissue groups are remarkably similar. For astrocytes, however, the mean of the controls is 10% smaller than that of lesions. Finally, myelinated axons display the largest difference in means, especially for the lesion group. Fig. 8 shows the group means and the corresponding 95% confidence intervals. The differences in the group means were statistically tested with the Student's t-test. In axons, no significant differences were observed. In astrocytes, we found a significant increase in the lesion mean regarding control mean (p < 0:05). In myelinated axons, we found significant differences between lesion and control (p < 0:01), and between lesion and NAWM (p < 0:05).

Discussion
Nervous tissue comprises countless different structures such as neurons and glial cells. An accurate characterisation of their complex arrangement in healthy and lesional tissue can help in the understanding of neurological diseases. The focus of this work was to characterise the volume fraction distribution of axons, astrocytes, and myelinated axons in DWM, which has not been previously addressed in the literature. Populations with and without DSCLs were analysed, since this conditions can alter the distribution of the studied WM structures. The volume densities represented in Fig. 7 and their summary provided in Table 2 can be relevant prior information for a more efficient analysis of in vivo measurements of new patients.
When comparing the mean of the V V distributions we see an agreement with previous works on the differences observed between groups. We observe that DSCLs lead to a significant decrease in the myelinated axons fraction (Wharton et al., 2015). We also found an increase in the astrocytic fraction as reported in (Simpson et al., 2010). Unlike them, we detected statistical significant differences between lesion and control groups due to analysing a larger cohort. We did not find significant differences between groups in the mean fraction of axons, which agrees with the work in (Highley et al., 2014). These results demonstrate that DSCLs are associated with significant demyelination and astrogliosis, which may lead to axonal dysfunction and impact on central conducting pathways passing through the white matter. While control and normal appearing samples look similar on standard T2-weighted MRI scans, this analysis suggests potential (non-significant) astrogliosis and   Fig. 8. Mean V V for each group and structure with its corresponding 95% confidence intervals. * Indicates p < 0:05, ** indicates p < 0:01. demyelination in the NAWM which may be interpreted as reflecting lesion progression. Previous works have also focused on estimating the volume fraction of certain structures in the central nervous system. Histologically derived axonal mean volume fractions have been reported in the grey matter (Chklovskii et al., 2002), corpus callosum (Stikov et al., 2015), and spinal cord (Xu et al., 2014). Our results are considerably lower than the ones in these studies. The axonal mean volume fraction we obtained in DWM was around 9 À 12%, while the reported volume fractions in grey matter and corpus callosum were 30% approximately. This discrepancy is not so surprising since the axonal density is expected to be quite different in each of these regions. The corpus callosum for example, is characterized by a very compact tract of axons while the DWM contains shorter and thinner fibres. Additionally, this study was performed on subjects belonging to an ageing population, where decreased values of axonal (Calkins, 2013) and myelin (Peters, 2002) content have been reported. This age-related decrease agrees with further tests we performed on a corpus callosum sample belonging to the same age cohort (see Appendix D), observed in both transversal and tangential cuts. The potential relative errors introduced by the mapping from A A to V V are quite smaller (relative error < 10%, see Appendix C). Segmentation errors were around 7%;6%, and 10% for axons, astrocytes, and myelin, respectively. We were not able to quantify errors due to incomplete staining, however, images were subject to qualitative quality control check for artifacts, repeating any unsatisfactory image with an adjacent slice from the same sample block. Finally, complex axonal structures (i.e. curved axons with varying radius) may have different relations between A A and V V than the ones modelled, but they are expected to be negligible compared to the errors previously mentioned. In contrast, no significant changes with ageing in the astrocytic fraction were reported in (Sloane et al., 2000), which agrees with our results and previous measurements of astrocyte volume fractions in the corpus callosum, the centrum semiovale and the corticospinal tract (Mollink et al., 2017). Thus, with ageing, we seem to get closer proportions of axons and astrocyes.
We were interested in quantifying the local volume fraction across the tissue, as opposed to the global average volume fraction. Considering windows of observation of too large scales would not correctly capture the spatial variations. However, a too small scale would be dominated by statistical fluctuations reducing the precision of the V V estimations. The volume fraction cannot be defined for scales comparables to the size of the structures of interest. The appearing precision-resolution trade-off was analysed in a sequence of scales by computing the standard error of the A A estimations (precision) and the statistically significant differences observed between adjacent windows (resolution worth). The obtained insight led us to define an optimal scale (236μm Â 236μm) for computing A A distributions and, V V distributions.
The chosen histology technique enabled the acquisition of an extensive population of large samples for the current study. This had a considerable impact on the accuracy of the characterisation of the V V distribution of the population and in the detection of differences between groups. However, this adds the challenge of computing V V from A A . Many works that quantify volume fractions from histology images base their calculations on the stereological relationship V V ¼ A A . However, this is only valid if the thickness of the imaged sample is negligible when compared to the size of the structures of interest (Ohser and Mucklich, 2000). Fortunately, there exists some theory (Weibel and Paumgartner, 1978) tackling this problem when certain conditions are satisfied regarding the structures' geometry. This methodology was extended to allow the computation of the volume fraction of structures that present an unknown size distribution.
A first application of the results obtained in this work could be as benchmark for diffusion MRI techniques aiming to estimate the axonal volume fraction, such as those recently reviewed in , and newer approaches such as (Veraart et al., 2017;Reisert et al., 2017). Our results could also contribute to the definition of new biomarkers (Khan et al., 2016;Jelescu et al., 2016). Knowledge of the typical volume fractions of the compartments present in white matter is a must for generating realistic in silico phantoms. Most dMRI biophysical models compute compartments' water volume fractions weighted by the T 2 relaxation. Thus, their relationship needs to be appropriately modelled and care must be taken when comparing dMRI measures with the V V distributions. Finally, the characterisation of the V V distributions in the population could inform dMRI biophysical models as prior distributions to improve the accuracy of the estimated parameters (Clayden et al., 2016).
The framework introduced in this paper has the potential to aid diagnostic histopathologists and neuropathologists by providing a tool for automatic quantification of the volume fraction of specific WM structures. Digital pathology techniques can enhance pathologists precision in biomarker assessment and accelerate diagnosis. Hence, there is an increasing interest from this community in automated image analysis technologies that support histopathological assessment of tissue structure (see for example (Paulik et al., 2017;Xu et al., 2017;Barker et al., 2016)).
One of the limitations of this work is that the immersion of the samples in fixative may produce shrinking in the tissue. However, this effect would affect the estimation of the relative fractions only if differential shrinkage occurs of the different structures and extracellular space. We found no available literature quantifying this latter effect. The chosen histology technique enabled the acquisition of samples from 90 subjects for each preparation. But, one disadvantage of this methodology is this came at the cost of measuring the volume fractions indirectly. Projected statistics were computed and related to their 3D counterpart. Although we tested the accuracy of the volume fraction estimation in synthetic phantoms that slightly violated the models assumptions, the method could still introduce bias in the results if the considered models are not applicable to the real structures. Having thinner sections or high resolution 3D data would be useful to further benchmark this approach. Our results depend highly on the accuracy of the segmentation. However, the agreement between the automatic algorithm and the pathologists equaled the agreement between the two expert pathologists, which is considered satisfactory. Finally, another limitation was that it was not possible to separate the myelin fraction from the axon fraction in myelinated axons. When we look at myelin images, we see practically the same projection as if the cylindrical sheaths were filled. Hence, as it is not possible to compute and subtract the axonal fraction we report the total fraction (i.e. myelin plus axon volume).
Future work will focus on exploring the benefits of including these distributions into dMRI biophysical models as prior probabilities on the volume fraction of the intracellular compartment.

Conclusions
In this work, we have successfully computed the distribution of the local volume fraction of axons, astrocytes, and myelinated axons in DWM for the first time. These distributions were estimated specifically for an ageing population (CFAS), and are not generalisable to younger populations. Since these distributions depend potentially on DWM pathologies, control, normal appearing, and lesion groups were analysed. The optimal scale for performing this analysis was also assessed. As expected, results of this study show that DSCLs lead to a statistically significant decrease in the myelinated axons fraction and an increase in the astrocytic fraction. No statistically significant differences between the three groups were found for axons.
The obtained distributions of local volume fractions have multiple implications in developing accurate dMRI biophysical tissue models. They can inform models, benchmark their performance, or help to construct realistic in silico phantoms. Volume fraction distributions can also be useful for the neuropathological assessment of quantitative changes in healthy and diseased DWM.
We have also devised a framework that allows the computation of volume fraction maps from digitised whole slide images of paraffin-based histology. The optimal scale for applying such framework was also analysed as a balance between resolution and precision. Automatic image analyses that can support a histopathological assessment of tissue structure are a valuable tool for diagnostic histopathologists and neuropathologists.

Conflicts of interest
The authors declare no conflict of interest.

Acknowledgements
This work has been supported by the OCEAN project (EP/M006328/ 1) and MedIAN Network (EP/N026993/1) both funded by the Engineering and Physical Sciences Research Council and the European Commission FP7 Project VPH-DARE@IT (FP7- ICT-2011-9-601055 We would like to acknowledge the essential contribution of the liaison officers, the general practitioners, their staff, and nursing and residential home staff. We are grateful to our respondents and their families for their generous gift to medical research, which has made this study possible. The Nanozoomer XR scanner was funded by the bet365/ Denise Coates Foundation. The authors thank Dan Fillingham, Lynne Baxter and Rachel Waller from SITraN, and Leandro Beltrachini and Manu Raghavan-Sareena from CISTIB for their help in the acquisition and processing of the digitised histology images and in early discussions of this work.

Appendix A. Correction factor for cylinders' size dispersion
The relation between the projected area fraction and the volume fraction for randomly placed cylinders derived in (Weibel and Paumgartner, 1978) is given by: where the correction factor K V depends on the cylinders' radius r and the section thickness t. We will omit the dependence on t, as it will be constant for the remaining of our calculations. The main limitation of the equations provided in (Weibel and Paumgartner, 1978) is that cylinders are considered to have an equal radius. We are interested in substrates where the radius is given by a probability distribution. Let r 0 be the random variable that determines the cylinders' radii, A 0 A and V 0 V the corresponding area and volume fractions of such substrates. We would be interested in: where K 0 V is the quotient between the expected fractions, which are computed regarding the radius. Our goal was to derive a correction factor that considers not only the section thickness effect but also the contribution from the dispersion in the radii. The latter one is given by: We assumed that the projected area of a cylinder and its volume are approximately proportional to r and to r 2 , respectively. Hence, the expectation of the area and volume fraction regarding the radius will be approximately proportional to EðrÞ and Eðr 2 Þ: We consider that r 0 has mean μ and variance σ 2 , while r is constant and equal to μ. Hence, EðrÞ ¼ μ and Eðr 2 Þ ¼ μ 2 . Following this, the desired correction factor is given by: In the absence of dispersion in the cylinders' radius the size dispersion factor is equal to the unit. As σ 2 μ 2 increases the errors in V V estimations due to considering cylinders of equal radius become larger too. We tested the validity of this correction factor with simulations considering feasible ground truth values and results were satisfactory (see Appendix C for in silico validation).

Appendix B. Match between projected and true size statistics for spheres
We modelled astrocytes' bodies as spheres, where their radius R was a random variable. Each was considered randomly placed in an infinitely large thin section with thickness t (see Fig. 4 c). However, in the projected 2D images of this section (see Fig. 4 d), we observe circumferences with radii not drawn from R but from R proj : where Z is the coordinate of the centres of the spheres in the dimension of the slice thickness. We assumed that the radius distribution of the astrocytes bodies could be approximated by R $ N ðμ; σ 2 Þ, and that Z $ U ½ À r; t þ r. Then, we can compute the mean and variance of R proj : where ρ R ðrÞ and ρ Z ðzÞ are the probability density functions of R and Z, respectively. These two integrals provide the means to go from μ true ;σ true → μ proj ; σ proj . In the real scenario, we can only measure the projected values from segmented astrocyte images. Based on these assumptions, we computed numerically the inverse relation (i.e. μ proj ; σ proj → μ true ; σ true ) for obtaining the true distribution parameters from the projected ones. The Gaussian approximation was feasible because μ true =σ true ! 5.

Appendix C. In silico validation of volume fraction computation
We performed in silico experiments to test the validity of the methodology in (Weibel and Paumgartner, 1978) and the extensions in the methods section. Substrates composed of randomly placed spheres and cylinders (see Fig. C.1) were generated for various feasible A A values. From these, we computed their projected 2D binary images and then estimated the volume fractions. The dimensions of the synthetic 3D slices were 200μm Â 200μm Â 5μm. Figures C2 and C3 show the relative error in the estimation of spheres and cylinders volume fractions. The variance of the V V;sph estimates is larger than V V;cyl ones because the number of spheres simulated per section was much smaller than the number of cylinders. The feasible number of axons or astrocyte processes (cylinders) inside a window of observation are much larger than the typical number of astrocyte bodies (spheres). Hence, as the number of elements increases, the relative dispersion between the mean projected area and the mean volume gets smaller (see scatter plots in Fig. C.4 and C.5). The total volume fraction of astrocytes is mostly due to contribution from processes, hence, the variance in the error of the total volume is similar to the one for cylinders.
To test the robustness of the methodology we also performed synthetic experiments generating astrocytes that did not have straight processes. We used branching lengths equal to the bodies' radius, and the branching angles were drawn randomly from a uniform distribution in a cone with 20 degrees of aperture. Relative errors were also very small in this case. Relative error in the V V estimates for randomly placed spheres. The simulated radius distribution was Gaussian with μ sph ¼ 5μm and σ sph ¼ 1μm. Relative error in the V V estimates for randomly placed cylinders, assuming constant radius (blue boxes) and a radius distribution (red boxes). The simulated radius distribution was Gamma with μ cyl ¼ 0:5μm and σ cyl ¼ 0:1μm.  shown in green and red dots, respectively. The simulated radius distribution was Gamma with μ cyl ¼ 0:5μm and σ cyl ¼ 0:1μm.

Appendix D. Experiments on corpus callosum data
There are no previous works that measure volume fractions in human DWM. While this was one of the motivations for our work, it made validating our results very challenging. The lack of literature might be due to the almost isotropic nature of brain tissue orientation in DWM, which makes approaches like those in (Stikov et al., 2015;Xu et al., 2014) infeasible as they heavily rely on fibre orientation being perpendicular to the imaging plane. As an assessment of the consistency of our experiments, we performed some tests on a corpus callosum sample from the same ageing cohort. We cut two slices from the corpus callosum of one control subject. In the first one, axons were perpendicular to the slice plane, while on the second one they were parallel to it (see Fig. D.1). We computed the mean volume fraction (V V ) from the segmented image as explained in Section 2.5. However, as fibres in the corpus callosum cannot be assumed to be isotropically distributed, some adjustments were done to the V V estimation. Both slices were imaged following the procedure explained in the section 'Histology acquisition'. A semi-automatic segmentation was performed on both of them. Regions of interest (ROIs) were drawn in the middle of the corpus callosum, and the mean area fraction (A A ) was computed on both. For the slice where axons were perpendicular to the imaging plane, we used the stereological relation EðV V Þ ¼ EðA A Þ. To obtain the relation between V V and A A in the slice where axons were parallel to the imaging plane we performed an in silico experiment (like those shown in Appendix C). Parallel cylinders were randomly placed in a synthetic histological sample of 5μm thickness, and the paired values of (V V ,A A ) where computed for multiple realisations of varying densities (see Fig. D2). The computed V V value in the perpendicular slice was 0.226. For the parallel one, we obtained A A ¼ 0:699, which was mapped to V V ¼ 0:220 using a fourth order polynomial fit to the synthetic cases (see Fig. D.2). These results are in agreement between themselves and also show, as expected, a significant decrease with respect to the values reported in (Stikov et al., 2015). We attribute it to ageing as it has been previously reported in (Calkins, 2013).