Computational quantification of brain perivascular space morphologies: Associations with vascular risk factors and white matter hyperintensities. A study in the Lothian Birth Cohort 1936

Highlights • We assessed Perivascular Spaces (PVS) computationally in the centrum semiovale.• We measured total PVS volume and count, and individual PVS length, width, size.• Computational PVS measures correlated positively with PVS ratings.• PVS were associated with hypertension stroke and white matter hyperintensities.


Introduction
Perivascular spaces (PVS), sometimes known as Virchow-Robin spaces, are fluid-filled compartments surrounding the small perforating brain microvessels. They act as conduits for fluid transport, exchange between cerebrospinal fluid (CSF) and interstitial fluid (ISF), and clearance of waste products from the brain (Brown et al., 2018). PVS are seen on structural brain Magnetic Resonance Imaging (MRI) as thin linear or punctate structures (depending on scan orientation) of similar signal to CSF (Potter et al., 2015b;Ramirez et al., 2016), defined as having a diameter smaller than 3 mm Wardlaw et al., 2013).
PVS have been reported to increase in number on MRI, based on visual scores, with age, with other brain features of small vessel disease (SVD)  with vascular risk factors, especially hypertension, in common brain disorders including stroke, mild cognitive impairment, and dementia including of vascular subtype (Debette et al., 2019;Francis et al., 2019) Although many individual studies have reported associations between increased numbers of PVS and WMH, a recent meta-analysis (Francis et al., 2019) found no clear PVS-WMH association in adjusted analysis, possibly reflecting variation in populations, SVD lesion burden or PVS assessment methods (Debette et al., 2019;Francis et al., 2019) To date quantification of PVS on MRI to study associations with vascular risk factors and other imaging variables has mainly relied on qualitative visual scores (Potter et al., 2015a). Visual scores mostly categorise subjects into those with no, mild, moderate and abundant numbers of PVS in characteristic regions, namely basal ganglia, midbrain and centrum semiovale. While robust, these ordinal scales are inherently insensitive due to the limited number of categories, floor and ceiling effects, and may be affected by observer bias Potter et al. (2015a).
Tools for computational PVS quantification have been developed Ramirez et al., 2015;González-Castro et al., 2016;Wang et al., 2016;Ballerini et al., 2018;Boespflug et al., 2018;Dubost et al., 2019aDubost et al., , 2019b. Whereas some methods provide total PVS burden and/or count (Ramirez et al., 2015;Wang et al., 2016;Dubost et al., 2019b), depending on the detection method, it is possible to derive several additional metrics including the total count and total volume per subject's brain, plus the size, length, width, shape (Ballerini et al., 2018;Boespflug et al., 2018) and direction of each individual PVS (Ballerini et al., 2018), which can then be analysed as mean or median per subject or per brain region. The detailed size, shape and directionality metrics may increase sensitivity and/or specificity to detect PVS associations with vascular risk factors, brain disease and longitudinal change in brain lesions or structure. Furthermore, a reliable computational method may increase the efficiency and consistency of analysis in very large datasets, e.g. in population imaging studies. The computational method developed by Ballerini and colleagues (Ballerini et al., 2018) was able to assess PVS in the centrum semiovale in two small independent older age cohorts (age 64-72 years): individuals with a clinical diagnosis of dementia (n = 20), and patients who previously had minor stroke (n = 48), in whom there was good agreement between PVS visual rating and computational measures (Ballerini et al., , 2018. Here, we evaluate this PVS computational method in a large community-dwelling older age cohort scanned at age 73 years. We assess the agreement between the computationally-derived PVS metrics (total volume and count, individual size, length and width) and the 'reference standard' PVS visual score. We also evaluate associations between each of five new PVS measures and important vascular risk factors (hypertension, diabetes, plasma cholesterol), vascular disease history, and WMH burden on brain MRI.
All clinical and imaging acquisition methods, and the visual and computational assessment of WMH and PVS visual scores in this cohort have been reported previously (Deary et al., 2007;Wardlaw et al., 2011;Taylor et al., 2018). Briefly, structural brain MRI data were acquired using a 1.5-Tesla GE Signa Horizon HDx scanner (General Electric, Milwaukee, WI), with coronal T1-weighted (T1w), and axial T2-weighted (T2w), T2*-weighted (T2*w) and fluid-attenuated inversion recovery (FLAIR)-weighted whole-brain imaging sequences (details in Wardlaw et al. (2011)). Medical history variables (medically diagnosed hypertension, diabetes, hypercholesterolemia, cardiovascular disease history (CVD) and stroke) were assessed at the same age as brain imaging. A history of CVD included ischaemic heart disease, heart failure, valvular heart disease and atrial fibrillation. Stroke included clinically-diagnosed stroke and also those with any ischaemic or haemorrhagic stroke seen on MRI in subjects with no clinical history of stroke. All medical history variables were coded as a binary variables, indicating presence (1) or absence (0).
The validation of the PVS rating in this cohort, have been published previously Aribisala et al. (2014). The PVS rating scale was developed as a pragmatic visual categorisation tool in several different healthy and diseased populations of different ages, and following analysis of other published rating scores. It was tested and refined, and all details including the observer reliability have been published Potter et al. (2015a). An experienced neuroradiologist rated the PVS on T2w images in the whole sample, cross-checking against FLAIR and T1w to avoid rating lacunas or WMH as PVS, following the method of Potter et al. (Potter et al., 2015a). This rating identifies the closest category on the scale ranging from 0 (no PVS), 1 (mild; 1-10 PVS), 2 (moderate; 11-20 PVS), 3 (frequent; 21-40 PVS) or 4 (severe; >40 PVS) based on an estimate of the number of PVS seen in the slice considered to have more of them in the stated brain region (i.e. centrum semiovale). Another neuroradiologist, blind to these results, generated visual ratings from a random 20% of scans. In this subsample, intra-and inter-rater kappa statistics of these visual scores ranged from 0.68-0.90 as published in Aribisala et al. (2014).
WMH were also visually rated by a neuroradiologist, primarily in FLAIR, checking the T1-and T2w where necessary. Fazekas score was given for periventricular (PVH, 0-3) and deep white matter hyperintensities (DWMH, 0-3), then summed into a total WMH burden score (0-6) (Wardlaw et al., 2011. Another consultant neuroradiologist randomly cross-checked 20% of the WMH ratings, all scans with stroke lesions (n = 60) and any scans where the first rater was L. Ballerini, et al. NeuroImage: Clinical 25 (2020) 102120 uncertain (n = 50). The final scores were agreed after discussing the cases where discrepancies were found . Intracranial volume (ICV) and WMH volume were measured using a semi-automatic pipeline validated and published in full previously, which uses a multispectral data fusion of T1w, T2w, T2*w and FLAIR (Valdés-Hernández et al., 2010). All WMH masks were visually checked and edited. For this study we express WMH as percentage of ICV.
The PVS computational assessment used the T2w images acquired with: 11,320 ms repetition time, 104.9 ms echo time, 20.83 KHz bandwidth, 2 mm slice thickness, and 256 × 256 field-of-view. The images were reconstructed to a 256 × 256 × 80 matrix, 1 mm in-plane resolution. The binary mask of the centrum semiovale for each subject has been obtained by mapping the T2w MRI sequence of a representative case to the native T2w space of the subject under analysis using affine (linear) registration, and then applying the space transformation to the centrum semiovale mask of the representative case. PVS were segmented in the centrum semiovale using the computational method described in . Briefly, this method uses the three-dimensional Frangi filter to enhance and capture the 3D geometrical shape of PVS. The filter parameters were optimized using visual scores and ordered logit models to address the measurement uncertainty and the unequal class intervals of the rating scores. The MRI structural volumes were resliced to 1 mm isotropic voxels using an ad hoc interpolation that calculates the intensity of each new voxel as the average of the voxels directly above and below. The "vesselness" of each voxel was calculated at a given set of scales using the 3D Frangi filter. The responses of the filters were combined and thresholded. Details of the method, including optimized parameters (filter scales and threshold), were described in full Ballerini et al. (2016). PVS were identified using 3D connected component analysis as tubular structures with lengths between 3 and 50 mm. The Frangi filter, thanks to its optimal scales, mainly enhances structures whose width is between 0.5 and 2.5 voxels, and therefore impose a soft constraint on the PVS width. To distinguish PVS from WMH we also imposed a constraint on the maximum volume of each PVS of 1000 voxels, calculated as the approximate volume of a cylinder of radius 2.5 and length 50 voxels. Segmented images were saved as binary masks in the native T2w space for subsequent analysis.
PVS count was defined as the number of connected component objects in the segmented images, PVS volume was the total number of voxels classified as PVS. Individual PVS features (size, length, width) were also measured using connected component analysis. PVS 'size' was defined as the volume of each individual PVS to avoid confusion with PVS 'volume' which was the total volume of all the PVS in an individual subject. PVS 'length' was defined as the measure of the major axis in the ellipsoid and 'width' the second longest axis, perpendicular to the longest axis. See Fig. 1 for a schematic illustration on how these individual PVS metrics have been computed. Mean, median, standard deviation and percentiles of these features (i.e. length, width and size) were calculated for each subject. Prior to use in statistical analysis, the segmented binary masks, superimposed on the T2w images, were visually checked by a trained operator, and accepted or rejected blind to all other data. Acceptable images were those where the computational method was able to detect a reasonable amount of visible PVS, and did not detect too many artefacts as PVS (see Fig. 2). Other sequences (FLAIR and T1w) were checked in case of ambiguity and cases with WMH detected as PVS were excluded. A small amount of false positives and negatives was considered acceptable. A repeatability test was performed on a subset of the cases (n = 50). In this subsample, kappa statistics was 0.696 (std error 0.107, 95%CI [0.487 to 0.905]). Reasons for exclusions were: failed registration of the centrum semiovale (6%), noise (26%), and misclassified WMH (8%). All PVS measurements were calculated in the native T2w space.
WMH and PVS segmentations were separate procedures performed at different times by different operators blind to each other and to clinical variables, using MATLAB versions R2012b (WMH), and MATLAB R2014b (PVS). PVS width, shape and length were determined using the MATLAB function regionprops3 (version 1.3.0.0) from the MATLAB File exchange. WMH masks were visually checked on FLAIR and PVS masks on T2w, looking at other sequences as needed. The visual checking of segmented masks was performed separately and independently from visual rating. Examples of PVS and WMH segmentations are shown in Fig. 3.

Statistical analyses
First, we compared the proportion of the sample for which PVS computational measures were available to those who underwent MRI but did not have PVS data using Welch two sample t-tests and chisquared tests.
Next, univariate associations between each of the computational PVS measures and the visual rating scale were calculated. We also tested for differences between males and females in the computational PVS measures and visual ratings using Welch two sample t-tests and chisquared tests respectively.
Finally, we investigated the relationships between visually rated and computational PVS measures (volume, count, mean width, mean length & mean size) and a variety of outcome variables using generalized linear models. Specifically, we looked at the relation of each PVS measure with total WMH Fazekas visual rating scores, WMH volume as a percent of ICV, hypertension, hypercholesterolemia, diabetes, CVD history, previous stroke, and age.
For each outcome, we modelled the association with each PVS measure, controlling for key covariates, including age, sex, and hypertension (excluded from models where these were outcomes of interest). Given the measurement and observed distribution of the outcomes of interest, three different generalised linear models were applied. For Fazekas visual rating scores and age, a standard linear model was applied. For models with hypertension, hypercholesterolemia, diabetes CVD and stroke as outcomes, we fitted binary logistic regressions.
For WMH volume as a percent of ICV, which is both a proportion and heavily positively skewed, we applied beta regression with a logit link. The standard beta regression model does not allow values to be exactly zero or one. No values in the current data approached one, but several zero values were converted to small positive values using the Fig. 2. Examples of acceptable and not acceptable images. On the top row, blue arrows indicate missed PVS, probably due to a non-perfect registration of the centrum semiovale template; yellow arrows indicate possible false positive due to background texture and interface between white matter and grey matter. On the bottom row, images on the left and middle are rejected due to noise, right image due to small WMH identified as PVS.
following transform as per (Ferrari and Cribari-Neto, 2004): where n is the sample size. In order to compare the magnitude of the associations between each PVS measure and each outcome, we present point estimates and confidence intervals, and evaluate the models using the Akaike and Bayesian Information Criterion (AIC and BIC respectively). Each of the WMH measures was z-transformed prior to running of the models to ensure comparability of effects. Non-overlapping confidence intervals were taken as indicative evidence of significant difference in effects. Stronger effects were taken as evidence for the criterion validity of a particular WMH measure. Differences of approximately 10 for BIC, and smaller AIC values, between the visual rating models and any of the computational variables were taken as indicative of a practical improvement in the model (Raftery, 1995;Burnham and Anderson, 2002). We dealt with multiple comparisons as recommended by Perneger (1998). We transparently report all results, including those with borderline significance, the effect of adding or reducing covariates to the regression models, and discuss them.

Results
In total, PVS segmentation was classed as acceptable for 540/700 (77%) participants. The cases that could not be processed were mainly due to noise and to motion artefacts that appeared in the MRI data as parallel lines similar to PVS. See example of accepted and rejected images in Fig. 2. Whereas it is common to edit WMH masks, PVS are tiny and numerous making the masks nearly impossible to edit. Therefore we excluded 160 (23%) of the original sample in which artefacts were wrongly segmented as PVS. The participants with (n = 540) and without (n = 160) computational measures did not differ in the proportion of males ( All other data required for analyses were available for 533/540 participants. Therefore 533 was the final sample size for all subsequent analyses. Table 1 shows the descriptive statistics for each variable. WMH is shown as percentage of ICV. PVS volume is measured in voxels. Age is re-scaled to years. Of the 533, 249 (48%) were female, 254 (35%) had hypertension, 209 (39%) were hypercholesterolaemic, 51 (9.5%) had diabetes, 151 (28%) had CVD history and 94 (18%) had stroke.
The correlations between the visual rating and all computational PVS measures are provided in Table 2. The visual rating correlated positively with all computational PVS measures (r range = 0.47 to 0.61), with the highest correlation being with computational PVS volume (r = 0.61). Scatter plots of the agreements between visual rating and PVS count in one slice are shown in Fig. 4. The frequency of the visual scores and the computational PVS count in the axial slice having the highest number of PVS for each subject are shown in Fig. 5.
Segmented images are shown in Fig. 6. Distribution of PVS metrics are shown in Fig. 7.
There were no differences between males and females in computational PVS mean length (Welch t(528.62 . For both hypertension and stroke, although the other PVS measures showed consistent direction of effect (OR's ≥ 1.0), the 95% CI overlapped the line of no effect. There were no associations between PVS measures and cholesterol, diabetes or CVD. Further, there were no differences in AIC or BIC for any index beyond the noted threshold, a pattern that indicates no difference in the estimates across PVS measures. (See supplementary Table S1). Fig. 9 displays the standardized regression coefficients and 95% CIs for the models of PVS associations with age, Fazekas total WMH score and WMH volume. For age, within the very narrow age range of the cohort, all PVS measures had effects close to zero, with most CIs including zero except for a marginal association between PVS total volume and age (0.08, 0.02-0.14). For the associations between PVS with Fazekas total scores and WMH volume models, firstly there were clear associations for all PVS measures and WMH, with standardised betas ranging from 0.21 to 0.66 (Fazekas WMH score) and 0.14 to 0.43 (WMH volume). Secondly, the computational PVS total volume, mean length, mean width and mean size all showed significantly stronger associations with WMH Fazekas score (range β = 0.

Discussion
To our knowledge, this is the first study to compare multiple  measures of PVS enlargement, derived using a computational segmentation method on 1.5T MRI data, with vascular risk factors and WMH burden. The ability to derive multidimensional measures of PVS from conventional brain MRI represents a major advance for research in ageing, SVDs, and cognitive impacts. In this Scottish cohort of community-dwelling individuals at the beginning of their 8th decade of life, computational PVS measures and visual rating scores were moderately to highly correlated, but measures of individual PVS size were more strongly associated with stroke and WMH than number of PVS, with marginal to no associations with hypertension, diabetes, hypercholesterolemia or CVD. The cross-sectional association of increasing individual PVS mean width and size with more severe WMH is consistent with the hypothesis that PVS widening reflects small vessel endothelial dysfunction and impaired interstitial fluid drainage (Rasmussen et al., 2018) contributing to accumulating brain damage during ageing and in SVD (Brown et al., 2018), although further longitudinal studies are required to determine the direction of effect. The exact mechanisms of PVS enlargement are still unknown. Some hypotheses on the dysfunction of the interstitial fluid clearance and inflammation have already been presented, summarised in Brown et al.
; Rasmussen et al. (2018). Importantly, the detailed metrics provided by this computational PVS method facilitate analysis of large scale population studies as well as detailed focused interventional studies, both of which increase the scope for determining, in humans in vivo, how PVS and glymphatic system dysfunction contribute to age-related brain changes and common neurological disorders including stroke, SVD and dementia.
We found few to no associations between PVS metrics derived mainly in the centrum semiovale and several common vascular risk factors except very marginal associations of PVS width and size with hypertension. A systematic review (Francis et al., 2019), found associations between basal ganglia PVS rated visually and hypertension, but with significant between-study heterogeneity; in contrast, PVS in the centrum semiovale were not associated with hypertension (although the direction of effect was similar) Our findings, using computational PVS, agree with these findings, consistent with the observation that risk factors for PVS may differ by brain region. This could be due to regional variations in vessel and PVS anatomy with variations in fibrohyaline thickening, lipohyalinosis and cerebral amyloid angiopathy (Wardlaw et al., 2003), which in turn may affect vessel-brain fluid exchange and PVS morphology (Wardlaw et al., 2003;Hurford et al., 2014;Zhang et al., 2014).
Older age has been associated with increased PVS visual rating scores (Francis et al., 2019) but the narrow age range of the current community dwelling individuals may have restricted our ability to identify associations between PVS parameters and age.
The positive associations between increasing PVS metrics and WMH burden are in agreement with some previous studies (Doubal et al., 2010;Zhu et al., 2010;Aribisala et al., 2014;Potter et al., 2015b;Ramirez et al., 2015;Arba et al., 2016;Laveskog et al., 2018), although the association of PVS and WMH did not reach significance in the small subset of studies that could be included in a meta-analysis  (Francis et al., 2019). Contrary to the Rotterdam scan study (Dubost et al., 2019b) which found associations of PVS in basal ganglia and hippocampi, but not in the centrum semiovale, with WMH, we found positive associations in centrum semiovale. However, it is noteworthy that they used a visual and an automatic score whereas our computational measures include count but not a score. Indeed, the stronger PVS associations in our work were the measures that differ most from scores in that they reflect PVS geometry rather than count.
The study suggests that computational and visual rating methods both have strengths since they largely agree on rank order of PVS burden and thus the choice of PVS assessment method to use in future studies could be determined by the imaging characteristics. For instance, the computational method requires quasi isotropic T2w images, whereas the visual rating approach is more flexible and can be performed on 2D T2w imaging data acquired at low magnetic field strengths and with fewer slices. The computational method is less rater dependent and in principle should be more reproducible. Despite differences in computational methods and visual rating scales attempting to assess PVS, our results replicate the positive association between visual scores and computational methods found by other studies (Ramirez et al., 2015;Boespflug et al., 2018). This further supports the use of these methods in larger studies and provides an opportunity to quantify small changes in longitudinal studies.
Comparing our results to those obtained with previous computational methods, the overall PVS burden (total volume and count) of our cohort is higher than those reported by Boespflug et al. Boespflug et al. (2018) in an older population and by Ramirez et al. Ramirez et al. (2015) in individuals with a clinical diagnosis of dementia. However, subject-wise PVS mean width is comparable with that reported in a previous study Boespflug et al. (2018). Differences in reporting results prevents a full comparison and highlights the need for harmonization.
This study has some limitations. The visual rating puts the PVS burden in the region into one of five categories rather than providing a total count of PVS, while PVS detection by automated methods, although having the potential to turn categories into absolute total numbers, is still a relatively new technique which is far from perfect and subject to ongoing improvements, as are WMH detection methods.  L. Ballerini, et al. NeuroImage: Clinical 25 (2020) 102120 For instance, the segmentation of PVS in the centrum semiovale is not always accurate due to variation in gyral patterns that occasionally cause misclassification of PVS outside the centrum semiovale mask. The PVS segmentation method used works on the hemispheric white matter superior to the basal ganglia, and therefore the results only reflect associations with PVS in this major brain region but not the basal ganglia. Future extension of this method to other brain regions, such as the basal ganglia, may be possible. The second limitation was that it was only possible to obtain valid quantitative PVS measures in 77% of this sample of subjects, the main reasons for failure in the other 23% being image degradation due to movement artefact, a resulting small bias through loss of data from older subjects (on average subjects who did not contribute data were 51 days older) may have influenced the analysis of PVS morphology with age. The PVS method also could provide a measure of PVS shape and directionality, which we did not use in the present analysis in view of potential difficulties in interpretation. Also, for accuracy, measurements were done in native space. The non-isotropic nature of our images is also a limitation to the accuracy of the width and length measurements. While for bigger brain structures is common to convert from voxels to mm based on the voxel size, in the case of tiny structures as PVS such conversion to true measures would be an approximation. The resampling to isotropic images required to apply the Frangi filter could affect the reliability of the output. Different interpolation methods would have yielded different results, and would have perhaps produced different output, and the calculation of the volume occupied by PVS would have varied depending on the approach. The results, therefore, must be interpreted with caution. These limitations are partially due to the use of retrospective data, which were not optimized for PVS segmentation. A recommendation for future studies would be to acquire isotropic images to overcome these limitations. The region of interest selected also deserves reflection: although visual rating scales refer to the centrum semiovale (Francis et al., 2019), the clinical literature partly refers to the "white matter", which covers a more extensive region of deep white matter (Ramirez et al., 2016). More efforts in harmonising and validating a unified approach and its variations depending on the acquisition protocol to ensure reproducibility and consistency is needed. We used diagnosis of vascular risk factors rather than measures of blood pressure, blood glucose or lipids; it is possible that PVS metrics might show stronger associations with blood pressure and plasma measures in future analyses. The strengths include the large sample, the careful blinding of image analysis, the use of visual scores and computational metrics, and the robustness of the analyses of associations between imaging variables, accounting for relevant risk factors and vascular disease.
In conclusion, the metrics derived from this computational PVS segmentation could advance understanding the role of PVS. However, given limitations in the acquisition protocol, the PVS measurements used are only proxies of the PVS burden and characteristics in the centrum semiovale. Widening of PVS is thought to indicate stagnation of interstitial fluid drainage, deposition of cell and protein debris and increased blood brain barrier leakage, all contributors to white matter damage in SVD including amyloid angiopathy, and to secondary neurodegeneration. Knowledge of PVS is relevant in understanding the brain fluid dynamics underpinning dementia and stroke through the common denominator of SVD (Ramirez et al., 2016;Brown et al., 2018;Rasmussen et al., 2018). We thank the LBC1936 participants, the LBC1936 study research team, nurses at the Clinical Research Facility, radiographers and other staff at the Brain Research Imaging Centre (Edinburgh Imaging): a SINAPSE collaboration Centre.

Author's contributions
Sponsor's Role: The sponsors did not participate in the design, methods, subject recruitment, data collections, analysis or preparation of this manuscript

Declaration of Competing Interest
None. Fig. 8. Odds ratios and 95% confidence intervals for visually rated PVS and computational PVS count, volume, length, width and size associations with vascular risk factors, CVD (cardiovascular disease) history and stroke. The vertical dotted line indicates the line of no effect, i.e.1.